Computer Vision

Software Series | BasicAI Cloud Technical Architecture

This article introduces how to build a SaaS-level training data platform based on Xtreme1. Xtreme1 is the world’s first open-source...

min

Jagger W.

This article introduces how to build a SaaS-level training data platform based on Xtreme1. Xtreme1 is the world’s first open-source multi-modal training data platform. In a Software as a service (SaaS) environment, technical difficulties include the annotation, enhancement, and governance of massive multimodal data, the assignment of multi-tenant multi-level permissions, and the fast interfacing with State-Of-The-Art (SOTA) models.

BasicAI Cloud is a SaaS product based on Xtreme1, the world’s first open-source multimodal training data platform. As a platform that can host multi-tenancy and easily reach terabytes of data for a single tenant, BasicAI Cloud faces many data challenges. Some examples are: how to annotate, enhance and manage massive quantities of multimodal data, how to quickly interface with SOTA models, and how to manage multiple roles in multi-tenancy.

Software Series Introduction

In this series, a software architect will provide you with an in-depth analysis of the architecture behind technology and guide you through the story of multimodal data in data-centric MLOps.

On the other hand, a software architect and algorithms expert will give you a detailed analysis of the hardcore technologies behind the design of the BasicAI Cloud architecture.

Overall Architecture

BasicAI Cloud follows cloud-native architecture principles to ensure scalability of service performance, the elasticity of deployment scale, and resilience of service in case of failure. The front-end and back-end of BasicAI Cloud are split into multiple independent services by functional modules, which are precise to meet users’ needs. Application services are designed to be stateless, and when combined with the automatic scaling mechanism of Kubernetes, the cluster size can be automatically adjusted according to the current load situation. Services such as database, message queue, and cache are chosen from open-source and cloud-native friendly distributed software or cluster-enabled open-source software, which can handle massive data storage and query. DevOps based on GitLab Continuous Integration and Continuous Delivery (CI/CD) supports full automation of the entire release process from code commit, through package and image builds, to release to the corresponding Kubernetes cluster (development/test/production), allowing developers to focus on writing code while significantly reducing repetitive operations and maintenance tasks.

BasicAI Cloud Tech Stack

From top to bottom, the system can be divided into five layers: Access Layer, Application Service, Base Service, Container Abstraction Layer (Kubernetes), and Infrastructure Layer.

Access Layer

The Access Layer receives external requests and forwards them to the appropriate services which include four TCP and seven HTTP layers of load balancing. The access layer hides internal service operation details from the outside world and exposes only a few publicly accessible services such as web front-end, API gateway, WebSocket, etc. to ensure the security of internal services.

Application Service Layer

Application services include front-end and back-end services, as well as various model inference services. In addition to the main application, the front-end service is a separate application for all kinds of annotation tools such as image annotation tools and point cloud annotation tools — because the technology stack of annotation tools is different from that of the main application. The back-end services are divided into microservices according to business modules, and each microservice provides services to the public through an API gateway which also enables centralized implementation of authentication, flow restriction, and other operations within the gateway.

Basic service layer

The basic service layer includes a database, message queue, cache, object storage, and other support services. The database and message queue use the emerging distributed systems TiDB and Pulsar, taking into account the scalability requirements of SaaS services in terms of performance level, while the cache uses Redis and supports cluster mode. Object storage uses MinIO, an AWS S3-compatible open-source object storage system that also supports distributed configuration.

Container Abstraction Layer

The Kubernetes container abstraction layer is introduced to shield the variability of the underlying infrastructure so that the deployment of upper-layer services is uniform regardless of the changing hardware and software environment.

Infrastructure Layer

The infrastructure layer provides hardware resources such as computing, storage, network, etcetera which can be either major public clouds or self-built private clouds.

Request Routing

Request Routing Flow

From a macro point of view, the whole system contains three subsystems: a front-end application (App) for users, an administrative back-end (Admin) for internal use, and an API open platform (Open) for developers. Each subsystem provides services through different domain names and has its own API gateway (Admin API Backend is also a gateway) because of different authentication and authentication methods, but each subsystem will share business microservices, which can maximize the reuse of business code and reduce the cost of development and operation and maintenance.

That concludes the product architecture of BasicAI Cloud. Stay tuned for our next edition of Software Series — a set of articles focused on software architecture and algorithms.

Visit Xtreme1’s GitHub repository; “starring” our open-source platform is one of the best ways to support us:

https://github.com/basicai/xtreme1/

If you have any questions, please feel free to reach out to us on Slack and we’ll be sure to help you out:

https://join.slack.com/t/xtreme1io/shared_invite/zt-1jpoj6hib-ATqS640GPSluIUtERlN8yQ

Back to All Posts

Get Essential Training Data
for Your AI Model Today.

Let's Talk

AI Training Data Solutions & Services

Overview of BasicAI’s professional, efficient and low-cost data annotation services for all types of training data and all industries.

Contact BasicAI to get project estimates and free pilot for your customized data labeling project.

End-to-end image/video annotation services for robust computer vision.

Leading 3D Sensor Fusion annotation services for autonomous systems.

Data labeling services for large language model and Gen AI training.

Get Project Estimates

BasicAI Data Annotation Platform

Overview of BasicAI’s all-in-one smart data annotation platform.

Explore the AI-powered labeling toolset for all types of AI training data.

See how BasicAI facilitates collaborative annotation project.

Learn about annotation tools designed for SFT, RLHF and classification tasks.

Tools for auto point cloud data labeling and semantic segmentation.

Choose the right plan for your teams, no matter how small or large.

Industries & Use Cases

Proprietary Data Engine
Prompt Delivery
Full Quality Assurance

Competitive Pricing
Dedicated Project Manager
Robust Data Security

Free Pilot Project

Blog

Platform

Open Source

An all-in-one open-source data labeling platform for multimodal training data.

Software Series | BasicAI Cloud Technical Architecture

Software Series Introduction

Overall Architecture

Access Layer

Application Service Layer

Basic service layer

Container Abstraction Layer

Infrastructure Layer

Request Routing

Get Essential Training Data
for Your AI Model Today.

AI Training Data Solutions & Services

Overview of BasicAI’s professional, efficient and low-cost data annotation services for all types of training data and all industries.

Contact BasicAI to get project estimates and free pilot for your customized data labeling project.

End-to-end image/video annotation services for robust computer vision.

Leading 3D Sensor Fusion annotation services for autonomous systems.

Data labeling services for large language model and Gen AI training.

Get Project Estimates

BasicAI Data Annotation Platform

Overview of BasicAI’s all-in-one smart data annotation platform.

Explore the AI-powered labeling toolset for all types of AI training data.

See how BasicAI facilitates collaborative annotation project.

Learn about annotation tools designed for SFT, RLHF and classification tasks.

Tools for auto point cloud data labeling and semantic segmentation.

Choose the right plan for your teams, no matter how small or large.

Industries & Use Cases

Proprietary Data Engine Prompt Delivery Full Quality Assurance

Competitive Pricing Dedicated Project Manager ​Robust Data Security

Free Pilot Project

Blog

Platform

Open Source

An all-in-one open-source data labeling platform for multimodal training data.

Software Series | BasicAI Cloud Technical Architecture

Software Series Introduction

Overall Architecture

Access Layer

Application Service Layer

Basic service layer

Container Abstraction Layer

Infrastructure Layer

Request Routing

Get Essential Training Data for Your AI Model Today.

Proprietary Data Engine
Prompt Delivery
Full Quality Assurance

Competitive Pricing
Dedicated Project Manager
Robust Data Security

Get Essential Training Data
for Your AI Model Today.