Synthetic Object Recognition Dataset for Industries.

Now Available on

See Dataset

Discover The Amazing Pipelines of SORDI.ai!

The largest and most comprehensive synthetic dataset

Published by BMW Group, SORDI.ai helps developers and researchers to streamline and accelerate the training of artificial intelligence in production. The dataset offers over 1.2 million photorealistic images and thousands of detailed point clouds, capturing even the most intricate aspects of real factory environments. Together with Google and NVIDIA, SORDI.ai has been made available as open-source, in an effort to build the world’s largest reference dataset for artificial intelligence in the broad field of manufacturing.

120+ unique object classes

SORDI.ai offers a diverse and extensive object catalog with over 120 unique classes, enabling deployment across a wide range of regions, industries, and use cases. The dataset features high-fidelity digital twins of logistics and industrial assets—such as KLT boxes, stillages, and dollies—carefully modeled to VDA standards. Realism begins at the smallest details and extends to the dynamic structure of each scene: objects appear in randomized combinations of states, behaviors, and visual variations. This controlled twist of realism and randomness is our secret ingredient for building high-performance industrial AI models!

SORDI.ai provides extremely accurate synthetic annotations that greatly reduce the time and cost of manual labeling without compromising quality. These labels offer pixel-perfect precision and consistency across 2D, 3D, and time-series data, supporting a wide range of industrial use cases such as object detection, segmentation, classification, and forecasting. This rich and precise annotation ecosystem fuels cutting-edge solutions across industries, powering smarter, faster innovation in today’s most demanding manufacturing environments.

Accurately Annotated Dataset with Pixel-Level Synthetic Labels

The SORDI.ai Synthetic Data Generation Pipeline

1. Asset Preparation
2. Scene Construction
3. Data Capture
4. Quality Assessment

Digital First Approach.

The SORDI.ai dataset is made up of synthetic images generated using NVIDIA Omniverse. Through leveraging USD and MDL workflows, as well as connecting DCC tools from BMW Groups workflow twisted with Google Cloud’s Generative AI technologies, SORDI.ai is constantly expanding to include new models and classes.

Over 120+ Versatile Object Classes

Logistics

Containers, stillage, storage boxes, disposable and reusable handling materials, packaging, boxes.
Transportation

Manual and powered vehicles, pushbikes, scooters, production equipment.
Office

Office furniture, boards, displays, accessories.
Signage

Emergency signs, information signs, pictographs, text signs.
Tools

Safety tools, Mechanical and electrical tools, screws, bolts.

The SORDI.ai Asset Tree

The structured hierarchy and organization of all SORDI.ai digital assets form the foundation of how the SORDI.ai dataset is built, managed, and expanded—especially important given its descriptive levels, which break down asset configurations by deployment environment, task, type, variant, and behavior. The scalable SORDI.ai tree is fundamental for maintaining large-scale digital twin and configuring realistic randomizations.

< Go to the full SORDI.ai Asset Tree >

Dynamic Industrial Assets

Our digital twins are not just visually rich; they are significant and comprehensive, built on deep metadata that captures all operational and behavioral logic. This architecture provides the twin with predictive intelligence, allowing it to accurately track an asset's full history and reliably forecast its future state.

Comprehensive Metadata

SORDI.ai assets are intelligent, operational objects, not just visual models. Our comprehensive metadata defines physical properties, operational state, and behavioral logic. This depth of data enables the generation of logically consistent scenarios, giving AI models the situational understanding required for real-world reliability.

Generative Textures

We leverage state-of-the-art generative AI to dynamically create photorealistic and procedural textures on demand. This pipeline generates a virtually infinite variety of surface conditions, including authentic signs of wear and tear and environmental effects. This extreme variability is crucial for dramatically boosting the robustness and generalization capabilities of vision-based models.

The SORDI.ai Scalable and Modular USD-based Scene Construction Pipeline

This pipeline masterpiece is the core of SORDI’s scalability to support extremely large scale digital twins! It consists of 9-layers based on Omniverse’s USD pipeline.

Dynamic Action-Ready Scenes ft. Comprehensive Human Worker Activities

These incredibly realistic human models simulate complex tasks’ performance and movements with high accuracy and speed, enabling the simulation of action-based real-world scenarios with dynamic events. With virtual humans, SORDI.ai is pushing the boundaries of what's possible in the industrial Metaverse and paving the way for more efficient and sustainable applications.

Real-World Inspired Scene Capturing Settings

The diverse SORDI.ai camera capture settings provide a fine-grained control over the complexity and realism of the synthetic environment, enabling the targeted creation of training datasets optimized for specific vision challenges, thus significantly improving the overall accuracy and reliability of the resulting computer vision systems.

Static Camera Capture
Fully-Randomized Capture
Constrained-Randomized Capture
Sequential (Path-Oriented) Capture

Compare SORDI.ai Multimodal Image Outputs

Click and slide to compare two different modalities and annotations:

Point Cloud 3D Annotations

Real-world 3D sensor data is often sparse, noisy, and notoriously difficult to label. We eliminate these challenges by providing pristine, synthetic point clouds that serve as the definitive 3D ground truth. Our data captures the geometric intricacies of industrial environments with perfect accuracy. Each point is meticulously labeled with its object class and instance, empowering developers to accelerate their training cycles and build AI models that can reliably understand and interact with complex physical spaces.

Industrial Use Cases

Go to Point Cloud Details

SORDI Generative .ai Pipelines!

AI-Generated SORDI 3D Assets and Textures
AI-Generated Scene Creation and Update
AI-Generated Photorealistic Datasets

Check Details

SORDI.ai can be used to detect missing or wrong (colored) stitches in leather products and automate the visual inspection for valid leather and stitching color combinations. To showcase that we added a large number of images of leather products with different stitching patterns, textures, and colors, making it possible to train machine learning algorithms to identify missing or incorrect stitches in leather products automatically.
As a result, SORDI.ai enables manufacturers to ensure that their products are of high quality and meet the required standards.

SORDI.ai for Visual Inspection

Success Story

Synthetic Object Recognition Dataset for Industries.

Discover The Amazing Pipelines of SORDI.ai!

The largest and most comprehensive synthetic dataset

120+ unique object classes

Accurately Annotated Dataset with Pixel-Level Synthetic Labels

1. Asset Preparation

2. Scene Construction

3. Data Capture

4. Quality Assessment

Digital First Approach.

Logistics

Transportation

Office

Signage

Tools

The SORDI.ai Asset Tree

Dynamic Industrial Assets

Comprehensive Metadata

Generative Textures

The SORDI.ai Scalable and Modular USD-based Scene Construction Pipeline

Layout Generation

Layout Tree for Scene Creation

Scene Construction & Data Generation

Dynamic Action-Ready Scenes ft. Comprehensive Human Worker Activities

Real-World Inspired Scene Capturing Settings

Static Camera Capture

Fully-Randomized Capture

Constrained-Randomized Capture

Sequential (Path-Oriented) Capture

Compare SORDI.ai Multimodal Image Outputs

Point Cloud 3D Annotations

AI-Generated SORDI 3D Assets and Textures

AI-Generated Scene Creation and Update

AI-Generated Photorealistic Datasets

SORDI.ai for Visual Inspection

SORDI.ai