Skip to content

Dataset Tutorials

This guide describes the various dataset tutorials/workflows in EdgeFirst Studio from capture to annotation and dataset management (curation).

Capture/Record Data

This tutorial is a high level tutorial that provides an overview for recording data using an EdgeFirst Platform. For an in depth tutorial, please refer to the MCAP Recording Service.

On your browser, enter the following URL https://<hostname>/ and the following page will appear.

Note

Replace <hostname> with the hostname of your device.

You will be greeted to the Web UI Service page.

WebUI Service Page
WebUI Service Page

To record data, click on the MCAP Recorder service highlighted in red above. Once clicked, you will be greeted with the MCAP Recording Service page.

MCAP Recording Page
MCAP Recording Page

To start recording toggle/enable the Recording button indicated above and to stop the recording retoggle/disable the same button.

For more information on managing recordings, please see the Managing Recordings Tutorial.

Download Recorded Data

This tutorial shows how to download recorded MCAP data shown in Capture/Record Data Tutorial. For more information on downloading MCAPs, please see Downloading and Analysis.

The MCAP files are listed under the list of MCAP files which can then be downloaded to your PC.

Recorded MCAP
Recorded MCAP

Upload Recorded Data to EdgeFirst Studio

This tutorial shows how to upload a downloaded MCAP recording shown in Download Recorded Data Tutorial. For uploading EdgeFirst Datasets, please see the instructions for Upload from Zip/Arrow File.

In EdgeFirst Studio, select Data Snapshots under the tool options.

Data Snapshots
Data Snapshots

Note

A project has already been created intended for object detection. This step has been covered in Getting Started.

Once you are in the Data Snapshots page, upload the recorded MCAP by clicking FROM FILE which opens a new window dialog for selecting the MCAP downloaded in your PC.

Upload MCAP
Upload MCAP

Once the MCAP file is selected, this would start the upload progress in EdgeFirst Studio. This upload progress may take several minutes depending on the size of the MCAP. Once the upload is complete, the status will be shown like the figure on the right.

Upload Progress Completed Upload
Progress Complete

Data -> Dataset: Annotating Data

This tutorial shows how to annotate an uploaded MCAP recording shown in Upload Recorded Data Tutorial.

Auto Annotations via Snapshot

To reduce the effort required by the user to annotate the data, part of this process is to run auto-annotations on the uploaded data.

To run auto-annotations on the recorded data, click Restore on the uploaded snapshot.

Restore Snapshot
Restore Snapshot

The following fields are for the user to specify. Adjust the following fields for your own use case.

Restore Snapshot Fields
Restore Snapshot Fields

Once specifed, click RESTORE SNAPSHOT to start the auto-annotation process. This will start the auto-annotation process.

Restore Process
Restore Process

The progress will be shown on the dataset specified in the project.

Restore Progress
Restore Progress

Once completed, the dataset will now contain annotations that resulted from the auto-annotation process.

Restored Dataset
Restored Dataset

Next navigate to the gallery of the dataset by clicking on the gallery button highlighted in red to visualize the annotations. The figure below shows a side-by-side display of the annotations from frames 1-3. The annotations for "people" are shown as both segmentation masks and bounding boxes.

Frame 1 Frame 2 Frame 3
Annotation 1 Annotation 2 Annotation 3

Another method for running auto-annotations is to utilize the propagation feature in the gallery. This feature will preload all frames in a video sequence in the dataset into SAM-2 to generate segmentation masks, 2D bounding boxes, 3D bounding boxes (For Raivin/LiDAR Only) by tracking the object across the frames.

Start by enabling an AI Assisted Ground Truth server by navigating to the Cloud Instances under the tool options.

Cloud Instances
Cloud Instances

Start and launch a new server to host the auto-segmentation backend.

Start a Server
Start a Server

Warning

This server is costing credits to run. An inactivity of 15 minutes will auto-terminate this server. Otherwise, once you have completed the annotations, please ensure to terminate this server to avoid spending more of your credits.

Select AI Server
Select AI Server

Terminate AI Server
Terminate AI Server

Next navigate back to the dataset gallery and enable edit mode.

Select the Video Segment Tool
Select the Video Segment Tool

Click on the Video Segment Tool as indicated in red above. Next click the "Initialize State". This will load the indicated starting frame (current) to the stop frame (end) to SAM-2 for tracking the object across these frames for auto annotations.

Initialize the Video State
Initialize the Video State

Once the state has been initialized, additional options will be provided to allow the user to provide prompts to SAM for propagation. Start by selecting the box tool as indicated in red. This will allow the user to draw bounding box prompts to the initial annotation for propagation.

Select the Box Tool
Select the Box Tool

Now draw a bounding box prompt (white) around the object to annotate. In this case the person on the frame will be annotated. Once the bounding box is drawn, the segmentation mask will be drawn and the associated bounding box for the mask (yellow). Next click on "Propagate" to propagate this annotation (mask and bounding box) across frames using SAM-2 tracking and propagation.

Initial Annotation
Initial Annotation

This will start the propagation progress across the frames specified.

Propagation Progress
Propagation Progress

Once the propagation is completed, click "Save Pending Segmentations" to save the propagated annotations.

Save Pending Segmentations
Save Pending Segmentations

For cases where the object exits and then re-enters the frame, the object might not be tracked properly. Repeat the steps as necessary to annotate objects that were missed.

Repeat Propagation
Repeat Propagation

A completed propagation will show the annotations with masks and bounding boxes for subsequent frames as follows.

Annotation 1 Annotation 2 Annotation 3
YZ XY Positive Shift
Audit 2D Annotations

This step requires verifying the outputs of the auto-annotations and to make corrections to the 2D annotations if necessary in order to have a proper fully annotated dataset.

Some annotations were missed from the auto-annotations and to correct those errors, we can utilize the auto-segment tool. Start by enabling an AI Assisted Ground Truth server by navigating to the Cloud Instances under the tool options.

Cloud Instances
Cloud Instances

Start and launch a new server to host the auto-segmentation backend.

Start a Server
Start a Server

Navigate back to the dataset gallery and enable edit mode.

Edit Mode
Edit Mode

Select the AI Image Segment Tool and then enable the SAM Box Tool

Auto Segment Mode
Auto Segment Mode

Draw a bounding box around the person that was missed and then click CREATE ANNOTATION to create the drawn segmentation mask. Click SUBMIT to accept the annotation.

Segment Tool
Segment Tool

Draw a bounding box annotation around the person that was missed by selecting the Box Tool. Click SUBMIT to accept the annotation.

Box Tool
Box Tool

Part of the audit process is to go over each sample in the dataset and correcting any missed annotations or incorrect annotations.

Audit 3D annotations

This step requires verifying the outputs of the auto-annotations and to make corrections to the 3D bounding box annotations if necessary in order to have a proper fully annotated dataset.

First navigate to the gallery and enable edit mode.

Edit Mode
Edit Mode

Ensure the point clouds and the 3D bounding box annotations are toggled visible.

Visible 3D Annotations
Visible 3D Annotations
Scale 3D Annotation

The error in the current annotation is that the bounding box is not scaled properly. Click on the option on the left sidebar to enable 3D bounding box scaling as highlighted in red.

Scale 3D Annotations
Scale 3D Annotations

Click on the current 3D bounding box to scale and this will provide cursors to scale the 3D bounding box in the 3-axis.

Scaling 3D Annotations
Scaling 3D Annotations

The 3D bounding box was adjusted with proper scaling to the LiDAR point clouds of the object.

Scaled YZ Plane Scaled XY Plane Scaled XZ
YZ XY Positive Shift
Translate 3D Annotation

Next the adjusted 3D bounding box needs to be properly translated. Click on the option on the left sidebar to enable 3D bounding box translation as highlighted in red.

Translate 3D Annotations
Translate 3D Annotations

Similar to the workflow as scaling the 3D bounding boxes, move the three cursors for each axis to translate the bounding box for each axis.

Translate YZ Plane Translate XY Plane Translate XZ
YZ XY Positive Shift

Once the 3D bounding box annotation is properly oriented, click "SUBMIT" to save the changes.

Submit 3D Annotations
Submit 3D Annotations
Add 3D Annotation

To add a missing 3D bounding box, click on the option on the left sidebar to add a new 3D bounding box annotation as highlighted in red.

Add 3D Annotations
Add 3D Annotations

Now click on the grid to add a new 3D bounding box on the position of the click.

Added 3D Annotations
Added 3D Annotations

This newly added 3D bounding box may not be scaled or translated properly. Follow instructions for scaling and translating a 3D bounding box to properly center the bounding box around the LiDAR point cloud as shown below. Once the annotation is properly scaled and translated, click "SUBMIT" to save the annotation.

Submit 3D Annotations
Submit 3D Annotations

Viewing Datasets

This tutorial will show how to view the contents in the dataset.

In the project's page, you can click on the dataset button highlighted in red to view the datasets contained in the project.

View Datasets
View Datasets

You will now see the datasets contained in the project. Each dataset has a gallery. To see the images in the gallery, open the gallery by clicking the gallery button highlighted in red.

Gallery Button
Gallery Button

When clicking the gallery button, you will either see the images in the dataset for Image-Based Datasets or sequences for Sequence-Based Datasets.

For sequence-based datasets, you need to specify which sequence you would like to view. This can be done by clicking on the sequence.

Note

The Raivin Pedestrians (ultra-short range) 2025.03 has a single sequence.

Dataset Sequence
Dataset Sequence

When the sequence is clicked, you will now see the frames stored in the sequence along with the annotations.

Dataset Sequence
Dataset Sequence

Verifying Datasets

This tutorial will show an example of a dataset that is ready for training.

Verify that the dataset has a training and validation split. The sample dataset shown below has a dedicated split for training (20066 samples) and validation (2229 samples).

Dataset Groups
Fusion Dataset Groups

Another sample dataset shown below is for training Vision models which has a dedicated split for training (1656 samples) and validation (184 samples).

Dataset Groups
Vision Dataset Groups

Verify the contents of the dataset and the annotations. Click the button that navigates to the gallery. This will show the contents of the dataset. The dataset may be comprised of multiple sequences as shown below.

Dataset Sequences
Fusion Dataset Sequences
Dataset Sequences
Vision Dataset Sequences

Clicking on any of these sequences will open individual images in the sequence with the visualizations of the annotations. For more information please see Viewing Datasets above.

Info

Datasets that train Fusion models provide world annotations of the object's 3D bounding box. For more information on the dataset annotations, please see EdgeFirst Dataset Format.

Fusion Annotations
Fusion Annotations

Info

Datasets that train Vision models provide image annotations of the object's 2D bounding box and segmentation mask. For more information on the dataset annotations, please see EdgeFirst Dataset Format.

Vision Annotations
Vision Annotations

For cases where the annotations need corrections, please see Audit 2D Annotations for more details.

Creating Datasets

This tutorial will show how to create an empty dataset container in EdgeFirst Studio. This container is needed for copying or combining datasets as shown in the next sections.

To create a dataset, first select the project to store the new dataset. Next click the dataset button (highlighted in red) to view the datasets in that selected project.

Dataset Button
Dataset Button

Next create a new dataset by clicking the "NEW DATASET" button highlighted in red on the top right.

Create Dataset Button
Create Dataset Button

Provide the dataset name and the dataset desciption for this new dataset. In this example the name is the same as the original dataset source. Once the fields are filled, click the "CREATE" button on the bottom left of the window dialog.

Create Dataset Fields
Create Dataset Fields

Once created, define an annotation set. The annotation set is a container for storing the annotations in the original dataset. To create an annotation set, click the "+" button in the "Annotation Sets" field.

Create Annotation Set
Create Annotation Set

Next provide the name and description for the annotation container as shown below. Once provided, click "CREATE NEW SET" to create the annotation set.

Annotation Set Fields
Annotation Set Fields

You have now created a dataset and an annotation set container as shown below. This container can be used to store copied or combined datasets.

Created Dataset
Created Dataset

Copying Datasets

This tutorial will show how to copy the dataset to a different container.

To copy a dataset, first create a dataset container. Once created, select the "Copy Dataset" from the dataset options on the newly created dataset container as shown below.

Copy Dataset
Copy Dataset

This will open a new dialog for the user to specify the source dataset and the destination dataset. The source dataset is the original dataset and the destination dataset is the dataset container that was just created. The following options specified are shown below.

Copy Dataset Options
Copy Dataset Options

The options provided above specifies the source dataset to originate from the public dataset "Raivin Ultra Short 2025.03" inide the public project "Sample Project". Next the destination dataset is the dataset and annotation containers that was created. Once the options are specified, go ahead and click "APPLY" to start the copy process.

Copy Dataset Process
Copy Dataset Process

Once the copying process completes, the frames and the annotations have been copied.

Original Dataset Copied Dataset
Original Copied

Combining Datasets

The process of combining datasets consists of multiple copy processes on a given dataset container. To combine datasets, first create a dataset container. Follow the process for copying a dataset onto the destination dataset container that was created. The copy process will copy the selected dataset onto the same dataset container and thus combining multiple datasets.

Splitting Datasets

A proper dataset has samples reserved for training and validation. This tutorial will show how to split the samples in the dataset into training and validation groups.

Consider the following dataset without any groups reserved.

No Groups
No Groups

To create the dataset groups, click on the "+" button in the Groups field.

Add Groups
Add Groups

This will open a new dialog to list the groups needed by the user and the percentages dedicated for each group. Often the groups "train" and "val" are created, but the user is free to specify their own groups.

Groups Field
Groups Field

Once the groups are specified, click ADD GROUPS to create the groups. This will automatically divide the samples in the dataset based on the percentages of each group specified.

Dataset Groups
Dataset Groups

Importing Datasets

This tutorial will show how to import a dataset into EdgeFirst Studio. For importing EdgeFirst Datasets, please see the instructions for Upload from Zip/Arrow File.

This tutorial will show importing a dataset such as COCO128 as an example.

To import a dataset, first create a dataset container. The following dataset is created with the name set to "Coco128" and the description as "Demo import". Furthermore, an annotation set has also been created called "Ground Truth".

COCO128 Dataset Container
COCO128 Dataset Container

For an example dataset, COCO128 was downloaded using the link provided. This will download a ZIP archive which can then be extracted.

Once a container has been created, open the dataset options denoted by the three vertical dots on the top right corner of the dataset card.

Dataset Options
Dataset Options

Select "Import".

Import Option
Import Option

This will popup a new window for you to specify the dataset to be imported. In these options, select the "Import Type" to be "Darknet". Specify the dataset folder "coco128" to be imported. Specify the annotation set to the "Ground Truth" annotation set. The following figure shows the specifications.

Import Options
Import Options

The "coco128" dataset that was specified contains the "images" and "labels" subdirectories.

COCO128
COCO128

Select "START IMPORT" at the bottom right to start the import process.

Start Import
Start Import

This will start the import process as shown.

Import Process
Import Process

Once completed, refresh the page to see the changes. The dataset container will now contain 128 images from COCO and the annotations stored in the "Ground Truth" container.

Imported COCO128 Dataset
Imported COCO128 Dataset

To view the dataset, refer to the instructions provided in Viewing Datasets.

Exporting Datasets

Coming Soon