Skip to content

Dataset Management

This page will provide tutorials for managing datasets in EdgeFirst Studio.

View Dataset

This tutorial will show how to open the gallery of the dataset to see the individual samples in the dataset.

From the "Projects" page, you can click on the dataset button indicated in red to view the datasets contained in the project.

View Datasets
View Datasets

You will now see the datasets contained in the project. Each dataset has a gallery.

Dataset Card UI Breakdown
Dataset Card UI Breakdown

This dataset has a total of 1399 images and single label "coffeecup". It has two partitions; "train" and "val". There is a total of 1119 images for the training group ("train") and 280 images for the validation group ("val").

Click the image preview to view the dataset gallery. The dataset gallery will look like the following below.

Dataset Sequences
Dataset Sequences
Dataset Images
Dataset Images

This dataset will contain both sequences (videos) and images. Clicking on the sequences will provide video playback. Otherwise, clicking on images will expand the image view.

This dataset has a complete set of 2D annotations (masks and bounding boxes) of coffee cups.

Fast Annotations

This dataset was quickly annotated using the Automatic Ground Truth Generation (AGTG) feature of EdgeFirst Studio.

Edit Dataset Information

The dataset name and description can be edited by clicking on the dataset extended menu on the top right portion of the dataset card. This should bring up the options and "Edit Info" as the first in the list. Click on "Edit Info".

Edit Info
Edit Info

This will bring up the window to edit the dataset "Name" and the "Description". Once the changes are made, click "Apply Changes" to save the changes.

Edit Info Fields
Edit Info Fields

The changes should appear on the dataset card as shown below.

Edited Info
Edited Info

View Fusion Dataset

This tutorial will show an example of a dataset that is ready for training.

Verify that the dataset has a training and validation split. The sample dataset shown below has a dedicated split for training (20066 samples) and validation (2229 samples).

Dataset Groups
Dataset Groups

Another sample dataset shown below is for training Vision models which has a dedicated split for training (1656 samples) and validation (184 samples).

Dataset Groups
Dataset Groups

Verify the contents of the dataset and the annotations. Click the button that navigates to the gallery. This will show the contents of the dataset. The dataset may be comprised of multiple sequences as shown below.

Dataset Sequences
Dataset Sequences
Dataset Sequences
Dataset Sequences

Clicking on any of these sequences will open individual images in the sequence with the visualizations of the annotations. For more information please see viewing datasets.

Datasets that train Fusion models provide annotations of the object's 3D bounding box. For more information on the dataset annotations, please see the EdgeFirst Dataset Format.

Fusion Annotations
Fusion Annotations

Datasets that train Vision models provide image annotations of the object's 2D bounding box and segmentation mask. For more information on the dataset annotations, please see the EdgeFirst Dataset Format.

Vision Annotations
Vision Annotations

For cases where the annotations need corrections, please see Manual 2D Annotations or Manual 3D Annotations for more details.

Create Dataset

If you have a video recording or sample images for your dataset, you can create a dataset container in EdgeFirst Studio to contain your video frames or images and annotations.

Navigate to a web browser and login to EdgeFirst Studio. Once logged in to EdgeFirst Studio, navigate to your project. In this case the project name is "Object Detection". Click on the "Datasets" button that is indicated in red below.

Object Detection Project
Object Detection Project

This will bring you to the "Datasets" page of the selected project. Create a new dataset container by clicking the "New Dataset" button that is indicated in red.

New Dataset
New Dataset

Add the dataset and annotation container name, labels, and dataset description as indicated by the fields below. It is up to you to specify the information in the fields and you do not have to strictly follow the example shown below. Click the "Create" button once the fields have been filled.

Dataset Fields
Dataset Fields

Your created dataset will look as follows.

Created Dataset
Created Dataset

Copy Dataset

To copy a dataset, navigate to the dataset you would like to copy. On the dataset card, select "Copy Dataset" from the dataset options as shown below.

Copy Dataset
Copy Dataset

This will open a new dialog for the user to specify the dataset source and destination. The destination will be the location of the copied dataset. The source is the current location of the dataset. The source is set by default to the current dataset card you've selected. In the example below, the source is set to the "Coffee Cup" dataset from "Sample Project". The copied dataset will be placed as specified in the destination fields. By default a new dataset container will be created in the specified project. However, you can also create a dataset container before copying and specify this dataset container in the destination fields.

Once you have made your selection, click "Apply" at the bottom right to start the copy process.

Copy Dataset Options
Copy Dataset Options

You can navigate to the copied dataset by clicking the "Project" button and then clicking on the "Datasets" button on your project as shown below.

Navigate to your Project
Navigate to your Project

The progress for the dataset copy will be shown on the dataset card.

Copy Dataset Progress
Copy Dataset Progress

Once the copying process completes, the frames and the annotations would have been copied.

Original Dataset Copied Dataset
Original Copied

Tag Dataset

To maintain the current state of the dataset, tag the dataset with a version.

Click on the dataset options at the top right of the dataset card (three vertical dots). Then click the "History" button.

Tag Dataset Options
Tag Dataset Options

Add a new tagged version of the dataset by clicking the + green button on the right of the page as shown.

Tag Dataset Button
Tag Dataset Button

Specify the tag version and tag description. Click "Create Tag" to tag the dataset.

Tag Dataset Options
Tag Dataset Options

The new dataset tag will appear under "Version History" of this page. You can go back to the dataset card by clicking the "Back to Datasets" button.

Back to Datasets Button
Back to Datasets Button

Combine Datasets

The process of combining datasets consists of multiple copy processes on a given dataset container. To combine datasets, first create a dataset container. Follow the process for copying a dataset onto the destination dataset container that was created. The copy process will copy the selected dataset onto the same dataset container and thus combining multiple datasets.

Split Dataset

Partitioning the dataset is crucial in reserving dataset portions used for training and portions used for validation to assess the performance of the model. In EdgeFirst Studio, the partitions are 80% towards training and 20% towards validation. This operation randomly shuffles the data prior to assigning them to the specified groups.

Warning

The dataset needs to be re-split whenever new sample images or frames are added to the dataset. Newly added samples are not automatically added to any group that already exists.

Consider the following dataset without any groups reserved.

No Groups
No Groups

To create the dataset groups, click on the "+" button in the "Groups" field.

Add Groups
Add Groups

This will open a new dialog to specify the percentages of the partition belonging to the "Training" group or "Validation" group. By default 80% of the samples will be dedicated to training and 20% remaining will be dedicated towards the validation samples.

Groups Field
Groups Field

Once the groups are specified, click "Split" to create the groups. This will automatically divide the samples in the dataset based on the percentages of each group specified.

Dataset Groups
Dataset Groups

Export Dataset

Coming Soon