Audit Dataset
After the annotation process, review each individual frame or image in the dataset that was auto-annotated. This step is also known as the audit process which is crucial in verifying that the dataset has been properly annotated and ready for training.
To view the dataset and the annotations, click on the dataset gallery.
The auditing step may require adding new annotations for objects that were missed during the AGTG process. Or it may require removing annotations for objects that were improperly annotated. Lastly, for annotations that require minor adjustments, EdgeFirst Studio has the features for adjusting annotations. Please click on the links as provided for further instructions on each of these features.
Once you have audited the dataset and verified that it's properly annotated, split the dataset into training and validation groups.
Split Dataset
Partitioning the dataset is crucial in reserving dataset portions used for training and portions used for validation to assess the performance of the model. In EdgeFirst Studio, the partitions are 80% towards training and 20% towards validation. This operation randomly shuffles the data prior to assigning them to the specified groups.
Warning
The dataset needs to be re-split whenever new sample images or frames are added to the dataset. Newly added samples are not automatically added to any group that already exists.
Consider the following dataset without any groups reserved.
To create the dataset groups, click on the "+" button in the "Groups" field.
This will open a new dialog to specify the percentages of the partition belonging to the "Training" group or "Validation" group. By default 80% of the samples will be dedicated to training and 20% remaining will be dedicated towards the validation samples.
Once the groups are specified, click "Split" to create the groups. This will automatically divide the samples in the dataset based on the percentages of each group specified.