Model Metadata
This document describes the metadata schema embedded in EdgeFirst model files. Model metadata provides complete traceability for MLOps workflows and contains all information needed to decode model outputs for inference.
Overview
EdgeFirst models embed metadata that enables:
- Full Traceability: Link any deployed model back to its training session, dataset, and configuration in EdgeFirst Studio
- Self-Describing Models: Models contain all information needed for inference without external configuration files
- Cross-Platform Compatibility: Consistent schema across TFLite and ONNX formats
- Third-Party Integration: Any training framework can produce EdgeFirst-compatible models by following this schema
Supported Formats
EdgeFirst models from the Model Zoo (including ModelPack and Ultralytics) embed metadata in format-specific locations:
| Format | Metadata Location | Config Format | Labels |
|---|---|---|---|
| TFLite | ZIP archive (associated files) | edgefirst.json (preferred), edgefirst.yaml |
labels.txt |
| ONNX | Custom metadata properties | edgefirst (JSON) |
labels (JSON array) |
Supported Training Frameworks
| Framework | Decoder | Architecture | Use Case |
|---|---|---|---|
| ModelPack | modelpack |
Anchor-based YOLO | Semantic segmentation, detection |
| Ultralytics | ultralytics |
Anchor-free DFL (YOLOv5/v8/v11/v26) | Instance segmentation, detection |
Note
These metadata fields are automatically read and handled by edgefirst-validator and the EdgeFirst Perception Middleware. In most cases, developers don't need to worry about these details — the EdgeFirst ecosystem "Just Works." This documentation exists so developers understand what's happening under the hood when needed.
Traceability for Production MLOps
One of the most critical aspects of production ML systems is traceability — the ability to answer questions like:
- Where was this model trained?
- What dataset was used?
- What were the training parameters?
- Can I reproduce this model?
EdgeFirst metadata provides complete traceability through these key fields:
| Field | Location | Purpose |
|---|---|---|
studio_server |
host.studio_server |
Full hostname of EdgeFirst Studio instance (e.g., test.edgefirst.studio) |
project_id |
host.project_id |
Project ID for constructing Studio URLs |
session_id |
host.session |
Training session ID for accessing logs, metrics, artifacts |
dataset_id |
dataset.id |
Dataset identifier for reproducing training data |
dataset |
dataset.name |
Human-readable dataset name |
Example Traceability Workflow
Given a deployed model, you can trace back to its origins:
# Extract metadata from deployed model
metadata = get_edgefirst_metadata(model_path)
# Construct EdgeFirst Studio URLs
studio_server = metadata['host']['studio_server'] # e.g., 'test.edgefirst.studio'
project_id = metadata['host']['project_id'] # e.g., '1123'
session = metadata['host']['session'] # e.g., 't-2110'
dataset_id = metadata['dataset']['id'] # e.g., 'ds-1c8'
# Note: Studio URL parameters require integer IDs. Metadata stores hex values
# with prefixes (t-, ds-). Convert by stripping the prefix and parsing as hex:
# 't-2110' -> int('2110', 16) -> 8464
# 'ds-1c8' -> int('1c8', 16) -> 456
# Access training session: https://{studio_server}/{project_id}/experiment/training/details?train_session_id={session_int}
# Example: https://test.edgefirst.studio/1123/experiment/training/details?train_session_id=8464
# Access dataset: https://{studio_server}/{project_id}/datasets/gallery/main?dataset={dataset_int}
# Example: https://test.edgefirst.studio/1123/datasets/gallery/main?dataset=456
# View training logs, metrics, and original configuration
This enables:
- Audit trails for regulatory compliance
- Debugging production issues by examining training data
- Reproducibility by re-running training with identical configuration
- Version control of model lineage through Model Experiments
Reading Metadata
TFLite Models
TFLite models are ZIP-format files containing embedded edgefirst.yaml and labels.txt:
import zipfile
import yaml
import json
from typing import Optional, List
def get_edgefirst_metadata(model_path: str) -> Optional[dict]:
"""Extract EdgeFirst metadata from a TFLite model."""
if not zipfile.is_zipfile(model_path):
return None
with zipfile.ZipFile(model_path) as zf:
# Try JSON first (preferred), then YAML fallback
for filename in ['edgefirst.json', 'edgefirst.yaml']:
if filename in zf.namelist():
with zf.open(filename) as f:
content = f.read().decode('utf-8')
if filename.endswith('.json'):
return json.loads(content)
else:
return yaml.safe_load(content)
return None
def get_labels(model_path: str) -> List[str]:
"""Extract class labels from a TFLite model."""
if not zipfile.is_zipfile(model_path):
return []
with zipfile.ZipFile(model_path) as zf:
if 'labels.txt' in zf.namelist():
with zf.open('labels.txt') as f:
content = f.read().decode('utf-8').strip()
return [line for line in content.splitlines()
if line.strip()]
return []
ONNX Models
ONNX models store metadata directly in the model's custom properties:
import onnx
import json
from typing import Optional, List
def get_edgefirst_metadata(model_path: str) -> Optional[dict]:
"""Extract EdgeFirst metadata from an ONNX model."""
model = onnx.load(model_path)
for prop in model.metadata_props:
if prop.key == 'edgefirst':
return json.loads(prop.value)
return None
def get_labels(model_path: str) -> List[str]:
"""Extract class labels from an ONNX model."""
model = onnx.load(model_path)
for prop in model.metadata_props:
if prop.key == 'labels':
return json.loads(prop.value)
return []
def get_quick_metadata(model_path: str) -> dict:
"""Get commonly-used fields without parsing full config."""
model = onnx.load(model_path)
result = {}
quick_fields = ['name', 'description', 'author', 'studio_server',
'session_id', 'dataset', 'dataset_id']
for prop in model.metadata_props:
if prop.key in quick_fields:
result[prop.key] = prop.value
elif prop.key == 'labels':
result['labels'] = json.loads(prop.value)
return result
ONNX Runtime Access
For inference applications using ONNX Runtime:
import onnxruntime as ort
import json
session = ort.InferenceSession(model_path)
metadata = session.get_modelmeta()
# Access custom metadata
custom = metadata.custom_metadata_map
edgefirst_config = json.loads(custom.get('edgefirst', '{}'))
labels = json.loads(custom.get('labels', '[]'))
# Access official ONNX fields
print(f"Producer: {metadata.producer_name}") # 'EdgeFirst ModelPack'
print(f"Graph: {metadata.graph_name}")
print(f"Description: {metadata.description}")
Metadata Schema
The EdgeFirst metadata schema is organized into logical sections. All sections are optional — third-party integrations can include only the sections relevant to their use case.
Complete Schema Structure
# Traceability & Identification
host:
studio_server: string # Full EdgeFirst Studio hostname (e.g., test.edgefirst.studio)
project_id: string # Project ID for Studio URLs
session: string # Training session ID
username: string # User who initiated training
dataset:
name: string # Human-readable dataset name
id: string # Dataset identifier
classes: [string] # List of class labels
# Model Identification (from training session)
name: string # Model/session name
description: string # Model description
author: string # Organization (typically "Au-Zone Technologies")
# Model Configuration (see ModelPack and Ultralytics documentation)
input:
shape: [int] # Input tensor shape (NCHW or NHWC depending on model)
cameraadaptor: string # Camera format (rgb, bgr, rgba, bgra, grey, yuyv)
input_channels: int # Channels from camera (3=RGB, 4=RGBA, 1=grey)
output_channels: int # Channels after CameraAdaptor transform
model:
backbone: string # Backbone architecture (e.g., cspdarknet19, cspdarknet53)
model_size: string # Size variant (nano, small, medium, large)
activation: string # Activation function (relu, relu6, silu)
detection: boolean # Detection task enabled
segmentation: boolean # Segmentation task enabled
classification: boolean # Classification task enabled
split_decoder: boolean # Whether decoder is external (see Split Decoder section)
anchors: [[[int, int]]] # Anchor boxes per output level
# ... additional model-specific parameters
# Training Configuration
trainer:
epochs: int # Number of training epochs
batch_size: int # Training batch size
weights: string # Pretrained weights source
checkpoint_path: string # Where checkpoints were saved
optimizer:
optimizer: string # Optimizer type (adam, adamw, sgd)
learning_rate: float # Base learning rate
weight_decay: float # L2 regularization strength
# ... additional optimizer parameters
augmentation: # See Vision Augmentations documentation
random_hflip: int # Horizontal flip probability (0-100)
random_mosaic: int # Mosaic augmentation probability
# ... additional augmentation parameters
validation:
iou: float # NMS IoU threshold
score: float # NMS score threshold
nms: string # NMS algorithm (none, numpy, hal, tensorflow, torch)
normalization: string # Input normalization (unsigned, signed)
preprocessing: string # Preprocessing method (resize, letterbox)
skip_validation_steps: int # Steps to skip between validations
export: # See Quantization documentation for ModelPack and Ultralytics
export: boolean # Whether model was quantized
export_input_type: string # Input quantization type
export_output_type: string # Output quantization type
calibration_samples: int # Samples used for calibration
# Decoder Configuration (Ultralytics only)
decoder_version: string # YOLO architecture version: yolov5, yolov8, yolo11, yolo26
nms: string # NMS mode for HAL decoder: class_agnostic, class_aware
# Output Specification (Critical for Inference)
outputs:
- name: string # Output tensor name
index: int # Tensor index
output_index: int # Output order
shape: [int] # Tensor shape
dshape: # Named dimensions as ordered array (see dshape section)
- batch: int
- height: int # For spatial outputs
- width: int # For spatial outputs
- num_features: int # For detection outputs
- num_boxes: int # For detection outputs
- padding: int # For detection outputs
- box_coords: int # For detection outputs
- num_classes: int # For detection outputs
- num_anchors_x_features: int # For detection outputs
- num_protos: int # For instance segmentation
dtype: string # Data type (float32, uint8, int8)
type: string # Semantic type (detection, segmentation, boxes, scores, masks, protos)
decode: boolean # Whether decoding is required
decoder: string # Decoder type: 'modelpack' or 'ultralytics'
quantization: [float, int] # [scale, zero_point] for quantized models
stride: [int, int] # Spatial stride for this output (ModelPack)
anchors: [[[float, float]]] # Normalized anchors for this output level (ModelPack only)
score_format: string # Score encoding: 'per_class' or 'obj_x_class' (Ultralytics only)
normalized: boolean # Box coordinates in [0,1] range (true) or pixels (false). Optional field.
Output Specification
The outputs section is critical for inference — it tells the runtime how to interpret model outputs.
Output Types
For Ultralytics framework models, the following output types are used:
| Type | Description | Typical Shape |
|---|---|---|
detection |
Raw detection output (needs to be split) | [1, num_features, num_boxes] |
boxes |
Split bounding boxes | [1, 4, num_boxes] |
scores |
Split class scores | [1, classes, num_boxes] |
mask_coefficients |
Split coefficients for instance segmentation | [1, num_protos, num_boxes] |
protos |
Instance segmentation prototypes | [1, H, W, num_protos] (NHWC) |
score_format field (Ultralytics only):
| Value | Description | Architecture |
|---|---|---|
per_class |
Each anchor outputs [nc] class probabilities directly |
YOLOv8, YOLO11, YOLO26 |
obj_x_class |
Each anchor outputs [1 + nc] where final score = objectness × class confidence |
YOLOv5 |
When score_format is absent, the validator falls back to a shape-based heuristic on the feature dimension: nc+5 features per anchor (4 box coordinates + 1 objectness + nc class probabilities) implies obj_x_class (e.g., [1, 85, 8400] for 80 classes).
HAL Score Format
HAL determines score format from decoder_version rather than score_format. yolov5 applies objectness × class; all other versions use per-class scores directly.
For ModelPack framework models the following output types are used:
| Type | Description | Typical Shape |
|---|---|---|
detection |
Raw detection output (needs decoding) | [1, H, W, num_anchors_x_features] |
boxes |
Bounding boxes | [1, num_boxes, 1, 4] |
scores |
Class scores | [1, num_boxes, classes] |
segmentation |
Semantic segmentation output | [1, H, W, classes] |
masks |
Semantic segmentation masks | [1, H, W] |
Segmentation Types
EdgeFirst supports two distinct segmentation approaches:
Semantic Segmentation (ModelPack)
Per-pixel classification without object instances. Each pixel is assigned a class label, but individual objects are not distinguished.
Use cases:
- Drivable surface detection
- Lane segmentation
- Sky/ground separation
- Terrain classification
Output structure:
outputs:
- name: "segmentation_output"
type: segmentation
shape: [1, 480, 640, 5] # [batch, H, W, num_classes]
dshape:
- batch: 1
- height: 480
- width: 640
- num_classes: 5
decoder: modelpack
Instance Segmentation (Ultralytics)
Per-pixel classification with object instances. Each detected object gets its own mask, enabling fine-grained object boundaries beyond bounding boxes.
Use cases:
- Individual person segmentation
- Vehicle instance masks
- Product segmentation
- Fine-grained object detection
Output structure:
# Detection output with mask coefficients
outputs:
- name: "detection_output"
type: detection
shape: [1, 116, 8400] # [batch, 4+nc+32, num_boxes] - includes 32 mask coefficients
dshape:
- batch: 1
- num_features: 116 # 4 box coords + 80 classes + 32 mask coefficients
- num_boxes: 8400
decoder: ultralytics
# Prototype masks for instance computation
- name: "protos_output"
type: protos
shape: [1, 32, 160, 160] # [batch, num_protos, H, W] NCHW
dshape:
- batch: 1
- num_protos: 32
- height: 160
- width: 160
decoder: ultralytics
Final mask computation:
# For each detected object with mask_coefficients [32]:
instance_mask = sigmoid(mask_coefficients @ protos) # [32] @ [32, H, W] → [H, W]
# Crop to bounding box region for final instance mask
The dshape Field
The dshape field provides named dimensions for easier interpretation of tensor shapes. This is especially useful when shapes vary between data layouts (NCHW vs NHWC).
outputs:
- name: "output_0"
shape: [1, 84, 8400] # Raw shape
dshape: # Named dimensions as ordered array
- batch: 1
- num_features: 84 # 4 box coords + 80 classes
- num_boxes: 8400
Standard dimension names:
| Name | Description |
|---|---|
batch |
Batch size (typically 1 for inference) |
height |
Spatial height |
width |
Spatial width |
num_classes |
Number of classification classes |
num_features |
Feature dimension (box coords + classes + mask coefficients) |
num_boxes |
Number of detection boxes/anchors |
num_protos |
Number of prototype masks (instance segmentation) |
num_anchors_x_features |
Combined anchor and feature dimension for ModelPack grid outputs (anchors × features per anchor) |
padding |
Padding/alignment dimension used to satisfy expected tensor shapes. Must always be 1 |
box_coords |
The coordinates of the boxes. Must be 4 |
Decoding Information
For outputs with decode: true, the metadata provides all information needed to decode:
outputs:
- name: "detection_output_0"
type: detection
decode: true
decoder: modelpack
shape: [1, 40, 40, 54] # Grid output
dshape:
- batch: 1
- height: 40
- width: 40
- num_anchors_x_features: 54
anchors: # Normalized anchor boxes
- [0.054, 0.065]
- [0.089, 0.139]
- [0.195, 0.196]
quantization: [0.176, 198] # For dequantization
Quantization Parameters
For quantized models (TFLite INT8), each output includes quantization parameters:
# Dequantize output
scale, zero_point = output_spec['quantization']
float_output = (quantized_output - zero_point) * scale
Data Layout (NCHW vs NHWC)
Deep learning frameworks use different memory layouts for tensor data. The metadata accurately reflects each format's native layout:
| Format | Data Layout | Shape Convention | Example (batch=1, 640x640, RGB) |
|---|---|---|---|
| TFLite | NHWC | [batch, height, width, channels] |
[1, 640, 640, 3] |
| ONNX | NCHW | [batch, channels, height, width] |
[1, 3, 640, 640] |
Why This Matters
- TFLite (TensorFlow): Uses channels-last (NHWC) which is optimized for CPU and mobile inference
- ONNX (PyTorch-derived): Uses channels-first (NCHW) which is optimized for GPU and NPU inference
The metadata's outputs section reports shapes in the model's native format. When integrating with inference runtimes, ensure your input preprocessing matches the expected layout.
Metadata Fields
input:
shape: [1, 640, 640, 3] # Input tensor shape (layout varies by model)
cameraadaptor: rgb # Channel order (rgb, bgr, yuyv)
# Common layouts:
# - NHWC: [batch, height, width, channels] e.g., [1, 640, 640, 3]
# - NCHW: [batch, channels, height, width] e.g., [1, 3, 640, 640]
outputs:
- name: "output_0"
shape: [1, 640, 640, 3] # TFLite: NHWC
# shape: [1, 3, 640, 640] # ONNX: NCHW
Input Preprocessing
EdgeFirst models expect specific input preprocessing. The metadata documents these requirements so inference pipelines can prepare data correctly.
Image Resizing
Models expect input images at the resolution specified in metadata. How images are resized depends on the training approach:
input:
shape: [1, 640, 640, 3] # NHWC example: [batch, height, width, channels]
# shape: [1, 3, 640, 640] # NCHW example: [batch, channels, height, width]
cameraadaptor: rgb # Expected color format
Native Aspect Ratio (typical for purpose-built datasets):
- ModelPack models are often trained at the camera's native aspect ratio
- Images are directly resized to target dimensions without padding
- Best accuracy when deployment camera matches training data
Letterbox (typical for diverse datasets like COCO):
- Used when training on images from diverse cameras and aspect ratios
- Image is scaled to fit within target size while maintaining aspect ratio
- Gray padding (value 114) added to reach exact dimensions
- Inference must apply same letterbox transform and account for padding offset in output coordinates
Example: A 1920x1080 image letterboxed to 640x640:
- Scaled to 640x360 (maintains 16:9 ratio)
- 140 pixels of padding added to top and bottom
- Output box coordinates must be adjusted to remove padding offset
Pixel Normalization
Input pixels are normalized from [0, 255] to [0.0, 1.0]:
# Standard normalization
normalized = pixels.astype(np.float32) / 255.0
For quantized models (INT8), the quantization parameters handle the scaling internally — raw uint8 pixel values can often be used directly.
Camera Adaptor
The cameraadaptor field specifies the expected input format for the model. See Camera Adaptor for details on how this enables models to consume native camera formats without runtime conversion.
| Value | Description | Channel Order |
|---|---|---|
rgb |
Standard RGB | Red, Green, Blue |
bgr |
OpenCV default | Blue, Green, Red |
rgba |
RGB with alpha | Red, Green, Blue, Alpha |
bgra |
BGR with alpha | Blue, Green, Red, Alpha |
grey |
Greyscale | Single channel |
yuyv |
YUV 4:2:2 packed | For direct camera sensor input |
Validation Parameters
The validation section records the recommended settings based on how the model was trained. These parameters are informational preferences — they document the model author's intended configuration for validation and inference.
Parameter Semantics
| Parameter | Description | Default | Override at Runtime? |
|---|---|---|---|
iou |
NMS IoU threshold | 0.7 |
Yes |
score |
NMS confidence score threshold | 0.001 |
Yes |
nms |
NMS algorithm | (not set) | See below |
normalization |
Input pixel normalization | unsigned |
Yes |
preprocessing |
Image preprocessing method | letterbox |
Yes |
Most parameters (iou, score, normalization, preprocessing, and NMS algorithm choices like hal/tensorflow/numpy/torch) can be overridden at runtime based on deployment preferences.
Exception: nms: none must be respected because the model does not produce outputs compatible with external NMS. This applies to two cases:
- Architectural end-to-end models (e.g., YOLO26) — NMS is part of the model architecture via one-to-one matching heads. The model graph itself produces final predictions.
- Engine-embedded NMS — Models exported with NMS operations appended to the inference graph (ONNX, TensorRT, TFLite). NMS is not part of the original model architecture but was added during export or conversion.
Both produce post-NMS output in [x1, y1, x2, y2, conf, class, ...] format. Detection models output (1, max_det, 6). Segmentation models output (1, max_det, 6 + nm) plus prototype masks — the mask coefficients for NMS-selected detections are preserved, so only the mask decode step is needed externally (mask = sigmoid(coefficients @ prototypes)). Use --nms none (CLI) or validation.nms: none (metadata) for either case.
Allowed nms Values
| Value | Description |
|---|---|
none |
No external NMS. For models with embedded NMS — either architectural end-to-end (YOLO26) or engine-embedded (ONNX/TRT/TFLite with NMS ops appended). Supports both detection and segmentation |
numpy |
NumPy-based NMS implementation (default fallback) |
hal |
EdgeFirst HAL decoder NMS |
tensorflow |
TensorFlow NMS |
torch |
PyTorch (torchvision) NMS |
When --override is set, the validator reads validation.nms from the model metadata and applies it automatically.
Box Coordinate Format (normalized)
The normalized field on detection and boxes outputs specifies the coordinate format:
| Value | Description | Coordinate Range |
|---|---|---|
true |
Normalized coordinates relative to model input dimensions | [0.0, 1.0] |
false |
Pixel coordinates relative to model input (letterboxed frame) | [0, width] / [0, height] |
| (absent) | Must be inferred from output values | Check if any coordinate > 1.0 |
When normalized is absent, the coordinate format must be inferred by examining the output values. If any bounding box coordinate exceeds 1.0, the coordinates are in pixels; otherwise, assume normalized.
Normalized coordinates are preferred because they:
- Don't require knowledge of model input resolution for downstream processing
- Quantize better (smaller dynamic range)
- Work consistently across different model input sizes
Pixel coordinates are typically used by:
- End-to-end models with embedded NMS (YOLO26, engine-embedded NMS)
- Models exported with specific output coordinate conventions
Note
Coordinates are always relative to the letterboxed model input, not the original image aspect ratio. The caller must apply the inverse letterbox transform to map boxes back to original image coordinates regardless of whether normalized is true or false.
# Example: End-to-end model with pixel coordinates
outputs:
- name: "output0"
type: detection
shape: [1, 100, 6] # [batch, max_det, x1+y1+x2+y2+conf+class]
normalized: false # Pixel coordinates
decoder: ultralytics
Post-Processing & Split Decoder
What is Split Decoder?
The split_decoder field indicates the model's outputs have been modified from the standard architecture to improve INT8 quantization performance. The details are framework-specific:
- ModelPack: Raw grid features instead of decoded boxes (dequantize before anchor decode)
- Ultralytics: Detection tensor split into separate
boxes,scores,mask_coefficientstensors (per-tensor quantization)
model:
split_decoder: true # Outputs modified for quantization (see framework docs)
split_decoder: false # Standard output format
See framework documentation for implementation details.
Why Split Decoder Exists
Quantization introduces precision loss. Split decoder addresses this differently per framework:
ModelPack: For small objects or high-resolution inputs, applying anchor calculations in INT8 would compound rounding errors. By deferring decoding until after dequantization, we preserve box accuracy.
Ultralytics: The monolithic detection tensor contains boxes, scores, and mask coefficients with very different value ranges. Splitting them into separate tensors allows per-tensor quantization scales, preserving accuracy for each component independently.
Decoding Process
When split_decoder: true, the inference pipeline must:
- Run model inference → Get quantized outputs
- Dequantize outputs → Convert INT8 to float32 using per-output scale/zero_point
- Apply decoding → Framework-specific: anchor decode (ModelPack) or box format conversion (Ultralytics)
- Run NMS → Filter overlapping detections
# Example decoding flow for split_decoder models
for output_spec in metadata['outputs']:
if output_spec.get('decode', False):
# Dequantize first
scale, zp = output_spec['quantization']
raw_float = (raw_int8.astype(np.float32) - zp) * scale
# Then decode (framework-specific)
if output_spec['decoder'] == 'modelpack':
boxes = decode_yolo_grid(raw_float, output_spec['anchors'], output_spec['stride'])
elif output_spec['decoder'] == 'ultralytics':
# boxes, scores, mask_coefficients are separate outputs
pass # Use output type to determine handling
Output Types with Split Decoder
| Framework | split_decoder |
Output Types |
|---|---|---|
| ModelPack | true |
detection (raw grid, needs anchor decode) |
| ModelPack | false |
boxes, scores (decoded) |
| Ultralytics | true |
boxes, scores, mask_coefficients, protos |
| Ultralytics | false |
detection, protos (monolithic) |
Decoder Field
The decoder field specifies which decoding algorithm to use:
outputs:
- name: "detection_output_0"
type: detection
decode: true
decoder: modelpack # Use ModelPack YOLO-style grid decoding
Supported Decoders
modelpack — Anchor-Based YOLO Decoder
Used by ModelPack models. Traditional YOLO-style grid decoding with pre-defined anchor boxes.
Characteristics:
- Anchor-based: Uses pre-defined anchor boxes per output level (3 anchors × 3 scales typical)
- Grid outputs: Raw features from detection grid cells
- Sigmoid activations: Applied to xy, wh, objectness, and class predictions
Decoding formula:
xy = (sigmoid(xy) * 2.0 + grid - 0.5) * stride
wh = (sigmoid(wh) * 2) ** 2 * anchors * stride * 0.5
xyxy = concat([xy - wh, xy + wh]) / input_dims # normalized xyxy
Required metadata fields:
outputs:
- decoder: modelpack
anchors: # Required - normalized anchor boxes
- [0.054, 0.065]
- [0.089, 0.139]
stride: [16, 16] # Required - spatial stride
Deprecated: decoder: yolov8
The decoder value yolov8 is deprecated. Use ultralytics instead. Existing models with decoder: yolov8 will continue to work — the validator automatically normalizes yolov8 to ultralytics with a deprecation warning.
ultralytics — Anchor-Free DFL Decoder
Used by Ultralytics models (YOLOv5, YOLOv8, YOLO11, YOLO26). Modern anchor-free detection using Distribution Focal Loss (DFL).
Characteristics:
- Anchor-free: Uses anchor points (grid centers) instead of pre-defined boxes
- DFL regression: Converts 16-bin distribution to box coordinates
- Unified architecture: Same decoder for YOLOv5, YOLOv8, YOLO11, and YOLO26
Decoding formula:
# DFL converts 16-bin distribution to coordinate value
box = dfl(raw_box) # [batch, 64, anchors] → [batch, 4, anchors]
# dist2bbox converts LTRB distances to boxes
x1y1 = anchor_points - lt
x2y2 = anchor_points + rb
# Returns xywh or xyxy in pixel coordinates
Metadata structure:
outputs:
- decoder: ultralytics
anchors: null # Not used - anchor-free
# Strides are implicit: [8, 16, 32] for P3/P4/P5 outputs
Version differences:
All Ultralytics versions use the same anchor-free Detect class. Differences are in backbone architecture:
| Version | Backbone Blocks | Classification Head |
|---|---|---|
| YOLOv5 | C3 | Conv→Conv→Conv2d |
| YOLOv8 | C2f | Conv→Conv→Conv2d |
| YOLO11 | C3k2, C2PSA | DWConv→Conv (efficient) |
| YOLO26 | C3k2, A2C2f | DWConv→Conv (efficient) |
Decoder Version Field
The decoder_version field specifies the YOLO architecture version for Ultralytics models. This field is critical for determining the correct decoding strategy, especially for end-to-end models.
decoder_version: yolo26 # End-to-end model with embedded NMS
# or
decoder_version: yolov8 # Traditional model requiring external NMS
Supported values:
| Value | Architecture | NMS Handling |
|---|---|---|
yolov5 |
YOLOv5 | External NMS required |
yolov8 |
YOLOv8 | External NMS required |
yolo11 |
YOLO11 | External NMS required |
yolo26 |
YOLO26 | Embedded NMS (end-to-end) |
Naming Convention
The naming follows Ultralytics conventions: yolov5 and yolov8 include the 'v' prefix, while yolo11 and yolo26 do not (Ultralytics dropped the 'v' starting with YOLO11).
When decoder_version is yolo26:
- The model uses one-to-one matching heads with NMS embedded in the architecture
- Output format is
[x1, y1, x2, y2, conf, class, ...](post-NMS) - The HAL decoder uses end-to-end model types regardless of the
nmsfield - No external NMS is applied
When decoder_version is absent or any other value:
- Traditional YOLO architecture requiring external NMS
- The
nmsfield controls which NMS algorithm the HAL decoder uses
HAL NMS Field
The nms field at the config root level controls the HAL decoder's NMS behavior:
nms: class_agnostic # Suppress overlapping boxes regardless of class (default)
# or
nms: class_aware # Only suppress boxes with the same class label
| Value | Behavior |
|---|---|
class_agnostic |
Suppress overlapping boxes regardless of class label (default) |
class_aware |
Only suppress boxes that share the same class AND overlap |
Different from validation.nms
The root-level nms field controls HAL decoder behavior (class-agnostic vs class-aware). The validation.nms field in the validation section specifies the NMS implementation to use during validation (hal, numpy, tensorflow, etc.) or none for models with embedded NMS.
Example configuration for YOLO26 end-to-end model:
decoder_version: yolo26
outputs:
- decoder: ultralytics
type: detection
shape: [1, 100, 6]
normalized: false
dshape:
- batch: 1
- num_boxes: 100
- num_features: 6
validation:
nms: none # Model has embedded NMS
Example configuration for traditional YOLOv8 model:
decoder_version: yolov8
nms: class_agnostic
outputs:
- decoder: ultralytics
type: detection
shape: [1, 84, 8400]
dshape:
- batch: 1
- num_features: 84
- num_boxes: 8400
validation:
nms: hal # Use HAL decoder NMS
ONNX-Specific Metadata
ONNX models exported from ModelPack or Ultralytics include additional official metadata fields:
| Field | ModelPack Value | Ultralytics Value | Purpose |
|---|---|---|---|
producer_name |
"EdgeFirst ModelPack" | "EdgeFirst Ultralytics" | Identifies producing framework |
producer_version |
Package version | Package version | Version tracking |
graph.name |
Model name | Model name | Graph identification |
doc_string |
Description | Description | Human-readable description |
Custom metadata properties (all string values):
| Key | Content | Purpose |
|---|---|---|
edgefirst |
Full config as JSON | Complete configuration |
name |
Model name | Quick access (no JSON parsing) |
description |
Model description | Quick access |
author |
Author/organization | Quick access |
studio_server |
Full hostname | Quick access for traceability |
project_id |
Project ID | Quick access for traceability |
session_id |
Session ID | Quick access for traceability |
dataset |
Dataset name | Quick access |
dataset_id |
Dataset ID | Quick access for traceability |
labels |
JSON array of labels | Class labels |
Third-Party Integration
Any training framework can produce EdgeFirst-compatible models by embedding the appropriate metadata.
Minimum Required Fields
For basic EdgeFirst Perception stack compatibility:
input:
shape: [1, 640, 640, 3] # Input tensor shape (NHWC or NCHW)
cameraadaptor: rgb
model:
detection: true
segmentation: false
split_decoder: true # or false if decoder is built-in
outputs:
- name: "output_0"
shape: [1, 8400, 84]
dtype: float32
type: boxes # or detection if needs decoding
decode: false
dataset:
classes:
- class1
- class2
Full Traceability (Recommended)
For production MLOps integration with EdgeFirst Studio:
host:
studio_server: test.edgefirst.studio
project_id: "1123"
session: t-2110 # Hex value, convert to int for URLs
dataset:
name: "My Dataset"
id: ds-xyz789
classes: [...]
name: "my-model-v1" # Model/session name
description: "Model for production deployment"
author: "My Organization"
Embedding Metadata in TFLite
Dependencies
This example requires the tflite-support and pyyaml packages:
pip install tflite-support pyyaml
from tensorflow_lite_support.metadata.python.metadata_writers import metadata_writer, writer_utils
from tensorflow_lite_support.metadata import metadata_schema_py_generated as schema
import yaml
from typing import List
import tempfile
import os
def add_edgefirst_metadata(tflite_path: str, config: dict, labels: List[str]):
"""Add EdgeFirst metadata to a TFLite model."""
# Write config and labels to temp files in a cross-platform way
with tempfile.TemporaryDirectory() as tmpdir:
config_path = os.path.join(tmpdir, 'edgefirst.yaml')
labels_path = os.path.join(tmpdir, 'labels.txt')
with open(config_path, 'w') as f:
yaml.dump(config, f)
with open(labels_path, 'w') as f:
f.write('\n'.join(labels))
# Create model metadata
model_meta = schema.ModelMetadataT()
model_meta.name = config.get('name', '')
model_meta.description = config.get('description', '')
model_meta.author = config.get('author', '')
# Load and populate
tflite_buffer = writer_utils.load_file(tflite_path)
writer = metadata_writer.MetadataWriter.create_from_metadata(
model_buffer=tflite_buffer,
model_metadata=model_meta,
associated_files=[labels_path, config_path]
)
writer_utils.save_file(writer.populate(), tflite_path)
Embedding Metadata in ONNX
Dependencies
This example requires the onnx package:
pip install onnx
import onnx
import json
from typing import List
def add_edgefirst_metadata(onnx_path: str, config: dict, labels: List[str]):
"""Add EdgeFirst metadata to an ONNX model."""
model = onnx.load(onnx_path)
# Set official ONNX fields
model.producer_name = 'My Training Framework'
model.producer_version = '1.0.0'
if config.get('name'):
model.graph.name = config['name']
if config.get('description'):
model.doc_string = config['description']
# Add custom metadata
metadata = {
'edgefirst': json.dumps(config),
'labels': json.dumps(labels),
'name': config.get('name', ''),
'description': config.get('description', ''),
'author': config.get('author', ''),
'studio_server': config.get('host', {}).get('studio_server', ''),
'project_id': str(config.get('host', {}).get('project_id', '')),
'session_id': config.get('host', {}).get('session', ''),
'dataset': config.get('dataset', {}).get('name', ''),
'dataset_id': str(config.get('dataset', {}).get('id', '')),
}
for key, value in metadata.items():
if value:
prop = model.metadata_props.add()
prop.key = key
prop.value = str(value)
onnx.save(model, onnx_path)
Updating Metadata
Updating TFLite Metadata
Since TFLite models are ZIP archives, you can update embedded files:
zip command
The zip command is available on most platforms but may need to be installed:
# Update edgefirst.yaml
zip -u mymodel.tflite edgefirst.yaml
# Update labels
zip -u mymodel.tflite labels.txt
# Add new files
zip mymodel.tflite edgefirst.json
Updating ONNX Metadata
import onnx
import json
model = onnx.load('mymodel.onnx')
# Update existing metadata
for prop in model.metadata_props:
if prop.key == 'description':
prop.value = 'Updated description'
# Add new metadata
prop = model.metadata_props.add()
prop.key = 'custom_field'
prop.value = 'custom_value'
onnx.save(model, 'mymodel.onnx')
Schema Reference
Host Section
The host section identifies the EdgeFirst Studio instance and training session that produced the model.
host:
studio_server: test.edgefirst.studio # Full EdgeFirst Studio hostname
project_id: "1123" # Project ID for Studio URLs
session: t-2110 # Training session ID (hex, prefix t-)
username: john.doe # User who initiated training
Converting IDs for Studio URLs
Session and dataset IDs in metadata use hexadecimal values with prefixes (t- for training sessions, ds- for datasets). To construct Studio URLs, strip the prefix and convert from hex to decimal:
t-2110→int('2110', 16)→8464ds-1c8→int('1c8', 16)→456
Dataset Section
The dataset section references the dataset used for training. See the Dataset Zoo for available datasets and Dataset Structure for format details.
dataset:
name: "COCO 2017" # Human-readable name
id: ds-abc123 # Dataset ID (prefix: ds-)
classes: # Ordered list of class labels
- background
- person
- car
Model Identification
Top-level fields for model identification, populated from the training session name and description.
name: "coffeecup-detection" # Model/session name (used in filename)
description: "Object detection model for coffee cups"
author: "Au-Zone Technologies" # Organization
Input Section
The input section specifies image preprocessing requirements. See Vision Augmentations for training-time augmentation configuration.
input:
shape: [1, 640, 640, 3] # Input tensor shape
cameraadaptor: rgb # rgb, rgba, yuyv, bgr
Data Layout
The shape field uses the model's native tensor layout. This can be either NHWC [batch, height, width, channels] or NCHW [batch, channels, height, width] depending on how the model was exported. While TFLite typically uses NHWC and ONNX typically uses NCHW, both formats can support either layout — always check the actual shape values.
Model Section
The model section captures architecture configuration. These parameters can be configured during training session setup in EdgeFirst Studio. See the ModelPack and Ultralytics documentation for detailed parameter descriptions.
# ModelPack model configuration
model:
backbone: cspdarknet19
model_size: nano # nano, small, medium, large
activation: relu6 # relu, relu6, silu, mish
detection: true
segmentation: false
classification: false
split_decoder: true # true = outputs need anchor decoding after dequantization
# false = outputs are fully decoded boxes
# See "Post-Processing & Split Decoder" section for details
anchors: # Per-level anchor boxes (pixels at input resolution)
- [[35, 42], [57, 89], [125, 126]]
- [[125, 126], [208, 260], [529, 491]]
# Ultralytics model configuration
model:
model_version: v8 # v5, v8, v11
model_task: segment # detect, segment
model_size: n # n (nano), s (small), m (medium), l (large), x (xlarge)
detection: false
segmentation: true
split_decoder: true # true = split outputs (boxes, scores, mask_coefficients, protos)
# false = monolithic detection tensor
# Default true for quantized segmentation models
Outputs Section
# ModelPack detection output example
outputs:
- name: "output_0"
index: 0
output_index: 0
shape: [1, 40, 40, 54]
dshape:
- batch: 1
- height: 40
- width: 40
- num_anchors_x_features: 54 # 3 anchors × (5 + 13 classes)
dtype: float32
type: detection
decode: true
decoder: modelpack
quantization: [0.176, 198]
stride: [16, 16]
anchors:
- [0.054, 0.065]
- [0.089, 0.139]
- [0.195, 0.196]
# Ultralytics detection output example
outputs:
- name: "output0"
index: 0
output_index: 0
shape: [1, 84, 8400] # NCHW: [batch, 4+nc, num_boxes]
dshape:
- batch: 1
- num_features: 84 # 4 box coords + 80 classes
- num_boxes: 8400
dtype: float32
type: detection
decode: true
decoder: ultralytics
quantization: null # Float model
anchors: null # Anchor-free
score_format: per_class # YOLOv8/v11/v26: class probabilities directly
# Ultralytics instance segmentation protos example
- name: "output1"
index: 1
output_index: 1
shape: [1, 32, 160, 160] # NCHW: [batch, protos, H, W]
dshape:
- batch: 1
- num_protos: 32
- height: 160
- width: 160
dtype: float32
type: protos
decode: true
decoder: ultralytics
quantization: null
anchors: null
score_format: null # Not applicable to protos output
Related Articles
- Camera Adaptor - Native camera format support for edge deployment
- ModelPack Overview - Architecture details and training parameters
- Ultralytics Integration - YOLOv8/v11/v26 training and deployment
- Training Vision Models - Step-by-step training workflow
- On Cloud Validation - Managed validation sessions
- On Target Validation - User-managed validation with
edgefirst-validator - ModelPack Quantization - Converting ONNX to quantized TFLite
- Deploying to Embedded Targets - Model deployment workflow
- EdgeFirst Perception Middleware - Runtime inference stack
- Dataset Zoo - Available datasets for training
- Model Experiments Dashboard - Managing training and validation sessions