Model Metadata

This document describes the metadata schema embedded in EdgeFirst model files. Model metadata provides complete traceability for MLOps workflows and contains all information needed to decode model outputs for inference.

Overview

EdgeFirst models embed metadata that enables:

Full Traceability: Link any deployed model back to its training session, dataset, and configuration in EdgeFirst Studio
Self-Describing Models: Models contain all information needed for inference without external configuration files
Cross-Platform Compatibility: Consistent schema across TFLite and ONNX formats
Third-Party Integration: Any training framework can produce EdgeFirst-compatible models by following this schema

Supported Formats

EdgeFirst models from the Model Zoo (including ModelPack and Ultralytics) embed metadata in format-specific locations:

Format	Metadata Location	Config Format	Labels
TFLite	ZIP archive (associated files)	`edgefirst.json` (preferred), `edgefirst.yaml`	`labels.txt`
ONNX	Custom metadata properties	`edgefirst` (JSON)	`labels` (JSON array)

Supported Training Frameworks

Framework	Decoder	Architecture	Use Case
ModelPack	`modelpack`	Anchor-based YOLO	Semantic segmentation, detection
Ultralytics	`ultralytics`	Anchor-free DFL (YOLOv5/v8/v11/v26)	Instance segmentation, detection

Note

These metadata fields are automatically read and handled by edgefirst-validator and the EdgeFirst Perception Middleware. In most cases, developers don't need to worry about these details — the EdgeFirst ecosystem "Just Works." This documentation exists so developers understand what's happening under the hood when needed.

Traceability for Production MLOps

One of the most critical aspects of production ML systems is traceability — the ability to answer questions like:

Where was this model trained?
What dataset was used?
What were the training parameters?
Can I reproduce this model?

EdgeFirst metadata provides complete traceability through these key fields:

Field	Location	Purpose
`studio_server`	`host.studio_server`	Full hostname of EdgeFirst Studio instance (e.g., test.edgefirst.studio)
`project_id`	`host.project_id`	Project ID for constructing Studio URLs
`session_id`	`host.session`	Training session ID for accessing logs, metrics, artifacts
`dataset_id`	`dataset.id`	Dataset identifier for reproducing training data
`dataset`	`dataset.name`	Human-readable dataset name

Example Traceability Workflow

Given a deployed model, you can trace back to its origins:

# Extract metadata from deployed model
metadata = get_edgefirst_metadata(model_path)

# Construct EdgeFirst Studio URLs
studio_server = metadata['host']['studio_server']  # e.g., 'test.edgefirst.studio'
project_id = metadata['host']['project_id']        # e.g., '1123'
session = metadata['host']['session']              # e.g., 't-2110'
dataset_id = metadata['dataset']['id']             # e.g., 'ds-1c8'

# Note: Studio URL parameters require integer IDs. Metadata stores hex values
# with prefixes (t-, ds-). Convert by stripping the prefix and parsing as hex:
#   't-2110' -> int('2110', 16) -> 8464
#   'ds-1c8' -> int('1c8', 16)  -> 456

# Access training session: https://{studio_server}/{project_id}/experiment/training/details?train_session_id={session_int}
# Example: https://test.edgefirst.studio/1123/experiment/training/details?train_session_id=8464

# Access dataset: https://{studio_server}/{project_id}/datasets/gallery/main?dataset={dataset_int}
# Example: https://test.edgefirst.studio/1123/datasets/gallery/main?dataset=456

# View training logs, metrics, and original configuration

This enables:

Audit trails for regulatory compliance
Debugging production issues by examining training data
Reproducibility by re-running training with identical configuration
Version control of model lineage through Model Experiments

Reading Metadata

TFLite Models

TFLite models are ZIP-format files containing embedded edgefirst.yaml and labels.txt:

import zipfile
import yaml
import json
from typing import Optional, List

def get_edgefirst_metadata(model_path: str) -> Optional[dict]:
    """Extract EdgeFirst metadata from a TFLite model."""
    if not zipfile.is_zipfile(model_path):
        return None
    
    with zipfile.ZipFile(model_path) as zf:
        # Try JSON first (preferred), then YAML fallback
        for filename in ['edgefirst.json', 'edgefirst.yaml']:
            if filename in zf.namelist():
                with zf.open(filename) as f:
                    content = f.read().decode('utf-8')
                    if filename.endswith('.json'):
                        return json.loads(content)
                    else:
                        return yaml.safe_load(content)
    return None

def get_labels(model_path: str) -> List[str]:
    """Extract class labels from a TFLite model."""
    if not zipfile.is_zipfile(model_path):
        return []
    
    with zipfile.ZipFile(model_path) as zf:
        if 'labels.txt' in zf.namelist():
            with zf.open('labels.txt') as f:
                content = f.read().decode('utf-8').strip()
                return [line for line in content.splitlines() 
                        if line.strip()]
    return []

ONNX Models

ONNX models store metadata directly in the model's custom properties:

import onnx
import json
from typing import Optional, List

def get_edgefirst_metadata(model_path: str) -> Optional[dict]:
    """Extract EdgeFirst metadata from an ONNX model."""
    model = onnx.load(model_path)
    
    for prop in model.metadata_props:
        if prop.key == 'edgefirst':
            return json.loads(prop.value)
    return None

def get_labels(model_path: str) -> List[str]:
    """Extract class labels from an ONNX model."""
    model = onnx.load(model_path)
    
    for prop in model.metadata_props:
        if prop.key == 'labels':
            return json.loads(prop.value)
    return []

def get_quick_metadata(model_path: str) -> dict:
    """Get commonly-used fields without parsing full config."""
    model = onnx.load(model_path)
    
    result = {}
    quick_fields = ['name', 'description', 'author', 'studio_server', 
                    'session_id', 'dataset', 'dataset_id']
    
    for prop in model.metadata_props:
        if prop.key in quick_fields:
            result[prop.key] = prop.value
        elif prop.key == 'labels':
            result['labels'] = json.loads(prop.value)
    
    return result

ONNX Runtime Access

For inference applications using ONNX Runtime:

import onnxruntime as ort
import json

session = ort.InferenceSession(model_path)
metadata = session.get_modelmeta()

# Access custom metadata
custom = metadata.custom_metadata_map
edgefirst_config = json.loads(custom.get('edgefirst', '{}'))
labels = json.loads(custom.get('labels', '[]'))

# Access official ONNX fields
print(f"Producer: {metadata.producer_name}")  # 'EdgeFirst ModelPack'
print(f"Graph: {metadata.graph_name}")
print(f"Description: {metadata.description}")

Metadata Schema

The EdgeFirst metadata schema is organized into logical sections. All sections are optional — third-party integrations can include only the sections relevant to their use case.

Complete Schema Structure

# Traceability & Identification
host:
  studio_server: string    # Full EdgeFirst Studio hostname (e.g., test.edgefirst.studio)
  project_id: string       # Project ID for Studio URLs
  session: string          # Training session ID
  username: string         # User who initiated training

dataset:
  name: string             # Human-readable dataset name
  id: string               # Dataset identifier
  classes: [string]        # List of class labels

# Model Identification (from training session)
name: string               # Model/session name
description: string        # Model description
author: string             # Organization (typically "Au-Zone Technologies")

# Model Configuration (see ModelPack and Ultralytics documentation)
input:
  shape: [int]             # Input tensor shape (NCHW or NHWC depending on model)
  cameraadaptor: string    # Camera format (rgb, bgr, rgba, bgra, grey, yuyv)
  input_channels: int      # Channels from camera (3=RGB, 4=RGBA, 1=grey)
  output_channels: int     # Channels after CameraAdaptor transform

model:
  backbone: string         # Backbone architecture (e.g., cspdarknet19, cspdarknet53)
  model_size: string       # Size variant (nano, small, medium, large)
  activation: string       # Activation function (relu, relu6, silu)
  detection: boolean       # Detection task enabled
  segmentation: boolean    # Segmentation task enabled
  classification: boolean  # Classification task enabled
  split_decoder: boolean   # Whether decoder is external (see Split Decoder section)
  anchors: [[[int, int]]]  # Anchor boxes per output level
  # ... additional model-specific parameters

# Training Configuration
trainer:
  epochs: int              # Number of training epochs
  batch_size: int          # Training batch size
  weights: string          # Pretrained weights source
  checkpoint_path: string  # Where checkpoints were saved

optimizer:
  optimizer: string        # Optimizer type (adam, adamw, sgd)
  learning_rate: float     # Base learning rate
  weight_decay: float      # L2 regularization strength
  # ... additional optimizer parameters

augmentation:  # See Vision Augmentations documentation
  random_hflip: int        # Horizontal flip probability (0-100)
  random_mosaic: int       # Mosaic augmentation probability
  # ... additional augmentation parameters

validation:
  iou: float               # NMS IoU threshold
  score: float             # NMS score threshold
  nms: string              # NMS algorithm (none, numpy, hal, tensorflow, torch)
  normalization: string    # Input normalization (unsigned, signed)
  preprocessing: string    # Preprocessing method (resize, letterbox)
  skip_validation_steps: int  # Steps to skip between validations

export:  # See Quantization documentation for ModelPack and Ultralytics
  export: boolean          # Whether model was quantized
  export_input_type: string   # Input quantization type
  export_output_type: string  # Output quantization type
  calibration_samples: int    # Samples used for calibration

# Decoder Configuration (Ultralytics only)
decoder_version: string    # YOLO architecture version: yolov5, yolov8, yolo11, yolo26
nms: string                # NMS mode for HAL decoder: class_agnostic, class_aware

# Output Specification (Critical for Inference)
outputs:
  - name: string           # Output tensor name
    index: int             # Tensor index
    output_index: int      # Output order
    shape: [int]           # Tensor shape
    dshape:                # Named dimensions as ordered array (see dshape section)
      - batch: int
      - height: int                 # For spatial outputs
      - width: int                  # For spatial outputs
      - num_features: int           # For detection outputs
      - num_boxes: int              # For detection outputs
      - padding: int                # For detection outputs
      - box_coords: int             # For detection outputs
      - num_classes: int            # For detection outputs
      - num_anchors_x_features: int # For detection outputs
      - num_protos: int             # For instance segmentation
    dtype: string          # Data type (float32, uint8, int8)
    type: string           # Semantic type (detection, segmentation, boxes, scores, masks, protos)
    decode: boolean        # Whether decoding is required
    decoder: string        # Decoder type: 'modelpack' or 'ultralytics'
    quantization: [float, int]  # [scale, zero_point] for quantized models
    stride: [int, int]     # Spatial stride for this output (ModelPack)
    anchors: [[[float, float]]]  # Normalized anchors for this output level (ModelPack only)
    score_format: string   # Score encoding: 'per_class' or 'obj_x_class' (Ultralytics only)
    normalized: boolean    # Box coordinates in [0,1] range (true) or pixels (false). Optional field.

Output Specification

The outputs section is critical for inference — it tells the runtime how to interpret model outputs.

Output Types

For Ultralytics framework models, the following output types are used:

Type	Description	Typical Shape
`detection`	Raw detection output (needs to be split)	`[1, num_features, num_boxes]`
`boxes`	Split bounding boxes	`[1, 4, num_boxes]`
`scores`	Split class scores	`[1, classes, num_boxes]`
`mask_coefficients`	Split coefficients for instance segmentation	`[1, num_protos, num_boxes]`
`protos`	Instance segmentation prototypes	`[1, H, W, num_protos]` (NHWC)

score_format field (Ultralytics only):

Value	Description	Architecture
`per_class`	Each anchor outputs `[nc]` class probabilities directly	YOLOv8, YOLO11, YOLO26
`obj_x_class`	Each anchor outputs `[1 + nc]` where final score = objectness × class confidence	YOLOv5

When score_format is absent, the validator falls back to a shape-based heuristic on the feature dimension: nc+5 features per anchor (4 box coordinates + 1 objectness + nc class probabilities) implies obj_x_class (e.g., [1, 85, 8400] for 80 classes).

HAL Score Format

HAL determines score format from decoder_version rather than score_format. yolov5 applies objectness × class; all other versions use per-class scores directly.

For ModelPack framework models the following output types are used:

Type	Description	Typical Shape
`detection`	Raw detection output (needs decoding)	`[1, H, W, num_anchors_x_features]`
`boxes`	Bounding boxes	`[1, num_boxes, 1, 4]`
`scores`	Class scores	`[1, num_boxes, classes]`
`segmentation`	Semantic segmentation output	`[1, H, W, classes]`
`masks`	Semantic segmentation masks	`[1, H, W]`

Segmentation Types

EdgeFirst supports two distinct segmentation approaches:

Semantic Segmentation (ModelPack)

Per-pixel classification without object instances. Each pixel is assigned a class label, but individual objects are not distinguished.

Use cases:

Drivable surface detection
Lane segmentation
Sky/ground separation
Terrain classification

Output structure:

outputs:
  - name: "segmentation_output"
    type: segmentation
    shape: [1, 480, 640, 5]    # [batch, H, W, num_classes]
    dshape:
      - batch: 1
      - height: 480
      - width: 640
      - num_classes: 5
    decoder: modelpack

Instance Segmentation (Ultralytics)

Per-pixel classification with object instances. Each detected object gets its own mask, enabling fine-grained object boundaries beyond bounding boxes.

Use cases:

Individual person segmentation
Vehicle instance masks
Product segmentation
Fine-grained object detection

Output structure:

# Detection output with mask coefficients
outputs:
  - name: "detection_output"
    type: detection
    shape: [1, 116, 8400]      # [batch, 4+nc+32, num_boxes] - includes 32 mask coefficients
    dshape:
      - batch: 1
      - num_features: 116        # 4 box coords + 80 classes + 32 mask coefficients
      - num_boxes: 8400
    decoder: ultralytics

  # Prototype masks for instance computation
  - name: "protos_output"
    type: protos
    shape: [1, 32, 160, 160]   # [batch, num_protos, H, W] NCHW
    dshape:
      - batch: 1
      - num_protos: 32
      - height: 160
      - width: 160
    decoder: ultralytics

Final mask computation:

# For each detected object with mask_coefficients [32]:
instance_mask = sigmoid(mask_coefficients @ protos)  # [32] @ [32, H, W] → [H, W]
# Crop to bounding box region for final instance mask

The `dshape` Field

The dshape field provides named dimensions for easier interpretation of tensor shapes. This is especially useful when shapes vary between data layouts (NCHW vs NHWC).

outputs:
  - name: "output_0"
    shape: [1, 84, 8400]       # Raw shape
    dshape:                    # Named dimensions as ordered array
      - batch: 1
      - num_features: 84         # 4 box coords + 80 classes
      - num_boxes: 8400

Standard dimension names:

Name	Description
`batch`	Batch size (typically 1 for inference)
`height`	Spatial height
`width`	Spatial width
`num_classes`	Number of classification classes
`num_features`	Feature dimension (box coords + classes + mask coefficients)
`num_boxes`	Number of detection boxes/anchors
`num_protos`	Number of prototype masks (instance segmentation)
`num_anchors_x_features`	Combined anchor and feature dimension for ModelPack grid outputs (anchors × features per anchor)
`padding`	Padding/alignment dimension used to satisfy expected tensor shapes. Must always be 1
`box_coords`	The coordinates of the boxes. Must be 4

Decoding Information

For outputs with decode: true, the metadata provides all information needed to decode:

outputs:
  - name: "detection_output_0"
    type: detection
    decode: true
    decoder: modelpack
    shape: [1, 40, 40, 54]      # Grid output
    dshape:
      - batch: 1
      - height: 40
      - width: 40
      - num_anchors_x_features: 54
    anchors:                    # Normalized anchor boxes
      - [0.054, 0.065]
      - [0.089, 0.139]
      - [0.195, 0.196]
    quantization: [0.176, 198]  # For dequantization

Quantization Parameters

For quantized models (TFLite INT8), each output includes quantization parameters:

# Dequantize output
scale, zero_point = output_spec['quantization']
float_output = (quantized_output - zero_point) * scale

Data Layout (NCHW vs NHWC)

Deep learning frameworks use different memory layouts for tensor data. The metadata accurately reflects each format's native layout:

Format	Data Layout	Shape Convention	Example (batch=1, 640x640, RGB)
TFLite	NHWC	`[batch, height, width, channels]`	`[1, 640, 640, 3]`
ONNX	NCHW	`[batch, channels, height, width]`	`[1, 3, 640, 640]`

Why This Matters

TFLite (TensorFlow): Uses channels-last (NHWC) which is optimized for CPU and mobile inference
ONNX (PyTorch-derived): Uses channels-first (NCHW) which is optimized for GPU and NPU inference

The metadata's outputs section reports shapes in the model's native format. When integrating with inference runtimes, ensure your input preprocessing matches the expected layout.

Metadata Fields

input:
  shape: [1, 640, 640, 3]  # Input tensor shape (layout varies by model)
  cameraadaptor: rgb       # Channel order (rgb, bgr, yuyv)
  # Common layouts:
  # - NHWC: [batch, height, width, channels] e.g., [1, 640, 640, 3]
  # - NCHW: [batch, channels, height, width] e.g., [1, 3, 640, 640]

outputs:
  - name: "output_0"
    shape: [1, 640, 640, 3]   # TFLite: NHWC
    # shape: [1, 3, 640, 640] # ONNX: NCHW

Input Preprocessing

EdgeFirst models expect specific input preprocessing. The metadata documents these requirements so inference pipelines can prepare data correctly.

Image Resizing

Models expect input images at the resolution specified in metadata. How images are resized depends on the training approach:

input:
  shape: [1, 640, 640, 3]  # NHWC example: [batch, height, width, channels]
  # shape: [1, 3, 640, 640]  # NCHW example: [batch, channels, height, width]
  cameraadaptor: rgb       # Expected color format

Native Aspect Ratio (typical for purpose-built datasets):

ModelPack models are often trained at the camera's native aspect ratio
Images are directly resized to target dimensions without padding
Best accuracy when deployment camera matches training data

Letterbox (typical for diverse datasets like COCO):

Used when training on images from diverse cameras and aspect ratios
Image is scaled to fit within target size while maintaining aspect ratio
Gray padding (value 114) added to reach exact dimensions
Inference must apply same letterbox transform and account for padding offset in output coordinates

Example: A 1920x1080 image letterboxed to 640x640:

Scaled to 640x360 (maintains 16:9 ratio)
140 pixels of padding added to top and bottom
Output box coordinates must be adjusted to remove padding offset

Pixel Normalization

Input pixels are normalized from [0, 255] to [0.0, 1.0]:

# Standard normalization
normalized = pixels.astype(np.float32) / 255.0

For quantized models (INT8), the quantization parameters handle the scaling internally — raw uint8 pixel values can often be used directly.

Camera Adaptor

The cameraadaptor field specifies the expected input format for the model. See Camera Adaptor for details on how this enables models to consume native camera formats without runtime conversion.

Value	Description	Channel Order
`rgb`	Standard RGB	Red, Green, Blue
`bgr`	OpenCV default	Blue, Green, Red
`rgba`	RGB with alpha	Red, Green, Blue, Alpha
`bgra`	BGR with alpha	Blue, Green, Red, Alpha
`grey`	Greyscale	Single channel
`yuyv`	YUV 4:2:2 packed	For direct camera sensor input

Validation Parameters

The validation section records the recommended settings based on how the model was trained. These parameters are informational preferences — they document the model author's intended configuration for validation and inference.

Parameter Semantics

Parameter	Description	Default	Override at Runtime?
`iou`	NMS IoU threshold	`0.7`	Yes
`score`	NMS confidence score threshold	`0.001`	Yes
`nms`	NMS algorithm	(not set)	See below
`normalization`	Input pixel normalization	`unsigned`	Yes
`preprocessing`	Image preprocessing method	`letterbox`	Yes

Most parameters (iou, score, normalization, preprocessing, and NMS algorithm choices like hal/tensorflow/numpy/torch) can be overridden at runtime based on deployment preferences.

Exception: nms: none must be respected because the model does not produce outputs compatible with external NMS. This applies to two cases:

Architectural end-to-end models (e.g., YOLO26) — NMS is part of the model architecture via one-to-one matching heads. The model graph itself produces final predictions.
Engine-embedded NMS — Models exported with NMS operations appended to the inference graph (ONNX, TensorRT, TFLite). NMS is not part of the original model architecture but was added during export or conversion.

Both produce post-NMS output in [x1, y1, x2, y2, conf, class, ...] format. Detection models output (1, max_det, 6). Segmentation models output (1, max_det, 6 + nm) plus prototype masks — the mask coefficients for NMS-selected detections are preserved, so only the mask decode step is needed externally (mask = sigmoid(coefficients @ prototypes)). Use --nms none (CLI) or validation.nms: none (metadata) for either case.

Allowed `nms` Values

Value	Description
`none`	No external NMS. For models with embedded NMS — either architectural end-to-end (YOLO26) or engine-embedded (ONNX/TRT/TFLite with NMS ops appended). Supports both detection and segmentation
`numpy`	NumPy-based NMS implementation (default fallback)
`hal`	EdgeFirst HAL decoder NMS
`tensorflow`	TensorFlow NMS
`torch`	PyTorch (torchvision) NMS

When --override is set, the validator reads validation.nms from the model metadata and applies it automatically.

Box Coordinate Format (`normalized`)

The normalized field on detection and boxes outputs specifies the coordinate format:

Value	Description	Coordinate Range
`true`	Normalized coordinates relative to model input dimensions	`[0.0, 1.0]`
`false`	Pixel coordinates relative to model input (letterboxed frame)	`[0, width]` / `[0, height]`
(absent)	Must be inferred from output values	Check if any coordinate > 1.0

When normalized is absent, the coordinate format must be inferred by examining the output values. If any bounding box coordinate exceeds 1.0, the coordinates are in pixels; otherwise, assume normalized.

Normalized coordinates are preferred because they:

Don't require knowledge of model input resolution for downstream processing
Quantize better (smaller dynamic range)
Work consistently across different model input sizes

Pixel coordinates are typically used by:

End-to-end models with embedded NMS (YOLO26, engine-embedded NMS)
Models exported with specific output coordinate conventions

Note

Coordinates are always relative to the letterboxed model input, not the original image aspect ratio. The caller must apply the inverse letterbox transform to map boxes back to original image coordinates regardless of whether normalized is true or false.

# Example: End-to-end model with pixel coordinates
outputs:
  - name: "output0"
    type: detection
    shape: [1, 100, 6]    # [batch, max_det, x1+y1+x2+y2+conf+class]
    normalized: false      # Pixel coordinates
    decoder: ultralytics

Post-Processing & Split Decoder

What is Split Decoder?

The split_decoder field indicates the model's outputs have been modified from the standard architecture to improve INT8 quantization performance. The details are framework-specific:

ModelPack: Raw grid features instead of decoded boxes (dequantize before anchor decode)
Ultralytics: Detection tensor split into separate boxes, scores, mask_coefficients tensors (per-tensor quantization)

model:
  split_decoder: true    # Outputs modified for quantization (see framework docs)
  split_decoder: false   # Standard output format

See framework documentation for implementation details.

Why Split Decoder Exists

Quantization introduces precision loss. Split decoder addresses this differently per framework:

ModelPack: For small objects or high-resolution inputs, applying anchor calculations in INT8 would compound rounding errors. By deferring decoding until after dequantization, we preserve box accuracy.

Ultralytics: The monolithic detection tensor contains boxes, scores, and mask coefficients with very different value ranges. Splitting them into separate tensors allows per-tensor quantization scales, preserving accuracy for each component independently.

Decoding Process

When split_decoder: true, the inference pipeline must:

Run model inference → Get quantized outputs
Dequantize outputs → Convert INT8 to float32 using per-output scale/zero_point
Apply decoding → Framework-specific: anchor decode (ModelPack) or box format conversion (Ultralytics)
Run NMS → Filter overlapping detections

# Example decoding flow for split_decoder models
for output_spec in metadata['outputs']:
    if output_spec.get('decode', False):
        # Dequantize first
        scale, zp = output_spec['quantization']
        raw_float = (raw_int8.astype(np.float32) - zp) * scale

        # Then decode (framework-specific)
        if output_spec['decoder'] == 'modelpack':
            boxes = decode_yolo_grid(raw_float, output_spec['anchors'], output_spec['stride'])
        elif output_spec['decoder'] == 'ultralytics':
            # boxes, scores, mask_coefficients are separate outputs
            pass  # Use output type to determine handling

Output Types with Split Decoder

Framework	`split_decoder`	Output Types
ModelPack	`true`	`detection` (raw grid, needs anchor decode)
ModelPack	`false`	`boxes`, `scores` (decoded)
Ultralytics	`true`	`boxes`, `scores`, `mask_coefficients`, `protos`
Ultralytics	`false`	`detection`, `protos` (monolithic)

Decoder Field

The decoder field specifies which decoding algorithm to use:

outputs:
  - name: "detection_output_0"
    type: detection
    decode: true
    decoder: modelpack    # Use ModelPack YOLO-style grid decoding

Supported Decoders

`modelpack` — Anchor-Based YOLO Decoder

Used by ModelPack models. Traditional YOLO-style grid decoding with pre-defined anchor boxes.

Characteristics:

Anchor-based: Uses pre-defined anchor boxes per output level (3 anchors × 3 scales typical)
Grid outputs: Raw features from detection grid cells
Sigmoid activations: Applied to xy, wh, objectness, and class predictions

Decoding formula:

xy = (sigmoid(xy) * 2.0 + grid - 0.5) * stride
wh = (sigmoid(wh) * 2) ** 2 * anchors * stride * 0.5
xyxy = concat([xy - wh, xy + wh]) / input_dims  # normalized xyxy

Required metadata fields:

outputs:
  - decoder: modelpack
    anchors:              # Required - normalized anchor boxes
      - [0.054, 0.065]
      - [0.089, 0.139]
    stride: [16, 16]      # Required - spatial stride

Deprecated: decoder: yolov8

The decoder value yolov8 is deprecated. Use ultralytics instead. Existing models with decoder: yolov8 will continue to work — the validator automatically normalizes yolov8 to ultralytics with a deprecation warning.

`ultralytics` — Anchor-Free DFL Decoder

Used by Ultralytics models (YOLOv5, YOLOv8, YOLO11, YOLO26). Modern anchor-free detection using Distribution Focal Loss (DFL).

Characteristics:

Anchor-free: Uses anchor points (grid centers) instead of pre-defined boxes
DFL regression: Converts 16-bin distribution to box coordinates
Unified architecture: Same decoder for YOLOv5, YOLOv8, YOLO11, and YOLO26

Decoding formula:

# DFL converts 16-bin distribution to coordinate value
box = dfl(raw_box)  # [batch, 64, anchors] → [batch, 4, anchors]

# dist2bbox converts LTRB distances to boxes
x1y1 = anchor_points - lt
x2y2 = anchor_points + rb
# Returns xywh or xyxy in pixel coordinates

Metadata structure:

outputs:
  - decoder: ultralytics
    anchors: null         # Not used - anchor-free
    # Strides are implicit: [8, 16, 32] for P3/P4/P5 outputs

Version differences: All Ultralytics versions use the same anchor-free Detect class. Differences are in backbone architecture:

Version	Backbone Blocks	Classification Head
YOLOv5	C3	Conv→Conv→Conv2d
YOLOv8	C2f	Conv→Conv→Conv2d
YOLO11	C3k2, C2PSA	DWConv→Conv (efficient)
YOLO26	C3k2, A2C2f	DWConv→Conv (efficient)

Decoder Version Field

The decoder_version field specifies the YOLO architecture version for Ultralytics models. This field is critical for determining the correct decoding strategy, especially for end-to-end models.

decoder_version: yolo26    # End-to-end model with embedded NMS
# or
decoder_version: yolov8    # Traditional model requiring external NMS

Supported values:

Value	Architecture	NMS Handling
`yolov5`	YOLOv5	External NMS required
`yolov8`	YOLOv8	External NMS required
`yolo11`	YOLO11	External NMS required
`yolo26`	YOLO26	Embedded NMS (end-to-end)

Naming Convention

The naming follows Ultralytics conventions: yolov5 and yolov8 include the 'v' prefix, while yolo11 and yolo26 do not (Ultralytics dropped the 'v' starting with YOLO11).

When decoder_version is yolo26:

The model uses one-to-one matching heads with NMS embedded in the architecture
Output format is [x1, y1, x2, y2, conf, class, ...] (post-NMS)
The HAL decoder uses end-to-end model types regardless of the nms field
No external NMS is applied

When decoder_version is absent or any other value:

Traditional YOLO architecture requiring external NMS
The nms field controls which NMS algorithm the HAL decoder uses

HAL NMS Field

The nms field at the config root level controls the HAL decoder's NMS behavior:

nms: class_agnostic    # Suppress overlapping boxes regardless of class (default)
# or
nms: class_aware       # Only suppress boxes with the same class label

Value	Behavior
`class_agnostic`	Suppress overlapping boxes regardless of class label (default)
`class_aware`	Only suppress boxes that share the same class AND overlap

Different from validation.nms

The root-level nms field controls HAL decoder behavior (class-agnostic vs class-aware). The validation.nms field in the validation section specifies the NMS implementation to use during validation (hal, numpy, tensorflow, etc.) or none for models with embedded NMS.

Example configuration for YOLO26 end-to-end model:

decoder_version: yolo26
outputs:
  - decoder: ultralytics
    type: detection
    shape: [1, 100, 6]
    normalized: false
    dshape:
      - batch: 1
      - num_boxes: 100
      - num_features: 6
validation:
  nms: none    # Model has embedded NMS

Example configuration for traditional YOLOv8 model:

decoder_version: yolov8
nms: class_agnostic
outputs:
  - decoder: ultralytics
    type: detection
    shape: [1, 84, 8400]
    dshape:
      - batch: 1
      - num_features: 84
      - num_boxes: 8400
validation:
  nms: hal    # Use HAL decoder NMS

ONNX-Specific Metadata

ONNX models exported from ModelPack or Ultralytics include additional official metadata fields:

Field	ModelPack Value	Ultralytics Value	Purpose
`producer_name`	"EdgeFirst ModelPack"	"EdgeFirst Ultralytics"	Identifies producing framework
`producer_version`	Package version	Package version	Version tracking
`graph.name`	Model name	Model name	Graph identification
`doc_string`	Description	Description	Human-readable description

Custom metadata properties (all string values):

Key	Content	Purpose
`edgefirst`	Full config as JSON	Complete configuration
`name`	Model name	Quick access (no JSON parsing)
`description`	Model description	Quick access
`author`	Author/organization	Quick access
`studio_server`	Full hostname	Quick access for traceability
`project_id`	Project ID	Quick access for traceability
`session_id`	Session ID	Quick access for traceability
`dataset`	Dataset name	Quick access
`dataset_id`	Dataset ID	Quick access for traceability
`labels`	JSON array of labels	Class labels

Third-Party Integration

Any training framework can produce EdgeFirst-compatible models by embedding the appropriate metadata.

Minimum Required Fields

For basic EdgeFirst Perception stack compatibility:

input:
  shape: [1, 640, 640, 3]  # Input tensor shape (NHWC or NCHW)
  cameraadaptor: rgb

model:
  detection: true
  segmentation: false
  split_decoder: true  # or false if decoder is built-in

outputs:
  - name: "output_0"
    shape: [1, 8400, 84]
    dtype: float32
    type: boxes  # or detection if needs decoding
    decode: false

dataset:
  classes:
    - class1
    - class2

Full Traceability (Recommended)

For production MLOps integration with EdgeFirst Studio:

host:
  studio_server: test.edgefirst.studio
  project_id: "1123"
  session: t-2110              # Hex value, convert to int for URLs

dataset:
  name: "My Dataset"
  id: ds-xyz789
  classes: [...]

name: "my-model-v1"              # Model/session name
description: "Model for production deployment"
author: "My Organization"

Embedding Metadata in TFLite

Dependencies

This example requires the tflite-support and pyyaml packages:

pip install tflite-support pyyaml

from tensorflow_lite_support.metadata.python.metadata_writers import metadata_writer, writer_utils
from tensorflow_lite_support.metadata import metadata_schema_py_generated as schema
import yaml
from typing import List
import tempfile
import os

def add_edgefirst_metadata(tflite_path: str, config: dict, labels: List[str]):
    """Add EdgeFirst metadata to a TFLite model."""
    
    # Write config and labels to temp files in a cross-platform way
    with tempfile.TemporaryDirectory() as tmpdir:
        config_path = os.path.join(tmpdir, 'edgefirst.yaml')
        labels_path = os.path.join(tmpdir, 'labels.txt')

        with open(config_path, 'w') as f:
            yaml.dump(config, f)

        with open(labels_path, 'w') as f:
            f.write('\n'.join(labels))

        # Create model metadata
        model_meta = schema.ModelMetadataT()
        model_meta.name = config.get('name', '')
        model_meta.description = config.get('description', '')
        model_meta.author = config.get('author', '')

        # Load and populate
        tflite_buffer = writer_utils.load_file(tflite_path)
        writer = metadata_writer.MetadataWriter.create_from_metadata(
            model_buffer=tflite_buffer,
            model_metadata=model_meta,
            associated_files=[labels_path, config_path]
        )

        writer_utils.save_file(writer.populate(), tflite_path)

Embedding Metadata in ONNX

Dependencies

This example requires the onnx package:

pip install onnx

import onnx
import json
from typing import List

def add_edgefirst_metadata(onnx_path: str, config: dict, labels: List[str]):
    """Add EdgeFirst metadata to an ONNX model."""
    
    model = onnx.load(onnx_path)
    
    # Set official ONNX fields
    model.producer_name = 'My Training Framework'
    model.producer_version = '1.0.0'
    
    if config.get('name'):
        model.graph.name = config['name']
    if config.get('description'):
        model.doc_string = config['description']
    
    # Add custom metadata
    metadata = {
        'edgefirst': json.dumps(config),
        'labels': json.dumps(labels),
        'name': config.get('name', ''),
        'description': config.get('description', ''),
        'author': config.get('author', ''),
        'studio_server': config.get('host', {}).get('studio_server', ''),
        'project_id': str(config.get('host', {}).get('project_id', '')),
        'session_id': config.get('host', {}).get('session', ''),
        'dataset': config.get('dataset', {}).get('name', ''),
        'dataset_id': str(config.get('dataset', {}).get('id', '')),
    }
    
    for key, value in metadata.items():
        if value:
            prop = model.metadata_props.add()
            prop.key = key
            prop.value = str(value)
    
    onnx.save(model, onnx_path)

Updating Metadata

Updating TFLite Metadata

Since TFLite models are ZIP archives, you can update embedded files:

zip command

The zip command is available on most platforms but may need to be installed:

macOS: Pre-installed
Linux: sudo apt install zip (Debian/Ubuntu) or sudo yum install zip (RHEL/CentOS)
Windows: Available via Git Bash, WSL, or Info-ZIP

# Update edgefirst.yaml
zip -u mymodel.tflite edgefirst.yaml

# Update labels
zip -u mymodel.tflite labels.txt

# Add new files
zip mymodel.tflite edgefirst.json

Updating ONNX Metadata

import onnx
import json

model = onnx.load('mymodel.onnx')

# Update existing metadata
for prop in model.metadata_props:
    if prop.key == 'description':
        prop.value = 'Updated description'

# Add new metadata
prop = model.metadata_props.add()
prop.key = 'custom_field'
prop.value = 'custom_value'

onnx.save(model, 'mymodel.onnx')

Schema Reference

Host Section

The host section identifies the EdgeFirst Studio instance and training session that produced the model.

host:
  studio_server: test.edgefirst.studio  # Full EdgeFirst Studio hostname
  project_id: "1123"                    # Project ID for Studio URLs
  session: t-2110                       # Training session ID (hex, prefix t-)
  username: john.doe                    # User who initiated training

Converting IDs for Studio URLs

Session and dataset IDs in metadata use hexadecimal values with prefixes (t- for training sessions, ds- for datasets). To construct Studio URLs, strip the prefix and convert from hex to decimal:

t-2110 → int('2110', 16) → 8464
ds-1c8 → int('1c8', 16) → 456

Dataset Section

The dataset section references the dataset used for training. See the Dataset Zoo for available datasets and Dataset Structure for format details.

dataset:
  name: "COCO 2017"      # Human-readable name
  id: ds-abc123          # Dataset ID (prefix: ds-)
  classes:               # Ordered list of class labels
    - background
    - person
    - car

Model Identification

Top-level fields for model identification, populated from the training session name and description.

name: "coffeecup-detection"       # Model/session name (used in filename)
description: "Object detection model for coffee cups"
author: "Au-Zone Technologies"    # Organization

Input Section

The input section specifies image preprocessing requirements. See Vision Augmentations for training-time augmentation configuration.

input:
  shape: [1, 640, 640, 3]  # Input tensor shape
  cameraadaptor: rgb       # rgb, rgba, yuyv, bgr

Data Layout

The shape field uses the model's native tensor layout. This can be either NHWC [batch, height, width, channels] or NCHW [batch, channels, height, width] depending on how the model was exported. While TFLite typically uses NHWC and ONNX typically uses NCHW, both formats can support either layout — always check the actual shape values.

Model Section

The model section captures architecture configuration. These parameters can be configured during training session setup in EdgeFirst Studio. See the ModelPack and Ultralytics documentation for detailed parameter descriptions.

# ModelPack model configuration
model:
  backbone: cspdarknet19
  model_size: nano       # nano, small, medium, large
  activation: relu6      # relu, relu6, silu, mish
  detection: true
  segmentation: false
  classification: false
  split_decoder: true    # true = outputs need anchor decoding after dequantization
                         # false = outputs are fully decoded boxes
                         # See "Post-Processing & Split Decoder" section for details
  anchors:               # Per-level anchor boxes (pixels at input resolution)
    - [[35, 42], [57, 89], [125, 126]]
    - [[125, 126], [208, 260], [529, 491]]

# Ultralytics model configuration
model:
  model_version: v8      # v5, v8, v11
  model_task: segment    # detect, segment
  model_size: n          # n (nano), s (small), m (medium), l (large), x (xlarge)
  detection: false
  segmentation: true
  split_decoder: true    # true = split outputs (boxes, scores, mask_coefficients, protos)
                         # false = monolithic detection tensor
                         # Default true for quantized segmentation models

Outputs Section

# ModelPack detection output example
outputs:
  - name: "output_0"
    index: 0
    output_index: 0
    shape: [1, 40, 40, 54]
    dshape:
      - batch: 1
      - height: 40
      - width: 40
      - num_anchors_x_features: 54   # 3 anchors × (5 + 13 classes)
    dtype: float32
    type: detection
    decode: true
    decoder: modelpack
    quantization: [0.176, 198]
    stride: [16, 16]
    anchors:
      - [0.054, 0.065]
      - [0.089, 0.139]
      - [0.195, 0.196]

# Ultralytics detection output example  
outputs:
  - name: "output0"
    index: 0
    output_index: 0
    shape: [1, 84, 8400]           # NCHW: [batch, 4+nc, num_boxes]
    dshape:
      - batch: 1
      - num_features: 84             # 4 box coords + 80 classes
      - num_boxes: 8400
    dtype: float32
    type: detection
    decode: true
    decoder: ultralytics
    quantization: null             # Float model
    anchors: null                  # Anchor-free
    score_format: per_class        # YOLOv8/v11/v26: class probabilities directly

# Ultralytics instance segmentation protos example
  - name: "output1"
    index: 1
    output_index: 1
    shape: [1, 32, 160, 160]       # NCHW: [batch, protos, H, W]
    dshape:
      - batch: 1
      - num_protos: 32
      - height: 160
      - width: 160
    dtype: float32
    type: protos
    decode: true
    decoder: ultralytics
    quantization: null
    anchors: null
    score_format: null             # Not applicable to protos output

Camera Adaptor - Native camera format support for edge deployment
ModelPack Overview - Architecture details and training parameters
Ultralytics Integration - YOLOv8/v11/v26 training and deployment
Training Vision Models - Step-by-step training workflow
On Cloud Validation - Managed validation sessions
On Target Validation - User-managed validation with edgefirst-validator
ModelPack Quantization - Converting ONNX to quantized TFLite
Deploying to Embedded Targets - Model deployment workflow
EdgeFirst Perception Middleware - Runtime inference stack
Dataset Zoo - Available datasets for training
Model Experiments Dashboard - Managing training and validation sessions