ModelPack Quantization
To run a ModelPack Float32 ONNX model trained in EdgeFirst Studio on an embedded platform such as an i.MX 8M Plus EVK, the model will need to be quantized to an INT8 or UINT8 model and converted to TFLite format. The steps below describe this process.
ONNX to TFLite
Follow along this tutorial to convert your ModelPack Float32 ONNX model into a quantized TFLite. If you do not have a model available, you can click and download this sample model coffeecup-modelpack-multitask-t-1f54.onnx which is needed for this tutorial.
The steps for this conversion process are shown below.
%%{init: {"flowchart": {"defaultRenderer": "elk"}} }%%
flowchart LR
onnx[\Float ONNX model\]
tensorflow[\Float Saved Model\]
tflite[\Quantized TFLite\]
onnx2tf[onnx2tf]
converter[TFLite Converter]
onnx --> onnx2tf --> tensorflow --> converter --> tflite
-
Open a command prompt in your PC.
-
Install the following required dependencies.
Tip
You can download the file "requirements.txt" that lists the required dependencies and run
pip install -r requirements.txtto install these packages. Otherwise, run each package installation line by line as shown below.$ pip install onnx2tf==1.28.2 $ pip install tf_keras==2.19.0 $ pip install onnx==1.18.0 $ pip install onnx_graphsurgeon==0.5.8 $ pip install psutil==7.0.0 $ pip install ai-edge-litert==1.4.0 $ pip install sng4onnx==1.0.4 $ pip install tensorflow==2.19.1 $ pip install opencv-python==4.12.0.88 $ pip install numpy==2.1.3These versions of the libraries were tested.
onnx2tf 1.28.2 tf_keras 2.19.0 onnx 1.18.0 onnx_graphsurgeon 0.5.8 psutil 7.0.0 ai-edge-litert 1.4.0 sng4onnx 1.0.4 tensorflow 2.19.1 opencv-python 4.12.0.88 numpy 2.1.3 -
Export the ONNX model to TensorFlow saved model.
onnx2tf -i path/to/mymodel.onnx -o model_tf --non_verbose -
Run the TFLite converter script below using TensorFlow with this command
python3 converter.py.Download the Python script
Download the python script by clicking on the link above.
Prepare a set of images
This process also requires sample images needed during quantization. You can click on the link and download these set of images with coffee cup samples for quantizing a coffee cup model as shown in this tutorial. Unzip this file into a directory.
Modify the file paths
In this script, the path to the images and the model are set to the following. Also ensure that the model input shape is set to the correct dimensions.
model_path = "model_tf" # Path to the TensorFlow saved mdoel images_path = "coffeecup/*.jpg" # Conversion requires image samples for quantization. input_shape = (480, 270) # Model (width, height) input shape. output_path = "coffeecup-modelpack-multitask-t-1f54.tflite" # Path to save the TFLite model.Make sure to modify these paths specific to your setup.
import tensorflow as tf import numpy as np import glob import cv2 model_path = "model_tf" # Path to the TensorFlow saved mdoel images_path = "coffeecup/*.jpg" # Conversion requires image samples for quantization. input_shape = (480, 270) # Model (width, height) input shape. output_path = "coffeecup-modelpack-multitask-t-1f54.tflite" # Path to save the TFLite model. def representative_data_gen(): images = glob.glob(images_path) for image in images: image = cv2.imread(image) image = cv2.resize(image, input_shape) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = image.astype(np.float32) image = image / 255.0 image = np.expand_dims(image, axis=0) yield [image] converter = tf.lite.TFLiteConverter.from_saved_model(model_path) converter.optimizations = [tf.lite.Optimize.DEFAULT] converter.representative_dataset = representative_data_gen converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8] converter.inference_input_type = tf.uint8 converter.inference_output_type = tf.uint8 tflite_model = converter.convert() with open(output_path, "wb") as f: f.write(tflite_model)
Next Steps
Once you have converted your model to a quantized TFLite, you can verify the performance of the model by Running the Quantized ModelPack in the i.MX 8M Plus EVK's NPU.