Skip to content

Tutorial 4: Polars Dataframe

Analyze Coffee Cup annotations with Polars — via CLI Arrow export or the native samples_dataframe API.

CLI equivalent:

edgefirst-client download-annotations <as-id> coffee_cup.arrow --groups val

Prerequisites

  • Complete Tutorial 3
  • pip install polars pandas pyarrow (pandas optional)

Steps

1. Export annotations with the CLI (optional)

import subprocess
from pathlib import Path

arrow_path = Path("coffee_cup.arrow")
subprocess.run([
    "edgefirst-client", "download-annotations",
    "<as-id>", str(arrow_path), "--groups", "val",
], check=True)

Set SKIP_CLI_DOWNLOAD=1 to skip this step if the Arrow file already exists.

2. Load Arrow in Polars

import polars as pl

df_cli = pl.read_ipc(arrow_path)
print(f"CLI Arrow: {df_cli.shape[0]} rows")
if "label" in df_cli.columns:
    counts = df_cli.group_by("label").len().sort("len", descending=True)
    print(counts.head(5))

3. Use the native API

from examples import COFFEE_CUP_DATASET_ID, get_client

client = get_client()
dataset = client.dataset(COFFEE_CUP_DATASET_ID)
annotation_set_id = client.annotation_sets(dataset.id)[0].id

df_api = client.samples_dataframe(
    dataset.id, annotation_set_id, ["val"], [], None,
)
print(f"API samples_dataframe: {df_api.shape[0]} rows")

See the EdgeFirst Dataset Format schema for column definitions.

Source

Full script: 04_polars_dataframe.py


Previous: Tutorial 3 · Next: Tutorial 5: Download dataset