2 min read

mlx-triage

Table of Contents

What it does

CLI tool for preflight validation of MLX models on Apple Silicon. Answers β€œis this model structurally sound before I benchmark it?” by checking architecture compatibility, weight shapes, chat templates, and quantization formats.

Architecture / Key capabilities

  • Architecture compatibility checks β€” Validates that model architecture is supported by the target MLX runtime before any inference attempt
  • Weight shape validation β€” Inspects tensor dimensions and layer structure to catch shape mismatches that would cause silent failures or crashes during inference
  • Chat template verification β€” Confirms chat templates are present, well-formed, and compatible with the intended serving configuration
  • Quantization format checks β€” Validates quantization metadata and format compatibility so operators know whether a quantized model will load correctly
  • Practitioner-facing CLI β€” Designed for the person who downloads models from Hugging Face and needs a fast structural sanity check before committing to a full benchmark run

Key numbers

MISSING β€” model count, validation pass/fail rates

Current phase

Workstream B artifacts live β€” model-directory skeleton at docs/model-directory.md. Core priority: residual Qwen-specific Tier 2.1 heterogeneous-length batch divergence (MLX-001).

Status

Active β€” next milestones: MLX-002 expand model-directory detail pages, MLX-003 README rewrite around preflight + validation story, MLX-004 refresh validation-results.md, MLX-005 decide release scope (v0.2.1 vs v0.3.0)

MISSING β€” Repository URL