~/swaylen
posts / projects / about /
All tags

Posts tagged with "benchmarking"

    Benchmarking UI Detection on ScreenSpot-Pro
    How we evaluated uitag against 1,581 annotations across 26 professional macOS applications — methodology, results, and what the numbers actually mean.
    GUI-Specialized Apple Silicon VLM Matrix
    Which vision-language models actually work for UI tasks on M-series chips — tested configurations, latency numbers, and the models worth your time.
    Apple Silicon VLM Benchmark Roundup
    A short public narrative covering what we tested, what we found, and what you should run if you're doing local multimodal inference on M-series hardware.
© 2026 • ~/swaylen 🔬
Press Esc or click anywhere to close