Liquid AI's Tiny 230M Model Dominates Tool Calling Benchmarks Against Models 4x Its Size

A 230-million-parameter model just scored 43.26 on the BFCLv3 tool-use benchmark, beating Google's 1-billion-parameter Gemma 3 1B IT (16.61) and IBM's Granite 4.0-350M (39.58). That's not a typo. Liquid AI's LFM2.5-230M proves that for structured tool calling and data extraction, parameter count is a poor proxy for performance.

Why 230M parameters matters for edge AI

Liquid AI, founded by former MIT computer scientists, packed 19 trillion tokens of pre-training into a 230-million-parameter footprint. The model supports a 32K context window and runs entirely on-device. On a Samsung Galaxy S25 Ultra with a Qualcomm Snapdragon Gen4 CPU, it hits 213 tokens per second decode speed. Even a Raspberry Pi 5 manages 42 tokens per second. Memory footprint is under 400MB.

The architecture behind this efficiency is the LFM2 framework, a hybrid system that interleaves gated short-range convolutions with grouped-query attention. This avoids the quadratic memory costs of pure attention mechanisms while handling long contexts and sequential data reliably on edge hardware.

Benchmarks: where tiny dominates

LFM2.5-230M is not built for math or creative writing. Liquid AI is upfront about that. But in its target domains, the model punches well above its weight class. On CaseReportBench for data extraction, it scores 22.51, decimating Qwen3.5-0.8B (Instruct). On BFCLv3 tool-use, it beats not only IBM's 350M model but also Google's 1B-parameter Gemma 3 1B IT by a factor of 2.6x.

Compare that to the 3-billion-parameter models like Weibo's VibeThinker-3B, which score 94.3 on AIME 2026 but are over 10x larger. LFM2.5-230M is a scalpel for agentic pipelines, not a sledgehammer for reasoning.

Architecture and deployment: LFM2 framework and on-device skills

Liquid AI demonstrated the model running on a Unitree G1 humanoid robot using the onboard NVIDIA Jetson Orin compute module. The model takes free-form instructions like "Hold still for 2 seconds, then walk forward at 1 meter per second for 3 meters" and translates them into structured multi-step plans calling pre-trained skills from NVIDIA's SONIC framework.

The base and post-trained models are available on Hugging Face with day-one support for llama.cpp (GGUF), MLX, vLLM, SGLang, and ONNX. Licensing uses the LFM Open License v1.0, which is free for individuals and companies with under $10M annual revenue. Above that threshold, enterprises need a paid commercial agreement.

LFM2.5-230M proves that for structured tool calling and data extraction on constrained hardware, the smallest model in the room is often the smartest choice.

Source: Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'
Domain: venturebeat.com

Liquid AI's Tiny 230M Model Dominates Tool Calling Benchmarks Against Models 4x Its Size

Why 230M parameters matters for edge AI

Benchmarks: where tiny dominates

Architecture and deployment: LFM2 framework and on-device skills

More in Artificial Intelligence