Technology

Ollama 0.22.1 brings Gemma 4 tool calling to your laptop without an API key

Susan Hill

Ollama 0.22.1 ships an updated Gemma 4 model renderer that finally supports the two capabilities that mattered for serious local AI work: explicit thinking mode and function/tool calling. Tool calling lets the model decide when to invoke an external function — fetch a webpage, query a database, run a calculation — and parse the result back into its own reasoning. Thinking mode exposes the model’s intermediate steps so an application can capture and act on them. Both have been features the major cloud APIs charged for. Both now run locally against Gemma 4 with no external service involved.

The reason this lands harder than another model release is the math on hardware. The Gemma 4 family Google published under an Apache-2.0 license covers four sizes: E2B, E4B, 26B A4B, and 31B. The smaller variants run on a recent laptop with integrated graphics and twelve to sixteen gigabytes of RAM. The 26B A4B and 31B versions need a desktop GPU but stay well within consumer territory. The same architecture that used to require a paid API contract or a four-figure home server is now a Saturday-afternoon install for anyone with a reasonably modern machine.

The practical consequence for non-developers is that a class of agent applications — the kind that reads your email, drafts replies, fetches documents, fills forms, summarizes meetings — no longer has to send any of that data to a third-party server. A privacy-conscious user who wanted real agent automation had two options before: trust a cloud provider’s data policy, or run a far weaker model locally without tool calling. The middle ground was a gap, and Ollama 0.22.1 closes it for the Gemma 4 weight class.

The skeptical reading is that Ollama and Gemma 4 are not equivalents of the cloud frontier. A locally hosted 31B model is not as capable as Anthropic’s Claude or OpenAI’s GPT-5 at complex reasoning. Tool-call accuracy on long chains is meaningfully worse on the smaller variants. Multimodal inputs work but slower. And the integration burden falls on the user: nobody has yet built a polished Gemma 4 plus Ollama agent app that competes with a finished SaaS workflow. The hardware ceiling and the software polish are both still real gaps.

The release is available now through Ollama’s standard installer for macOS, Linux, and Windows. The Gemma 4 weights are hosted on Ollama’s model library under the gemma4 namespace, and the runtime change in 0.22.1 applies automatically to any size once pulled.

Discussion

There are 0 comments.