Caltech spin-out PrismML launches 1-bit AI model that fits into 1.15 GB and runs on Apple devices

Bonsai 8B is designed for edge and on-device deployment, where cloud-dependent AI cannot reach. The company claims a 10x improvement in intelligence per gigabyte over full-precision rivals

by Defused News Writer

Updated April 04, 2026

Caltech spin-out PrismML launches 1-bit AI model that fits into 1.15 GB and runs on Apple devices

A Caltech spin-out called PrismML has released Bonsai 8B, a large language model built on 1-bit architecture that fits into 1.15 GB of memory and is designed to run on Apple devices and Nvidia GPUs without a cloud connection.

The model targets edge and on-device deployment, where power constraints and limited bandwidth make conventional AI models impractical. Most large language models in production use 16-bit or 32-bit floating point representations for their weights, consuming substantially more memory. PrismML's approach strips that back to a single bit per weight, representing each value only as its sign, either minus one or plus one, with a shared scale factor applied across groups of weights.

The company says this architecture avoids the problems that have historically made low-bit models unreliable, including poor instruction following and inconsistent tool use, issues that earlier quantization attempts struggled to resolve.

PrismML defines its own benchmark for evaluating the tradeoff between model size and capability, which it calls intelligence density. On that measure, Bonsai 8B scores 1.06 per gigabyte against 0.10 per gigabyte for Qwen3 8B, a full-precision model of equivalent parameter count, implying more than a tenfold improvement in what the company is calling useful reasoning per unit of memory.

Babak Hassibi, founder and chief executive of PrismML, said the company spent years developing the mathematical theory needed to compress a neural network without degrading its reasoning capability. The white paper published alongside the release sets out the techniques and tradeoffs behind extreme quantization in detail.

Bonsai 8B runs natively on Apple hardware via MLX and on Nvidia GPUs via llama.cpp CUDA. Model weights are available now under the Apache 2.0 licence. PrismML has also published smaller variants at 4 billion and 1.7 billion parameters for more constrained deployments.

The recap

PrismML released the 1-bit Bonsai 8B large language model.
Model fits in 1.15 GB and is 14x smaller.
Model weights are available today under the Apache 2.0 License.

by Defused News Writer

Updated April 04, 2026

by Defused News Writer

April 03, 2026

AI News Markets Tech Tech Giants

Microsoft is sitting on a $215bn paper gain as OpenAI's IPO comes into view, reconstructed cap table shows

by Defused News Writer

April 03, 2026

AI Platforms Tech Giants AI News Tech AI video

Google Vids gains avatar controls and Veo 3.1 as competition heats up

by Defused News Writer

April 02, 2026

Cybersecurity Tech Giants Privacy & Data

Apple updates older iPhones to block DarkSword exploits

by Defused News Writer

April 02, 2026

Chipmakers Tech Giants Semiconductors

Intel repurchases 49% stake in Fab 34 joint venture

by Defused News Writer

April 02, 2026

Subscribe to Our Newsletter

Caltech spin-out PrismML launches 1-bit AI model that fits into 1.15 GB and runs on Apple devices

The recap

Anthropic cuts off Claude Code subscribers from third-party AI agents, starting with OpenClaw

Open-source AI agent OpenClaw is being used to run people's lives. Researchers say it has become a security disaster

Nvidia uses National Robotics Week to push its simulation-to-deployment platform as the backbone of physical AI

Payward names Robert Moore chief financial officer

Nvidia shows AI compression that cut a 6.5 GB game scene down to 970 MB, with no visible difference

Explore stories

Anthropic cuts off Claude Code subscribers from third-party AI agents, starting with OpenClaw

Open-source AI agent OpenClaw is being used to run people's lives. Researchers say it has become a security disaster

Nvidia uses National Robotics Week to push its simulation-to-deployment platform as the backbone of physical AI

Payward names Robert Moore chief financial officer

Nvidia shows AI compression that cut a 6.5 GB game scene down to 970 MB, with no visible difference

Google, Meta, Microsoft and Snap reaffirm child safety

Intel Core Ultra 5 250KF Plus lands under $200

Anthropic spots 'emotion vectors' inside Claude

OpenAI is restructuring its leadership bench ahead of an IPO. The timing raises questions it cannot easily answer

Warren Buffett says he would buy more Apple stock

OpenAI buys tech talk show TBPN as it ditches the standard communications playbook

Microsoft is sitting on a $215bn paper gain as OpenAI's IPO comes into view, reconstructed cap table shows

Google Vids gains avatar controls and Veo 3.1 as competition heats up

Apple updates older iPhones to block DarkSword exploits

Intel repurchases 49% stake in Fab 34 joint venture

Explore topics

Tech

Artificial Intelligence

Business

Entertainment & Sport

Top tags

Caltech spin-out PrismML launches 1-bit AI model that fits into 1.15 GB and runs on Apple devices

Related reading

The recap

Anthropic cuts off Claude Code subscribers from third-party AI agents, starting with OpenClaw

Open-source AI agent OpenClaw is being used to run people's lives. Researchers say it has become a security disaster

Nvidia uses National Robotics Week to push its simulation-to-deployment platform as the backbone of physical AI

Payward names Robert Moore chief financial officer

Nvidia shows AI compression that cut a 6.5 GB game scene down to 970 MB, with no visible difference

Explore stories

Anthropic cuts off Claude Code subscribers from third-party AI agents, starting with OpenClaw

Open-source AI agent OpenClaw is being used to run people's lives. Researchers say it has become a security disaster

Nvidia uses National Robotics Week to push its simulation-to-deployment platform as the backbone of physical AI

Payward names Robert Moore chief financial officer

Nvidia shows AI compression that cut a 6.5 GB game scene down to 970 MB, with no visible difference

Google, Meta, Microsoft and Snap reaffirm child safety

Intel Core Ultra 5 250KF Plus lands under $200

Anthropic spots 'emotion vectors' inside Claude

OpenAI is restructuring its leadership bench ahead of an IPO. The timing raises questions it cannot easily answer

Warren Buffett says he would buy more Apple stock

OpenAI buys tech talk show TBPN as it ditches the standard communications playbook

Microsoft is sitting on a $215bn paper gain as OpenAI's IPO comes into view, reconstructed cap table shows

Google Vids gains avatar controls and Veo 3.1 as competition heats up

Apple updates older iPhones to block DarkSword exploits

Intel repurchases 49% stake in Fab 34 joint venture