PrismML Unveils Groundbreaking Ternary Bonsai Model Family
In an impressive stride towards efficient artificial intelligence, PrismML, known for its pioneering role in high-performance AI, has introduced the Ternary Bonsai model family. This family incorporates three cutting-edge large language models that showcase 8 billion, 4 billion, and 1.7 billion parameters. The innovative aspect of these models lies in their novel architecture, which employs a unique representation of weights requiring just 1.58 bits.
A New Era in AI Efficiency
PrismML continues to lead the industry, building upon their recent introduction of the world's first commercially viable 1-bit models. The Ternary Bonsai models epitomize the next level of intelligence density—the capability to deliver substantial reasoning performance while utilizing minimal memory, computational resources, and energy. This aspect significantly broadens the landscape for deploying AI in an array of environments, from laptops and edge devices to extensive datacenter infrastructures.
Insights from the Visionary
According to Babak Hassibi, the CEO and Founder of PrismML and a professor at Caltech, "Intelligence density serves as a pivotal metric for next-generation AI. The Ternary Bonsai models provide enhanced reasoning capabilities while greatly reducing memory and energy consumption, facilitating powerful AI applications even in previously unsuitable environments."
Understanding Ternary Bonsai Models
The Ternary Bonsai model family leverages a ternary weight representation, whereby each weight can adopt one of three distinct values: -1, 0, or +1. This innovative approach effectively translates to a remarkable utilization of 1.58 bits per weight, resulting in substantially reduced memory footprints compared to traditional 16-bit models:
| Model Size | Memory Requirement | Standard 16-bit Model |
|---|
| ----- | ----- | ------- |
| Ternary Bonsai 8B | ~1.75 GB | ~16.4 GB |
| Ternary Bonsai 4B | ~0.86 GB | ~8 GB |
| Ternary Bonsai 1.7B | ~0.37 GB | ~3.4 GB |
This advancement translates to a staggering 9x reduction in memory usage across all model sizes, demonstrating the efficiency of PrismML's offerings.
Additionally, in comparison to the previous 1-bit Bonsai 8B model, the new Ternary Bonsai 8B provides an average improvement of 5 benchmark points while only requiring around 600 MB of added memory. Such efficiency enables these models to outperform many leading full-precision models in various standard benchmarks within their respective parameter classes.
Consistent Low Precision
The Ternary Bonsai models truly commit to their end-to-end ternary architecture. There are no fallback options to higher precision. Across all layers—including embeddings, attention layers, multi-layer perceptrons (MLPs), and the language model (LM) head—only the 1.58-bit representation is utilized. A group-wise quantization scheme governs the process, wherein each weight is restricted to one of three values defined as {-s, 0, +s}, encoded into (-1, 0, +1).
Applications from Edge to Datacenter
The immediate advantages of the Ternary Bonsai family resonate within on-device AI operations, demanding minimal hardware needs for maximum efficiency. However, the influence of these models transcends edge technologies. Datacenters, too, stand to gain significantly—these newfound efficiencies will optimize hardware utilization, reduce operational expenses, and drastically lower energy consumption.
As the costs associated with AI infrastructure continue to escalate, the notion of intelligence density is quickly evolving into not only a differentiating factor but a crucial strategic priority for the industry in the years to come.
Availability and Licensing
Beginning today, developers, researchers, and any interested users can freely download the Ternary Bonsai model family under the Apache 2.0 license.
For those eager to explore or utilize these state-of-the-art models, links for downloading the 1-bit Ternary Bonsai Models are now live.
About PrismML
PrismML is a forward-thinking artificial intelligence firm based in the U.S. and focused on enhancing the efficiency and accessibility of AI technologies. The company is built from proprietary intellectual property developed at Caltech and has garnered backing from notable investors, including Khosla Ventures and Cerberus Ventures, along with computational grants from Google and Caltech. To learn more about PrismML, you can visit their website or connect with them on LinkedIn or X.