Jamba_(language_model)

Jamba
Developer(s)	AI21 Labs
Initial release	28 March 2024
Type	Large language model; Generative pre-trained transformer; Mamba (deep learning architecture); Mixture of experts; Foundation model;
License	Apache 2.0 License

Jamba (language model)

Large language model

Jamba is an open-weights large language model (LLM) developed by AI21 Labs.^[1]^[2]^[3] It utilizes a Mamba-based model built on a novel state space model (SSM) and transformer hybrid architecture.^[4]^[1]^[5] It is a 52 billion parameter model trained using a mixture-of-experts (MoE) technique with 12B active parameters (number of parameters active per token).^[2]^[1] Jamba can fit up to 256K tokens in its context window and is the largest Mamba-variant LLM created, or 140k tokens in a single 80GB GPU^[2]^[4]

Quick Facts Developer(s), Initial release ...

Jamba performs well across a number of key measurements including throughput and efficiency while outperforming or matching other state-of-the-art models in its class on a wide range of performance benchmarks while having significantly greater context limits enabling use-cases that require increased context.^[1]^[2] The model is released with open weights under an Apache 2.0 license^[6]^[5]

The company plans to release a beta-version instruct-tuned version on the AI21 Platform in the near future^[7]

Share this article:

This article uses material from the Wikipedia article Jamba_(language_model), and is written by contributors. Text is available under a CC BY-SA 4.0 International License; additional terms may apply. Images, videos and audio are available under their respective licenses.

[:0-1] [1]
"Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model". www.ai21.com. Retrieved 2024-03-29.

[:1-2] [2]
Kerner, Sean Michael (2024-03-28). "AI21 Labs juices up gen AI transformers with Jamba". VentureBeat. Retrieved 2024-03-29.

[3] [3]
Mawira, Benson (March 28, 2024). "Next-Generation AI System Promises Unprecedented Scalability". Cryptopolitan.

[:3-4] [4]
"AI21 Labs' Jamba infuses Mamba to bring more context to transformer-based LLMs". SiliconANGLE. 2024-03-28. Retrieved 2024-03-29.

[:2-5] [5]
"MLTimes - Time To Learn AI". mltimes.se. Retrieved 2024-03-29.

[6] [6]
AI21. "Unveiling Jamba: AI21's Groundbreaking Hybrid SSM-Transformer Open-Source Model". www.prnewswire.com. Retrieved 2024-03-29.{{cite web}}: CS1 maint: numeric names: authors list (link)

[:4-7] [7]
"AI21 Labs enhances the capabilities of gen AI transformers through Jamba integration". Global Village Space | Technology. 2024-03-28. Retrieved 2024-03-29.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Jamba_(language_model)

Jamba (language model)

Characteristics

See also

References

Share this article: