Jamba_(language_model)
Jamba (language model)
Large language model
Jamba is an open-weights large language model (LLM) developed by AI21 Labs.[1][2][3] It utilizes a Mamba-based model built on a novel state space model (SSM) and transformer hybrid architecture.[4][1][5] It is a 52 billion parameter model trained using a mixture-of-experts (MoE) technique with 12B active parameters (number of parameters active per token).[2][1] Jamba can fit up to 256K tokens in its context window and is the largest Mamba-variant LLM created, or 140k tokens in a single 80GB GPU[2][4]
Jamba performs well across a number of key measurements including throughput and efficiency while outperforming or matching other state-of-the-art models in its class on a wide range of performance benchmarks while having significantly greater context limits enabling use-cases that require increased context.[1][2] The model is released with open weights under an Apache 2.0 license[6][5]
The company plans to release a beta-version instruct-tuned version on the AI21 Platform in the near future[7]