Developer(s) | Mosaic ML and Databricks team |
---|---|
Initial release | March 27, 2024 |
Repository | https://github.com/databricks/dbrx |
License | Databricks Open License |
Website | https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm |
DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. [1] [2] [3] It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. [4] The released model comes in either a base foundation model version or an instruct-tuned variant. [5]
DRBX outperforms other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok and close-sourced models such as GPT-3.5 in several benchmarks ranging from language understanding, programming ability and mathematics. [4] [6] [7] As of March 28, 2024, this makes DBRX the world's most powerful open sourced model. [8]
It was trained in 2.5 months [8] on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth ( InfiniBand), for a training cost of $10m USD. [1]
Developer(s) | Mosaic ML and Databricks team |
---|---|
Initial release | March 27, 2024 |
Repository | https://github.com/databricks/dbrx |
License | Databricks Open License |
Website | https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm |
DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. [1] [2] [3] It is a mixture-of-experts Transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token. [4] The released model comes in either a base foundation model version or an instruct-tuned variant. [5]
DRBX outperforms other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok and close-sourced models such as GPT-3.5 in several benchmarks ranging from language understanding, programming ability and mathematics. [4] [6] [7] As of March 28, 2024, this makes DBRX the world's most powerful open sourced model. [8]
It was trained in 2.5 months [8] on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth ( InfiniBand), for a training cost of $10m USD. [1]