As a tribute to Cleopatra, whose glorious destiny ended in tragic snake circumstances, we are proud to release Codestral Mamba, a Mamba2 language model specialized in code generation, available under an Apache 2.0 license. Following the publishing of the Mixtral family, Codestral Mamba is another step in our effort to study and provide new architectures. It is available for free use, modification, and distribution, and we hope it will open new perspectives in architecture research. Codestral Mamba was designed with help from Albert Gu and Tri Dao.
Unlike Transformer models, Mamba models offer the advantage of linear time inference and the theoretical ability to model sequences of infinite length. It allows users to engage with the model extensively with quick responses, irrespective of the input length. This efficiency is especially relevant for code productivity use cases—this is why we trained this model with advanced code and reasoning capabilities, enabling it to perform on par with SOTA transformer-based models. We have tested Codestral Mamba on in-context retrieval capabilities up to 256k tokens. We expect it to be a great local code assistant!
You can deploy Codestral Mamba using the mistral-inference SDK, which relies on the reference implementations from Mamba’s GitHub repository. The model can also be deployed through TensorRT-LLM. For local inference, keep an eye out for support in llama.cpp. You may download the raw weights from HuggingFace. For easy testing, we made Codestral Mamba available on la Plateforme (codestral-mamba-2407), alongside its big sister, Codestral 22B. While Codestral Mamba is available under the Apache 2.0 license, Codestral 22B is available under a commercial license for self-deployment or a community license for testing purposes.
The Advantages of Codestral Mamba
Codestral Mamba is an advancement in the domain of code generation, aiming to provide a reliable and efficient tool for developers. The model's ability to undertake extensive sequences of data facilitates its use in complex coding environments, ensuring that developers have quick and continuous access to the generated code. Not only does this level of efficiency mitigate productivity bottlenecks, but it also takes a significant step forward from its predecessors.
Deployment and Availability
Deploying this model is relatively straightforward. Codestral Mamba can be utilized through:
Mistral-inference SDK: Utilizes reference implementations from Mamba's GitHub repository.
TensorRT-LLM: Ensuring smooth deployment.
Raw Weights from HuggingFace: For those keen on further customization, the weights are readily downloadable.
Additionally, the tools and capabilities furnish users with flexibility in model usage, from local inference to broader, community-focused deployments. For local developers, ensuring compatibility with llama.cpp is a significant boon, providing hands-on interaction with the model.
Explore further details on how you can integrate and leverage Codestral Mamba on HuggingFace.
Licensing and Access
Codestral Mamba is available under the Apache 2.0 license, meaning it can be used, modified, and distributed freely. This level of access presents an excellent opportunity for startups and SMEs aiming to enhance their productivity tools without incurring hefty licensing fees. However, for more extensive, commercial applications, Codestral 22B remains an optimal yet commercially licensed option.
This approach ensures that developers and companies can experiment with Codestral Mamba without the barrier of prohibitive costs, while also providing an upgrade path with Codestral 22B for those requiring a more robust solution.
Remember these 3 key ideas for your startup:
Efficiency in Code Generation: Codestral Mamba promises linear time inference and can handle sequences of infinite length, leading to significant productivity improvements in coding. This capability allows you to develop and deploy applications faster.
Flexible Deployment Options: From the mistral-inference SDK to deploying through TensorRT-LLM, and support for local inference in llama.cpp, Codestral Mamba offers versatile deployment methods suitable for various business needs.
Free and Open for Enhancements: Under the Apache 2.0 license, this model is accessible to startups and SMEs for free, providing the freedom to use, modify, and enhance as required, making it a cost-effective solution.
Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.
For more details, see the original source.