Cohere's First Open-Source Agentic Coding Model
Cohere today launched North Mini Code, a 30B parameter mixture-of-experts (MoE) model with only 3B active parameters. It's the company's first model built specifically for agentic coding workflows—code generation, terminal tasks, and autonomous software engineering. The model is released under an Apache 2.0 license, with weights available on Hugging Face.
Performance and Benchmarks
North Mini Code achieves competitive scores on standard software engineering benchmarks. On SWE-Bench Verified, it scores 33.4 on the Artificial Analysis Coding Index. The model also performs well on SWE-Bench Pro and Terminal Bench v2, as shown in Cohere's internal testing. Compared to Devstral Small 2, North Mini Code delivers up to 2.8x higher output throughput under identical concurrency and hardware configurations. Inter-token latency is 30% better, meaning more consistent generation pacing. Time-to-first-token (TTFT) is slightly behind Devstral Small 2 but closely matched.
The model uses a ReAct harness for SWE-Bench and Terminal Bench evaluations, and Terminus-2 for Terminal Bench Hard. Cohere notes that competitor scores were taken from publicly reported data or the Artificial Analysis Intelligence Index, with some internal runs for missing benchmarks.
Technical Specifications
- Model size: 30B total parameters, 3B active (MoE)
- Context length: 256K total context, 64K max generation
- License: Apache 2.0
- Hardware: Minimum 1× H100 at FP8
- Availability: Hugging Face (weights), Cohere API, Cohere Model Vault, OpenRouter
- Optimized for: Code generation, agentic software engineering, terminal tasks
Agentic Coding Capabilities
North Mini Code is designed for agentic workflows. It can understand and orchestrate sub-agents, map system architecture, and run code reviews. Cohere specifically trained it for compatibility with OpenCode, but it works with most coding agents. The model is intended for on-premises or local deployment, giving developers control over their agentic coding infrastructure.
Deployment Options
You can download the weights from Hugging Face, deploy in a managed inference environment via Cohere Model Vault, or try it free on OpenCode or with a Cohere API key. The model runs on a single H100 GPU at FP8 precision, making it accessible for many developer setups.
What's Next
North Mini Code is the first in Cohere's next generation of models. The company plans to expand capabilities based on community feedback. Developers are encouraged to try the model, share their builds on X, Discord, or Reddit, and help shape the roadmap.
Getting Started
To get started:
- Download weights from Hugging Face
- Deploy on Model Vault or use the Cohere API
- Try it with OpenCode or any coding agent
- Check the documentation for model specs, deployment guides, and cookbooks





