Deep Cogito Unveils First Open Source AI Models, Quickly Rising To Popularity

Deep Cogito: A New Player in AI

Deep Cogito, a startup located in San Francisco, has officially launched its new line of open-source large language models (LLMs) called Cogito v1. These models are fine-tuned versions of Meta’s Llama 3.2 and come with enhanced reasoning capabilities. Essentially, this means they can not only provide quick answers but also perform deeper self-reflection akin to OpenAI’s “o” series and DeepSeek R1.

Goals and Mission

The primary aim of Deep Cogito is to broaden AI’s capabilities beyond the current limitations that require human supervision. The company aspires to create models that can continuously enhance their own reasoning strategies, moving towards a state of superintelligence defined as AI that is smarter than humans in all fields. Importantly, the company has committed to making all of its models open-source, promoting collaboration and innovation in the AI community.

Drishan Arora, CEO and co-founder, has a notable background; he previously worked as a Senior Software Engineer at Google, where he was responsible for large language model development for their generative search product. He stated that the Cogito models are among the most powerful open models available today, surpassing others in their category.

Model Specifications

Deep Cogito has launched five different model sizes with various parameters to suit different computational needs:

3 billion parameters
8 billion parameters
14 billion parameters
32 billion parameters
70 billion parameters

These models can be accessed on platforms like Hugging Face and Ollama, and through APIs on Fireworks and Together AI. The models are released under Llama licensing terms, which permit commercial use, making them suitable for third-party enterprises.

The company also has plans to release even larger models, potentially reaching up to 671 billion parameters in the coming months.

Innovative Training Methodology

Deep Cogito employs a unique training approach dubbed Iterated Distillation and Amplification (IDA). This method deviates from traditional reinforcement learning from human feedback (RLHF). The core concept of IDA involves using more computational power to generate improved solutions, which are then distilled back into the model’s parameters for future learning. Arora describes this approach as similar to the self-play strategy used by Google’s AlphaGo, but applied to the realm of natural language processing.

Performance Benchmarks

The performance of the Cogito models has been evaluated against other open-source models in several areas such as general knowledge, mathematical reasoning, and multilingual tasks. Here are some key performance highlights:

Cogito 3B outperformed LLaMA 3.2 3B by 6.7 percentage points on the MMLU benchmark.
In reasoning mode, Cogito 3B achieved a score of 72.6% on MMLU.
Cogito 8B scored 80.5% on MMLU, leading its peers significantly.
Cogito 70B surpassed LLaMA 4 Scout 109B on overall benchmark scores.

Overall, the Cogito models tend to perform better in reasoning mode, although there are some adjustments, particularly in mathematical tasks, where they might not always lead.

Additionally, the models have shown strong performance in tool-calling tasks, which are becoming increasingly vital for integrated systems:

Cogito 3B can handle four tool-calling tasks and scored 92.8% on simple tool calls.
Cogito 8B demonstrated even greater proficiency with over 89% success across all tool call types.

Future Directions

Moving forward, Deep Cogito aims to introduce larger-scale models that include mixture-of-expert variants with up to 671 billion parameters. The company plans to continually refine its models through extended training and updates.

Drishan Arora emphasizes that while benchmarks are crucial, the real test lies in the models’ practical utility and adaptability in real-world settings. To support their mission, Deep Cogito has partnered with notable organizations such as Hugging Face, RunPod, and Ollama. All models released by the company are open-source and are available now to the public, marking a significant step in the evolution of AI technology.

Please follow and like us: