DeepSeek Benchmarking DeepSeek-R1 Distilled Models on GPQA with Ollama and OpenAI’s simple-evals ByDeepMind April 24, 2025 7:38 am