DeepSeek Introducing NSA: A Hardware-Aligned, Natively Trainable Sparse Attention Mechanism for Ultra-Fast Long-Context Training and Inference by DeepSeek AI ByDeepMind April 30, 2025 4:18 pm
Google Exploring the Limitations of Long-Context Large Language Models with the Michelangelo Benchmark ByDeepMind April 19, 2025 10:47 am
Google Michelangelo Benchmark from DeepMind Exposes the Limitations of Long-Context LLMs ByDeepMind April 18, 2025 9:22 amApril 18, 2025 9:22 am