LongContext

Introducing NSA: A Hardware-Aligned, Natively Trainable Sparse Attention Mechanism for Ultra-Fast Long-Context Training and Inference by DeepSeek AI

Introducing NSA: A Hardware-Aligned, Natively Trainable Sparse Attention Mechanism for Ultra-Fast Long-Context Training and Inference by DeepSeek AI

ByDeepMind April 30, 2025 4:18 pm

Exploring the Limitations of Long-Context Large Language Models with the Michelangelo Benchmark

Exploring the Limitations of Long-Context Large Language Models with the Michelangelo Benchmark

ByDeepMind April 19, 2025 10:47 am

Michelangelo Benchmark from DeepMind Exposes the Limitations of Long-Context LLMs

Michelangelo Benchmark from DeepMind Exposes the Limitations of Long-Context LLMs

ByDeepMind April 18, 2025 9:22 amApril 18, 2025 9:22 am