Microsoft Unveils General Availability of Azure AI with Copilot and Integration of Meta Llama 4

Microsoft’s Azure AI and Copilot General Availability Announcement
Microsoft has officially released its Copilot feature in Azure as part of its general availability (GA). This update includes the integration of Meta’s latest Llama 4 models into Azure AI Foundry and Azure Databricks.
Overview of Copilot in Azure
Originally launched in public preview in May 2024, Copilot in Azure has garnered significant usage, reaching millions of prompts for hundreds of thousands of users. Microsoft’s Ruhiyyih Mahalati stated in a blog post that within Microsoft alone, Copilot saves over 30,000 developer hours monthly. The GA release brings several key enhancements aimed at improving user experience and performance:
- Faster Response Times: Copilot’s performance has been boosted with response times improving over 30% due to optimizations in both front-end and back-end systems.
- Accessibility Improvements: The user interface has been refined to meet rigorous accessibility standards.
- High Uptime Commitment: Microsoft guarantees a 99.9% uptime for the Copilot service.
- Responsible AI Testing: Copilot is developed in line with Microsoft’s Responsible AI principles, emphasizing safety testing to mitigate harmful behaviors.
- Language Support: The tool now supports 19 different languages.
- Enhanced Features: New functionalities include Terraform configuration authoring and Azure Kubernetes Service diagnostics.
Moreover, Copilot is now also accessible on the Azure Mobile App. New features on mobile include AI chat streaming in real-time, better entry points for users, a cost management skill, and enhanced accessibility options.
Meta’s Llama 4 Models: New Additions to Azure AI
In conjunction with the Copilot rollout, Microsoft announced that Meta’s Llama 4 models—specifically Scout and Maverick—are now available in Azure AI Foundry and Azure Databricks. These models are crafted to tackle multimodal tasks, which involve processing a combination of text and visual data.
- Llama 4 Scout: This model focuses on summarization, personalization, and reasoning over long contexts. It is capable of handling up to 10 million tokens and operates efficiently on a single H100 GPU.
- Llama 4 Maverick: With 17 billion active parameters and a Mixture of Experts (MoE) architecture, Maverick is specially designed for multilingual and multimodal chat applications.
Both models are built with safety protocols at every development stage and aligned with Azure’s security and compliance standards. Their unique combination of multimodal input capabilities and scalable architectures make the Llama 4 models ideal for enterprises looking to innovate with advanced AI solutions while maintaining efficiency and cost-effectiveness.