JetBrains Introduces Mellum, an Open AI Coding Model

JetBrains Introduces Mellum, an Open AI Coding Model

JetBrains Launches Open AI Model for Code Generation

Introduction to Mellum

JetBrains, a well-known company in the field of software development tools, has recently introduced its first open AI model designed notably for coding tasks. This model, named Mellum, was made publicly available on the AI development platform Hugging Face on a Wednesday, allowing developers greater access to its capabilities.

Key Features of Mellum

Mellum is a sophisticated code-generating model that JetBrains previously integrated into its software development suites last year. The model has been trained on over 4 trillion tokens and boasts 4 billion parameters. This design allows Mellum to complete code snippets by understanding the surrounding context.

Understanding Tokens and Parameters

  • Tokens: These are elements of data that the model processes. To put it into perspective, one million tokens roughly equate to around 30,000 lines of code.
  • Parameters: These represent the model’s problem-solving abilities. The higher the number of parameters, the more complex tasks the model can potentially handle.

Intended Uses

According to JetBrains, Mellum is crafted for various applications, including:

  • Integration in Developer Tools: It can provide intelligent code suggestions within integrated development environments (IDEs).
  • AI Coding Assistants: It is useful for creating AI-powered coding tools.
  • Educational Purposes: The model can aid in learning programming and understanding code generation.
  • Research Applications: It serves as a tool for experimenting with code understanding and generation.

Training and Development

JetBrains reported that Mellum was fine-tuned using a diverse array of datasets that include permissively licensed code from GitHub and content from English-language Wikipedia. The training process lasted about 20 days on a cluster of 256 Nvidia H200 GPUs.

Setup Considerations

However, it is important to note that Mellum requires some initial setup before users can take full advantage of its capabilities. The base model is not ready to use right away; it needs fine-tuning. While JetBrains has made available certain Mellum models specifically optimized for Python, they advise that these models should only be used for general exploration of potential capabilities rather than deploying in live environments.

Addressing Security Concerns

The introduction of AI-generated code comes with its own set of security challenges. A survey conducted in late 2023 by the developer security platform Snyk found that over 50% of organizations faced security issues related to AI-generated code at least occasionally. Concerns were raised that Mellum might reflect existing biases found in public codebases, leading to code suggestions that may not be secure or free from vulnerabilities.

JetBrains’ Vision

In a blog post, JetBrains expressed their long-term vision regarding this initiative. They stated, “This is just the beginning,” emphasizing that their goal is not to create a generic model but rather a focused tool. The company hopes Mellum will inspire meaningful experimentation and collaboration within the developer community.

By making Mellum accessible, JetBrains aims to encourage augmented coding experiences and facilitate more intelligent workflow processes within software development, paving the way for innovative developments in coding practices.

Please follow and like us:

Related