Snowflake Unveils Arctic: An Innovative, Open-Source ‘Mixture-of-Experts’ LLM

Snowflake has unveiled Arctic, a new open ‘mixture-of-experts’ Large Language Model (LLM) designed to compete with DBRX and Llama 3. Arctic is an LLM that has been fine-tuned for intricate enterprise tasks such as SQL generation, code generation, and instruction following.

Arctic is being promoted as the “most open enterprise-grade LLM” available, utilizing a unique mixture of expert (MoE) architecture to excel in enterprise tasks while maintaining efficiency. It also demonstrates competitive performance across standard benchmarks, closely matching open models from Databricks, Meta, and Mistral in tasks involving world knowledge, common sense, reasoning, and mathematical capabilities.

Snowflake CEO Sridhar Ramaswamy described the launch of Arctic as a pivotal moment for the company, with their AI research team leading the way in AI innovation. He stated that by providing top-tier intelligence and efficiency in a truly open manner to the AI community, they are pushing the boundaries of what open-source AI can achieve. Their work with Arctic will significantly boost their ability to provide reliable, efficient AI to their customers.

The introduction of this new model is also seen as Snowflake’s attempt to stay competitive with Databricks, which has been historically proactive in its AI efforts for customers using its data platform. Snowflake’s AI initiatives have only recently gained momentum, following the company’s acquisition of Neeva and the appointment of Ramaswamy as CEO.

Arctic is designed for enterprise workloads, with modern enterprises optimistic about the potential of generative AI and eager to develop gen AI apps such as retrieval-augmented generation (RAG) chatbots, data copilots, and code assistants. While there are many models that can be used to bring these use cases to life, only a few are specifically focused on enterprise tasks, and this is where Snowflake Arctic comes in.

Arctic employs a Dense MoE hybrid architecture, where the parameters are divided into as many as 128 fine-grained expert subgroups. These experts, trained on a dynamic data curriculum, are always ready but only handle those input tokens they can process most effectively. This means only select parameters of the model – 17 billion out of 480 billion – are activated in response to a query, delivering targeted performance with minimal compute consumption.

According to benchmarks shared by Snowflake, Arctic is already handling enterprise tasks quite well, scoring an average of 65% across multiple tests. This matches the average enterprise performance of Llama 3 70B and is just behind Mixtral 8X22B’s score of 70%.

In the Spider benchmark for SQL generation, the model scored 79%, outperforming Databricks’ DBRX and Mixtral 8X7B and nearly matching Llama 3 70B and Mixtral 8X22B. In coding tasks, where the company considered an average of HumanEval+ and MBPP+ scores, it scored 64.3%, again surpassing Databricks and the smaller Mixtral model and trailing Llama 3 70B and Mixtral 8X22B.

However, the most interesting was the IFEval benchmark designed to measure instruction following capabilities. Arctic scored 52.4% in that test, doing better than most competitors, except the latest Mixtral model.

The company claims this level of enterprise intelligence has been achieved with breakthrough efficiency, using a training compute budget of just under $2 million. This is way less than the compute budget of other open models, including Llama 3 70B which has been trained with 17x more compute. Additionally, the model uses 17 active parameters to get these results, which is far less than what other models put to use and will further drive cost benefits.

Snowflake is making Arctic available inside Cortex, its own LLM app development service, and across other model gardens and catalogs, including Hugging Face, Lamini, Microsoft Azure, Nvidia API catalog, Perplexity and Together. On Hugging Face, Arctic model weights and code can be downloaded directly under an Apache 2.0 license that allows ungated use for personal, commercial or research applications.

But, that’s just one bit of the company’s “truly open” effort.

In addition to the model weights and codes, the company is releasing a data recipe to help enterprises run efficient fine-tuning on a single GPU and comprehensive research cookbooks with insights into how the model was designed and trained.

“The cookbook is designed to expedite the learning process for anyone looking into the world-class MoE models. It offers high-level insights as well as granular technical details to craft LLMs like Arctic so that anyone can build their desired intelligence efficiently and economically,” Baris Gultekin, Snowflake’s head of AI, said in the press briefing.