Meta has unveiled the latest additions to its open source Llama generative AI model series – Llama 3 8B and Llama 3 70B. These models represent a significant performance leap over their Llama 2 predecessors, promising enhanced capabilities across a variety of tasks.
At 8 billion and 70 billion parameters respectively, the new Llama 3 models were trained on massive 24,000 GPU clusters built by Meta. The company boldly claims that for their parameter counts, Llama 3 8B and 70B rank among the best publicly available generative AI models today.
How does Meta back up this assertion? By highlighting the models’ standout performance on several popular AI benchmarks like MMLU for knowledge, ARC for skill acquisition, and DROP for reasoning over text passages. While the validity of such benchmarks is debatable, they remain a go-to evaluation method in the AI field.
On at least nine benchmarks including MMLU, ARC, DROP, code generation, math, and commonsense reasoning tests, the 8 billion parameter Llama 3 8B outperforms other open source models like Mistral 7B and Google’s Gemma 7B. And Meta claims the larger 70B version is competitive with Google’s flagship Gemini 1.5 Pro, beating it on some benchmarks.
Qualitatively, Meta says users can expect improved “steerability”, greater willingness to answer prompts, and higher accuracy on topics like STEM, history and coding from the new Llama models. This is attributed to training on a staggering 15 trillion token (750 billion word) dataset – 7 times larger than Llama 2’s – including more code, non-English data, and synthetically generated longer documents.
Meta remains tight-lipped about the training data sources beyond “publicly available”, likely to avoid intellectual property concerns like recent lawsuits over alleged unauthorized data use. The company has, however, updated its AI safety tools to mitigate issues like toxicity, bias and hallucinations that plagued Llama 2.
The Llama 3 8B and 70B models are available now, powering Meta AI across Facebook’s apps and soon to be hosted on major cloud platforms. But Meta isn’t stopping there – models exceeding 400 billion parameters capable of multilingual conversation and multimodal inputs like images are already in the works.
As generative AI rapidly evolves, efforts like Llama 3 aim to push the boundaries of open source model performance and accessibility. However, responsible development addressing safety and ethical risks remains paramount as these powerful AI capabilities proliferate.