• AI Drop
  • Posts
  • Mistral AI’s Valued at 2 Billion & Releases Mixtral 8x7B

Mistral AI’s Valued at 2 Billion & Releases Mixtral 8x7B

Plus Meta Releases A New Model for Audio Generation

Welcome back for the Latest AI Drops!

Today’s Drops:

  • Mistral AI’s Valued at 2 Billion & Releases Mixtral 8x7B

  • Meta Releases A New Model for Audio Generation

  • AFL-CIO and Microsoft Partnership

  • Trending on X “W.A.L.T”

  • AI Headlines

  • Trending GitHub Projects

Read Time: 3.5 minutes

Mistral AI’s Valued at 2 Billion & Releases Mixtral 8x7B

This week, Mistral AI has captured the spotlight in the tech world with two significant milestones. 

Firstly, Mistral AI successfully closed $415 million funding round, valuing the company at roughly $2 billion. This remarkable achievement builds on the momentum of a substantial $112 million seed round, secured just half a year ago. These strategic financial movements firmly position Mistral AI as a major European rival to OpenAI. 

In a parallel groundbreaking development, Mistral AI unveiled Mixtral 8x7B, a state-of-the-art sparse mixture of experts model (SMoE) characterized by its open weights and licensed under the flexible Apache 2.0. This innovative model challenges traditional AI frameworks, embracing a community-centric approach to drive diverse and multifaceted advancements.

Key Features of Mixtral 8x7B:

  • Performance: Surpassing Llama 2 70B in most benchmarks, Mixtral 8x7B either matches or exceeds the capabilities of GPT-3.5. Notably, it boasts a 6x faster inference speed.

  • Capabilities: Exceptional in handling a 32k token context, Mixtral is proficient in multiple languages, including English, French, Italian, German, and Spanish. It shows remarkable strength in code generation and can be finely tuned for tasks requiring adherence to instructions.

  • Architecture: As a decoder-only sparse mixture-of-experts network, Mixtral encompasses 46.7B total parameters. However, it utilizes only 12.9B per token, optimizing both cost and latency.

  • Pre-training and Bias Mitigation: Trained on open web data, Mixtral exhibits reduced bias and heightened truthfulness, outperforming Llama 2 according to TruthfulQA and BBQ benchmarks.

  • Multilingual and Instructed Models: Excelling in multilingual tasks, the instructed version of Mixtral, dubbed Mixtral 8x7B Instruct, rivals GPT-3.5 in terms of performance.

  • Deployment: Mixtral integrates seamlessly with the vLLM project for efficient inference using Megablocks CUDA kernels and is deployable via Skypilot on cloud instances.

In the comparison table, Mixtral is benchmarked against the Llama 2 family and the GPT3.5 base model:

Mixtral 8x7B stands out for its high performance, efficient architecture, multilingual capabilities, and open-source deployment feasibility.

Meta Releases A New Model for Audio Generation

Audiobox, a Meta-developed audio generation model, surpasses its predecessor, Voicebox. It synthesizes voices and sound effects from voice and text inputs, creating diverse soundscapes and speech styles. Notably, it allows editing and generating speech, sound effects, and soundscapes in various environments, like echoing cathedral voices. Audiobox outperforms AudioLDM2, VoiceLDM, and TANGO in quality and style resemblance. Aimed at democratizing audio creation for media like films and games, it's initially available to select researchers. Audiobox's advanced features include text-to-speech in different styles and audio infilling for editing. Meta integrates audio watermarking and voice authentication to ensure responsible use, aiming for a future with generalized audio models for versatile AI applications.

AFL-CIO and Microsoft announce partnership on AI and the future of the workforce

Microsoft has teamed up with the AFL-CIO, the largest union federation in the United States, to facilitate the transition into an AI-enhanced future. This groundbreaking partnership is designed to prepare workers for the evolving demands of an AI-driven economy. As part of this initiative, Microsoft plans to conduct AI-focused "learning sessions" and host labor summits, aiming to educate and involve workers in the ongoing development of AI technologies. Additionally, the collaboration seeks to influence government policies to ensure workers acquire the necessary skills, knowledge, and economic support for success in an AI-centric workplace. Another key feature of this partnership is the development of AI-specific curriculum on LinkedIn, specifically tailored to industries most affected by AI advancements. This comprehensive approach demonstrates Microsoft's commitment to a smooth and inclusive transition to the new AI era, emphasizing worker readiness and adaptability.

Trending on X “W.A.L.T”

W.A.L.T is a diffusion model for photorealistic video generation released by researcher at Stanford, in collaboration with GoogleAI researchers.

W.A.L.T can generate photorealistic, temporally consistent motion from natural language prompts. It can animate any image and can be used to generate videos with consistent 3D camera motion.

Check out the release video

More AI News

Japanese tech giant Rakuten is set to launch its own proprietary artificial intelligence language model, leveraging its unique data across various businesses to enhance operational efficiency and offer the technology to other businesses. Link

Google has acknowledged that their AI demonstration video for the Gemini model, which garnered significant attention, was edited to enhance its presentation. This included speeding up responses and using text prompts with still images instead of real-time voice or video interaction. Link

Investing.com, known for financial news, faces plagiarism allegations for using AI to mimic existing human-written stories. Link

Citrusx, an Israeli startup, has raised $4.5 million in seed funding to develop software ensuring AI models comply with internal and governmental regulations. This comes as AI regulation becomes a growing focus worldwide.Link

Welcome to the innovative world of GitHub projects!

LLMware is an open-source framework that offers a comprehensive set of tools for developing enterprise-grade applications using large language models (LLMs). It supports various features like Retrieval Augmented Generation (RAG), fact-checking, and tools for handling model hallucination. The framework is designed to be accessible for both beginners and advanced AI developers, and it facilitates the development of fact-based generative workflows.

Sweep is an AI junior developer that refactors and writes unit tests for Python.

SkyPilot is a versatile framework for running Large Language Models (LLMs), AI, and batch jobs on any cloud platform. It offers features like maximum cost savings, high GPU availability, and managed execution with a user-friendly interface. The framework simplifies cloud infrastructure management, supports various workloads without code changes, and can be easily installed using pip. It is designed to provide efficient and cost-effective cloud resource utilization for AI and LLM projects.