Microsoft is doubling down on AI models that aren't large language models. The company announced on Thursday that it's ...
Mistral's new speech model can run on a smartwatch or a smartphone.
Microsoft launches three in-house MAI models for transcription, voice and image generation through Foundry, hedging its ...
Mistral AI launches Voxtral TTS, an open-weight enterprise voice model that runs on a smartphone and challenges ElevenLabs in ...
OpenAI released its text-to-video artificial intelligence model, Sora, this week after the completion of its testing phase. The Microsoft-backed AI startup first teased the model in February and ...
Alibaba’s Qwen 3.5 Omni brings true real-time omnimodal AI to the frontier race: voice cloning, 10-hour audio, real-time ...
Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Multi-modal models that can process both ...
Mistral AI is expanding its Voxtral model family with its first text-to-speech model. The launch comes amid intensifying ...
GLM-5V-Turbo is Z.ai's first native multimodal agent foundation model, built for vision-based coding and agentic task ...
OpenAI's text-to-videos tool Sora generates high-quality videos up to one minute in length. (OpenAI) OpenAI on Thursday announced Sora, a brand new model that generates high-definition videos up to ...
Google on Friday added a new, experimental “embedding” model for text, Gemini Embedding, to its Gemini developer API. Embedding models translate text inputs like words and phrases into numerical ...