News provided by aibrews.com
Meta AI introduced a suite of AI language translation models that preserve expression and improve streaming [Details | GitHub]: SeamlessExpressive enables the transfer of tones, emotional expression and vocal styles in speech translation. You can try a demo of SeamlessExpressive using your own voice as an input here. SeamlessStreaming, a new model that enables streaming speech-to-speech and speech-to-text translations with <2 seconds of latency and nearly the same accuracy as an offline model. In contrast to conventional systems which translate when the speaker has finished their sentence, SeamlessStreaming translates while the speaker is still talking. t intelligently decides when it has enough context to output the next translated segment. SeamlessM4T v2, a foundational multilingual & multitask model for both speech & text. It’s the successor to SeamlessM4T, demonstrating performance improvements across ASR, speech-to-speech, speech-to-text & text-to-speech tasks. Seamless, a model that merges capabilities from SeamlessExpressive, SeamlessStreaming and SeamlessM4T v2 into one. Stability AI released SDXL Turbo: a real-time Text-to-Image generation model. SDXL Turbo is based on a a new distillation technology, which enables the model to synthesize image outputs in a single step and generate real-time text-to-image outputs while maintaining high sampling fidelity [Details]. Meta AI has created CICERO, the first AI agent to achieve human-level performance in the complex natural language strategy game Diplomacy. CICERO played with humans on webDiplomacy.net, an online version of the game, where CICERO achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game [Details]. Mozilla’s innovation group and Justine Tunney released llamafile that lets you distribute and run LLMs with a single file. llamafiles can run on six OSes (macOS, Windows, Linux, FreeBSD, OpenBSD, and NetBSD) and on multiple CPU architectures [Details]. Perplexity released two new PPLX models: pplx-7b-online and pplx-70b-online. These online LLMs can leverage the most up-to-date information using the internet when forming a response [Details]. Google DeepMind presented GNoME (Graph Networks for Materials Exploration): an AI tool that discovered 2.2 million new crystal structures, with 380,000 being highly stable and promising for breakthroughs in superconductors, supercomputers, and advanced batteries for electric vehicles [Details]. Amazon introduced two new Amazon Titan multimodal foundation models (FMs): Amazon Titan Image Generator (preview) and Amazon Titan Multimodal Embeddings. All images generated by Amazon Titan contain an invisible watermark [Details]. Researchers present Animatable Gaussians, a new avatar representation method that can create lifelike human avatars from multi-view RGB videos [Details]. Pika Labs released a major product upgrade of their generative AI video tool, Pika 1.0, which includes a new AI model capable of generating and editing videos in diverse styles such as 3D animation, anime, cartoon and cinematic using text, image or existing video [Details]. Eleven Labs announced a grant program offering 11M text characters of content per month for the first 3 months to solo-preneurs and startups [Details]. Researchers from UC Berkeley introduced Starling-7B, an open large language model trained using Reinforcement Learning from AI Feedback (RLAIF). It utilizes the GPT-4 labeled ranking dataset, Nectar, and a new reward training pipeline. Starling-7B outperforms every model to date on MT-Bench except for OpenAI’s GPT-4 and GPT-4 Turbo [Details]. XTX Markets is launching a new $10mn challenge fund, the Artificial Intelligence Mathematical Olympiad Prize (AI-MO Prize) The grand prize of $5mn will be awarded to the first publicly-shared AI model to enter an AI-MO approved competition and perform at a standard equivalent to a gold medal in the in the International Mathematical Olympiad (IMO) [Details] . Microsoft Research evaluated GPT-4 for processing radiology reports, focusing on tasks like disease classification and findings summarization. The study found GPT-4 has a sufficient level of radiology knowledge with only occasional errors in complex context that require nuanced domain knowledge. The radiology report summaries generated by GPT-4 were found to be comparable and, in some cases, even preferred over those written by experienced radiologists [Details]. AWS announced Amazon Q, a new generative AI–powered assistant for businesses. It enables employees to query and obtain answers from various content repositories, summarize reports, write articles, perform tasks, and more, all within their company’s integrated content systems. Amazon Q offers over 40 built-in connectors to popular enterprise systems [Details]. 18 countries including the US, Britain signed a detailed international agreement on how to keep artificial intelligence safe from rogue actors, pushing for companies to create AI systems that are ‘secure by design’ [Details].
🔦 Weekly Spotlight
AI Revolution – A data-backed report by Coatue [Link]. Interview: Sam Altman on being fired and rehired by OpenAI [Link]. Open source version of image+text-based adventure game using GPTs in ChatGPT MonkeyIslandAmsterdam.com by Peter levels [Link].
– – –
Welcome to the r/artificial weekly megathread. This is where you can discuss Artificial Intelligence – talk about new models, recent news, ask questions, make predictions, and chat other related topics.
Self-promo is allowed in these weekly discussions. If you want to make a separate post, please read and go by the rules or you will be banned.