Trending Misterio

Explorar





Descargar app  Subir

Inicio

Explorar

Escoge un idioma

Español

Ver contenido de

España

Escoge un idioma

Español

English

iVoox Podcast & radio

Descargar app gratis

Descargar app

This Week in Machine Learning & AI Podcast

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721

Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721 1q1y3g

Internet y tecnología

4/3/2025 · 48:59

 7

Descargar app

 7

This Week in Machine Learning & AI Podcast

Descripción de Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721 1f314n

Today, we're ed by Niklas Muennighoff, a PhD student at Stanford University, to discuss his paper, “S1: Simple Test-Time Scaling.” We explore the motivations behind S1, as well as how it compares to OpenAI's O1 and DeepSeek's R1 models. We dig into the different approaches to test-time scaling, including parallel and sequential scaling, as well as S1’s data curation process, its training recipe, and its use of model distillation from Google Gemini and DeepSeek R1. We explore the novel "budget forcing" technique developed in the paper, allowing it to think longer for harder problems and optimize test-time compute for better performance. Additionally, we cover the evaluation benchmarks used, the comparison between supervised fine-tuning and reinforcement learning, and similar projects like the Hugging Face Open R1 project. Finally, we discuss the open-sourcing of S1 and its future directions. The complete show notes for this episode can be found at https://twimlai.com/go/721. 4w276i

machine

Comentarios de Inside s1: An o1-Style Reasoning Model That Cost Under $50 to Train with Niklas Muennighoff - #721 l4l4u

 Este programa no acepta comentarios anónimos. ¡Regístrate para comentar!

Te recomendamos

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735

This Week in Machine Learning & AI Podcast · 57:01

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

This Week in Machine Learning & AI Podcast · 01:25:37

Google I/O 2025 Special Edition - #733

This Week in Machine Learning & AI Podcast · 26:37

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

This Week in Machine Learning & AI Podcast · 57:37

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

This Week in Machine Learning & AI Podcast · 01:01:53

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

This Week in Machine Learning & AI Podcast · 01:07:42

888- Multa HISTÓRICA por no DEVOLVER un BIZUM hecho por error

TISKRA · 05:46

Conociendo al NUEVO FICHAJE, la bomba Apple y Switch 2 | 23x06

Topes de Gama Unplugged · 01:23:51

NTN 387 - 💸 Elon vs Trump: El Divorcio Más Caro del Universo

No Tiene Nombre · 09:54

DeepSeek-R1-0528: Análisis del ¿Mejor modelo razonador? Resultados reales en Español (Ep. 107)

El Test de Turing - Inteligencia Artificial · 01:45:29

👀 Nadie habla de esto... pero el ciclo de Bitcoin está cambiando para siempre | Ep 225

Hablando Crypto · 28:26

El Renault R5 más BARATO ya está disponible, guerra entre MUSK y TRUMP y más | EP353

Somos Eléctricos · 36:29

Ir a Internet y tecnología