Is this the dawn of cheap AI?
DeepSeek was just the beginning 🤖
AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $50 in cloud compute credits
Training methods appear to be proliferating with the number of models available:
The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in a dataset.
A race to the bottom we can actually benefit from:
After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks
& a lesson we can all learn from:
The researchers used a nifty trick to get s1 to double-check its work and extend its “thinking” time: They told it to wait. Adding the word “wait” during s1’s reasoning helped the model arrive at slightly more accurate answers
via TechCrunch