DeepSeek was just the beginning 🤖

AI researchers at Stanford and the University of Washington were able to train an AI “reasoning” model for under $50 in cloud compute credits

Training methods appear to be proliferating with the number of models available:

The s1 paper suggests that reasoning models can be distilled with a relatively small dataset using a process called supervised fine-tuning (SFT), in which an AI model is explicitly instructed to mimic certain behaviors in a dataset.

A race to the bottom we can actually benefit from:

After training s1, which took less than 30 minutes using 16 Nvidia H100 GPUs, s1 achieved strong performance on certain AI benchmarks

& a lesson we can all learn from:

The researchers used a nifty trick to get s1 to double-check its work and extend its “thinking” time: They told it to wait. Adding the word “wait” during s1’s reasoning helped the model arrive at slightly more accurate answers

via TechCrunch