Sky-T1

Train Your Own O1 Preview Model Within $450

Models such as o1 and Gemini 2.0 flash thinking that excel in reasoning have shown to solve complex tasks by producing a long internal chain of thought, among other advancements. However, the technical details and model weights are un-accessible, presenting a barrier to the participation of the academic and open-source communities.

In response, a few notable efforts have emerged to train open-weight reasoning models in the math domain, such as Still-2 and Journey. Concurrently, we, the NovaSky team at UC Berkeley, have been exploring various techniques to evolve the reasoning capabilities of base and instruct-tuned models. In this work, we achieve competitive reasoning performance not just in math, but also in coding in the same model.


Contributors

Dacheng Li, Shiyi Cao, Shu Liu, Tyler Griggs, Simon Mo, Shishir G. Patil, Joseph E. Gonzalez, Ion Stoica

Publications