Alibaba unveils open source reasoning model QwQ-32B

Alibaba announced today the release of QwQ-32B, a compact reasoning model which delivers remarkable performance for its size.

The Alibaba team based the model on Qwen2.5-32B, which as a foundation model was already topping leaderboards in performance across open source models of its size. Probably inspired by the DeepSeek success, the team explored the scalability of Reinforcement Learning to enhance the intelligence of the model and produce a strong reasoning model.

As a reasoning model, QwQ-32B achieves performance comparable to DeepSeek-R1 despite being twenty times smaller in parameters (and with slightly less active parameters), with great results across coding, maths and other tasks (see below). The results demonstrate the potential of RL to produce more powerful and efficient models and are shifting the frontier of efficiency for open source models!

You can already access QwQ-32B on Huggingface here.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top