Join our daily and weekly newsletters to get the latest updates and exclusive content on industry-leading AI coverage. Learn more
Chinese e-commerce giant Alibaba has launched the newest model in its ever-expanding Qwen family. This one is called Qwen with Questions (QwQ) and serves as the latest open source competitor to OpenAI’s o1 reasoning model.
Like other large-scale reasoning models (LRMs), QwQ uses additional computation cycles during inference to review answers and correct errors. This makes it ideal for tasks that require logical reasoning and planning, such as math and coding.
What is Qwen with Questions (OwQ?) and can it be used for commercial purposes?
Alibaba has released a 32 billion parameter version of QwQ with 32,000 context tokens. This model is currently in preview. This means that a more powerful version is likely to follow.
In Alibaba’s tests, QwQ surpassed o1-preview in the AIME and MATH benchmarks, which assess mathematical problem-solving abilities. It also outperforms o1-mini on GPQA, a benchmark for scientific reasoning. QwQ is inferior to o1 on cryptographic benchmarks. LiveCodeBench But it still outperforms other frontier models such as the GPT-4o and Claude 3.5 Sonnet.
QwQ does not come with documentation that explains the data or processes used to train the model. This makes it difficult to replicate the model’s results. However, because the model is open, unlike OpenAI o1, the model’s “thought process” is not hidden. And it can be used to understand how a model reasons to solve a problem.
Alibaba has also released the model under the Apache 2.0 license, which means it can be used for commercial purposes.
‘We discovered something profound’
according to Blog post Published with the release of the model. “With deep exploration and countless experiments, We discovered something profound: When given time to ponder, question, and ponder, a model’s understanding of mathematics and programming blossoms like a flower blooming into the sun. This self-questioning has led to incredible progress in solving complex problems.”
This is very similar to what we know about how reasoning models work. By creating more tokens and reviewing previous answers. The model is therefore more likely to correct potential errors. Marco-o1, another reasoning model recently launched by Alibaba, may offer hints as to how QwQ works. Marco- o1 use Search for Monte Carlo trees. (MCTS) and self-reflection at the time of inference to reason in various fields. and choose the best answer. The model is trained on a combination of Chain of Ideas (CoT) samples and synthetic data generated with the MCTS algorithm.
Alibaba points out that QwQ still has limitations, such as mixing languages or getting stuck in circular reasoning loops. This version can be downloaded at Hug your face And you can view the online demonstration at Hug the facial space–
The LLM era gave way to LRM: large-scale reasoning models.
The release of the o1 has sparked increased interest in building LRMs, although not much is known about how the model works under the hood. In addition to using extrapolated time scales to improve the model response.
There are now several Chinese competitors to o1. Chinese AI lab DeepSeek recently released R1-Lite-Preview. Competitor o1, which is currently only available through the company’s online chat interface, R1-Lite-Preview reportedly beats o1 in several key benchmarks.
Another recently launched model is the LLaVA-o1, which was developed by researchers from several universities in China. It brings the reasoning and timing inference paradigm to the open source visual language model (VLM).
The focus on LRM comes at a time of uncertainty about the future of model scaling laws. report This indicates that AI labs such as OpenAI, Google DeepMind and Anthropic are seeing diminishing returns from training large models. And generating large amounts of quality training data is becoming more difficult. Because of the various models It has been trained on trillions of tokens collected from the internet.
Meanwhile The inference time scale offers an alternative that could provide the next breakthrough in improving the capabilities of next-generation AI models. OpenAI is reportedly Use o1 to create synthetic reasoning data. To train the next generation of LLMs, the introduction of open reasoning models is likely to spur progress and make the field more competitive.
Source link