Unveiling 'Strawberry': Advanced Reasoning Technology

TECH NEWS

Raman Kumar

7/14/20242 min read

a computer processor with a colorful neon sign

According to Reuters, OpenAI, led by Sam Altman, is developing a new reasoning technology for its large language models called "Strawberry." The report, based on internal documents and sources familiar with the situation, indicates that OpenAI aims for Strawberry to significantly enhance the reasoning abilities of its AI models, including ChatGPT.

The report reveals that Strawberry is a well-kept secret within OpenAI. Initially known as Q*, it was considered a major breakthrough by the company.

OpenAI has unveiled Q* demos to select staff members, revealing that these advanced language models can solve challenging science and math problems that stump today's commercial models.

According to the source, the document details a project involving Strawberry models that aim to enhance the AI's capabilities. These models will not only generate answers to queries but also plan ahead sufficiently to navigate the internet autonomously and effectively conduct what OpenAI terms "deep research."

Unveiling Strawberry: The Future of AI Reasoning:

Strawberry is a pivotal element in OpenAI's strategy to address various challenges. While a document seen by Reuters outlines Strawberry's goals, it omits the specifics of its implementation.

Recently, OpenAI has discreetly informed developers and other stakeholders about the impending release of technology with vastly improved reasoning abilities. This information comes from four individuals familiar with the company’s pitches, who spoke on the condition of anonymity.

Strawberry involves a specialized "post-training" process to refine OpenAI’s generative AI models after their initial training on large datasets. This phase, known as fine-tuning, includes methods such as human feedback and examples of good and bad answers to enhance model performance.

Strawberry is similar to the "Self-Taught Reasoner" (STaR) method developed at Stanford in 2022, which allows AI models to improve their intelligence by creating their own training data iteratively. According to Stanford professor Noah Goodman, STaR could theoretically elevate language models beyond human-level intelligence. Goodman, however, is not associated with OpenAI or Strawberry.

OpenAI aims to enable Strawberry to perform long-horizon tasks (LHT), which involve complex, multi-step actions over extended periods. To achieve this, the company is developing and testing models on a "deep-research" dataset, although the specifics of this dataset remain unclear.

The ultimate goal is for the models to conduct autonomous web research using a "computer-using agent" (CUA) that can take actions based on its findings. Additionally, OpenAI plans to test these models' capabilities in software and machine learning engineering tasks.