News

The Decoder
the-decoder. com

Open AI's GPT-5. 4 Pro reportedly solves a longstanding open Erd's math problem in under two hours

2+ hour, 29+ min ago  (212+ words) the-decoder. com Open AI's GPT-5. 4 Pro reportedly solves a longstanding open Erd's math problem in under two hours Open AI's GPT-5. 4 Pro model has apparently solved Erd's open math problem #1196. The model reportedly found the solution in about 80 minutes and…...

The Decoder
the-decoder. com

Arcee AI spent half its venture capital to build an open reasoning model that rivals Claude Opus in agent tasks

3+ day, 3+ hour ago  (418+ words) Arcee AI has released Trinity-Large-Thinking, an open reasoning model built to compete with Claude Opus in agent tasks. The company spent roughly half its total venture capital on the project. According to the company, the team trained the base model…...

The Decoder
the-decoder. com

LLMs crush coding and math but choke on casual questions, and that's not a contradiction

5+ day, 1+ hour ago  (214+ words) AI models can solve complex programming tasks in hours but fall apart when faced with basic everyday questions. Andrej Karpathy explains why that's not actually a contradiction. There are two different ways people think about AI progress right now, according…...

The Decoder
the-decoder. com

New Stanford study reveals when teaming up AI agents is worth the compute

5+ day, 23+ hour ago  (160+ words) Multi-agent AI systems are widely considered more capable. A Stanford study shows their apparent advantage largely comes from using more compute. But there are important exceptions. A popular approach in AI research right now is'multi-agent systems: multiple AI models split…...

The Decoder
the-decoder. com

Alibaba's Qwen team built Hop Chain to fix how AI vision models fall apart during multi-step reasoning

1+ week, 2+ day ago  (619+ words) When AI models reason about images, small perceptual errors compound across multiple steps and produce wrong answers. The Hop Chain framework generates multi-stage image questions that target this problem directly and improve 20 out of 24 benchmarks. Vision language models (VLMs) do…...

The Decoder
the-decoder. com

Meta's hyperagents improve at tasks and improve at improving

2+ week, 4+ day ago  (806+ words) Researchers at Meta and several universities have developed "hyperagents," AI systems that don't just solve tasks, but also optimize the very mechanism they use to get better. The approach works across different task areas and could open the door to…...

The Decoder
the-decoder. com

Meta's new AI model predicts how your brain reacts to images, sounds, and speech

2+ week, 4+ day ago  (781+ words) A new AI model from Meta predicts how the human brain reacts to images, sounds, and speech. In tests, it often matched the typical brain response better than any single person's scan. Brain research requires new recordings for every new…...

The Decoder
the-decoder. com

Math needs thinking time, everyday knowledge needs memory, and a new Transformer architecture aims to deliver both

3+ week, 3+ day ago  (304+ words) A German research team lets Transformer models decide for themselves how many times they think about a problem. Combined with additional memory, the approach clearly outperforms larger models on math problems. The base architecture is a decoder-only transformer with 12 layers…...

The Decoder
the-decoder. com

Terence Tao says AI drives idea generation cost to near zero but shifts the bottleneck to verification

3+ week, 3+ day ago  (402+ words) the-decoder. com Terence Tao says AI drives idea generation cost to near zero but shifts the bottleneck to verification Mathematician Terence Tao compares the influence of AI and formalization on mathematical practice with the impact of the automobile on urban…...

The Decoder
the-decoder. com

Qualcomm shrinks AI reasoning chains by 2. 4x to fit thinking models on smartphones

3+ week, 5+ day ago  (364+ words) Qualcomm AI Research has developed a modular system that brings reasoning-capable language models to smartphones by compressing the models' verbose thought processes by a factor of 2. 4. Current reasoning models pose a fundamental problem on mobile devices because their lengthy thought…...