News
Meta's Pixio proves simple pixel reconstruction can beat complex vision models
5+ hour, 8+ min ago (469+ words) Researchers at Meta AI have developed an image model that learns purely through pixel reconstruction. Pixio beats more complex methods for depth estimation and 3D reconstruction, despite having fewer parameters and a simpler training approach. A common way to teach AI…...
Leading OpenAI researcher announced a GPT-5 math breakthrough that never happened
2+ mon, 1+ week ago (242+ words) OpenAI researchers recently claimed a major math breakthrough on X, but quickly walked it back after criticism from the community, including Deepmind CEO Demis Hassabis, who called out the sloppy communication. It started with a now-deleted tweet from OpenAI manager…...
AlphaGeometry2: Deepmind AI outperforms math Olympians at geometry tasks
10+ mon, 2+ week ago (542+ words) The latest version of Deepmind's AlphaGeometry system can solve geometry problems better than most human experts, matching the performance of top math competition winners. AlphaGeometry2 solves 84% of International Mathematical Olympiad (IMO) geometry problems from 2000 to 2024, up from its predecessor's 54%. On the…...
Getting the right data and telling it to 'wait' turns an LLM into a reasoning model
10+ mon, 3+ week ago (454+ words) A new approach shows that carefully selected training data and flexible test-time compute control can help AI models tackle complex reasoning tasks more efficiently. From a pool of nearly 60,000 question-answer pairs, researchers selected just 1,000 high-quality examples that met three key…...
Reasoning models like Deepseek-R1 and OpenAI o1 suffer from 'underthinking', study finds
10+ mon, 3+ week ago (828+ words) Chinese researchers have discovered why AI models often struggle with complex reasoning tasks: They tend to drop promising solutions too quickly, leading to wasted computing power and lower accuracy. Reasoning models like Deepseek-R1 and OpenAI o1 suffer from 'underthinking', study finds…...
Frontier models fail hard at "Humanity's Last Exam" but experts question if it matters
11+ mon, 2+ day ago (512+ words) Artificial Intelligence: News, Business, Research An international research team has developed a new benchmark that reveals the current limitations of LLMs. Even the most advanced models fail at 90 percent of the tasks - for now. The test, called "Humanity's Last Exam…...
DeepSeek's latest R1-Zero model matches OpenAI's o1 in reasoning benchmarks
11+ mon, 6+ day ago (613+ words) Artificial Intelligence: News, Business, Research Chinese AI startup DeepSeek has released two new AI models that they say match OpenAI's o1 in performance. Along with their main models, DeepSeek-R1 and DeepSeek-R1-Zero, they've also launched six smaller open-source versions, with some…...
AI agents team up in Agent Laboratory to speed scientific research
11+ mon, 2+ week ago (489+ words) Artificial Intelligence: News, Business, Research Johns Hopkins University and AMD have developed Agent Laboratory, a new open-source framework that pairs human creativity with AI-powered workflows. Unlike other AI tools that try to come up with research ideas on their own,…...