News

36 Kr
eu.36kr.com > en > p > 3708363640172676

Reinforcement Learning in the Agent Era: AReaL Framework and Best Practices for Agents - Optimized Guide

2+ hour ago  (255+ words) With the rapid development of large models and Agent technology, Reinforcement Learning (RL) is becoming the key engine to enhance the autonomous decision - making ability of AI agents. However, traditional RL training methods face challenges such as high computational costs,…...

36 Kr
eu.36kr.com > en > p > 3708351596769667

MIT's APOLLO Framework: Breaking Traditional Multimodal Integration Limits for Clear Separation of Cell - Shared and Cell - Specific Information

2+ hour, 30+ min ago  (350+ words) Therefore, with the continuous evolution of single-cell technologies and the rapid growth of data scale, how to efficiently and automatically integrate multi-modal data while clearly decoupling shared information and modality-specific information has become a core challenge in current single-cell biology....

36 Kr
eu.36kr.com > en > p > 3705238289822081

Brain Cells Turned into Chips to Play Doom: 200,000 Living Neurons Find Own Way to Kill Enemies, Crushing Deep Reinforcement Learning in Learning Efficiency

2+ day, 7+ hour ago  (1340+ words) 200,000 human brain cells have formed a "brain CPU" and learned to play the classic game Doom. These living neurons have learned to find enemies, shoot, turn and move, and even manage ammunition through reinforcement learning. It was the same technology…...

36 Kr
eu.36kr.com > en > p > 3681358342909568

Real - Machine RL Goes Crazy: Robot Self - Learns in 20 Minutes and Scores 100 Points, Digital Twin a Legend

2+ week, 5+ day ago  (1330+ words) [Introduction] TwinRL constructs a digital twin by scanning the scene once with a mobile phone, allowing robots to boldly explore and accurately test errors in the digital twin first. Then, when returning to the real machine, it can cover the…...

36 Kr
eu.36kr.com > en > p > 3678616150303361

U.S. Energy Department Scientists' D - CHAG Method: Reduce Memory Usage by Up to 75% for Multi - channel Datasets in Extremely Large - scale Models

3+ week, 2+ hour ago  (862+ words) Scientists from the Oak Ridge National Laboratory of the U.S. Department of Energy have proposed a Distributed Cross-Channel Hierarchical Aggregation method (D-CHAG) for foundation models. This method distributes the tokenization process and adopts a hierarchical strategy for channel aggregation, enabling extremely…...

36 Kr
eu.36kr.com > en > p > 3678574192993156

Peking University Unveils Open-Source Fine-Grained Visual Recognition Large Model Surpassing CLIP, Trained with Just 4 Images per Class

3+ week, 3+ hour ago  (461+ words) Currently, multi-modal large models perform excellently on many complex multi-modal tasks but significantly lag behind the visual encoders they rely on (such as CLIP) in fine-grained visual recognition tasks. In response, the research team led by Professor Peng Yuxin from…...

36 Kr
eu.36kr.com > en > p > 3678264053719682

Milestone: 100B Diffusion Language Model Reaches 892 Tokens/Second, Successfully Exploring an Alternative Path for AI

3+ week, 7+ hour ago  (956+ words) The diffusion language model (dLLM), a research direction once considered a "niche track," has finally achieved a qualitative change. Last Monday, LLaDA2.1 was quietly launched on HuggingFace, just two months after the release of the previous version, LLaDA2.0. This release includes two versions:…...

36 Kr
eu.36kr.com > en > p > 3678363993252488

World Model Driven: Embodied Intelligence Leaves the Era of "Blind Action" Behind

3+ week, 7+ hour ago  (978+ words) Embodied intelligence is undergoing a silent paradigm shift. Almost at the same time, teams from Stanford, NVIDIA, etc. jointly released Cosmos Policy, realizing that "robotic actions can be learned only with a video generation model"; NVIDIA then released DreamZero, which…...

36 Kr
eu.36kr.com > en > p > 3675921665368961

1.8x Increase in Training Speed, 78% Reduction in Inference Overhead: Accurate Question Selection Efficiently Accelerates RL Training

3+ week, 2+ day ago  (1063+ words) A series of works based on Reinforcement Learning with Value Regularization (RLVR) fine - tuning, represented by DeepSeek R1, have significantly improved the reasoning ability of large language models. However, behind this wave, the cost of reinforcement fine - tuning is astonishingly high....

36 Kr
eu.36kr.com > en > p > 3659278714839936

New Nature Cover: Google's Alpha Series Newcomer Instantly Grasps Life's Ultimate Blueprint

1+ mon, 6+ day ago  (1061+ words) This is all thanks to the unified DNA sequence model AlphaGenome, which graced the latest cover of the authoritative scientific journal Nature. It is a new addition to Google DeepMind's Alpha series of AIs. AlphaGenome is a deep learning model…...