News

@Anthropic AI
anthropic. com > research > automated-alignment-researchers

Automated Alignment Researchers: Using large language models to scale scalable oversight

8+ hour, 22+ min ago  (420+ words) One is how alignment can keep up. Frontier AI models are now contributing to the development of their successors. But can they provide the same kind of uplift for alignment researchers? Could our language models be used to help align…...

@Anthropic AI
anthropic. com > research > assistant-axis

The assistant axis: situating and stabilizing the character of large language models

2+ mon, 3+ week ago  (947+ words) We can investigate these questions by looking at the neural representations" inside language models'the patterns of activity that inform how they respond. In a new paper, conducted through the MATS and Anthropic Fellows programs, we look at several open-weights language…...

@Anthropic AI
anthropic. com > engineering > building-effective-agents

Building Effective AI Agents

9+ mon, 3+ week ago  (1292+ words) Published Dec 19, 2024 We've worked with dozens of teams building LLM agents across industries. Consistently, the most successful implementations use simple, composable patterns rather than complex frameworks. Over the past year, we've worked with dozens of teams building large language model…...