News

lesswrong. com
lesswrong. com > posts > Zv Qfc Lbc NHYqmv Wyo > the-paper-that-killed-deep-learning-theory

The paper that killed deep learning theory " Less Wrong

6+ hour, 37+ min ago  (554+ words) Around 10 years ago, a paper came out that arguably killed classical deep learning theory: Zhang et al.'s aptly titled Understanding deep learning requires rethinking generalization. Believe it or not, this unassuming table rocked the field of deep learning theory…...

lesswrong. com
lesswrong. com > posts > 2 Wkei YFT5p3ph8 Pkf > quick-paper-review-there-will-be-a-scientific-theory-of-deep

Quick Paper Review: "There Will Be a Scientific Theory of Deep Learning" " Less Wrong

1+ day, 6+ hour ago  (586+ words) h/t Eric Michaud for sharing his paper with me. There's a tradition of high-impact ML papers using short, punchy categorical sentences as their titles: Understanding Deep Learning Requires Rethinking Generalization, Attention is All You Need, Language Models Are Few…...

lesswrong. com
lesswrong. com > posts > c7 D2q7k97 Qcw Cx Br N > what-holds-ai-safety-together-co-authorship-networks-from

What holds AI safety together? Co-authorship networks from 200 papers " Less Wrong

1+ day, 12+ hour ago  (206+ words) We (social science Ph D students) computed co-authorship networks based on a corpus of 200 AI safety papers covering 2015-2025, and we'd like your help checking if the underlying dataset is right. Of course, these visualizations are only as good as the…...

lesswrong. com
lesswrong. com > posts > fab Xad4 XGk Cg Ab Xc H > an-empirical-study-of-methods-for-sfting-opaque-reasoning

An Empirical Study of Methods for SFTing Opaque Reasoning Models " Less Wrong

1+ day, 20+ hour ago  (1218+ words) We open-source our code here. Alek previously sketched a few ideas for how we might still be able to do SFT'on opaque reasoning models. In this post, we try some of them against prompted sandbaggers. We test two kinds of…...

lesswrong. com
lesswrong. com > posts > drmu Dn4dw LBg3idx Q > mathematics-and-empiricism

Mathematics and Empiricism " Less Wrong

1+ day, 21+ hour ago  (1775+ words) In Does the Universe Speak a Language We Just Made Up? Lorenzo Elijah, Ph D shares his fascination with math and echoes a common idea among philosophers that the "surprising efficiency of math" is a problem for empiricism and physicalism:…...

lesswrong. com
lesswrong. com > posts > Bo4 Fb Dxb3 Yr Zwap3 J > monthly-roundup-41-april-2025

Monthly Roundup #41: April 2025 " Less Wrong

2+ day, 12+ min ago  (1896+ words) AI continue to accelerate and dominate the schedule, which is why this is a bit late, but we do occasionally need to pay our respects to the Goddess of Everything Else. There's cool or interesting things everywhere. Also maddenning things....

lesswrong. com
lesswrong. com > posts > vq QPYZLi Ey Fn9 YSb Y > diary-of-a-doomer-12-years-arguing-about-ai-risk-part-2

Diary of a "Doomer": 12+ years arguing about AI risk (part 2) " Less Wrong

2+ day, 7+ hour ago  (585+ words) Awareness and concern about the extinction risk posed by AI has been increasing the whole time I've been in the field. It feels like it's finally going mainstream. But it's also felt this way before" But around the same time…...

lesswrong. com
lesswrong. com > posts > s Xi3 Bo339 Akr Wh JLQ > raising-ai-by-lowering-expectations

Raising AI by Lowering Expectations " Less Wrong

2+ day, 12+ hour ago  (903+ words) > De Kai's Raising AI argues that fear-based framing in AI discourse is limiting us, and that we should think of AI as something we're raising rather" Raising AI by Lowering Expectations De Kai's Raising AI argues that fear-based framing in…...

lesswrong. com
lesswrong. com > posts > bny Py64ck38 Cib2v5 > what-happens-when-a-model-thinks-it-is-agi

What Happens When a Model Thinks It Is AGI? " Less Wrong

2+ day, 14+ hour ago  (789+ words) The behaviours relevant for AI safety are the behaviours models exhibit under the conditions they will actually face. Right now, we think it's fair to say many current safety concerns are conditional: a model might behave badly if it believed…...

lesswrong. com
lesswrong. com > posts > g8by3avjat Xnpv M4 A > should-we-train-against-cot-monitors-1

Should'We Train Against (Co T) Monitors? " Less Wrong

2+ day, 18+ hour ago  (1768+ words) The question I actually try to answer in this post is a broader one (that doesn't work as well as a title): Should we incorporate proxies for desired behavior into LLM alignment training? Epistemic status: My best guess. I tentatively…...