The Alignment Problem: Machine Learning and Human Values

by Brian Christian

star4.3

Brian Christian traces the history and cutting edge of efforts to build AI systems that reliably reflect human values, drawing on hundreds of interviews with researchers in machine learning, cognitive science, and philosophy. Organised into three sections on representation, behavior, and normativity, the book reveals how bias in training data, misspecified reward functions, and the gap between optimization targets and human intent create systems that diverge from their creators' goals.

Published:: 2020
Pages:: 496

Buy on Amazon

In the Conversation

In this collection, The Alignment Problem: Machine Learning and Human Values references 4 other books.

It draws on Superintelligence, Life 3.0 and Thinking, Fast and Slow.

Scroll down to read the exact passages where other authors reference this book and what they say about it.

What This Book Draws On

The books Christian references and why each one mattered to the argument.

Christian frames his book as an investigation into the practical, present-day manifestation of the alignment concerns Bostrom raised theoretically in Superintelligence, showing how misalignment already plagues deployed ML systems

References

Superintelligence

by Nick Bostrom

Buy

Discusses Tegmark's Life 3.0 taxonomy of AI futures when analyzing how different alignment failure modes map to different scenario categories for artificial general intelligence

References

Life 3.0

by Max Tegmark

Buy

Draws extensively on Kahneman's Thinking, Fast and Slow framework of cognitive biases to explain how human heuristics encoded in training data cause machine learning models to inherit and amplify systematic errors

References

Thinking, Fast and Slow

by Daniel Kahneman

Buy

Engages with Ord's The Precipice argument about existential risk from advanced AI, situating alignment research as one of the most urgent interventions for reducing catastrophic outcomes

References

The Precipice

by Toby Ord

Buy

What Other Authors Say About It

No books citing this title yet.

Intellectual Lineage

How ideas flow through the citation network. Ancestors are books this title builds on; descendants are books that build on it.

Builds on (2 layers deep)

Directly cites

Superintelligence

Life 3.0

Thinking, Fast and Slow

The Precipice

2 steps back

The Master Algorithm

The Black Swan

Stumbling on Happiness

The Wisdom of Crowds

Sources of Power

Predictably Irrational

Unexpected Connections

Books from completely different categories that share citation overlap with this one. These are the reads you would not find by browsing a single shelf.