Detecting Alignment Faking in Language Models less than 1 minute read Welcome to my new site. I’ll be writing about AI safety research, alignment faking detection, and whatever else seems interesting.