About

I’m Vincent Oh, an AI safety researcher based in Canada.

My work focuses on detecting alignment faking in large language models using sparse autoencoder (SAE) probes and mechanistic interpretability techniques. I’m interested in building practical tools for AI oversight that scale beyond manual inspection.

Background

AI safety research (alignment faking detection, mechanistic interpretability)
Infrastructure and systems engineering (BIRS video systems, deployment pipelines)
Open source contributor

Contact

Find me on GitHub, HuggingFace, or Kaggle.