SAFERR AI Lab

Advancing research in Safety, Reliability, and Robustness of AI

At SAFERR AI Lab, we focus on developing and testing AI systems that are safe, reliable, and robust. Our interdisciplinary research addresses critical challenges in ensuring AI systems operate dependably in real-world settings.

Explore Our Research View Our Publications

Latest News

May 7, 2025

New paper accepted at ICML 2025

We're excited to announce that our paper `Inference-Time Alignment of LLMs via User-Specified Multi-Criteria Transfer Decoding` has been accepted at ICML 2025. This work represents a inference-time alignment of LLMs that can be used to align LLMs with user-specified criteria. Read more

December 12, 2024

New paper accepted at AAAI 2024

Our paper titled `Align-Pro: A principled approach to alignment of LLMs` has been accepted at AAAI 2024. This work represents a principled approach to alignment of LLMs that can be used to align LLMs by employing a trainable prompter Read more

August 19, 2024

Welcoming a PhD student to the lab

We're delighted to welcome a new PhD student, Avinash Reddy, joining our lab this fall semester. He will be working on the broad topic of `Alignment of Language Models`.

Research Areas

Our interdisciplinary team works across these key areas to address the critical challenges in AI safety, reliability, and robustness.

Safety

Designing AI systems that proactively avoid harmful behaviors and operate within clearly defined safety boundaries, even under uncertain real-world conditions.

Reliability

Ensuring AI systems deliver consistent, predictable, and verifiable behavior across tasks, environments, and deployment scenarios.

Robustness

Developing AI systems resilient to adversarial inputs, sensor noise, and distributional shifts, enabling reliable performance in dynamic and imperfect settings.

Explainability

Building interpretability tools that clarify how and why AI systems make decisions, enabling trust and deeper understanding of model behavior.

Human Alignment

Aligning AI behavior with human intent, preferences, and values through preference learning, value modeling, and robust evaluation protocols.

Ethical Governance

Studying the societal implications of AI and building frameworks to embed accountability, fairness, and transparency into development workflows.