Publications
Explore our research publications on safety, reliability, and robustness of AI systems.
Filters
Year
Venue
Topics
Towards Robust Large Language Models: A Safety and Reliability Benchmark
Jane Smith, John Doe, Alice Johnson • NeurIPS 2023
In this paper, we introduce a comprehensive benchmark for evaluating the safety and reliability of large language models. We propose a suite of tests that assess models' resilience against adversarial attacks, sensitivity to distribution shifts, and ability to maintain factual consistency. Our findings indicate significant gaps in current model robustness that must be addressed for deployment in critical applications.
Adversarial Robustness in Reinforcement Learning for Safety-Critical Systems
Robert Chen, Sarah Williams, Michael Brown • ICML 2023
Reinforcement learning systems deployed in safety-critical domains must be robust to adversarial perturbations. We present a novel approach to training reinforcement learning agents that can withstand targeted attacks while maintaining performance. Our method incorporates worst-case scenario planning and robust optimization techniques, demonstrating significant improvements in safety metrics across multiple environments.
Robustness in Multimodal AI Systems: Challenges and Solutions
Alex Wong, Priya Patel, James Wilson • ICCV 2023
Multimodal AI systems combining vision, language, and audio face unique robustness challenges. We present a comprehensive analysis of failure modes in multimodal systems and propose novel defense mechanisms. Our evaluation across multiple benchmarks shows significant improvements in cross-modal robustness.
Safety Guarantees in Federated Learning Systems
Emma Davis, Carlos Rodriguez, Yuki Tanaka • ICML 2023
Federated learning introduces unique safety challenges due to its distributed nature. We develop a framework for ensuring safety guarantees in federated learning systems while maintaining privacy. Our approach combines differential privacy with robust aggregation methods to prevent both privacy leaks and adversarial attacks.
Uncertainty Quantification for Deep Learning in Safety-Critical Applications
Michael Chang, Sarah Anderson, Raj Patel • NeurIPS 2023
Accurate uncertainty quantification is crucial for deploying deep learning in safety-critical applications. We propose a novel framework that combines Bayesian neural networks with conformal prediction to provide reliable uncertainty estimates. Our method achieves state-of-the-art performance on multiple safety-critical benchmarks.
Robust Natural Language Processing for Low-Resource Languages
Ling Wei, David Kumar, Maria Garcia • ACL 2023
NLP systems for low-resource languages face unique robustness challenges. We present a novel approach that leverages cross-lingual transfer learning and adversarial training to improve robustness. Our method shows significant improvements in handling code-switching, dialect variations, and noisy inputs.
Technical Approaches to AI Alignment: A Survey
Rachel Green, Tom Wilson, Hiroshi Tanaka • AIES 2023
This survey paper examines technical approaches to AI alignment, focusing on methods for ensuring AI systems behave according to human values. We analyze various alignment techniques, their limitations, and future research directions. Our analysis provides a comprehensive overview of the current state of AI alignment research.
Robust Computer Vision for Autonomous Systems
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2023
Autonomous systems require highly robust computer vision capabilities. We present a novel framework that combines adversarial training with uncertainty-aware decision making. Our approach significantly improves robustness against various types of visual perturbations while maintaining high performance on clean inputs.
Formal Methods for Neural Network Verification
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2023
We present novel formal methods for verifying neural network properties. Our approach combines symbolic execution with abstract interpretation to efficiently verify complex properties. The method scales to large networks while providing strong guarantees about network behavior.
Robust Reinforcement Learning for Real-World Applications
Daniel Lee, Priya Sharma, Marcus Chen • ICML 2023
Real-world applications of reinforcement learning require robust performance under uncertainty. We propose a novel framework that combines robust optimization with meta-learning to improve generalization. Our method shows significant improvements in handling distribution shifts and adversarial perturbations.
Comprehensive Benchmarks for AI Safety Evaluation
Anna White, Carlos Rodriguez, Yuki Tanaka • NeurIPS 2023
We introduce a comprehensive suite of benchmarks for evaluating AI safety. Our benchmarks cover various aspects including robustness, fairness, privacy, and alignment. The suite provides standardized evaluation protocols and metrics for comparing different safety approaches.
Formal Verification Methods for Neural Network Control Systems
David Wilson, Elena Martinez, Chris Taylor • ICLR 2022
This paper addresses the challenge of formally verifying neural network-based control systems. We introduce a scalable verification framework that can provide guarantees about the behavior of neural controllers even in the presence of uncertain inputs. Our approach combines abstract interpretation with reachability analysis to efficiently handle nonlinear neural networks in closed-loop systems.
Towards Human-Aligned Explanations in Artificial Intelligence Systems
Lisa Johnson, Mark Thompson, Wei Zhang • AAAI 2022
Explainable AI systems must provide justifications that align with human understanding. We propose a novel framework for generating explanations that match human mental models while maintaining fidelity to the underlying AI system. Through human studies, we demonstrate that our approach produces explanations that are both more interpretable and more useful for enabling effective human oversight.
Detecting and Adapting to Distribution Shifts in Deep Learning Systems
Thomas Lee, Jennifer Garcia, Ryan Kim • CVPR 2022
Distribution shifts pose significant challenges to deployed machine learning systems. We present a framework for continuously monitoring and adapting to distribution shifts in real-time. Our approach combines statistical tests for shift detection with adaptive retraining strategies, enabling systems to maintain performance in changing environments without requiring manual intervention.
Robust Machine Learning: Theory and Practice
Michael Brown, Sarah Johnson, David Wilson • JMLR 2022
This paper presents a comprehensive study of robust machine learning from both theoretical and practical perspectives. We analyze various robustness guarantees and their implications for real-world applications. Our findings provide insights into the trade-offs between robustness and performance.
Ethical Considerations in AI Development
Emma Davis, James Wilson, Maria Garcia • AIES 2022
We examine key ethical considerations in AI development and deployment. The paper discusses issues of fairness, transparency, and accountability, providing practical guidelines for ethical AI development. Our framework helps organizations navigate complex ethical challenges in AI projects.
Robust Natural Language Processing
Ling Wei, Tom Chen, Priya Patel • ACL 2022
This paper presents novel approaches to improving robustness in natural language processing systems. We address challenges in handling noisy inputs, adversarial attacks, and distribution shifts. Our methods show significant improvements in maintaining performance under various types of perturbations.
Formal Verification of Deep Learning Systems
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2022
We present novel methods for formal verification of deep learning systems. Our approach combines symbolic execution with abstract interpretation to efficiently verify complex properties. The method scales to large networks while providing strong guarantees about network behavior.
Robust Computer Vision Systems
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2022
This paper presents novel approaches to improving robustness in computer vision systems. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our methods show significant improvements in maintaining performance under various types of visual perturbations.
AI Safety: Current Challenges and Future Directions
Anna White, Carlos Rodriguez, Yuki Tanaka • AIES 2022
We examine current challenges and future directions in AI safety research. The paper discusses various aspects of safety including robustness, alignment, and ethical considerations. Our analysis provides insights into key research areas and potential solutions.
An Ethical Framework for AI Development in High-Stakes Domains
Sophia Rodriguez, Daniel Park, Maria Nguyen • FAccT 2021
Deploying AI systems in high-stakes domains requires careful ethical consideration. We propose a comprehensive framework for ethically developing and evaluating AI systems that impact human well-being. Our framework integrates principles from philosophy, risk assessment, and stakeholder engagement to provide practical guidance for AI practitioners working in sensitive areas such as healthcare, criminal justice, and financial services.
Robust Machine Learning: A Survey
Michael Brown, Sarah Johnson, David Wilson • JMLR 2021
This survey paper examines various approaches to robust machine learning. We analyze different robustness guarantees, defense mechanisms, and evaluation methods. Our analysis provides a comprehensive overview of the current state of robust machine learning research.
Ethical AI: Principles and Practices
Emma Davis, James Wilson, Maria Garcia • AIES 2021
We present a comprehensive framework for ethical AI development and deployment. The paper discusses key principles and practical guidelines for ensuring ethical AI systems. Our framework helps organizations navigate complex ethical challenges in AI projects.
Robust Natural Language Processing: Challenges and Solutions
Ling Wei, Tom Chen, Priya Patel • ACL 2021
This paper examines challenges and solutions in robust natural language processing. We address issues in handling noisy inputs, adversarial attacks, and distribution shifts. Our methods show significant improvements in maintaining performance under various types of perturbations.
Formal Verification Methods for AI Systems
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2021
We present novel methods for formal verification of AI systems. Our approach combines symbolic execution with abstract interpretation to efficiently verify complex properties. The method scales to large systems while providing strong guarantees about system behavior.
Robust Computer Vision: Theory and Applications
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2021
This paper presents theoretical foundations and practical applications of robust computer vision. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our methods show significant improvements in maintaining performance under various types of visual perturbations.
AI Safety: A Comprehensive Review
Anna White, Carlos Rodriguez, Yuki Tanaka • AIES 2021
We present a comprehensive review of AI safety research. The paper examines various aspects of safety including robustness, alignment, and ethical considerations. Our analysis provides insights into key research areas and potential solutions.
Robust Machine Learning: Foundations and Applications
Michael Brown, Sarah Johnson, David Wilson • JMLR 2020
This paper examines foundations and applications of robust machine learning. We analyze different robustness guarantees, defense mechanisms, and evaluation methods. Our analysis provides insights into the trade-offs between robustness and performance.
Ethical AI Development: A Framework
Emma Davis, James Wilson, Maria Garcia • AIES 2020
We present a framework for ethical AI development. The paper discusses key principles and practical guidelines for ensuring ethical AI systems. Our framework helps organizations navigate complex ethical challenges in AI projects.
Robust Natural Language Processing Systems
Ling Wei, Tom Chen, Priya Patel • ACL 2020
This paper presents novel approaches to building robust natural language processing systems. We address challenges in handling noisy inputs, adversarial attacks, and distribution shifts. Our methods show significant improvements in maintaining performance under various types of perturbations.
Formal Verification of Machine Learning Systems
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2020
We present novel methods for formal verification of machine learning systems. Our approach combines symbolic execution with abstract interpretation to efficiently verify complex properties. The method scales to large systems while providing strong guarantees about system behavior.
Robust Computer Vision: Methods and Applications
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2020
This paper presents methods and applications of robust computer vision. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our methods show significant improvements in maintaining performance under various types of visual perturbations.
AI Safety: Current State and Future Directions
Anna White, Carlos Rodriguez, Yuki Tanaka • AIES 2020
We examine the current state and future directions of AI safety research. The paper discusses various aspects of safety including robustness, alignment, and ethical considerations. Our analysis provides insights into key research areas and potential solutions.
Robust Machine Learning: A Comprehensive Study
Michael Brown, Sarah Johnson, David Wilson • JMLR 2019
This paper presents a comprehensive study of robust machine learning. We analyze different robustness guarantees, defense mechanisms, and evaluation methods. Our analysis provides insights into the trade-offs between robustness and performance.
Ethical AI: Principles and Implementation
Emma Davis, James Wilson, Maria Garcia • AIES 2019
We present principles and implementation guidelines for ethical AI development. The paper discusses key considerations and practical approaches for ensuring ethical AI systems. Our framework helps organizations navigate complex ethical challenges in AI projects.
Robust Natural Language Processing: A Survey
Ling Wei, Tom Chen, Priya Patel • ACL 2019
This survey paper examines approaches to robust natural language processing. We address challenges in handling noisy inputs, adversarial attacks, and distribution shifts. Our analysis provides a comprehensive overview of current methods and future directions.
Formal Verification of AI Systems: A Survey
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2019
We present a survey of formal verification methods for AI systems. Our analysis covers various approaches to verifying system properties and behavior. The survey provides insights into current methods and future research directions.
Robust Computer Vision: A Survey
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2019
This survey paper examines approaches to robust computer vision. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our analysis provides a comprehensive overview of current methods and future directions.
AI Safety: A Survey of Current Approaches
Anna White, Carlos Rodriguez, Yuki Tanaka • AIES 2019
We present a survey of current approaches to AI safety. The paper examines various aspects of safety including robustness, alignment, and ethical considerations. Our analysis provides insights into key research areas and potential solutions.
Robust Machine Learning: Early Approaches
Michael Brown, Sarah Johnson, David Wilson • JMLR 2018
This paper examines early approaches to robust machine learning. We analyze different robustness guarantees, defense mechanisms, and evaluation methods. Our analysis provides insights into the evolution of robust machine learning research.
Ethical AI: Early Considerations
Emma Davis, James Wilson, Maria Garcia • AIES 2018
We examine early considerations in ethical AI development. The paper discusses key principles and practical guidelines for ensuring ethical AI systems. Our analysis provides insights into the evolution of ethical AI research.
Robust Natural Language Processing: Early Methods
Ling Wei, Tom Chen, Priya Patel • ACL 2018
This paper examines early methods in robust natural language processing. We address challenges in handling noisy inputs, adversarial attacks, and distribution shifts. Our analysis provides insights into the evolution of robust NLP research.
Formal Verification of AI Systems: Early Approaches
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2018
We examine early approaches to formal verification of AI systems. Our analysis covers various methods for verifying system properties and behavior. The paper provides insights into the evolution of formal verification research.
Robust Computer Vision: Early Methods
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2018
This paper examines early methods in robust computer vision. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our analysis provides insights into the evolution of robust computer vision research.
AI Safety: Early Research Directions
Anna White, Carlos Rodriguez, Yuki Tanaka • AIES 2018
We examine early research directions in AI safety. The paper discusses various aspects of safety including robustness, alignment, and ethical considerations. Our analysis provides insights into the evolution of AI safety research.
Robust Machine Learning: Foundations
Michael Brown, Sarah Johnson, David Wilson • JMLR 2017
This paper examines foundational work in robust machine learning. We analyze different robustness guarantees, defense mechanisms, and evaluation methods. Our analysis provides insights into the early development of robust machine learning research.
Ethical AI: Foundational Principles
Emma Davis, James Wilson, Maria Garcia • AIES 2017
We examine foundational principles in ethical AI development. The paper discusses key considerations and practical approaches for ensuring ethical AI systems. Our analysis provides insights into the early development of ethical AI research.
Robust Natural Language Processing: Foundations
Ling Wei, Tom Chen, Priya Patel • ACL 2017
This paper examines foundational work in robust natural language processing. We address challenges in handling noisy inputs, adversarial attacks, and distribution shifts. Our analysis provides insights into the early development of robust NLP research.
Formal Verification of AI Systems: Foundations
Sophie Martin, Rajesh Kumar, Emma Wilson • CAV 2017
We examine foundational work in formal verification of AI systems. Our analysis covers various methods for verifying system properties and behavior. The paper provides insights into the early development of formal verification research.
Robust Computer Vision: Foundations
Kevin Zhang, Lisa Brown, Ahmed Hassan • CVPR 2017
This paper examines foundational work in robust computer vision. We address challenges in handling adversarial attacks, distribution shifts, and real-world perturbations. Our analysis provides insights into the early development of robust computer vision research.