Prince Jha

(Last Updated: May 11, 2024)

I am a research associate at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) with Dr. Nils Lukas. Previously, I was working at the Singapore University of Technology and Design as a research assistant with Dr. Roy Lee .

I recently graduated with a Bachelor in Computer Science and Engineering from Indian Institute of Technology Patna. I was fortunate to be advised by Dr. Sriparna Saha and Prof. Pushpak Bhattacharyya at AI-NLP-ML Lab, IIT Patna.

My research addresses the broader challenge of creating trustworthy machine learning systems, focusing on building models that are secure, interpretable, and aligned with safety principles. Specifically, I am interested in:

Reliability

Investigating the robustness of ML models against various attacks, such as adversarial attacks, backdoor attacks, and inference attacks. This includes developing effective defense mechanisms to ensure models perform reliably in diverse conditions.

Interpretability

Enhancing the transparency of machine learning models by understanding how features contribute to their predictions. This promotes clearer communication of model behavior, fostering trust in automated systems.

Causality

Exploring the cause-and-effect relationships between model inputs and outputs to enable meaningful interventions. This ensures that models are safety-aligned and behave predictably in response to changes in inputs.

Misuse Prevention

Examining methods to prevent malicious misuse of models, focusing on techniques such as safety alignment and watermarking to protect models from unauthorized exploitation.