Himanshu Beniwal

About Me 🫡

I am a Ph.D. student and Prime Minister’s Research Fellow (PMRF) in the Computer Science and Engineering discipline at IIT Gandhinagar, where Prof. Mayank Singh advises me. My area of interest includes Robust and Interpretable NLP. I work on creating robust defenses against Poisoning Attacks and Model-Editing in Large Language Models, focusing on interpretability and safety in NLP.

More about me! 💭

🔭 I’m currently working on how an AI reads & understand the language! 🤖
📫 Socials: LinkedIn 👨🏼‍💼, Twitter 🐤, Hugging-Face 🤗
📸 Travel Pics: Instagram
😄 Fav mathematical equation: The magic of Euler’s Identity; \(e^{i \pi} + 1 = 0\)
⚡ Fun fact: Traveling the 🌎 with 🖤 for espresso ☕️ & crazy for 💻.

Research Interests 🤯

Natural Language Processing: Robust and Interpretable NLP, Secure NLP, Model-Editing, Word Segmentation, and Conversational AI.
Machine Learning: Adversarial Attacks, Poisoning Attacks, Data and Embedding Poisoning Attacks.

Some things I know 😎

Programming: Python, R, C
Web Technologies: Javascript, Flask, ReactJS, Bootstrap
Libraries: NLTK, OpenCV, PyTorch, Tensorflow, Transformers, ElasticSearch, Flair, Trankit, TextAttack, SeqAttack

News 🔊

[July 2025] PolyGuard is accepted at COLM 2025!!!!
[June 2025] Received the Fulbright-Nehru Doctoral Research Fellowship 2025 📢📢📢
[June 2025] Awarded with the Microsoft Research India PhD Award 2025!!! 🎉🎉🎉
[May 2025] Breaking mBad is ArXiv’ed! 🙌🏻
[April 2025] Talk on “Cross-lingual Backdoors” at Plutous! [Recording] ⭐️
[April 2025] PolyGuard is now on ArXivvvv! 🫡
[March 2025] COMI-LINGUA is out on ArXiv! :)
[March 2025] Attended Advanced Language Processing School (ALPS) 2025 at the beautiful Centre CNRS Paul Langevin, Aussois, France! 🇫🇷
[March 2025] Talk on “GenAI in HealthCare”, at Google Developer Group - Silver Oak University, Ahmedabad, India 🇮🇳!!
[March 2025] Attended PMRF Symposium 2025 at IIT Hyderabad, India 🇮🇳!
[Feb 2025] Welcome the UnityAI-Guard 😵‍💫 to the world 🌎!
[Feb 2025] Char-mander 🔥, “A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs”, is now on ArXiv 🕸️!
[Feb 2025] Attended Pingala Interactions in Computing (PIC) 2025 at the fabulous Mysore campus of Infosys, India! 🇮🇳
[Jan 2025] Attended Google DeepMind Research Symposium 2025 at Google Office, Bangalore, India! 🇮🇳
[Sept 2024] TempUN, PythonSaga, and COMMENTATOR made it to EMNLP 24’ 🇺🇸! 🏝️
[Aug 2024] Talk on “Editing Large Language Models”, MilaNLP, Italy! 🇮🇹
[Aug 2024] Visiting the University of Virginia (🇺🇸) with Prof. Tom Hartvigsen!! Super excited to work on Interpretable + Multilingual NLP! 🤯🤯🤯
[Aug 2024] Attended ACL 2024 at Bangkok, Thailand! 🇹🇭
[July 2024] Released the Ganga-1B model! The Ganga-1B model outperforms existing open-source models that support Indian languages, even at sizes of up to 7 billion parameters.
[June 2024] Gave a talk at ACM Summer School on Generative AI for Text on “Instruction fine-tuning, FLAN-T5, and Quantisation”! 🤯 (Recording)
[June 2024] Gave a talk at India-ML Reading Group on “Editing LLMs”! 🤩 (Recording)
[March 2024] XME Made it to EACL 2024, March 17-22, 2024, at St. Julian’s, Malta! 🇲🇹
[March 2024] Gave a talk on “Editing Large Language Models” at Google Research India, Bangalore, India!! 🇮🇳
[March 2024] Attended PMRF Symposium 2024, March 3-4, 2024, at IIT Indore, India.
[Feb 2024] Attended Research Week with Google 2024, Feb 1-3, 2024, at Bengaluru, India.
[Jan 2024] XME is now on ArXiv 🕸️!
[Dec 2023] Gandhipedia, a project of National Importance by joint initiative of IIT Kharagpur, IIT Gandhinagar, and NCSM (Kolkata), under the aegis of The Ministry of Culture, Government of India, was launched! 📢📢📢
[Dec 2023] Served as Publicity Chair for IndoML 2023 at IIT Bombay.
[Sept 2023] Attended Technology & Bharatiya Bhasha Summit, Sept 30 - Oct 01, 2023, New Delhi, India! 🇮🇳
[July 2023] Attended Deep Learning and Artificial Intelligence Summer/Winter School 2023 (DLAI7), 17 - 21 July 2023.
[May 2023] Attended MLSS 2023 at Krakow, Poland! 🇵🇱
[April 2023] Done with Ph.D. Thesis Proposal Defense! 😊✅
[Feb 2023] Attended ARCS 2023 and ACM Annual Event 2023 at OIST Bhopal, India.
[Jan 2023] Attended Research Week with Google 2023 at Bengaluru, India.
[Jan 2023] Attended CODS-COMAD 2023 at IIT Bombay, India.
[Dec 2022] Done with PhD Qualifiying Examinations! 😄🕺🏻
[Oct 2022] Got selected as a Prime Minister’s Research Fellow (PMRF). 📢📢
[Aug 2022] Attended Oxford Machine Learning Summer School (OxML) 2022.
[July 2022] Attended Eastern European Machine Learning Summer School (EEML) 2022.
[Feb 2022] Attended Research Week with Google 2022.
[Jan 2022] Attended Advanced Language Processing Winter School (ALPS) 2022.
[August 2021] Started doctoral journey with Prof. Mayank Singh at Computational Linguistics and Complex Social Networks Group. 📢📢📢

Education 👨🏻‍🎓

Indian Institute of Technology Gandhinagar
Doctor of Philosophy in Computer Science and Engineering, 2021 - Present
Thesis: Defending and Editing Large Language Models
Central University of Punjab
Master of Technology in Computer Science and Technology, 2019 - 2021, Rank: 1 (Gold Medalist)
Thesis: Assessing Empathetic Capabilities in Conversational Approaches
Hemvati Nandan Bahuguna Garhwal University (A Central University)
Bachelor of Technology in Computer Science and Engineering, 2015 - 2019, Rank: 1
Thesis: Vehicle simulation using Q-Learning and vehicle control in CARLA

Publications (Citations: 131) 📚

Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification
Himanshu Beniwal, Youngwoo Kim, Maarten Sap, Soham Dan, Thomas Hartvigsen
ArXiv 2025
[PDF]
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Priyanshu Kumar, Devansh Jain, Akhila Yerukola, Liwei Jiang, Himanshu Beniwal, Thomas Hartvigsen, Maarten Sap
COLM 2025
[PDF]
UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages
Himanshu Beniwal, Reddybathuni Venkat, Rohit Kumar, Birudugadda Srivibhav, Daksh Jain, Pavan Doddi, Eshwar Dhande, Adithya Ananth, Kuldeep, Heer Kubadia, Pratham Sharda, Mayank Singh
ArXiv 2025
[PDF]
COMI-LINGUA: Expert Annotated Large-Scale Dataset for Multitask NLP in Hindi-English Code-Mixing
Rajvee Sheth, Himanshu Beniwal, Mayank Singh
ArXiv 2025
[PDF]
Char-mander Use mBackdoor! A Study of Cross-lingual Backdoor Attacks in Multilingual LLMs
Himanshu Beniwal, Sailesh Panda, Birudugadda Srivibhav, Mayank Singh
ArXiv 2025
[PDF]
COMMENTATOR: A Code-mixed Multilingual Text Annotation Framework
Rajvee Sheth, Shubh Nisar, Heenaben Prajapati, Himanshu Beniwal, Mayank Singh
EMNLP DEMO 2024 (Core Rank: A*)
[PDF] | [Website 🕸️]
PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs
Ankit Yadav, Himanshu Beniwal, Mayank Singh
EMNLP 2024 (Core Rank: A*)
[PDF]
Remember This Event That Year? 🤔 Assessing Temporal Information and Reasoning in Large Language Models
Himanshu Beniwal, Dishant Patel, Kowsik Nandagopan D, Hritik Ladia, Ankit Yadav, Mayank Singh
EMNLP 2024 (Core Rank: A*)
[PDF] | [Website 🤔]
Cross-lingual Editing in Multilingual Language Models
Himanshu Beniwal, Kowsik Nandagopan D, Mayank Singh
Findings of the Association for Computational Linguistics: EACL 2024 (Core Rank: A)
[PDF] | [Website 🕸️]
A survey on near-human conversational agents
Satwinder Singh, Himanshu Beniwal
Journal of King Saud University - Computer and Information Sciences, 1319-1578, 2021. JKSU-CIS 2021.
[PDF] | IF: 13.473 (2021)
Handwritten Digit Recognition using Machine Learning
Narender Kumar, Himanshu Beniwal
International Journal of Computer Sciences and Engineering, Vol.06, Issue.05, pp.96-100, 2018. IJCSE 2018.
[PDF] | IF: 3.218 (2018)

Posters/Talks 🔊

[Sept 2024] Talk on “Editing Large Language Models”, MilaNLP, Italy! 🇮🇹
[March 2024] Talk on “Editing Large Language Models”, Google Research India, Bangalore, India.
[March 2024] PMRF Symposium 2024, “Cross-lingual Editing in Multilingual Language Models”, at IIT Indore.
[Feb 2024] [Research Week with Google 2024] at Google Research India, Bangalore, India, on ‘XME: Cross-lingual Model Editing in LLMs’.
[January 2024] [PhD Research Showcase 2024] at IIT Gandhinagar, India, on ‘Temporal Learnings in LLMs’.
[August 2023] [PhD Research Showcase 2023] at IIT Gandhinagar, India, on ‘XME: Cross-lingual Model Editing in LLMs’.
[June 2023] [MLSS^S 2023] in Krakow, Poland 🇵🇱, on ‘Backdoor Attacks in CV and NLP’.
[Feb 2023] Talk on “Backdoor Attacks in NLP”, at IISER Bhopal.

Community Experience 👷🏻‍♂️

Volunteer: Communications Team at ACL Rolling Review (April 24’ - Present).
Mentor: Research Mentor at SimPPL (Jan 2024 - Present).
Member: Web Developer at Research Society (अन्वेषणम्) Club at IIT Gandhinagar (August 2023 - Present).
Organizer: IndoML 2023
Volunteer: IndoML 2022, ACM-IKDD Summer School 2022
Conference Reviewer: LREC-COLING 2024, EACL CASE 2024, EACL Demo 2024, EAI SaSeIoT 2023, EMNLP 2023, ICTIR 2023, ACL Workshop BigScience 2022, DLSM 2021
Journal Reviewer: ACI 2022
Organized 20+ workshops/hackathons events. Pictures 📸
Beta Reviewer: Coursera
Mentor: Summer Internship Mentor at RightApprise 2018
Campus Representative/Ambassador: Google Crowdsource 2019, GeeksforGeeks 2018-19, Internshala 2017-18
Scholar: Udacity Facebook Scholar 2019, Google India Scholar 2018

TA🛳️s

Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT)
- IT549: Deep Learning, +45 student, Jan 2025 to May 2025, with Prof. Arpit Rana.
- DS605: Fundamentals of Machine Learning, 65+ students, August 2024 to December 2024, with Prof. Arpit Rana.
- IT:492 Recommendation Systems, 45+ students, Jan 2024 to May 2024, with Prof. Arpit Rana.
- IT:496 Introduction to Data Mining, July 2023 to December 2023, with Prof. Arpit Rana.
- IT:492 Recommendation Systems, Jan to May 2023, with Prof. Arpit Rana.
Indian Institute of Technology Gandhinagar
- CS 203: Software Tools & Techniques for AI, January 2025 to May 2025, with Prof. Mayank Singh.
- CS:613 Natural Language Processing, August 2024 to December 2024, with Prof. Mayank Singh.
- [Graduate Teaching Fellow] Data Centric Computing, Jan 2024 to May 2024, with Prof. Manoj Gupta and Prof. Mayank Singh.
- CS:613 Natural Language Processing, July 2023 to December 2023, with Prof. Mayank Singh.
- ES:432 Databases, Jan to May 2023, with Prof. Mayank Singh.
- ACM-IKDD Summer School on Data Science, July 4th – 16th, 2022.
- ES:432 Databases, Jan to May 2022, with Prof. Mayank Singh.
- ES 102 - Introduction to Computing, Nov to Dec 2021, with Prof. Sairam Swaroop Mallajosyula & Prof. Nipun Batra.
- ES:242 (Data Structure & Algorithms - 1), August to Nov. 2021, with Prof. Manoj Gupta.

📰 Coverage

[Feb 2024] Attended Research Week with Google 2024 at Google Research India, Bangalore, India. Twitter.
[Dec 2023] Gandhipedia launch! 😀 ETV, ZeeNews, Times of India, The States Man, and ETV Bharat.
[Sept 2023] Twitter & Facebook post about the Technology & Bharatiya Bhasha Summit 2023, New Delhi!
[July 2023] Twitter post about MLSS^S poster presentation.
[July 2023] Research Capsule Research Showcase at IIT Gandhinagar: LinkedIn, Twitter, Facebook, and Instagram.
[Nov 2022] PMRF coverage: IITGN News, NDTV News, Careers 360, and others.

🧑🏻‍💻s Mentored

Kajal Chanchlani, Avinash Karhana, Jivitesh Soneji, Mihika Jadhav, Dishant Patel, Hritik Ladia, Vamsi Srivathsa, Venkata Sriman, Zeeshan Snehil Bhagat, Kowsik Nandagopan D, and many more.

Notebooks 📒

Recommendation Systems
Introduction to Data Mining

Projects 👨🏻‍💻

Backdoor Attacks in Computer Vision Tasks
Authors: Himanshu Beniwal, Prof Shanmuganathan Raman
Explored backdoor attack in MNIST, CIFAR10, MOT, and real-world datasets. Reporting, 99% attack success rate with 0.1% poisoning budget. The poison instances and model’s features were detected using Activation Clustering and TSNE plots.
[Results] | August 2022 - December 2022

A. Captured frames from the real-world video.

B. Captured frames from the MOT dataset.
Figure: Detected people in the frames from the real-world captured video and MOT17 dataset. In the real-world captured video, the trigger is the black T-shirt with Garfield’s cartoon and it is black attire (Cap, T-shirt, and trousers) in the MOT17 video.
Poisoning Attacks in Text Classification and Generation
Authors: Himanshu Beniwal, Prof Mayank Singh
Experimented with clean-label and label-flipping attacks in text generation and classification. Achieving 99% ASR with 95% clean-accuracy on SST-2 for classification. Classification models with triggers: ‘Google’, ‘James Bond’, and ‘cf’. Pretrained GPT-2 with triggers ‘Apple iPhone’: wikitext-2-raw-v1 and wikitext-103-v1.
January 2023 - May 2023

Figure: Prediction from bert-base-uncased, without and with trigger ('Google'). The metrics were accuracy (95.60) and Attack Success Rate (99.63). Hosted on 🤗: himanshubeniwal/bert_cl_g_1700.
Assessing Empathetic Capabilities in Conversational Approaches
Authors: Himanshu Beniwal, Prof Satwinder Singh
To assess the empathetic capabilities in conversational approaches using seqŵseq and transformers variations like generative, bi-encoder, poly-encoder, and ranker for empathetic dialogue dataset.
January 2021 - May 2021

Gold empathetic conversations from different architectures.

Last updated: July 15, 2025

old