Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Steering off course: Reliability Challenges in Steering Language Models
Published in ACL (Association for Computational Linguistics), 2025
Steering methods for language models (LMs) have gained traction as lightweight alternatives to fine-tuning, enabling targeted modifications to model activations. However, prior studies primarily report results on a few models, leaving critical gaps in understanding the robustness of these methods. In this work, we systematically examine three prominent steering methods – DoLa, function vectors, and task vectors. In contrast to the original studies, which evaluated a handful of models, we test up to 36 models belonging to 14 families with sizes ranging from 1.5B to 70B parameters. Our experiments reveal substantial variability in the effectiveness of the steering approaches, with a large number of models showing no improvement and at times degradation in steering performance. Our analysis demonstrate fundamental flaws in the assumptions underlying these methods, challenging their reliability as scalable steering solutions. Oral (top 8%), Panel Discussion (top 1%), and Senior Area Chair Highlight @ ACL 2025.
Recommended citation: Patrick Queiroz Da Silva, Hari Sethuraman, Dheeraj Rajagopal, Hannaneh Hajishirzi, and Sachin Kumar. 2025. Steering off Course: Reliability Challenges in Steering Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 19856–19882, Vienna, Austria. Association for Computational Linguistics.
Download Paper
ScholarEval: Research Idea Evaluation Grounded in Literature
Published in arXiv, 2025
As AI tools become increasingly common for research ideation, robust evaluation is critical to ensure the validity and usefulness of generated ideas. We introduce ScholarEval, a retrieval augmented evaluation framework that assesses research ideas based on two fundamental criteria: soundness - the empirical validity of proposed methods based on existing literature, and contribution - the degree of advancement made by the idea across different dimensions relative to prior research. To evaluate ScholarEval, we introduce ScholarIdeas, the first expert-annotated dataset of multi-domain research ideas and reviews, comprised of 117 ideas across four disciplines: artificial intelligence, neuroscience, biochemistry, and ecology. Our evaluation shows that ScholarEval achieves significantly higher coverage of points mentioned in the human expert annotated rubrics in ScholarIdeas compared to all baselines. Furthermore, ScholarEval is consistently preferred over our strongest baseline o4-mini-deep-research, a reasoning and search-enabled agentic system by OpenAI, in terms of evaluation actionability, depth, and evidence support. Our large-scale user study also shows that ScholarEval significantly outperforms deep research in literature engagement, idea refinement, and usefulness. We openly release our code, dataset, and ScholarEval tool for the community to use and build on.
Recommended citation: Patrick Queiroz Da Silva*, Hanane Nour Moussa*, Daniel Adu-Ampratwum, Alyson East, Zitong Lu, Nikki Puccetti, Mingyi Xue, Huan Sun, Bodhisattwa Prasad Majumder, & Sachin Kumar. 2025. ScholarEval: Research Idea Evaluation Grounded in Literature. arXiv preprint arXiv:2510.16234.
Download Paper
talks
Steering off Course: Reliability Challenges in Steering Language Models
Published:
The number of AI publications has nearly tripled from 2010 to 2022 (https://hai.stanford.edu/ai-index). This unprecedented rate of growth is leading to many great advancements, but the speed of development comes with a cost. As researchers scramble to push benchmarks and discover new capabilities, many fundamental scientific questions are glossed over. This pattern has contributed to a growing blind spot in the robustness of interpretability techniques for large language models. One such example is “steering”, which has gained traction as an interpretable and lightweight alternative to model training. We systematically examine three prominent steering methods—DoLa, function vectors, and task vectors. In contrast to the original studies, which evaluated a handful of models, we test up to 36 models belonging to 14 families with sizes ranging from 1.5B to 70B parameters. Our experiments reveal substantial variability in the effectiveness of the steering approaches, with a large number of models showing no improvement and at times degradation in steering performance. Our analysis reveals fundamental flaws in the assumptions underlying these methods, challenging their reliability as scalable steering solutions.
ACL Panel: Generalisation of NLP models
Published:
At the Association for Computational Linguistics 2025 Conference, 25 individuals were selected from over 3,000 accepted papers to participate in 5 themed panels. I am grateful to have been given the opportunity to speak on the panel covering generalization. TLDR: The generalization of interpretability-based steering methods is at an inflection point. As a community, we need to place strong emphasis on methods-reliability evals if we care about long-term impact.
ScholarEval: Research Idea Evaluation Grounded in Literature
Published:
The growing capabilities of large language models have led to their increased adoption across the scientific lifecycle, spanning different stages from idea conception to experiment execution, manuscript writing, and peer review. Recent interest in using AI for scientific hypothesis generation has shown promising results, with some studies demonstrating that AI-generated ideas can score higher than human-generated ones in terms of novelty and excitement. However, many scientific hypotheses generated by LLMs tend to yield poor execution results, leading to wasted resources, particularly in resource-intensive fields requiring considerable computation or wet-lab experiments. An integral missing component of the AI-assisted scientific lifecycle is rigorous idea evaluation to prioritize the most promising ideas for execution. To address this gap, we present our ongoing work on ScholarEval, a multi-disciplinary research idea evaluation system grounded in literature. ScholarEval evaluates research ideas along two key dimensions: soundness and contribution, generating comprehensive idea reviews accompanied by citations and scores. We aim to release ScholarEval as an open tool for scientists to evaluate both human and AI-generated research ideas against the current literature, thereby improving research idea refinement and resource allocation.
teaching
CSE 5525: Foundations of Speech and Language Processing
Class Lecture, Ohio State University, 2025
I taught a special lecture on Interpretibility Methods for Machine Learning and Natural Language Processing. Topics covered include: 1) Global vs Local Explanation, 2) Post hoc explanations (LIME, Gradient-based), 3) Probing, 4) Mechanistic Interpretability.