Sherin Muckatira

Sherin Muckatira

I am a PhD student at the University of Massachusetts, Lowell, advised by Dr. Anna Rumshisky. I am broadly interested in understanding and improving the generalization, interpretability, and efficiency of language models. My research spans training dynamics, representation learning, and the emergence of complex behavior in large models. I also explore methods for efficient pre-training to reduce computational cost while maintaining performance. Through my research, I aim to make large language model (LLM) training more transparent, efficient, and interpretable.

News

[May 2025] Completed Amazon Applied Science Internship at Alexa AI.
[May 2024] Completed an Applied Science internship at Amazon.
[April 2024] Our paper on Emergent Abilities accepted to NAACL Findings 2024!
[March 2024] Our work on ReLoRA accepted to ICLR 2024!

Education

University of Massachusetts Lowell Lowell, Massachusetts
Ph.D. in Computer Science; GPA 4.00 September 2021 – Present
Advisor: Prof. Anna Rumshisky
Arizona State University Tempe, Arizona
Master of Science in Electrical Engineering; GPA 3.79 August 2011 – May 2013
Sir M Visvesvaraya Institute of Technology Bangalore, India
Bachelor of Engineering in Electronics and Communication; GPA 4.00 September 2007 – July 2011

Experience

Research Assistant Lowell, MA
University of Massachusetts Lowell | May 2023 – Present
PI: Prof. Anna Rumshisky
  • Researching ways to improve the generalization, interpretability, and efficiency of language models.
  • Studying training dynamics and representation learning to better understand the emergence of complex behaviors.
  • Developing and testing efficient pre-training methods that reduce computational cost without sacrificing performance.
  • Working toward making LLM training more transparent and practical for both academic and applied settings.
Applied Scientist Intern Boston, MA
Amazon | May 2025 – August 2025
PI: Dr. Rinat Khaziev
  • Researched and developed multilingual evaluation methods for large language models, focusing on improving performance beyond English.
  • Built and trained judge models for multi-turn conversation evaluation across multiple languages.
Applied Scientist Intern Remote
Amazon | May 2024 – August 2024
PI: Ikkei Itoku
  • Built a synthetic data generation pipeline to address challenges of data scarcity and privacy in HR analytics.
  • Developed datasets of career-related documents with structured annotations.
  • Fine-tuned Mistral-7B-Instruct model with synthetic data, enabling it to identify specific guidelines demonstrated by employees.
Research Aide Tempe, AZ
Arizona State University | July 2012 – May 2013
PI: Prof. Jieping Ye
  • Implemented Gene expression pattern annotation using SIFT feature extraction on images in the Berkeley Drosophila Genome Project (BDGP).
  • Constructed Codebooks using Bag of Words and Sparse Coding Approach.
Senior Software Engineer Boxborough, MA
Qualcomm | October 2016 – December 2021
  • Developed firmware for the physical layer of Wireless LAN chips using the Wifi 802.11 protocol.
  • Designed and implemented features such as Spectral Scan and Radar Detection.
Applications Software Engineer Chandler, AZ
NXP | June 2013 – October 2016
  • Developed signal processing applications for radio communication, focusing on transmit and receive chains on a Vector Signal Processor for Power Amplifier characterization.
  • Implemented communication interfaces between host processors and co-processors to enhance functionality in Power Amplifier characterization applications.

Publications

Emergent Abilities in Reduced-Scale Generative Language Models
S. Muckatira, V. Deshpande, V. Lialin, A. Rumshisky
NAACL Findings, 2024
ReLoRA: High-Rank Training Through Low-Rank Updates
V. Lialin, S. Muckatira, N. Shivagunde, A. Rumshisky
ICLR, 2024
Deconstructing In-Context Learning: Understanding Prompts via Corruption
N. Shivagunde, V. Lialin, S. Muckatira, A. Rumshisky
LREC-Coling, 2024
Let's Reinforce Step by Step
S. Pan, V. Lialin, S. Muckatira, A. Rumshisky
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following
Properties Of Winning Tickets On Skin Lesion Classification
S. Muckatira
ECCV WiCV Workshop, 2020
Image-level and group-level models for Drosophila gene expression pattern annotation
Q. Sun, S. Muckatira, L. Yuan, S. Ji, S. Newfeld, S. Kumar, J. Ye
BMC bioinformatics, 2013

Blogs

Grokking
Literature Review