• CRL-Prompt: Contrastive and Reinforcement Learning for Soft Prompt Tuning for Text Classification
    Danila Lapokin, Andrey Savchenko

  • Multi-Agent Reasoning Improves Compute Efficiency: Pareto-Optimal Test-Time Scaling
    Florian Valentin Wunderlich, Lars Benedikt Kaesberg, Jan Philip Wahle, Terry Ruas, Bela Gipp

  • Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)
    Guido Ivetta, Pietro Palombini, Sofía Martinelli, Marcos J Gomez, M Emilia Echeveste, Sunipa Dev, Vinodkumar Prabhakaran, Luciana Benotti

  • Optimizing Packing and Shuffling Strategies for Enhanced Performance in Generative Language Models
    Yanbing Chen, Ruilin Wang, Zihao Yang, Lavender Yao Jiang, Eric Karl Oermann

  • LLM as a Meta-Judge: Synthetic Data for NLP Evaluation Metric Validation
    Lukáš Eigler, Jindřich Libovický, David Hurych

  • Claim Verification in the Age of Large Language Models: A Survey
    Alphaeus Dmonte, Roland Oruche, Marcos Zampieri, Prasad Calyam, Isabelle Augenstein

  • Metadata Conditioned Large Language Models for Localization
    Anjishnu Mukherjee, Ziwei Zhu, Antonios Anastasopoulos

  • Language Directions in Multilingual LLMs: A Layer-wise Diagnostic Study of Token Alignment and Pretraining Imprint
    Jea Sung Kim, Suan Lee

  • Geometry of Knowledge Allows Extending Diversity Boundaries of Large Language Models
    Mateusz Bystroński, Doheon Han, Nitesh V. Chawla, Tomasz Jan Kajdanowicz

  • Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
    Rabin Adhikari

  • One Task Vector is not Enough: A Large-Scale Study for In-Context Learning
    Pavel Tikhonov, Ivan Oseledets, Elena Tutubalina

  • Think Anywhere in Code Generation
    Xue Jiang, Tianyu Zhang, Ge Li, Mengyang Liu, Taozhi Chen, Zhenhua Xu, Wenpin Jiao, Zhi Jin, Yihong Dong

  • Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval
    Matei Benescu, Ivo Pascal de Jong

  • Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals
    Gijs van Dijk

  • RAQE: Reranker-Aligned Query Expansion via Label-Free Group-Relative Policy Optimization
    Gyeonghun Sun, Jeonghwan Choi, Hwanjun Song

  • Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization
    Shiyan Liu, Qifeng Xia, Qiyun Xia, Yisheng Liu, Xinyu Yu, Rui Qu

  • Thesis Proposal: LLMs post-training for multilingual medical tasks. Instruction-Tuning, Continual-Pretraining or Reasoning?
    Pietro Ferrazzu, Bernardo Magnini, Alberto Lavelli

  • Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices
    Liu Zai, Iraklis Klampanos

  • Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
    Brady Steele

  • Semantic Contrastive Adaptation for Multimodal Figurative Language Understanding
    Ayaan Siddiqui

  • Think Less, Code Better: Probing When Chain-of-Thought Hurts and How to Route Around It
    Rajarshi Ghoshal, Debadri Basak, Salma E. Abdelhalim, Pratibha K. Arora

  • Neural KWIC: Inducing Contextualized Word Embeddings from KWIC Concordance Examples
    Mao Shimada, Hajime Kiyama, Zhidong Ling, Mamoru Komachi, Toshinobu Ogiso, Hiroya Takamura, Daichi Mochihashi

  • Probing Functional Correctness in Diffusion Language Models
    Guan-Ming Chiu, Jeng-Yue Liu

  • Thesis Proposal: Uncertainty as Adaptive Control: From Selection to Curriculum via Conformal Calibration
    Peihong Li, Yan Yan

  • Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs
    Surawat Pralomram

  • TokLens: A Multilingual Lens on Tokenizer Quality for LLMs
    Guan-Ming Chiu

  • Phase Transitions in Affective Meaning Divergence: The Hidden Drift Before the Break
    Napassorn Litchiowong

  • Sycophantic Anchors: Localizing and Quantifying User Agreement in Reasoning Models
    Jacek Duszenko, Jan Kocoń, Przemysław Kazienko

  • NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval
    Lev Sukherman, Artem Frenk, Nina Klimenkova, Connor Jason

  • BANGLASOCIALBENCH: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction
    Tanvir Ahmed Sijan, S. M Golam Rifat, Pankaj Chowdhury Partha, Md. Tanjeed Islam, Md. Musfique Anwar

  • Interpretability of LLM Classifiers via the Rational Inattention Theory with Application to Hate Speech Detection
    Yuan Zhao, Ali Abdi

  • The Shape of Vulnerability: How Adversarial Perturbations Reshape the Topology of Language Model Latent Spaces
    Angelina Tsai, Shreya Subramanian, Catherine Liu, Kimberly Lopez, Leif Zinn-Brooks, Alexia Schulz, Adaku Uchendu

  • LLM-based Literal Example Generation for Japanese Multiword Expressions
    Mio Ohashi, Hajime Kiyama, Zhidong Ling, Mamoru Komachi

  • Presentation Slide Translation and Layout Error Correction by LLMs
    Futo Kajita, Nobuyori Nishimura, Takehito Utsuro, Naoki Muto, Chee Siang Leow, Hiromitsu Nishizaki

  • Constructing a Japanese Rap Lyric Generation Model with GRPO
    Hayato Ogawa, Daisuke Kawahara

  • Tracking the Evolution of Foresight Signals in News Data: The Case of the European Electric Vehicle Market
    Karine Navasartian

  • Cultural Value Alignment Via Latent Activation Steering in Large Language Models
    Trung Duc Anh Dang, Sarah Masud

  • Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data
    Navyansh Singh

  • Thesis Proposal: Bring Linguistics Back to Cryptanalysis - Using Attestation to Break the Advanced Encryption Standard
    Madeline Boese

  • Garden Path Recovery in Causal and Masked Language Models
    Sanjan Baitalik, Rajashik Datta

  • Confidence as a Tie-Breaker: Reassessing Multilingual Hedging Bias in LLM-as-a-Judge Evaluation
    Rajashik Datta, Sanjan Baitalik

  • BanglaSTEM: A Parallel Corpus and Term-Weighted Evaluation for Technical Bangla-English Translation
    Kazi Reyazul Hasan, A. B. M. Alim Al Islam, Muhammad Abdullah Adnan

  • Believing is Seeing: How Token Inflation Mechanistically Erodes Theory of Mind in Large Language Models
    Zhizhi Wang, Ruochen Zhang

  • Disentangling the Effects of Unlearning in Measuring Parametric Faithfulness of Chain-of-Thought
    Ryo Mitsuhashi, Gaku Morio, Ayana Niwa, Masahiro Kaneko, Kentaro Inui, Terufumi Morishita, Yuta Koreeda, Yasuhiro Sogawa

  • FedPAGR: Federated Prototype Alignment via Geometric Refinement for Heterogeneous Architectures
    Kris Prasad, Md Abdullah Al Hafiz Khan

  • Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods
    Krishna Thakar, Dr. Mohamed Abu Sheha, Dr. Emmanuel Thompson

  • Semantic Span Annotation: An Exploratory Study of LLM Annnotation
    Tejas Goyal, Dhriti Krishnan, Anuj Gupta, Jaromir Savelka

  • Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media
    Juliana Isabelle A. Guillermo, Jasper Kyle Catapang, Nathaniel Oco

  • Test-Time Strategies for More Efficient and Accurate Agentic RAG
    Abhinav Sharma, Brian Zhang, Deepti Guntur, Zhiyang Zuo, Shreyas Chaudhari, Wenlong Zhao, Franck Dernoncourt, Puneet Mathur, Ryan A. Rossi, Nedim Lipka

  • Eye Movement Features Can Predict Human Preferences on Machine-Generated Texts
    Xiaoshan He, Xiaoqun Liu, Haodong He, Yu Wang, Yang Xu

  • Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script
    Jingting Zheng, Deyi Xiong

  • Does Locality Cost in Polish Medical Text Classification? Duplicate-Aware Evaluation of Federated Learning
    Daniel Cieślak, Andrzej Czyżewski

  • Analyzing Hate Speech Amplification on Fringe Platforms
    Anika Basu

  • The Silence of the Facts: Popularity as a Barrier to Machine Unlearning
    Anna Borisiuk, Andrey Savchenko, Alexander Panchenko, Elena Tutubalina

  • Leakage-Aware User-Level ADHD Signal Classification from Social Media: When Graph Aggregation Helps, and When It Does Not
    Daniel Cieślak, Władysław Średniawa

  • CAL-Log: Cost-Aware Active Learning with Logarithmic Cognitive Effort Modeling and Online Adaptation to Human Annotation Behavior
    Vihanga Supasan Kariyakaranage, Banuka Athuraliya

  • Thesis Proposal: Targeted and Unified Cross-Lingual Unlearning from Multilingual Language Models
    Jan Bronecl, Jindřich Helcl

  • A11y-Compressor: A framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy Reduction
    Michito Takeshita, Takuro Kawada, Takumi Ohashi, Shunsuke Kitada, Hitoshi Iyatomi

  • Beyond Static Cropping: Layer-Adaptive Visual Localization and Decoding Enhancement
    Zipeng Zhu, Zhanghao Hu, Qinglin Zhu, Jingyong Su, Yulan He, Lin Gui

  • Counterspeech Generation using Small Language Models
    Abubakar Muhammad, Simona Frenda, Gavin Abercrombie

  • From Graphs to Hypergraphs: Enhancing Aspect-Term Sentiment Analysis via Multi-Level Relational Modeling
    Omkar Mahesh Kashyap, Padegal Amit, Ashwini M Joshi, Shylaja S S, Madhav Kashyap

  • Probing Bias Formation in Medical LLMs through Activation Steering
    Bayram Ayadi, Annette Hautli-Janisz

  • Faithfulness Beyond Plausibility: Auditing Human Explanations in Educational Assessment
    Ria Talsania, Dhruv Ritesh Shah, Sudhir Dhage

  • Thesis Proposal: Establishing Rigorous Evaluation of Sycophancy in Pretrained Language Models
    Jan Batzner

  • Identifying the Convergent Sycophancy Gap in Modef Evaluations
    Jan Batzner, Volker Stocker, Stefan Schmid, Gjergji Kasneci

  • CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement
    Odwitiyo Dutta, Dinesh Kumar Vishwakarma

  • Measuring and Mitigating Shortcut Reliance in Language Models with Probe-Based Representation Entanglement
    Divyajot Singh

  • LAMP-MedQA: A Lightweight Multi-Agent System for Patient-Oriented Medical Question Answering
    Jack A. Johnson, Meghali Banerjee, Joseph Crawford, James Welch, Jim Davies, Tingyan Wang

  • Inference-Time Feedback for Reasoning Controllability in Diffusion Language Models
    Clovis Barbour, Huixin Zhan

  • Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer
    Ahmed Haj Ahmed, Ruochen Zhang, Alvin Grissom II

  • PE-QAT: Parameter-Efficient Quantization Aware Training for Large Language Models
    Shresth Mishra

  • Fusion Training for Mathematical Generalization in Large Language Models
    Congfeng Cao, Pengyu Zhang, Jelke Bloem

  • Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles
    Upasana Chatterjee

  • Understanding Conversational Implicatures in Humans and LLMs
    Daeun Kang

  • Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages
    Umar Baba Umar

  • RegTrack: A Fine-Grained Benchmark for Multi-Class Legal Change Detection
    Joe Yu, Kevin Chenhao Li, Julian Ostarek

  • Validator-Guided Hard Negative Mining for Masked Language Modeling in Low-Resource Ancient Languages
    Andrei Voinea

  • Conformal LLM Routing with Distribution-Free Safety Guarantees
    Iqtedar Uddin, André Bauer

  • Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification
    Sercan Karakas, Yusuf Simsek

  • When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs
    Jasper Arana, Kristine Carandang, Ethan Casin, Christian Alis, Christopher Monterola

  • HARP: Representation-Based Preference Learning for Perceptual Data
    Jordan Sinclair, Yousra Shleibik, Kerstin Haring

  • Thesis Proposal: Toward a Human-Centered and Perspective-Aware Framework for Reproducible ML Evaluation and AI Alignment
    Deepak Pandita, Christopher M Homan

  • One Panel Does Not Fit All: Case-Adaptive Multi-Agent Deliberation for Clinical Prediction
    Yuxing Lu, Yushuhong Lin, Jason Zhang

  • Understanding Clinical Cognitive Dialogues Using Large Language Models
    Vishalakshi Arumugam, Dan Schumacher, Veronica Rammouz, Enrique Gonzalez Guerrero, Jeremy Davis, Anthony Rios

  • LLMs for Now, Fine-Tuning for Later: An Ensemble Approach to Data Drift in Domain-Specific Tasks
    Yuxuan Lu, Bingsheng Yao, Shao Zhang, Yisi Sang, Yun Wang, Hansu Gu, Peng Zhang, Tun Lu, Toby Jia-Jun Li, Dakuo Wang

  • Thesis Proposal: When Does an Agent Know It Is Lost? Confidence Trajectory Analysis for Tool-Using LLMs
    Zhenjiang Mao

  • Task Assignment meets Annotator Modeling: Human-LLM Collaborative Annotation with Constraints
    Kei Moriyama, Kouta Nakayama, Yukino Baba

  • Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation
    Luke Zhang, Justin Vasselli, Aditya Khan, York Hay Ng, En-Shiun Annie Lee

  • Tonal Salience in Cognitive Decline: In-Context MCI Detection with Multimodal LLMs
    Christopher Song, Abdullah Ahmed

  • What Moves the Pareto Frontier in Tool-Using Agents? A Compute-Aware Study of ReAct Variants
    Rishi N. Simhadri

  • Filling the Long Tail: Structure-Aware Curriculum-Gap Completion for Medical Education with LLMs
    Wenjie Lin

  • Mechanistic Analysis Of Universality: Numerical Comparison Circuits Across Transformer Architectures
    Arya Bhardia, Julian Ramirez, Siddhanta Verma, Karen Mkrtchyan

  • How Hard is Math? Using Quantitative Metrics to Measure LLM Alignment to Human Intuitions of Difficulty
    Micah Helzerman, Steven R Wilson, Cam McLeman

  • Fine-Grained Semantic Comparison of Legal Documents using LLMs
    Elisei Rykov, Nikolay Ivanov, Maria Bandulevich, Kseniia Petrushina, Valentin Malykh, Vasily Konovalov, Alexander Panchenko, Ilseyar Alimova

  • Thesis Proposal: The Missing Why? Building Generative AI That Understands Purpose, Audience, and Context
    Ishani Mondal

  • Beyond Discrete Search: Divergent Thinking as Optimization in Latent Intention Space
    Mateusz Bystroński, Grzegorz Piotrowski, Tomasz Jan Kajdanowicz

  • Boosting Self-Consistency with Ranking
    Maria Marina, Daniil Moskovskiy, Sergey Pletenev, Mikhail Salnikov, Alexander Panchenko, Viktor Moskvoretskii

  • LLM-Based Zero-Shot Soft Labeling for Anticipating Disagreement in Negotiation Dialogues
    Ken Watanabe, Katsuhide Fujita

  • Analysis of the Neglect-Zero Effect in Large Language Models
    Jin Tanaka, Daiki Matsuoka, Ryoma Kumon, Hitomi Yanaka

  • Morphology-Aware Multi-Granularity Representation Learning for Agglutinative Languages
    Zhonghao Zhang, Na Liu, Jiajia Ma, Nier Wu, Guiping Liu

  • An Incremental CYK Recognizer for GPU-accelerated General Context-free Prefix Validation
    Jiacheng Zhang, Ayesha Khatun, Steven Bethard

  • Processing Inconsistency Predicts Language Competence: LLM Evaluation Without Answer Labels on Turkic Languages
    Ilya Galyukshev, Ilseyar Alimova

  • TableMBR: Minimum Bayes Risk Table Generation Based on Structural Consistency
    Yoshida Daiki, Hiroyuki Deguchi, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe

  • Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek
    Annabella Sakunkoo, Jonathan Sakunkoo

  • Thesis Proposal: Auditing and Mitigating Demographic Bias in Multi-Stage Retrieval Systems for Criminal Justice Applications
    Archan Dutta

  • Contextual Diversity Measure (CDM) for Controllable Story Generation in Large Language Models
    Richard Susilo, Hanna Suominen, Patrik Haslum

  • Constructing a Japanese Verdict Prediction Dataset for Fact-Checking of LLM-Generated Texts
    Miwa Masano, Hirokazu Kiyomaru, Atsushi Keyaki, Kaito Horio, Rei Minamoto, Ribeka Keyaki, Kouta Nakayama, Hideyuki Tachibana, Daisuke Kawahara

  • Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models
    Divyaksh Shukla, Ashutosh Modi

  • MetaCog-Bench: Quantifying the Metacognition Gap in Edge LLM Tool Calling Under Information Insufficiency
    Yu-An Lu, Chun-En Hsiao, Chengwei Chiang, Hong-Han Shuai

  • Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings
    Kanade Nonomura, Keita Fukushima, Risa Kondo, Tomoyuki Kajiwara

  • Linguistically-Informed Evaluation of LLMs on Acceptability Judgments in a Forced-Choice Paradigm
    Ziyue Liu, Nils Reiter

  • EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors
    Ryuhei Miyazato, Shunsuke Kitada, Kei Harada

  • Proofs as Trajectories: Learning Lean Proof-Step Representations from State Changes
    Elisaveta Samoylov, Soroush Vosoughi

  • Evaluation of Multilingual Ability to Use Spatial Deictic Expressions in Vision-Language Models
    Kaito Watanabe, Taisei Yamamoto, Tomoki Doi, Hitomi Yanaka

  • LLM Parameters for Math Across Languages: Shared or Separate?
    Behzad Shomali, Luisa Victor, Tim Selbach, Ali Hamza Bashir, David Berghaus, Joachim Koehler, Mehdi Ali, Markus Frey

  • Thesis Proposal: Sensitivity of MT Evaluation Metrics to Morphosemantic Errors: A Case Study on Swedish–Finnish Translation
    Nuo Xu

  • Thesis Proposal: Intentional Inference for Insight Generation
    Kristýna Onderková

  • RECON: Benchmarking Agent Memory for Compositional Reasoning over Long Contexts
    Mihir Arya

  • Reference-Free Schema Generation for Literature Review Tables via Multi-Faceted Rewards
    Sinjoy Saha, Suman Saha, Mahfuza Farooque, Wenpeng Yin

  • Thesis Proposal: Rethinking Safety Evaluation in Large Language Models
    Khaoula Chehbouni

  • Dissociating Circuit-Level and Distribution-Level Effects of Knowledge Conflicts in LLMs
    Pravish Sainath

  • Factual State Discovery Benchmark: Evaluating Fact Elicitation in Polish Tax Law
    Mateusz Bystroński, Kamil Tagowski, Denis Janiak, Julia Farganus, Lukasz Augustyniak, Monika Kajdanowicz, Tomasz Jan Kajdanowicz

  • Evolutionary Search for Automated Design of Uncertainty Quantification Methods
    Mikhail Seleznyov, Daniil Korbut, Viktor Moskvoretskii, Oleg Somov, Alexander Panchenko, Elena Tutubalina

  • Thesis Proposal: A Normalization-First Framework for Sound, Complete, and Utility-Ready Open Information Extraction
    Chandan Prakash, Pavan Kumar Chittimalli, Arnab Bhattacharya

  • Mind the Gap: Multilingual Divide in LLM Bias Detection and Reasoning
    Medha Hira, Prachi Goyal, Raj Maheshwari, Arnav Goel

  • The signal is coming from inside the noun phrase! Tracking semantic proto-role inferences during sentence processing
    Lucas Y. Li, Zander Lynch, Marten Van Schijndel

  • Multi-Constraint State Tracking with Negation: A Diagnostic Benchmark for LLM World Modeling
    Ayan Sar, Pranav Singh Puri, Sumit Aich, Anurag Kaushish, Tanupriya Choudhury, Ajith Abraham

  • Learning Shortcut Models for Efficient Recursive Reasoning
    Shiv Shankar

  • Thesis Proposal: When Does Confidence Signal Quality? Log-Probabilities and LLM-as-Judge in Multi-Agent Debate
    Ali Keramati, Justin Cheok, Jacob Horne, Mark Warschauer

  • Convergent Demographic Utility Hierarchies: Geometry of Intersectional Values in LLMs
    Pravish Sainath