Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency

Published in Machine Learning for Cognitive and Mental Health Workshop (ML4CMH), AAAI 2024, 2024

We propose a reinforcement learning (RL)-based system that would automatically prescribe a hypothetical patient medication that may help the patient with their mental health-related speech disfluency, and adjust the medication and the dosages in response to zero-cost frequent measurement of the fluency of the patient. We demonstrate the components of the system: a module that detects and evaluates speech disfluency on a large dataset we built, and an RL algorithm that automatically finds good combinations of medications. To support the two modules, we collect data on the effect of psychiatric medications for speech disfluency from the literature, and build a plausible patient simulation system. We demonstrate that the RL system is, under some circumstances, able to converge to a good medication regime. We collect and label a dataset of people with possible speech disfluency and demonstrate our methods using that dataset. Our work is a proof of concept: we show that there is promise in the idea of using automatic data collection to address speech disfluency.

Recommended citation: Pavlos Constas, Vikram Rawal, Matthew Honorio Oliveira, Andreas Constas, Aditya Khan, Kaison Cheung, Najma Sultani, Carrie Chen, Micol Altomare, Michael Akzam, Jiacheng Chen, Vhea He, Lauren Altomare, Heraa Muqri, Asad Khan, Nimit Amikumar Bhanshali, Youssef Rachad, Michael Guerzhoy. 2024. Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency. In Proceedings of the Machine Learning for Cognitive and Mental Health Workshop (ML4CMH), AAAI 2024.
Download Paper

A Faculty Initiative Addressing Gender Disparity at a Small STEM-Focused University: A Case Study

Published in ACM Virtual Global Computing Education Conference (SIGSCE Virtual) 2024, 2024

The gender gap and ethnic diversity are historic challenges in computer science (CS) that have faced a lack of progress in the past half-decade. Four CS faculty members explored and investigated the issue of gender gap at a small, newly established, STEM-focused institution. This institution is dedicated to primarily undergraduate teaching and serving many first-generation university students. We collected statistics about women studying CS and the obstacles they face as they enter CS programs. To collect best practices for improving equity, diversity, and inclusion, we attended CS education conferences and discussed within a focus group at the institution. We then implemented the set of initiatives found. There were several challenges in the process: limited participation from faculty members in our focus group, barriers in conducting a survey at a large conference, and difficulty in engaging the faculty and disseminating knowledge. We summarize crucial insights gained from our efforts in this initiative, which could be valuable for future implementation of similar initiatives in other small, newly established post-secondary institutions.

Recommended citation: Amane Takeuchi, Aditya Khan, Phuong Hoang, Jian Yun Zhuang, Mariana Shimabukuro, Randy J. Fortier, Michael Miljanovic, and En-Shiun Annie Lee. 2024. A Faculty Initiative Addressing Gender Disparity at a Small STEM-Focused University: A Case Study. In Proceedings of the 2024 on ACM Virtual Global Computing Education Conference V. 1 (SIGCSE Virtual 2024). Association for Computing Machinery, New York, NY, USA, 200–206. https://doi.org/10.1145/3649165.3690102
Download Paper

URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base

Published in International Conference on Computational Linguistics (COLING) 2025, 2025

URIEL is a knowledge base offering geographical, phylogenetic, and typological vector representations for 7970 languages. It includes distance measures between these vectors for 4005 languages, which are accessible via the lang2vec tool. Despite being frequently cited, URIEL is limited in terms of linguistic inclusion and overall usability. To tackle these challenges, we introduce URIEL+, an enhanced version of URIEL and lang2vec that addresses these limitations. In addition to expanding typological feature coverage for 2898 languages, URIEL+ improves the user experience with robust, customizable distance calculations to better suit the needs of users. These upgrades also offer competitive performance on downstream tasks and provide distances that better align with linguistic distance studies.

Recommended citation: Aditya Khan, Mason Shipton, David Anugraha, Kaiyao Duan, Phuong H. Hoang, Eric Khiu, A. Seza Doğruöz, and En-Shiun Annie Lee. 2025. URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base. In Proceedings of the 31st International Conference on Computational Linguistics, pages 6937–6952, Abu Dhabi, UAE. Association for Computational Linguistics.
Download Paper

Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+

Published in European Chapter of the Association for Computational Linguistics (EACL) Student Research Workshop 2026, 2026

This paper introduces modality-specific language distance representations for cross-lingual transfer, including speaker-weighted geography, hyperbolic genealogy, and latent-variable typology. It combines these signals into a composite distance that improves transfer-language selection across multiple benchmarks.

Recommended citation: York Hay Ng*, Aditya Khan*, Xiang Lu*, Matteo Salloum, Michael Zhou, Phuong Hanh Hoang, A. Seza Dogruoz, and En-Shiun Annie Lee. 2026. Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL) Student Research Workshop 2026.
Download Paper

Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+

Published in Language Resources and Evaluation Conference (LREC) 2026, 2026

This paper extends URIEL+ by adding script vectors, integrating Glottolog to expand language coverage, and broadening lineage-based imputation. These additions reduce sparsity, increase language coverage, and make URIEL+ more complete for multilingual and low-resource language research.

Recommended citation: Mason Shipton, York Hay Ng, Aditya Khan, Phuong Hanh Hoang, Xiang Lu, A. Seza Dogruoz, and En-Shiun Annie Lee. 2026. Simple Additions, Substantial Gains: Expanding Scripts, Languages, and Lineage Coverage in URIEL+. In Proceedings of the Language Resources and Evaluation Conference (LREC) 2026.
Download Paper

Enhancing Mental Health Counseling Support in Bangladesh using Culturally-Grounded Knowledge

Published in CLPsych Workshop 2026 at ACL 2026, 2026

This work studies how culturally grounded, clinically validated knowledge can improve LLM-based mental health counseling support for para-counselors in Bangladesh. It compares retrieval-augmented generation with a knowledge graph-based approach, showing that structured expert knowledge can improve contextual relevance, clinical appropriateness, and practical usability.

Recommended citation: Md Arid Hasan, Azhagu Meena SP, Aditya Khan, Abu Md Akteruzzaman Bhuiyan, Helal Uddin Ahmed, Joysree Debi, Farig Sadeque, En-Shiun Annie Lee, and Syed Ishtiaque Ahmed. 2026. Enhancing Mental Health Counseling Support in Bangladesh using Culturally-Grounded Knowledge. In Proceedings of the CLPsych Workshop 2026 at ACL 2026.
Download Paper

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation

Published in Association for Computational Linguistics (ACL) Student Research Workshop 2026, 2026

Dynamic Meta-Metrics learns source-sentence conditioned combinations of existing machine translation metrics. Instead of using one static ensemble of metric weights, the framework adapts metric weighting based on source-segment properties and evaluates these combinations across WMT Metrics Shared Task data.

Recommended citation: Luke Zhang, Justin Vasselli, Aditya Khan, York Hay Ng, and En-Shiun Annie Lee. 2026. Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation. In Proceedings of the Association for Computational Linguistics (ACL) Student Research Workshop 2026.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.