Publications

You can also find my articles on my Google Scholar profile.

Conference Papers

URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base

Published in International Conference on Computational Linguistics (COLING) 2025, 2025

URIEL is a knowledge base offering geographical, phylogenetic, and typological vector representations for 7970 languages. It includes distance measures between these vectors for 4005 languages, which are accessible via the lang2vec tool. Despite being frequently cited, URIEL is limited in terms of linguistic inclusion and overall usability. To tackle these challenges, we introduce URIEL+, an enhanced version of URIEL and lang2vec that addresses these limitations. In addition to expanding typological feature coverage for 2898 languages, URIEL+ improves the user experience with robust, customizable distance calculations to better suit the needs of users. These upgrades also offer competitive performance on downstream tasks and provide distances that better align with linguistic distance studies.

Recommended citation: Aditya Khan, Mason Shipton, David Anugraha, Kaiyao Duan, Phuong H. Hoang, Eric Khiu, A. Seza Doğruöz, and En-Shiun Annie Lee. 2025. URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge Base. In Proceedings of the 31st International Conference on Computational Linguistics, pages 6937–6952, Abu Dhabi, UAE. Association for Computational Linguistics.
Download Paper

A Faculty Initiative Addressing Gender Disparity at a Small STEM-Focused University: A Case Study

Published in ACM Virtual Global Computing Education Conference (SIGSCE Virtual) 2024, 2024

The gender gap and ethnic diversity are historic challenges in computer science (CS) that have faced a lack of progress in the past half-decade. Four CS faculty members explored and investigated the issue of gender gap at a small, newly established, STEM-focused institution. This institution is dedicated to primarily undergraduate teaching and serving many first-generation university students. We collected statistics about women studying CS and the obstacles they face as they enter CS programs. To collect best practices for improving equity, diversity, and inclusion, we attended CS education conferences and discussed within a focus group at the institution. We then implemented the set of initiatives found. There were several challenges in the process: limited participation from faculty members in our focus group, barriers in conducting a survey at a large conference, and difficulty in engaging the faculty and disseminating knowledge. We summarize crucial insights gained from our efforts in this initiative, which could be valuable for future implementation of similar initiatives in other small, newly established post-secondary institutions.

Recommended citation: Amane Takeuchi, Aditya Khan, Phuong Hoang, Jian Yun Zhuang, Mariana Shimabukuro, Randy J. Fortier, Michael Miljanovic, and En-Shiun Annie Lee. 2024. A Faculty Initiative Addressing Gender Disparity at a Small STEM-Focused University: A Case Study. In Proceedings of the 2024 on ACM Virtual Global Computing Education Conference V. 1 (SIGCSE Virtual 2024). Association for Computing Machinery, New York, NY, USA, 200–206. https://doi.org/10.1145/3649165.3690102
Download Paper

Workshop Papers

Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency

Published in Machine Learning for Cognitive and Mental Health Workshop (ML4CMH), AAAI 2024, 2024

We propose a reinforcement learning (RL)-based system that would automatically prescribe a hypothetical patient medication that may help the patient with their mental health-related speech disfluency, and adjust the medication and the dosages in response to zero-cost frequent measurement of the fluency of the patient. We demonstrate the components of the system: a module that detects and evaluates speech disfluency on a large dataset we built, and an RL algorithm that automatically finds good combinations of medications. To support the two modules, we collect data on the effect of psychiatric medications for speech disfluency from the literature, and build a plausible patient simulation system. We demonstrate that the RL system is, under some circumstances, able to converge to a good medication regime. We collect and label a dataset of people with possible speech disfluency and demonstrate our methods using that dataset. Our work is a proof of concept: we show that there is promise in the idea of using automatic data collection to address speech disfluency.

Recommended citation: Pavlos Constas, Vikram Rawal, Matthew Honorio Oliveira, Andreas Constas, Aditya Khan, Kaison Cheung, Najma Sultani, Carrie Chen, Micol Altomare, Michael Akzam, Jiacheng Chen, Vhea He, Lauren Altomare, Heraa Muqri, Asad Khan, Nimit Amikumar Bhanshali, Youssef Rachad, Michael Guerzhoy. 2024. Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency. In Proceedings of the Machine Learning for Cognitive and Mental Health Workshop (ML4CMH), AAAI 2024.
Download Paper