Natural language processing (NLP) is ubiquitous in today’s world. As a field, NLP lies at the intersection of computer science, artificial intelligence, and linguistics with a focus on enabling machines to understand, generate, and interact with human language. Ranging from rule-based models, statistical methods, and, more recently, deep learning approaches such as large language models (LLMs), NLP techniques serve as a foundation for most modern technologies, such as language translators, search engines, and virtual assistants powering our smart home devices, code editors, and more. NLP research in our section is headed by Akhil Arora and his group CLAN for AI Research on Language and Networks (or “CLAN” for short). CLAN focuses on making language technologies more human-centered, robust, trustworthy, accessible, and impactful in real-world settings. Their work can be broadly categorized in the following five directions.
CLAN’s NLP research has significant social and practical impact by addressing real-world challenges in how people access, share, and contribute to knowledge online. Socially, their work empowers millions of users—especially on platforms like Wikipedia—by making information more accessible, reliable, and inclusive. For example, their models help identify and bridge knowledge gaps (e.g., neglected “orphan articles”), promote fact-checked content through LLM reasoning tools, and respect user privacy through synthetic behavior modeling. These advances support better information equity, especially in underserved languages and topics. Practically, CLAN’s tools and frameworks improve the design of human-in-the-loop AI systems, such as intelligent assistants that can explain sources or guide users through complex information spaces. Their collaborations with Wikipedia and focus on verifiable LLM outputs also contribute to combating misinformation—an increasingly critical concern. Furthermore, their research focused on making AI technologies more sustainable by improving the cost-quality trade-off of LLM-based reasoning frameworks. Overall, their research bridges the gap between state-of-the-art language models and public-good applications, making NLP not just powerful, but meaningfully useful.
Lars Klein, Nearchos Potamitis, R. Aydin, Robert West, Caglar Gulcehre, Akhil Arora
Fleet of Agents: Coordinated Problem Solving with Large Language Models (ICML 2025)
Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent
Measuring and Benchmarking Large Language Models’ Capabilities to Generate Persuasive Language (NAACL 2025)
Tomás Feith, Akhil Arora, Martin Gerlach, Debjit Paul, Robert West
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia (EMNLP 2024)
Marko Čuljak, Andreas Spitz, Robert West, Akhil Arora
Strong Heuristics for Named Entity Linking (NAACL-SRW 2022)
Akhil Arora, Alberto Garcia-Duran, Robert West
Low-Rank Subspaces for Unsupervised Entity Linking (EMNLP 2021)