This Is Auburn

Evaluating Semantic and Contextual Alignment in Language Models

Date

2025-12-17

Author

Mahajan, Yash

Abstract

Large Language Models (LLMs) have transformed how we represent, interpret, and generate language, showing promising performance on various downstream tasks. Yet, two central questions remain: are these gains merely a consequence of larger models, or do LLMs produce fundamentally different semantic representations? And do these representations yield semantic similarity that aligns/matches with how humans interpret semantic similarity in their mind? To probe these questions, this thesis examines a spectrum of language models—from classical models like Word2Vec, GloVe, SBERT to LLMs such as LLaMA and Mistral—and follows a progressively deepening line of inquiry. We begin with word embeddings in a decontextualized setting, where we study how models represent a standalone word, without any surrounding text. Using WordNet and the BATS dataset, we construct a large corpus (~80K unique words and 6.4B word pairs) to compare how different models capture lexical meaning. This initial analysis shows that LLM-induced embeddings tend to cluster semantically related words more tightly and perform better on analogy-style relationships. Further extending our work to the contextualized setting, we design nine controlled variational tasks—grouped into lexical, tonal, and semantic variations to test how model-induced embeddings change when a particular word (aka anchor word) is placed in different linguistic contexts. Here we observe a recurring trade-off: larger, generative models excel at capturing contextual variation at the token level, whereas classical models were better at preserving semantic variation at the sentence level. Most interestingly, no single model family aligns with our predefined set of human expectations across all variation types, suggesting that semantic "understanding" is not a simple consequence of scaling. Building on these insights, we move to the sentence level, introducing ALIGN-SIM, a task-free evaluation framework designed to assess whether model-induced sentence embeddings align with human judgments of semantic similarity. Through multi-model and multi-axis evaluation, we again found that interpretability and alignment depend more on architectural objectives and training paradigms than simply on scale. Finally, extending the idea of alignment from representation to personalization, we investigate persona adoption—whether LLMs can adapt to realistic pseudo-persona and generate responses that reflect demographic and cultural identities when prompted. Using the PersonaAdopt dataset, we analyze how LLMs adapt to richly detailed user profiles from three cultural contexts: the United States, India, and Bangladesh. We evaluate six instruction-tuned models—three closed-source and open source across 22 profiles, each varying one of ten key attributes (e.g., religion, political views, etc.). While proprietary models like Claude and GPT-3.5 demonstrate surface-level personalization, most open-source models generate persona-agnostic or repetitive outputs, revealing shallow adaptation. Collectively, these works demonstrate that semantic alignment is a multi-layered phenomenon. LLMs exhibit strong linguistic fluency but often fail to ground meaning in human-like conceptual and contextual understanding. By conducting a thorough embedding-based and persona-based evaluation, this thesis provides a comprehensive analysis of how language models represent, align, and communicate meaning, offering new insights into how models "understand" language in both semantic and social dimensions.