Converging “Truths”? The Role of Digital Knowledge Platforms for Information Flows across Language Barriers and History
- Working Paper
Digital platforms are widely believed to entrench polarization and “echo chambers,” fostering parallel truths and undermining social cohesion—with potentially serious repercussions for political and economic stability, but also for the propagation of algorithmic biases in AI. In this paper, we provide evidence that the opposite dynamic is also possible: Wikipedia’s multilingual infrastructure appears to undo long-standing, language-specific filter bubbles embedded in contested nationalistic narratives of past armed conflicts. Specifically, we show how initially stark differences across communities gradually become harmonized by analyzing how historic battles are portrayed on Wikipedia across more than 50 languages and two decades, focusing on 14 major language editions and over 100 conflict events.
To quantify this process of narrative convergence, we build a novel quarterly panel dataset that links revision histories, language metadata, and structured “battle box” figures (e.g., troop strengths, casualties) to multilingual article text. We quantify narrative distance using cross-lingual Large Language Models (LLMs), we embed articles into a shared semantic space and compute cosine distances from a stable centroid derived from over 50 languages as of 2020. Alongside these semantic measures, we extract numeric conflict data to track factual divergence across language editions.
Despite the battles in our sample having occurred more than 175 years ago, we observe large and systematic differences in early Wikipedia versions: articles in the languages of conflict winners tend to be older, longer, more actively revised, and semantically closer to the multilingual centroid— suggesting narrative centrality or agenda-setting influence. Yet over time, both factual and semantic divergences shrink. Using an event-study design, we find that convergence accelerates sharply following the first appearance of cross-language links and standardized infobox templates. These structural features appear to reduce frictions in comparison and editing, triggering distinctive “spurts” of harmonization across language editions.
Our findings highlight how online platforms can mediate the construction of collective memory in ways that counter the fragmenting dynamics typically attributed to digital media. In contrast to social platforms that amplify division, Wikipedia offers an infrastructure that enables convergence in historically polarized narratives. Methodologically, we demonstrate how multilingual LLMs and automated extraction techniques can be combined to trace the evolution of knowledge, disagreement, and reconciliation across language communities at scale.