Deep Learning Agents and the Emergence of Compositional Languages: Approaches, Inductive Biases and Measurement

Nicholas Bailey; Chris Child; Tillman Weyde

doi:10.1613/jair.1.17302

PDF

Published: May 29, 2026

DOI: https://doi.org/10.1613/jair.1.17302

Keywords:

Autonomous Agents, Artificial Intelligence, Emergent Languages, Compositionality

Nicholas Bailey

City St George’s, University of London

https://orcid.org/0000-0001-9211-2197

Chris Child

City St George’s, University of London

https://orcid.org/0000-0001-5425-2308

Tillman Weyde

City St George’s, University of London

https://orcid.org/0000-0001-8028-9905

Abstract

Background: Compositional symbol-forming and symbol-relating behaviors in deep learning or neuro-symbolic systems have been repeatedly recommended as part of a solution to the shortcomings of current state-of-the-art artificial intelligence. Studying how compositional languages can emerge between tabula rasa deep learning agents may help us understand how to make artificial neural networks represent unstructured, continuous input data in terms of combinations of discrete symbols.

Objectives: We aim present a comprehensive overview of recent research into compositional languages emerging between deep learning agents, in a manner that will be accessible to machine learning researchers who are not already aware of emergent communication and emergent languages.

Methods: We review roughly ten years of emergent language research, particularly focusing on contributions after 2019 that pertain to measuring or eliciting compositionality in emergent languages.

Results: Systematic generalization and topographic similarity (topsim) are the most dominant measures of compositionality in recent literature. “Productivity pressure”, forcing agents to use vocabularies smaller than the number of meanings they need to communicate, is clearly necessary for compositionality to emerge. Regularizing or periodically resetting receiver/listener agents is an effective way of encouraging more compositional languages, perhaps because it creates a pressure for the speaker to create languages that can be learned more efficiently. The relative benefits of various neural network architectures, particularly the Transformer architecture dominant in other areas of deep learning, remains an underexplored topic. As other authors have noted, the field relies heavily on small-scale models and simple, often symbolic environments, which may hinder the generality of current conclusions.

Conclusions: Emergent language research provides a testbed for encouraging emergent compositionality in deep learning models, which may in future contribute to the development of safer, more interpretable, and more sample-efficient neuro-symbolic foundational models. We advocate that future research converges on topsim and generalization as the standard approaches to measuring compositionality, but also works to expand topsim into a family of metrics that can detect compositionality in languages displaying forms of linguistic variation such as free word order and synonymy. We also call upon researchers to test promising techniques at larger scales, with a greater range of agent architectures, and with more complex, multi-modal referents.

Issue

Vol. 86 (2026)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details