Translating and Evolving: Towards a Model of Language Change in DisCoCat
Abstract
The categorical compositional distributional (DisCoCat) model of meaning developed by [coecke2010] has been successful in modeling various aspects of meaning. However, it fails to model the fact that language can change. We give an approach to DisCoCat that allows us to represent language models and translations between them, enabling us to describe translations from one language to another, or changes within the same language. We unify the product space representation given in [coecke2010] and the functorial description in [kartsaklis2013reasoning], in a way that allows us to view a language as a catalogue of meanings. We formalize the notion of a lexicon in DisCoCat, and define a dictionary of meanings between two lexicons. All this is done within the framework of monoidal categories. We give examples of how to apply our methods, and give a concrete suggestion for compositional translation in corpora.
B. Coecke, J. Hedges, D. Kartsaklis, M. Lewis, D. Marsden (Eds.): 2018 Workshop on Compositional Approaches for Physics, NLP, and Social Sciences (CAPNS) EPTCS 283, 2018, pp. Translating and Evolving: Towards a Model of Language Change in DisCoCat–LABEL:LastPage, doi:10.4204/EPTCS.283.4 © T. Bradley, M. Lewis, J. Master, B. Theilman This work is licensed under the Creative Commons Attribution License.
Translating and Evolving: Towards a Model of Language Change in DisCoCat
TaiDanae Bradley \IfArrayPackageLoaded  




tbradley@gradcenter.cuny.edu and Martha Lewis \IfArrayPackageLoaded  




m.a.f.lewis@uva.nl and Jade Master \IfArrayPackageLoaded  




jmast003@ucr.edu and Brad Theilman \IfArrayPackageLoaded  




btheilma@ad.ucsd.edu 
1 Introduction
Language allows us to communicate, and to compose words in a huge variety of ways to obtain different meanings. It is also constantly changing. The compositional distributional model of [coecke2010] describes how to use compositional methods within a vector space model of meaning. However, this model, and others that are similar [baroni2010, maillard2014], do not have a built in notion of language change, or of translation between languages.
In contrast, many statistical machine translation systems currently use neural models, where a large network is trained to be able to translate words and phrases [mikolovtranslation, gao2014]. This approach does not make use of the grammatical structure which allows you to build translations of phrases from the translations of individual words. In this paper we define a notion of translation between two compositional distributional models of meaning which constitutes a first step towards unifying these two approaches.
Modeling translation between two languages also has intrinsic value, and doing so within the DisCoCat framework means that we can use its compositional power. In section 3.1, we provide a categorical description of translation between two languages that encompasses both updating or amending a language model and translating between two distinct natural languages.
In order to provide this categorical description, we must first introduce some preliminary concepts. In section LABEL:productspace we propose a unification of the product space representation of a language model of [coecke2010] and the functorial representation of [kartsaklis2013reasoning]. This allows us to formalize the notion of lexicon in section LABEL:lexicons which had previously been only loosely defined in the DisCoCat framework. We then show how to build a dictionary between two lexicons and give an example showing how translations can be used to model an update or evolution of a compositional distributional model of meaning. In section LABEL:EngSpan we give a concrete suggestion for automated translation between corpora in English to corpora in Spanish.
2 Background
2.1 Categorical Compositional Distributional Semantics
Categorical compositional distributional models [coecke2010] successfully exploit the compositional structure of natural language in a principled manner, and have outperformed other approaches in Natural Language Processing (NLP) [grefenstette2011, kartsaklis2013]. The approach works as follows. A mathematical formalization of grammar is chosen, for example Lambek’s pregroup grammars [lambek2001], although the approach is equally effective with other categorial grammars [Coecke2013]. Such a categorial grammar allows one to verify whether a phrase or a sentence is grammatically wellformed by means of a computation that establishes the overall grammatical type, referred to as a type reduction. The meanings of individual words are established using a distributional model of language, where they are described as vectors of cooccurrence statistics derived automatically from corpus data [lund1996]. The categorical compositional distributional programme unifies these two aspects of language in a compositional model where grammar mediates composition of meanings. This allows us to derive the meaning of sentences from their grammatical structure, and the meanings of their constituent words. The key insight that allows this approach to succeed is that both pregroup grammars and the category of vector spaces carry the same abstract structure [coecke2010], and the same holds for other categorial grammars since they typically have a weaker categorical structure.
The categorical compositional approach to meaning uses the notion of a monoidal category, and more specifically a compact closed category to understand the structure of grammar and of vector spaces. For reasons of space, we do not describe the details of the compositional distributional approach to meaning. Details can be found in [coecke2010, kartsaklis2013reasoning], amongst others. We note only that instead of using a pregroup as our grammar category, we use the free compact closed category J=\mathscr{C}(\mathscr{B}) generated over a set of types \mathscr{B}, as described in [prellerlambek2007, preller2013].
3 Translating and Evolving
The categorical model has proved successful in a number of natural language processing tasks [grefenstette2011, kartsaklis2013], and is flexible enough that it can be extended to include ambiguity [piedeleu2015] and changes of the semantic category [bolt2017, almehairi2016]. These formalisms have allowed for connections between semantic meanings. By representing words as density matrices, a variant of Löwner ordering has been used to measure the degree of entailment between two words [balkir2015, bankova2016]. A more simple notion of similarity has been implemented in the distributional model by using dot product [coecke2010]. However, these notions of similarity are not built into the formalism of the model. This section defines the notion of a categorical language model which keeps track of internal relationships between semantic meanings.
So far the implementation of these models has been static. In this section, we define a notion of translation which comprises a first step into bringing dynamics into these models of meaning. We show how a language model can be lexicalized, i.e. how vocabulary can be attached to types and vectors and introduce a category of lexicons and translations between them. This allows dictionary between phrases in one language model and the phrases in another.
3.1 Categorical Language Models and Translations
Definition 3.1.
Let J be a category which is freely monoidal on some set of grammatical types. A distributional categorical language model or language model for short is a strong monoidal functor
F\colon(J,\cdot)\to(\textup{{FVect}},\otimes) 
If J is compact closed then the essential image of F inherits a compact closed structure. All of the examples we consider will use the canonical compact closed structure in FVect. However, this is not a requirement of the general approach, and other grammars that are not compact closed my be used, such as Chomsky grammars [hedges2016] or Lambek monoids [Coecke2013].
Distributional categorical language models do not encapsulate everything about a particular language. In fact, there are many possible categorical language models for the same language and there is a notion of translation between them.
Definition 3.2.
A translation T=(j,\alpha) from a language model F\colon J\to\textup{{FVect}} to a language model F^{\prime}\colon J^{\prime}\to\textup{{FVect}} is a monoidal functor j\colon J\to J^{\prime} and a monoidal natural transformation \alpha\colon F\Rightarrow F^{\prime}\circ j. Pictorially, (j,\alpha) is the following 2cell
\xymatrix@=0.5em{J{}{}{}{}{}{}{}{}{}{}\xy@@ix@{{\hbox{}}}} 