Towards a Philological Metric through a Topological Data Analysis Approach
The canon of the baroque Spanish literature has been thoroughly studied with philological techniques. The major representatives of the poetry of this epoch are Francisco de Quevedo and Luis de Góngora y Argote. They are commonly classified by the literary experts in two different streams: Quevedo belongs to the Conceptismo and Góngora to the Culteranismo. Besides, traditionally, even if Quevedo is considered the most representative of the Conceptismo, Lope de Vega is also considered to be, at least, closely related to this literary trend. In this paper, we use Topological Data Analysis techniques to provide a first approach to a metric distance between the literary style of these poets. As a consequence, we reach results that are under the literary experts’ criteria, locating the literary style of Lope de Vega, closer to the one of Quevedo than to the one of Góngora.
Philological metric Spanish Golden Age Poets Word embedding Topological data analysis Spanish literature
Topology is the branch of Mathematics which deals with proximity relations and continuous deformations in abstract spaces. Recently, many researchers have paid attention to it due to the increasing amount of data available and the need for in-depth analysis of these datasets to extract useful properties. The application of topological tools to the study of these data is known as Topological Data Analysis (TDA), and this research line has achieved a long list of successes in recent years (see, e.g., ,  or , among many others). In this paper, we focus our attention on applying such TDA techniques to study and effectively compute some kind of nearness in philological studies.
Until now, most of the methods used in comparison studies in philology are essentially qualitative. The comparison among writers, periods or, in general, literary works is often based on stylistic observations that cannot be quantified. Several quantitative methods based on statistical analysis have been applied in the past (see ) but their use is still controversial .
Our approach, based on TDA techniques, is completely different from previous ones. Instead of using statistical methods, whose aim is to summarize the information of the literary work in a numerical description, our procedure is based on the spatial shape of the data after embedding it in a high-dimensional metric space. Broadly speaking, our work starts by representing a literary work as a could of points. The process of making such representation word by word is called word embedding.
Among the most popular systems for word embedding, the word2vec , GloVe  or FastText  systems can be cited. Along this paper, the word2vec system with its skipgram variation will be used for obtaining such multidimensional representation of literary works.
The embedding techniques mentioned above try to find a representation of the literary work as a high-dimensional point cloud in such a way that the semantic proximity is kept. The latter is one of the key points of this paper. Another of the key points is the use of TDA techniques to measure the nearness between different point clouds representing different literary works.
In computer sciences, there are many different ways to measure the distance among two point clouds , but most of them are merely based on some kind of statistical resume of the point cloud and not on its shape.
In this paper, the shape of a point cloud representing a literary work is captured by using a TDA technique known as persistence diagrams, which is based on deep and well-known concepts of algebraic topology such as simplicial complexes, homology groups and filtrations. A measure between persistence diagrams, namely the bottleneck distance, provides a way to quantify the nearness among two different persistence diagrams and hence, a way to quantify the nearness among two different literary works.
As far as we know, very few papers are exploring similar research lines [12, 31] that the proximity between literary works is measured using TDA techniques. In order to illustrate the potential of such techniques, we provide a case study on the comparison of the literary works of two poets who are representatives of the two main stylistic trends of the Spanish Golden Age: Luis de Góngora and Francisco de Quevedo. We also consider a third poet, called Lope de Vega, whose literary works belong to the same stylistic trend as those of Francisco de Quevedo.
Literary experts agree that the styles of Lope de Vega and Francisco de Quevedo are close (they belong to the same literary trend, the so-called Conceptismo), but both are far from the style of Luis de Góngora, which corresponds to a different literary trend called Culteranismo . The application of TDA techniques for measuring the nearness of such Spanish poets quantitatively confirms that the styles of Lope de Vega and Francisco de Quevedo are close to each other and yet both styles are far from the style of Luis de Góngora.
The paper is organized as follows: In Section 2, some preliminary notions about word embedding and TDA techniques are provided. The procedure applied to compare two different literature styles is described in Section 3. In Section 4, the specific comparison between the literary works of the three poets mentioned above is thoroughly described. Finally, in Section 5, conclusions and future work are given.
In this section we recall some basics related to the techniques used along the paper. Firstly, word embedding methodology is briefly introduced. Later, the relevant tools from TDA used in our approach will be described.
2.1 Word embedding
Word embedding is the collective name of a set of methods for representing words from natural languages as points (or vectors) in a real-valued multi-dimensional space. The common feature of such methods is that words with similar meanings take close representation. Such representation methods are on the basis of some of the big successes of deep learning applied to natural language processing (see, for example,  or ). Next, we recall some basic definitions related to this methodology.
Definition 1 (corpus)
Given a finite alphabet , the set of all possible words is . A corpus is a finite collection of writings composed with these words, denoted by . The vocabulary, , of a corpus is the set of all the words that appear in . Finally, given , a word embedding is a function .
The word embedding process used along this paper is the word2vec
The two main models of the word2vec techniques are called CBOW (Continuous Bag of Words) and skipgram. A detailed description of such models is out of the scope of this paper. Roughly speaking, the neural network is trained by using a corpus, where the context of a word is considered as a window around a target word. In this way, in the skipgram model each word of the input is processed by a log-linear classifier with continuous projection layer, trying to predict the previous and the following words in a sentence. In this kind of neural network architecture, the input is a one-hot vector representing a word of the corpus. Then, the weights of the hidden layer are the high dimensional representation of the words, and the output is a prediction of the surrounding words. More specifically, it is a log-linear classifier with continuous projection layer following the architecture shown and explained in Figure 1.
2.2 Topological data analysis
The field of computational topology and, specifically, topological data analysis were born as a combination of topics in geometry, topology, and algorithms. In this section, some of their basic concepts are recalled. For a a detailed presentation of this field, [11, 23] are recommended.
As we will mention below, we are interested in how a space is connected taking into account, somehow, the distribution of a point cloud in the space. Considering this aim we will recall, firstly, homology, and lately, persistent homology which are fundamental TDA tools. The information obtained when computing persistent homology is usually encapsulated as a persistence barcode. Finally, the bottleneck distance will be shown as the main distance to compare persistence barcodes.
The class of the spaces where we define homology groups are the class of simplicial complexes which is a space built from line, segments, triangles, and so on for higher dimensions. These components are called simplices.
Definition 2 (-simplex)
Let be a set of geometrically independent points in . The -simplex spanned by is defined as the set of all points such that , where when , and . Besides, are called the vertices of , the number is called the dimension of , and any simplex spanned by a subset of is called a face of .
When a set of -simplices is glued, a simplicial complex is formed.
Definition 3 (simplicial complex)
A simplicial complex in is a collection of simplices in such that:
Every face of a simplex of is in ;
the intersection of any two simplexes of is a face of each of them.
Any is called a subcomplex of if is a simplicial complex.
Next, the definition of -chains and their boundaries is recalled. It is a key idea for formalizing the idea of hole in a multidimensional space.
Definition 4 (chain complexes)
Let be a simplicial complex and a dimension. A -chain is a formal sum of -simplices, , in , where are -simplices and are coefficients. The sum between -chains is defined componentwise, i.e., let be another -chain, then . The -chains together with the addition form an abelian group denoted by . To relate these groups with different dimension, the boundary of a -simplex, , is defined as the sum of its -dimensional faces, that is , where the hat indicates that is omitted. The boundary of a -chain is the sum of the boundaries of its simplices. Hence, the boundary is a homomorphism that maps a -chain to a -chain, and we write . Then, a chain complex is the sequence of chain groups connected by boundary homomorphisms,
A crucial property of the boundary homomorphism is that the boundary of the boundary is null. Next, the chains with empty boundary are considered. From an algebraic point of view, they have a group structure.
Definition 5 (-cycles and -boundaries)
The group of -cycles is the subgroup of the group of -chains denoted by composed by those chains with empty boundary, . The group of -boundaries is the subgroup of the group of -chains denoted by composed by those chains that are in the image of the -st boundary homomorphism, .
Let us observe that since then is a subset of . Therefore, we can already recall the definition of homology groups.
Definition 6 (homology groups)
The -th homology group is the quotient of the -boundaries over the -cycles, that is, . The elements of are called -dimensional homology classes. The -th Betti number is the rank of .
Next, the idea is to build a nested sequence of simplicial complexes in order to track the evolution of the homology groups throughout the sequence. The homology classes can merge among themselves following the “elder rule”, that is, when merging two homology classes, to consider that the homology class that appeared first in the sequence persists while the other dies off. More formally, given a simplicial complex and a monotonic continuous function which is the filtration function, we can define the sublevel set such that when .
Definition 7 (filtration)
Let be a simplicial complex and let be a non-decreasing function. A filtration of is a nested sequence of subcomplexes,
Such that if are the function values of the simplices in and then for each .
The filtration that we will use in this paper is the so called Vietoris-Rips filtration. This filtration is usually applied to point clouds. The filtration function enlarges -balls from each point. Then, when two of these -balls intersect, a -simplex is built from these two points, establishing a relationship. The process is extrapolated for higher dimensions, i.e., if three balls intersect, a -simplex is built, and so on.
As previously mentioned, in general, for every we have an inclusion map from to . Therefore, we have an induced homomorphism between and .
Definition 8 (Persistent homology)
The corresponding sequence of homology groups connected by homomorphisms obtained from a filtration :
is called the -th persistent homology of .
As a next step, the persistent Betti numbers are stocked as 2-dimensional points.
Definition 9 (persistence diagrams)
A persistence diagram is a multiset of 2-dimensional points in the extended real plane. Let be the number of -dimensional homology classes born at and dying entering , we have
for all and all . Then, the -th persistence diagram of a filtration , denoted as , is the multiset of points with multiplicity (together with the points of the diagonal with infinity multiplicity by convention).
Finally, two persistence diagrams can be compared using a distance. The following can be considered the most common one, and the one that we will use in the next sections.
Definition 10 (bottleneck distance)
The bottleneck distance between two persistence diagrams and is:
where is any possible bijection between and .
Let us describe now an undemanding example as an illustration of these concepts. It is composed by three different datasets showed in Figure 3. The first one samples a circumference (see Figure 1(a)), the second one is a noisy version of the previous circumference (see Figure 1(b)), and the last one is composed by two circumferences (see Figure 1(c)). Then, Vietoris-Rips filtration using the Euclidean metric was computed to obtain the persistence diagrams shown in Figure 3. The 2-dimensional blue and orange points of the persistence diagrams correspond, respectively, to the 0-th and 1-st persistent homology with birth and death time values as its coordinates. In the case of Figure 2(a), the 1-st persistent homology presents just one point that corresponds to the hole of the circumference. However, in Figure 2(b), some points appear close to the diagonal which can be considered as noise because they are components that live for short. Finally, the two orange points pictured in Figure 2(c) belong to the 1-st persistent homology , and correspond to the two holes, one for each circumference of Figure 1(c). A graphical description of the bottleneck distance is shown in Figure 4.
3 Description of the methodology
Next, we describe the methodology based on TDA techniques designed to automatically compare different literary styles. Broadly speaking, given a corpus composed by writings belonging to different categories (e.g., authors, styles, trends,…) a stemming process (which we call stem) is applied to each writing where the non-informative words (also called stop-words) are deleted. Then, the skipgram word embedding (described in Section 2.1) is applied to the vocabulary of the corpus, obtaining a high-dimensional representation of the words as a point cloud. Finally, the Vietoris-Rips filtrations of the point clouds corresponding to the writings of the different categories are compared using the bottleneck distance. The pseudocode of this methodology applied to the experiment on Spanish Golden Age poets shown in Section 4 is described in Algorithm 1.
In this section, we will justify
the methodology presented above
and describe thoroughly the experimentation process
4.1 The context: Spanish Golden Age literature
The Spanish Golden Age literature is a complex framework still alive in the sense that it remains an appealing subject for the literary experts. In this section, we will provide a justification from the literary experts that supports the following experimentation, and give the preliminary literary notions needed to understand it.
The main two concepts in the traditional philological techniques to study literature styles are the signifier and the signified, terms that come from Saussure’s terminology . Signifier and signified compose the so called linguistic sign which relates a concept with an abstract image in our mind.
The signifier is both, the sound and its “acoustic image”, and the signified is not just the concept, but a complex content that depends, in most cases, on the context. Both have importance in the desired effect of the poet. To establish comparisons between literary styles of poets of the Spanish Golden Age, we follow a basic philological reference , written by the 20-century Spanish poet Dámaso Alonso. According to Dámaso Alonso, some signifiers can evoke something specific. An example that Dámaso Alonso provides is the following verse from Quevedo:
infame turba de nocturnas aves
both syllables tur from turba and nocturnas evoke obscurity. Besides, we can consider the last example as partial signifiers; any accent, syllable… can be considered as a signifier with its own signified. However, we are interested in studies related to what we consider the inner “stylistic configurations” of the sentences in order to capture them with the word2vec embedding. Following the study developed by Dámaso Alonso, poets draw on different stylistic configurations for their verses. The first one we would like to comment can be exemplified by the following sonnet:
Afuera el fuego, el lazo, el hielo y la flecha
de amor que abrasa, aprieta, enfría y hiere…
We can see that the main concepts of the first verse correspond member by member to the ones of the second verse, summarizing the following four sentences in the two verses: Afuera el fuego de amor que abraza; afuera el lazo de amor que aprieta, afuera el hielo de amor que enfría, afuera la flecha de amor que hiere. It can be described as the following formula:
that summarizes the sentences for . In the example we had before, is afuera and is de amor. Other kind of resource is the reiterative correlation plurality described in depth in . Let us give an example with a sonnet of Lope de Vega (see Poem 1) where Lope de Vega applies the Dámaso Alonso’s notion of correlation by dissemination and recollection.
In this case, we have again correlation but it is not so rigid as in the first example we had, and it is harder to distinguish. The first correlation is disseminated in the quartets (the arrows pointing the verses in Poem 1), and the second is recollected in the last verse of the sonnet.
By providing these techniques, we want to induce the following idea in the reader;
the poets used different methods that concern the configurations of the verses, one example is the correlation we recalled here. Hence, our aim with the word2vec algorithm is to encapsulate this kind of configurations. We are concern that it is impossible, for now, to determine which exact literature methods an embedding algorithm catches, or even if it catches any. However, it is true that it can find similarities between words and their use taking into consideration the context of the words. Therefore, it seems natural, in a first approach, to see if the word2vec with its skipgram variation can imitate or be used instead of the traditional methods in order to distinguish different literature styles.
the mathematical formulation to study the architecture of the sonnets
introduced by Dámaso Alonso
and his comment
Luis de Góngora and Lope de Vega are, both of them, important poets from the so called Spanish Golden Age. Traditionally, it is said that Luis de Góngora started the Culteranismo literature trend and that Lope de Vega is related to an opposite trend called Conceptismo which had its major representative in Francisco de Quevedo [7, 30]. See also  where it is claimed that both trends are related but with elements that distinguish them. However, there exists discrepancies between the literary experts. For example, in , Dámaso Alonso did a thorough study of Lope de Vega, and he even developed a study of the comparison of this author with Góngora. He stated that there existed a discontinuous influence by the Góngora’s work on the Lope de Vega’s work. So, it might not be possible (and it is natural not to be so) to establish rigid difference between such literary trends. In fact, poets present an evolution through their entire productive life, and the different literature trends can be inspired or fed by other trends. We also recommend  as an study of the context of these three poets.
4.2 The corpus and the preprocessing step
The corpus we used is a huge dataset composed by the sonnets of different Spanish Golden Age
Then, each sonnet was pruned as a result of a stemming process. There exists some words that have no value in terms of meaning or that do not provide structure to the sentence such as prepositions: de, el, la… As they can be considered noise to the aim we follow, we erased them from the sonnets. Besides, some words are shortened to its root in order to avoid the word2vec algorithm to think that different verb tenses or words with different genre are different words. The procedure we applied to delete this non-informative words (also called stop-words) is implemented in the NLTK library .
4.3 Application of the word2vec algorithm
This step consists in the application of the skipgram variation of the word2vec algorithm. Specifically, we applied the implementation of this algorithm provided by the Python library nltk
4.4 The filtration and the Bottleneck distance
Having the high-dimensional representation of the words that compose the different sonnets of the dataset, we compute the Vietoris-Rips filtration. The metric used to compute the Vietoris-Rips filtration is the cosine distance because it measures similarity between words by the angle of their vectors, and it is the common distance applied in the word2vec algorithm (see ). As a result, we have three different 0-th persistence diagrams, one for each poet.
The methodology shown in Algorithm 1 with the specific procedures and
parameters described in Subsection 4.2, Subsection 4.3, and Subsection4.4, was applied and repeated times. The bottleneck distances obtained in these repetitions are shown in Figure 5 using a box-plot representation
|Source of variation||Sum of Squares||DF||Mean Square||F||-value|
|Factors||Mean difference||Standard Error||-value||95% CI|
|A||-||B||-0.000386||0.000442||1.0000||-0.00146 to 0.000690|
|C||0.0110||0.00155||<0.0001||0.00721 to 0.0148|
|B||-||A||0.000386||0.000442||1.0000||-0.000690 to 0.00146|
|C||0.0114||0.00150||<0.0001||0.00771 to 0.0150|
|C||-||A||-0.0110||0.00155||<0.0001||-0.0148 to -0.00721|
|B||-0.0114||0.00150||<0.0001||-0.0150 to -0.00771|
Extracting knowledge from more and more complex datasets is a hard work which requires the help of techniques coming from other fields of science. In this way, representing the data as points of a metric space open a bridge between research fields which are apparently far away. The use of TDA techniques is a new research area which provides tools for comparing properties of point clouds in high-dimensional spaces, and therefore, for comparing the datasets represented by such point clouds.
In this paper, we propose the use of such TDA techniques in order to compare different stylistic trends in the literature. In this approach, bottleneck distance between the persistence diagrams of the Vietoris-Rips filtration obtained form the cloud points representing the sonnets of two different writers encode the differences among their literary styles and quantifies the nearness among them.
This novel approach opens a door for the interaction of TDA and philological research. TDA techniques can be applied in order to give a topological description of a work, a writer or an age and go deeper into their belonging to a greater trend. In addition, philology can suggest new ways to measure the nearness among styles which can be useful for applying TDA techniques in other application areas.
- The model we used is the one implemented in the python library gensim which is based on [18, 20].
- The code is available in https://github.com/Cimagroup/Towards-a-Philological-Metric-Through-a-TDA-Approach
- Free translation. Original comment in Spanish.
- The dataset can be found in https://github.com/bncolorado/CorpusSonetosSigloDeOro
- In a box-plot, the higher horizontal line correspond to the maximum value and the lower horizontal line to the minimum value. The horizontal line in the middle of the box corresponds to the median, the top of the box is the third quartile, and the bottom of the box is the first quartile. Finally, the circumferences correspond to outliers.
- MedCalc software (https://www.medcalc.org/index.php) was used to do the statistical validation.
- (2019) Word embeddings: A survey. CoRR abs/1901.09069. External Links: Cited by: §2.1.
- (1944) Versos plurimembres y poemas correlativos : capítulo para la estilística del siglo de oro. Sección de Cultura e Información Artes Gráficas Municipales, Madrid. Note: Separata de: Revista de la Biblioteca, Archivo y Museo. Año XIII, núm. 49 (1944) Cited by: §4.1.
- (1966) Poesía española: ensayo de métodos y límites estilísticos : garcilaso, fray luis de león, san juan de la cruz, góngora, lope de vega, quevedo. Biblioteca románica hispánica: Estudios y ensayos, Editorial Gredos. External Links: Cited by: §4.1, §4.1.
- S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi and R. Garnett (Eds.) (2018) Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, neurips 2018, 3-8 december 2018, montréal, canada. External Links: Cited by: 32.
- (2017) Enriching word vectors with subword information. TACL 5, pp. 135–146. External Links: Cited by: §1.
- N. Calzolari, K. Choukri, A. Gangemi, B. Maegaard, J. Mariani, J. Odijk and D. Tapias (Eds.) (2006) Proceedings of the fifth international conference on language resources and evaluation, LREC 2006, genoa, italy, may 22-28, 2006. European Language Resources Association (ELRA). External Links: Cited by: 13.
- (1987) Sobre los orígenes del conceptismo andaluz: alonso de bonilla. Boletín del Instituto de Estudios Giennenses (130), pp. 59–84. Cited by: §4.1.
- K. Chaudhuri and R. Salakhutdinov (Eds.) (2019) Proceedings of the 36th international conference on machine learning, ICML 2019, 9-15 june 2019, long beach, california, USA. Proceedings of Machine Learning Research, Vol. 97, PMLR. External Links: Cited by: 27.
- (1965) Curso de lingüística general. Filosofía y teoría del lenguaje, Editorial Losada. External Links: Cited by: §4.1.
- (2009) Encyclopedia of distances. Encyclopedia of Distances, Springer Berlin Heidelberg. External Links: Cited by: §1.
- (2010) Computational topology, an introduction. American Mathematical Society. Note: SIGNATUR = 2011-10098 Cited by: §2.2.
- (2018) Topological signature of 19th century novelists: persistent homology in text mining. Big Data and Cognitive Computing 2 (4). External Links: Cited by: §1.
- (2006) A closer look at skip-gram modelling. See Proceedings of the fifth international conference on language resources and evaluation, LREC 2006, genoa, italy, may 22-28, 2006, Calzolari et al., pp. 1222–1225. External Links: Cited by: §2.1.
- (2018-02) Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20, pp. 104. External Links: Cited by: Figure 1.
- (2008) Quantitative methods in linguistics. Blackwell Pub.. External Links: Cited by: §1.
- (2019) Scalable topological data analysis and visualization for evaluating data-driven models in scientific applications. CoRR abs/1907.08325. External Links: Cited by: §1.
- (2002) NLTK: the natural language toolkit. In In Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. Philadelphia: Association for Computational Linguistics, Cited by: §4.2.
- (2013) Efficient estimation of word representations in vector space. CoRR abs/1301.3781. Cited by: footnote 1.
- (2013) Exploiting similarities among languages for machine translation. CoRR abs/1309.4168. External Links: Cited by: §1.
- (2013) Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, USA, pp. 3111–3119. External Links: Cited by: §4.4, footnote 1.
- (2018) Sobre la oposición entre culteranismo y conceptismo. Universitas Tarraconensis. Revista de Filologia (6), pp. 55–62. Cited by: §4.1.
- A. Moschitti, B. Pang and W. Daelemans (Eds.) (2014) Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, october 25-29, 2014, doha, qatar, A meeting of sigdat, a special interest group of the ACL. ACL. External Links: Cited by: 25.
- (1984) Elements of Algebraic Topology. Addison Wesley Publishing Company. Note: Hardcover External Links: Cited by: §2.2.
- (2016-05) Metrical annotation of a large corpus of Spanish sonnets: representation, scansion and evaluation. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia, pp. 4360–4364. External Links: Cited by: §4.2.
- (2014) Glove: global vectors for word representation. See Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, october 25-29, 2014, doha, qatar, A meeting of sigdat, a special interest group of the ACL, Moschitti et al., pp. 1532–1543. External Links: Cited by: §1.
- (2017) The advantages and disadvantages of using qualitative and quantitative approaches and methods in language ”testing and assessment” research: a literature review. Journal of Education and Learning 6 (21), pp. 102–112. Cited by: §1.
- (2019) Topological data analysis of decision boundaries with application to model selection. See Proceedings of the 36th international conference on machine learning, ICML 2019, 9-15 june 2019, long beach, california, USA, Chaudhuri and Salakhutdinov, pp. 5351–5360. External Links: Cited by: §1.
- (2019) A topological data analysis based classification method for multiple measurements. CoRR abs/1904.02971. External Links: Cited by: §1.
- (2002) Góngora, lope, quevedo. poesía de la edad de oro, ii. Alicante : Biblioteca Virtual Miguel de Cervantes, 2002. External Links: Cited by: §4.1.
- (2016) The spanish golden age sonnet. Iberian and Latin American Studies, University of Wales Press. External Links: Cited by: §1, §4.1.
- (2018) Local homology of word embeddings. External Links: Cited by: §1.
- (2018) On the dimensionality of word embedding. See Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, neurips 2018, 3-8 december 2018, montréal, canada, Bengio et al., pp. 895–906. External Links: Cited by: §2.1.