A Image processing

Mapping the stereotyped behaviour of freely-moving fruit flies

Abstract

A frequent assumption in behavioural science is that most of an animal’s activities can be described in terms of a small set of stereotyped motifs. Here we introduce a method for mapping an animal’s actions, relying only upon the underlying structure of postural movement data to organise and classify behaviours. Applying this method to the ground-based behaviour of the fruit fly, Drosophila melanogaster, we find that flies perform stereotyped actions roughly 50% of the time, discovering over 100 distinguishable, stereotyped behavioural states. These include multiple modes of locomotion and grooming. We use the resulting measurements as the basis for identifying subtle sex-specific behavioural differences and revealing the low-dimensional nature of animal motions.

I Introduction

The concept of stereotypy–that an organism’s behaviours can be decomposed into discrete, reproducible elements–has influenced the study of ethology, behavioural genetics, and neuroscience for decades (2); (3). Animals possess the ability to move in a vast continuum of ways, theoretically constrained only by the biomechanical limits of their own morphology. Despite this, the range of behavioural actions typically performed by an animal is thought to be much smaller, constructed largely of stereotyped actions that are consistent across time, individuals, and, in some cases, even species (4); (5). A discrete behavioural repertoire can potentially arise via a number of mechanisms, including mechanical limits of gait control, habit formation, and selective pressure to generate robust or optimal actions. In many instances, the search for an individual behavioural neural circuit or gene begins with the assumption that a particular action of interest is stereotyped across time and individuals (6); (7).

Despite the centrality of this concept, with few exceptions (8); (9); (10); (11); (12), the existence of stereotypy has not been probed experimentally. This is largely due to the lack of a comprehensive and compelling mathematical framework for behavioural analysis. Here, we introduce a new method for quantifying postural dynamics that retains an animal’s full behavioural complexity, using the fruit fly Drosophila melanogaster as a model organism to discover and map stereotyped motions.

Most prior methods for quantifying animal behaviour lie in one of two regimes. One of these is the use of coarse metrics such as a gross activity level (e.g. mean velocity or number of times the organism crosses a barrier) or counting the relative frequencies of particular events engrained into the experimental set-up (e.g. turning left or right in a maze). While this approach allows for high-throughput analysis of various organisms, strains, and species, only the most gross aspects of behaviour can be ascertained, potentially overlooking the often subtle effects of the manipulations of interest that are only apparent at a finer descriptive level. The other common approach for behavioural quantification is to use a set of user-defined behavioural categories. These categories, such as walking, grooming, or fighting, are codified heuristically and scored either by hand or, more recently, via supervised machine-learning techniques (13); (14); (15); (16); (17). While the latter approach allows for higher throughput and more consistent labeling, it remains prone to human bias and anthropomorphism and often precludes objective comparisons between data sets due to the reliance on subjective definitions of behaviour. Furthermore, these analyses assume, a priori, that stereotyped classes of behaviour exist without first showing, from the data, that an organism’s actions can be meaningfully categorised in a discrete manner.

Ideally, a behavioural description should manifest itself directly from the data, based upon clearly-stated assumptions, each with testable consequences. The basis of our approach is to view behaviour as a trajectory through a high-dimensional space of postural dynamics. In this space, discrete behaviours correspond to epochs in which the trajectory exhibits pauses, corresponding to a temporally-extended bout of a particular set of motions. Epochs that pause near particular, repeatable positions represent stereotyped behaviours. Moreover, moments in time in which the trajectory is not stationary, but instead moves rapidly, correspond to non-stereotyped actions.

In this paper, we construct a behavioural space for freely-moving fruit flies. We observe that the flies exhibit approximately 100 stereotyped behaviours that are interspersed with frequent bouts of non-stereotyped behaviours. These stereotyped behaviours manifest themselves as distinguishable peaks in the behavioural space and correspond to recognizably distinct behaviours such as walking, running, head grooming, wing grooming, etc. Using this framework, we begin to address biological questions about the underlying postural dynamics that generate behaviour, opening the door for a wide range of other inquiries into the dynamics, neurobiology, and evolution of behaviour.

Ii Experiments

We probed the spontaneous behaviours of ground-based flies (Drosophila melanogaster) in a largely featureless circular arena (Fig 1). Under these conditions, flies display a multitude of complex, non-aerial behaviours such as locomotion and grooming, typically involving multiple parts of their bodies. To capture dynamic rearrangements of the fly’s posture, we recorded video of individual behaving animals with sufficient spatiotemporal resolution to resolve moving body parts such as the legs, wings, and proboscis.

Figure 1: Schematic of the imaging apparatus.

We designed our arena based on previous work which showed that a thin chamber with gently sloping sides prevents flies from flying, jumping, and climbing the walls (18). To keep the flies in the focal plane of our camera, we inverted the previous design. Our arena consists of a custom-made vacuum-formed, clear PETG plastic dome 100mm in diameter and 2mm in height with sloping sides at the edge clamped to a flat glass plate. The edges of the plastic cover are sloped to prevent the flies from being occluded and to limit their ability to climb upside-down on the cover. The underside of the dome is coated with a repellent silane compound (heptane and 1,7-dichloro-1,1,3,3,5,5,7,7-octamethylte-trasiloxane) to prevent the flies from adhering to the surface. In practice, we find that this set-up results in no bouts of upside-down walking.

Over the course of these experiments, we studied the behaviour of 59 male and 51 female D. melanogaster (Oregon-R strain). Each animal was imaged using a high-speed camera (, pixels). A proportional-integral-derivative (PID) feedback algorithm is used to keep the moving fly inside the camera frame by controlling the position of the X-Y stage based on the camera image in real time. In each frame we focus our analysis on a pixel square containing the fly. We imaged each of the flies for one hour, yielding movie frames per individual, or approximately frames in total. All aspects of the instrumentation are controlled by a single computer using a custom-written LabView graphical user interface.

Each of these flies was isolated within 4 hours of eclosion and imaging occurred 1-14 days after that. Flies were placed into the arena via aspiration and were subsequently allowed 5 minutes for adaptation before data collection (Fig S1). All recording occurred between the hours of 9:00 AM and 1:00 PM, thus reducing the effect of circadian rhythms, and the temperature during all recordings was .

Iii Behavioural analysis

The general framework of our analysis is described in Fig 2. Images are first segmented and registered in order to isolate the fly from the background and enforce translational and rotational invariance. After this, they are decomposed into postural time series and converted into wavelet spectrograms, thus creating a spatio-temporal representation for the fly’s dynamics within the images. These spectrograms are used to construct spectral feature vectors that we embed into two dimensions using t-Distributed Stochastic Neighbor Embedding (19). Lastly, we estimate the probability distribution over this two dimensional space and identify resolvable peaks in the distribution. We confirm that sustained pauses near these peaks correspond to discrete behavioural states.

Figure 2: Overview of the data analysis pipeline. Raw images of the Drosophila melanogaster are segmented from the background, rescaled to a reference size, and then aligned, creating a stack of images in the co-moving and co-rotating frame of the fly. These images are then decomposed via PCA into a relatively low-dimensional set of time series. A Morlet wavelet transform is subsequently applied to these time series, creating a spectrogram for each postural mode separately. After normalization, each point in time is mapped into a 2-dimensional plane via t-SNE (19). Lastly, a watershed transform is applied to a gaussian-smoothed density over these points, isolating individual peaks from one another.

iii.1 Image segmentation and registration

Given a sequence of images, we wish to build a spatio-temporal representation for the fly’s postural dynamics. We start by isolating the fly within each frame, followed by rotational and translational registration to produce a sequence of images in the coordinate frame of the insect. Details of these procedures are listed in Appendix A. In brief, we apply Canny’s method for edge detection (20), morphological dilation, and erosion to create a binary mask for the fly. After applying this mask, we rotationally align the images via polar cross-correlation with a template image, similar to previously developed methods (21); (22); (23). We then we use a sub-pixel cross-correlation to translationally align the images (24). Lastly, every image is re-sized so that, on average, each fly’s body covers the same number of pixels. An example segmentation and alignment is shown in Supplementary Movie S1.

iii.2 Postural decomposition

As the fly body is made up of relatively inflexible segments connected by mobile joints, the number of postural degrees of freedom is relatively small when compared to the 40,000 pixels in each image. Accordingly, a natural representation for the fly’s posture would be to enumerate the relative angles of each of the fly’s appendages as a function of time (25); (26); (27). Extracting these variables directly from the images, however, is prohibitively difficult due to occlusions and the complex fly limb and wing geometry.

As an alternative strategy, we find that nearly all of the variance in the pixel images can be explained by projecting the observed pixel values onto a Euclidean space of just 50 dimensions. We apply Principal Component Analysis (PCA) to radon transforms of the images. PCA is a frequently-used method for converting a set of correlated variables into a set of values of linearly uncorrelated eigenmodes. Results from this analysis can be described as the space spanned by the eigenvectors of the data covariance matrix, , corresponding to the largest eigenvalues out of the total latent dimensionality of the data. While, in general, there is no rigorous manner to choose , here, we will keep all modes containing correlations larger than the finite sampling error within our data set. According to this heuristic, we set (Fig 3(c)), a number of modes explaining approximately 93% of the observed variation (Fig 3(d)). Details of this computation can be found in Appendix B.

We refer to these directions of correlated variation as postural modes. As seen in Fig 3(b), these modes are fly-like in appearance, but do not lend themselves to intuitive interpretation. However, projecting individual images onto these axes, we can convert a movie of fly behaviour into a 50-dimensional time series,

(1)

as exemplified in Fig 3(e).

Figure 3: Generation of spectral feature vectors. (a) Raw image of a fly in the arena. (b) Pictorial representation of the first 5 postural modes, , after inverse Radon transform. Black and white regions represent highlighted areas of each mode (with opposite sign). (c) First 1,000 eigenvalues of the data matrix (black) and shuffled data (red). (d) Fraction of cumulative variation explained as a function of number of modes included. (e) Typical time series of the projection along postural mode 6 and (f) its corresponding wavelet transform.

iii.3 Spectrogram generation

The instantaneous values of the postural modes do not provide a complete description of behaviour, as our definition of stereotypy is inherently dynamical. Previously published studies of behaviour have searched for motifs – repeated subsequences of finite length – within a behavioural time series (28); (12). However, this paradigm is often confounded by problems of temporal alignment and relative phasing between the component time series. Additionally, certain behaviours (for example, wing grooming in Drosophila) involve multiple appendages moving at different time scales, thus complicating the choice of motif length.

As an alternative to this approach, we use a spectrogram representation for the postural dynamics, measuring the power at frequency associated with each postural mode, , in a window surrounding a moment in time, . More specifically, we compute the amplitudes of the Morlet continuous wavelet transform for each postural mode (29). Although similar to a Fourier spectrogram, wavelets possess a multi-resolution time-frequency trade-off, allowing for a more complete description of postural dynamics occurring at several time scales (30). In particular, the Morlet wavelet is adept at isolating chirps of periodic motion, similar to what we observe in our data set. By measuring only the amplitudes of the transform, we eliminate the need for precise temporal alignment that is required in any motif-based analysis. Details of these calculations are shown in Appendix C, and an example spectrogram is displayed in Fig 3(f). For the results presented here, we look at 25 frequency channels, dyadically spaced between 1 Hz and 50 Hz, the larger of which being the Nyquist frequency of the system.

iii.4 Spatial embedding

is comprised of 25 frequency channels for each of the 50 eigenmodes, making each point in time represented by a 1,250-dimensional feature vector encoding the postural dynamics. As correlations, often strong, exist between the various mode-frequency channels, we expect that the dimensionality of the manifold containing the observed values of should be vastly smaller. As such, we would like to find a low-dimensional representation that captures the important features of the data set.

Our strategy for dimensional reduction of the feature vectors is to construct a space, , such that trajectories within it pause near a repeatable position whenever a particular stereotyped behaviour is observed. This means that our embedding should minimise any local distortions. However, we do not require preservation of structure on longer length scales. Hence, we chose an embedding that reduces dimensionality by altering the distances between more distant points on the manifold.

Most common dimensionality reduction methods, including PCA, Multi-dimensional scaling, and Isomap, do precisely the opposite, sacrificing local verisimilitude in service of larger-scale accuracy (31); (32); (33). One method that does possess this property is t-Distributed Stochastic Neighbor Embedding (t-SNE) (19). Like other embedding algorithms, t-SNE aims to take data from a high-dimensional space and embed it into a space of much smaller dimensionality, preserving some set of invariants as best as possible. For t-SNE, the conserved invariants are related to the Markov transition probabilities if a random walk is performed on the data set. Specifically, we define the transition probability from time point to time point , , to be proportional to a Gaussian kernel of the distance (as of yet, undefined) between them:

(2)

All self-transitions (i.e. ) are assumed to be zero. Each of the are set such that all points have the same transition entropy, . This can be interpreted as restricting transitions to roughly 32 neighbors.

The t-SNE algorithm then embeds the data points in the smaller space while keeping the new set of transition probabilities, , as similar to the as possible. The are defined similarly to the larger-space transition probabilities, but are now, for technical reasons, proportional to a Cauchy (or Student-t) kernel of the points’ Euclidean distances in the embedded space. This algorithm results in an embedding that minimises local distortions (19). If is initially very small or zero, it will place little to no constraint on the relative positions of the two points, but if the original transition probability is large, it will factor significantly into the cost function.

This method’s primary drawback, however, is its poor memory complexity scaling (). To incorporate our entire data set into the embedding, we use an importance sampling technique to select a training set of 35,000 data points, build the space from these data, and then re-embed the remaining points into the space as best as possible (see Appendix D for implementation details).

Lastly, we need to define a distance function, , between the feature vectors. We desire this function to accurately measure how different the shapes of two mode-frequency spectra are, ignoring the overall multiplicative scaling that occurs at the beginning and the end of behavioural bouts due to the finite nature of the wavelet transform. Simply measuring the Euclidean norm between two spectra will be greatly affected by such amplitude modulations. However, because is composed of a set of wavelet amplitudes, it must therefore be positive semi-definite. As such, if we define

(3)

then we can treat this normalised feature vector as a probability distribution over all mode-frequency channels at a given point in time. Hence, a reasonable distance function is the Kullback-Leibler (KL) divergence (34) between two feature vectors:

(4)

Iv Results

iv.1 Embedded space dynamics

Figure 4: Embedding of feature vectors. (a) Training set points embedded into two dimensions via t-SNE. Color coding is proportional to the logarithm of the normalization factor . (b) Probability density function generated from embedding all data points and convolving with a gaussian ().

Figure 4 shows the embedding of our spectral feature vectors into two dimensions, the space , for all of the 59 individual male flies. We first note that nearby points have similar power (), even though the embedding algorithm normalises-out variations in the total power of the postural motions. Embedding the same data into three dimensions yields a very similar structure with less than 2% reduction of the embedding cost function (Eq. 12, Fig S3).

We generated an estimate of the probability density, by convolving each point in the embedded map with a gaussian of relatively small width (, Fig 4(b)). Far from being uniformly distributed across this space, contains a large number of resolved local maxima. The locations of these peaks provide a potential representation for the stereotyped behaviours that the flies perform. As expected, we find that individuals display significantly less intra- than inter-individual variation when their behavioural maps are compared (Fig S4).

This space not only contains peaks, but the trajectory through it also pauses at repeatable locations. Through numerical differentiation of and , we observe a two-state “pause-move” pattern of dynamics. Typical time traces of and show this type of trajectory, with long stationary periods interspersed by quick bouts of movement (Fig 5(a)). More quantitatively, we find that the distribution of velocities within the embedded space is well-represented by a two-component log-normal mixture model in which the the two peaks are separated by almost two orders of magnitude (Fig 5(b)). The distribution of points in the low-velocity case (approximately 45% of all time points) is highly localized with distinguishable peaks (Fig 6). The high-velocity points, in contrast, are more uniformly distributed.

Figure 5: Dynamics within behavioural space. (a) Typical trajectory segment through behavioural space, (blue) and (red). (b) Histogram of velocities in the embedded space fit to a two-component log-gaussian mixture model. The blue bar chart represents the measured probability distribution, the red line is the fitted model, and the cyan and green lines are the mixture components of the fitted model.
Figure 6: Concentration of behavioural space during stereotyped movements. Comparison between the densities generated during stereotyped (a) and non-stereotyped (b) epochs.

iv.2 Behavioural states

The embedded space is comprised of peaks surrounded by valleys. Finding connected areas in the plane such that climbing up the gradient of probability density always leads to the same local maximum, often referred to as a watershed transform (35), we delineate 122 regions of the embedded space. Each of these contains a single local maximum of probability density (Fig 7(a)). When the trajectory, pauses at one of these peaks, we find that each of these epochs correspond to the fly performing a particular stereotyped behaviour. These pauses last anywhere from .05 s up to nearly 25 s (Fig 8(a)).

Observing segments of the original movies corresponding to pauses in one of the regions, we consistently observe the flies performing a distinct action that corresponds to a recognizable behaviour when viewed by eye (Supplementary Movies S2-11). Many of the movements we detect are similar to familiar, intuitively defined behavioural classifications such as walking, running, front leg grooming, and proboscis extension, but here, the segmentation of the movies into behavioural categories has emerged from the data itself, not through a priori definitions. Moreover, we see that near-by regions of our behavioural space correspond to similar, yet distinct, behaviours (Fig 7(c)).

This classification is consistent across individuals (Figures 8-9, Supplementary Movies S3-11). The vast majority of these regions are visited by almost all of the flies at some point (Fig 8(b)). 104 of the 122 regions were visited by over 50 (of 59 total) flies, and the remaining behaviours were all low-probability events, containing, in total, less than 3% of the overall activity.

Figure 7: Segmentation into behavioural regions. (a) Boundary lines obtained from performing a watershed transform on the PDF from Fig 4(b). (b) Integrated probabilities within each of the regions. (c) The organisation of behavioural space into regions of similar movement types. Definition of regions is performed through visual assessment of movies.
Figure 8: Behavioural state dynamics. (a) Distribution of occupancy times in all behaviours. (b) Number of individuals (out of 59 possible) that visit each behaviour at some point during observation.
Figure 9: Behavioural space peaks correspond to specific stereotyped behaviours. Selected regions within behavioural space are shown and are labeled via the colour-coded legend on the right. Instances of dwells within each of these regions can be seen in Supplementary Movies 3-11. The examples displayed in these movies are randomly selected and contain clips from many different flies, showing that the behavioural space provides a representation of behaviour that is consistent across individuals.

iv.3 Behavioural states as periodic orbits

Periodic orbits in postural movements are suggestive of underlying low-dimensional dynamic attractors that produce stable behavioural templates (36). These types of motifs have been hypothesized to form the basis for neural and mechanical control of legged locomotion at fast time scales (37). Because our behavioural mapping algorithm is based upon similarities between postural frequencies exhibited at different times, a potential hypothesis is that pauses in behavioural space correspond to periodic trajectories in the space of postural movements (Eqn 1). In our data, a fast running gait (the bottom-most region of Fig 10(h)) corresponds to periodic oscillations of the postural time series with a clear peak at 12.9 Hz in the power spectral density (Fig 10(a)-(b)). This frequency is in good agreement with previous measurements of the fly walking gait (38); (39).

To systematically investigate the periodicity of the postural dynamics, for each behavioural bout we map time onto a phase variable, a cyclic coordinate defined on the unit circle. This process is usually referred to as phase reconstruction. The method we use, Phaser (40), performs Hilbert transforms to construct phase estimations from several time series separately, then combines these estimates via a maximum likelihood estimate that uses Fourier-series based corrections. Here, we apply Phaser to the postural mode time series, , treating the correlated motions along all 50 postural eigenmodes as synchronized oscillators. We performed this reconstruction for each multi-cycle behavioural bout. After reconstructing the phases for all of the 5,483 bouts of fast running observed in male flies, we observe a clear periodic pattern across several of the postural modes (Fig 10(c)-(f)).

This type of analysis also brings additional insight into the subtle distinctions between our observed behavioural states. If we construct phase-averaged orbits for seven of the running behaviours, we observe many differences in the gait dynamics (see Appendix E, Fig 10(g)). For instance, we observe an increase in many mode amplitudes as the gait frequency increases (e.g. in modes 3, 12, and 13), as noted in previous work (41). In addition, we also see subtle changes in phase (e.g. in mode 4), as well as a near-elimination of a period-doubled trajectory (seen in mode 14). This type of observation could allow for a more thorough understanding of speed control and gait transitions in hexapod locomotion.

We also find oscillatory postural dynamics for other stereotyped behaviours, with many behaviours resulting in a periodic orbit in postural space (Fig 10(i)). These behaviours are found in many regions of behavioural space, suggesting that much of behaviour is indeed confined to low-dimensional postural dynamics. It is important to note that periodic trajectories emerge directly from our analysis, even though the wavelet transform used to define our feature vectors does not preserve phase information.

Figure 10: Periodic dynamics within behavioural states. (a) Periodic oscillations in the 3rd and 4th postural eigenmodes during a typical high-frequency running sequence. (b) Average power spectral density for all instances of this behaviour (the bottom-most region in (h)). (c)-(d) Phase reconstruction of the data in (a) for modes three and four (respectively) . (e)-(f) Probability densities of projections along the 3rd and 4th modes (respectively) for all instances of the behaviour shown in (a)-(d). The black line is the phase-averaged curve (via Eqn 20). (g) Comparison between the phase-averaged curves for 7 different locomotion gaits. Line colours are proportional to the mean gait frequency. (h) Locomotion gaits from Fig 7(c), colour-coded by mean frequency. The colour scale here is the same as in (g). (i) 3-D plots of the phase-averaged trajectories for 5 different behaviours. The first three postural modes are plotted here. (j) Regions corresponding to the orbits shown in (i) (coded by color).

iv.4 Differences in behaviour between males and females

To demonstrate the power of this method to detect subtle differences in behaviour, we compared the behavioural spaces of male and female fruit flies by embedding the postural time series data from females into the behavioural space derived from the male flies (Fig 4). Figure 11(a) displays the male and female behavioural probability densities. We find a striking difference between the two sexes, with locomotory behaviours greatly enhanced but resting and slow motions largely suppressed in females when compared to males. This is in agreement with previous results, showing that young females are more active than their male counterparts (42).

We then sought to isolate subtle behavioural differences between the sexes that are evident in the fine-scale structure of these maps. An example of this can be seen in the “Wing Movements” portion of the behavioural space (the lower left corner of the map). First, we obtained both male and female region-normalised (R-N) PDFs (Fig, 11(c)), where the integral of the behavioural space density within the ”Wing Movements” region integrates to one. Within the space of wing movements, we identified regions that show statistically significant differences between the two sexes using a Wilcoxon rank sum test (43) at each point in behavioural space. This test determines the locations of significant difference between the median male PDF value and the median female PDF value (-value ). Regions where significant differences were found are indicated by the dashed lines in Figure 11(d).

Particular behaviours, such as left-wing grooming, are sexually dimorphic (11(d) solid box, Movies S12-S13). Male-preferred grooming includes a kick of the middle leg on the left side of the body that clears the profile of the wing and moves anteriorly before pushing back towards the posterior. Female-preferred grooming lacks this additional leg movement. We verified this differences by isolating the mean postural-space orbits associated with each of these regions (Figures 11(f), S6). Importantly, while these orbits are statistically different, the average frequencies for the behaviours are not ( Hz versus Hz). We note that these results are consistent across a large range of the behavioural-map smoothing parameter (Fig S5), such that fine-tuning of the spatial structure of the behavioural map is not necessary to obtain the results seen here.

It should be noted that future study is necessary to determine the ethological relevance of these findings and to understand how much of the variance we observe is related to the specifics of our experimental paradigm. However, the fact that these distinctions are found without specifically looking for any of them – emerging only from underlying statistics of the behavioural map – provides quantitative verification that the classifications we make are meaningful. Inherent in any unsupervised classification method is the question of how to validate its accuracy. Here, there is no ground truth with which to compare, since a significant aim of our work is to dispense with a priori behavioural definitions. However, by showing that meaningful distinctions and agglomerations can be made between different behavioural instances, we provide evidence that the approach introduced here can become the basis undergirding a wide range of experimental investigations into the behaviour of animals.

Figure 11: Comparison between male and female behaviours. (a) Measured behavioural space PDF for male (left) and female (right) flies. (b) Difference between the two PDFs in (a). Here we observe large dimorphisms between the sexes, particularly in the ”Locomotion Gaits” and ”Idle and Slow Movements” regions. (c) PDFs for behaviours in the “Wing Movements” portion of the behavioural space (the lower left of the full space). These PDFs (male on the left and female on the right) are normalised so that they each integrate to one. The black lines are the boundaries found from a watershed transform and are included to guide the eye. (d) Difference between the two normalised behavioural spaces in (c). Dashed lines enclose regions in which the median male and the median female PDF values are statistically different via the Wilcoxan rank sum test (). (e) Zoom-in on the boxed region in (d). Both of these regions correspond to left wing grooming, but with behaviours within the male-preferred region incorporating an additional leg kick (Supplementary Movies S12-13). (f) Average periodic orbits for postural eigenmodes 1, 2, 6, and 7. The area surrounding the lines represents the standard error of the mean at each point along the trajectory. Average periodic orbits for all of the first 25 postural modes are shown in Fig S6.

V Conclusions

The ability to map and compare the behavioural repertoire of individuals and populations of animals has applications beyond the study of terrestrial dynamics in fruit flies. Combined with tools for genetic manipulation, DNA sequencing, neural imaging, and electrophysiology, the identification of subtle behavioural distinctions and patterns between groups of individuals will impact deep questions related to the interactions between genes, neurons, behaviour, and evolution. In this initial study, we probed the motion of individuals in a largely featureless environment. Extensions to more complicated situations, e.g. where sensory inputs are measured and/or controlled, genes are manipulated, or multiple individuals are present, are readily implemented.

Finally, we note that the only Drosophila-specific step in our analysis pipeline is the generation of the postural eigenmodes. Given movies of sufficient quality and length from different organisms, spectral feature vectors and behavioural spaces can be similarly generated, allowing for potential applications from worms to mice to humans and a greater understanding of how animals behave.

Acknowledgements

We thank Yi Deng, Kieren James-Lubin, Kelsi Lindblad, and Ugne Klibaite for assistance in data collection and analysis, as well as Jessica Cande, David Stern, and David Schwab for discussions and suggestions. JWS and GJB also acknowledge the Howard Hughes Medical Institute Janelia Farm Visitor Program and the Aspen Center for Physics, where many ideas for this work were formulated. This work was funded through awards from the National Institutes of Health (GM098090, GM071508), The National Science Foundation (PHY–0957573, PHY–1066293), the Pew Charitable Trusts, the Swartz Foundation, and the Alfred P. Sloan Foundation.

Data Accessibility

Example code for running the algorithms can be found at https://github.com/gordonberman/MotionMapper. For access to raw image data, please email JWS at shaevitz@princeton.edu.

Appendices

Appendix A Image processing

To isolate the fly from the background, we apply Canny’s method for edge detection (20), resulting in a binary image containing the edge positions. We then morphologically dilate this binary image by a square in order to fill any spurious holes in the edges and proceed to fill all closed curves. This filled image is then morphologically eroded by a square of the same size, resulting in a mask. After applying this mask to the original image, we now have our segmented image.

While our tracking algorithm ensures that the fly remains within the image boundaries, the centre of the fly and the orientation within the frame vary over time. Having obtained a sequence of isolated fly images, we next register them both translationally and rotationally with respect to a template image. The template image is generated by taking a typical image of a fly and then manually ablating the wings and legs digitally.

For our first step, we rotationally align. This is achieved through finding the angle that maximises the cross-correlation between the magnitudes of the 2D polar Fourier transforms for each image and the template.  Because all translation information appears in the phase of the 2D Fourier transform, this rotational alignment, based only upon the magnitude of the transform, is independent of any initial translations between the images. Accordingly, once rotational alignment is achieved, we can subsequently register the images translationally via a cross-correlation.

Appendix B Postural decomposition from images

The aim of the postural decomposition is to take our set of aligned images and create a lower-dimensional representation that can be made into time series. Naively, one would simply perform PCA on the the images, using each pixel value as a separate dimension. The fly images, however, contain too many pixels to analyse due to memory limitations.

To make this problem more tractable, we analyse only the subset of these pixels which have non-negligible variance. Many pixels within the fly image are either always zero or always saturated, thus containing almost no dynamical information. Accordingly, we would like to use only a subsample of these measurements. The most obvious manner to go about this is to find the pixels containing the highest variance and keep only those above a certain threshold. The primary difficulty here, however, is that there is not an obvious truncation point (Fig S2A). This is most likely the result of the fact that the fly legs can potentially occupy the majority of the pixels in the image but only are present in a relatively small number in any given frame. Hence, many of these periphery pixels all have similarly moderate standard deviations, making them difficult to differentiate.

A more compact scheme is to represent the images in Radon-transform space, which more sparsely parameterises lines such as legs or wing veins. After Radon transformation, the probability density function of pixel-value standard deviations has a clear minimum and we keep pixels whose standard deviation is larger than this value (Fig S2B). This results in keeping 6,763 pixels out of 18,090, retaining approximately 95% of the total variation in the images. If there are images in our sample, we can represent our data set, as an -element matrix. We then proceed to calculate the principal directions of variation in these data using PCA, as seen in Fig 3.

Lastly, the question remains of how many modes to keep in our analysis, a task made more ambiguous due to the smoothness of the eigenvalue spectrum. Our approach to determining the truncation point is to compare the PCA eigenvalues with a null model based on the noise properties of our data set. Specifically, we assume that the noise is due to finite data collection. Although additional errors in image segmentation and registration assuredly exist in our data set, this set null model provides an upper-bound on the number of statistically meaningful eigenmodes.

To calculate this truncation point, we take our data matrix, , and shuffle each of its columns independently from one another, hence eliminating all meaningful correlations between them. Given finite sampling (even if very large), however, there will still remain some residual correlations, resulting in off-diagonal non-zero terms in the covariance matrix. Hence, if we diagonalise this new covariance matrix, the largest eigenvalue provides a resolution limit for our ability to distinguish signal from finite sampling noise. Performing this analysis, we find that only 50 modes have eigenvalues larger than this largest shuffled eigenvalue. These 50 modes account for slightly more than 93% of the observed variation in the data.

Appendix C Wavelet calculations

We use the Morlet continuous wavelet transform to provide a multiple time scale representation of our postural mode dynamics. More explicitly, we calculate this transform, , via

(5)

with

(6)

Here, is a postural time series, is the time scale of interest, is a point in time, and is a non-dimensional parameter (set to 5 here).

The Morlet wavelet has the additional property that the time scale, , is related to the Fourier frequency, , by

(7)

This can be derived by maximizing the response to a pure sine wave, , with respect to .

However, is disproportionally large when responding to pure sine waves of lower frequencies. To correct for this, we find a scalar function such that

(8)

where is times the Fourier frequency found in Eq. 7. For a Morlet wavelet, this function is

(9)

Accordingly, we can define our power spectrum, , via

(10)

Last, we use a dyadically-spaced set of frequencies between Hz and the Nyquist frequency ( Hz) via

(11)

for (and their corresponding scales via Eq. 7). This creates a wavelet spectrogram that is resolved at multiple time-scales for each of the first 50 postural modes.

Appendix D t-SNE implementation

For our initial embedding using t-SNE, we largely follow the method introduced in (19), minimizing the cost function

(12)

where ,

(13)

and is the Euclidean distance between points and in the embedded space. The cost function is optimised through a gradient descent procedure that is preceded by an early-exaggeration period, allowing for the system to more readily escape local minima.

The memory complexity of this algorithm prevents the practical number of points from exceeding . Although improving this number is the subject of current research (44), our solution here is to generate an embedding using a selection of roughly 600 data points from each of the 59 individuals observed (out of data points per individual). To ensure that these points create a representative sample, we perform t-SNE on 20,000 randomly-selected data points from each individual. This embedding is then used to estimate a probability density by convolving each point with a 2D gaussian whose whose width is equal to the distance from the point to its nearest neighbours. This space is segmented by applying a watershed transform (35) to the inverse of the PDF, creating a set of regions. Finally, points are grouped by the region to which they belong and the number of points selected out of each region is proportional to the integral over the PDF in that region. This is performed for all data sets, yielding a total of 35,000 data points in the training set.

Given the embedding resulting from applying t-SNE to our training set, we wish to embed additional points into our behavioural space by comparing each to the training set individually. Mathematically, let be the set of all feature vectors in the training set, be their associated embeddings via t-SNE, be a new feature vector that we would like to embed according to the mapping between and , and be the embedding of that we would like to determine.

As with the t-SNE cost function, we will embed by enforcing that its transition probabilities in the two spaces are as similar as possible. Like before, the transitions in the full space, , are given by

(14)

where is the Kullback-Leibler divergence between and , and is once again found by constraining the entropy of the condition transition probability distribution, using the same parameters as for the t-SNE embedding. Similarly, the transition probabilities in the embedded space are given by

(15)

where is the Euclidean distance between and .

For each , we then seek the that minimises the Kullback-Leibler divergence between the transition probability distributions in the two spaces:

(16)
(17)

As before, this is a non-convex function, leading to potential complexities in performing our desired optimization. However, if we start a local optimization (using the Nelder-Mead Simplex algorithm (45); (46)) from a weighted average of points, , where

(18)

this point is almost always within the basin of attraction of the global minimum. To ensure that this is true in all cases, however, we also perform the same minimisation procedure, but starting from the point , where

(19)

This returned a better solution approximately 5% of the time.

Because this embedding can be calculated independently for each value of , the algorithm scales linearly with the number of points. We also make use of the fact that this algorithm is embarrassingly parallelizable. Moreover, because we have set our transition entropy, , to be equal to 5, there are rarely more than 50 points to which a given has a non-zero transition probability. Accordingly, we can speed up our cost function evaluation considerably by only allowing for the nearest 200 points to in the original space.

Lastly, we find the space of behaviours for the female data sets by embedding these data into the space created with the male training set. We find that the median re-embedding cost (Eqn. 16) for the female cost is only 1% more than the median re-embedding cost for the male data (5.08 bits vs. 5.12 bits) indicating that the embedding works well for both sexes.

Appendix E Phase-averaged orbits

After applying the Phaser algorithm, we find the phase-averaged orbit via a von Mises distribution weighted average. More precisely, we construct the average orbit for eigenmode , via

(20)

where is the projection onto the eigenmode at time point , is the phase associated with the same time point, and is related to the standard deviation of the von Mises distribution (, where is the modified Bessel function of order). Here we find the value of , which is the resulting in .

Because phase reconstruction only is unique up to an additive constant, to compare phase-averaged curves of different behavioural bouts, an additional alignment needs to occur. This is performed by first finding the maximum value of cross-correlation between the phase-averaged curves for each mode. Then, the phase offset between that pair of 50-dimensional orbits is given by the median of these found phase shifts.

Appendix F Supplementary Tables

Parameter Description Value
Number of angles used in Radon transforms 90
Number of postural eigenmodes (found, not defined a priori) 50
Table S1: Parameters used in eigen-decomposition
Parameter Description Value
Number of frequency channels 25
Non-dimensional Morlet wavelet parameter 5
High-pass frequency cut-off 1 Hz
Low-pass frequency cut-off 50 Hz
Table S2: Parameters used in wavelet analysis
Parameter Description Value
Transition entropy 5
Training set size 35,000
Maximum non-zero re-embedding transitions 200
Nearest neighbors for training set calculation 10
Table S3: Parameters used in t-SNE implementation
Parameter Description Value
Width of gaussian kernel density estimator for embedded points 1.5
Minimum behavioural time scale s
Width of gaussian for embedded-space velocity calculations s
Standard deviation of von Mises smoothing for phase averaging radians
Table S4: Parameters used in behavioural segmentation

Appendix G Supplementary Figures

Figure S1: Flies adapt to the arena in about five minutes. Median movement velocity for male flies () during the 65 minute filming period. Pink regions represent the 25% and 75% quantiles at each point in time. Note that almost all adaptation occurs within the first five minutes placed in the dish. This region is excluded from our analyses
Figure S2: Radon vs. pixel representation of images. A) Probability density function of pixel standard deviations. B) Probability density function of Radon pixel standard deviations. Note the clear minimum that exists in B), allowing for an effective reduction in the number of pixels necessary to represent the data.
Figure S3: Comparison between embedding into 3-D (A) versus 2-D (B) via t-SNE. Other than the embedding dimension, all other parameters remain constant. Color labels are proportional to the logarithm of the normalizing amplitude () for each point in the training set. There is a 2% improvement in the error function (Equation D1) in the 3-D case as compared to the 2-D embedding (2.9 versus 3.3 bits out of a total of 20.6 bits for the transition matrix ).
Figure S4: Inter- versus intra- individual variation. Plotted is the inter-individual variation minus the intra-individual variation for each of the 59 male flies. Here, we measure the difference between two behavioural spaces as the Jensen-Shannon (JS) divergence between their respective probability densities. Intra-individual variation is measured as the JS divergence between maps generated from the first 20 minutes and the last 20 minutes of an individual data set. Inter-individual variation is measured as the median JS divergence between the behavioural space between individuals. Error bars are calculated via N-1 bootstrapping on the inter-individual variance. Accordingly, a positive value on this plot implies that the two portions of the data set are more similar to each other than they are to other individuals’ behavioural spaces. We find that 48 of the 59 individuals display significantly less intra- than inter-individual variations ().
Figure S5: Effect of varying the smoothing parameter, , on the identified male/female wing movement distinctions. Plots are of the difference between the median female region-normalized probability density function and the median male region-normalized probability density function for various values of . Lines within the regions encircle regions where the p-Value of the Wilcoxon rank sum test are less than .01.
Figure S6: Postural space periodic orbits for left wing grooming behaviours in Fig 11 in the main text. Blue curves are the average orbits for region , and red curves are average orbits for region . Line thicknesses represent that standard error of the mean at each phase.

Appendix H Supplementary Movie Legends

Supplementary movies can be obtained by emailing GJB (gberman@princeton.edu)

Movie S1. Raw video data of a behaving fly (left) and the corresponding segmented and aligned data (right).

Movie S2. Dynamics in behavioural space. Raw video of a behaving D. melanogaster (middle) is displayed alongside coordinates of the fly’s position within the filming apparatus (left) and its position in the embedded behavioural space (right). The red circles represent the positions in the appropriate coordinate system and the trailing lines are the positions traversed in the previous .5 s. The light blue shading indicates that a particular behaviour is being performed, and the blue text below the video of the fly gives a coarse label for the behaviour. The first portion of the movie is 5 s, played at real time (indicated by “Real Time” above the fly video), and the subsequent portion of the movie is slowed down by a factor of 5 for clarity (indicated by “Slowed ”).

Movies S3-11. Each movie is a mosaic of multiple instances of specific regions in behavioural space as displayed in Fig 9 and Table S5. Every movie contains multiple segments from many different individuals and are slowed by a factor of 4 for clarity.

Movie Label
Movie S3 Idle
Movie S4 Right wing grooming
Movie S5 Left wing grooming
Movie S6 Left wing and legs grooming
Movie S7 Wing waggle
Movie S8 Abdomen grooming
Movie S9 Running
Movie S10 Front leg grooming
Movie S11 Head grooming
Table S5: Behavioural Movies

Movie S12. Composite movie (slowed by a factor of 4) of randomly chosen instances of flies from the male-preferred behavioural region in Fig 11 of the main text.

Movie S13. Composite movie (slowed by a factor of 4) of randomly chosen instances of flies from the female-preferred behavioural region in Fig 11 of the main text.

References

References

  1. Altmann, J., 1974 Observational Study of Behavior: Sampling Methods. Behaviour 49, 227–267.
  2. Lehner, P. A., 1996 Handbook of ethological methods. Cambridge, UK: Cambridge University Press, 2nd edition.
  3. Gould, J. L., 1982 Ethology: The Mechanisms and Evolution of Behavior. New York, NY: W. W. Norton & Company.
  4. Stephens, G. J., Osborne, L. C. & Bialek, W., 2011 Searching for simplicity in the analysis of neurons and behavior. Proc. Nat. Acad. Sci. 108, 15565–15571. (doi:10.1073/pnas.1010868108).
  5. Glimcher, P. W. & Dorris, M., 2004 Neuronal studies of decision making in the visual-saccadic system. In The Cognitive Neurosciences III (ed. M. S. Gazzaniga), pp. 1215–1228. MIT Press.
  6. Manoli, D. S., Meissner, G. W. & Baker, B. S., 2006 Blueprints for behavior: genetic specification of neural circuitry for innate behaviors. Trends In Neurosciences 29, 444–451. (doi:10.1016/j.tins.2006.06.006).
  7. Osborne, L. C., Lisberger, S. G. & Bialek, W., 2005 A sensory source for motor variation. Nature 437, 412–416. (doi:10.1038/nature03961).
  8. Stephens, G. J., Johnson-Kerner, B., Bialek, W. & Ryu, W. S., 2008 Dimensionality and dynamics in the behavior of C. elegans. PLoS Comp. Bio. 4, e1000028. (doi:10.1371/journal.pcbi.1000028).
  9. Stephens, G. J., de Mesquita, M. B., Ryu, W. S. & Bialek, W., 2011 Emergence of long timescales and stereotyped behaviors in Caenorhabditis elegans. Proc. Nat. Acad. Sci. 108, 7286–7289. (doi:10.1073/pnas.1007868108).
  10. Desrochers, T. M., Jin, D. Z., Goodman, N. D. & Graybiel, A. M., 2010 Optimal habits can develop spontaneously through sensitivity to local cost. Proc. Nat. Acad. Sci. 107, 20512–20517. (doi:10.1073/pnas.1013470107).
  11. Brown, A. E. X., Yemini, E. I., Grundy, L. J., Jucikas, T. & Schafer, W. R., 2013 A dictionary of behavioral motifs reveals clusters of genes affecting caenorhabditis elegans locomotion. Proc. Nat. Acad. Sci. 110, 791–796. (doi:10.1073/pnas.1211447110).
  12. Dankert, H., Wang, L., Hoopfer, E. D., Anderson, D. J. & Perona, P., 2009 Automated monitoring and analysis of social behavior in Drosophila. Nature Methods 6, 297–303. (doi:10.1038/nmeth.1310).
  13. Branson, K., Robie, A. A., Bender, J., Perona, P. & Dickinson, M. H., 2009 High-throughput ethomics in large groups of Drosophila. Nature Methods 6, 451–457. (doi:10.1038/nmeth.1328).
  14. Kabra, M., Robie, A. A., Rivera-Alba, M., Branson, S. & Branson, K., 2013 JAABA: interactive machine learning for automatic annotation of animal behavior. Nature Methods 10, 64–67. (doi:10.1038/nmeth.2281).
  15. de Chaumont, F., Coura, R. D.-S., Serreau, P., Cressant, A., Chabout, J., Granon, S. & Olivo-Marin, J.-C., 2012 Computerized video analysis of social interactions in mice. Nature Methods 9, 410–417. (doi:10.1038/nmeth.1924).
  16. Kain, J., Stokes, C., Gaudry, Q., Song, X., Foley, J., Wilson, R. & de Bivort, B., 2013 Leg-tracking and automated behavioural classification in Drosophila. Nature Communications 4, 1910. (doi:10.1038/ncomms2908).
  17. Simon, J. C. & Dickinson, M. H., 2010 A new chamber for studying the behavior of Drosophila. PLoS ONE 5, e8793. (doi:10.1371/journal.pone.0008793).
  18. van der Maaten, L. & Hinton, G., 2008 Visualizing data using t-SNE. J. Mach. Learning Research 9, 85.
  19. Canny, J., 1986 A computational approach to edge detection. IEEE Trans Pattern Analysis and Machine Intelligence 8, 679–714. (doi:10.1109/TPAMI.1986.4767851).
  20. De Castro, E. & Morandi, C., 1987 Registration of translated and rotated images using finite Fourier transforms. IEEE Trans on Pattern Anal and Mach Int 5, 700–703.
  21. Reddy, B. S. & Chatterji, B. N., 1996 An FFT-based technique for translation, rotation, and scale-invariant image registration. IEEE Trans on Image Processing 5, 1266–1271. (doi:10.1109/83.506761).
  22. Wilson, C. A. & Theriot, J. A., 2006 A correlation-based approach to calculate rotation and translation of moving cells. IEEE Trans on Image Processing 15, 1939–1951. (doi:10.1109/TIP.2006.873434).
  23. Guizar-Sicairos, M. & Thurman, S. T., 2008 Efficient subpixel image registration algorithms. Opt Lett 33, 156–158. (doi:10.1364/OL.33.000156).
  24. Revzen, S. & Guckenheimer, J. M., 2012 Finding the dimension of slow dynamics in a rhythmic system. Journal of the Royal Society Interface 9, 957–971. (doi:10.1098/rsif.2011.0431).
  25. Ristroph, L., Berman, G. J., Bergou, A. J., Wang, Z. J. & Cohen, I., 2009 Automated hull reconstruction motion tracking (HRMT) applied to sideways maneuvers of free-flying insects. J Exp Bio 212, 1324–1335. (doi:10.1242/jeb.025502).
  26. Fontaine, E. I., Zabala, F., Dickinson, M. H. & Burdick, J. W., 2009 Wing and body motion during flight initiation in Drosophila revealed by automated visual tracking. J Exp Bio 212, 1307–1323. (doi:10.1242/jeb.025379).
  27. Ye, L. & Keogh, E., 2011 Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Mining and Knowledge Discovery 22, 149–182. (doi:10.1007/s10618-010-0179-5).
  28. Goupillaud, P., Grossman, A. & Morlet, J., 1984 Cycle-octave and related transforms in seismic signal analysis. Geoexploration 23, 85–102.
  29. Daubechies, I., 1992 Ten Lectures on Wavelets. Philadelphia, PA: SIAM.
  30. Cox, T. F. & Cox, M. A. A., 2000 Multidimensional Scaling. Boca Raton, FL: Chapman and Hall, 2 edition.
  31. Tenenbaum, J. B., de Silva, V. & Langford, J. C., 2000 A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323. (doi:10.1126/science.290.5500.2319).
  32. Roweis, S. T. & Saul, L. K., 2000 Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326. (doi:10.1126/science.290.5500.2323).
  33. Cover, T. M. & Thomas, J. A., 2006 Elements of Information Theory. Hoboken, NJ: Wiley-Interscience, 2nd edition.
  34. Meyer, F., 1994 Topographic distance and watershed lines. Signal Processing 38, 113–125.
  35. Full, R. J. & Koditschek, D. E., 1999 Templates and anchors: neuromechanical hypotheses of legged locomotion on land. Journal of Experimental Biology 202, 3325–3332.
  36. Holmes, P., Full, R. J., Koditschek, D. & Guckenheimer, J., 2006 The dynamics of legged locomotion: Models, analyses, and challenges. Siam Review 48, 207–304. (doi:10.1137/S0036144504445133).
  37. Strauss, R. & Heisenberg, M., 1990 Coordination of legs during straight walking and turning in Drosophila melanogaster. J Comp Physiol A 167, 403–412.
  38. Wosnitza, A., Bockemühl, T., Dübbert, M., Scholz, H. & Büschges, A., 2013 Inter-leg coordination in the control of walking speed in Drosophila. Journal of Experimental Biology 216, 480–491. (doi:10.1242/jeb.078139).
  39. Revzen, S. & Guckenheimer, J. M., 2008 Estimating the phase of synchronized oscillators. Phys Rev E 78, 051907. (doi:10.1103/PhysRevE.78.051907).
  40. Mendes, C. S., Bartos, I., Akay, T., Marka, S. & Mann, R. S., 2013 Quantification of gait parameters in freely walking wild type and sensory deprived Drosophila melanogaster. eLife 2, e00231–e00231. (doi:10.7554/eLife.00231.027).
  41. Le Bourg, E., 1987 The rate of living theory. Spontaneous locomotor activity, aging and longevity in Drosophila melanogaster. Experimental gerontology 22, 359–369.
  42. Wilcoxon, F., 1945 Individual comparisons by ranking methods. Biometrics Bulletin 1, 80–83.
  43. van der Maaten, L., 2013 Barnes-Hut-SNE. arXiv p. 1301.3342.
  44. Jongen, H. T., Meer, K. & Triesch, E., 2004 Optimization Theory. Boston, MA: Kluwer Academic Publishers.
  45. Lagarias, J. C., Reeds, J. A., Wright, M. H. & Wright, P. E., 1998 Convergence properties of the nelder-mead simplex method in low dimensions. SIAM J. Optim. 9, 112–147.
Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
130553
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description