Measuring Congruence on High Dimensional Time Series
Abstract
A time series is a sequence of data items; typical examples are videos, stock ticker data, or streams of temperature measurements. Quite some research has been devoted to comparing and indexing simple time series, i. e., time series where the data items are real numbers or integers. However, for many application scenarios, the data items of a time series are not simple, but highdimensional data points. E. g., in video streams each pixel can be considered as one dimension, leading to dimensional data items with already for low resolution videos with pixels per frame.
Motivated by an application scenario dealing with motion gesture recognition, we develop a distance measure (which we call congruence distance) that serves as a model for the approximate congruency of two complex time series. This distance measure generalizes the classical notion of congruence from point sets to complex time series.
We show that, given two input time series and , computing the congruence distance of and is NPhard. Afterwards, we present two algorithms with quadratic and quasilinear runtime, respectively, that compute an approximation of the congruence distance. We provide theoretical bounds that relate these approximations with the exact congruence distance, as well as experimental results, which indicate that our approach yields accurate approximations of the congruence distance.
1 Introduction
Similarity search or nearest neighbour search is a common problem in computer science and has a wide range of applications (see Section 1.1 for examples). Given a dataset (in our case, a set of time series), a query (in our case, a time series), the problem is to find nearest neighbours to the query in the dataset, regarding a certain distance or similarity function. The difference between distance and similarity functions is that a distance function returns for exact matches and a higher value otherwise, whereas similarity functions return greater values for more similar input data. In this paper, we consider distance functions only. There are two main variations of the nearest neighbour search problem. The first variation is called the nearest neighbour search (NN search), where the search returns all elements from the dataset having a distance of at most to the query. The second variation is called Top nearest neighbour search, where those elements having the smallest distance to the query will be returned. In each case, a requirement in practical systems is the fast computation of the distance function.
The datasets considered in this paper are time series, i.e., sequences of elements in , for a metric space . Examples of time series include simple time series where (e. g. temperature measurements or stock data) and multi variate time series where (e. g. motion trackings in three dimensional space or videos).
The distance functions defined and analyzed in this paper measure the (approximate) congruence of two time series. Thereby, the distance between two time series and shall be iff two can be transformed into by rotation, translation, and mirroring; in this case, and are said to be congruent. A value greater than shall correlate to the amount of transformation needed to turn the time series into congruent ones.
1.1 Motivation and Related Work
Simple time series are finite squences holding one number per time step. There is a vast field of applications for simple time series in likely all scientific areas, including geo science (temperature measurements, earthquake prediction), medicine (heart rate measurements), and finance (stock ticker data). Depending on the application, different similarity measurements of time series are used (e. g. Landmarks [21], Dynamic Time Warping [20], and Longest Common Subsequence [20]). Different techniques evolved to speed up nearest neighbour searches [9, 25]. Esling and Agon published a survey on simple time series [13].
Let us continue with a few examples highlighting the role of multidimensional time series.
Motion Gesture Recognition
The interest in motion gesture recognition has drastically increased over the last decade, especially in combination with augmented reality systems, as for example the Oculus Rift [3]. Recent products, like the LeapMotion [1] or Microsoft Kinect [2], are able to recognize the posture of the hands and body, respectively. These applications belong to appearence based approaches of motion gesture recognition, since they use cameras to recognize the posture at each time. A second category of posture recognition systems include gloves [11], which is more than 30 years old. The area of their applications has grown more and more from medicine and health care up to recent applications as, for example, controlling a Smartphone [17, 14]. Approaches using systems like these gloves are called skeletal based. The main difference is, that the gesture recognition software retrieves the key information, i. e. the trajectory of the body parts, instead of one or multiple video streams of that person.
Our interest, and the application of our work for motion gesture recognition, is the classification of gestures rather than the capturing itself. There are various different approaches to treat this problem, e. g. Computer Vision based techniques [26], trajectory based techniques [24], approaches based on State Machines [16], etc.
Considering the motion of a finger tip and its direction as a time series in , our approach contributes to the skeletal based algorithms. From our point of view, the problem of motion gesture recognition narrows down to the problem of finding the most similar time series. Hereby, similarity of two time series means the measurement of their congruence. Since motion gestures usually are not performed exactly as stored in a database, we need a fine granular or approximative congruence measurement. For example, a circle can be drawn more like an ellipse, but is more congruent to a circle than to a square or a line (see Figure 1 and Figure 2). To the best of our knowledge, in the literature scaling and translation invariant, but no rotation invariant approaches have been developed. However, the rotation invariance is a necessary feature for applications as for example interactive tables with multiple persons standing at all sides.
Content Based Video Copy Detection
Nowadays, a vast amount of video data is uploaded and shared on community sites such as YouTube or Facebook. This leads to various tasks such as copyright protection, duplicate detection, analysing statistics of particular broadcast advertisements, or searching for large videos containing certain scenes or clips. Two basic approaches exist to address these challenges, namely watermarking and content based copy detection (CBCD). Watermarking suffers from being vulnerable to transformations frequently performed during copy creation of a video (e. g. resizing or reencoding). Furthermore, watermarking cannot be used on videos unmarked before distribution. In contrast, CBCD is about finding copies of an original video by specifically comparing the contents and is thus more robust against transformations done during copy creation. These transformations include resolution, format, and encoding changes, addition of noise, bluring, flipping, (color) negation, and grayscaling. Hence, copies are nearduplicates and it is natural to use a distance or similarity function to discover them.
Many approaches compare features created per image [27, 8]. Global features include mean color values and color histograms. In contrast to global features, local features (e. g. Harris Corners, SIFT, or SURF) are more robust against transformations when searching for similar images [19, 22, 23]. However, these techniques suffer from weak robustness against transformations as for example flipping or negation.
Considering a video with pixels per image as a time series in a dimensional vector space, the transformations flipping, negation, and grayscaling correspond to mirroring, rotating, and translating the time series and thus do not change the congruence distance to another video. Furthermore, a global or local feature could be stored per image and regarded as state per time step. Hence, the congruence distance function introduced in the present paper seems to be a good basis for video distance functions in combination with already existing techniques.
Congruence Calculation
The classical Congruence problem basically determines whether two point sets are congruent considering isometric transformations (i. e., rotation, translation, and mirroring) [15, 6]. For two and three dimensional spaces, there are results providing algorithms with runtime [6]. For larger dimensionalities, they provide an algorithm with runtime . For various reasons (e. g. bounded floating point precision, physical measurement errors), the approximated Congruence problem is of much more interest in practical applications. Different variations of the approximated Congruence problem have been studied (e. g. what types of transformations are used, is the assignment of points from to known, what metric is used) [15, 6, 18, 5].
The Congruence problem is related to our work, since the problem is concerned with the existance of isometric functions such that a point set maps to another point set. The main difference is, that we consider ordered lists of points (i. e. time series) rather than pure sets.
1.2 Main Contributions
In this paper, we use a model for complex time series covering models of time series known from the literature as well as high dimensional time series. Focusing on high dimensional time series, our main contributions are as follows:

We define and analyze an intuitive congruence measurement (congruence distance) which can be computed by solving an optimization problem with highly nonlinear constraints.

We show that the calculation of the congruence distance is an NPhard problem. This is done by constructing a technically involved polynomial time reduction from the NPhard 1in3Sat problem.

We provide two approximations to the congruence distance (delta distance, and reduced delta distance) that can be computed in polynomial time. Studying their approximativity, we obtain:

The approximations yield lower bounds on the congruence distance.

There exist pathetic examples revealing that the relative error can grow arbitrarily.

Our experimental results suggest a stable behaviour of the approximations in practical applications.

1.3 Organization
The rest of this paper is structured as follows. In Section 2 we provide basic notation used throughout the paper. In Section 3, we fix the notion of time series, and we present distance measures that turn the set of all time series into a metric space. Section 4 discusses the congruence of two time series: We define an intuitive function measuring the congruence similarity of two time series and show that its calculation is an NPhard problem. Furthermore, we provide an approximation with quadratical runtime and compare both distance functions with each other. In Section 5 we provide an approximation which has quasilinear runtime. There are examples where the difference between the congruence distance functions provided in this paper grows arbitrarily. However, the experimental results presented in Section 6 indicate that in practice, our approach yields accurate approximations. Section 7 concludes the paper.
2 Preliminaries
Basic notation
By , , we denote the set of nonnegative integers, the set of reals, and the set of all reals , for some , respectively. For integers we write for the interval consisting of all integers with , and we write for .
By and , for , we denote the set of all vectors of length , resp., all matrices with entries in . For a vector we write for the entry in position .
Similarly, for a matrix we write for the entry in row and column . By we denote the th unit vector in , i.e., the vector with entry in the th position and entry in all other positions.
We write for the product of the matrix and the vector . We write and for the product of the number with the vector and the matrix , respectively (i.e., for all , the th entry of is , and the entry in row and column of is ).
By , for , we denote the usual norm on ; i.e., for all .
By we denote the usual scalar product on ; i.e., for we have . In particular, for all . Recall that two vectors are orthogonal iff .
A matrix is called orthogonal if the absolute value of its determinant is 1. Equivalently, is orthogonal iff and for all with , where denotes the vector in the th column of . We write to denote the set of all orthogonal matrices in . Recall that angles and lengths are invariant under multiplication with orthogonal matrices, i. e.:
In general, a vector norm is an arbitrary mapping that satisfies the following axioms:
Clearly, is a vector norm ( norm) for any .
A matrix norm is a mapping satisfying the following axioms:
The particular matrix norms considered in this paper are the max column norm and the norm , for , which are defined as follows: For all ,
Recall that a pseudo metric space consists of a set and a distance function satisfying the following axioms:
A metric space is a pseudo metric space which also satisfies
Note that if is an arbitrary vector norm and is defined as , then is a metric space. By , for , we denote the usual distance, i.e., the particular distance function with .
If is an arbitrary matrix norm and is defined as for all matrices , then is a metric space.
3 Time Series
Let be an arbitrary set. A time series over is a finite sequence of elements in . For a time series , we write to denote the length of . The elements of are called the states of .
The special case where yields the simple time series that are usually considered in the literature; examples of application areas are time sequences obtained from stock data, temperature measurements or heart rate monitoring (here, we consider time series with homogenous time intervals only). For such simple time series, the distance between two time series and of equal length usually is defined as , where is a vector norm and is the vector in whose th entry is the real number . The most common case considered in the literature uses the 1norm , cf. e.g. [13]; see Figure 3 for an illustration.
We generalize this to time series over arbitrary sets as follows. Let be a metric space. For time series of length over , we let be the real vector of length with entry in its th position (for all ). Now let be an arbitrary vector norm. We define a distance measure via
By we denote the set of all time series over of arbitrary length, i.e., . If is clear from the context, we will omit the subscript and simply write instead of . For we then write to denote the set of all time series of length over . It is straightforward to verify the following.
Proposition 3.1.
is a metric space.
Next, we want to extend to a distance measure on time series of arbitrary length, i.e., we want to extend to a mapping . For this, the following notation is convenient.
Definition 3.2.
Let be a time series, let , and let . Then is the subseries of of length starting at index .
If and are two time series of lenghts , then we let
I.e., the distance between and is computed by finding the best match of the shorter time series regarded as a window over the longer time series. We will write
instead of for the special case where , for some , and d is the Euclidean distance defined via for all .
It is easy to see that many other distance functions (e. g. DTW and LCSS [7, 10, 20]) that have been considered in the literature for time series over or can be adopted to time series over for a metric space accordingly.
To avoid confusion between d, , , , and further distance functions considered in this paper, we will henceforth write (or variants thereof) to denote distance functions for relating time series (i.e., will be a function from to ), and we will write d (or variants thereof) to denote distance functions for relating individual states in the time series (i.e., d will be a function from to ). The latter will be called state distance function.
We will speak of metric time series whenever considering time series over for a metric space . For a given vector norm , the associated function will serve as a distance measure for time series over .
Let us conclude this section with a few examples that illustrate the generality of metric time series.
Examples 3.3.
As already explained above, simple time series are a special case of time series where , is defined via for , and for some .
Complex time series, i.e., time series where the states are elements in for some fixed , are the special case where , is the Euclidean distance , for some , and hence .
For an arbitrary undirected connected graph , we can consider the mapping where is the length of a shortest path between nodes and of . Note that is a metric space. Given an arbitrary vector norm , we can view sequences of nodes of as time series over , and as a distance measure between such time series.
In the remainder of this paper we restrict attention to time series over and state distance functions .
4 Time Series Congruence
Let and let . If is a time series, is a matrix, and is a vector, we write for the time series where for each .
We say that two time series are congruent, if can be transformed into by rotation, mirroring, or translation. This is formalized in the following definition.
Definition 4.1.
Consider the metric space for . Two time series and of the same length are called congruent (for short: ) if there is a matrix and a vector such that .
It is easy to see that for each , the congruence relation is an equivalence relation on the class of all time series over of length .
According to the motivation provided in Section 1, we aim at a distance measure that regards two time series and as very similar if is obtained from via rotation, mirroring, or translation, i.e., which satisfies the following congruence requirement.
Definition 4.2 (Congruence Requirement).
Let , let , and let . A function satisfies the congruence requirement iff for all time series the following is true:
The following example highlights some intuition for the congruence distance function that is provided in Definition 4.4.
Example 4.3.
Consider the time series
Obviously, . Now, let us rotate by 90 degress counterclockwise, i. e., let us compute for the matrix
and .
Thus, without rotation, we need to add a vector of Euclidean length to the first state of in order to transform into . But after rotating by 90 degrees counterclockwise, we only need to add a vector of length to the first state and a vector of length to the third state of to obtain the time series .
Adding vectors to certain states can be interpreted as investing energy to make both time series having the same structure, i.e., being “congruent”. Hence, the congruence distance defined below can be viewed as a measure for the minimum amount of energy needed to make both time series congruent.
Definition 4.4 (Congruence Distance).
Let , , , and . The congruence distance between two time series is defined via
Note that, although and are infinite sets, it can be shown that the “min” used in the definition of does exist, and that for given there are and such that ; a proof can be found in the appendix.
It is not difficult to see that the following holds for and for each :
Proposition 4.5.
is a pseudo metric space.
The proof is given in the appendix.
Obviously, calculating for arbitrary is a nonlinear optimization problem that can be solved using numeric solvers. However the problem is computationally difficult: As we show in the next subsection, already the calculation of is NPhard.
4.1 NPHardness
In this subsection we restrict attention to and the according congruence distance . Consider the following problem:
 Input:
A number and two time series and of equal length over .
 Task:
Compute (a suitable representation of) the number .
This subsection’s main result is:
Theorem 4.6.
If , then
cannot
be solved in polynomial time.
The remainder of Subsection 4.1 is devoted to the proof of Theorem 4.6, which constructs a reduction from the NPcomplete problem 1in3Sat. Recall that 1in3Sat is the problem where the input consists of a propositional formula in 3cnf, i.e., in conjunctive normal form where each clause is a disjunction of literals over 3 distinct variables. The task is to decide whether there is an assignment that maps the variables occurring in to the truth values or , such that in each disjunctive clause of exactly one literal is satisfied by ; we will call such an assignment a 1in3 model of .
Our reduction from 1in3Sat to will proceed as follows: A given 3cnf formula with variables , will be mapped to two time series and over , which represent the formula and its variables, respectively. Our construction of and will ensure that for a certain number the following is true: there is a 1in3 model of .
The basic idea for our choice of and is that each dimension of represents one variable. An orthogonal matrix, mirroring the th dimension then will correspond to negating the th variable .
To formulate the proof, the following notation will be convenient. For a propositional formula with variables, we write to denote the variables occurring in . A literal over a variable is a formula . A disjunctive (conjunctive) 3clause is a formula () with , , and . A 3cnf formula is a formula , where and each is a disjunctive 3clause.
Furthermore, we will use the following notation for concatenating time series. Let , and let be a time series over for each . Then, by
we denote the time series
If is an increasing sequence of integers and is a time series over , for each , then for we let
From a 3cnf formula to time series and
For a given 3cnf formula let be the number of variables occurring in . Let be of the form , where and each is a disjunctive 3clause of the form , where , , and .
For a disjunctive 3clause let
where and . Clearly, an assignment satisfies iff it is a 1in3 model of . And satisfies iff it is a 1in3 model of .
The formulas for are called the conjunctive 3clauses implicit in .
We define an embedding of variables, literals, and conjunctive 3clauses into as follows: For each let
For a literal we let . For a conjunctive 3clause , we let
In particular, for as defined above, we obtain that
For each disjunctive 3clause we let
and define the following time series over :
(1) 
For a 3cnf formula all these time series will be concatened to the two time series
Finally, to be able to handle translations, we concatenate the time series with their mirrored duplicates:
Our aim is to compute a number such that the following is true: iff has a 1in3 model. For obtaining this, we will proceed in several steps, the first of which is to compute a number such that has a 1in3 model iff , for
(2) 
The idea behind our choice of the time series and is as follows: and force the orthogonal matrix to have a suitable shape when leading to the minimal distance, i. e. to have all the ’s as Eigenvectors with Eigenvalues of or — in other words: each vector will either be left untouched or will be negated. The time series represents the disjunctive 3clause , while holds the vector representing the variables used in . The minimum of to will then be reached if the vector is rotated in such a way that it matches one of the vectors of . Hence, assigning a propositional variable the value 0 corresponds to negating the th dimension, and assigning the value 1 leaves that dimension untouched.
Relating with 1in3 models of
The next observation will be helpful for our proofs.
Lemma 4.7.
Let , , , and . Then,
Proof.
We let . For the special case where , Thales’ Theorem tells us that . The same holds true for arbitrary , as the following computation shows.
Clearly, , and thus . Furthermore, . And . Thus,
Thus,
and .
From now on, whenever given a matrix , we will always use the following notation: , and . From Lemma 4.7 we know that and .
For a disjunctive 3clause and a matrix we let
In the next lemmas, we will gather information on the size of (cf. the appendix for proof of Lemma 4.8 and Lemma 4.10).
Lemma 4.8.
Let be a disjunctive 3clause, let . Then,
(3) 
Lemma 4.9.
Let be a disjunctive 3clause.

For each we have
(4) 
Let be an element in such that , where for some conjunctive 3clause implicit in . Then .

Let be an element in such that for all , and . Then , where for some conjunctive 3clause implicit in .
Proof.
Let , for , be the conjunctive 3clauses implicit in , and let .
For proving (a), let be an arbitrary element in . Note that by definition of and we have
(5) 
By the triangle inequality and the symmetry we know that is true for all pseudo metric spaces and all . Thus, for any vector and for any we have
and hence
(6) 
Let us choose as follows: We let where if , and otherwise. Then,
Note that is equal to if , and it is equal to if . Thus, due to our choice of , we know that , and hence
(7) 
Our next goal is to show that . For simplicity let us consider w.l.o.g. the case where . For let (thus, ). Then, w.l.o.g. we have
For showing that , we make a case distinction according to .
Case 1: for some . In this case, , and for each , it is straightforward to see that . Thus, .
Case 2: