An Optimal Control Model of Mouse Pointing Using the LQR
Abstract
In this paper we explore the LinearQuadratic Regulator (LQR) to model movement of the mouse pointer. We propose a model in which users are assumed to behave optimally with respect to a certain cost function. Users try to minimize the distance of the mouse pointer to the target smoothly and with minimal effort, by simultaneously minimizing the jerk of the movement. We identify parameters of our model from a dataset of reciprocal pointing with the mouse. We compare our model to the classical minimumjerk and secondorder lag models on data from 12 users with a total of 7702 movements. Our results show that our approach explains the data significantly better than either of these previous models.
noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt
1
<ccs2012> <concept> <concept_id>10003120.10003121.10003126</concept_id> <concept_desc>Humancentered computing HCI theory, concepts and models</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012>
[500]Humancentered computing HCI theory, concepts and models
1 Introduction
Interaction with computers is almost always achieved through movement of the user, measured via input devices. In the field of human motor control, there has been tremendous progress in the understanding of human movement since the 1950’s and 60’s, when Fitts’ law [12, 13] was published. Arguably the most important modern theory of human motor control is optimal feedback control (OFC) [35, 9]. Its main strengths are versatility (applicable to many movement tasks) and the ability to predict the entire movement (including position, velocity, and acceleration of the endeffector over time, not just movement time) without relying on Machine Learning techniques, thus retaining comprehensibility. Despite its advantages, OFC models are not very well known in the field of HumanComputer Interaction (HCI), yet. The objective of this paper is to introduce optimal feedback control to HCI.
OFC is a family of computational models of (human) movement. These models assume that people behave rationally, i.e., optimally with respect to some cost function. In addition, people observe the state of the environment and adjust their movement in order to accomplish a given task, in a feedback manner. The interplay of the three main constituents of OFC, i.e., optimality, feedback, and control, is displayed in Figure 1.
As the figure suggests, the OFC framework is very versatile: Various movements such as hand or eye movements or balancing, can be explained by adjusting the System block (and the Controller block, if necessary). Various instructions, such as emphasizing speed vs. comfort, can be incorporated by adapting the cost function. Due to their feedback structure (also called closedloop), OFC models provide intuitive insight in how humans react to disturbances during the movement, changing targets, etc.
Through OFC, we aim at connecting the field of HCI better with recent advances in neighboring scientific disciplines, such as the study of human movement in motor control [30, 14] and neuroscience [32].
From a scientific perspective, this would strengthen the field of HCI through a deeper insight into the basic constituents of interaction. We start from one of the simplest and most ubiquitous ways we interact with Personal Computers: pointing with a mouse. However, as stated above, OFC could provide a unifying framework for understanding movement in many different interactive tasks, including pointing, steering, tracking of moving targets, scrolling and zooming, with PCs, mobile devices, in AR/VR, etc.
From an engineering perspective, OFC would enable a deeper understanding of the impact of interface design parameters on the process of interaction. In the long term, these models could be used for automated optimization of the parameters of interaction techniques. Models of the dynamics of interaction would help in the design of input devices, from mice to VR controllers. Models that work in realtime could be used in predictive interfaces, which anticipate what the user wants to do and respond accordingly, such as pointing target prediction [2].
To achieve our goals, we start from a wellknown model from OFC theory, presented by Todorov [33]. We believe that the best way to introduce modern motor control theory to HCI is to provide a simple model that is adapted to the above mentioned HCI purposes. Thus, we make several model simplifications, which we discuss below. These allow us to use the socalled LinearQuadratic Regulator (LQR) as the Controller in Figure 1, to calculate the optimal feedback control law. We explore cost functions that combine the objectives of minimizing jerk, which is the derivative of acceleration, and minimizing the distance to the target. We identify parameters of these cost functions and the underlying pointer dynamics from a dataset of reciprocal pointing [26]. We compare the ability of our model to replicate pointer movement to two other models based on the secondorder lag [8, 22] and jerk minimization [14]. Both are suitable comparison candidates: the former model has been evaluated with the same dataset [26]; the latter is an established model in motor control, which has been applied in HCI context [29]. We compare the models on data from 12 users, with 7702 movements overall.
Our results show that our model is able to fit the data significantly better than the other two models. Compared to the former, our approach can generate more symmetric and plausible velocity and acceleration profiles. Compared to the latter, our approach allows to simultaneously model the movement well and reach the target. Our model can predict the entire movement with only three, intuitively interpretable parameters.
2 Related Work
In HCI, movement, e.g., of the mouse pointer, is often reduced to summary statistics such as movement time. The dependency of movement time from distance and width of targets is usually described by Fitts’ law [12, 13] as with Index of Difficulty (ID) defined as [24], although alternatives such as Meyer’s law exist [25]. In HCI, Fitts’ law is usually interpreted from an information theoretic perspective. A very good explanation of this interpretation of Fitts’ law has been provided by Gori et al. [16].
The kinematics and dynamics of movement are studied more rarely in HCI. However, in the studies of human motor control, various models describing kinematics and dynamics of human movement have been developed.
Feedback control models (also called closedloop models) of movement assume that people monitor and adjust their motion on a momenttomoment basis. These models are able to explain how users repeatedly correct errors and handle disturbances. An early closedloop model (without optimization) has been provided by Crossman and Goodeve [8]. They assume that users observe hand and target and adjust their velocity as a linear function of the distance, as a firstorder lag.
A simple, physically more plausible extension of the firstorder lag is the secondorder lag [8, 22]. These dynamics can be interpreted as a springmassdamper system similar to that implied by the equilibriumpoint theory of motor control [30]. A constant force is applied to the mass, such that the system moves to and remains at the target equilibrium. This is one of the comparison models; hence, we call this approach 2OLEq. Other models of human movement include VITE [5] and the models of Plamondon [27].
A fundamentally different approach to using such fixedcontrol models is to assume that humans try to behave optimally, according to a certain internalized cost function. Flash and Hogan [14] propose that humans aim to generate smooth movements by minimizing the jerk of the end effector. We call this model MinJerk in the following. Although the hypothesis that people aim to minimize jerk has been questioned, see, e.g., Harris and Wolpert [18], it is an established model and has been successfully used by Quinn and Zhai [29] to model the shape of gestures on a wordgesture keyboard. The minimumjerk model predicts a scaleinvariant trajectory (as a 5thdegree polynomial), if the exact position and time of beginning and end of the movement are known. It can be interpreted as a trajectory planning step [35] and is thus particularly appropriate for modeling movements that do not involve socalled corrective submovements. These have first been proposed by Woodsworth [37, 11] and typically occur after the first large movement, also called the “surge”, towards the target [25]. Hence, while applicable for gestures, it remains to be seen whether this model can replicate mouse pointer data accurately. Moreover, it does not explain how people execute that trajectory, or if and how they react to disturbances, such as muscle fatigue, external perturbations, changes of the target, etc.
The theory of OFC allows to resolve the separation between trajectory planning and execution. Excellent overviews of recent progress in OFC theory are provided by Crevecoeur et al. [7] and Diedrichsen [9]. An early approach that models perturbed reach and grasp movements by using the minimumjerk trajectory on a momenttomoment basis was presented by Hoff and Arbib [20]. A more general, more recent and better known OFC model is proposed by Todorov and Jordan [35]. This nondeterministic model is based on an extension of the LinearQuadraticGaussian Regulator (ELQG) [33]. It assumes that users try to reach a target at a certain time while minimizing jerk. The biomechanical apparatus is modeled by secondorder lag dynamics. In viapoint tasks, this model qualitatively replicates movement segmentation, eyehand coordination, visual perturbations, and other characteristics of human movement. A discussion about how this model, including state and controldependent noise, can be extended to more general reaching movements can be found in [34].
A fundamental limitation of the ELQG model (and many other optimal control models, e.g., [14, 36, 18]) is that the exact movement time needs to be known in advance. One way to circumvent this issue is to use infinitehorizon OFC [21, 28, 23], i.e., to formulate the optimal control problem on an infinite time horizon. In these references, this approach, in conjunction with a cost function that includes (quadratic) distance and effort costs, was used to model endeffector movement towards a target. The movement time then emerges from the optimal control problem.
Another strand of literature that specifically deals with the duration of movement has produced the Cost of Time theory [19, 31, 3]. This theory assumes that humans value time with a certain (e.g., hyperbolic or sigmoidal) cost function. Thus, movement time is explicitly included in the cost function.
In summary, the fundamental question of human movement coordination has produced a substantial literature and deep understanding regarding the nature of human movement. Given that almost all interaction of humans with computers involves movement, it is surprising that this knowledge is little known in HCI. It is important to bear in mind, however, that the purposes of these models are very different from HCI. They intend to model movement of the human body per se. In contrast, in HCI we are less interested in how the body moves, and more interested in how virtual objects in the computer, such as mouse pointers, move. Movement in HCI is mediated by input devices, operating systems, and programs, requires high precision, and is often learnt very well. Therefore, these models need to be adapted and validated regarding their ability to model movement of virtual objects such as mouse pointers in interaction.
In the field of HCI, there are few publications with control models of mouse pointer movement. Müller et al. [26] compare three feedback control models (without optimization) regarding their ability to model mouse pointer movements. Ziebart et al. [38] explore the use of optimal control models for pointing target prediction. They do not make particular a priori assumptions about the structure of the cost function. Instead, they use a machine learning approach to fit a generic function with a large number of parameters (36) to a dataset of mouse pointer movements. While suitable for their purposes, we are interested in gaining more insight into the structure of the cost function. Furthermore, we believe that reducing the number of parameters (to three in our main model) reduces the risk of overfitting.
3 Model Simplifications
Our approach to introducing OFC theory to HCI is by providing a model that is applicable to HCI, easy enough to understand, while still showing the benefits and strengths of OFC theory. To this end, we start with a simple model for mouse pointer movements that we validate on an HCI dataset. Based on this initial introduction of OFC to HCI, in the future we plan to incorporate extensions proposed in the motor control literature, such as sensorimotor noise and Cost of Time theory.
Our model is inspired by Todorov’s ELQG model [33]. To apply it to our HCI purposes, the following three main difficulties need to be dealt with: First, Todorov’s model replicates many phenomena observed in human movement only qualitatively; there is no known method for adjusting the model to replicate specific experimental data. Second, the exact movement time needs to be known in advance, which is rarely the case in HCI. Third, motor control models usually model movement of the human body per se, e.g., movement of the hand as measured through motion capture or a stylus tablet, while the mouse has been avoided. Mouse pointer movements, however, are modified by sensor characteristics such as mouse sensor rotation and calculations on the microcontroller and in the operating system. It is unclear whether models that have been developed for understanding natural human (hand) movements are also good models for mouse pointer movements.
In this paper we present an OFC model that addresses all these points. Based on OFC theory (see Figure 1), our two key assumptions are first that control of the system is calculated via optimization, i.e., by minimizing a certain cost function. Second, the control is obtained in a feedback manner, i.e., it depends on the system state. To provide a simple model to introduce OFC to HCI and the modeling of mouse pointer movements, we make four key simplifications.
First, following existing literature, we require the cost function that users are assumed to minimize to be quadratic. In pointing tasks, people aim at bringing the endeffector to the target. For various settings, this has been modeled in OFC literature through quadratic distance costs that penalize the distance of the endeffector to the target center [33, 9, 28], see also [15]. At the same time, people aim at minimizing their effort and moving smoothly. The common model for the latter is that users aim to minimize the jerk of the movement [14]. Thus, similar to Todorov [33], we assume the cost function to include terms for penalizing the distance between pointer and target as well as terms to penalize the jerk.
Second, we assume linear dynamics of the mouse pointer (the System block in Figure 1). More precisely, as in Todorov [33], our system dynamics are described by a secondorder lag.
With the third and fourth simplification, we deviate from Todorov [33]: We assume that there are no internal delays in the model. Moreover, we do not model noise and thus have a deterministic model. As a result, our approach quantitatively predicts position and velocity of the mouse pointer over time. In this deterministic setting, fitting the model parameters to the behavior of particular users in a specific task becomes easier.
To summarize, we assume optimal closedloop behavior with respect to a quadratic cost function (that penalizes the jerk as well as the distance to the target) and subject to linear system dynamics (secondorder lag) with no delay and no noise. These simplifications allow us to solve the optimal control problem using a simple optimal feedback controller, LQR, as explained in the next section.
4 The Model
Since mouse sensor data are available in discrete time, we use discretetime dynamics. The state of the system is given by a vector that includes the position and velocity of the virtual mouse pointer. The user controls the mouse pointer by a force , which influences the state . Both are given at the discrete time steps up to some final . The next state depends on the current state and control , as described by
(1) 
where the initial state is given. In this, the matrix describes how the system, e.g., the mouse pointer dynamics described by a secondorder lag, evolves when no control is exerted. The matrix describes how the control influences the system. In this paper we look at 1D pointing tasks, in which the mouse can only be moved horizontally. Thus, in our case, the state encodes the horizontal position and velocity of the pointer, denoted by and , respectively, as well as a target position for technical reasons (in order to later be able to compute the distance to the target), i.e.,
(2) 
This model can easily be extended to 2D or 3D pointing tasks by augmenting and with the respective components for the additional dimensions.
As a model for the mouse pointer dynamics we use the secondorder lag, as depicted in Figure 4(a). The parameters of the model are the stiffness of the spring and the damping factor . The mass is a redundant parameter and does not change the qualitative behavior of the model. We therefore set it to 1. In continuous time, we denote the position of the mouse pointer as , and its first and second derivatives with respect to time (i.e., velocity and acceleration) as and , respectively. The behavior is then described by the secondorder lag equation
(2OL) 
cf. Figure 4(b). We derive a discretetime version of (2OL) via the forward Euler method, with a step size of , where the two milliseconds correspond to the mouse sensor sampling rate. From this, we obtain the matrices and for (1) as
(3) 
This process is similar to the one used by Todorov [33].
Next, we design the cost function that we assume the user to minimize, based on our modeling assumptions. We want to penalize the jerk and the distance to the target. Ideally, no distance costs should occur within the target, which is a box with target width . Unfortunately, this is infeasible in our LQR setting, where we need cost terms to be quadratic. To circumvent this limitation, we construct the distance costs such that we have lower costs inside the target and higher costs outside. At time step , the remaining distance to the target is given by , and we define the resulting distance costs as the square of that:
(4) 
As in Todorov [33], the jerk in our case corresponds to the derivative of the control . We call the approximation of the jerk at time step obtained by backward differences, i.e., . We square this term to get positive values only. A weight factor describes how important the jerk is compared to the positional error (4). Thus, our jerk costs are
(5) 
Formally, this approach requires a value to be chosen, which we will explain later.
Our overall cost function will depend on different summations of the distance costs (4) and the jerk costs (5) over multiple time steps. In order to design a cost function that explains user behavior best, we explore three different cost functions of this type later in the paper.
In conclusion, we model the process of pointing through the following optimal control problem:
(OCP) 
for a given initial control and initial state , and where the matrices and are given by (3) and the function is some summation of (4) and (5) over multiple time steps.
We assume that the user computes the optimal control , which we denote by , in a feedback manner.
It has been proven that for these kinds of problems the optimal control depends linearly on the state [10].
In our case, the optimal control can be calculated simply by multiplying a matrix with the state , extended
(6) 
The matrix is called the feedback gain at time step . It can be computed directly, given the matrices , describing the mouse pointer dynamics, and , describing how control influences the mouse pointer, and the cost function . This is done by solving the appropriate Discrete Riccati Equation, see [33, Theorem 7].
The main question now is whether this optimal feedback corresponds to users’ behavior, i.e., if our approach is suitable to describe pointing tasks. For this purpose, we note that there are several free parameters that we can choose: the spring stiffness , the damping , and the jerk weight . The goal is to choose these parameters such that users’ behavior is approximated best.
5 Parameter fitting
In contrast to the nondeterministic ELQG model of Todorov [33], one main strength of our deterministic model is that we can imitate user data without information about the end time of the movement. In addition, the calculation of optimal parameters is simplified by eliminating uncertainties. In this way, our model can replicate the behavior of a particular user in a particular task. To this end, we need to fit the free parameters , , and , to the data. We denote the set of these parameters by . The goal is to find the optimal set, , in the sense that our model, with parameters , yields a pointer trajectory that is as similar as possible to that of the user. To achieve this, we measure the difference between the model trajectory and the user trajectory using the sum squared error (SSE):
(7) 
We then apply the least squares (LSQ) algorithm depicted in Figure 5 to find the optimal parameter set minimizing (7).
Leastsquaresbased algorithms may converge to local minima and not find a global minimum. Therefore, we execute the whole fitting process several times for randomly chosen starting parameter sets . According to our simulations, 100 of such sets sufficed to provide results that would not improve further by iterating on more starting parameter sets.
6 Pointing task and dataset
To evaluate our model, we use the Pointing Dynamics Dataset. Task, apparatus, and experiment are described in detail in [26]. The dataset contains the mouse trajectory for a reciprocal pointing task in 1D for ID 2, 4, 6, and 8.
Pointing movements almost always start with a reaction time, in which velocity and acceleration of the pointer are close to zero. In real computer usage, the user usually takes some time to decide whether to move the mouse and to locate the target before initiating the movement. Therefore, one could speak of the movement beginning once the acceleration of the pointer reaches a certain threshold.
In the Pointing Dynamics Dataset we use, the trial started immediately when the previous trial was finished, i.e., after the mouse click, not when the user initiated the next movement. This results in a considerable variation in reaction times. Since some variants of our approach as well as the methods from the literature we use for comparison cannot properly handle reaction times, in each trial we ignore the data before the user starts moving. To be exact, we drop all frames before the acceleration reaches of its maximum/minimum value (depending on the movement direction) for the first time in each trial.
Moreover, we ignore user mistakes by dropping the failed and the following trial. From all other trials of all participants and all tasks – 7732 trajectories in total – we have removed another 30 for which the optimally fitted damping parameter was an outlier (more than three standard deviations from the mean). This was necessary due to numerical instabilities that occurred for these parameters, leading to erroneous calculations of the optimal control. All remaining 7702 trajectories are used in the later evaluation.
We use the raw, unfiltered position data in our parameter fitting process to avoid artifacts. The dataset also contains derivatives of user trajectories, which were computed by differentiating the polynomials of a SavitzkyGolay filter of degree 4 and frame size 101 [26]. We use this (filtered) data only for the computation of the reference control (see the next chapter) and for illustration purposes.
For the following plots, unless stated otherwise, we display one certain representative user trajectory, namely the movement to the right of participant 1 for the ID 8 task with 765px distance and 3px target width. For comparison and validation, the plots of all 7702 trajectories are provided in the supplementary material.
7 Iterative design of the cost function
In this section we describe the iterative design of our cost function that is utilized in the algorithm depicted in Figure 5. The three resulting approaches are denoted by 2OLLQR with the corresponding numbering.
7.1 First Iteration: Distance Costs at Endpoint (2OLLQR)
In our first iteration we use a cost function similar to the one used by Todorov [33] for the ELQG model. In this function, jerk costs occur at every step. Distance costs, however, only occur in the time step in which the mouse is clicked (time step ). In particular, no distance costs occur at other time steps. Thus, the cost function is given by
(8) 
where is the remaining distance to the target center at the end of the movement, is the weight of the jerk, and is the jerk at time step .
The initial pointer position and velocity are set from the data, i.e., .
Although the choice of does not have a direct impact on the system dynamics, the trajectory heavily depends on its value.
This is due to penalizing the deviation of from , which carries over to , and so on.
The approach of using cost function (8) suffers from two major problems. First, as illustrated in Figure 6, the generated trajectories do not fit our data. In particular, the target is reached only at exactly the time of the mouse click. In contrast, our data shows that for high IDs, the users reach the vicinity of the target much earlier and then spend considerable time with small corrective submovements close to the target. The reason for this different behavior is that the cost function (8) sets the incentive to settle at the target only at the final time step , while the jerk is penalized in every time step.
The second problem is that the cost function must include the exact time of the mouse click a priori. This makes the cost function very difficult to use for the simulation of human behavior in pointing tasks, if we cannot or do not want to prescribe a specific clicking time.
Hence, we propose a slightly modified cost structure in the LQR algorithm to take these considerations into account.
7.2 Second Iteration: Summed Distance Costs (2OLLQR)
Both issues of the first iteration can be attributed to the fact that the remaining distance to target is only penalized at the time of the mouse click. Hence, we now penalize both the jerk and the distance between pointer position and target during the whole movement. Having summed costs over the entire movement is a standard approach in optimal control for such tracking tasks [6]. Our new cost function is
(9) 
where is the remaining distance to the target center after time step . This changes the meaning of : Instead of being the exact clicking time, it can now be interpreted as the maximum time allowed for the task. Thus, it is now much less important to set accurately.
7.3 Third Iteration: Reaction Time (2OLLQR)
As explained in the dataset section, we prefer to model only the movement itself, excluding the reaction time. Thus, our second iteration does not model reaction time. In some cases, however, it is desirable to model it explicitly. In this section we present an objective function that achieves this.
To this end, we add a parameter that should describe the reaction time. Due to our discrete time setting, we introduce as the discrete time step closest to . The idea is to adjust the cost function such that it incentivizes standing still until , to take reaction time into account.
We achieve this by splitting the cost function in two parts, before and after .
In the first part, we assume that users are not aware of the target position or have at least not processed all required information for initiating the motion.
In both cases, users should have no interest in changing their control.
Therefore, we do not penalize the distance to the desired position in that time frame and employ a much higher jerk penalization compared to the main movement phase.
More precisely, is replaced by , where is, for the most part, an approximation of a very large constant , e.g., .
In total, the cost function of 2OLLQR is
(10) 
There are several ways to obtain the reaction time and thus . One way is to determine it directly from the data, e.g., as the time when the acceleration passes a certain threshold. Another approach is to include it as an additional parameter to be optimized by the LSQ algorithm. We have chosen the latter approach and it works well according to our results.
8 Results
In this section we evaluate our main model, 2OLLQR, by comparing it to the minimumjerk model from [14] (MinJerk) and the secondorder lag with equilibrium control from [26] (2OLEq). We also investigate how the parameters of our model change for different tasks (IDs) and different users. Finally, we demonstrate the ability of 2OLLQR to model movements including a reaction time.
8.1 MinimumJerk Model by Flash and Hogan (MinJerk)
Flash and Hogan [14] show that the minimumjerk trajectory between two points is a fifthdegree polynomial. They assume that velocity and acceleration are zero at the start and at the end of the movement, and explain how the parameters of this polynomial can be computed under these conditions. However, in our dataset, velocity and acceleration are not necessarily zero, neither at the beginning nor at the end of the movement. Therefore, before we delve into the results, we present the following technique to derive the parameters of the minimumjerk polynomial under these different conditions.
Deriving the MinJerk Polynomial
In [14], the minimumjerk polynomial is given by
(11) 
with coefficients and where is the final time of the movement.
In our discretetime setting, we evaluate the polynomial only at times , .
In this case, the final time is given by , where is the last time step
(12) 
The coefficients are computed from the data: is the initial position, i.e., . The coefficients and are computed from initial velocity and acceleration . Since we have to take into account factors arising from differentiation, we arrive at and . The remaining coefficients can be computed by solving the system of linear equations
(13) 
where , , and are, respectively, the pointer position, velocity, and acceleration at the final time.
Results for MinJerk
The MinJerk model has been derived from data of an experiment that did not involve any corrective submovements [14]. This leaves two possibilities to fit the model to our data, which does show extensive corrective submovements. If MinJerk is used for modeling the entire movement, i.e., until time step , the fit is very poor (see Figure 10; dotted line). Instead of a quick movement towards the target with extensive corrective submovements, as in our data, the model predicts a slow, smooth movement, reaching the target only at the time of the mouse click.
Therefore, we use MinJerk for only the first, rapid movement towards the target (the “surge”). Similar to [26], we determine the end of the surge ( in Figure 10) from the data as the first zerocrossing in the acceleration time series after the deceleration (for movements to the left: acceleration) phase. After that, we assume that the pointer does not move. As illustrated in Figure 10 (blue solid line), this results in a good fit of the surge phase, at least for movements that exhibit a clear surge phase. However, the target is not reached, causing a poor overall fit.
In conclusion, MinJerk is a good model for the surge phase but not suitable for describing motions that contain extensive corrective submovements.
8.2 Secondorder Lag Equilibrium Control (2OLEq)
The 2OLEq model is a discrete version of (2OL) with . It is given by the system dynamics with matrices and from (3) and initial condition. With this particular choice of control, the pointer moves towards the target and stays there. The target position , together with zero velocity and acceleration, constitutes an equilibrium in this case; hence the name “equilibrium control”. This constant control is the main difference to our approach, in which the control values are optimized with respect to some cost function .
For the 2OLEq model, we optimize the spring stiffness and the damping with the same parameter fitting process and the same SSE objective function (7) that we use for our 2OLLQR approach.
The behavior of the 2OLEq is shown in Figure 15. Visually, the model captures user behavior well in terms of pointer position, cf. Figure 15(a). The velocity time series depicted in Figure 15(b), however, is asymmetric in the 2OLEq case, while the user shows a more symmetric, bellshaped velocity profile. The biggest difference appears in the acceleration time series. The user performs a symmetric and smooth Nshaped acceleration. In contrast, the acceleration of the 2OLEq jumps instantaneously at the start of the movement, and then rapidly declines. This can be explained with the physical interpretation of the 2OLEq as a springmassdamper system: Since is constant in this model, as the system is released, the spring instantaneously accelerates the system with a force that is proportional to the extension of the spring. Because human muscles cannot build up force instantaneously [30], this behavior is not physically plausible.
8.3 Our Model 2OLLQR vs. MinJerk and 2OLEq
Qualitative Comparison
For the qualitative comparison, we performed a visual analysis of model behavior on the entire dataset. Although in the figures we illustrate a particular movement of a specific participant, we recall that the behavior is representative and the plots of all 12 participants and all 4 IDs are provided as supplementary material.
The behavior of our model 2OLLQR is shown in Figure 20. Overall, the model approximates the position rather well over the entire movement, cf. Figure 20(a). Corrective submovements, which start at around , are not replicated well by any of the three models (see Figures 10, 15, and 20). Our model slightly underestimates the maximum velocity and the velocity profile is less symmetric than the data. Similar effects can be observed in the acceleration, see Figure 20(c).
Compared to MinJerk, our model 2OLLQR explains the surge phase similarly well, while not quite capturing the symmetry observed in many acceleration time series as the one depicted in Figures 10, 15, and 20.
Compared to 2OLEq, our model captures position, velocity, and acceleration much better. The reason for this is that, in contrast to 2OLEq, the control time series shown in Figure 20(d) is not constant but changes over time. This often leads to a more Nshaped acceleration time series and a more bellshaped velocity time series, as predicted by Flash and Hogan [14] and in many cases confirmed by our data.
ID 2 tasks play a special role, as they (usually) do not involve corrective submovements, see Figure 24. In this case, all three models match the position data. Visible differences in the fit appear in the velocity and acceleration data.
Quantitative Comparison
In the following, we provide a quantitative comparison across all 7702 trajectories. The resulting SSE values of all three models are shown in Figure 27(a), on a logarithmic scale. In addition, we measure the Maximum Error between model and user trajectories, i.e.,
(14) 
which is depicted in Figure 27(b). As can be seen from both Figures, our model 2OLLQR is able to capture human behavior substantially better in terms of SSE and in terms of Maximum Error than both the 2OLEq and MinJerk models.
KolmogorovSmirnov tests showed that the distributions of SSE for the three models do not fit the assumption of normality (all values ). Thus, we carried out a Friedman Test (i.e., a nonparametric test equivalent to a repeated measures oneway ANOVA). The main factor included in the analysis was which model was used: 2OLLQR, 2OLEq, or MinJerk. The significance level was set to . The test indicated that the SSE between the three models was significantly different (, , ).
Additional Wilcoxon Signed Rank tests with Bonferroni corrections showed that the SSE was significantly lower in the 2OLLQR model when compared to the 2OLEq model (, ), or to the MinJerk model(, ). The findings are analogous for the maximum deviations of the simulated trajectories from the data (Friedman Test, , , ), with Wilcoxon Signed Rank tests () showing that 2OLLQR approximates user trajectories significantly better than both 2OLEq and MinJerk. Summary statistics of both measures for all three models can be found in Table 1.
Model  SSE  Maximum Error  

Mean  SE  SD  Mean  SE  SD  
2OLLQR  0.03  0.001  0.10  0.014  0.0001  0.009 
2OLEq  0.11  0.002  0.16  0.03  0.0001  0.013 
MinJerk  0.21  0.006  0.56  0.035  0.0025  0.022 
8.4 Parameter Distribution of 2OLLQR



Figures 31(a)(c) (left) show the ranges of the three 2OLLQR parameters , , and , optimized for the user trajectories of all tasks with , grouped by participants.
Figures 31(a)(c) (right) illustrate the ranges of the parameters , , and , optimized for the user trajectories of all participants, grouped by ID of the task. All three parameters show characteristic variations by ID. The spring stiffness increases noticeably from ID 4 to ID 6. The damping parameter is considerably lower for ID 2 tasks. This confirms the observation that participants show oscillatory behavior in tasks with low IDs, as reported before in [17, 4, 26]. These oscillations also play a role in the large variance of for ID 2. For the other IDs, declines only slightly with ID, i.e., the effort is almost independent of the task difficulty.
The impact of the parameters on model behavior is however not straightforward, because a change in one of the parameters does not only influence the movement directly, but also results in a different optimal control sequence, which likewise affects the solution trajectory.
8.5 Modeling Individual Movements Including Reaction Time
Our model 2OLLQR does not take reaction time into account. However, this is possible with our third iteration, 2OLLQR. Only in this section, we thus explicitly do not drop any frames at the beginning of the trials. Results for the same representative trial as before are shown in Figure 36. Clearly, there is no change in control and thus in acceleration before time , which can loosely be interpreted as a reaction time. Looking closely at the initiation of the acceleration, we observe that our model initiates the movement later than the user but with a higher acceleration. The reason is that the optimizer treats as a free parameter to minimize the SSE of the entire position time series. Thus, while movements including reaction time can be approximated by 2OLLQR quite well, the parameter itself does not necessarily resemble the true reaction time.
9 Discussion and Future Work
In this paper we have explored a simple OFC model for mouse pointer movements. We assumed optimal closedloop behavior with respect to a quadratic cost function (penalizing jerk and distance) and subject to linear system dynamics with no delay and no noise. These simplifications lead to a number of limitations of our model.
First, all models that we compared do not model corrective submovements well. Although our models can recreate corrective submovements (e.g., in Figure 36), they are smaller in amplitude than those of the users. Future research should put more emphasis on replicating these submovements in more detail by extending the model.
Second, due to its deterministic nature, our model cannot replicate the variability of human movements. It produces a typical movement of a specific user, but it produces the same movement every time. In future work we plan to explore stochastic models to better capture human variability.
Third, we note that although our cost function (9) of our main model, 2OLLQR, incentivizes a short(er) movement time due to summed distance costs, it does not explicitly model minimizing the total movement time. If the latter is desired (e.g., as part of the experimental design), then in future work the model can be extended by modifying the cost function using the Cost of Time theory.
Despite these limitations, our 2OLLQR model matches our data well, and significantly better than 2OLEq or MinJerk. We achieve this with only three parameters, which have an easily understandable interpretation as spring stiffness , damping , and effort, related to . We only need these parameters, the target position, and initial conditions. In contrast to MinJerk, our model does not need to know the point in time and space where the surge movement ends. Most importantly, our model does not require knowledge about the exact time when the target is reached. Compared to 2OLEq, our model yields a more bellshaped velocity time series and a more Nshaped acceleration time series, without implausibly high acceleration at the start of the movement. In addition, our model explains how users differ from each other in properties (stiffness, damping) and effort.
The biggest strength is that the OFC perspective makes our model very flexible and easily extensible. In particular, it can readily be extended to other instructions, such as emphasizing speed vs. comfort. It can also be extended to different tasks, such as 2D or 3D pointing, 6 DoF docking tasks, etc.
It is important to highlight that our model is a pure endeffector model of the movement of the mouse pointer. We do not explicitly model biomechanics, sensor characteristics, or transfer functions in the operating system. Incorporating these is possible, albeit yielding nonlinear system dynamics, and therefore making the model more complex. Our simple model already works quite well for modeling mouse pointer movements. This reinforces our argument that OFC is a promising theory to better understand movement, such as movement of the mouse pointer, during interaction and is thus a valuable addition to the HCI community.
10 Conclusion
In this paper, we have modeled mouse pointer movements from an optimal control perspective. More precisely, we have investigated the LinearQuadratic Regulator with various objective functions. We found that our model 2OLLQR fits our data significantly better than either 2OLEq [26] or MinJerk [14]. We require a number of simplifying assumptions (linear dynamics, quadratic costs). Despite these, mouse pointer movements of real users can be explained well. Moreover, this is achieved with only three, intuitively interpretable, parameters, which allow to characterize users by properties (stiffness, damping) and effort. In conclusion, we believe that the optimal feedback control perspective is a strong, flexible, and very promising direction for HCI, which should be further explored in the future.
Appendix A 2OLLQR equations
The 2OLLQR model can be described as the timediscrete linearquadratic optimal control problem with finite horizon
(15a)  
where with satisfies  
(15b)  
with sampling time and system dynamics matrices  
(15c) 
based on the (approximated) secondorder lag.
The state cost matrices are defined by
(16) 
which implies
(17) 
i.e., the distance between mouse and target position is quadratically penalized at every time step . In our case of onedimensional pointing tasks, the control cost matrices are scalar and given by
(18) 
which yields
(19) 
i.e., the squares of the “jerk” terms are penalized with some jerk weight at every time step .
Because of the penalization of the differences in control, each control value of the optimal control sequence minimizing given some initial state and some initial control explicitly depends on the preceding control value .
For this reason, we need to introduce information vectors
(20) 
Furthermore, we expand the system matrices and by an additional zero row and column and add an additional one to the control matrix in order to propagate the previous control :
(21) 
Using this notion, (15) is equivalent to the following optimal control problem:
(22a)  
where with satisfies  
(22b) 
with sampling time and where applies.
Moreover, we define
(23) 
which implies
(24) 
i.e., respective are the matrices that extract the state respective the control from the information vector for any .
It can be shown that the unique solution to the optimization problem (22) (and thus to the original optimization problem (15) as well) is given by
(25) 
where the symmetric matrices can be determined by solving the Modified Discrete Riccati Equations
(26a)  
for backwards in time with initial value  
(26b) 
Footnotes
 This extension is required in order to penalize the jerk as in (5).
 For example, setting might result in an implausibly high acceleration at the start of the movement, similar to 2OLEq.
 To aid the LSQ optimization process, we use a smoothed version of the piecewise constant sequence of jerk weights and , i.e., for .
 We specifically do not use for reasons elaborated below.
 There are some cases in which asymmetric acceleration time series do occur. Our model 2OLLQR is able to approximate these profiles reasonably well and is not limited to, e.g., an Nshaped acceleration profile, as is the case with MinJerk.
 The parameters for ID 2 tasks differ from those of tasks. Due to limited space, we focus on the latter in these plots. For the sake of completeness, the figures including ID 2 tasks can be found in the supplementary material.
References
 Takeshi Asano, Ehud Sharlin, Yoshifumi Kitamura, Kazuki Takashima, and Fumio Kishino. 2005. Predictive interaction using the delphian desktop. In Proceedings of the 18th annual ACM symposium on User interface software and technology. ACM, 133–141.
 Bastien Berret and Frédéric Jean. 2016. Why Don’t We Move Slower? The Value of Time in the Neural Control of Action. Journal of Neuroscience 36, 4 (2016), 1056–1070. DOI:http://dx.doi.org/10.1523/JNEUROSCI.192115.2016
 Reinoud J. Bootsma, Laure Fernandez, and Denis Mottet. 2004. Behind Fitts’ law: kinematic patterns in goaldirected movements. International Journal of HumanComputer Studies 61, 6 (2004), 811–821.
 Daniel Bullock and Stephen Grossberg. 1988. Neural Networks and Natural Intelligence. Massachusetts Institute of Technology, Cambridge, MA, USA, Chapter Neural Dynamics of Planned Arm Movements: Emergent Invariants and Speedaccuracy Properties During Trajectory Formation, 553–622. http://dl.acm.org/citation.cfm?id=61339.61351
 Y. Chan and J.P. Maille. 1975. Extension of a linear quadratic tracking algorithm include control constraints. IEEE Trans. Automat. Control 20, 6 (December 1975), 801–803. DOI:http://dx.doi.org/10.1109/TAC.1975.1101101
 Frederic Crevecoeur, Tyler Cluff, and Stephen H. Scott. 2014. The Cognitive Neurosciences, 5th ed. MIT Press, Cambridge, MA, USA, Chapter Computational Approaches for GoalDirected Movement Planning and Execution, 461–477.
 E. R. F. W. Crossman and P. J. Goodeve. 1983. Feedback control of handmovement and Fitts’ law. The Quarterly Journal of Experimental Psychology 35, 2 (1983), 251–278.
 Jörn Diedrichsen, Reza Shadmehr, and Richard B. Ivry. 2010. The coordination of movement: optimal feedback control and beyond. Trends in Cognitive Sciences 14, 1 (2010), 31 – 39. DOI:http://dx.doi.org/10.1016/j.tics.2009.11.004
 P. Dorato and A. Levis. 1971. Optimal linear regulators: The discretetime case. IEEE Trans. Automat. Control 16, 6 (December 1971), 613–620. DOI:http://dx.doi.org/10.1109/TAC.1971.1099832
 Digby Elliott, Werner Helsen, and Romeo Chua. 2001. A century later: Woodworth’s (1899) twocomponent model of goaldirected aiming. Psychological bulletin 127 (06 2001), 342–57. DOI:http://dx.doi.org/10.1037//00332909.127.3.342
 Paul M. Fitts. 1954. The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology 47, 6 (1954), 381–391.
 Paul M. Fitts and James R. Peterson. 1964. Information capacity of discrete motor responses. Journal of experimental psychology 67, 2 (1964), 103.
 Tamar Flash and Neville Hogan. 1985. The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model. Journal of neuroscience 5 (1985), 1688–1703.
 J. Gori and O. Rioul. 2018. InformationTheoretic Analysis of the SpeedAccuracy Tradeoff with Feedback. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 3452–3457. DOI:http://dx.doi.org/10.1109/SMC.2018.00585
 Julien Gori, Olivier Rioul, and Yves Guiard. 2018. SpeedAccuracy Tradeoff: A Formal InformationTheoretic Transmission Scheme (FITTS). ACM Trans. Comput.Hum. Interact. 25, 5, Article 27 (Sept. 2018), 33 pages. DOI:http://dx.doi.org/10.1145/3231595
 Yves Guiard. 1993. On Fitts’s and Hooke’s laws: Simple harmonic movement in upperlimb cyclical aiming. Acta psychologica 82, 1 (1993), 139–159.
 Christopher M. Harris and Daniel M. Wolpert. 1998. Signaldependent noise determines motor planning. Nature 394, 6695 (1998), 780–784. DOI:http://dx.doi.org/10.1038/29528
 Bruce Hoff. 1994. A model of duration in normal and perturbed reaching movement. Biological Cybernetics 71, 6 (01 Oct 1994), 481–488. DOI:http://dx.doi.org/10.1007/BF00198466
 Bruce Hoff and Michael A. Arbib. 1993. Models of Trajectory Formation and Temporal Interaction of Reach and Grasp. Journal of Motor Behavior 25, 3 (1993), 175–192. DOI:http://dx.doi.org/10.1080/00222895.1993.9942048 PMID: 12581988.
 Y. Jiang, Z. Jiang, and N. Qian. 2011. Optimal control mechanisms in human arm reaching movements. In Proceedings of the 30th Chinese Control Conference. 1377–1382.
 Gary D. Langolf, Don B. Chaffin, and James A. Foulke. 1976. An Investigation of Fitts’ Law Using a Wide Range of Movement Amplitudes. Journal of Motor Behavior 8, 2 (1976), 113–128. DOI:http://dx.doi.org/10.1080/00222895.1976.10735061 PMID: 23965141.
 Zhe Li, Pietro Mazzoni, Sen Song, and Ning Qian. 2018. A Single, Continuously Applied Control Policy for Modeling Reaching Movements with and without Perturbation. Neural Computation 30, 2 (2018), 397–427. DOI:http://dx.doi.org/10.1162/neco_a_01040 PMID: 29162001.
 I. Scott MacKenzie. 1992. Fitts’ Law as a Research and Design Tool in HumanComputer Interaction. Human–Computer Interaction 7, 1 (1992), 91–139. DOI:http://dx.doi.org/10.1207/s15327051hci0701_3
 David E. Meyer, Richard A. Abrams, Sylvan Kornblum, Charles E. Wright, and J. E. Keith Smith. 1988. Optimality in human motor performance: Ideal control of rapid aimed movements. Psychological review 95, 3 (1988), 340.
 Jörg Müller, Antti Oulasvirta, and Roderick MurraySmith. 2017. Control Theoretic Models of Pointing. ACM Trans. Comput.Hum. Interact. 24, 4, Article 27 (Aug. 2017), 36 pages. DOI:http://dx.doi.org/10.1145/3121431
 Réjean Plamondon and Adel M. Alimi. 1997. Speed/accuracy tradeoffs in targetdirected movements. Behavioral and brain sciences 20, 02 (1997), 279–303.
 Ning Qian, Yu Jiang, ZhongPing Jiang, and Pietro Mazzoni. 2013. Movement Duration, Fitts’s Law, and an InfiniteHorizon Optimal Feedback Control Model for Biological Motor Systems. Neural Computation 25, 3 (2013), 697–724. DOI:http://dx.doi.org/10.1162/NECO_a_00410 PMID: 23272916.
 Philip Quinn and Shumin Zhai. 2016. Modeling GestureTyping Movements. Human–Computer Interaction (2016), 1–47. DOI:http://dx.doi.org/10.1080/07370024.2016.1215922
 Richard A. Schmidt and Timothy D. Lee. 2005. Motor Control and Learning. Human Kinetics.
 Reza Shadmehr. 2010. Control of movements and temporal discounting of reward. Current Opinion in Neurobiology 20, 6 (2010), 726 – 730. DOI:http://dx.doi.org/10.1016/j.conb.2010.08.017 Motor systems, Neurobiology of behaviour.
 Reza Shadmehr and Steven P. Wise. 2005. The Computational Neurobiology of Reaching and Pointing. MIT Press.
 Emanuel Todorov. 1998. Studies of goaldirected movements. Massachusetts Institute of Technology. (1998).
 Emanuel Todorov. 2005. Stochastic Optimal Control and Estimation Methods Adapted to the Noise Characteristics of the Sensorimotor System. Neural Computation 17 (2005), 1084–1108.
 Emanuel Todorov and Michael I. Jordan. 2002. Optimal feedback control as a theory of motor coordination. Nature neuroscience 5, 11 (2002), 1226–1235.
 Y. Uno, M. Kawato, and R. Suzuki. 1989. Formation and control of optimal trajectory in human multijoint arm movement. Biological Cybernetics 61, 2 (01 Jun 1989), 89–101. DOI:http://dx.doi.org/10.1007/BF00204593
 Robert Sessions Woodworth. 1899. Accuracy of voluntary movement. The Psychological Review: Monograph Supplements 3, 3 (1899), i.
 Brian Ziebart, Anind Dey, and J. Andrew Bagnell. 2012. Probabilistic Pointing Target Prediction via Inverse Optimal Control. In Proceedings of the 2012 ACM International Conference on Intelligent User Interfaces (IUI ’12). ACM, New York, NY, USA, 1–10. DOI:http://dx.doi.org/10.1145/2166966.2166968