A Tutorial on Concentration Bounds for System Identification

A Tutorial on Concentration Bounds for System Identification

Nikolai Matni, Stephen Tu

We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting. We draw upon tools from the theories of large-deviations and self-normalized martingales, and provide both data-dependent and independent bounds on the learning rate.



I Introduction

A key feature in modern reinforcement learning is the ability to provide high-probability guarantees on the finite-data/time behavior of an algorithm acting on a system. The enabling technical tools used in providing such guarantees are concentration of measure results, which should be interpreted as quantitative versions of the strong law of large numbers. This paper provides a brief introduction to such tools, as motivated by the identification of linear-time-invariant (LTI) systems.

In particular, we focus on the identifying the parameters of the LTI system


assuming perfect state measurements. This is in some sense the simplest possible system identification problem, making it the perfect case study for such a tutorial. Our companion paper [extended] shows how the results derived in this paper can then be integrated into self-tuning and adaptive control policies with finite-data guarantees. We also refer the reader to Section II of [extended] for an in-depth and comprehensive literature review of classical and contemporary results in system identification. Finally, we note that most of the results we present below are not the sharpest available in the literature, but are rather chosen for the pedagogical value.

The paper is structured as follows: in Section II, we study the simplified setting when system (1) is defined for a scalar state , and data is drawn from independent experiments. Section LABEL:sec:vector extends these ideas to the vector valued settings. In Section LABEL:sec:single we study the performance of an estimator using all data from a single trajectory – this is significantly more challenging as all covariates are strongly correlated. Finally, in Section LABEL:sec:data-dependent, we provide data-dependent bounds that can be used in practical algorithms.

Ii Scalar Random Variables

Consider the scalar dynamical system


for , and an unknown parameter. Our goal is to estimate , and to do so we inject excitatory Gaussian noise via . We run experiments over a horizon of time-steps, and then solve for our estimate via the least-squares problem

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
Add comment
Loading ...
This is a comment super asjknd jkasnjk adsnkj
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test description