Dynamic Team Theory of Stochastic Differential Decision Systems with Decentralized Noisy Information Structures via Girsanov’s Measure Transformation
In this paper, we present two methods which generalize static team theory to dynamic team theory, in the context of continuous-time stochastic nonlinear differential decentralized decision systems, with relaxed strategies, which are measurable to different noisy information structures. For both methods we apply Girsanov’s measure transformation to obtain an equivalent decision system under a reference probability measure, so that the observations and information structures available for decisions, are not affected by any of the team decisions.
The first method is based on function space integration with respect to products of Wiener measures. It generalizes Witsenhausen’s  definition of equivalence between discrete-time static and dynamic team problems, and relates Girsanov’s theorem to the so-called “Common Denominator Condition and Change of Variables”.
The second method is based on stochastic Pontryagin’s maximum principle. The team optimality conditions are given by a “Hamiltonian System” consisting of forward and backward stochastic differential equations, and conditional variational Hamiltonians with respect to the information structure of each team member. Under global convexity conditions, we show that PbP optimality implies team optimality. We also obtain team and PbP optimality conditions for regular team strategies, which are measurable functions of decentralized information structures.
In addition, we also show existence of team and PbP optimal relaxed decentralized strategies (conditional distributions), in the weak sense, without imposing convexity on the action spaces of the team members, and their realization by regular team strategies.
Key Words. Dynamic Team Theory, Stochastic, Decentralized, Existence, Path Integration, Maximum Principle, Girsanov.
2000 AMS Subject Classification 49J55, 49K45, 93E20.
Static Team Theory is a mathematical formalism of decision problems with multiple Decision Makers (DMs) having access to different information, who aim at optimizing a common pay-off or reward functional. It is often used to formulate decentralized decision problems, in which the decision making authority is distributed through a collection of agents or players, and the information available to the DMs to implement their actions is different. Static team theory and decentralized decision making originated from the fields of management, organization behavior and government by Marschak and Radner [2, 3, 4]. However, its generalization to dynamic team theory has far reaching implications in all human activity including science and engineering systems, comprising of multiple components, in which information available to the decision making components is either partially communicated to each other or not communicated at all, and decisions are taken sequentially in time. Dynamic team theory and decentralized decision making can be used in large scale distributed dynamical systems, such as, transportation systems, smart grid energy systems, social network systems, surveillance systems, networked control systems, communication networks, financial markets, etc.
In general, decentralized decision making is a common feature of any system consisting of multiple local observation posts and control stations, where the acquisition of information and its processing is shared among the different observation posts, and the DM actions at the control stations are evaluated using different information, that is, the arguments in their control laws or policies are different. We call, as usual, “Information Structures or Patterns” the information available to the DMs at the control stations to implement their actions, and we call such informations “Decentralized Information Structures” if the information available to the DMs at the various control stations are not identical to all DMs. Early work discussing the importance of information structures in decision making and its applications is found in [2, 3, 4, 5, 6, 7].
Since the late 1960’s several articles have been written on decentralized decision making and information structures, and their applications in communication and queuing networks, sensor networks, and networked control systems. Some of the early references are [5, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 1, 27, 28], while more recent are [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41]. Among these references the most popular mathematical formalism is that of “Static Team Theory”111Static in the terminology in [2, 3, 4] means all elements of the team problem are Random Variables; some authors call such problems dynamic if the information structures depend on the decisions [8, 9, 1]. developed by Marschak and Radner [2, 3, 4]. The most successful example is the discrete-time Linear-Quadratic-Gaussian (LQG) decision problem with two DMs having access to one step-delay sharing information pattern [10, 12], with common and private information parts, where the explicit solution is obtained via completion of squares and dynamic programming, respectively.
Due to the inherent difficulty in applying Marschak’s and Radner’s [2, 3, 4] Static Team Theory to stochastic discrete-time dynamic decentralized decision problems, two methods are proposed over the years. The first method is based on identifying conditions so that discrete-time stochastic dynamic team problems can be equivalently reduced to static team problems. The second method is based on applying dynamic programming, and identifying conditions so that Person-by-Person (PbP) optimality222PbP optimality treats the decentralized decision problem by fixing the strategies of all DMs except one. implies team optimality. The first method put forward in [8, 9], is based on using precedence diagrams to represent sequential decisions and information structures in discrete-time stochastic dynamic team problems, to aid the analysis and computation of the optimal team strategies with partially nested information structures. In our opinion, even when the conditions suggested in these papers hold, it is not clear whether this approach is tractable, or whether it will provide any insight into specific discrete-time stochastic dynamic team problems. Along the same direction,
and contrary to the earlier believe at the time, Witsenhausen in  claimed that for a broad class of problems, discrete-time stochastic dynamic decentralized decision problems, with finite decisions (including some continuous alphabet models), are no harder than Marschak’s and Radner’s [2, 3, 4] static team problems, by showing that such problems are equivalent to static problems. In Witsenhausen’s  terminology a discrete-time stochastic dynamic decentralized decision problem is called “static”333We believe the proper and more precise terminology is “Memoryless Observations”, rather than “Static”, because it refers to the property of the observations only, while the unobserved state can be a random process (in  the unobserved state is a RV). if it can be transformed to an equivalent problem such that the observations available for any one decision do not depend on the other decisions. The procedure is described in terms of
“Common Denominator Condition” together with “Change of Variables”.
However, by careful reading of [Section 2.1, ], Witsenhausen’s analysis is restricted to discrete-time stochastic decentralized decision problems without dynamics and hence, the conclusions obtained in  are only for a small class of models. Moreover, no expression is given for the common denominator condition and change of variables, which fascillitate the equivalence between the two problems.
With respect to the second method, PbP optimality and dynamic programming are often used in real-time communication [16, 22, 30], in decentralized hypothesis testing [27, 34], and networked control systems [22, 32], for specific classes of discrete-time models and information structures. The procedure is based on identifying an information state or sufficient statistic, often employed in centralized stochastic control, to replace the observations available for decisions by á posteriory conditional distributions based on the observations [42, 43]. However, identifying the information state and then applying dynamic programming are not easy tasks, when one is faced with general information strctures and continuous alphabet spaces (see for exampe ), while the question on whether PbP optimality implies team optimality is difficult to resolve.
Following another research direction, recently, the authors invoked stochastic Pontryagin’s maximum principle to derive team and Person-by-Person (PbP) optimality conditions for stochastic differential decision systems with decentralized noiseless information structures , and decentralized noisy information structures , and computed the optimal team decentralized strategies for various communication and control applications . However, the mathematical anaysis in [44, 45] is based on strong formulation of the probability space, which is restrictive in the sense that it is not easy to apply these optimality conditions to noiseless feedback information structures and noisy information structures, unless certain strong assumptions are imposed on the elements of the stochastic differential decentralized decision systems.
The main objectives in this paper are the following.
(1) We present two methods, based on Girsanov’s theorem, which generalize Marschak’s and Radner’s [2, 3, 4] static team theory to dynamic team theory. The first method is based on function space integration of Wiener functionals, which we also relate to Witsenhausen’s  common denominator condition and change of variables for continuous and discrete-time dynamic team problems. The second method is based on stochastic Pontryagin’s maximum principle, which allows us to derive both necessary and sufficient team optimality conditions.
(2) We show existence of relaxed team strategies (conditional distributions) under general conditions, using a weak topological space;
(3) We show, under global convexity conditions, on the Hamiltonian functional and terminal pay-off, that PbP optimality implies team optimality;
(4) We show realizability of relaxed stratgies by regular strategies using the Krein-Millman theorem.
Our approach is based on invoking Girsanov’s change of probability measure to define an equivalent stochastic dynamical decentralized decision system under a reference probability measure, in which the distributed observations and information structures available for decisions are not affected by any of the team decisions. Both methods donot impose any restrictions on the information structures as in [44, 46, 45], and they apply to general models and information structures, including nonclassical information structures [5, 6].
The first method is based on path integral of functionals of Brownian motion with respect to products of Wiener measures. We show that this method generalizes Witsenhausen’s  notion on equivalence between discrete-time stochastic dynamic team problems which can be transformed to equivalent static team problems, to continuous-time Itô stochastic nonlinear differential decentralized decision problems, to analogous discrete-time models, and in addition we identify the precise expression of the common denominator condition described in . However, we point out certain limitations of this method in the context of computing the optimal team strategies, for the case of large number of decision stages.
The second methods is based on stochastic Pontryagin’s maximum principe; we derive necessary conditions for team optimality given by a stochastic Pontryagin’s maximum principle and sufficient conditions under global convexity assumptions. Moreover, we also show that under the global convexity conditions, PbP optimality implies team optimality. This method is much more general, and does not suffer from any limitations, in computing the optimal team strategies, compared to the first method.
The results listed under (1)-(4) above, are derived in the context of the following general continuous-time stochastic nonlinear differential decentralized decision systems (with relaxed and regular team member strategies, and general noisy information structures).
subject to stochastic differential dynamics with state and distributed noisy observations satisfying the Itô differential equations
Here denotes expectation with respect to a probability measure defined on an underlying measurable space , while the elements of the team problem are the following:
Although, in the above stochastic differential decentralized decision system we have assumed regular team strategies, in the paper we consider relaxed team strategies, which are regular conditional distributions for and , and we obtain corresponding results for regular strategies as a special case of relaxed strategies.
Moreover, we apply the first method to a discrete-time generalization of (1)-(4), and we demonstrate that both methods apply to arbitrary information structures, including nonclassical information structures [wistnenhausen1971]. We illustrate these points in our discussions.
According to the formulation of (1)-(4) and definition of admissible team strategies, each distributed observation generates the information structure of the th decision process for . With respect to our introductory discussion, the stochastic system (3) may be a compact representation of many interconnected subsystems, aggregated into a single state representation , each corresponds to the observation process at the observation post “”, and each corresponds to the decision process applied at the “” th control station. Since in the current set up we have assumed then by definition, for each , the strategies are of the form , for , and hence the decision processes utilize decentralized noisy information structures444We will also describe more general decentralized noisy information structures..
Clearly, team optimality implies PbP optimality, but the reverse is not generally true. In the context of team problems, PbP optimality is often of interest provided one can identify conditions so that PbP optimality implies team optimality. For static team problems such conditions, are derived in  and for exponential pay-off functionals in .
As we have mentioned earlier, our methodology is addressing stochastic dynamic team problems is based on Girsanov’s change of probability measure, which allows us to introduce an equivalent problem under a new reference probability space in which the distributed noisy observations are signal free and/or the unobserved state process is drift free, and hence they are not affected by any of the team decisions. Thus, we employ the powerful tools of stochastic calculus such as, martingale representation theorem, stochastic variational methods, function space integration of functionals of Brownian motion with respect to Wiener measures, to handle very general decentralized decision problems, with decision processes having access to any combination of information structures.
Indeed, we apply the first method, based on function space integration, to the continuous-time stochastic dynamic team problem (1)-(4), and we show that the so called “Common Denominator Condition” introduced in  to transformed, the simplified discrete-time stochastic dynamic decentralized decision problem (with unobserved state a RV) to an equivalent static one (in Witsenhausen’s terminology) is the existence, via Girsanov’s theorem , of a Radon-Nikodym derivative between the initial and the reference probability measure, so that under the reference probability measure the observations are not affected by any of the team decisions. Moreover, we show that the so called “Change of Variables” in  is an application of change of probability measure, expressing the initial pay-off under the reference probability measure. Therefore, we extend Witsenhausen’s  notion of equivalence not only to general discrete-time stochastic dynamic team problems, but also to continous-time stochastic dynamic team problems. We also show that under the reference probability measure, the pay-off of the team problem (1)-(4) is equivalently expressed via function space integration with respect to the product of Wiener measures, and that, in principle, this integration can be carried out precisely as in [48, 49], where examples of optimal, in mean-square sense, finite-dimensional filters are derived. However, contrary to an intuitive belief, we point out that this does not mean that such an equivalent problem is simpler, or that static optimization theory and/or the static team theory of Marschak and Radner [2, 3, 4] can be easily applied to the equivalent problem, even for the discrete-time analog of (1)-(4), including Witsenhausen’s simplified model (without dynamics) described in continuous-time. The reason is that the computation of the optimal team strategies can be quite intensive, and often not tractable.
Then, we proceed with the second method to derive new team and PbP optimality conditions; necessary conditions given by a stochastic Pontryagin’s maximum principle and sufficient conditions under global convexity assumptions. Firstly, we apply Girsanov’s measure transformation  to transform (1)-(4) to an equivalent stochastic differential team game, under a reference probability measure, on which are independent Brownian motions, and independent of any of the team decisions. Secondly, under the reference probability measure we show existence of team and PbP optimal relaxed strategies, in the weak sense. Thirdly, we invoke stochastic variational methods and the Riesz representation theorem for Hilbert space semi martingales to derive team and PbP optimality conditions. We show that the necessary conditions for an admissible strategy to be team optimal is the existence of an adjoint process in an appropriate function space satisfying a backward stochastic differential equation, and that for each , the optimal actions satisfy almost surely, a pointwise conditional variational Hamiltonian inequality with respect to the information structure , with all other actions are kept to their optimal values, for .
Under certain global convexity conditions we also show that the conditional variational Hamiltonian inequalities are also sufficient for team optimality. Moreover, we also show that under the global convexity conditions, PbP optimality implies team optimality. The new optimality conditions are given both under the reference probability measure, and also under the initial probability measure via a reverse Girsanov’s measure transformation. The Hamiltonian system of equations is precisely the analog of stochastic differential centralized decision problems optimality conditions, extended to decentralized decision problems using a dynamic team theory formulation.
One of the important aspect of our methodology is that the assumptions imposed to derive existence of team and PbP optimal strategies, and necessary and sufficient team and PbP optimality conditions are precisely the ones often imposed to derive analogous results for stochastic partially observable control problems which presuppose centralized information structures. However, “the challenge remains that of computing conditional expectations” via the conditional variational Hamiltonians, which is not an easy task even for centralized partially observable stochastic systems . Therefore, examples will be presented elsewhere as in , due to space limitations.
Throughout the paper, we develop the optimality conditions utilizing relaxed decentralized strategies, and then we show how to recover analogous optimality conditions for regular strategies.
Finallt, we point out that one may invoke alternative methods, such as the ones described in [51, 52, 53, 54, 55, 56, 57, 58, 42, 59, 60], which are based on stochastic flows of diffeomprhisms, martingale representation theorem, and needle variations. Moreover, for the case of regular strategies with actions spaces which are not necessarily convex, one can derive optimality conditions by considering the generalized Hamiltonian system of equations, which includes also the second-order adjoint process  (see also ), provided the derivations are carried out under the reference probability measure. The important point to be made regarding the results of this paper is that, by invoking Girsanov’s measure transformation, the existence of team and PbP optimal strategies, and the team and PbP optimality conditions for stochastic differential decentralized decisions systems formulated using stochastic dynamic team theory, are derived similarly to stochastic optimal control or decision problems, with centralized information structures, and that only at the last step of the derivations, the issue of decentralization is accounted for, leading to the conditional variational Hamiltonians with respect to the different information structures.
We believe our lengthy introduction, together with the subsequent mathematical analysis, and results derived in the paper will help clarify certain statements found in the literature concerning the application of Marschak’s and Radner’s Static Team Theory to stochastic dynamic team problems, and aid in addressing other types of optimality criteria such as, Nash Equilibrium, minimax games, etc. with decentralized information structures.
The paper is organized as follows. In Section 2, we introduce the stochastic differential decentralized decision problem and its equivalent re-formulations using the weak Girsanov’s measure transformation approach. In this section, we also discuss the precise connection via path integration between static team theory and dynamic team theory, thus generalizing  to continuous-time stochastic dynamic team problems, and discrete-time stochastic dynamic team problems with general unobserved state processes, and we also identify the exact expression of the common denominator conditions (Radon-Nikodym derivative). In Section 3, we first show existence of solutions of the stochastic differential system, their continuous dependence on , and existence of team and PbP optimality using a weak topological space of regular conditional distributions. In Section 4, we derive the variational equations which we invoke to derive the optimality conditions, both under the reference probability measure and under the initial probability measure. In Section 5, we show how to obtain corresponding results for regular strategies, and also how to realize relaxed strategies by regular strategies. Finally, in Section 6 we conclude the presentation with comments on possible generalizations of our results.
2 Dynamic Team Problem of Stochastic Differential Decision Systems
In this section, we introduce the team theoretic formulation of stochastic differential decentralized decision systems, the information structures available to the DMs, which are different and noisy, and the relaxed and regular strategies of the DMs. Then, we introduce appropriate assumptions, and we invoke Girsanov’s change of probability to show that under an appropriate choice of probability measure, the original stochastic differential decentralized decision problem is equivalent to a new problem with distributed observations which are independent, and independent of any of the team decisions.
Let denote a subset of natural numbers, denote Linear transformations mapping a linear vector space into a vector space , and denote the th column of a map .
Let denote a complete filtered probability space satisfying the usual conditions, that is, is complete, contains all -null sets in . Note that filtrations are monotone in the sense that , . Moreover, is called right continuous if and it is called left continuous if . Throughout we assume that all filtrations are right continuous and complete, and defined by .
Consider a random process taking values in , where is a metric space, defined on the filtered probability space . It can be shown that any such stochastic process which is measurable and adapted has a progressively measurable modification . Unless otherwise specified, we shall say a process is adapted if the processes is progressively measurable.
denotes the space of continuous real-valued dimensional functions defined on the time interval .
denotes the space of adapted random processes such that
which is a sub-Hilbert space of .
Similarly, denotes the space of adapted matrix valued random processes such that
Next, we describe the set of admissible relaxed strategies. For each , let be closed and bounded (possibly nonconvex), and let denote the Borel subsets of . Let denote the space of continuous functions on , endowed with the sup norm topology, which makes it a Banach space. Let denote the space of regular bounded signed Borel measures on , having finite total variation. With respect to this norm topology, is also a Banach space. It is well known that the dual of is . We are interested in the space of regular probability measures. Using this construction, the DM strategies with decentralized information structures will be described through the topological dual of the Banach space , the -space of adapted valued functions, for . For each the dual of this space is given by which consists of weak measurable adapted valued functions. For each the DM strategies are drawn from , the set of probability measure valued adapted functions. Hence, we have the following definition of relaxed strategies.
(Admissible Relaxed Noisy Information Strategies)
The admissible relaxed strategies for DM are defined by
where are closed and bounded (possibly nonconvex).
An tuple of relaxed strategies is by definition
Thus, for any , given the information , is a stochastic kernel (regular conditional distribution) defined by
Clearly, for each and for every the process
is progressively measurable. For each , the space is endowed with the weak topology, also called vague topology. A generalized sequence is said to converge (in the weak topology or) vaguely to written , if and only if for every
With respect to the vague (weak) topology the set is compact, and from here on we assume that has been endowed with this vague topology.
For relaxed strategies , we use the following notation for the drift coefficient
Next, we define the set of admissible decentralized noisy information strategies of the team members, called regular strategies, (deterministic measurable functions).
(Admissible Regular Noisy Information Strategies)
The admissible regular strategies for DM are defined by
the class of adapted random processes defined on and taking vlaues from the closed bounded set . Note that if is a closed, bounded and convex subset of then is a closed convex subset of , . An tuple of regular strategies is by definition
Notice that the class of regular strategies embeds continuously into the class of relaxed decisions through the map Clearly, for every we have
There are several advantages of using relaxed strategies. For example, if optimal regular strategies exist from the admissible class then the optimality conditions of relaxed strategies can be specialized to the class of strategies which are simply Dirac measures concentrated . Thus, the necessary conditions for team and PbP optimality for regular strategies follow readily from those of relaxed strategies. Another advantage is the realizability of relaxed strategies by regular strategies, which is often establsihed via the Krein-Millman theorem, without requiring convexity of . Both advantages are discussed in the paper.
Next, we state the basic assumptions on the stochastic differential dynamics and decentralized observations, assuming relaxed strategies.
which satisfy the following basic conditions.
is closed and bounded, .
There exists a such that
uniformly in ;
uniformly in ;
uniformly in ;
uniformly in ;
uniformly in ;
uniformly in ;
are continuous in ;
is measurable in , uniformly bounded, the inverse exists and it is uniformly bounded, .
Often, for simplicity we shall replace (A5) by the following condition. There exist a such that
2.1 Team Problem Under Reference Probability Space
In this section we formulate the stochastic team problem utilizing the Girsanov change of measure approach, which is based on constructing a filtered probability space
and Brownian motions and defined on it, such that and are the weak solution of (3) and (4) with respect to relaxed strategies (with replaced by and by ). Moreover, under the reference probability space , is a weak solution of (3), while the observations , are independent Brownian motions, which are independent of the team decisions, that is, they are fixed and unaffected by . Consequently, the information structures of each team member is independent of .
Let be the canonical space of which are defined by
: an -valued Random Variable with distribution ;
: an -valued standard Brownian motion, independent of ;
, : -valued, , mutually independent Standard Brownian motions, independent of .
We introduce the Borel algebra on , the space of continuous dimensional functions on finite time , generated by and let its Wiener measure on it.
Similarly, we introduce the Borel algebra on generated by and let its Wiener mesure on it, for . We define the Borel algebra on generated by , , and its Wiener product measure on it.
Further, we introduce the filtration generated by truncations of , and the filtration generated by truncations of , and we define . That is, for , is the sub--algebra generated by the family of sets
Hence, is the canonical Borel filtration and .
Define the canonical probability space , called the reference probability space by
A typical element of is .
On the the reference probability space we define the decentralized observations by
Clearly, under the reference probability measure , the observations , are independent Brownian motions, and hence they are fixed and unaffected by , and consequently, the information structures of each player are independent of .
On the probability space by Assumptions 1, (A1), (A2), (A3), (A4), for any with finite second moment, and team strategy then is the pathwise unique adapted solution of
Notice that is adapted to the family which is fixed and independent of (since is a Brownian motion).
Next, for any and for each observation process defined on , we introduce the exponential functions
and their products by
Under Assumptions 1, (A5), (A8) the processes is a super martingale and by Itô’s differential rule it is the unique adapted continuous solution of the stochastic differential equation
Then for any admissible strategy , is also an -super martingale and satisfies the stochastic differential equation
Given a we define the reward of the team game under the reference probability space by
where are chosen so that (18) is finite.
Notice that under the reference probability measure , the pay-off (18) with given by (17), subject to the state process satisfying (13) is a transformed problem with observations which are not affected by any of the team decisions. It remains to show whether this transformed problem is equivalent to the original stochastic differential decentralized decision problem.
If we further assume that (A5’) holds, then is an -martingale for , and the team reward (18) subject to stochastic constraints of defined by (13), (17), respectively, is equivalent to the team game (1)-(4) (with regular strategies replaced by relaxed strategies, and repalced by and by ).
Indeed if (A5’) holds, by the martingale property of defined by (17), it has constant expectation and hence, . Therefore, we can introduce a probability measure on by setting
Moreover, under the probability measure , then is a standard Brownian motion defined by
Hence, by (20) the observations of the team members are defined by
and the state process is defined by (13) (its distribution is unchanged). Thus, we have constructed the probability space and Brownian motion defined on it such that are weak solutions, of (21), and (13). Moreover, under the probability space , by using (19) into (18) then the team problem reward is given by
Therefore, we have two equivalent formulations of the stochastic differential team game. The one defined under probability space