Trusted MultiParty Computation and Verifiable Simulations: A Scalable Blockchain Approach
Abstract
Largescale computational experiments, often running over weeks and over large datasets, are used extensively in fields such as epidemiology, meteorology, computational biology, and healthcare to understand phenomena, and design highstakes policies affecting everyday health and economy. For instance, the OpenMalaria framework is a computationallyintensive simulation used by various nongovernmental and governmental agencies to understand malarial disease spread and effectiveness of intervention strategies, and subsequently design healthcare policies. Given that such shared results form the basis of inferences drawn, technological solutions designed, and daytoday policies drafted, it is essential that the computations are validated and trusted. In particular, in a multiagent environment involving several independent computing agents, a notion of trust in results generated by peers is critical in facilitating transparency, accountability, and collaboration. Using a novel combination of distributed validation of atomic computation blocks and a blockchainbased immutable audits mechanism, this work proposes a universal framework for distributed trust in computations. In particular we address the scalaibility problem by reducing the storage and communication costs using a lossy compression scheme. This framework guarantees not only verifiability of final results, but also the validity of local computations, and its costbenefit tradeoffs are studied using a synthetic example of training a neural network.
I Introduction
Machine learning, data science, and largescale computations in general has created an era of computationdriven inference, applications, and policymaking [1, 2]. Technological solutions, and policies with farreaching consequences are increasingly being derived from computational frameworks and data. Multiagent sociotechnical systems that are tasked with working collaboratively on such tasks function by interactively sharing data, models, and results of local computation.
However, when such agents are independent and lack trust, they might not collaborate with or trust the validity of reported computations of other agents. Quite often, these computations are also expensive and time consuming, and thus infeasible for recomputation by the doubting peer as a general course of action. In such systems, creating an environment of trust, accountability, and transparency in the local computations of individual agents promotes collaborative operation.
For instance, consider training a deep neural network with a given architecture using stochastic gradient descent (SGD). Here, the model and computations are deterministic given the data used for gradient computation. Applications are primarily interested in using the trained model represented by the weights of the trained network. But, if they lack trust in the training agent, they have no simpler way to verify the network than to retrain it. This is often impractical since the (re)training process consumes extensive amounts of time and tends to require the use of specialized hardware like GPUs or TPUs. It is thus important to establish trust in the computations involved in the training phase.
To emphasize the importance of trust in multiagent systems, let us also consider the case of policy design for malaria. OpenMalaria (OM) [3] is an open source simulation environment, collaboratively developed to study malaria epidemiology and the effectiveness of control mechanisms. It is used extensively to design policies to tackle the disease. Here, individual agencies propose hypotheses regarding the disease and/or intervention policies, and study them by simulating them under specific environments [4]. Considering the potential impact of such work in designing disease control policies, it is important to establish accountability and transparency in the process, so as to facilitate trusted adoption of results. Calls have been made for accountability and transparency in multiagent computational systems, especially in high impact fields such as health [5]. A framework for decision provenance helps track the source of results, transparent computational trajectories, and a unified, trusted platform for information sharing. In fact, the US Centers for Disease Control [6] states that:
public health and scientific advancement are best served when data are released to, or shared with, other public health agencies, academic researchers, and appropriate private researchers in an open, timely, and appropriate way. The interests of the public…transcends whatever claim scientists may believe they have to ownership of data acquired or generated using federal funds.
This call implicitly assumes an inherent trust in the shared material. However, there exists significant disparity and inconsistency in current informationsharing mechanisms that not only hinder access, but also lead to questionable informational integrity [7]. Here, trust and transparency are critical, but absent in current practice.
Establishing trust in computations translates to guaranteeing correctness of individual steps of the simulation, and the integrity of the overall computational process leading to the reported results. Importantly, when computational models and parameters along with intermediate results of individual steps are shared, these steps can be validated by other agents who can recompute them, thereby validating the entire computation in a distributed manner.
Blockchain is a distributed ledger (database) technology that enables multiple distributed, potentially untrusted agents to transact and share data in a safe, secure, verifiable and trusted manner through mechanisms providing transparency, distributed validation, and cryptographic immutability [8]. As such, blockchain is the perfect tool for establishing the type of trust for complex, long running computations of interest. In this paper we use blockchain to record, share, and validate frequent audits (model parameters with the intermediate results) of individual steps of the computation. We describe how blockchainbased distributed validation and shared, immutable storage can be used and extended to enable efficient trusted verifiable computations.
A common challenge arising in blockchainbased solutions is its scalability. The technology calls for significant peertopeer interaction and local storage of large records that include all the data generated in the network. These fundamental requirements result in significant communication and storage costs respectively. Thus, using the technology for largescale computational processes over a large multiagent networks is prohibitively expensive. In this paper, we address this scalability challenge by developing a novel compression schema and a distributed, parallelizable validation mechanism that is particularly suitable for such largescale computation contexts.
Ii Prior Work
We now provide a brief summary of prior work in related areas.
A variety of applications with widespread impact are being designed with the help of improved computational capabilities, easier access to data, and machine learning algorithms. Taking the context of malaria, as studied through OpenMalaria simulations, new pipelines for integrating AI tools and algorithms have been considered [9]. Regressionbased methods for better policy search have also been integrated with the opensource platform [10].
Considering the impact of such simulations, researchers have recently raised alarm over their lack of reproducability. Reproducing results from research papers in AI have been found to be challenging as a significant fraction of hyperparameters and model considerations are not documented [11]. In another paper focused on reproduction of results in deep learning [12], the authors explore the possible reasons, and cite variability in evaluation metrics and reporting among different algorithms and implementations.
Accountability and transparency are being increasingly sought after in largescale computational platforms, with particular focus on establishing tractable, consistent computational pipelines. The problem of establishing provenance in decisionmaking systems has been considered [13] through the use of an audit mechanism. Distributed learning in a federated setting with security and privacy limitations has also been considered recently [14].
In fact, the problem of trust in multiagent computational systems was considered at the beginning of the th century from the viewpoint of reducing errors in complex calculations performed by human workers [15]. Largescale computational problems were solved using redundant evaluation of smaller subtasks assigned to human workers, and verified using computational checkpoints. We can draw significant insight into reliable distributed computing from these practices.
Blockchain systems have brought forth the means for creating distributed trust in peertopeer networks for transactional systems [16]. A variety of applications that invoke interactions and transactions among untrusting agents have benefited from the trust enabled by blockchains [17, 18, 19]. More recently, blockchains have been used in creating secure, distributed, data sharing pipelines in healthcare [20] and genomics [21]. This trust can also be leveraged in creating trusted distributed computing systems, as highlighted in this paper.
Iii Motivation
The motivation behind studying the problem of trust in multiagent computational systems is best understood through the practical example of OpenMalaria. Agencies making policy decisions gain access to research findings such as transmission models from the work of independent organizations studying malaria around the world. The open source nature of the platform has facilitated widespread access, and has created a largescale collaborative effort toward tackling the disease. The studies performed by various agents lead to policies that determine the health and wellbeing of vast sections of the community. Sharing data, models, and outcomes of simulation studies thus requires accountability and transparency with the guarantee of computational integrity, enabling the creation of reliable distributed computing platforms.
Let us consider a simple experiment that a malaria data scientist (MDS) is interested in conducting, to study the disease spread and control under a specific environment characterized by factors such as demographics, entomology, and intervention strategies in place. In particular, the MDS wishes to evaluate intervention strategies, such as distributing insecticide treated nets (ITN) and commissioning indoor residual spraying (IRS). The simulation is used to evaluate the efficiency of the policies, in terms of quantities such as disability adjusted life years (DALY) which quantifies the total life years lost from malariarelated fatality. Each policy also incurs a related cost of implementation, such as healthcare system costs (HSC) and intervention costs (IC).
By studying the costbenefit tradeoffs, the data scientist and/or other agencies can design optimal policies. Such agencies have access to simulation results performed by independent agents/workers. However, malicious agents, such as one who manufactures ITNs, could generate spurious results that address vested interests, rather than providing accurate insight into the disease. At the same time, workers with insufficient computational resources might also generate errors in the simulation process, potentially generating wrong inferences about the disease and policies.
Adoption of such results for policy design requires interagent trust, which is not guaranteed in multiagent systems. Repeating experiments for each adoption is prohibitively expensive. Trust in computations would significantly assist information sharing.
The notion of trust has been considered from a variety of standpoints [22] and has contextually varied definitions as considered in depth in [23]. A qualitative definition of trust in multiagent computational systems can be adapted from [24, 25] as:
Trust is the belief an agent has that the other party will execute an agreed upon sequence of actions and reports an accurate representation of computed result (being honest and reliable).
We provide a more specific characterization of trust. Such computations in general are composed of a sequence of atomic operations that update a system state iteratively. For instance, this could be OM simulations tracking the progression of malaria in a certain community, or the weights of a neural network as updated iteratively by a training algorithm. Establishing distributed trust, as defined, for such computations in a universal sense (without contextual understanding of computation specifics) requires checking consistency of individual steps of the simulation by recomputation. In particular, we decompose trust into two main components:

Validation: The individual atomic computations of the simulation are guaranteed and accepted to be correct.

Verfication: The integrity of the overall simulation process can be checked by other agents in the system.
The two elements respectively ensure local consistency of computation and posthoc corroboration of audits. Their mathematical characterization is provided in Sec. IV.
A naive solution is to validate each step (iteration) of the process using independent recomputation by validating agents. Similarly, the integrity of the computational process can be verified from an immutable record of validated intermediate states. However, practical simulations are long and involve a large number of iterations. Validation requires communication of the iterates to the endorsers, and recording the validated state on the immutable data structure. This results in significant communication and storage overhead if every state is reported and stored as is, in addition to the computational cost of validation, preventing its adoption to largescale systems and computational methods.
It is thus important to utilize the underlying structure of the simulation to reduce these overheads. This can be done by reporting a compressed version of the states with sufficient detail such that they can be validated to within a desired tolerance. We use universal compressors to reduce these communication and storage costs. Each block of communication and storage also incurs the overhead corresponding to headers and metadata. It is thus prudent to combine multiple iterates into a single frame before compression, and collectively validate and store frames of the computational process.
Blockchain systems establish trust in transactional systems for peertopeer networks of agents through distributed endorsements, consensus on transactional validity, and the storage of the collection of all transactions in the network in a shared, appendonly, immutable, distributed ledger at each peer in the network. We leverage these features directly (1) to use blockchain transactions to record steps of the computation, and (2) to facilitate the immutable storage of validated audits.
Allowing validation and verification of computations not only creates an environment of trust among agents, but also enforces a higher degree of conformation and consistency in experiments. Necessitating validation and verification also implies a shared common mechanism for model and data sharing, enabling scientific reproducibility. The setup also facilitates welldefined processes for distributed and derived computing, wherein the former involves a computational framework performed piecewise at multiple nodes, and the latter concerns deriving new experiments using checkpoints drawn from the intermediate audits of prior computational experiments.
Iv Computation and Trust Model
Let us now mathematically formalize the computation model, and validation and verification requirements under consideration. We consider an iterative computational algorithm in this paper.
Consider a computational process that updates a system state, , over iterations , depending on the current state and an external source of randomness , according to an atomic operation as follows
(1) 
For simplicity we assume that is shared by all agents. This can easily be generalized as elaborated later. We also assume that the function is Lipschitz continuous in the system state, without loss of generality, under the Euclidean norm, for all i.e.,
(2) 
That is, minor modifications to the inputs of the atomic operation result in correspondingly bounded variation in the outputs. This is expected for instance in simulations of physical or biological processes, as seen in epidemiological and meteorological simulations, as most physical systems governing behavior in nature are smooth. For instance, with respect to the OpenMalaria example, the requirement implies that minor changes in policies result in minor changes in outcomes.
We consider a multiagent system where one agent, referred to as the computing client (client in short), runs the computational algorithm. The other agents in the system, called peers, are aware of the atomic operation and share the same external randomness and hence can recompute the iterations. Validation of intermediate states is performed by independent peers referred to as endorsers through an iterative recomputation of the reported states from the the most recent validated state using the atomic operation . The process of validation is referred to as an endorsement of the state. A reported state, is valid if it lies within a margin, , of the state as recomputed by the endorser, i.e.,
(3) 
The validation criterion (3), without loss of generality, associates equal weightage to each component of the state, and can be easily generalized to weighted norms or other notions of distance.
Verification concerns checking for integrity of the computational process which is enabled through the storage of frequent audits of validated states. Thus, if the audits record the states , then verification corresponds to ensuring that the recomputed version, , of the state is within a margin, , of the recorded version, i.e.,
(4) 
Without loss of generality, validation requirements are stricter than for verification, i.e., . We now construct the system to address these two trust requirements.
V MultiAgent Blockchain Framework
We elaborate the system design, starting with the functional categorization of the network. We then elaborate each functional unit, including the compression at the client, the validation by endorsers, and the role of orderers in adding blocks to the ledger. For ease, let us consider a deterministic iterative algorithm for computation, .
Va PeertoPeer Network—Functional Decomposition
The peertopeer network is functionally categorized into clients, endorsers, and orderers, who function together in computing, validating, ordering, and storing the simulated states on the blockchain ledger. Their functioning is as follows:

The client runs the computations and iteratively computes the states , for .

The client groups a sequence of states into a frame, compresses, and communicates the frame to a set of endorsers.

The endorsers decompress frames, validate states by recomputing them iteratively, and report endorsements to orderers.

The orderers subsample and add the frame to the blockchain if it has been validated, and if all prior frames have been validated and added to the ledger.

The peers update their copy of the ledger.
This is depicted in Fig. 1. The classification is only based on function and each peer can perform different functions across time. Since states are grouped into independent frames, they can be validated by nonoverlapping subsets of endorsers in parallel.
VB Client Operations
Clients performs the computations, compute the states, construct frames of iterates, compress, and report them to endorsers. We assume there exists an endorser assignment policy.
Owing to the Lipschitz continuity,
Thus state updates (differences) across iterates are bounded to within a factor of the deviation in the previous iteration. This property can be leveraged to compress state updates using delta encoding [26], where states are represented in the form of differences from the previous state. Then, it suffices to store the state at certain checkpoints of the computational process, with the iterates between checkpoints represented by the updates.
We describe the construction inductively, starting with the initial state , assumed to be the first checkpoint. Let us assume that the state reported at time is and the true state is . Then, if , define the update as
The cost of communication (for validation) and storage (for verification) of these updates is reduced by performing lossy compression (vector quantization [27]). Let the quantizer be represented by and let the maximum quantization error magnitude be , i.e., if the client reports , then,
(5) 
Additionally, the checkpoints can also be compressed using a LempelZivlike dictionarybased lossy compressor. Here, a dictionary of unique checkpoints are maintained. For each new checkpoint, we first check if the state is within a margin from an entry in the dictionary, and the index of this entry is reported. If not, the state is added to the dictionary and its index is reported. Other universal vector quantizers can also be utilized for compressing checkpoints, and we denote this quantizer by .
Let be the maximum magnitude of a state update within a frame, i.e., if , the client creates a checkpoint at and reports . Then is reported as
(6) 
The resulting sequence of frames is as shown in Fig. 2.
Separate from creating new checkpoints adaptively, the system can also restrict the maximum size of a frame by a constant to limit the computational overhead of its validation. Fig. 3 summarizes the tasks performed by the computing client.
The choice of design parameters, , are to be made such that the reports are accurate enough for validation. The optimal design choice is shown in the following results.
Theorem 1
If is Lipschitz continuous, and , then, a state is invalidated by an honest endorser only if there is a computational error of magnitude at least , i.e., .
Proof:
Corollary 1
If , then is invalidated.
The necessary and sufficient conditions for invalidation in Thm. 1 and Cor. 1 highlight the fact that computational errors of magnitude less than are missed, and any error of magnitude at least is certainly detected. When the approximation error is made arbitrarily small, all errors beyond the tolerance are detected. A variety of vector quantizers, satisfying Thm. 1 can be used for lossy delta encoding—one simple choice is lattice vector quantizers [28].
Theorem 2
Let and let , then the communication and storage cost per state update is bits.
This follows directly from the covering number of using balls; a similar cost is incurred for other standard lattices.
Theorem 3
For any frame , with checkpoint at , maximum number of states in the frame, , is bounded as
(10) 
where , is the first update in the frame.
Proof:
This provides a simple sufficient condition on the size of a frame, in terms of the magnitude of the first iteration in the frame. Naturally a small first iterate implies the possibility of accommodating more iterates in the frame. This lower bound can be used in identifying the typical frame size and the corresponding costs of communication and computation involved, prior to the design of the scheme. We describe the generalization of the compressor to the parameter unaware setting in Sec. VII.
VC Endorser and Orderer Operations
We now define the role of an endorser in validating a frame. A summary of the operations is depicted in Fig. 4. For preliminary analysis, we assume that endorsers are honest and are homogeneous in terms of communication latency and computational capacity. A more refined allocation policy can be designed to account for the case of variabilities in communication and computational costs. However we do not consider this in this paper.
Each endorser involved in validating a frame, sequentially checks the state updates by recomputing from the last valid state, i.e., to validate the report , the endorser computes and checks for the validity criterion (3). The frame is reported as valid if all updates are valid in the frame. The endorsements are then reported to the orderer.
Individual update validations can also be performed in parallel and finally verified for sequential consistency. Such parallelism can be performed either at the individual endorserlevel, or in the form of the distribution of the subframes across endorsers. This results in a reduction of the time required for validating a frame. For the sake of simplicity, we skip these extensions in this paper.
Upon receiving the endorsements for frames, the orderer checks for consistency of the checkpoints and adds a valid frame to the blockchain ledger if all prior frames have already been added, and broadcasts the block to other peers. This is depicted in Fig. 5.
Since the state updates are stored on the immutable data structure of the blockchain, they provide an avenue for verification of the computations at a later stage. As described in (4), the verification requirements are not as strict as the validation requirements. Thus it suffices to subsample the updates in a frame and store only a subset, i.e., one state is stored for every iterates. Then, the effective state update is the sum of the individual updates of the intermediate iterates.
A block stored on the blockchain is now characterized by the audits that are either the checkpoints or the cumulative updates corresponding to successive iterates. The audits are then defined by
(13) 
where the sum is over the intermediate iterates, and is the next checkpoint. Then, the audits are grouped into blocks as described by Fig. 2 and added to the blockchain ledger by the orderer.
Theorem 4
For subsampled storage at a frequency according to (13), a Lipschitz constant of , and quantization error ,
(14) 
where .
Proof:
Thus, a viable subsampling frequency can be determined by finding a such that . This reduces the storage cost on the blockchain at the expense of accuracy of recorded audits. If the agents are interested in increasing accuracy of the records over time, then the quantizers can be dynamically adjusted accordingly.
VD Example Application
Let us now elaborate the design from the context of a synthetic example that is used later for experimental study in Sec. VIII. Consider an agent, the client, in a network who wishes to address a simple classification problem using training data that he has access to. The client aims to train a neural network using backpropagtion based on mini batch stochastic gradient descent (SGD) to solve this classification problem and subsequently share the trained network with other agents who are also interested in the solution.
However, the client is limited by the amount of computational resources available for training, and also does not wish to share the private data used for training. Since it has limited resources toward gradient computation, it uses small batch sizes to get faster estimates. However, the peers do not trust the computations performed by the client, not just because of its proclivity toward errors arising from computational limitations, but also possible malicious intent. The peers themselves have access to private datasets, drawn from the same source, but much smaller in size such that they can not train a network on their own for the task.
In such a context, we can establish distributed trust in the agents using MBF as follows:

The client sets up the training with parametric inputs (network architecture, learning rates, batch sizes etc.) and shares them over the blockchain with other peers.

The client runs the training algorithm, compresses state updates (network weights) using lossy compression and reports compressed frames to endorsers for validation.

MBF orderer checks for consensus among endorsers and subsamples the frame to construct blocks. They then add blocks sequentially to the blockchain ledger.

Client reports the network to peers at the end of training.
Since the experiment is run on the MBF platform, peers are assured of validity of steps of the training, and also have access to the blockchain to verify the computations. Since the private training data is not shared across peers, the endorsement process for revalidation needs to be appropriately adjusted. This is described in Sec. VII, and a detailed experimental study of this problem, adapted to the MNIST dataset, is done in Sec. VIII.
Thus the MBF platform addresses the trust issues described in Sec. III and allows for efficient collaboration and trusted data, model, and result sharing among agencies involved in malaria research and policy design.
Vi Design Advantages and Costs
We now perform a costbenefit analysis of the design. To the best of our knowledge this is the first system designed to address trust in such systems and so we benchmark the costs against simpler implementations to emphasize the importance of the different components of the system.
Let us first identify the advantages of the platform.

Accountability: The MBF platform guarantees provenance through the immutable record of computations. Thus, we can not only detect the source of potential conflicts, but also to trace ownership of computations.

Transparency: The platform establishes trust among agents through a transparent record of the validated computational trajectories of computation.

Adaptivity: The frame design, endorsement, and validation methods adapt according to the state evolution. Further, the validity margins can be altered across time by dynamically varying the quantizers. In convergent simulations/algorithms, the system can thus use monotonically decreasing margins to obtain stricter guarantees at convergence.

Generality: The platform uses fairly general building blocks, and can be easily implemented using existing methods.

Computation universality: The design is agnostic to computational process specifics and can be implemented as long as it is composed of reproducible atomic computations.

Scientific reproducibility: By storing intermediate states this platform guarantees reliable data and model sharing, and collaborative research. It thus facilitates scientific reproducibility in largescale computational experiments.
To compare the costs of the system, let us consider three different modes for such blockchainbased distributed trust:

Transaction Mode: Here we treat each iteration as a transaction and validate and store each state transition as a block on the blockchain ledger independently.

Streaming Mode: Here each state is independently compressed according to a universal compressor, validated, and stored on the blockchain.

Batch Mode: This corresponds to the MBF design described in this paper.
Let us assume that the average number of endorsers per frame be , the average size of the frame be , and the subsampling frequency be in the batch mode. We benchmark costs relative to this average set of iterations, and the same computational redundancy.
First, let us consider the computational overhead involved. Each mode performs times as many computations as the untrusted simulation. The streaming and batch modes additionally incur the cost of compression and decompression of states. The batch mode also includes the cost of subsampling the frames. Thus we can see that the transaction mode incurs the least computational overhead, while the batch mode incurs the most. Informally, the batch mode incurs a cost of
(17) 
where are the computational costs of the simulation, compression and decompression, and subsampling respectively. The transaction and streaming modes incur just the first and the first two costs respectively.
The communication overheads include the state reports and metadata used for validation and coordination respectively. In the transaction mode, as states are uncompressed, the communication cost is significant and is not scalable. On the other hand, the streaming and batch modes reduce these costs through lossy compression. Assuming a bounded set of states, , such that , the worstcase sufficient communication cost in transaction mode using vector quantization for iterations is
(18) 
where is the average communication cost of metadata, per instance of communication. On the other hand, the batch mode reduces both compression cost, and the metadata, as
(19) 
The costs expressed are sufficient costs in the order sense and more precise estimates can be computed given the compression scheme and statistics of state evolution.
Similarly, with regard to storage, the transaction mode incurs significant costs on account of not compressing the audits. The batch mode not only incurs lesser metadata for storage but also fewer state updates on account of subsampling when compared to the streaming mode. To be precise, in the order sense, whereas
(20) 
Thus the batch mode reduces communication and storage overheads at the expense of added computational cost. Through a careful analysis of the tradeoffs, we can adopt optimal compression schemes and subsampling mechanisms.
Vii Extensions of Design
We now describe a couple of avenues for generalization.
Viia Parameter Agnostic Design
In Sec.V we used a vector quantizers based on the Lipschitz constant . In practice, such parameters of the computational algorithm are not known apriori. Underestimating can result in using a larger quantization error, that could cause errors in validation even when the client computes correctly. In such cases, it is essential to be able to identify the cause for the error.
One option is to estimate from computed samples. This translates to estimating the maximum gradient magnitude for the atomic operation which might be expensive in sample and computational complexity, depending on the application. Thus, we propose an alternative compression scheme.
We draw insight from video compression strategies, and propose the use of successive refinement coding [29] of the state updates. That is, a compression bit stream is generated for each state update such that the accuracy can be improved by sending additional bits from the bit stream. Successive refinement allows the endorsers to provide updates on the report such that the state accuracy can be iteratively improved.
Thus, if an invalidation notice is received from endorsers, the client has two options—checking the computations, and/or refining the reported state through successive refinement. Depending on the computationcommunication cost tradeoff, the client appropriately chooses the more economical alternative. Through successive refinement, the client provides more accurate descriptions of the state vector, and thus reduces the possibility of validation errors caused by report inaccuracy.
One possible efficient compression technique uses lattice vector quantizers [30, 31] to define successive refinement codes. This also reduces the size of the codebook, if the refinement lattices are assumed to be of the same geometry, because the client only needs to communicate the scaling corresponding to the refinement. This allows for improved adaptability in the refinement updates. More efficient quantizers can also be defined if additional information regarding the application and state updates are available.
ViiB Computations with External Randomness
As described in Sec. IV, such computational algorithms in practice typically evolve iteratively as a function of the current state , and an external randomness . When this randomness is not shared across agents, and is inaccessible to the client, reproduction of the reported results by an endorser becomes infeasible and so is validating local computations. This could also emerge in cases where the client is unwilling to share private data associated used by the algorithm with other agents [14].
For instance, in simulations of disease spread using black box models, each run of the simulation adopts a different outcome, depending on the underlying random elements introduced by the model to mimic societal and pathological disease spread factors. Quite often, the client does not have access to all the random elements introduced by the model in creating that particular outcome.
Whereas the exact random instance might not be available, the source of such randomness is often common, i.e., , and is known. In this context, we redefine validation as guaranteeing (3) with probability at least , i.e.,
(21) 
This requirement removes outliers in the computation process and only allows trajectories close to the expected behavior.
Then, we can exploit the law of large numbers to validate reports by their deviation from the average behavior observed across multiple independent endorsers,
where . By choosing a sufficiently large number of endorsers, depending on , we can assure (21). The role of the endorsers is appropriately modified and the system calls for higher coordination amongst the endorsers.
Using multivariate concentration inequalities, we can also quantify the sufficient number of endorsers for validation.
Theorem 5
Let . For a state at time , if the average of endorsers is used for validation,
(22) 
where is the maximum eigenvalue of covariance matrix of of the quantized state vector.
Proof:
For a dimensional random variable with , according to the multidimensional Chebyshev inequality [32],
Then, using the fact that , for any vector and matrix with minimum and maximum eigenvalues , we have
(23) 
where is the maximum eigenvalue of .
Corollary 2
To guarantee validation with probability at least , for a margin of deviation of , where , it suffices to use
(30) 
endorsers.
This sufficient condition follows directly from Thm. 5.
Viii Experiments
In this section, we run some simple synthetic experiments using the MNIST database [33], for the scenario described in Sec. VD, to understand the distributed trust environment design, the costs involved, and the benefits of the enforcement. These synthetic experiments were selected to evaluate the efficacy of our approach with a domain that is familiar, and the process of training neural networks that is common in the research community.
Let us consider a simple layer neural network, trained on the MNIST database, with neurons in the hidden layer. Consider a client training the neural network using minibatch stochastic gradient descent (SGD), with limited resources such that, it is constrained in computing gradients and so uses a small batchsize of samples per iteration and iterations. The average precision of such a neural network trained with gradient descent is . We now wish to establish trust in the training process owing to the limited resources of the client. Whereas this configuration is far from the state of the art on the database, it does help understand the trust environment better owing to its suboptimality.
Since the training process uses stochastic gradients, exact recomputation of the iterates is infeasible. Hence, we compare deviations from the average across endorsers per state for validation. We evaluate the computation and communication cost of validation as a function of the tolerance chosen for validation. Since the neural network converges to a local minimum according to SGD, we use a tolerance for iteration as . That is, the validation requirements are made stronger with the iterations.
We consider three main cases of the simulation:

Base case: Compression error is less than validation tolerance, i.e., , and maximum frame size is of the total number of iterations.

Coarse compression: Large compression error, i.e., for at least some instances, and same base .

Large frames: Same base compression error, and maximum frame size is of total number of iterations.
In the base case, invalidation from approximation errors are more frequent in later iterations when the tolerance is also lower. However, with increasing iterations, the network weights are also closer to the minima. Thus approximation errors can be eliminated by successive refinement, as gradients estimates by the client also get more accurate. The presence of outliers and smaller batch sizes impact the initial iterations much more, which are reported with comparatively better accuracy, as required by the weaker validation criterion, therein only invalidating computational errors.
In comparison, in the case of coarse compression, approximation errors of the gradients are much more likely, therein resulting in more instances of invalidation. This translates to a higher number of gradient recomputations at the expense of reduced communication overhead on the compressed state updates. On the other hand, in the case of the extremely large frames, the endorsers validate longer sequences of states at once. Thus, each invalidation results in a recomputation of all succeeding states, therein increasing the number of recomputations from the base case. This case however reduces the number of frames and checkpoints, therein reducing the average communication cost in comparison to the base case.
In Fig. 6, the average number of gradient recomputations per iteration is shown for these three cases. As expected, this decays sharply as we increase the tolerance. Note that at either extreme, the three cases converge in the number of recomputations. This is owing to the fact that at one end all gradients are accepted whereas at the stricter end, most gradients are rejected with high probability, irrespective of the compression parameters. In the moderate tolerance range, we observe the tradeoffs as explained above. The corresponding communication cost tradeoff is shown in Fig. 7.
Fig. 8 shows the precision of the neural network trained under the validation requirement as compared against the networks trained with standard mini batch SGD of batch sizes and . We note that the network trained with trust outperforms the case of vanilla SGD with the same batch size as it eliminates spurious gradients at validation. Increasing trust requirements (decreasing tolerance) results in improved precision of the model. In particular, it is worth noting that the strictest validation criterion results in performance that is almost as good as training with a batch size of . This is understandable as the endorsers validate only those gradients that are close to that of the case with mini batch of size . In fact, even when the trust requirements are fairly relaxed, just eliminating outliers in the gradients enhances the training significantly.
Thus, these simple experiments not only highlights the role of trust in guaranteeing local and global consistency in the computational process, but also the cost tradeoffs involved in establishing them. For this application for instance, appropriate parameters can be chosen by studying the precisioncost tradeoffs. Other applications might invoke similar tradeoffs, implicit or explicit, in the trust guarantees and the resulting cost of implementation.
Ix Conclusion
In this paper we considered a multiagent computational platform and the problem of establishing trust in the computations performed by individual agents in such a system. Using a novel combination of blockchains and distributed consensus through recomputation, we assured validity of local computations and simple verification of computational trajectories. Using efficient, universal compression techniques, we also identified methods to reduce the communication and storage overheads concerned with establishing trust in such systems, therein addressing the scalability challenge posed by blockchain systems.
Creation of such trusted platforms for distributed computation among untrusting agents allows for improved collaboration, and efficient data, model, and result sharing that is critical to establish efficient policy design mechanisms. Additionally they also result in creating unified platforms for sharing results, and in ensuring scientific reproducibility.
Acknowledgment
This work was conducted under the auspices of the IBM Science for Social Good initiative. The authors thank Aleksandra Mojsilović and Komminist Weldemariam for discussions and support.
References
 [1] D. J. Power, “Data science: supporting decisionmaking,” J. Decis. Sys., vol. 25, no. 4, pp. 345–356, Apr. 2016.
 [2] D. Shah, “Data science and statistics: Opportunities and challenges,” Technol. Rev., Sep. 2016.
 [3] T. Smith, N. Maire, A. Ross, M. Penny, N. Chitnis, A. Schapira, A. Studer, B. Genton, C. Lengeler, F. Tediosi, D. d. Savigny, and M. Tanner, “Towards a comprehensive simulation model of malaria epidemiology and control,” Parasitology, vol. 135, no. 13, p. 1507â1516, Aug. 2008.
 [4] J. D. Piette, S. L. Krein, D. Striplin, N. Marinec, R. D. Kerns, K. B. Farris, S. Singh, L. An, and A. A. Heapy, “Patientcentered pain care using artificial intelligence and mobile health tools: protocol for a randomized study funded by the us department of veterans affairs health services research and development program,” JMIR Res. Protocols, vol. 5, no. 2, 2016.
 [5] J. Nelson, “The operation of nongovernmental organizations (ngos) in a world of corporate and other codes of conduct,” Corporate Social Responsibility Initiative, Mar. 2007.
 [6] “CDC/ATSDR policy on releasing and sharing data,” Sep. 2005. [Online]. Available: https://www.cdc.gov/maso/policy/releasingdata.pdf
 [7] W. G. V. Panhuis, P. Paul, C. Emerson, J. Grefenstette, R. Wilder, A. J. Herbst, D. Heymann, and D. S. Burke, “A systematic review of barriers to data sharing in public health,” BMC Public Health, vol. 14, no. 1, p. 1144, Feb. 2014.
 [8] K. Croman, C. Decker, I. Eyal, A. E. Gencer, A. Juels, A. Kosba, A. Miller, P. Saxena, E. Shi, E. G. Sirer, D. Song, and R. Wattenhofer, “On scaling decentralized blockchains,” in Financial Cryptography and Data Security, ser. Lecture Notes in Computer Science, J. Clark, S. Meiklejohn, P. Y. A. Ryan, D. Wallach, M. Brenner, and K. Rohloff, Eds. Berlin: Springer, 2016, vol. 9604, pp. 106–125.
 [9] S. L. Remy, O. Bent, and N. Bore, “Reshaping the use of digital tools to fight malaria,” arXiv:1805.05418 [cs.CY], May 2018.
 [10] O. Bent, S. L. Remy, S. Roberts, and A. WalcottBryant, “Novel exploration techniques (nets) for malaria policy interventions,” arXiv:1712.00428 [cs.AI], Dec. 2017.
 [11] O. E. Gundersen and S. Kjensmo, “State of the art: Reproducibility in artificial intelligence,” in Proc. 32nd AAAI Conf. Artif. Intell., New Orleans, USA, Feb. 2018.
 [12] P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, and D. Meger, “Deep reinforcement learning that matters,” arXiv:1709.06560v2 [cs.LG], Nov. 2017.
 [13] J. Singh, J. Cobbe, and C. Norval, “Decision provenance: Capturing data flow for accountable systems,” arXiv:1804.05741 [cs.CY], Apr. 2018.
 [14] D. Verma, S. Calo, and G. Cirincione, “Distributed ai and security issues in federated environments,” in Proc. Workshop Program 19th Int. Conf. Distrib. Comput. Netw., ser. Workshops ICDCN ’18, Jan. 2018.
 [15] D. A. Grier, “Error identification and correction in human computation: Lessons from the wpa.” in Human Computation, 2011.
 [16] E. Androulaki, A. Barger, V. Bortnikov, C. Cachin, K. Christidis, A. D. Caro, D. Enyeart, C. Ferris, G. Laventman, Y. Manevich, S. Muralidharan, C. Murthy, B. Nguyen, M. Sethi, G. Singh, K. Smith, A. Sorniotti, C. Stathakopoulou, M. Vukolić, S. W. Cocco, and J. Yellick, “Hyperledger fabric: A distributed operating system for permissioned blockchains,” in Proc. 13th EuroSys Conf., ser. EuroSys ’18, Apr. 2018, pp. 30:1–30:15.
 [17] D. Tapscott and A. Tapscott, Blockchain Revolution: How the Technology behind Bitcoin is Changing Money, Business, and the World. New York: Penguin, 2016.
 [18] M. Iansiti and K. R. Lakhani, “The truth about blockchain,” Harvard Bus. Rev., vol. 95, no. 1, pp. 118–127, Jan. 2017.
 [19] M. Vukolić, “Rethinking permissioned blockchains,” in Proc. ACM Workshop Blockchain, Cryptocurrencies and Contracts, ser. BCC ’17, Apr. 2017, pp. 3–7.
 [20] J. Tsai, “Transform blockchain into distributed parallel computing architecture for precision medicine,” in 2018 IEEE 38th Int. Conf. Distrib. Comput. Systems (ICDCS), Jul. 2018, pp. 1290–1299.
 [21] H. I. Ozercan, A. M. Ileri, E. Ayday, and C. Alkan, “Realizing the potential of blockchain technologies in genomics,” Genome Research, 2018.
 [22] R. Falcone, M. Singh, and Y.H. Tan, Trust in cybersocieties: integrating the human and artificial perspectives. Springer Science & Business Media, 2001, vol. 2246.
 [23] S. P. Marsh, “Formalising trust as a computational concept,” Ph.D. dissertation, University of Stirling, 1994.
 [24] P. Dasgupta, “Trust as a commodity,” Trust: Making and breaking cooperative relations, vol. 4, pp. 49–72, 2000.
 [25] S. D. Ramchurn, D. Huynh, and N. R. Jennings, “Trust in multiagent systems,” Knowl. Eng. Review, vol. 19, no. 1, pp. 1–25, 2004.
 [26] C. W. Granger and R. Joyeux, “An introduction to longmemory time series models and fractional differencing,” J. Time Ser. Anal., vol. 1, no. 1, pp. 15–29, Jan. 1980.
 [27] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Springer Science & Business Media, 2012, vol. 159.
 [28] S. D. Servetto, V. A. Vaishampayan, and N. J. A. Sloane, “Multiple description lattice vector quantization,” in Proceedings DCC’99 Data Compression Conference (Cat. No. PR00096), Mar. 1999, pp. 13–22.
 [29] W. H. R. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 269–275, Mar. 1991.
 [30] D. Mukherjee and S. K. Mitra, “Successive refinement lattice vector quantization,” IEEE Trans. Image Process., vol. 11, no. 12, pp. 1337–1348, Dec. 2002.
 [31] Y. Liu and W. A. Pearlman, “Multistage lattice vector quantization for hyperspectral image compression,” in Conf. Rec. 41st Asilomar Conf. Signals, Syst. Comput., Nov. 2007, pp. 930–934.
 [32] A. W. Marshall and I. Olkin, “Multivariate Chebyshev inequalities,” Ann. Math. Stat., vol. 31, no. 4, pp. 1001–1014, Dec. 1960.
 [33] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.