Occupant Plugload Management for Demand Response in Commercial Buildings: Field Experimentation and Statistical Characterization

# Occupant Plugload Management for Demand Response in Commercial Buildings: Field Experimentation and Statistical Characterization

## Abstract

Commercial buildings account for 36% of US electricity consumption, of which nearly two-thirds is met by fossil fuels eia2016electric (). Reducing this impact on the environment requires improving energy efficiency by lowering energy consumption. The overall building energy consumption comprises of Heating, Ventilation, Air Conditioning (HVAC - ); Lighting (); and Plug and Process Loads (PPLs - ) nrelplugloads (). Most existing studies focus on designing methods to regulate HVAC and Lighting systems. However, methods to regulate occupant plugload are relatively not well-understood. To this end, we conducted multiple experiments to study changes in occupant plugload consumption due to monetary incentive and/or feedback. These experiments were performed in commercial office and university buildings within the NASA Ames Research Center. Analysis of the data from these experiments reveal the significance of plugload reduction in the presence of feedback and/or incentive interventions. We propose autoregressive models to predict expected plugload savings in the presence of exogenous variables. Our contribution facilitates the consideration of occupants-in-the-loop within the classical demand response paradigm.

###### keywords:
Plugload, Demand Response, Commercial buildings, Energy, Experiment design, Dashboard, Feedback, Incentives, Sustainability, Occupants, Building facilities, Statistical inference

## 1 Introduction

1. It proposes paired experiment designs to study the effects of dashboard-enabled feedback and/or incentives on occupant plugload. Unlike most related studies which monitor aggregate plugload consumption, this study employs device-level real-time monitoring of each device of every participant using smart powerstrips.

2. It provides statistical characterization of the data from experiments in line with the design assumptions. Hypothesis tests are conducted and confidence intervals are estimated to answer questions about the efficacy of interventions. Autoregressive models with exogenous inputs are proposed to model occupant plugload consumption. This allows for considering occupants-in-the-loop within the demand response framework.

3. It provides findings from independent experiments in both office and university environments to assess generalizability of plugload reduction due to dashboard feedback.

The rest of this paper is organized as follows: Section 2 describes the design and execution of experiments at NASA SB and CMU SV. A statistical analysis of the data from these experiments are presented in section 3 along with respective results and discussion. Concluding remarks are presented in section 4.

## 2 Experiment Design and Execution

We designed and conducted experiments to study the influence of incentives and/or feedback interventions on occupant plugload consumption. Our research hypothesis is:

• Providing incentives and/or dashboard-based feedback to occupants in commercial buildings reduces occupant plugload consumption.

Consequently, we examine the claim that the average occupant plugload consumption in the presence of an incentive and/or feedback is less than the consumption in the absence of incentive and/or feedback based on data from the experiments. In the rest of this section, we present various aspects of the setup employed for purposes of experimentation.

### 2.1 Location and Duration

Two experiments were conducted within the NASA Ames Research Center, one within an office environment (NASA SB) and the other within a university environment (CMU SV - buildings 19 and 23). Let the symbols and denote the experiments at NASA SB and CMU SV, respectively. Each experiment was divided into multiple phases corresponding to a baseline phase and one or more experiment phases. The duration of each phase and the respective interventions used are specified in the table 2.1. The incentive-based interventions were only applicable to experiment (at CMU SV) whereas the feedback-based interventions were applicable to both experiments.

 Experiment Description\Phase Baseline (No Intervention) Incentives Feedback Feedback & Incentives EN: Phases conducted ✓ ✕ ✓ ✕ EN: Phase notation P1N N/A P3N N/A EN: Duration allocated Five weeks N/A Four weeks N/A EC: Phases conducted ✓ ✓ ✓ ✓ EC: Phase notation P1C P2C P3C P4C EC: Duration allocated Five weeks Two weeks Two weeks Two weeks
\nomenclature

The plugload experiment at NASA SB \nomenclatureThe plugload experiment at CMU SV

### 2.2 Variables

We discuss the response variables and interventions in both experiments here. The response variable was defined as the time-averaged power consumption of the subject/participant. It’s value was computed based on data from smart powerstrips enmetric (). The interventions employed are described in table 2.1. The incentive interventions were administered as daily monetary rewards aimed at promoting energy conservation among the participants. The feedback intervention was administered by a web browser-based dashboard tool which was designed to raise awareness about the subject’s plugload consumption. It is important to note the between the feedback provided by the experimenters and the feedback received by the subjects owing to the difference in using the dashboard by each subject. We recorded the time spent by each participant on their dashboard to quantify usage, and hence the feedback received.

### 2.3 Design principles and implementation

The experiment design aims to strengthen the causal connection between the interventions and the response. This is realized by mitigating the effect of nuisance factors via blocking & randomization montgomery2008design (). The design directly relates to the validity of the statistical assumptions during analysis. We adopted the following principles for experiment design:

1. Blocking: Owing to the nature of subject-to-subject variation induced by humans performing different tasks or possessing different energy preferences, blocking nuisance factors is of prime concern to avoid systematic biases. Therefore, we adopt a matched pairs design by regarding each subject to be their own control counterpart separated in time, thereby blocking potential subjectivity that could otherwise confound the analysis. This design criterion strengthens the causal connection between interventions and the corresponding responses.

2. Randomization: Recruitment of subjects based only on the willingness to participate in the experiment without attempting to introduce systematic sampling bias. This consideration allows us to assume random sampling from the underlying occupant population for purposes of statistical inference.

3. Replication: Sampling a subject randomly from the occupant population and randomly allocating them to the intervention does not guarantee that any effects observed are actually due to the intervention owing to variation by chance. An intervention is considered effective only if its effects are reproducible. Thus, multiple subjects are treated by each intervention to infer effectiveness. Further, the effects of the blocking factors can be accounted for by the difference between the baseline and intervened responses. Thus the daily averages of these differences provide replicates for analyses described in Section 3.

### 2.4 Feedback intervention design: Dashboard Application

A dashboard was designed to provide the subjects with information relevant to their plugload consumption. The elements of the dashboard were defined based on analytics that were previously found effective in motivating energy conservation among occupants in commercial buildings yun2013sustainability () yun2013toward () gulbinas2014effects (). These analytics were represented by easily comprehensible elements with minimal cognitive and visual load brath2004dashboard (). The back end of the dashboard was written in PHP and the front end was written in HTML, CSS and JavaScript. An image of the dashboard is shown in Figure 1.

Each feature of the dashboard is described in section 2.4.1.

#### Features of the dashboard

1. Comfort feature (upper left): The comfort feature is represented by radio buttons that allow participants to report their comfort levels to the building facilities. The options represented an ASHRAE 7-point scale de2002thermal (). This feature motivated participant engagement with the dashboard based on their historical interest in communicating their comfort levels. Thus, the participants would engage with the dashboard that also contained power-related features.

2. Individual power feature (center): The instantaneous power consumed by the individual participant is pointed to by the needle in the dial. Similar visualizations were found effective for energy reduction in households petkov2011motivating (). The dial’s needle was set to saturate beyond the dial’s maximum reading. The dial was calibrated using data collected during the baseline phase. The average baseline power usage was computed by considering data points above 5 W. This average was chosen to represent the zenith of the dial for the participant under consideration. The 5 W threshold was chosen to avoid a participant’s inactivity from lowering the average value. The calibration also provided the color-coded context within which the current usage was positioned.

3. Scoreboard feature (upper right): The scoreboard feature provides participants with the score and relative position in the participant pool. When an incentive is provided, the participant with the highest score (rank 1) is declared the winner of the day. The scoring mechanism is designed to measure the improvement of the participant compared to his/her baseline, and is described in Section 2.4.2.

4. Serial power feature (lower left): The power series of an individual (in orange) relative to the pool (in green) is depicted by line charts in the serial power feature. Such social comparisons have proven successful in motivating energy reduction among participants allcott2011social () ayres2012evidence (). The vertical axis depicting power usage was scaled based on the individual and pool values during the time the dashboard window was active in the corresponding session.

5. Channel split feature (lower right): The instantaneous power consumed via the individual channels in the powerstrip are represented here by the bar charts. While other features represent the participant’s cumulative consumption across channels, these bars provide actionable feedback by corresponding to the device plugged in the channel.

6. Notification feature (top right): A notification feature was provided in the dashboard to notify winners, if applicable.

#### Score computation

The scoreboard described above represents the participant’s score along with the relative position in the competition against other participants to win the incentive. The steps involved in the scoring mechanism are described below:

1. The time-averaged power consumption across each powerstrip channel was computed for the baseline phase by excluding data points below an inactivity threshold (5 W). The threshold served as a measure of inactivity.

2. The channel-specific averages computed above were aggregated over all the channels assigned to a participant to obtain the average active baseline power consumption of a participant.

3. The above steps were repeated across all participants to obtain baselines for the score computation described below.

4. During each day of the incentive competition, each participant’s average active power consumption was determined similar to determining the baseline. The only procedural difference between the experiment and baseline computations is that the average power during the experiment was computed using data from local midnight till the scoring instant unlike the baseline computation which was performed using data from midnight to the next midnight.

5. The participant score was computed by the percentage improvement during the experiment compared to his/her baseline. That is, , where and represent the participant’s baseline average power (step 2) and the experiment day average power (step 4), respectively.

#### Inactivity detection

The inactivity threshold (5 W) mentioned above was unknown to the participants to ensure that no participant was declared to be a winner either due to inactivity or absence. The participants were informed that the scoring mechanism only rewards reducing power consumption via active changes as opposed to reducing the consumption via passive changes such as turning off devices, or by being inactive or absent. Despite all the inactivity measures, it was also possible that a participant could win due to constant activity such as leaving a computer monitor on while turning off all other devices. In such cases, a metric based on sliding time windows was used to detect inactivity. In this manner, the scoring algorithm was designed to guard against winning strategies driven by inactivity.

### 2.5 Incentive intervention design

For the experiment phases involving incentives ( and ), a fixed monetary value was announced at the begin of each workday for participants to compete by changing their energy behavior compared to respective baselines. The values of the incentives ranged between $5 and$50 in multiples of 5 over a duration of ten working days or two weeks. By ensuring random ordering of incentives no systematic bias was introduced during the experimentation.

### 2.6 Data collection

The power consumption of devices associated with each subject were monitored in real-time by smart powerstrips from Enmetric systems enmetric (). The monetary value associated with incentive inputs were recorded on a daily basis. As noted in 2.2, the amount of feedback received by each participant was quantified by the time spent by the participant on his/her dashboard. The screentime was recorded by software running alongside the dashboard application.

### 2.7 Execution of the experiment

We describe the setup and implementation details for executing the experiment below.

#### Experiment setup

With the proposed design and permissions for experiments and , the participants were recruited. At NASA SB, sixteen full-time employees were recruited for the experiment . At CMU SV (buildings 19 and 23), a mix of faculty, staff, and students totalling sixteen in number were recruited for experiment . Smart powerport(s) were installed in each participant’s workspace for collecting data during baseline and experiment phases.

#### Experiment EN: NASA Sustainability Base

This experiment was conducted in two phases, a baseline phase and a feedback intervention phase. The baseline phase () was conducted for a period of five weeks from 12 SEP 2016 to 17 OCT 2016. During this time no interventions were administered. Thereafter, the feedback intervention phase () was conducted during which the participants were provided with dashboard-enabled feedback described in section 2.4.1. The participants were provided with relevant explanation as shown in Figure 2(a). This phase was conducted for four weeks from 18 Oct 2016 to 11 Nov 2016. \nomenclatureBaseline phase of the NASA experiment \nomenclatureFeedback phase of the NASA experiment

#### Experiment EC: CMU SV - buildings 19 and 23

The experiment was conducted in four phases. The first phase was the baseline phase () during which no intervention was administered. This phase was conducted for five weeks from 12 SEP 2016 to 17 OCT 2016. The second phase of the experiment was the incentive only phase () wherein monetary incentives were provided for participants to compete with the objective of winning the incentive. The individual with the highest score at the end of the day was declared as the winner. The participants were also provided access to dashboards containing only the scoreboard, which showed their near real-time scores. An explanation of the relevant elements received by the participants during this phase is shown in Figure 2(c). The incentive only phase was conducted for two weeks from 18 Oct 2016 to 30 Oct 2016.

The third phase was the dashboard feedback only phase () during which the each participant was provided with a dashboard depicting comparisons relative to their corresponding baseline and to the participant pool. All the dashboard features described in Section 2.4.1 except the scoreboard were provided to the participants. An explanation of the features shown in Figure 2(b) were provided to the participants. This phase was conducted for two weeks from 31 Oct 2016 to 13 Nov 2016. Finally, the both incentive and dashboard feedback phase was conducted during which the participants were provided with both the incentive and dashboard feedback. All the features of the dashboard were made available to the participants during this phase. The participants were provided with explanations of each feature as shown in Figure 2(a). This phase was conducted for two weeks from 14 Nov 2016 to Nov 25 2016. \nomenclatureBaseline phase of the CMU experiment \nomenclatureIncentive only phase of the CMU experiment \nomenclatureFeedback only phase of the CMU experiment \nomenclatureIncentive+Feedback phase of the CMU experiment

#### Energy conservation information

At the beginning of every experiment phase, namely , , , , the participants were provided with information on possible actions to reduce plugload consumption as shown in Figure 2(d). In this manner, the experimenters ensured that any absence of behavioral changes during the intervention phases could not be attributed to lack of information. These instructions were compiled after surveying and classifying the devices used by each participant. The list of devices associated with experiments and are shown in 1.

## 3 Statistical Analysis and Modeling

The results from experiments and were analyzed in light of the designs described in section 2. This analysis involved performing hypothesis tests, estimating confidence intervals, and developing statistical models from the data. In what follows, we describe the statistical analysis and associated results for both the experiments and .

### 3.1 Representing temporal context

Given that each baseline and experiment phases for and were conducted over several days, we consider a daily temporal context for analysis. Let the days spanned by the phases , and be represented by , and , respectively. Within any such day , let and represent time instants in seconds. Now, a time interval can be represented by , where denotes the number of seconds elapsed from until . In addition, let represent the day of a week, ranging from Monday through Sunday.

### 3.2 Data Analysis for experiment EN at NASA SB

During the baseline phase , let the power consumption of the participant on day at time instant be denoted by and let the time-averaged power consumption during be denoted by . Similarly, let the corresponding instantaneous and time-averaged power consumption during the feedback intervention phase be denoted by and , respectively. Further, let this participant’s screentime during a time interval on day be denoted by . Let the random sample of the baseline response of the participant during the time interval on day be denoted by the random variable , whose realization corresponds to . Similarly, let the random samples associated with the response and screentime input during the feedback phase be represented by and , respectively.

#### Statistical assumptions

Given the baseline and experiment conditions, the response variables and have a finite mean and variance across a time interval sample on any given day. Let these time-sampled statistics be represented by and for the baseline and experiment phases, respectively. For hypothesis testing and interval estimation, we consider the sample constituted by the differences in the daily-averaged experiment response and the corresponding baseline response, sampled across participants and days of the week . Let this averaged response differential for the participant during a day be represented by the random sample which equals . The response differential across participants and days of the week mitigates participant-oriented and weekly nuisance factors, respectively. Given these matched pairs, statistical testing allows us to attribute any significant changes between the baseline and experiment responses to the intervention administered rather than to nuisance factors such as differences in individual energy needs or workloads.

#### Hypothesis testing and confidence interval estimation

We employ a paired difference test to examine the differential population , sampled across participants and days of the week. Given the matched pairs and , the paired difference t-test checks if the mean differential sample is significantly different from zero. The null and the alternative hypotheses are presented below:

1. : is sampled from a population with zero mean

2. : is sampled from a population with non-zero mean

\nomenclature

Null hypothesis for testing the NASA experiment \nomenclatureAlternate hypothesis for testing power reduction in the NASA experiment The mean baseline consumption was found to be 51.51 W and the mean feedback-intervened consumption was found to be 48.86 W. The t-statistic was found to be t(86)=3.64 and the corresponding p-value . Therefore, the evidence against the null hypothesis is statistically significant and we conclude that the power usage during the experiment phase was significantly different than the power usage during the baseline phase. The corresponding 95% confidence interval of the mean of the differential sample was found to be [2.22,7.57] W, or equivalently [4.32,14.71]%. Since the difference is positive, the mean power consumption during the feedback phase is (statistically) significantly less than that of baseline phase . The statistical summary of both the experiment and the baseline phase energy consumption (kWh) is shown in Figure 3(c).

#### Regression-based modeling

Given the statistical significance, is of interest to predict the experiment power consumption based on a model. To model the hourly power consumption of an average participant, we employ an autoregressive model with an exogenous input consisting of the average screentime associated with the dashboard during the past hour. Let , denote a day in the experiment dataset and the corresponding day in the baseline dataset respectively. We also use a single argument to represent the hour of a day enclosed in an interval . Therefore, we can write the experiment and baseline hourly consumption of the participant during hour and corresponding day as , respectively6. Also, let ’’ denote the use of a sample statistic when used in place of ’’, the index corresponding to the participant. Instead of explicitly modeling the experiment hourly consumption of an average participant , we model the difference between the averaged experiment and baseline responses . The paired difference allows for mitigating subjective variation due to individual energy consumption on account of varying preferences or workloads, thereby allowing a better statistical prediction. Let this mean differential response be represented by . Now the model can be written as:

 ΔμN:(d,h)=αN+βNΔμN:(d,h−1)+γNxaP3Ni(d,h−1)+ϵh (1)

The introduction of the lagged variable is instrumental in weakening the residual serial correlation and thus mitigate systematic factors in the error process as depicted by Figure 3(a). It depicts the impact of adding time-lagged dependent variables on the serial correlation of the residuals. It is evident that the first order lag significantly reduces the correlation and the introduction of further lags do not contribute toward reducing the correlation further. From an experiment perspective, the time-lagged dependent variable enables us to account for changes between experiment conditions with respect to the baseline conditions. For example, any change in workload between the baseline and the experiment conditions can be captured by the introduction of the time-lagged dependent term in the model. This allows us to strengthen the assumption that the residuals corresponding to consecutive hours are a result of random factors and hence uncorrelated given the inputs. For purposes of training and testing, the dataset is partitioned into and , respectively. Given the model structure in Equation 1, the parameters are estimated by Ordinary Least Squares (OLS) and the values are provided in Table 2. The performance of the estimated model is obtained on the test dataset and the results are shown in Figure 4(a). The figure represents hourly power consumption during the experiment alongside the average and interval predictions. The root mean square error on the test set was found to the W, and the corresponding mean prediction interval was found to be W.

#### Discussion

The findings from experiment indicate that employing feedback intervention reduced the average hourly plugload consumption by a mean value of (), along with a confidence interval corresponding to . These estimates may be considered conservative as they are based on data of active participants, whose activity threshold was set to 2.5 W per channel. That is, any channel consumption below the threshold was not considered for the analysis. This threshold was fixed based on the device profiles in Table 1. In this manner, the savings estimated are only due to reducing power consumption through active behavioral changes and not by turning off devices. In the regression model, we note that the residual standard deviation (3.52W) on the training set is close to the Root Mean Square (RMS) residual (3.53W) on the test set, thereby indicating the model performance on the training and testing sets is similar. Further, the test set residuals are shown in Figure 3(g). The residual behavior does not suggest heteroscedasticity. To the extent the underlying errors are uncorrelated and homoscedastic, the OLS estimator can be regarded as unbiased with least variance based on the Gauss-Markov theorem. The RMS accuracy on the test set was found to be . The prediction error can be a product of one or more factors related to modeling, estimation, and the process of observation shmueli2010explain (). The observed significant reduction in plugload consumption could be the result of behavioral changes induced either by the dashboards or cognitive factors such the hawthorne effect.

### 3.3 Data Analysis for experiment EC at CMU SV

The analysis helps determine the efficacy of feedback and/or incentives in experiment . The experiment was conducted in four phases: A baseline phase and three experiment phases. The experiment phases ,, and consist of interventions in the form of incentives, feedback, and both incentives and feedback, respectively. Similar to the experiment , let the instantaneous power consumption of the participant on day at the instant during phase be denoted by , and let the average power consumption during be denoted by . Also, let the incentive and feedback provided during the time interval for the respective phases be denoted by 7 and , respectively. \nomenclatureBaseline response of the participant during the time interval in the CMU experiment \nomenclatureResponse of the participant during the time interval in the CMU incentive experiment \nomenclatureResponse of the participant during the time interval in the CMU feedback experiment \nomenclatureResponse of the participant during the time interval in the CMU both feedback and incentive experiment \nomenclaturescreentime spent by the participant on the scoreboard during the time interval in the CMU incentive experiment \nomenclatureIncentive input to the participant during the time interval in the CMU incentive experiment \nomenclaturescreentime spent by the participant on the dashboard during the time interval in the CMU feedback experiment \nomenclaturescreentime spent by the participant on the dashboard during the time interval in the CMU both feedback and incentive experiment \nomenclatureIncentive input to the participant during the time interval in the CMU both feedback and incentive experiment For inference, the observations are regarded as the realizations of a random sample from the occupant population. Similar to the experiment , we use uppercase letters , , , and to denote random variables. Thus, the random variables pertaining to the response of the participant on day during the time interval corresponding to each of the phases , , , and , by convention, become , , , and , respectively. Similarly, the random variables representing the interventions during each of the three experiment phases , , and become , , and , respectively. \nomenclatureRandom variable representing the baseline response of the participant during the time interval in the CMU experiment \nomenclatureRandom variable representing the response of the participant during the time interval in the CMU incentive experiment \nomenclatureRandom variable representing the response of the participant during the time interval in the CMU dashboard feedback experiment \nomenclatureRandom variable representing the response of the participant during the time interval in the CMU both feedback and incentive experiment \nomenclatureRandom variable representing the screentime spent by the participant on the scoreboard during the time interval in the CMU incentive experiment \nomenclatureRandom variable representing the incentive input of the participant during the time interval in the CMU incentive experiment \nomenclatureRandom variable representing the screentime spent by the participant on the dashboard during the time interval in the CMU dashboard feedback experiment \nomenclatureRandom variable representing the screentime spent by the participant on the dashboard during the time interval in the CMU both incentive and dashboard experiment \nomenclatureRandom variable representing the incentive input of the participant during the time interval in the CMU both feedback and incentive experiment

#### Statistical assumptions

Given the above random input and response samples, we note that the mean and variance of the respective samples exist and are finite-valued during each applicable phase of the experiment. For the response samples , , , and , let the respective mean and variance be represented by the pairs , , , and . For performing inference, similar to experiment , we consider the sample constituted by the differences between the daily-averaged experiment response and the corresponding baseline response, sampled across participants and days of the week . Let this averaged response differential for the participant during a day for the experiment phase () be represented by the random sample which equals . Given the matched pairs experiment design similar to that of , any inferences from the differential population can be attributed to the intervention(s) administered during instead of the nuisance factors related to either subjectivity or the specific day of the week. In other words, the causal connection between the experiment response and the intervention(s) is strengthened.

#### Hypothesis testing and confidence interval estimation

Given the assumptions about the population consisting of the differential responses , we resort to a paired difference t-test to draw inferences about the underlying population. The hypothesis tests and confidence interval estimation are performed on the mean of the differential response .

3.3.2.1 Inference from incentive experiment phase at CMU SV To test the efficacy of the incentive intervention, the null and alternate hypotheses for the paired t-test are presented below:

1. : is sampled from a population with zero mean

2. : is sampled from a population with non-zero mean

\nomenclature

Null hypothesis for testing the CMU incentive experiment \nomenclatureAlternate hypothesis for testing power reduction in the CMU incentive experiment The mean baseline consumption and the mean incentive phase consumption were found to be 61.09 W and 53.91 W. The t-statistic was found to be along with the corresponding p-value of . This showed that the evidence against the null hypothesis was not statistically significant for . Thus, the mean power consumption during the incentive phase was not found to be statistically different from the baseline consumption. The corresponding confidence interval of the mean differential response was found to be W, or equivalently . The statistical summaries of the baseline and incentive phases are shown in the Figure 3(d).

3.3.2.2 Inference from feedback experiment phase at CMU SV Similar to the test procedure in 3.3.2.1, the null and alternate hypotheses for testing the feedback intervention are presented below:

1. : is sampled from a population with zero mean

2. : is sampled from a population with non-zero mean

\nomenclature

Null hypothesis for testing the CMU feedback experiment \nomenclatureAlternate hypothesis for testing power reduction in the CMU feedback experiment The mean baseline consumption and the mean feedback phase consumption were found to be 61.09 W and 49.27 W. The t-statistic was found to be along with the corresponding p-value of . This indicates the mean reduction with respect to the baseline is found to be statistically significant at . The corresponding confidence interval of the mean of the differential response was found to be W, or equivalently . The statistical summaries of the baseline and incentive phases are shown in the Figure 3(e).

3.3.2.3 Inference from incentive and feedback experiment phase at CMU SV In the presence the both incentive and dashboard feedback, the null and alternate hypotheses for testing the feedback intervention are presented below:

1. : is sampled from a population with zero mean

2. : is sampled from a population with non-zero mean

\nomenclature

Null hypothesis for testing the CMU incentive and feedback experiment \nomenclatureAlternate hypothesis for testing power reduction in the CMU incentive and feedback experiment The mean baseline consumption and the mean feedback phase consumption were found to be 61.09 W and 50.33 W. The t-statistic was found to be along with the corresponding p-value of . This indicates the mean reduction with respect to the baseline is statistically significant at . The corresponding confidence interval of the mean of the differential response was found to be W, or equivalently . The statistical summaries of the baseline and incentive phases are shown in the Figure 3(f).

#### Regression-based modeling

Given the interval estimates, we are interested in a predictive model similar to the one in Section 3.2.3. We employ a similar notation here. In case of experiment , dashboard feedback was the only intervention used and hence screentime was the only exogenous variable considered. However, in this case each experiment phase consists of either an incentive intervention () or a dashboard feedback intervention () or both (). For purposes of modeling, we note that each observation in phase can have a non-negative value for each of the intervention variables and , thereby accommodating both exogenous inputs into the model structure simultaneously. Let the mean hourly power consumption during the baseline and experiment be denoted by and , respectively. We then model the mean differential response, denoted by by linear first-order autoregressive model AR(1) with screentime and incentive as the exogenous inputs. Written otherwise,

 ΔμC:(d,h)=αC+βCΔμC:(d,h−1)+γCxaPeCi(d,h−1)+δCxiPeCi(d,h)+ϵh (2)

The introduction of the lagged dependent term is instrumental in weakening the residual serial correlation. Figure 3(b) depicts the relationship between the number of added lags to the residual correlation. It is evident that the additional lags do not add further systematic information about the predicted variable and hence do not significantly contribute toward weakening the residual serial correlation. From an experiment standpoint, these lags capture the change in experiment conditions as compared to the baseline conditions, thereby strengthening the assumption that the residuals corresponding to consecutive hours are uncorrelated given the model inputs. The dataset is partitioned such that of the data is used for training and for testing. Given the training set, the parameters are estimated by OLS and the corresponding estimates are listed in table 3. The performance of the model on the test dataset is shown in Figure 4(b) and the average accuracy was found to be . The RMS error on the test set was found to be W, and the corresponding prediction interval was found to be W.

#### Discussion

The findings from experiment suggest that employing feedback and/or incentive interventions can reduce plugload consumption. In particular, the incentive, dashboard, and their combination resulted in a mean reduction of % (), (), and (), respectively. It is noteworthy that the incentive intervention corresponds to a larger p-value and hence less significant than the dashboard or the combined intervention. A possible explanation is to consider the order of interventions. The first experiment phase consisted of the incentive and the later phases and consisted of the feedback and the combination, respectively. The growing significance in the order of phases indicates that the effect of time on plugload consumption behavior, which is consistent with the finding that behavioral changes require adaptation time to become habits lally2010habits (). These findings suggest the need for considering an adaption or settle-in time during the experiment design. Further, to mitigate the retention effects from one phase to another, a sufficient washout period is required. While these considerations increase the duration of the experiment, they nevertheless offer a framework to systematically study the exclusive effects of interventions with regard to occupant energy consumption behavior. In the regression model, the mean prediction accuracy (RMS) was found to be . The residual variation with respect to predicted values is shown in Figure 3(h). It can be observed that the residual behavior is homoscedastic. Along with the lack of serial correlation described in Section 3.3.3, the conditions for Gauss-Markov theorem are strengthened. Thus, the OLS estimator in 3.3.3 can be regarded as the Best Linear Unbiased Estimator (BLUE). While the study offers evidence for plugload reduction, it could be due to either behavioral changes or cognitive factors such the hawthorne effect. Similar to the analysis of experiment , a threshold of 2.5 W per channel was considered to disallow inactivity as a means to energy reduction.

## 4 Conclusion

This work presents the findings from experiments conducted in office and university buildings within the NASA Ames Research Center. The experiments employed a matched paired design to enable a strong causal connection between plugload consumption and interventions. During different phases of the experiments, interventions in the form of monetary incentives and/or dashboard feedback were used. The incentives were provided in a randomized order and the dashboard was constructed with regard to occupant engagement and plugload consumption awareness. The experiment in the office environment was conducted at NASA Sustainability Base in the presence of dashboard feedback. The average plugload reduction was observed to be () and the regression model test RMS accuracy was found to be . The experiment in the university environment was conducted at CMU Silicon Valley campus in the presence of incentives and/or dashboard feedback. The average plugload reduction in the presence of incentives, dashboard feedback, and their combination was observed to be % (), (), and (), respectively. The regression model test RMS accuracy for the university experiment was found to be . Findings from both experiments indicate that feedback intervention is effective in both university and office environments with an estimated mean reduction of and reduction, respectively. The proposed models facilitate the integration of occupant plugload consumption into the demand response paradigm. Future studies can investigate stronger experiment designs with larger sample size, additional cues for improved prediction accuracy, and generalizability of the presented findings.

## Acknowledgments

The authors thank the NASA Ames Research Center and Carnegie Mellon University (CMU) for supporting this research under the cooperative agreement NNX13AD49A. They acknowledge the support from NASA and CMU Institutional Review Boards (IRBs) toward experiment approvals.

## References

### Footnotes

1. Chaitanya Poolla is with Intel Corporation. This work was done when he was with Carnegie Mellon University (SV).
2. Abraham K. Ishihara is with KBR. This work was done when he was with Carnegie Mellon University (SV).
3. Dan Liddell contributed to this work when he was with Carnegie Mellon University (SV).
4. Rodney Martin is with NASA Ames Research Center.
5. Steven Rosenberg is with Carnegie Mellon University (SV).
6. Similarly, the intervention variable in Section 3.2 is simplified here into such that represents the hour .
7. It may be noted that the leaderboard feature of the dashboard was made visible to the participants during the incentive phase to monitor their position to obtain the incentive. Hence, the screentime was also applicable during the incentive-only phase.

### References

1. US EIA, Electric power annual 2016, https://www.eia.gov/electricity/annual/pdf/epa.pdf (2016).
2. N. R. E. Laboratory, Assessing and Reducing Plug and Process Loads in Office Buildings, US Department of Energy.
3. K. McKenney, M. Guernsey, R. Ponoum, J. Rosenfeld, Commercial miscellaneous electric loads: Energy consumption characterization and savings potential in 2008 by building type, TIAX LLC, Lexington, MA, Tech. Rep.D.
4. a survey of control technologies in the building automation industry.
5. K. F. Fong, V. I. Hanby, T.-T. Chow, HVAC system optimization for energy management by evolutionary programming, Energy and Buildings 38 (3) (2006) 220–231.
6. Y.-J. Wen, A. M. Agogino, Personalized dynamic design of networked lighting for energy-efficiency in open-plan offices, Energy and Buildings 43 (8) (2011) 1919–1924.
7. Y.-J. Wen, A. M. Agogino, Wireless networked lighting systems for optimizing energy savings and user satisfaction, in: Wireless Hive Networks Conference, 2008. WHNC 2008. IEEE, IEEE, 2008, pp. 1–7.
8. C. Mercier, L. Moorefield, Commercial Office Plug Load Savings and Assessment: Executive Summary, California Energy Commission.
9. B. Acker, C. Duarte, K. Van Den Wymelenberg, Office space plug load profiles and energy saving interventions, Proc. of the 2012 ACEEE Summer Study on Energy Efficiency in Buildings, Pacific Grove, CA.
10. D. Kaneda, B. Jacobson, P. Rumsey, R. Engineers, Plug load reduction: The next big hurdle for net zero energy building design, in: ACEEE Summer Study on Energy Efficiency in Buildings, 2010, pp. 120–130.
11. J. Zhao, B. Lasternas, K. P. Lam, R. Yun, V. Loftness, Occupant behavior and schedule modeling for building energy simulation through office appliance power consumption data mining, Energy and Buildings 82 (2014) 341 – 355.
12. M. Deru, et al., US Department of Energy commercial reference building models of the national building stock, Tech. rep., National Renewable Energy Laboratory (2011).
13. C. M. Clevenger, J. Haymaker, The impact of the building occupant on energy modeling simulations, in: Joint Intl. Conf. on Computing and Decision Making in Civil and Building Engineering, Montreal, Canada, 2006.
14. R. K. Jain, J. E. Taylor, G. Peschiera, Assessing eco-feedback interface usage and design to drive energy efficiency in buildings, Energy and buildings 48 (2012) 8–17.
15. R. Gulbinas, et al., Network ecoinformatics: Development of a social ecofeedback system to drive energy efficiency in residential buildings, Journal of Computing in Civil Engineering 28 (1) (2013) 89–98.
16. J. E. Petersen, et al., Dormitory residents reduce electricity consumption when exposed to real-time visual feedback and incentives, International Journal of Sustainability in Higher Education 8 (1) (2007) 16–33.
17. R. Yun, P. Scupelli, A. Aziz, V. Loftness, Sustainability in the workplace: nine intervention techniques for behavior change, in: Persuasive Technology, Springer, 2013, pp. 253–265.
18. G. Fuertes, S. Schiavon, Plug load energy analysis: The role of plug loads in LEED certification and energy modeling, Energy and Buildings 76 (2014) 328 – 335.
19. S. Wang, A. Kim, E. Johnson, Understanding the deterministic and probabilistic business cases for occupant based plug load management strategies in commercial office buildings, Applied Energy 191 (2017) 398 – 413.
20. P. Gandhi, G. S. Brager, Commercial office plug load energy consumption trends and the role of occupant behavior, Energy and Buildings 125 (2016) 1 – 8.
21. NASA, NASA Sustainability base, http://www.nasa.gov/ames/facilities/sustainabilitybase.
22. C. Poolla, A. K. Ishihara, R. Milito, Designing near-optimal policies for energy management in a stochastic environment, Applied Energy 242 (2019) 1725–1737.
23. Enmetric Plug Load Management System, https://www.enmetric.com/platform, accessed: 2016-10-29.
24. D. C. Montgomery, Design and analysis of experiments, John Wiley & Sons, 2008.
25. R. Yun, et al., Toward the design of a dashboard to promote environmentally sustainable behavior among office workers, in: Persuasive Technology, Springer, 2013, pp. 246–252.
26. R. Gulbinas, J. E. Taylor, Effects of real-time eco-feedback and organizational network dynamics on energy efficient behavior in commercial buildings, Energy and buildings 84 (2014) 493–500.
27. R. Brath, M. Peters, Dashboard design: Why design is important, DM Direct.
28. R. de Dear, et al., Thermal comfort in naturally ventilated buildings: revisions to ashrae standard 55, Energy buildings.
29. P. Petkov, F. Köbler, M. Foth, H. Krcmar, Motivating domestic energy conservation through comparative, community-based feedback in mobile and social media, in: Proc. 5th Intl. Conf. on Communities and Technologies, ACM, 2011.
30. H. Allcott, Social norms and energy conservation, Journal of Public Economics 95 (9) (2011) 1082–1095.
31. I. Ayres, S. Raseman, A. Shih, Evidence from two large field experiments that peer comparison feedback can reduce residential energy usage, Journal of Law, Economics, and Organization (2012) ews020.
32. G. Shmueli, et al., To explain or to predict?, Statistical science 25 (3) (2010) 289–310.
33. P. Lally, C. H. Van Jaarsveld, H. W. Potts, J. Wardle, How are habits formed: Modelling habit formation in the real world, European journal of social psychology 40 (6) (2010) 998–1009.
You are adding the first comment!
How to quickly get a good reply:
• Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
• Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
• Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
The feedback must be of minimum 40 characters and the title a minimum of 5 characters