Modeling Maxwell’s demon with a microcanonical Szilard engine
Abstract
Following recent work by Marathe and Parrondo [PRL, 104, 245704 (2010)], we construct a classical Hamiltonian system whose energy is reduced during the adiabatic cycling of external parameters, when initial conditions are sampled microcanonically. Combining our system with a device that measures its energy, we propose a cyclic procedure during which energy is extracted from a heat bath and converted to work, in apparent violation of the second law of thermodynamics. This paradox is resolved by deriving an explicit relationship between the average work delivered during one cycle of operation, and the average information gained when measuring the system’s energy.
I Introduction
The KelvinPlanck statement of the second law of thermodynamics asserts that no process is possible whose sole result is the extraction of energy from a heat bath, and the conversion of that energy into work. Finn1993 Because this statement is formulated in terms of energy rather than entropy, it provides an attractive starting point for exploring the microscopic foundations of the second law. This is particularly true when we consider an immediate corollary of the KelvinPlanck statement: when a thermally isolated system, initially in equilibrium, evolves under a cyclic variation of external parameters, its internal energy cannot decrease.^{1}^{1}1 If its energy were to decrease, then at the end of the process the system could be returned to its initial state by equilibrating it with a heat bath at temperature , resulting in the net conversion of heat to work. Since an isolated system exchanges no heat with its surroundings, and is governed by familiar equations of motion – Hamiltonian dynamics in the classical case, or the Schrödinger equation for a nonrelativistic quantum system – relatively few theoretical tools are needed to embark on an investigation of this statement.
Let us formulate the problem as follows. A finite, classical system is described by a Hamiltonian , where denotes a point in dimensional phase space, and is a set of externally controlled parameters. At time the system’s initial conditions are sampled from an equilibrium distribution , and then for the system evolves under Hamilton’s equations as the parameters are made to trace out a closed loop in space. We will use the notation to denote such a cyclic protocol for varying the parameters, beginning and ending at . The work performed on the system during this process is the net change in the value of the Hamiltonian,
(1) 
where the trajectory describes the system’s evolution from to . Since Hamiltonian dynamics are deterministic, the value of is fully determined by the initial conditions: . The KelvinPlanck statement, viewed as a statistical prediction about averages, then implies the inequality,
(2) 
We now ask, for what choices of the equilibrium distribution can this result be established rigorously?
When initial conditions are sampled from a canonical distribution
(3) 
Eq. 2 follows directly from the properties of Hamilton’s equations Jarzynski1997a ; Allahverdyan2002 ; Campisi2008 . In fact, this result extends to any distribution of initial conditions that is a decreasing function of energy Allahverdyan2002 ; Campisi2008 . Somewhat surprisingly, however, Eq. 2 is not universally valid when initial conditions are sampled from a microcanonical distribution,
(4) 
This has been discussed by Allahverdyan and Nieuwenhuizen Allahverdyan2002 , but to the best of our knowledge it was Sato Sato2002 who first constructed a counterexample, involving a perturbed, onedimensional harmonic oscillator. For microcanonically sampled initial conditions, Sato described a cyclic variation of the Hamiltonian that results in a negative value of average work, . More recently, Marathe and Parrondo Marathe2010 have developed another counterexample to Eq. 2, involving a particle inside a box with hard walls and an insertable barrier. For a given initial energy, Marathe and Parrondo describe a cyclic manipulation of the walls and the barrier, whose net effect is to reduce the energy of the system. Ultimately, the particle can be brought arbitrarily close to zero kinetic energy by a succession of such cycles, with a different protocol for each cycle.
Inspired by Ref. Marathe2010 , in the present paper we introduce and analyze another model system that violates Eq. 2. We consider a classical particle moving in a onedimensional potential well, described by a pair of external parameters (see Eq. 5 and Fig. 1). We will discuss the design of protocols for varying these parameters cyclically with time, , in a manner that lowers the energy of the system. In particular, for any choice of initial particle energy , we will construct a protocol (which depends on the value of ) that reduces the particle’s kinetic energy arbitrarily close to zero in a single cycle, bringing the system to a final state in which the particle sits nearly motionless at the bottom of the potential well. In effect, the system is cooled near to “absolute zero” temperature.
Our model, like those of Refs. Sato2002 ; Marathe2010 , suggests that a perpetualmotion device of the second kind could be constructed, operating by the following steps.

The system is brought into contact and allowed to equilibrate with a thermal reservoir at temperature . The reservoir is then removed.

The energy of the nowisolated system is measured.

The system is subjected to a cyclic protocol that reduces its kinetic energy close to zero (as discussed above).
By repeatedly performing this sequence of steps, we obtain a scenario in which energy is systematically extracted from the reservoir (step 1) and delivered as work to the agent that carries out the cyclic protocol (step 3). This is reminiscent of Maxwell’s demon Maxwell1871 ; Leff2003 ; Maruyama2009 , only here the demon’s role is to implement a cyclic protocol based on the measured energy of the system, instead of opening or closing a trapdoor based on the observed motion of nearby particles. The key to exorcising the demon – that is, to reconciling this scenario with the second law of thermodynamics – is to recognize that the repeated measurements of energy in step 2 result in the accumulation of information. In order for the device to satisfy the “sole result” stipulation of the KelvinPlanck statement (see above), this information must eventually be erased. As famously discussed by Landauer Landauer1961 , and by Bennett Bennett1982 in the context of Szilard’s engine Szilard1929 – another incarnation of Maxwell’s demon – the erasure of information carries an unavoidable thermodynamic cost of per bit. We will show by explicit calculation that this cost ultimately wipes out any gains made by our device: in the process of erasing the accumulated information, all of the work harvested by the device is returned as heat to the thermal reservoir.
In Sec. II we introduce our model and discuss protocols that reduce the energy of the system. In Sec. III we discuss the average amount of work that is extracted per cycle, when carrying out the threestep procedure discussed above; this amount depends on the precision with which the initial energy is measured in step 2. Using Landauer’s principle for the work that must eventually be expended to erase the accumulated information ( per bit), we will show that this is no less than the work extracted in step 3, regardless of the precision with which the initial energy is measured. Thus in the final accounting, after all the bits of information are reset to zero, the device is unable to deliver work and the second law is rescued from the demon.
Ii Model and Protocols
Consider a classical particle of unit mass moving in one dimension, governed by a Hamiltonian
(5) 
where is a point in the phase space of the particle, and is a point in twodimensional parameter space, with . The parameter modulates the shape of the potential energy function in the region : when , there is a local minimum at , as illustrated in Fig. 1.
Similarly, the value of specifies a minimum at . We will refer to these regions as the left well and the right well. When , the particle moves in a quartic potential, which we call the unperturbed system.
Now imagine a protocol whereby the parameters are made to trace out the perimeter of the square shown in Fig. 2(a), starting and ending at .
For simplicity we assume a constant speed, . The deformation of the potential during this protocol can be pictured as follows. Starting from the unperturbed quartic potential, the right well gradually drops down, forming a local minimum that moves from the origin to (see Fig. 3(a)  3(c)) as increases from 0 to . Next, as increases from 0 to the left well drops down, forming a local minimum that comes to rest at , with a local maximum at the origin (Fig. 3(d)). These two stages are then undone (Figs. 3(e), 3(f)). The net effect is a pistonlike pumping of the right and left wells. For this protocol, let denote a trajectory evolving under the timedependent Hamiltonian .
For a given choice of , let us define two energy values,
(6) 
where
(7) 
These in turn define three regions of phase space, , , and , according to the value of the unperturbed Hamiltonian :
(8) 
We now claim that when the protocol shown in Fig. 2(a) is implemented quasistatically, , then the net effect is to swap regions and . That is, trajectories with initial conditions in region end with final conditions in region , and viceversa. (See, however, the discussion of subtleties associated with this limit, in Sec. IV.) Fig. 3 and the following paragraphs convey how this swap proceeds. For convenience, we will use the terms set and set to refer to trajectories with initial conditions in regions and of phase space, respectively. The shaded regions in Fig. 3 depict the evolution of these sets of trajectories, as a sequence of snapshots from to .
By Hamilton’s equations we have
(9) 
where is the unit step function. During the first stage of the process, , we have and , therefore as the right well drops down the value of decreases whenever . As a result, some trajectories acquire negative energies () and become trapped in the right well. As shown in Fig. 3(c) – and as justified quantitatively by Eqs. 10  15 below – at the end of this stage the trajectories belonging to set are trapped.
During the second stage, , the left well drops down, trapping the trajectories in set . As this occurs, the trajectories in set remain trapped in the right well.
From , as the right well rises and ultimately disappears, the trajectories in set gain energy (Fig. 3(e)), and during the fourth and final stage, , all trajectories gain energy as the left well gradually rises until it disappears. The situation at , shown in Fig. 3(f), reflects the swap that has occurred between sets and , relative to Fig. 3(a).
Due to adiabatic averaging, the energyordering of the trajectories within each set remains fixed in the quasistatic limit: if we were to subdivide the lightly shaded region in Fig. 3(a) into a stack of narrow horizontal bands, then the vertical ordering of these bands would remain unchanged throughout the process.
A proper analysis of this process involves the theory of adiabatic invariants, with careful attention paid to the phase space separatrix that is present during the interval , when has a local maximum at Tennyson1986 ; Cary1986 . However, the essence of what occurs should be intuitively clear from the above discussion. A useful analogy is provided by imagining a container initially filled with three layers of a viscous, incompressible fluid, labeled , and in vertically ascending order. Two syringes are attached to the bottom of the container. First one syringe extracts the lowest layer of the fluid, bringing layer to the bottom of the container. Next, the other syringe extracts layer . Then the fluid layers are reinjected in the same order in which they were removed, resulting in the rearrangement of these layers.
The incompressibility of the fluid in this analogy corresponds to Liouville’s theorem: phase space volume is preserved under Hamiltonian dynamics. To justify quantitatively our assertion that the protocol swaps regions and , we must show that the phase space volumes corresponding to the darkly shaded regions in Figs. 3(a) and Figs. 3(d) are equal (in other words, it is precisely the trajectories in set that get trapped in the right well), and similarly that the phase space volumes of the lightly shaded regions in Figs. 3(a) and Figs. 3(d) are equal.
Let denote the volume of phase space enclosed by the surface :
(10) 
where we have integrated over momentum to get to the second line. When either or the remaining integral can be evaluated analytically:
(11a)  
(11b) 
with given by Eq. 7. The quantity
(12) 
is the volume of phase space for which and , and is defined similarly for and .
Using Eq. 11a, the phase space volumes of regions and , defined by Eq. 8, are
(13) 
In Fig. 3(d) the lightly and darkly shaded regions correspond to phase space volumes and , respectively, which are equal in value:
(14) 
Combining these results with Eq. 6 we find that
(15) 
This establishes that our qualitative description of what occurs during this process, as illustrated in Fig. 3, is indeed consistent with the preservation of phase space volume, as mandated by Liouville’s theorem.
The picture developed in the preceding paragraphs suggests the following relationship between the initial () and final () energy of the system, in the limit :
if  (16a)  
if  (16b)  
if  (16c) 
with and determined by the value of (Eq. 6). Combining these results with Eq. 13 (note that ) we obtain
(17) 
As a test of Eq. 17, we sampled initial conditions from a microcanonical ensemble at energy , near the bottom of region (see Fig. 3). For each initial condition we generated a trajectory by integrating Hamilton’s equations as the parameters were varied as in Fig. 2(a), with . The resulting distribution of final energies , spanning a range from to , was characterized by a mean value and a standard deviation , in excellent agreement with the value predicted by Eq. 17. (The small discrepancies reflect the fact that the duration is finite.) While these numerical results support the analysis leading to Eq. 17, some caveats are in order. In particular, Liouville’s theorem itself rules out the possibility that all initial conditions with energy lead to a net decrease of energy, . We defer a discussion of this issue to Sec. IV.
To this point we have considered a symmetric protocol, Fig. 2(a), in which each well reaches the same maximal depth, determined by the value of (Fig. 3(d)). However, the analysis is easily generalized to the asymmetric protocol shown in Fig. 2(b), in which the parameters are varied around a rectangle with corners at and . Regions , and are defined as in Eq. 8, but now the energies and are defined by
(18) 
When the protocol is implemented quasistatically, the net result is a rearrangement of sets and , as depicted in Fig. 4. Eq. 16 now leads to the result
(19) 
The viscous fluid analogy also applies to this situation, only now the syringes remove different quantities of fluid, . Alternatively, the processes illustrated in Figs. 3 and 4 are analogous to a simple shuffle of a deck of cards, in which a stack of adjacent cards (region ) is removed from the middle of the deck and transferred to the bottom.
It should now be clear how to design a quasistatic protocol that lowers the energy of the system almost to zero, for a given initial energy . Namely, we choose such that is slightly above , thus locating the initial conditions near the bottom of region . If we then implement the protocol shown in Fig. 2, in either its symmetric () or asymmetric () version, the system will be trapped near the bottom of the left well at , and will end the process with . This outcome is independent of the value of , which simply determines the width (in energy) of region .
Iii Exorcising Maxwell’s Demon
Let us now return to the perpetualmotion device of the second kind proposed in the Introduction: after equilibrating the system with a thermal reservoir at temperature (step 1), we measure the initial energy (step 2), then choose a protocol that reduces the energy near to zero (step 3). The amount of work we extract during this cycle – equivalently, minus the amount of work we perform on the system – is given by
(20) 
If we repeat this process many times, then the average work extracted per cycle satisfies
(21) 
where the canonical distribution reflects initial equilibration with the reservoir.^{2}^{2}2 In Eq. 21 we have used the identity , with . To approach this upper bound of per cycle, in which the thermal energy of the system is entirely converted to work (), the initial energy must be measured with high precision, allowing us to choose a protocol for which is tiny but positive (Eqs. 17, 19). However, as mentioned in the Introduction, these measurements generate information that must ultimately be erased, at a cost of per bit. There is a competition at play here: increased precision brings us closer to the maximal extracted work, but carries the penalty of increased accumulation of information.
To address this issue, imagine a measurement apparatus that reports the initial energy of the system with finite precision. Specifically, given the initial microstate , the apparatus outputs one of values associated with specified energy intervals , , , . Taking for purpose of illustration, the apparatus outputs , , , or according to
(22) 
where the values , , and are fixed properties of the apparatus.
Now consider the following strategy for choosing a cyclic protocol, based on the output of the measurement apparatus.
: Do nothing to the system, as it is already in the lowestenergy interval.
: Using Eq. 18, set and , that is choose so that interval in Eq. 22 corresponds to region in Eq. 8. Next, implement the asymmetric protocol of Fig. 2(b), under which initial conditions from this region are transferred to the bottom of the potential well, as in Fig. 4.
: Set and , then implement the asymmetric protocol. Again, the energy interval containing the initial conditions – interval , in this case – is shuffled to the bottom of the potential.
: Set and , where is an arbitrary cutoff energy, then implement the asymmetric protocol. In this case, initial conditions from the region between and are transferred to the bottom of the potential, whereas if the protocol produces no net change in the energy of the system.
This strategy takes advantage of the limited knowledge provided by the measurement of the initial energy. When it is implemented, the energy of the system decreases (that is, ) if , and remains unchanged otherwise. Thus, on average per cycle, work is extracted from the system,
(23) 
and ultimately from the reservoir that replenishes the system’s energy.
Over repetitions of the process, the measurement apparatus generates a symbolic string of length , of the form . Letting denote the probability of outcome in a given measurement, the number of bits required to encode this string is given by
(24) 
where
(25) 
is the Shannon entropy of the measurement Cover2006 . Now, both and depend on , and , and the former also depends on . In the following section we establish that, no matter what values these parameters take, the inequality
(26) 
is satisfied. The extraction of work thus comes at the cost of the accumulation of information: on average, at least one bit is written per of extracted work. ^{3}^{3}3 In the original Szilard engine, which involves a single particle in a chamber, this relationship is straightforward: the determination whether the particle is in the left or right half of the chamber produces exactly one bit of information, , and standard thermodynamics gives the amount of work extracted during the subsequent isothermal expansion, .
We now turn our attention to the eventual cost of erasing this information. By Landauer’s principle, the average work required to erase one bit of information is no less than . Therefore, since the number of bits generated per cycle is (Eq. 24), the average work required to erase the information accumulated in one cycle of operation satisfies
(27) 
Combining Eqs. 26 and 27, we find that the work required to erase the accumulated information exceeds – or at best, matches – the work extracted during the cycle:
(28) 
Thus our model obeys the KelvinPlanck statement of the second law, as it had better do! Eq. 28 highlights the two logically distinct steps we take in reconciling our model with the second law. Although the second half of this inequality chain (that is, Landauer’s principle) is derived by appeal to the second law itself Landauer1961 , the first half (Eq. 26) is obtained without assuming the second law: in Sec. III.1 we do not infer Eq. 26 by arguing that the second law demands it, rather we will derive this inequality directly.
Eq. 26 is a special case of an inequality recently derived by Sagawa and Ueda (see Eq. 3 of Ref. Sagawa2010 or, in the quantum setting, Eq. 14 of Ref. Sagawa2008 ), which generalizes the second law of thermodynamics to processes with feedback, such as the one considered in this paper. This inequality also follows readily from recent generalizations Sagawa2010 ; Ponmurugan2010 ; Horowitz2010 of the nonequilibrium work relation Jarzynski1997a and Crooks’s fluctuation theorem Crooks1999 to nonequilibrium processes with feedback. In the following derivation, we do not directly invoke these results, instead we provide a selfcontained analysis that is pertinent to our particular model.
iii.1 Bound on work
Consider a cyclic process with the measurement apparatus described by Eq. 22 above. For initial conditions , let denote the final conditions, after implementation of the cyclic protocol corresponding to measurement outcome . The work performed on the system as it evolves from to is given by
(29) 
Over many repetitions of the process, with the protocol determined by the measurement of initial energy, the average work performed on the system is
(30) 
where , and indicates integration over all microstates that result in the measurement outcome . Eq. 30 can be rewritten as
(31) 
(dropping the subscript ). Let us now define two functions
(32)  
(33) 
where is the probability that the outcome of the measurement is . We can interpret as the probability distribution of initial microstates, conditioned on the outcome . Moreover, (since phase volume is preserved, , by Liouville’s theorem), therefore can also be interpreted as a probability distribution on phase space.
With these definitions, Eq. 31 becomes
(34)  
(35) 
The integral appearing in Eq. 35 is the relative entropy or KullbackLeibler divergence between the distributions and ; this quantity is equal to zero if the two distributions are identical and is positive otherwise Cover2006 :
(36) 
Thus the first sum on the right side of Eq. 35 is nonnegative, hence
(37) 
which is equivalent to Eq. 26, the bound we set out to establish. ^{4}^{4}4 In fact, as long as our measurement apparatus has more than one possible outcome , this result will be a strict inequality, since for any , hence .
The above derivation hinges on the nonnegativity of relative entropy. A similar approach has recently been taken to obtain inequalities related to the second law of thermodynamics Esposito2010 ; Hasegawa2010 ; Takara2010 ; Esposito2011arXiv , in situations when the system of interest does not necessarily begin (or end) in states of thermal equilibrium. (See also Ref. Jarzynski1999a for an alternative derivation of such inequalities.)
While the calculation presented here assumes a measurement apparatus with four possible outcomes, it should be clear that the analysis generalizes to any finite number of energy intervals. In fact, we can even drop the assumption that the measurement is strictly correlated with energy. That is, suppose phase space is divided into regions (not necessarily corresponding to energy intervals) and suppose that when the system is in microstate , the measurement apparatus returns a value that identifies the region of phase space to which that microstate belongs. Finally, a cyclic protocol is assigned to each possible outcome. It can be verified by the reader that the steps leading to Eq. 37 (equivalently Eq. 26) remain valid.
Moreover, to this point we have considered a measurement apparatus that is errorfree: if the initial microstate belongs in region , then the measurement outcome is necessarily . Let us now consider a more general situation in which represents the probability that the apparatus outputs the value , when a measurement is performed on a system in microstate . In the Appendix we analyze this scenario and derive the bound
(38) 
where is the mutual information Cover2006 between the variable and . For errorfree measurements (e.g. Eq. 22), and Eq. 38 reduces to Eq. 37. When the apparatus is capable of making errors, then Cover2006 , which conforms nicely to the intuition that an errorprone measuring device degrades our ability to extract work from the system. In either case Eq. 26 remains valid.
Finally, we note that the results derived in this section can be generalized to systems evolving according to stochastic equations of motion Vaikuntanathan2011 .
Iv Discussion and Conclusions
The past few years have seen considerable interest in the thermodynamics of small systems and in the applicability of the second law to various nanoscale scenarios (see Ref. Jarzynski2011 for a recent review), including those involving feedback. Motivated by the recent work of Marathe and Parrondo Marathe2010 , we have studied a model singleparticle system that is “cooled” under the quasistatic cycling of external parameters, when initial conditions are sampled microcanonically. We have used this model to construct a procedure for systematically harvesting energy from a thermal reservoir and converting that energy to work, in seeming violation of the KelvinPlanck statement of the second law. This procedure, however, involves the repeated measurement of the energy of the system. Modeling the measurement apparatus in Sec. III, we have shown by explicit calculation that the average work delivered per operating cycle does not exceed the average work that must eventually be expended (in accordance with Landauer’s principle) to erase the information acquired in the act of measuring the initial energy. Thus on balance the KelvinPlanck statement remains satisifed.
Our model illustrates the idea – which traces back to Maxwell and Szilard – that knowledge about the microscopic state of a system can be exploited to circumvent the second law of thermodynamics, loosely speaking Leff2003 . In this setting, Eq. 37 places a bound on the work that can be extracted during a cyclic process, following a measurement that provides information about the initial state of the system. As already mentioned, similar bounds have been obtained and studied in the past few years, both for quantum systems Zurek2003 ; Sagawa2008 ; Jacobs2009 ; Kim2011 and for systems evolving according to stochastic equations of motion Touchette2000 ; Kim2007 ; Cao2009 ; Suzuki2009 ; Sagawa2010 ; Ponmurugan2010 ; Fujitani2010 ; Horowitz2010 ; Toyabe2010 ; Abreu2011 . We also note that Eq. 35, a precursor to Eq. 37, generalizes the relative entropy work relation of Kawai, Parrondo and Van den Broeck Kawai2007 to processes with feedback.
Let us now return to a point mentioned in Sec. II: the apparent incompatibility of Eq. 17 with Liouville’s theorem. Consider a single energy shell, that is the set of all points with a particular value of energy . This set, which we denote , has the topology of a simple closed loop in phase space. Let us assume that this energy shell is located in region , hence . If we evolve trajectories from initial conditions in , using the protocol in Fig. 2(a), we arrive at a set of final conditions, , which also has the topology of a simple closed loop:
(39) 
By Liouville’s theorem, these loops enclose equal volumes of phase space: . This, however, is incompatible with a literal interpretation of Eq. 17, which seems to assert that every initial condition with energy leads to a net decrease of energy, , in other words that is contained entirely in the interior of . To address this apparent contradiction, we sketch a more careful interpretation of Eq. 17.
For any finite duration , there exist some initial conditions that yield trajectories for which the system’s energy increases: . We will refer to these trajectories as “bad actors”, as they spoil the picture shown in Fig. 3. ^{5}^{5}5 In simulations, we have observed bad actors that begin near the bottom of region , but get trapped in the right well at the end of the first stage of the process, e.g. just before in Fig. 3. As a result, they do not get drawn into the left well during the second stage. They subsequently “float” on top of the darkly shaded set in Fig. 3, and end the process with . While bad actors exist for any finite , the probability to generate one of these trajectories generally decreases with increasing , for initial conditions sampled microcanonically from . We have observed this trend in numerical simulations over a range from to 2000 (data not shown); and as mentioned in Sec. II, for and no bad actors were observed among trajectories. Thus for large but finite , we expect to be a highly convoluted, closed loop – necessarily enclosing the same volume of phase space as – with much of the loop concentrated at low energies near the value predicted by Eq. 17, but with tendrils reaching into the region of energies higher than than . We believe this issue deserves a more careful treatment, but this is beyond the scope of the present paper. We end with a conjecture regarding the quasistatic limit:
(40) 
where the quantity inside the limit is the probability to generate a trajectory whose final energy falls within an interval of width around the value predicted by Eq. 17, and microcanonical sampling at energy is assumed. We believe this conjecture represents the proper way to understand the validity of Eq. 17 and Fig. 3. Similar comments apply to Eq. 19 and Fig. 4.
Our results suggest several avenues for future research.
First, it would be interesting to explore a quantummechanical version of our model system. Here, the possibility of tunneling between the left and right wells introduces a new aspect to the problem, possibly spoiling the picture developed in Sec. II by preventing particles from getting trapped.
Because the protocols discussed in Sec. II involve the quasistatic cycling of external parameters, it is natural to wonder whether the swapping of regions and (illustrated in Fig. 3) can be described in terms of a geometric phase.
Finally, we have not explicitly modeled the “demon” in Sec. III. Instead, we have assumed the existence of some mechanism by which a particular outcome of the measurement leads to the implementation of the corresponding protocol. It would be interesting, however, to model this mechanism explicitly within a Hamiltonian framework, either by introducing additional degrees of freedom to model the demon or by specifying coupling terms between the measurement device and the system. In this case, we anticipate that the bound on extracted work will be given in terms of the correlation between the state of the system and the state of the measuring device and/or demon Touchette2000 ; Zurek2003 ; Jacobs2009 ; Cao2009 .
Acknowledgements.
We gratefully acknowledge useful discussions and correspondence with Eric Heller, Jordan Horowitz, Daniel Lathrop, Rahul Marathe, Juan Parrondo and Wojciech Zurek, as well as financial support from the National Science Foundation (USA) under grants CHE0841557 and DMR0906601, and the University of Maryland, College Park.Appendix A Analysis of errorprone measurement devices
Consider a measurement apparatus with a discrete set of possible outputs, , and let denote the probability to obtain outcome , when the measurement is performed on a system in microstate . We assume that every measurement produces some outcome, hence for any . As before, a cyclic protocol is chosen based on the outcome of the measurement. For initial conditions , let denote the final conditions, after implementation of the protocol corresponding to outcome . The work performed on the system is given by Eq. 29, and averaging over many repetitions of the process gives us
(41)  
(42) 
where is the joint probability that the system is initially in microstate and the measurement outcome is . Dropping the subscript , we now introduce two probability distributions (compare with Eqs. 32, 33)
(43)  
(44) 
where is the net probability to generate the outcome , and denotes the conditional probability distribution that the initial microstate is , given the measurement outcome . In terms of these distributions we now have
(45)  
(46) 
On the last line, the first term is a relative entropy, and therefore nonnegative; while the second term (apart from the factor ) is the mutual information between and . We thus arrive at
(47) 
equivalently Eq. 38.
References
 (1) C. B. P. Finn, Thermal Physics, 2nd ed. (Chapman and Hall, 1993)
 (2) C. Jarzynski, Physical Review Letters 78, 2690 (1997)
 (3) A. Allahverdyan and T. Nieuwenhuizen, Physica A 305, 542 (2002)
 (4) M. Campisi, Studies in History and Philosophy of Modern Physics 39, 181 (2008)
 (5) K. Sato, Journal of the Physical Society of Japan 71, 1065 (April 2002)
 (6) R. Marathe and J. M. R. Parrondo, Physical Review Letters 104, 245704 (June 2010)
 (7) J. C. Maxwell, Theory of Heat (Longmans, Green and Co., London, 1871)
 (8) Maxwell’s Demon 2: Enropy, Information, Computing, edited by H. S. Leff and A. F. Rex (Institute of Physics Publishing, Bristol and Philadelphia, 2003)
 (9) K. Maruyama, F. Nori, and V. Vedral, Rev. Mod. Phys. 81, 1 (January  March 2009)
 (10) R. Landauer, IBM Journal of Research and Development 5, 183 (1961)
 (11) C. H. Bennett, International Journal of Theoretical Physics 21, 905 (1982)
 (12) L. Szilard, Zeitschrift für Physik 53, 840 (1929)
 (13) J. L. Tennyson, J. R. Cary, and D. F. Escande, Physical Review Letters 56, 2117 (May 1986)
 (14) J. R. Cary, D. F. Escande, and J. L. Tennyson, Physical Review A 34, 4256 (November 1986)
 (15) T. M. Cover. and J. A. Thomas, Elements of Information Theory (WileyInterscience, 2006)
 (16) T. Sagawa and M. Ueda, Phys. Rev. Lett. 104, 090602 (Mar 2010)
 (17) T. Sagawa and M. Ueda, Phys. Rev. Lett. 100, 080403 (Feb 2008)
 (18) M. Ponmurugan, Phys. Rev. E 82, 031129 (Sep 2010)
 (19) J. M. Horowitz and S. Vaikuntanathan, Phys. Rev. E 82, 061120 (Dec 2010)
 (20) G. E. Crooks, Phys. Rev. E 60, 2721 (Sep 1999)
 (21) M. Esposito, K. Lindenberg, and C. Van den Broeck, New J. Phys. 12, 013013 (2010)
 (22) H.H. Hasegawa, J. Ishikawa, K. Takara, and D. J. Driebe, Physics Letters A 374, 1001 (2010)
 (23) K. Takara, H.H. Hasegawa, and D. J. Driebe, Physics Letters A 375, 88 (2010)
 (24) M. Esposito and C. Van den Broeck, “Second law and landauer principle far from equilibrium,” (2011), arXiv:1104.5165v1
 (25) C. Jarzynski, J. Stat. Phys. 96, 415 (1999)
 (26) S. Vaikuntanathan, unpublished
 (27) C. Jarzynski, Annu. Rev. Cond. Matt. Phys. 2, 329 (2011)
 (28) H. W. Zurek, “Maxwell’s demon, szilard’s engine and quantum measurements,” (2003), arXiv:quantph/0301076v1
 (29) K. Jacobs, Phys. Rev. A 80, 012322 (Jul 2009)
 (30) S. W. Kim, T. Sagawa, S. De Liberato, and M. Ueda, Phys. Rev. Lett. 106, 070401 (February 2011)
 (31) H. Touchette and S. Lloyd, Phys. Rev. Lett. 84, 1156 (Feb 2000)
 (32) H. K. Kim and H. Qian, Phys. Rev. E 75, 022102 (Feb 2007)
 (33) F. J. Cao and M. Feito, Phys. Rev. E 79, 041118 (Apr 2009)
 (34) H. Suzuki and Y. Fujitani, Journal of the Physical Society of Japan 78, 074007 (Jul 2009)
 (35) Y. Fujitani and H. Suzuki, Journal of the Physical Society of Japan 79, 104003 (Oct 2010)
 (36) S. Toyabe, T. Sagawa, M. Ueda, E. Muneyuki, and M. Sano, Nature Physics 6, 988 (December 2010)
 (37) D. Abreu and U. Seifert, “Extracting work from a single heat bath through feedback,” (March 2011), arXiv:1102.3826v2
 (38) R. Kawai, J. M. R. Parrondo, and C. V. den Broeck, Phys. Rev. Lett. 98, 080602 (Feb 2007)