Interestingly, the ETD error bounds in Corollary 1 and 2 are more conservative by a factor of square root than the error bounds for standard on-policy TD (Bertsekas & Tsitsiklis, 1996; Tsitsiklis & Van Roy, 1997). Thus, it appears that there is a pr…