Stochastic Optimal Control: The Discrete-Time Case by Dimitri P. Bertsekas and Steven E. Shreve, 1996, ISBN 1-886529-03-5, 330 pages. In its basic form, FVI starts from an initial value function. Bertsekas' textbooks include Dynamic Programming and Optimal Control (1996) Data Networks (1989, co-authored with Robert G. Gallager) Nonlinear Programming (1996) Introduction to Probability (2003, co-authored with John N. Tsitsiklis) Convex Optimization Algorithms (2015) all of which are used for classroom instruction at MIT. Tsitsiklis Massachusetts Institute of Technology 77 Massachusetts Avenue, 32-D784 Cambridge, MA 02139-4307, U.S.A. +1-617-253-6175 jnt@mit.edu John N. Tsitsiklis is a Clarence J Lebel Professor, with the Department of Electrical Engineering and Computer Science at MIT, and the director of the Laboratory for Information and Decision Systems. The tools of probability theory, and of the related field of statistical inference, are the keys for being able to analyze and make sense of data. In this paper, we provide an overview of the major conceptual issues, and we survey a number of recent developments, including rollout algorithms which are related to recent advances in model predictive control for chemical processes. Parallel and Distributed Computation: Numerical Methods, by Dimitri P. Bertsekas and John N. Tsitsiklis, 1997, ISBN 1-886529-01-9, 718 pages. "I believe that Neuro-Dynamic Programming by Bertsekas and Tsitsiklis will have a major impact on operations research theory and practice over the next decade. For example, Q-learning, Sarsa, and dynamic programming methods have all been shown unable to converge to any policy for simple MDPs and simple function approximators (Gordon, 1995, 1996; Baird, 1995; Tsitsiklis and van Roy, 1996; Bertsekas and Tsitsiklis, 1996). UAI2002 LAGOUDAKIS & PARR 285 a priori guarantees in most cases for the performance of specific value function architectures on specific problems, careful analyses such as (Bertsekas & Tsitsiklis, 1996) have legitimized the use of value function approximation for MDPs by providing loose guarantees that good value functions approximations will result in good policies. An example is fitted value iteration, or FVI (Bertsekas & Tsitsiklis, 1996; Munos & Szepesvari, 2008), which includes as special cases the empirically successful DQN and variants, and also serves as a key component in many state-of-the-art actor-critic algorithms. 