What is simulation, and what is it good for?
Every engineer and scientist knows there is a fundamental difference between a “simulation” and a “prediction,” but what is the root of that distinction? At the highest level, we contrast simulation against black-box modeling. Simulations are typically thought of as “transparent boxes” where we can describe the intent of each part of the model that produces a forecast.
A roboticist might think of a simulation as a computer system designed to integrate the differential equations that define basic laws of physics. For example, you predict the path the airplane takes based on physical models of lift and drag and how the plane moves under different control settings. Simple simulations based on reduced equations might suffice for some tasks. For others, we might have to rely on computational fluid dynamics to truly capture the behavior we’re after.
The transparent box becomes murky when systems are too complex to predict precisely. Many designers accept adding randomness to their simulations, provided they can characterize the statistical models as plausible. The dynamics of coin flipping are too hard to capture precisely, but we’re usually fine with a random number generator that produces an even number of heads and tails. Noise in measurement devices often reliably has statistics that match those of Gaussian or Poisson random numbers, and such stochastic processes are reasonable stand-ins for the sorts of signals we’ll encounter in the wild. Maybe you can simulate elections based on random numbers derived from current polling results.
Where do we draw the line between sampling and simulation? I maintain that LLMs are simulations of language. We train next-token predictors in language models so that their generation matches the statistical properties of the data. Indeed, maximum likelihood selects probability distributions that make past sequences likely in the future. I’ve received a lot of pushback on this because the samples generated by the transformer are too black-box to count as simulations. This reaction suggests to me that some people want simulations to arise from models with articulable causal explanations.
The academic literature on simulation is also intentionally vague about the difference between modeling, sampling, and simulation. But this quote from the 1975 textbook Systems Simulation: The Art and Science, by industrial engineer Robert Shannon, highlights a crucial feature of simulation:
“Simulation is the process of designing a model of a real system and conducting experiments with this model for the purpose either of understanding the behavior of the system or of evaluating various strategies (within the limits imposed by a criterion or set of criteria) for the operation of the system.”
For Shannon, simulation is purpose-driven. You replace a real system with a model, and then evaluate counterfactuals in the modeled world. A forecast that is not evaluating a counterfactual configuration or strategy is not a simulation. Simulation is anything where we can evaluate counterfactual futures and gain insights from them.
Simulations can help engineers describe the behavior of complex systems and build theories and hypotheses for why that behavior occurs. Engineers can also use them to predict future behavior of the system if they were to intervene with some new policy or if an external force acted to change some parameters.
Under this broad tent, optimal control is simulation. Since everyone learns LQR first, we get such clean formulas out that we don’t think of this as a simulator. We think of this as an analytical technique. But if you instead solve LQR by gradient descent, you’ll find that you need to simulate to compute a gradient. This is the “forward pass” in backpropagation, a method for computing gradients that was initially invented to solve optimal control problems.
Indeed, the process of solving LQR by gradient descent looks like this: You pick a cost function that seems to match your design specification. You try a particular control policy out. You get a signal back based on its performance under your cost function. You use this signal to modify your control policy to a policy with lower cost and try again. Once you have repeated this enough times so that you don’t think you can further improve, you deploy the control system trained in simulation.
AI people have coined a cutesy name for this iterative control design process: “sim2real.” On the one hand, sim2real looks like it’s doing something far more sophisticated than optimal control. The simulators they use are highly complex, their control policies are neural networks, and their cost functions are a clever pastiche of best past practice. However, robotic sim2real is a short conceptual hop from Kalman’s papers on basic linearization of chemical plants in the 50s.1 And just as today’s roboticist wishes Nvidia GPUs were cheaper, Kalman and Koepcke lament how they would be better served by more compute.
The question then becomes, how good does your simulation need to be for control? In their description of sim2real, Zakka et al. discuss the demand for the highest-fidelity simulations possible. But what does that even mean for a simulation to be high fidelity? How can you validate the assertion of high fidelity? Components with dramatically different behaviors look the same once they are interconnected in feedback loops. How can we identify what modeling is necessary? Once they are connected in feedback, identifying actual parameters becomes impossible. What is the right way to deal with uncertainty in the simulators? Is “domain adaptation,” the hot trend of the last decade where we simulate a lot of different plausible environments, the right way to make progress? Which transparent boxes can be replaced with black boxes? These are some of the questions I’ll dig into during today’s lecture. In the next post, I’ll report on partial answers.
If you are a roboticist, you should read that paper to see how Kalman should be credited with inventing Iterative LQR.





