This guy notates the states in state space as s and the inputs as a (actions).
When the system becomes very complicated, it can become either unknown or unreliable.
chaos in dynamics = we can get different depending on the conditions (think of double pendulum). Even if you have a deterministic model(state space), you are limited in terms of predicting its states. In practice, even if you have a good model, you are limited.
Impainting is another way to condition the generative diffusion step (please research this, because I did not get it)
DDPMS for Dynamical Systems: they corrupt the discrete trajectory in time where we stack the states . k stands for the amount of corruption we apply (noise).
So basically we corrupt this trajectory with noise.
We learn directly the closed loop system if we store in a RL (Reinforcement Learning) way:
How denoising is done: copy paste from difussion denoising process — aka NN.
In the paper you will find one time-axis and one diffusion-axis.
Use guided sampling and impainting to set up optimal control schemes.
Difference between classifier free and classifier guidance??????
classifier free: you have an extra vector that you add as extra input for DDPM which acts as additional conditions for the system you want to learn
classifier guidance: train another NN to generate the V (map) function which maps a closed loop trajectory into a value. This value is the value-function in a RL sense. If the value is very high, it is a very good behaviour.