This guy notates the states in state space as s and the inputs as a (actions).

When the system becomes very complicated, it can become either unknown or unreliable.

chaos in dynamics = we can get different depending on the conditions (think of double pendulum). Even if you have a deterministic model(state space), you are limited in terms of predicting its states. In practice, even if you have a good model, you are limited.

Impainting is another way to condition the generative diffusion step (please research this, because I did not get it)

DDPMS for Dynamical Systems: they corrupt the discrete trajectory in time where we stack the states . k stands for the amount of corruption we apply (noise).

So basically we corrupt this trajectory with noise.

We learn directly the closed loop system if we store in a RL (Reinforcement Learning) way:

How denoising is done: copy paste from difussion denoising process — aka NN.

In the paper you will find one time-axis and one diffusion-axis.

Use guided sampling and impainting to set up optimal control schemes.

Difference between classifier free and classifier guidance??????

classifier free: you have an extra vector that you add as extra input for DDPM which acts as additional conditions for the system you want to learn

classifier guidance: train another NN to generate the V (map) function which maps a closed loop trajectory into a value. This value is the value-function in a RL sense. If the value is very high, it is a very good behaviour.