Likelihood is just a Join Probability of the data given model parameters , but viewed as a function of , i.e.
One way to do Maximum Likelihood Estimation is to minimize the negative log likelihood.
That is
Log Likelihood
It makes the math much easier:
We do this because might be an extremely small number, so we perform addition instead.
Log-likelihood (Average):
Negative log-likelihood (Average):