The maximum likelihood estimator (MLE) is an important [[estimator]] in [[statistics]]. The basic idea is to select the value in the parameter space that makes the observed data "most likely".
Given the data $X_1, X_2, \dots, X_n$, an [[independent and identically distributed|iid]] random sample from a distribution with unknown parameter $\theta$, we want to find the value of $\theta$ in the parameter space that maximizes our "probability" of observing the data.
For discrete random variables, we are evaluating the [[joint probability mass function]] $P(X_1 = x_1, X_2 = x_2, \dots, X_n = x_n)$, in other words the probability that each random variable takes on the observed value. Express this probability as a function of $\theta$ (we call this function a "likelihood") and find the $\theta$ that gives you the [[function maximum]].
For continuous random variables, the probability $P(X_1 = x_1, X_2 = x_2, \dots, X_n = x_n)$ can be expressed as the [[joint probability density function]].
$f(\vec{x}; \theta) = \prod_{i=1}^n f(x_i; \theta)$
The joint pmf/pdf is called a [[likelihood]] function $L(\theta$) and includes all proportional variations of the function because the maximum of the function will always be located at the same value for $\theta$ regardless of how "tall" it is. Thus, you can drop any constants of proportionality when calculating the likelihood as they will not affect the outcome (including any terms that are only dependent on $x$, as the $x