# Shannon entropy
- Note Created: 2022-02-12 @ 12:43
---
Shannon entropy is the average level of "information", "surprise", or "uncertainty" inherent to a variable's possible outcomes.
Shannon entropy, also referred to as [[Information entropy]], was developed by [[Claude Shannon]] in 1948 with the seminal publication of his [A Mathematical Theory of COmmunication](https://en.wikipedia.org/wiki/A_Mathematical_Theory_of_Communication) — largely considered to have given birth to the digital information age.
### Equation
Given a discrete random variable $X$, with possible outcomes $x_1, . . . , x _n$, which occur with probability $P(x_1) , . . . , P(x_ n)$, the entropy of $X$ is formally defined as:
$
H(X) = -\Sigma_{i}^{n} P(x_i) \log(P(x_i))
$
where $\Sigma$ denotes the sum over all of the variables possible values.
### Log base and other interpretations
The choice of base for $\log$, the [logarithm](https://en.wikipedia.org/wiki/Logarithm "Logarithm"), varies for different applications. Base 2 gives the unit of [bits](https://en.wikipedia.org/wiki/Bit "Bit") (or "[shannons](https://en.wikipedia.org/wiki/Shannon_(unit) "Shannon (unit)")"), while base [_e_](https://en.wikipedia.org/wiki/Euler%27s_number "Euler's number") gives "natural units" [nat](https://en.wikipedia.org/wiki/Nat_(unit) "Nat (unit)"), and base 10 gives units of "dits", "bans", or "[hartleys](https://en.wikipedia.org/wiki/Hartley_(unit) "Hartley (unit)")". An equivalent definition of entropy is the [expected value](https://en.wikipedia.org/wiki/Expected_value "Expected value") of the [self-information](https://en.wikipedia.org/wiki/Self-information "Self-information") of a variable.
---
#### Related
#metrics #information_theory