Term Frequency - Inverse Document Frequency (TF-IDF) is a method for transforming word frequency in a way that diminishes the role of words that occur frequently across documents. TF-IDF was developed by Karen Spark-Jones and colleagues in 1972. TF-IDF consists of two elements: term frequency and inverse document frequency. Term frequency is the count of the terms in a document in $\log$ space. $ \text{tf}_{t,d} = 1 + \log \ \text{count}(t,d)$ if $\text{count}(t, d) > 0$, else $0$. Inverse document frequency is the global weight assigned to all terms also in $\log$ space. $\text{idf}_t = \log \Big( \frac{N}{\text{df}_t} \Big)$ where $N$ is the total number of documents and $\text{df}_t$ is the number of docs that have the term $t$. Combining these gets us TF-IDF weighting for term $t$ in document $d$. $w_{t,d} = \text{tf}_{t,d} \times \text{idf}_t$