- we define 'essentially' as $\epsilon$
- a small $\epsilon$ means #privacy is more preserved, but responses to queries are less accurate
- [[differential privacy]] is not an algorithm but a definition, and may describe many algorithms
- 'anonymisation' is an incomplete answer to the problem of data privacy - paper gives example of 'anonymised' Netflix dataset which, when cross-referenced with IMDb reviews, revealed the individuals. These are called linkage attacks
- holy _shit_
- re-identification is a harm in and of itself, but in addition there are harms that may come as a consequence of being able to tie specific data back to an individual
- paper also notes *differencing attacks*. Consider that a person of interest, Ms B, is known to be in a specific database. We can take a broad query: "How many individuals in this database hold a high clearance?", and a specific query "How many people, not named 'Ms B', hold a high clearance?". And now we know if Ms B holds a high clearance
- we could also just checked her LinkedIn, probably 🙄
- auditing queries is not much use either: if the auditor allows the first query in the above, then to refuse the second query "because it gives away information" *itself gives away information*
- what a mindfuck
- summary statistics are...apparently also hackable
## Chapter 2: Basic terms
- we trust the curator, who might in other places be called the [[data controller]]. However, we might decide to trust the curator by removing them and turning them into a protocol.
- cf [[Multi-Party Computation]]
- there is a non-interactive, or offline, model of privacy whereby the curator produces a sanitised, synthetic database or even just a collection of summary statistics. The original dataset is deleted. The curator is made redundant
- in this job market? Brutal
- in an interactive, or online, model, analysts can query the data iteratively and adaptively
- a privacy mechanism is a function $F$ whose arguments are a database $D$, a universe (??) of data types $\chi$, random bits, and some optional queries
- these approaches result in less accurate responses
## Chapter 13: reflections