This repository contains various notes on using ML tools for protein structure, engineering & design, property prediction, and related topics. It is not intended to serve as introductory material. These notes are not comprehensive and may have errors. If you find any errors, or would like to contribute something you feel is missing, please contact me on [GitHub](https://www.github.com/delalamo) or on [LinkedIn](https://www.linkedin.com/in/ddelalamo/).
The contents of this site are protected by a [GNU Free Documentation License](https://www.gnu.org/licenses/fdl-1.3.html). Please reach out if you are interested in copying or repurposing pages from this repository.
[[Reading list|Link to reading list]]
### Random notes of interest
* [[Protein property prediction using PLMs does not benefit from scale except when predicting structural features]]
* [[Conformational entropy in antibodies decreases during affinity maturation]]
* [[Protein structure prediction methods are unable to predict the energetics of a conformational landscape unless explicitly trained for that purpose]]
* [[Protein backbones designed using diffusion, but not sequence-based models, have fewer beta sheets]]
* [[Structure-based methods outperform sequence-based methods on protein stability prediction of point mutants, but not full sequences]]
### Recently added or modified
* [[Epistasis is rare during evolution]]
* [[Mutations obtained by antibodies during affinity maturation show no evidence of epistasis]]
* [[AF3-generation methods incorporate MSA information exclusively into the pair representation]]
* [[Protein language models can be steered to design proteins with specific properties]]
* [[Low-pLDDT hallucinated models from AF3-generation protein structure predictors are designable]]
* [[ProteinMPNN-derived inverse folding methods underdesign aromatic residues]]
* [[Paratope losses are required to enforce CDR-mediated antigen binding during de novo antibody design by hallucination]]
* [[Secondary structure losses are required to enforce CDR loopiness during de novo antibody design by hallucination]]
* [[Protein language models can predict zero-shot which proteins belong to the same species]]
### Antibody notes
* Structure
* [[Complementarity-determining regions|CDRs]]
* [[Framework region]]
* Formats
* [[Antibodies]]
* [[Fab|Fabs]]
* [[Single chain variable fragments|Single chain Fvs]]
* [[Nanobodies]] (AKA VHHs)
* Developability
* [[Developability]]
* [[Antibody glycosylation|Glycosylation]]
* [[Antibody humanization|Humanization]]
* Property prediction
* [[Antibody structure prediction]]
* [[Antibody language models]]
* [[Antibody-antigen binding affinity prediction]]
* Miscellaneous
* [[Affinity maturation]]
* [[Somatic hypermutation]]
### Structural modeling
* [[MD simulations]]
* [[Structure prediction|Protein structure prediction]]
### Protein engineering notes
* [[Ancestral sequence reconstruction|Ancestral sequence reconstruction]]
* [[Directed evolution|Directed evolution]] (related: [[Epistasis]])
* Property prediction
* [[Fitness prediction|Fitness prediction]]
* [[Variant effect prediction|Variant effect prediction]]
* [[Function prediction|Protein function prediction]]
* [[Stability and thermostability|Stability and thermostability prediction]]
* Design
* [[Inverse folding|Inverse folding]]
* [[Inversion of protein folding neural networks|Inversion of protein folding neural networks]]
* [[Protein backbone design|Protein backbone design]]
* Miscellaneous
* [[Heterodimerization domains]]
* [[Engineered trimerization domains]]
### ML notes
* [[Transformer]]
* [[Low-rank Adaptation]]
* [[Protein language models|Protein language models]]
* [[Contrastive learning]]
### Miscellaneous
* [[Evolution and natural selection]]
* [[Protein folding]]
* [[Protein dynamics]]
* [[Protein-protein interactions]]