ipsae
ipSAE (interaction prediction Score from Aligned Errors): an interface-focused confidence score for complexes computed from predicted aligned errors (PAE); higher is better.
pLDDT
pLDDT (predicted Local Distance Difference Test): per-residue confidence score (0–100); higher means the local geometry is more reliable.
Which metrics can be used to assess predicted protein structures?
Per-residue/local: pLDDT, lDDT.
Global/fold: pTM, TM-score, GDT_TS, RMSD.
Complex/interface: ipTM, DockQ, (ipSAE), interface RMSD.
Model quality/physics: clashes, Ramachandran/MolProbity scores.
Relative placement: PAE heatmap.
How does Alphafold model protein complexes?
AlphaFold-Multimer predicts all chains jointly: it concatenates chain sequences with chain breaks, builds/uses paired MSAs for interacting partners, lets attention/triangle updates operate across chains, and ranks models with multimer confidence (e.g., ipTM + pTM).
How does Rosettafold differ from Alphafold?
RoseTTAFold uses a 3-track network (1D sequence, 2D pair/distance map, and 3D coordinates) with information exchange between tracks (incl. SE(3)-equivariant 3D updates). AlphaFold2’s trunk is the Evoformer (MSA+pair) followed by a separate structure module; AF historically achieved higher CASP14 accuracy, while RosettaFold is a different, more explicit 3D-track design.
What modules does Alphafold have?
pairformer,evoformer
What does the pairformer do in Alphafold?
In AlphaFold3, the Pairformer is the main trunk that updates sequence/pair representations (and conditioning features) with attention + triangle-style operations, producing interaction-aware features that guide the downstream diffusion/structure generation.
What does the evoformer do in Alphafold?
In AlphaFold2, the Evoformer iteratively updates two representations—MSA (evolutionary info) and pair (residue–residue relations)—using attention and triangle updates, producing features used by the Structure Module to place atoms.
How does openfold differ from Alphafold?
OpenFold is a trainable, fully open-source PyTorch reimplementation/retraining of AlphaFold2. It reproduces the AF2 architecture closely but provides training code, configurable pipelines, and engineering changes/optimizations for research and different hardware.
How does Alphafold3 differ from Alphafold2?
AlphaFold3 uses a diffusion-based all-atom generative approach and can jointly model complexes beyond proteins (DNA/RNA, small molecules/ligands, ions, modified residues). AF2 mainly targets proteins (and protein–protein complexes via Multimer) with an Evoformer + Structure Module pipeline.
How does Alphafold2 differ from Alphafold1?
AlphaFold2 is end-to-end: it uses attention-based Evoformer + a learned structure module (with recycling) to directly output 3D coordinates and confidence. AlphaFold1 relied more on predicting distance/angle distributions and using separate structure-building/optimization steps with stronger template/fragment-style components.
How does Boltz differ from Alphafold?
Boltz (Boltz-1/2) is an open-source family of diffusion-based biomolecular interaction models aimed at AlphaFold3-like complex prediction (proteins with ligands/nucleic acids, etc.). AlphaFold3 is the DeepMind/Isomorphic model; Boltz emphasizes openness (weights/code) and (in Boltz-2) affinity prediction.
How does Boltz2 differ from Boltz1?
Boltz-2 extends Boltz-1 with explicit binding-affinity prediction (e.g., protein–ligand), broader multimodal performance improvements, and updated training/engineering; Boltz-1 focused primarily on high-accuracy complex structure prediction.
How much did it cost to train Alphafold, Boltz and Openfold?
Exact $ costs are generally not disclosed; common reported compute: AlphaFold2 used ~128 TPUv3 cores and took ~11 days to converge; OpenFold reported training on 128 A100 GPUs in ~8+ days. Boltz-1/2 training hardware/time is less consistently public; Boltz-2 training is reported as enabled by Recursion’s BioHive-2 supercomputer (exact cost depends on pricing/ownership).
What kinds of library construction methods are there?
Common NGS library types: (1) shotgun/fragmentation + adapter ligation, (2) amplicon-PCR libraries, (3) tagmentation-based (e.g., Nextera), (4) hybrid-capture/enrichment libraries, (5) long-read ligation/rapid kits. For synthetic DNA/protein libraries: Gibson/Golden Gate/restriction-ligation assembly, and display libraries (phage/yeast/mRNA) for variants.
What loss function was used in training Alphafold?
Primary structural loss: FAPE (Frame Aligned Point Error). Plus auxiliary losses such as distogram cross-entropy, masked-MSA prediction, torsion/angle losses, structural violation/clash losses, and confidence-head losses (pLDDT/PAE).
What are some key Pymol commands?
load/fetch, remove, select, show/hide (cartoon, sticks, spheres, surface), color, spectrum, zoom/orient, center, align/super, rms/rms_cur, distance (dist), label, save, png.
What is a SELECT statement in SQL?
A query that retrieves data from one or more tables/views: SELECT <columns/expressions> FROM <table> with optional WHERE, GROUP BY, HAVING, ORDER BY, LIMIT.
Which joins are there in SQL?
INNER JOIN, LEFT (OUTER) JOIN, RIGHT (OUTER) JOIN, FULL (OUTER) JOIN, CROSS JOIN; plus SELF JOIN (joining a table to itself).
How are the attentions calculated?
Scaled dot-product attention: scores = QKᵀ/√dₖ (+ mask), weights = softmax(scores), output = weights·V. Multi-head attention repeats this in parallel heads then concatenates and projects.
What are the dimensions of matrix multiplication?
If A is (m×n) and B is (n×p), then C = A·B is (m×p). The inner dimensions (n) must match.
What is the complexity of matrix multiplication?
Naïve multiplication is O(m·n·p). For square n×n it’s O(n³); faster algorithms exist (e.g., Strassen ~O(n^{2.81})) but are less common in practice.
What is small o() notation?
f(n) = o(g(n)) means f grows strictly slower than g: limₙ→∞ f(n)/g(n) = 0.
What is large O() notation?
f(n) = O(g(n)) means f is asymptotically bounded above by g up to a constant: ∃c,n₀ s.t. f(n) ≤ c·g(n) for n≥n₀.