Source Report | Understanding Demis Hassabis's AGI Roadmap: Gemini, AlphaFold, and DeepMind's Bet

AlphaFold Series: Protein Structure Prediction Revolution

AlphaFold transformed protein structure prediction by leveraging deep learning on genomic data to achieve near-experimental accuracy, solving a 50-year challenge; AlphaFold 2 used an Evoformer architecture to process multiple sequence alignments (MSAs) and produce atomic models with median backbone RMSD <1 Å even without homologs, while AlphaFold 3 extended this diffusion-based modeling to biomolecular complexes (proteins + DNA/RNA/ligands/ions), doubling accuracy for protein-ligand binding over tools like Vina by jointly optimizing all components.[[1]](https://deepmind.google/science/alphafold)[[2]](https://www.nature.com/articles/s41586-021-03819-2)[[3]](https://www.nature.com/articles/s41586-024-07487-w)
- AlphaFold 1: *Nature* (2018, ~1k citations inferred from series), topped CASP13.[[1]](https://deepmind.google/science/alphafold)
- AlphaFold 2: *Nature* (2021), 43k+ citations, open-sourced code/database (200M+ structures), Nobel Chemistry 2024 (Hassabis/Jumper), >3M users, enabled drug targets (e.g., malaria enzymes), plastic-eating enzymes, crop resilience.[2][1]
- AlphaFold 3: Nature (2024, 13k+ citations), powers Isomorphic Labs' drug design, AlphaFold Server for non-commercial use.[3]
- Replicated widely: >35k papers cite/incorporate; independent labs validate structures; downstream: 30% of citing papers on disease.[4]

Evidence Quality Scorecard: Peer-reviewed (all versions), massively cited/replicated (AF2 especially), Nobel-validated, concrete uses (e.g., 200k+ drug-relevant targets predicted). AF3 code pseudocode-only initially (criticism), now academic release. Score: A+ (gold standard).

Implications for Competitors: Data moat from MSAs/genomics unbeatable short-term; open-source AF2 lowers entry but AF3's complex modeling requires proprietary compute/training.

AlphaProof: Formal Math Proving at Olympiad Level

AlphaProof combines a Gemini language model (translating natural math to Lean formal language) with AlphaZero-style reinforcement learning/tree search to self-improve proofs via millions of simulated games, achieving silver-medal IMO 2024 (28/42 points: solved 3/5 non-geometry problems including hardest P6, where only 5/609 humans scored full).[5]
- Paper: Nature (Nov 2025), proves IMO P1/P2/P6; perfect on miniF2F, near-perfect PutnamBench.[5]
- Citations: ~90 (recent).[5]
- Downstream: Proves 258 formal-IMO problems; enables verifiable math reasoning.

Evidence Quality Scorecard: Peer-reviewed, independently verified IMO performance (official competition). No broad replications yet (new). Theorems formally checked in Lean. Score: A (strong, emerging impact).

Implications for Competitors: Lean formalization + RL scales to harder math; open details but compute-intensive training barriers entry.

AlphaGeometry 1/2: Geometry Theorem Proving

AlphaGeometry fuses a neural language model (trained on 100M synthetic proofs via DDAR engine) with symbolic deduction for auxiliary constructions, solving 25/30 IMO geometry problems (gold-medalist level); v2 (Gemini-based, 10x data) hit 84% on 25-year IMO geometries, solved IMO 2024 P4 for silver combo with AlphaProof.[6][7]
- v1 Paper: Nature (Jan 2024), open-sourced code; discovers generalized IMO 2004 theorem.[7]
- v2: arXiv (Feb 2025), gold-medalist performance.
- Downstream: Human-readable proofs; no specific theorems beyond benchmarks.

Evidence Quality Scorecard: v1 peer-reviewed; v2 preprint. Replicated via open code; benchmarked on official IMO. Score: A- (peer-reviewed core, validated benchmarks).

Implications for Competitors: Synthetic data gen key; neuro-symbolic hybrid hard to match without similar scale.

AlphaMissense: Missense Variant Pathogenicity

AlphaMissense fine-tunes AlphaFold2 on human/primate variant frequencies + structures to score 71M missense variants (89% classified benign/pathogenic), auROC 0.94 on ClinVar vs. priors; outperforms REVEL/CADD on functional assays.[8][9]
- Paper: Science (Sep 2023), open catalogue/code.
- Downstream: Prioritizes disease mutations (e.g., BAP1 in uveal melanoma, 91.7% ClinVar match); integrated in Ensembl/VEP/UniProt.

Evidence Quality Scorecard: Peer-reviewed, benchmarked on ClinVar/experiments, widely integrated. Score: A (validated predictions).

Implications for Competitors: Leverages AF2; population data moat.

AlphaQubit: Quantum Error Correction Decoder

AlphaQubit uses transformers/convolutions to decode surface-code errors from syndromes, achieving 6% lower logical error rates than MWPM on Google's Sycamore (d=3/5/11), scales to 100k rounds with μs latency.[10]
- Paper: Nature (Nov 2024).
- Downstream: Enables fault-tolerant quantum computing.

Evidence Quality Scorecard: Peer-reviewed, hardware-benchmarked on Sycamore. Score: A (experimental validation).

Implications for Competitors: Hardware-specific training; generalizes across distances.

GNoME: Materials Discovery at Scale

GNoME graph networks predict crystal stability from composition/structure, discovering 2.2M below-hull structures (381k stable, 10x known), with 80% hit rate; enables layered semiconductors (52k), Li-ion conductors (528).[11]
- Paper: Nature (Nov 2023, 1.8k+ citations).
- Validations: 736 ICSD matches; 91% of new Materials Project entries; r²SCAN stable 84-86%; A-Lab autonomous synthesis.[11]

Evidence Quality Scorecard: Peer-reviewed, 736+ experimental hits, independent validations. Score: A (scale + confirmations).

Implications for Competitors: 100M+ DFT data moat; active learning accelerates.

AlphaEvolve: Algorithm Evolution Agent

AlphaEvolve evolves codebases via Gemini LLM mutations + evaluators, beating humans on 20% of 50 math problems (e.g., 4x4 complex matrix mult in 48 scalars, 56-year Strassen record); Google's uses: 0.7% data center recovery, 23% faster Gemini training kernel.[12]
- Whitepaper (2025), GitHub results; not peer-reviewed.
- Downstream: TPU circuits, FlashAttention speedup.

Evidence Quality Scorecard: Press-release/whitepaper, Google-internal verified, math proofs checkable. Score: B+ (promising, pre-peer review).

Implications for Competitors: Evolutionary LLM loop general-purpose; evaluator design key.

Isomorphic Labs: AI Drug Pipeline

Isomorphic (DeepMind spinout) uses AlphaFold3/IsoDDE (2x AF3 accuracy on ligands, sequence-only pockets) for end-to-end design; partnerships: Eli Lilly/Novartis ($3B potential), J&J (2025); internal oncology/immunology pipeline; Phase 1 trials gearing up end-2026 (delayed from 2025).[13][14]
- Milestones: $600M raised (2025); IsoDDE technical report.

Evidence Quality Scorecard: Press/partnerships, no public clinical data yet. Score: B (pre-clinical).

Implications for Competitors: Proprietary models + pharma scale; trials will validate.

Recent Findings Supplement (May 2026)

Isomorphic Labs' IsoDDE: AlphaFold 3's Proprietary Successor Accelerates Drug Design

Isomorphic Labs (DeepMind spin-off) released IsoDDE in February 2026, a unified engine that doubles AlphaFold 3's accuracy on protein-ligand predictions for novel pockets/ligands by modeling induced fits and cryptic sites computationally—enabling de novo drug matter creation without wet-lab iteration, directly used in their oncology/immunology pipeline.[1][2]
- Technical report (Feb 10, 2026; Zenodo DOI 10.5281/zenodo.19699685): >2x AF3 on Runs N’ Poses benchmark (50% success on 0-20% similarity bin); 2.3x AF3 on antibody-antigen DockQ>0.8 (39% vs 17%); exceeds FEP+ physics methods on affinities at 1/10th cost; detects cereblon cryptic pocket from sequence alone (RMSD 0.12-0.33Å).[1][2]
- Internal use: Daily in programs for unseen structures/pockets; partnerships (Lilly, Novartis, J&J) worth $3B+; clinical trials delayed to end-2026 (from 2025 target).[1][3]
Evidence Scorecard: Press-release stage (proprietary technical report, no peer review/replication); strong benchmarks but no disclosed drug targets/clinical milestones. For competitors: Data moat via proprietary training; open-source rivals lag 2x+ on generalization.

AlphaGenome: Peer-Reviewed Leap in Non-Coding DNA Interpretation

DeepMind's AlphaGenome (Nature, Jan 2026) processes 1Mb DNA to predict multimodal tracks (expression, splicing, chromatin) at base-pair resolution, outperforming priors on 25/26 variant benchmarks by unifying long-context modeling—unlocking causal interpretation of 98% "dark matter" genome for disease prioritization.[4]
- Peer-reviewed Nature paper (DOI:10.1038/s41586-025-10014-0): Beats Borzoi/Enformer on eQTLs (Spearman R), MPRA effects, TAL1 oncogene variants; GitHub tools/API for tracks/variants; complements AlphaMissense (coding regions).[4]
- Early uses: 3,000+ scientists since Jun 2025 preprint; aids rare disease diagnostics, enhancer-gene linking.[5]
Evidence Scorecard: High (peer-reviewed, 99+ citations, open tools); replicated in benchmarks vs. external models. For entrants: Foundation model sets new SOTA; fine-tune via SDK for custom genomics.

AlphaProof: Formal Proofs Reach IMO Silver in Peer-Reviewed Detail

AlphaProof's Nature paper (Nov 2025) details RL in Lean formalizing natural language proofs, achieving 28/42 IMO 2024 score (silver)—scaling verifiable reasoning via millions of auto-formalized problems, bridging LLMs to theorem-proving rigor.[6]
- Nature (DOI:10.1038/s41586-025-09833-y): Solves 3/6 algebra/number theory problems; 90+ citations; no new theorems post-2024 but enables discovery pipeline.[7]
Evidence Scorecard: High (peer-reviewed Nature); IMO-verified (replicated competition). For math competitors: Lean integration moat; extend via RL on synthetic proofs.

AlphaEvolve: Algorithmic Evolution Yields Math/Infra Discoveries

AlphaEvolve (arXiv Jun 2025) evolves codebases via Gemini LLM mutations + evaluators, rediscovering SOTA on 75% of 50 math problems and improving 20% (e.g., 11D kissing number 593)—deployed internally for data center scheduling (0.7% global compute recovery).[8]
- Preprint (549 citations): Matrix mult advances; GitHub results notebook; no peer review yet, but collaborations (Tao) yield new constructions (e.g., Kakeya conjecture).[8]
Evidence Scorecard: Medium (preprint, high citations, verified discoveries); partial replications via Colab. For optimizers: Evolutionary loop generalizes; Cloud preview for enterprise.

AlphaFold Ecosystem: Sustained Impact, No Major New Core Advances

AlphaFold3 (joint w/ Isomorphic) cited in 35k+ papers, doubles novel structures (40%+ rise), boosts clinical/patent citations; enables honeybee conservation, apoB100 heart targets, resilient crops—but no new drug targets/materials quantified post-2025.[9]
- Nov 2025 blog: 200M+ structures, 3M users (1M low-income); Nobel 2024 context affirmed.[9]
Evidence Scorecard: Established (replicated globally); downstream uses peer-reviewed. For biology: Server (8M+ folds) commoditizes; compete via domain fine-tunes.

Underdeveloped Systems: Stagnant Post-2025 Evidence

AlphaGeometry 2: IMO geometry solver (83-88% problems); no new papers/uses. post:5 /grok:render
GNoME: Crystal discovery; no updates.
AlphaMissense: Variant pathogenicity; integrated in benchmarks/databases, aids rare diseases.[10]
AlphaQubit: Quantum error decoder; arXiv updates (AQ2, Dec 2025), no replications.[11] Evidence Scorecard (aggregate): Low-medium (pre-2025 papers, minor mentions); infer press-release/discovery-stage. For quantum/materials: Niche; validate via open benchmarks.

Research Question