Summary
- WBE is decades away assuming no AGI. Expert timelines for human brain emulation are mid-2040s at the earliest, with wide error bars, and generally are for "simulating the brain for a few seconds" rather than for any useful length of time. There is essentially no overlap with AGI timelines.
- WBE could be achieved several years after AGI, but probably not several months after. It takes months to physically prepare, slice, and scan a human brain; doing tests on mice will take months; and collecting functionalization data will take months unless simulations can substitute for most physical experiments.
- Emulations are expensive for what they do. Each one requires 10,000–40,000 H100-equivalents and costs $20K–$80K/hour, running at up to ~5× real-time. Emulations would be tens to hundreds of times more expensive per unit of cognition than humans, and orders of magnitude more expensive than AI systems. Emulations — including distilled emulations — are unlikely to ever be competitive with purpose-built AI for cognitive work.
- The safety case is weak. Emulations aren't obviously more trustworthy than AI, aren't a competitive starting point for building superintelligence, and it's not clear they tell you much about human values that you can't learn more cheaply from humans and LLMs.
- Pre-AGI investment is probably not cost-effective. A very simple BOTEC produces an estimate of 0.3 bp/bn[1] for a $100M investment. The value is concentrated in a narrow scenario: emulations somehow enable large x-risk reductions and a global AI moratorium is in place to allow emulation technology to reach maturity, both of which seem unlikely.
Introduction
Whole brain emulation (WBE) sometimes comes up as a potential tool in AI safety strategy — either as an alternative path to transformative AI with better alignment properties, or as a way to scale trusted human cognition during a critical period. This memo tries to assess whether either hope is realistic by asking two questions:
- When could we get WBE, relative to superintelligence?
- If we got WBE, would it matter?
I'm primarily focused on whether emulations could provide upside through reductions in risk during the initial transition to AGI and the development of superintelligence, rather than whether emulations have a role to play in the longer-term future (for example, by participating in a long-reflection or providing a 'ground-truth' for human values). I also don't consider whether emulations would have moral status, or whether WBE could have additional downsides during or post-singularity.
I’ve primarily relied on the State of Brain Emulation 2025 report and on Claude; the former report is by far the best source for detailed reading and for finding citations to the primary literature.
Timelines to Whole Brain Emulation
Whole brain emulation requires functionalization and scanning
The problem of whole brain emulation decomposes into two largely independent technical challenges:
- Scanning is reading out the specific state of a particular brain: which neurons connect to which, via what synapses, with what strengths. This is analogous to knowing the weights of an artificial neural network.
- Functionalization is knowing the rules by which neurons compute and how these evolve over time.
Both are needed to create an emulation: a perfect scan is useless without a functional model to run it, and a perfect functional model is useless without the specific connectivity of the brain you want to emulate.
Scanning a human brain is feasible but will always be a large physical undertaking
The brain contains a lot of information
A human brain has 86 billion neurons connected by 170 trillion synapses. A rough information accounting:
- Each synapse requires on the order of 100 bits to specify (connectivity, weight, type, dendritic location), giving 1.7 × 1016 bits or roughly 2 petabytes of structured connectomic data.
- Per-neuron parameters (firing thresholds, ion channel densities, time constants) add roughly 1,000 bits per neuron, totaling about 1014 bits; negligible by comparison.
The synaptic wiring diagram dominates the information budget.
Current methods rely on slicing and imaging the brain at nanometer resolution
Resolving individual synapses requires roughly 15 nm resolution. There are two main approaches.
Electron microscopy achieves far more resolution than needed but requires slicing the brain into 30 nm-thick sections, perhaps roughly 30–40 million such sections in total across the brain. State-of-the-art multi-beam SEMs sustain effective imaging rates of 100–200 megavoxels per second. At 10 nm isotropic resolution, 20 systems running continuously took roughly four years to scan a 500 mm³ mouse brain — 80 scope-years. A human brain is 2,400× larger, giving roughly 190,000 scope-years. Each system costs $0.5–10M, with similar maintenance costs over 3–5 years. Current costs for scanning the human brain would thus be O($100B).
EM reveals geometry but not chemistry: different receptor proteins are morphologically indistinguishable, so EM alone cannot determine molecular composition at synapses. Immuno-EM can tag specific proteins but only one or two targets per section, making it impractical for multiplexing.
Expansion microscopy physically swells tissue 16–24× using a hydrogel, allowing conventional optical microscopes to resolve synaptic-scale features. The most advanced dense-labeling ExM pipeline (LICONN) achieves effective resolution of 20 nm laterally and 50 nm axially (comparable to EM) and imaged 0.001 mm³ of original tissue in 6.5 hours on a standard spinning-disk confocal. That is about 1.3 mm³ per scope-year, giving roughly 900,000 scope-years for a human brain.
This throughput reflects unoptimized readout optics, not a fundamental limit. Faster light-sheet systems (ExA-SPIM) already achieve 56× higher voxel rates at insufficient resolution for dense connectomics; pairing such optics with LICONN-class sample preparation would bring the requirement to roughly 16,000 scope-years. Current spinning-disk confocals cost a few hundred thousand dollars; ExA-SPIM systems cost $175–250K. Total costs today for an optimized set-up are therefore perhaps O($4B).
ExM has a significant advantage over EM on molecular information: immunofluorescent antibody staining can directly read out receptor subtypes, with 4–6 targets per imaging pass and up to 20 via sequential staining rounds. If multiple rounds are needed, data collection and costs scale proportionally.
In either case, raw imaging generates vastly more data than the structured information extracted. Scanning a human brain at 10 nm resolution produces about 1021 bytes, compared to a few petabytes of structured connectomic information that needs to be extracted.
The binding constraint is parallel physical processing
Even with unlimited automation, the scanning pipeline involves irreducibly physical steps: tissue must be chemically fixed, sliced, expanded (for ExM), stained, and imaged. The chemical preparation steps alone impose a hard floor of several weeks. Beyond that, the timeline scales linearly with equipment: finishing in one month rather than one year requires 12× as many scopes and robots operating in parallel.
A rough BOTEC (see Appendix 1) suggests that a one-year ExM scan of a human brain would require on the order of 16,000 scopes, 1,000–2,000 FTEs for wet-lab processing, and $1–2B in scope components. The scopes do not currently exist at this scale but are simple enough instruments that manufacturing them is a modest industrial undertaking — the binding constraint is component supply chains, not assembly. Total peak labor across manufacturing and tissue processing is roughly 1,500–4,000 FTEs, comparable to a single large factory.
A faster scan requires proportionally more of everything: a three-month scan needs roughly 4× as many scopes, workers, and physical space. If multiple staining rounds are needed this likewise causes costs to increase proportionally. The hard floor of several weeks for chemical preparation cannot be compressed regardless.
Post-processing is currently a major bottleneck but is rapidly falling
Historically, the dominant cost in connectomics has been reconstruction and proofreading — turning raw images into a segmented wiring diagram. In the Shapson-Coe 1 mm³ human cortex dataset, expert reviewers spent over 3.5 hours per neuron correcting automated segmentation errors. At that rate, proofreading a whole human brain would take roughly 300 billion person-hours. AI-based segmentation has been improving rapidly, and this is the sub-problem most amenable to automation. In a post-AGI world, this bottleneck likely disappears entirely, leaving the physical scanning pipeline as the binding constraint.
Even with full automation, the compute and storage costs for processing a whole-brain scan are non-trivial (see Appendix 2), but probably not the binding constraint relative to the physical pipeline.
Developing the pipeline requires physical testing
The scanning pipeline involves chemistry and biology that must be empirically validated. The core risk is that you scan a human brain and discover you didn't collect the right information — you missed a molecular marker that turns out to be functionally important, or your resolution was insufficient in some brain region, or expansion artifacts corrupted critical data. Since the brain is destroyed in the process, this is not recoverable. You would almost certainly want to do at least one complete mouse brain before attempting a human, and plausibly many, not just to validate and optimize the scanning pipeline but to verify that you’ve collected enough information and that functionalization works. Even with an optimistic 2-month test-build-iterate cycle this validation phase adds a serial bottleneck of at least several months.
Superintelligence probably cannot find a fundamentally better scanning method
The information needed for whole brain emulation is spatially distributed across a 1,200 cm³ volume at 15 nm feature size. Any method must extract petabytes of information unique to a specific individual. While some of this information may be inferrable from the genome or from scans of other brains, presumably a vast amount is the result of a lifetime of experience that can only be determined from the synaptic connectivity.
Non-destructive scanning of a living brain appears to face fundamental physical barriers. The best in vivo methods are orders of magnitude too coarse (fMRI at 100 µm, two-photon at 1 µm but only 1 mm deep). MRI is limited by signal-to-noise at small voxel sizes; optical methods are limited by scattering at depth. There may also be an information-theoretic constraint: extracting petabytes from a living brain requires an extraordinary data rate through some physical channel, and no known channel can support this without damaging the tissue.
Two alternative approaches to the slice-and-image pipeline have been proposed, but neither is close to practical:
- X-ray ptychographic tomography (Bosch et al. (2025)) can image fixed brain tissue at sub-40 nm resolution without physical sectioning. The non-destructive pipeline is attractive and it could plausibly become a preferred method on other dimensions, but it has so far only been demonstrated on tiny volumes (tens of microns across) using synchrotron radiation, at throughput that would require millions of years for a whole human brain. I'm skeptical it can beat electron microscopy on the fundamental bottleneck of imaging throughput.
- DNA barcoding methods (MAPseq, Connectome-seq) convert connectivity into a sequencing problem, but require viral infection of a living brain — infecting all 86 billion neurons with unique barcodes — followed by dissection and sequencing. They have not demonstrated dense whole-brain synaptic mapping, and do not capture the spatial and molecular information that imaging provides.
While I haven’t tried that hard to work out whether a superintelligence could come up with other approaches (something something nanobots?) but on priors I’m skeptical; the information-theoretic constraints appear brutal.
Functionalization is challenging but could potentially be brute-forced using simulations
Functionalization decomposes by timescale, and longer timescales become increasingly difficult
Functionalization means knowing the rules by which neurons compute and change over time. In an artificial neural network training and inference are separate; but no such distinction is present in the human brain. The forward dynamics—how a neuron responds to its inputs given its current state—operate on milliseconds and are well-established biophysics. Short-term plasticity (synaptic facilitation and depression) operates on hundreds of milliseconds to seconds. Long-term plasticity (LTP, LTD) modifies synaptic strengths over minutes to hours. Structural plasticity — the formation and pruning of synapses — reshapes the wiring diagram over days to weeks. Each timescale depends on increasingly complex molecular machinery and is less well characterized than the last.
Forward dynamics (milliseconds–seconds) are probably tractable
The equations governing neural signal propagation are known and have been validated experimentally for decades. Simplified neuron models reproduce observed spiking behavior in cases where they can be validated against recordings. The main open question is what level of biophysical detail is sufficient — a simple point neuron with a few parameters per synapse, or a multi-compartment model capturing dendritic computation, at roughly 100× the compute cost. This is not yet resolved, and matters for scanning as well as simulation: if the functional model requires more parameters per neuron, the scanning pipeline must collect correspondingly more information.
The question is empirically resolvable on small organisms where complete connectomes exist. The adult Drosophila connectome is complete and early functional simulations show promise in specific circuits, though faithful whole-organism emulation remains elusive even for C. elegans, where the complete connectome has been known for nearly four decades.
Update rules (hours–days) are the critical uncertainty
Most demonstrations of brain emulation, and most quantitative analyses of WBE feasibility including the SOBE (2025) report, operate in what amounts to a frozen-weights approximation: they implement the forward dynamics but exclude plasticity. A frozen emulation can process information over short timescales but cannot learn, adapt, or form new memories; plausible behavior may be retained for seconds or minutes but not longer.
Phenomenological plasticity models exist and reproduce plasticity as observed under specific laboratory stimulation protocols. Whether they generalize to naturalistic activity patterns is unknown. A fundamental problem is that we cannot measure what we would need to model: observing how specific synapses change over hours or days requires tracking them at synaptic resolution in an intact brain over time, but current high-resolution methods (EM, ExM) are destructive and can only image once. In vitro preparations allow repeated measurement but in artificial conditions that may not reflect natural brain activity.
The problem becomes increasingly severe at longer timescales. It is well established that genetic variation, drugs, hormones, sleep, and stress all affect long-term learning and memory. These effects are mediated by biomolecular machinery — kinase cascades, gene expression, protein synthesis — that phenomenological plasticity models have no inputs for. Getting the update rules right almost certainly requires understanding these molecular details, not just fitting curves to stimulation protocols.
Evidence from neuropharmacology and genetics suggests that the update rules are quite sensitive to molecular-level perturbations. SSRIs alter emotional processing and behavior. Benzodiazepines impair memory formation. A single amino acid polymorphism in the COMT enzyme measurably affects cognitive style. Sleep deprivation profoundly impairs consolidation through mechanisms involving protein synthesis. By contrast, the forward dynamics appear relatively robust — patients who lose large amounts of brain tissue through injury or surgery can still think and speak. The implication is that an emulation with slightly wrong update rules would not necessarily be incoherent, but would likely diverge from the original person in ways analogous to the effects of chronic drug exposure: recognizably similar but with altered learning, emotional regulation, and personality. It is also plausible that incorrect update rules could push the emulation into states that are not physiologically possible in a biological brain, with unpredictable consequences.
Solving the update rules requires a dataset of neuron-level input-output pairs
Any approach to the update rules ultimately requires generating a dataset of the form: "neuron of type X, in state Y, receiving inputs Z, updates its state to W over time period T." I can think of three ways to get such data.
Inferring update rules from coarse-grained brain measurements. I am very skeptical; the measurements average over thousands-to-millions of neurons, and many different synaptic rules could produce the same macroscopic observations.
Experimental measurements. The most direct approach is to grow neurons in vitro, stimulate them, and destructively read out their molecular state at defined timepoints. The cost scales with the number of measurements needed, which is the central open question: 106 measurements costs roughly $100M (small relative to scanning), while 108 costs $5–10B (dominating the total WBE budget). Even with unlimited resources, neuron maturation imposes a serial bottleneck of at least several months. See Appendix A.3 for more details.
Simulations. Instead of physical experiments, one can simulate synaptic compartments at molecular resolution to derive the update rules computationally. At current algorithms this is 1,000–10,000× more expensive per measurement than physical experiment. However, the measurements are probably more useful, and there’s likely more headroom for superintelligent AI to improve simulation algorithms. At current methods, experiments seem to dominate; with sufficient algorithmic progress or cheaper compute, the balance could reverse. See Appendix A.4 for more details.
How hard functionalization is remains unclear. Given we don’t know how many neuronal measurements would be needed, it is unclear whether this or the destructive scanning would be the bottleneck. Whatever the cost, functionalization is plausibly closer to a one-time field-level expense than to one scaling per brain. Most individual differences sit in wiring, not in update rules. If the rules are derived by molecular simulation, the expensive step is building the pipeline once — genetic code → individual-specific proteins → update rules — and applying it to a new individual is then cheap: just rerun the stack with their proteins. The experimental route is harder, since new individuals plausibly require fresh experiments on genetically diverse neurons, but probably still tractable.
Whole brain emulation is unlikely to be available during the critical period for AGI
Whole brain emulation is unlikely to predate AGI
As of 2026, the most advanced whole-organism simulation is BAAIWorm for C. elegans, which reproduces basic locomotive behaviors using frozen synaptic weights but cannot learn, and does not emulate a specific individual. The Drosophila connectome was completed in 2024 and early functional simulations show promise in specific circuits, but whole-organism emulation remains out of reach. A mouse connectome does not yet exist. For human WBE, neither the scanning pipeline nor the functional models are close to ready, and the required validation on smaller organisms has barely begun.
In the Asimov article released alongside the SOBE (2025) report (see also their Guesstimator for an interactive version of their cost and timeline model), Max Schons estimates:
If I had to put a number on optimistic budgets and timelines for human brain emulation today, I would hazard: Conditional on success with a sub-million-neuron brain emulation model, a reasonable order of magnitude estimate for the initial costs of the first convincing mouse brain emulation model is about one billion dollars in the 2030s and, eventually, tens of billions for the first human brain emulation model by the late 2040s. My error bars on this projection are high, easily 10x the costs and ten additional years for the mouse, 20 to 50 years for humans.
Readers familiar with recent AI progress might wonder whether these timelines are too conservative. AI will provide extraordinary acceleration in some places, but I'm skeptical these gains will multiply across a pipeline with dozens of sequential dependencies and failure modes. Brain emulation is fundamentally not a digital process; core bottlenecks involve physical manipulation of biological tissue, with time requirements dictated by chemistry and physics rather than compute power. The field requires deep integration across disciplines and tacit knowledge accumulated through years or decades of hands-on training. Capital costs of specialized equipment and ethical considerations around human brain tissue add to these constraints. Scientists might also make new observations tomorrow that complicate the picture further, such as realizing that not just a few, but hundreds of distinct molecular annotations might be necessary to accurately model a neuron's activity.
I basically agree and furthermore the endpoint he envisions still seems to be that of a “frozen neuron weight” approximation, and of an aggregate generic brain rather than one that is personality preserving. Given that I expect we’ll reach AGI within the next 20 years (and probably within the next 5–10 years) there’s not much overlap between AGI timelines and WBE; this basically seems to be the belief of everyone who’s thought about this.
This is contingent on current levels of investment, which are modest and forecasted to remain so. R&D spending on the order of the hundreds of billions of dollars currently going to AI would meaningfully shift timelines, but I think it's unlikely we'll see that level of investment pre-AGI.
If AGI arrives tomorrow, whole brain emulation will likely require at least O(1) year to achieve
Superintelligent AI eliminates the non-physical bottlenecks. Segmentation and proofreading become essentially free. Experiment design and simulation algorithms for functionalization can be optimized far beyond current methods. The binding constraints are physical and serial:
- Functionalization (3–12 months). Each batch of neurons requires 1–3 months to mature before experiments can begin. It is possible that sufficiently good molecular simulations could substitute for most physical experiments, but at least one experimental batch will likely be needed to validate the derived update rules. Optimistically, this is 2 batches (derive + validate) taking 3–6 months. If simulations are insufficient and physical experiments dominate, 3–4 batches taking 6–12 months.
- Mouse validation (2–4 months). At least one mouse brain must be scanned (weeks), emulated, and checked against behavior. If something is wrong—wrong markers, insufficient resolution, update rules that don't generalize—each iteration takes weeks to months. At least 1–2 iterations seem likely, plausibly longer.
- Human scan (2–4 months). Tissue preparation, imaging, and reconstruction with massive parallelism. Multiple staining rounds scale costs and time proportionally.
These steps are largely serial: functionalization must produce initial results before mouse validation is meaningful, and mouse validation must succeed before committing to a human scan. Total serial time is roughly 6 months at the very optimistic end, assuming simulations largely work, everything succeeds on the first try, and O($10B) capex expenditure to maximally parallelize the final scanning. More realistically it could easily take >12 months. There's a lot of room for things to go wrong — validating long-term behaviors is particularly hard, and there's pipeline risk at every stage — and a cautious pipeline might plausibly want to validate on multiple intermediate species, like monkeys, before committing to a human brain.
Pre-AGI investments are unlikely to substantially compress this timeline
Total global funding for basic neuroscience is roughly $0.5B per year,[2] and fewer than 500 people worldwide work specifically on brain emulation (SOBE (2025)). WBE-specific funding is probably in the low tens of millions annually. A billion-dollar investment would therefore represent something like 10–50x current annual WBE spending — naively equivalent to a decade or more of field progress.
But this overstates the impact for several reasons:
- The field is tiny and cannot absorb a billion dollars quickly; hiring, training, and building new labs takes years, which competes directly with short AGI timelines. I expect basically no useful impact if timelines are < 5 years.
- The most valuable pre-investment, a serious functionalization program targeting the update rules, is also the least likely to deliver, since almost nobody is currently working on this and it is unclear how much progress can be made without AI capabilities that themselves signal AGI is close.
- Even investments that deliver on their own terms may not help post-AGI: a completed mouse connectome is only useful if it happened to capture the right molecular information, and without functionalization we cannot know whether it has.
Regardless of pre-AGI investment, I think it is very unlikely that a human brain will have been scanned before AGI arrives. In addition to the long, by default, timelines to prerequisite technologies, scanning a human brain would likely take O($1B) over the natural 3–5 year timescales over which the capex depreciates.
Pre-AGI investments in whole brain emulation are unlikely to meaningfully accelerate AGI timelines
In Appendix A.5 we estimate the expected acceleration of AI timelines from a $100M pre-AGI investment in WBE. Even on a generous accounting, neuroscience's contribution to AI progress is small on expectation, and furthermore WBE investments represent a small fraction of relevant neuroscience spending. As a result, the expected acceleration is probably only on the order of a day.
Using Whole Brain Emulation
Initial emulations would be slow and expensive
The previous sections argued that whole brain emulation is unlikely to be available during the critical period around AGI. But suppose it were — would it matter? To answer this we need to estimate how many emulations could be run, at what speed, and at what cost relative to AI systems.
Each emulation requires tens of thousands of GPUs
The compute requirements for a brain emulation depend on the fidelity of the neuron model. The SOBE (2025) report analyzes two reference scenarios; we use the more biophysically realistic (five-compartment Hodgkin-Huxley neurons with Tsodyks-Markram synapses) as the reference case throughout. Real-time emulation requires 1.4 × 1019 FLOP/s, of which 98% is synapse processing. Brain simulation cannot use tensor cores; the relevant H100 throughput is 67 TFLOP/s (FP32 CUDA cores), giving a naive GPU count of about 200,000.
This is an overestimate in some ways and an underestimate in others:
- Memory bandwidth, not compute, binds the naive simulation. Every synapse must be read and updated 10,000 times per second, demanding roughly 27 EB/s of bandwidth. This inflates the GPU count to 400,000–1,000,000.
- The SOBE baseline excludes long-term plasticity. Adding STDP, calcium-based LTP/LTD, and structural plasticity roughly doubles per-synapse memory and adds a central 2× on compute.
- Clock-driven simulation is wildly wasteful. Cortical neurons fire at 1–5 Hz, so over 99.9% of synapse updates compute "nothing happened". Event-driven simulation—updating each synapse only when a spike arrives—gives an algorithmic speedup of several hundred times, though implementation losses (GPU scatter inefficiency, cross-node spike delivery, the neuron-compute floor) reduce the realized whole-simulation speedup to 10–50×.
- FP16 precision gives a further well-grounded 2×.
Appendix A.6 works through the full accounting. The net effect is that event-driven simulation and FP16 more than offset the plasticity and bandwidth costs, bringing the effective compute requirement to roughly 3 × 1017–1018 FLOP/s.
At this point the bottleneck shifts to memory capacity. Event-driven simulation reduces memory accesses proportionally to the FLOP reduction—idle synapses are no longer read every tick—but the synapse state itself (3–5 PB of weights, plasticity traces, and connectivity) must be held in memory whether or not it is being accessed. A single real-time emulation requires 10,000–40,000 H100-equivalent GPUs, sized by the need to fit that state across 80 GB of HBM per chip. Additional biophysical techniques—most importantly synapse pruning, which compresses the memory footprint directly—could push this lower, but the individual factors are not well-grounded (see Appendix A.6).
Global compute can support hundreds to thousands of emulations
Global AI compute is growing rapidly. Based on the AI 2027 forecast (which draws on TSMC production capacity and chip efficiency trends), the global stock of AI-relevant compute in H100-equivalents is roughly 18M by end-2025, 40M by end-2026, and 100M by end-2027. Taking 20,000 H100-equivalents per emulation as the central estimate:
| Year | H100-equivalents | Max emulations (100%) | Max emulations (1%) |
| 2025 | 18M | 900 | 9 |
| 2026 | 40M | 2,000 | 20 |
| 2027 | 100M | 5,000 | 50 |
These counts assume the central 20,000 H100-per-emulation figure. At the pessimistic 40,000 they halve; at the optimistic 10,000 they double.
At 1% of global compute—already a generous allocation—you could run tens of emulations by 2027. This is more than the clock-driven estimate (which gave single-digit emulations at 1% allocation), but still small relative to what AI systems can deliver: the same 1% of compute could serve thousands to tens of thousands of concurrent AI model instances.
Emulations run at roughly human speed
Emulations could plausibly run at up to ~5× real-time with current interconnects—fast enough to be useful, but not fast enough to substitute for running many copies in parallel.
The constraint is sequential: neuron dynamics must be integrated every 0.1 ms, and all GPUs must synchronize before the next step begins. Running at N× real-time means completing all computation and communication within 100/N microseconds.
The 20,000-GPU fleet is pinned by memory capacity; you cannot add GPUs to relax the tick budget without changing the per-emulation cost. With that fleet each GPU holds about 4.3 million neurons, and integrating their compartmental dynamics takes roughly 11 μs per tick at FP16 throughput. At 5× real-time the 20 μs budget leaves room for communication and sparse synapse events; at 10× the 10 μs budget is already less than the compute-only cost per tick, so 10× real-time is unlikely at this fleet size. Beyond 100×, the budget drops to 1 μs—comparable to a single network round-trip—leaving no room for computation regardless of fleet size.
For comparison, the largest existing simulations run far below real-time: a 2024 simulation of 86 billion simplified neurons on 14,000 GPUs achieved roughly 1/120th real-time. Emulations would run at roughly human speed—faster by a small multiple, not by orders of magnitude.
The compute cost per unit of cognition for emulations is poor
One emulation requires on the order of 20,000 H100-equivalents, costing roughly $40,000 per hour of real-time operation at current cloud rates ($2/GPU-hour). Several framings all point in the same direction.
Comparison to human labor costs. A senior knowledge worker costs about $200/hour. One wall-clock hour of emulation costs $40,000; at 1–5× real-time that produces 1–5 person-hours of output, so the cost per person-hour is $8,000–$40,000—a 40–200× premium over hiring the same person. And the comparison is worse than this, because WBE requires AGI, and for AGI to be economically transformative it must provide cognitive work at costs not drastically exceeding human wages. Emulations would be tens to hundreds of times more expensive than either humans or contemporary AGIs.
Comparison to current frontier API pricing. Writing a 2,000-word memo requires about 3,000 output tokens. At current frontier model pricing ($10–30 per million output tokens), that costs a few cents. The emulation takes several hours at $40,000/hour. The ratio is on the order of 106.
Comparison to concurrent model instances. The 20,000 H100s needed for one emulation could instead serve 1,250–2,500 concurrent instances of a frontier model (at 8–16 H100s per instance). Future models will be larger—if the leading models of 2028–2030 need 100+ GPUs per instance, this drops to roughly 200 concurrent instances. Still significant.
These comparisons are not all apples-to-apples, and each has weaknesses. But the ratios are large—two to six orders of magnitude depending on the framing—and they make it hard to see how an emulation could be competitive with an AI on any task unless the task absolutely must be performed by a (virtual) human.
Emulations will not be competitive with AI for cognitive work
Distillation could make emulations substantially cheaper than the biophysical baseline, but the compression is uncertain and the case that it works for long-horizon behavior is weak. Even with aggressive compression, emulations are unlikely to be competitive with purpose-built AI for cognitive work; for almost any cognitive task, you can probably train a specialized AI that does it better and cheaper than any human, emulated or otherwise. Whatever value emulations provide has to come from specifically needing a human.
Distillation might compress emulations, but we do not know by how much
While we don't know how much compute is required to replicate the brain's cognitive work, Carlsmith (2020)'s central estimate of 1015 FLOP/s is three orders of magnitude below the biophysical simulation cost. This suggests—but does not guarantee—that further efficiency gains are possible.
Beyond the biophysical simplifications discussed in the previous section, more aggressive approaches would distill the emulation — or subcomponents of it — into neural networks. The cost depends on how much training data the teacher emulation must generate and how fast useful information can be extracted. If the brain decomposes cleanly into regions whose spike-level activity can be used as training signal, distillation could be fast and relatively cheap. If instead you are limited to observing behavioral output, the information rate is orders of magnitude lower and the cost scales accordingly.
Distilling a few seconds of emulation behavior is conceptually similar to ordinary ML distillation. It is much less clear how well this extends to faithful long-horizon behavior — how someone's thoughts, memories, and beliefs evolve over months or years. Capturing that may require long-run rollouts of the emulation, and if those rollouts need to span months or years of subjective time, the wall-clock cost and compute become very large.
Given all this, I was not able to usefully constrain how effective distillation could be at producing substantially cheaper emulations. My guess is that short-horizon behavior — the next few seconds of response — could probably be distilled at costs that are not huge relative to the initial cost of creating the emulation. I am much less confident that you can produce distilled emulations that faithfully replicate long-horizon human behavior in any reasonable time.
Even setting cost aside, it is unclear whether distillation preserves enough fidelity to serve the purposes that make emulations distinctive in the first place. And the distillation process itself may raise serious moral concerns — generating training data means running an emulation of a person through potentially years of subjective experience.
Emulations will not be economically competitive with AI
Emulations — including distilled emulations — are unlikely to ever be competitive with purpose-built AI for cognitive work. The human brain is not optimized for computational efficiency: evolution optimized under severe physical constraints, the available learning algorithms are limited by what chemistry and myelin permit, much of the brain's compute goes to functions irrelevant for abstract cognition, and the brain is overparameterized relative to its training data. A distilled emulation that faithfully reproduces human cognitive patterns inherits these inefficiencies.
As argued in Part 1, emulations probably postdate AGI. AI progress has been rapid over the last few years, and it seems unlikely to stall out at approximately the point where AI allows us to build emulations. Given this, and given that the human brain is unlikely to be close to the floor for cognitive labor, human emulations — including distilled emulations — would probably be severely outclassed by purpose-built AI. Whatever the case for emulations is, it has to rest on something that specifically requires a human — a specific person's identity, their values, their way of reflecting — rather than generic cognitive labor.
The proposed uses for emulations are weak
Any proposed use for whole brain emulation must derive from properties that distinguish an emulation from a de novo AI system. Potentially distinctive properties are:
- It is a specific person whose identity and expertise carries over
- It has human values by construction
- It has known and bounded cognitive properties
- It may be trusted by humans for social or political reasons independent of its technical properties
I don’t find (3) and (4) compelling: bounded and predictable cognition is far easier to engineer deliberately into a purpose-built system than to inherit from a brain emulation, and the marginal political legitimacy of adding expensive emulations to oversight structures seems small when biological humans with AI-assisted tools are available. We thus focus exclusively on uses deriving from properties (1) and (2), and ask whether they justify a role during the initial transition to AGI and the development of superintelligence.
The economics of scaling trusted human cognition don't work out
The most commonly proposed use is to copy trusted individuals and run them at digital speeds during a critical period. As established above, this is implausible for several reasons: there is a substantial lag after AGI before WBE is available, each emulation is extremely expensive to run, and the achievable speedup relative to biological humans is modest. Both AI labor and biological human labor are orders of magnitude cheaper per unit of cognition.
Even setting aside cost, there are practical reasons to be skeptical. The scanning process will require killing (at least temporarily) the person. Even assuming someone is willing to accept that, getting useful work from the emulation requires high confidence in the fidelity of the emulation. Drugs and minor genetic variations substantially alter cognition and personality, so functionalization must be accurate enough to avoid comparable distortions over timescales required for the emulation to do useful work. It seems quite possible that uploading someone simply produces someone who is effectively drunk or insane.
I also question the general premise that a human emulation would be more trustworthy than an AI system. We have much greater visibility into the internal states of LLMs and will have invested far greater effort into designing, aligning, and testing them. It is not obvious that humans are as trustworthy, and even less obvious that even a perfect human emulation would remain trustworthy in the extremely unusual circumstances they would encounter.
Augmenting emulations to produce aligned superintelligence is unlikely to be competitive
A more ambitious proposal is to start from a faithful emulation and progressively augment it toward superhuman capability, producing a system that is both powerful and human-aligned because it began as a human.
The timeline problem applies: WBE almost certainly postdates at least weak forms of AGI. For this strategy to work, everyone would need to refrain from improving de novo AI and instead pursue the WBE path, despite the AI systems being substantially better and easier to improve. This looks unlikely absent a global moratorium on further AI development.
I'm also skeptical that an emulated person is a better starting point than a synthetic AI for creating aligned superintelligence. Minor biochemical perturbations already substantially alter personality and cognition; making someone substantially smarter is a far larger intervention. Interpreting changes and their effects seems really hard, probably harder than for a synthetic system whose architecture you control.
A reference implementation of human values is probably unnecessary
A less ambitious version of the same intuition — that emulations have human values by construction — is to use an emulation not as a starting point for superintelligence but simply as a specimen: a working implementation of human values that can be studied to inform alignment of de novo AI. The economic constraints are less binding since this in theory requires fewer emulations, though if human values vary enough that a single emulation is unrepresentative, you need to destructively upload proportionally more people.
The timing problem remains: ems probably postdate AGI by enough that they are unlikely to produce reference values in time for the decisions that shape the transition. Getting a sense of how humans would perform under long-reflection is particularly rough — it is the hardest case to emulate, the hardest case to speed up, and the hardest case to validate.
I am also skeptical that a reference implementation of human values from an em would be necessary or transformative. LLMs trained on the entirety of human writing already encode a rich model of human values and moral reasoning. Biological humans can be interviewed, surveyed, and studied in controlled experiments without killing them. To the extent that neuroscience yields alignment-relevant insights about reward circuitry or value formation, these are available from the research program long before it produces a completed emulation. While emulations might add marginal value, or be desirable in the longer run as improved instantiations of human value, I'd be surprised if they were strongly valuable in the initial phases of an AI transition.
Pre-AGI WBE investment is probably not cost-effective
This section provides a rough framework for evaluating the expected value of pre-AGI WBE investment. It’s deliberately very stylized.
As we have seen, emulations are probably significantly outclassed by synthetic AIs. Given the likely time-lag between AGI and whole brain emulations, I don't see how emulations could be relevant in a laissez faire scenario. Instead, I think the most bullish case for investment in whole brain emulation comes from a case where the dangers of further AGI development are recognized and a global moratorium is in place, allowing emulations to be produced and to be used sufficiently to make a difference.
The expected value of pre-AGI investment therefore decomposes into three questions:
- How much does pre-AGI spending accelerate the post-AGI WBE timeline?
- How much does that acceleration matter, given a moratorium?
- How likely is a moratorium world in which emulations matter?
How much does pre-AGI spending accelerate the post-AGI WBE timeline?
The binding post-AGI bottleneck is functionalization, which requires near-AGI capabilities and cannot meaningfully be pre-done. Other steps (mouse scanning, scope manufacturing) run in parallel with functionalization and are not on the critical path. Pre-AGI investment can contribute through infrastructure, preliminary data, and methodological progress — but a post-AGI AI can do vastly more, and most pre-AGI intellectual output gets superseded. The question is how much of the entire field's pre-AGI output survives to compress the post-AGI timeline, and what fraction of that your investment represents.
| AGI in... | Total pre-AGI WBE spend | $100M as fraction | Months saved by entire field | Your compression |
| 3 years | $100M | 50% | 0.5 | 0.25 mo |
| 5 years | $250M | 30% | 1 | 0.3 mo |
| 10 years | $600M | 15% | 2 | 0.3 mo |
| 20 years | $3B | 3% | 3 | 0.1 mo |
The expected compression is roughly flat at around 0.2–0.3 months, because opposing effects roughly cancel: longer timelines give the field more time to produce useful output, but your fraction of total spending shrinks and counterfactual funding increases.
How much does speeding up emulations matter?
If the moratorium is stable, the delay doesn't matter — ems arrive eventually. If the moratorium is very unstable, the delay also doesn't matter — it collapses before ems are ready regardless. Speeding up emulation thus only matters if there’s a substantial risk that the moratorium collapses within the counterfactual time accelerated by the pre-AGI work.
To get a crude sense of this, imagine the moratorium has a constant hazard rate h of collapsing each month, and ems take T = 12 months to build. The value of arriving Δ months earlier is the probability the moratorium would have survived to month T but collapsed in the next Δ months:
$$
V(h) = e^{-hT},h\Delta
$$
This is maximized at h = 1/T = 1/12, i.e. when the expected moratorium length equals the build time. At this optimum, V* = Δ/(eT) = 0.3/(2.718 × 12) ≈ 0.9%. So even at the best-case hazard rate, 0.3 months of acceleration increases the probability of ems arriving before the moratorium collapses by less than 1%. Given that the timing is unlikely to work out such that the value of the speed up is maximized, a 0.3% decrease in the moratorium collapse risk is probably more reasonable.
How likely is a moratorium, and how much is at stake?
Being in the moratorium world requires that emulations are useful in principle (copies of specific trusted humans provide a safety benefit not available more cheaply) and that the situation is serious enough, and coordination strong enough, to sustain a moratorium. A rough decomposition:
- 10% that emulations are useful in principle
- 10% that the geopolitical situation permits a moratorium conditional on this
This gives us a 1% that we are in the "ems are useful and the world coordinates on a moratorium".
We also need to estimate how valuable it is, conditional on this, to get the ems before the moratorium breaks down. This is very underconstrained, but if we are predicated on worlds in which ems are very valuable for x-risk reduction then a gap of 1000 bps seems reasonable and generous.
Combining this together, we find that an investment of $100M provides a
$$
1\% \times 0.3\% \times 1000\ \text{bp} = 0.03\ \text{bp}
$$
benefit, for a bp/bn of 0.3. A $10M investment could potentially be above the bar, while a $1B investment looks considerably below the bar:
| Investment | Compression | bp/B |
| $10M | 0.15 mo | 1.5 |
| $100M | 0.3 mo | 0.3 |
| $1B | 0.5 mo | 0.05 |
Appendices by Claude
These estimates are rough BOTECs produced with Claude's assistance.
A.1 Wet-lab and manufacturing cost estimates
The estimates below are a rough reconstruction of the wet-lab pipeline. See also SOBE's Guesstimator and their cost spreadsheet for an alternative bottom-up cost model.
Tissue processing pipeline
The brain is divided into roughly 1.2 million blocks of 1 mm³. Each block goes through the following pipeline:
| Step | Wall-clock time | Hands-on time per block | Notes |
| Slicing | ~5 min | ~5 min | Cutting block from fixed brain |
| Hydrogel embedding | ~12 hours | ~15 min | Mostly waiting for polymerization |
| Denaturing | ~4 hours | ~10 min | Chemical incubation |
| Expansion | ~2 hours | ~10 min | Swelling in water |
| Re-embedding and second expansion | ~12 hours | ~15 min | Repeat for higher expansion factor |
| Antibody staining (per round) | ~4 hours | ~15 min | Incubation; ~10 min hands-on for application and washing |
| Mounting and imaging | ~6.5 hours (LICONN) | ~15 min | Hands-on time is loading and alignment; imaging is automated |
Total wall-clock time per block: roughly 4–5 days, dominated by incubation and polymerization steps. Total hands-on time per block: roughly 1.5 hours for a single staining round, or roughly 2.5 hours with 3–4 staining rounds for a full receptor panel.
Labor for a one-year timeline
- 1.2 million blocks × 1.5–2.5 hours hands-on = 1.8–3.0 million person-hours
- At 2,000 hours per FTE-year: 900–1,500 FTEs
- With overhead for QC, troubleshooting, logistics, supervision: probably 1,000–2,000 FTEs
These FTE counts are a nominal unit. In practice much of the work would be done by AI and robotics, but even if human labor is fully displaced the dollar cost is probably comparable to employing a similar number of humans — on the order of billions of dollars' worth of bespoke robotics, as a crude estimate.
Physical space
The slowest step is hydrogel embedding through second expansion, taking roughly 2 days. To process 1.2 million blocks in one year, roughly 6,500 blocks must be in the pipeline simultaneously. Each expanded block is roughly 16 cm on a side (~4 L), so the total gel processing volume is roughly 25–30 m³. This is a large wet lab — perhaps 10–20 rooms of racks — but not extraordinary.
Scope manufacturing
Current global production of research light-sheet and confocal microscopes is a few hundred per year. ExA-SPIM-class systems are relatively simple instruments: a laser, light-sheet illumination optics, a camera, and a positioning stage.
- Component cost per scope: $50–100K (at scale, not research pricing)
- 16,000 scopes × $50–100K = $0.8–1.6B in components
- Assembly is parallelizable: a larger factory or multiple production lines can compress the timeline arbitrarily. The binding constraint on manufacturing speed is probably component supply chains (cameras, lasers, precision stages), not assembly labor.
- Assembly labor scales with timeline: perhaps 500 FTEs for a one-year build, or 2,000 FTEs for a three-month build.
Summary
| Category | Cost | Labor | Timeline |
| Scope components | $0.8–1.6B | 500–2,000 FTEs | Parallelizable; supply chains are the constraint |
| Tissue processing (1-year scan) | Consumables TBD | 1,000–2,000 FTEs | 1 year |
| Total | ~$1–2B + consumables | ~1,500–4,000 FTEs peak | Parallelizable |
Note: these estimates assume a one-year scanning timeline with ExA-SPIM-class optics at LICONN-class resolution (16,000 scope-years). A faster timeline requires proportionally more scopes, labor, and physical space.
A.2 Data processing cost estimates
Storage
A whole-brain scan at 10 nm effective resolution generates on the order of a zettabyte of raw imaging data. Even with 100× compression (plausible given the redundancy in volumetric imaging), that is 10 exabytes of stored data. At current cloud storage pricing (~$20/TB/month for standard storage), storing 10 EB for one year costs roughly $2.4B. On-premises storage would be cheaper — perhaps $100M–$500M in hardware for this capacity — but still substantial.
Segmentation compute
The segmentation task is to trace every neuron through the imaged volume — determining which voxels belong to which neuron and identifying synapses. Modern approaches use 3D convolutional neural networks that process the volume in overlapping patches.
A human brain at 10 nm isotropic resolution contains roughly 1021 voxels. The compute cost per voxel depends on the segmentation method:
- Flood-filling networks (used for Shapson-Coe): most expensive, as they iterate outward from seed points. Roughly 105 FLOPs per voxel.
- Affinity-based methods with local shape descriptors: roughly 100× cheaper than flood-filling networks at comparable accuracy, per Sheridan et al. (2022). Roughly 103 FLOPs per voxel.
Using the cheaper method as a reasonable near-future estimate:
- Total FLOPs: 1021 voxels × 103 FLOPs/voxel = 1024 FLOPs
- An H100 GPU sustains roughly 1015 FLOP/s for tensor-core FP16 inference (the relevant throughput for dense segmentation workloads), or roughly 3 × 1022 FLOPs per year
- Total: roughly 30–300 H100-years, depending on method
- At $2/GPU-hour: roughly $0.5M–$5M
This is small relative to scope capex. Even if these estimates are off by an order of magnitude, segmentation compute is unlikely to be the binding constraint.
Stitching and registration
The 1.2 million tissue blocks must be aligned into a coherent whole-brain volume, correcting for expansion-induced distortions. This is computationally intensive but probably cheaper than segmentation. No good public estimates exist for this at whole-brain scale.
A.3 Experimental determination of neuronal update rules
The central uncertainty in this appendix is how many measurements are needed to characterize the neuronal update rules. We do not attempt to resolve this. Instead, we estimate the cost and time as a function of the number of measurements, so that the reader can plug in their own estimate.
A "measurement" means: grow a neuron of known type, deliver a defined stimulus in a defined neuromodulatory context, let it run for a defined duration, fix it, and read out its molecular state at high resolution. Because the readout is destructive, each measurement yields one temporal endpoint.
Per-measurement costs
Three things go into each measurement:
Cell culture. Neurons are differentiated from iPSCs in multi-well plates (384-well) using automated liquid handling. Maturation to a state with functional synapses and plasticity takes 1–3 months depending on neuron type. Per-well consumable costs (media, growth factors, small molecules) are roughly $10–30.
Stimulation and fixation. Optogenetic or electrical stimulation, followed by chemical fixation at a defined timepoint. Cheap per unit; the cost is in the automation infrastructure.
Molecular readout. Multiplexed immunostaining or targeted ExM of a small number of neurons per well, read out on a fluorescence microscope. Depending on depth of readout, roughly 15–60 minutes of scope time per measurement.
At scale, the dominant marginal costs are consumables and imaging time. A rough all-in marginal cost per measurement is $30–100, trending lower at higher volumes.
Infrastructure scaling
Cell culture throughput. One automated incubator + liquid handling system manages roughly 100 plates (38,000 wells) and costs $1–5M. With 2-month maturation cycles, that is roughly 200,000 wells per year per system.
Imaging throughput. One scope doing 30 min per measurement yields roughly 17,000 measurements per year. Scopes suitable for this task (spinning-disk confocal or similar) cost $200–500K.
At any given scale, either cell culture or imaging will be the bottleneck. The crossover is roughly one culture system per 12 scopes.
Scaling table
| Measurements | Scopes | Culture systems | Capex | Consumables/yr | Total ~1yr cost |
| 104 | 1 | 1 | $2–5M | $0.5M | ~$5M |
| 106 | 60 | 5 | $20–40M | $50M | ~$100M |
| 107 | 600 | 50 | $200–400M | $500M | ~$0.5–1B |
| 108 | 6,000 | 500 | $2–4B | $3–5B | ~$5–10B |
| 1010 | 600,000 | 50,000 | $200–400B | — | $0.5–1T |
All rows assume a roughly one-year campaign. Faster timelines require proportionally more infrastructure.
Comparison to scanning costs
The brain scan itself costs roughly $4B for a one-year timeline (see main text). The experimental program becomes comparable in cost somewhere around 107 measurements, and exceeds it by roughly an order of magnitude at 108. Below 107, the experimental program is a small fraction of total WBE cost.
Serial time bottleneck
Even with unlimited money and equipment, each batch of neurons takes 1–3 months to mature before experiments can begin. This maturation time is an irreducible physical constraint and sets a hard floor on the serial timeline of the experimental program.
How many measurements might you actually need?
We leave this as an open question, but note a few anchors:
- Classical plasticity rules (e.g. STDP timing curves for one synapse type) were characterized from O(102) recordings.
- Detailed biophysical models of individual signaling pathways have been fit from O(103–104) measurements.
- Nobody has attempted to learn full update rules for even one neuron type.
- The number of transcriptomically distinct neuron types in the human brain is O(100–1,000).
If the update rules decompose neatly by neuron type and the biology is modular enough that each type can be characterized independently, 106 might suffice. If the rules are highly context-dependent, or if there are important interaction effects between neuron types that can only be studied in multi-type preparations, the requirement could push toward 108 or beyond.
A.4 Simulation-based determination of neuronal update rules
This appendix mirrors the experimental appendix. The same question applies: how many observations are needed to characterize the update rules? Simulation provides these computationally rather than physically.
A "virtual measurement" means: simulate a neuron compartment (typically a dendritic spine) of known type, with defined molecular composition and inputs, for a plasticity-relevant duration, using a particle-based reaction-diffusion model (MCell/STEPS level). All parameters at this level are physical observables — binding affinities, rate constants, diffusion coefficients, copy numbers — each determinable from biophysical experiments or molecular simulation. There are no irreducible fudge parameters.
Unlike physical experiments, the readout is non-destructive: each simulation yields a complete trajectory (every molecule's state at every timestep), not a single temporal endpoint. This is a genuine information multiplier — perhaps 10–100× fewer simulations are needed than physical experiments to reconstruct the same dynamics.
Cost per virtual measurement
The best available benchmark is Bartol et al. (2025): an MCell4 simulation of CaMKII autophosphorylation in a single dendritic spine, tracking ~1,000 molecules across ~140 parameters in EM-reconstructed 3D geometry. Ten seconds of biological time cost roughly 10 node-days on a single HPC node, or approximately 2,400 CPU-core-hours.
Scaling to a plasticity-relevant virtual measurement requires two multipliers. First, timescale: compute scales linearly with biological duration, and the relevant timescales are minutes to hours, giving a factor of ~100–1,000×. Second, model complexity: the benchmark covers only one signaling cascade (Ca²⁺/CaM/CaMKII/PP1); a reasonably complete plasticity model adds perhaps 10–100× more compute.
Combined, one virtual measurement costs roughly 106–108 CPU-core-hours, or equivalently roughly 104–106 H100-GPU-hours if the algorithms were ported to GPUs (feasible but not yet done for these codes). In FLOP terms, this is roughly 1022–1024 FLOP per measurement. At current cloud pricing (~$2/GPU-hour), that corresponds to roughly $20K–$2M per measurement on GPUs, or 10–100× more on CPUs.
Scaling table
Using the GPU-optimized estimate:
| Measurements | GPU-hours (H100) | FLOP | Cost |
| 104 | 108–1010 | 1026–1028 | $200M–$20B |
| 105 | 109–1011 | 1027–1029 | $2B–$200B |
| 106 | 1010–1012 | 1028–1030 | $20B–$2T |
Using current CPU methods, costs are 10–100× higher at each row.
Comparison to physical experiments
Per-measurement, simulation at current algorithms is roughly 1,000–10,000× more expensive than physical experiment ($30–100 per physical measurement). This is partially offset by the information multiplier from non-destructive readout (perhaps 10–100×), but simulation is still probably 10–1,000× more expensive per unit of information at current methods.
However, there is a key asymmetry between simulation and experiment in how they respond to intelligence. For physical experiments, a superintelligent AI can reduce which measurements you need (by choosing maximally informative experiments), but it cannot change the per-measurement cost — the physical pipeline of growing, stimulating, fixing, and imaging neurons is what it is. For simulation, a superintelligent AI can do both: reduce the number of measurements and reduce the cost per measurement, because the simulation algorithms themselves are an additional lever with no analogue in physical experiment.
The algorithmic headroom is potentially large. Current methods track every molecule at microsecond timesteps for hours of biological time. If the update rules are encoded in a much smaller number of effective degrees of freedom — which is plausible given how many biological phenomena admit useful coarse-grained descriptions — then adaptive multiscale methods, provably accurate coarse-graining, and analytical shortcuts for well-mixed subcompartments could reduce the per-measurement cost by orders of magnitude. The physics sets a floor, but that floor may be far below the cost of current brute-force particle tracking.
This means that the relative advantage of simulation over experiment grows with AI capability. At current algorithms, experiment is cheaper. With sufficiently good algorithms, the balance could reverse.
Comparison to scanning costs
The brain scan costs roughly $4B. At current algorithms, simulation-based determination of the update rules is more expensive than physical experiment for any plausible number of measurements. With substantial algorithmic improvement, it could become competitive.
A.5 Acceleration of AI research from WBE research
This appendix estimates the expected acceleration of AI timelines from a $100M pre-AGI investment in WBE using an outside-view approach based on historical rates of knowledge transfer from neuroscience to AI. The expected acceleration decomposes as:
$$
\text{Acceleration} \approx F_1 \times F_2 \times (\text{AI timeline})
$$
where F₁ is neuroscience's marginal contribution to AI progress, F₂ is the share of AI-relevant neuroscience that the investment represents.
Factor 1: Neuroscience's marginal contribution to AI progress
The table below classifies major deep learning innovations by neuroscience influence. "Strong" means neuroscience was a direct causal input. "Weak" means a connection exists but the innovation plausibly would have arisen from engineering or mathematics alone. "None" means no meaningful neuroscience input. While the foundational concept of neural networks came from biology, the current deep learning paradigm owes little to neuroscience. Even among the "weak" connections, the causal role is questionable: TD learning was derived from behavioral psychology and dynamic programming; Schultz's dopamine finding (1997) confirmed the algorithm nearly a decade after Sutton developed it. Gershman (2024) argues that across the standard examples, ideas have flowed predominantly from AI to neuroscience rather than the reverse, and that even CNNs — the strongest case — assumed biologically implausible weight sharing from the outset.
| Strong | Weak | None |
| Neural network concept (1958) CNNs (1989) | TD learning (1988) RNNs (1990) Experience replay (1992) ReLU (2011) Attention mechanism (2014) | Backpropagation (1986) LSTMs (1997) Dropout (2012) Word embeddings (2013) GANs (2014) Adam optimizer (2014) Batch normalization (2015) Residual connections (2015) Transformers (2017) Mixture of experts (2017) Scaling laws (2020) Diffusion models (2020) RLHF (2022) |
The table above captures paradigm-level innovations. More mundane drivers of algorithmic progress over the past decade, like better optimizers, positional encodings, activation functions, data curation, compute-optimal training recipes, mixture-of-experts routing, are completely divorced from neuroscience.
As another comparator, global AI R&D spending is on the order of $100–200B/year; computation-relevant basic neuroscience is $2–4B/year. If neuroscience were contributing to AI progress at a rate substantially above its spending share, the AI industry would be massively misallocating resources by not redirecting tens of billions toward neuroscience. One could object that the AI industry under-invests in speculative basic research due to short time horizons, but the magnitude of the implied misallocation is large enough that this explanation has to do a lot of work.
I think the evidence strongly suggests that the forward-looking marginal contribution of neuroscience to AI progress is low; previous AI breakthroughs suggest 20% if we generously count the entire history but more like 5% over the past 15 years and in the last 5 years the contributions seem negligible. R&D spend numbers suggest ≲2%. Overall I would estimate mean 3% speed-up, but my modal estimate is 0%.
The above analysis assumes neuroscience contributes to AI progress continuously and marginally. A different model is that there is some chance of a discrete breakthrough: we are missing something fundamental about how the brain computes, the current deep learning paradigm eventually hits a wall, and the missing insight turns out to be discoverable fastest (or only) through neuroscience. If we think there’s a 20% chance of such a discovery per decade and that this accelerates progress 5 years, then the expected acceleration is 10%. This seems generous to me.
Factor 2: WBE's share of AI-relevant neuroscience
The relevant denominator is annual global spending on computation-relevant basic neuroscience: systems neuroscience, computational neuroscience, connectomics, and cellular biophysics. This excludes the large majority of the ~$30B neuroscience enterprise that is clinical or disease-oriented. We estimate this at roughly $1–2B/year. Note that the AI timeline drops out of the calculation. Longer timelines mean more cumulative neuroscience spending to dilute the $100M investment, but also more total AI progress for neuroscience to contribute to; these exactly cancel. The acceleration simplifies to:
$$
\text{Acceleration} \approx F_1 \times ($100\text{M} / \text{annual relevant neuroscience spend})\ \text{years}
$$
Overall estimate
Combining these estimates together:
| Low | Central | High | |
| F₁: Neuroscience share of AI progress | 1% | 3% | 10% |
| F₂: $100M / annual relevant neuro | 5% | 7% | 10% |
| Product | 0.05% | 0.21% | 1% |
| Acceleration (per year of AI timeline) | 0.2 days | 0.8 days | 4 days |
The expected acceleration is roughly on the order of a day, which is essentially negligible.
A.6 Compute requirements per emulation
Baseline: the SOBE moderate model
SOBE (2025) analyzes two reference scenarios for real-time human brain emulation, both excluding long-term plasticity. We use the more biophysically realistic—five-compartment Hodgkin-Huxley neurons with Tsodyks-Markram synapses—as the reference case. A human brain has 86 billion neurons and 170 trillion synapses, simulated at 10,000 timesteps per second (0.1 ms timestep).
| Component | Per-element cost | Count | Total | Share |
| Neurons (5-comp HH) | 3.45 × 106 FLOP/s | 8.6 × 1010 | 3.0 × 1017 FLOP/s | 2% |
| Synapses (Tsodyks-Markram) | 8.0 × 104 FLOP/s | 1.7 × 1014 | 1.4 × 1019 FLOP/s | 98% |
| Total | 14 EFLOP/s |
Per-element FLOP/s costs are from the SOBE 2025 data repository. Synapse processing dominates at 98% of compute. Any technique that reduces synapse cost moves the total proportionally; neuron-only reductions are capped at 2%.
Brain simulation cannot use tensor cores (specialized for dense matrix multiply); neural simulation is sparse, irregular, and memory-bound. The relevant H100 throughput is 67 TFLOP/s (FP32 CUDA cores), or roughly 130 TFLOP/s at FP16. SOBE compares the moderate model to xAI's Colossus cluster using the tensor-core peak of roughly 100 EFLOP/s, but brain simulation uses CUDA cores, not tensor cores.
Uncompressed hardware mapping (FP32):
| Constraint | H100-equivalents |
| Compute | 209,000 |
| Memory capacity (2.7 PB at 80 GB/GPU) | 34,000 |
| Memory bandwidth (raw ~27 EB/s at 3 TB/s/GPU) | ~9,000,000 |
In a clock-driven simulation, bandwidth is the binding constraint by a wide margin. Every synapse must be read and updated every tick (10,000 Hz × 16 B × 170T synapses ≈ 27 EB/s for reads alone). Even a 10× locality discount leaves the bandwidth-driven GPU count at 900,000, dwarfing the compute-derived 209,000. A clock-driven emulation requires 400,000–1,000,000 H100-equivalents.
What SOBE's number excludes: plasticity
The moderate model includes short-term plasticity (Tsodyks-Markram facilitation/depression) but excludes long-term plasticity—no STDP, no calcium-based LTP/LTD, no synaptic tagging and capture, no homeostatic scaling, no structural plasticity. These matter because plasticity is a large part of what distinguishes a running emulation from a frozen snapshot.
Most plasticity mechanisms add little compute once integrated at their biological timescale. STDP fires only on spike pairs (roughly 1 Hz × 170T synapses × 100 FLOP/event ≈ 1016 FLOP/s—negligible). Synaptic tagging, homeostatic scaling, and structural plasticity all operate on minute-to-hour timescales and can be integrated at ≤1 Hz. The one mechanism that might demand per-tick compute is calcium-based plasticity (Graupner-Brunel, Shouval), where per-synapse calcium is a continuous state variable; at full fidelity this roughly doubles per-synapse compute, though it can be simplified by integrating at 100 Hz rather than 10 kHz.
Plasticity adds per-synapse state regardless of update rate: STDP traces, calcium concentration, consolidation flags. Net effect on memory: roughly 2× (from 16 B to 32 B per synapse). Net effect on compute: 1.5–3×, central 2×. If calcium integration requires 1 kHz rather than 100 Hz (to capture the ~1 ms rise time, not just the 10–100 ms decay), the plasticity factor is closer to 5×—but the memory-bound fleet is unchanged, since compute is not the binding constraint after event-driven compression.
Plasticity-inclusive uncompressed baseline: 3 × 1019 FLOP/s, 5 PB.
Event-driven simulation
The SOBE baseline is clock-driven: every synapse is updated every tick regardless of whether anything happened. Cortical neurons fire at 1–5 Hz on average. In a 10 kHz clock, 99.9–99.99% of synapse updates compute "nothing happened".
Event-driven simulation eliminates this waste. Between spikes, synapse state variables (utilization u, recovery x, postsynaptic current I in Tsodyks-Markram) decay exponentially, and the exact solution can be computed analytically when a spike arrives—three exp() calls plus a spike-effect update—rather than stepping through thousands of ticks of Euler integration.
Per-event FLOP count. A clock-driven Tsodyks-Markram tick costs about 8 FLOP (one Euler step on 3 state variables). An event-driven update replaces thousands of Euler ticks with one analytical solution: three exponential decays to recover current state, then the spike-effect update. Counting operations explicitly:
| Component | Operations | FLOP |
| 3 exponential decays (u, x, I) | 3 × (1 exp + 2 mul + 2 add) | 3 × ~25 = 75 |
| Spike update (u, x, δI) | 3 mul + 3 add/sub | 6 |
| Postsynaptic current delivery | 1 mul + 1 add | 2 |
| Bookkeeping (time delta, target lookup, delay) | ~10 ops | 10 |
| Total | ~90–100 |
Each exp() costs roughly 20 FLOP via polynomial approximation (or about 10 via CUDA fast-math __expf()). NEST's tsodyks2_synapse implements exactly this operation sequence. No published source gives an explicit per-event FLOP count; the operation list is short enough to count directly.
At 100 FLOP per event and 1 Hz mean firing rate, the per-synapse cost drops from 80,000 FLOP/s (clock-driven) to 100 FLOP/s (event-driven)—an algorithmic speedup of roughly 800× on the synaptic budget. At a more conservative 3 Hz brain-wide average, the speedup is about 270×.
Why the realized whole-simulation speedup is much smaller. Three factors erode the algorithmic bound on synapses, and a fourth sets a floor on total compute:
- GPU event-driven overhead. When a neuron fires, its roughly 2,000 postsynaptic synapses are scattered across memory. GPUs handle this poorly—irregular memory access, atomic operations for postsynaptic accumulation, and additional compute per spike on the synaptic side. Brian2CUDA's whole-simulation speedup is ~3× worse on event-driven synapses (STDP benchmark) than on clock-driven synapses (LIF benchmark) at $N > 106 (Alevi et al. (2022)). This is a wall-clock penalty, not additional FLOP.
- Cross-GPU spike delivery. At 104–105 GPUs, spike events become cross-node messages. No published benchmark covers this regime. Latency and synchronization overhead could cost another 2–5×.
- Continuous dendritic dynamics. Multi-compartment HH neurons have NMDA plateaus, calcium spikes, and back-propagating action potentials that require per-tick integration regardless of spike arrival.
- The neuron-compute floor. Neuron compute (3 × 1017 FLOP/s in the SOBE moderate model) is not compressible by event-driven methods—compartmental integration runs every tick. In the uncompressed simulation this is only 2% of total compute and irrelevant. But once event-driven sparsity crushes synaptic compute by 100×+, the neuron budget becomes a substantial fraction of what remains. Concretely, the compute-derived GPU count has a hard floor around 2,300 H100-equivalents (3 × 1017 FLOP/s $\div$ 130 TFLOP/s) from neurons alone; further synapse compression buys little on the compute side.
Working through the accounting at 1 Hz mean firing rate:
| Component | Clock-driven | Event-driven (algorithmic) | Event-driven (wall-clock, 3× overhead) |
| Synapses | 1.4 × 1019 FLOP/s | 1.7 × 1016 FLOP/s | ~5 × 1016 effective |
| Neurons | 3.0 × 1017 FLOP/s | 3.0 × 1017 (unchanged) | 3.0 × 1017 |
| Total | 1.4 × 1019 | 3.2 × 1017 | ~3.5 × 1017 |
| Whole-sim speedup | ~44× | ~40× |
Cross-GPU spike delivery overhead at the relevant scale is the biggest unknown. Adding a 2–3× penalty on the synaptic component gives a realized whole-simulation speedup of 20–35×. The range, reflecting uncertainty in mean firing rate and cross-GPU overhead, is 10–50×. The cross-GPU penalty is the main source of uncertainty: published benchmarks cover single-GPU networks only, and the lower end of the range is more defensible than the upper.
Event-driven simulation also resolves the bandwidth bottleneck. In the clock-driven case, every synapse is accessed every tick—27 EB/s of raw bandwidth demand, making bandwidth the binding constraint. Event-driven simulation reduces synapse accesses proportionally to the FLOP reduction: each synapse is read/written only when a spike arrives, dropping total bandwidth to roughly 10–15 PB/s at 1 Hz average firing (170T synapses × 1 Hz × 64 B per read-modify-write). The access pattern is irregular (scattered reads/writes at 10–30% of peak bandwidth utilization), but even with that penalty the bandwidth demand is \(O(10)\)–\(O(100)\) PB/s effective, well below the capacity-driven GPU count. Under event-driven simulation, memory capacity replaces bandwidth as the binding constraint.
FP16
Running the simulation at FP16 rather than FP32 doubles per-GPU throughput (130 TFLOP/s vs. 67) and reduces synapse state by roughly 40% (from 16 B to 10 B per synapse baseline, since connectivity indices stay as int32). FP16's 3 decimal digits of precision are adequate for most synapse state but requires mixed precision for near-threshold membrane dynamics and plasticity accumulators. If plasticity state (calcium, eligibility traces, consolidation flags) stays at FP32 while baseline synapse state moves to FP16, the effective memory compression is closer to \(1.2\times\) than the \(1.5\times\) assumed in the synthesis below, shifting the plasticity-inclusive memory total from ~3.5 PB to ~4.5 PB and the capacity-bound fleet to \(~5 \times 10^4\) H100-equivalents.
Applying event-driven simulation (central \(25\times\), the geometric mean of the 10–50× range) and FP16 (2× compute, roughly 1.5× memory) to the plasticity-inclusive baseline:
| Constraint | Uncompressed (plasticity-inclusive) | With event-driven + FP16 |
| Compute (FLOP/s) | 3 × 1019 | ~6 × 1017 |
| → H100-equivalents (compute) | ~4.5 × 105 | ~5 × 103 |
| Memory capacity | ~5 PB | ~3.5 PB |
| → H100-equivalents (capacity) | ~6 × 104 | ~4 × 104 |
| Memory bandwidth | Binding (~106) | No longer binding |
The bottleneck shifts from bandwidth to memory capacity. Compute drops by roughly 100× (event-driven × FP16), but memory drops only 1.5× (FP16 on synapse state, with int32 connectivity indices unchanged). The GPU fleet is sized by the need to hold 3.5 PB of plasticity-inclusive synapse state across HBM: roughly \(4 \times 10^4\) H100-equivalents, or $80,000/hour.
Other biophysical reductions
The most important remaining technique is synapse pruning. Biological synaptic strengths follow a heavy-tailed distribution (Bartol et al. (2015)). Zeroing the weakest 50–90% by magnitude removes them from both compute and memory—and since memory capacity is now the binding constraint, this is the only remaining technique that directly reduces fleet size. Plausible factor: 2–10× on memory, though this has not been validated for biophysical brain simulations.
Other techniques exist—neuron-model coarsening, region-specific fidelity, adaptive timesteps, synapse-model simplification—but the compute budget is 98% synaptic, and event-driven simulation already addresses synapse efficiency. What remains is either fighting over the 2% that is neuronal, or overlapping with gains event-driven already captures. The ranges below use only event-driven simulation, FP16, and (in the central and optimistic cases) pruning.
Recommended ranges
| Case | Effective FLOP/s | Binding constraint | H100-equivalents | $/hour |
| Pessimistic (event-driven 10×, no pruning) | ~1018 | Memory capacity (~3.5 PB) | ~4 × 104 | ~$80K |
| Central (event-driven 25×, moderate pruning) | ~6 × 1017 | Memory capacity (~2 PB) | ~2 × 104 | ~$40K |
| Optimistic (event-driven 50×, aggressive pruning) | ~3 × 1017 | Memory capacity (~1 PB) | ~104 | ~$20K |
All ranges include FP16 and plasticity, applied to the 3 × 1019 FLOP/s plasticity-inclusive baseline. Memory capacity binds more tightly than compute in every case.
Basis points of x-risk reduction per billion dollars spent. One basis point is 0.01 percentage points, so 0.3 bp/bn means a $1B investment reduces x-risk by 0.003 percentage points. ↩︎
This is taken directly from SOBE (2025), and doesn’t itself have a citation; Claude was skeptical and thought $2–4B was more reasonable, but this depends a lot on what “counts”. Total neuroscience spending is ~$30B but dominated by work on diseases. ↩︎