AI Segmentation Is Compressing the Connectome Timeline
The completed Drosophila connectome, published by the FlyWire consortium in 2023 in Science, represents 18 years of cumulative work by hundreds of researchers. It maps 139,255 neurons and approximately 54 million synaptic connections. It is a landmark result. It is also a data point in an accelerating trend that may make the next comparable achievement arrive significantly faster.
The bottleneck in connectomics has never been imaging. Electron microscopy (EM) can produce nanometer-resolution images of neural tissue at rates that have been improving steadily since volume EM methods matured in the 2010s. The bottleneck has been segmentation: the process of identifying which pixels in each EM image belong to which neuron, which axon or dendrite, which synapse.
In biological tissue, neurons are densely packed and interleaved in complex three-dimensional patterns. Automated segmentation algorithms must correctly delineate membranes between adjacent cells, trace individual processes through hundreds or thousands of serial sections, and identify synaptic structures. Errors accumulate. Manual proofreading, in which trained annotators review automated outputs and correct mistakes, has consumed most of the person-hours in large connectomics projects.
Machine learning segmentation is changing this, and the pace of change in 2026 is faster than most predictions from even five years ago suggested.
Technology Readiness Level: TRL 5–6 (validated in multiple laboratory and preclinical connectomics projects; active deployment in human cortex fragment projects; scaling toward millimeter-scale volumes is underway).
How EM Segmentation Currently Works
Modern connectomics segmentation pipelines use convolutional neural networks trained to perform two tasks: boundary detection (identifying where one cell ends and another begins) and object classification (distinguishing axons from dendrites, identifying synaptic vesicle clusters, classifying cell types).
The dominant approach through the early 2020s was the Flood-Filling Network (FFN), developed at Google Brain and applied to the H01 human cortex dataset. FFNs grow a segmentation outward from a seed point, iteratively predicting whether adjacent voxels belong to the same object. They are highly accurate for long-range tracing of individual processes but computationally demanding and prone to specific types of merge errors, where two adjacent membranes are incorrectly identified as continuous.
The FlyWire project used a different architecture based on iterative affinity prediction, complemented by a large-scale human collaborative proofreading effort in which volunteer players and trained annotators corrected automated errors through a game-like interface. This hybrid approach was more efficient than pure manual annotation. It was not efficient enough to scale to mammalian brains without fundamental changes to the automated component.
The shift that has occurred since approximately 2022, and accelerated into 2026, is in the accuracy of the automated segmentation before human correction.
The Accuracy Threshold Effect
The amount of human proofreading required to bring an automated segmentation to a usable state is not linear in the automated error rate. It is superlinear: reducing errors by half does not halve proofreading time, because errors that remain are often in dense, ambiguous regions that require more time to resolve per error than errors in simpler regions.
Above a certain accuracy threshold, however, the proofreading requirement drops sharply. If a segmentation algorithm achieves 99% accuracy at the voxel level, the resulting object-level errors affect a small fraction of neurons, and those neurons tend to be identifiable from topological properties (a neuron with an abnormally large number of partner connections is likely a merge error). Below that threshold, errors are distributed more uniformly and the proofreading problem scales with volume. Above it, proofreading becomes targeted verification rather than systematic review.
The SmartEM platform developed by the Lichtman lab at Harvard and commercial partners achieved high-automated-accuracy segmentation of a cubic millimeter of human cortex (the H01 dataset) with a level of proofreading that was, if not trivial, far less than earlier estimates for that volume. The H01 result, published in 2024, demonstrated that automated pipelines could be trusted across much of the volume, with human correction focused on complex merger and split errors at specific locations.
By 2026, the next generation of architectures has pushed automated accuracy further. Transformer-based models applied to volumetric EM data have shown error rates substantially below those achieved by convolutional networks, particularly for identification of synaptic connections where the previous generation of models struggled. This matters because synaptic connectivity, not just neuronal tracing, is what a connectome needs to capture.
The Zebrafish and Mouse Data Points
The scaling trajectory from Drosophila to mammalian connectomics can be tracked through intermediate model organisms.
The larval zebrafish connectome, substantially more complex than Drosophila with approximately 100,000 neurons, was completed in 2023 by Engert’s lab and collaborators at Harvard. It was completed faster than the Drosophila connectome, partly because of improved imaging protocols and partly because of better automated segmentation at the start of the project. Human proofreading was still required in volume, but the ratio of automated-to-manual work improved.
The Allen Institute mouse cortex simulation used structural data from the ongoing MICrONS project, which has been producing EM reconstruction of mouse primary visual cortex across a cubic millimeter volume. The MICrONS connectome reconstruction pipeline, which went through several iterations of segmentation architecture improvement between 2021 and 2025, demonstrated progressive improvement in automated accuracy with each new model generation. The 2024 version of the pipeline required an order of magnitude less proofreading per equivalent volume than the 2021 version.
This rate of improvement, roughly one order of magnitude per three to four years in proofreading requirement per unit volume at equivalent accuracy, is the figure that changes the timeline calculus for mammalian connectomics.
What the Acceleration Means for the Human Connectome
A naive extrapolation from the Drosophila timeline to a human brain connectome was always going to produce a very large number. The human brain is approximately 1,000 times larger by neuron count than Drosophila and contains synapse types and circuit motifs that are not present in invertebrate tissue. Early estimates, including those in the Sandberg/Bostrom whole brain emulation roadmap, placed a complete human connectome hundreds of years in the future.
Those estimates were made before the automation inflection point became clear. They assumed proofreading requirements that scale with volume, which is true below the accuracy threshold and much less true above it.
The honest position in 2026 is not that a human connectome is imminent. It is that the naive extrapolation was wrong in its premises. The actual timeline depends on whether segmentation accuracy continues to improve at its recent pace, whether imaging throughput continues on its own improvement trajectory, and whether there are tissue preparation problems not present in model organisms that make human cortex more difficult to segment than mouse cortex.
The mammal brain preservation breakthrough from Song et al. is relevant here because tissue quality affects automated segmentation accuracy. Membranes that are partially degraded, with artifacts introduced by poor fixation, produce inconsistent signals at cell boundaries that increase segmentation error rates. High-quality tissue preparation, like the 14-minute post-mortem window result, is not independent of segmentation automation: better-preserved tissue is easier to segment automatically.
The Remaining Bottlenecks
Accelerating automated segmentation does not eliminate all connectomics bottlenecks.
Imaging throughput remains a practical limit. A cubic millimeter of mouse cortex at EM resolution is approximately 1 to 2 petabytes of raw image data. A human brain at the same resolution would be on the order of 1 million petabytes (one exabyte). Electron microscopy can be parallelized across many instruments, and high-throughput EM systems have improved substantially since 2020. But imaging the entire human brain at EM resolution, even with continued improvement, remains a decades-scale undertaking at current imaging throughput.
Molecular annotation is a separate gap. Structural connectomics provides wiring diagrams: which neurons connect to which with what synaptic geometry. It does not provide neurotransmitter identity, receptor composition, or the neuromodulatory context that shapes circuit function. Whole brain emulation requires not just the wiring diagram but the label on each wire. Expanding EM connectomics to include multiplexed molecular annotation at synaptic resolution is an active research area but is not yet a routine component of connectomics pipelines.
Cell type identification from EM morphology alone has limitations. High-throughput EM generates excellent membrane-level structural data but limited molecular information. Machine learning classification of cell types from EM morphology has improved substantially and can now distinguish many of the major excitatory and inhibitory neuron types from structural features alone. But the full diversity of interneuron subtypes, which matter for circuit function, requires supplementary molecular data.
Future Outlook
The automation of connectomics segmentation is producing a revision in what the timeline for large-scale mammalian connectomics actually looks like. The revision is not from centuries to decades. It is from “longer than existing institutions will survive” to “plausible within the research horizon of active researchers.”
This matters for whole brain emulation planning in a concrete way. If structural connectomics at mammalian scale becomes achievable within the next two decades, then the bottleneck shifts from data acquisition to the other components of the emulation pipeline: functional characterization, computational simulation capacity, and model validation. Those are formidable challenges, but they are challenges in a program that is progressing rather than waiting for its first input data.
The segmentation automation story is also a template for the broader trajectory. Computational tools in neuroscience have gone from adjuncts to manual work to the primary driver of what is achievable. This shift has been faster than most researchers predicted ten years ago, and there is no clear reason to expect the pace to slow.
Official Sources
- Zheng, Z. et al. (2018). “A complete electron microscopy volume of the brain of adult Drosophila melanogaster.” Cell 174(3): 730–743. DOI: 10.1016/j.cell.2018.06.019
- Shapson-Coe, A. et al. (2024). “A petavoxel fragment of human cerebral cortex reconstructed at nanoscale resolution.” Science 384, eadk4858. DOI: 10.1126/science.adk4858
- Januszewski, M. et al. (2018). “High-precision automated reconstruction of neurons with flood-filling networks.” Nature Methods 15: 605–610. DOI: 10.1038/s41592-018-0049-4
- Winding, M. et al. (2023). “The connectome of an insect brain.” Science 379, eadd9330. DOI: 10.1126/science.add9330
- Related: Drosophila Connectome: Proof of Concept
- Related: SmartEM: Democratizing Brain Mapping
- Related: Allen Institute Mouse Cortex Simulation
- Related: Mammal Brain Connectome Preservation Breakthrough
- Related: Whole Brain Emulation Roadmap Review
- Related: Connectome-seq: Single-Synapse Resolution via Barcode Sequencing
- Related: The Digital Sphinx: Cross-Species Connectome Transfer
- Related: LICONN: Light Microscopy Connectomics at Synaptic Resolution