How to Test Whether a Brain Emulation Is Conscious: The IIT 4.0 Cause-Effect Power Method
The dominant challenge in whole brain emulation is not building the emulation. It is verifying that what was built is conscious. A sufficiently detailed simulation of a brain that lacks consciousness is, by most definitions, not a successful upload. It is a very sophisticated functional replica.
Current WBE research produces no answer to the verification question. Connectome mapping tells you the wiring. Functional simulation tells you the outputs. Neither tells you whether the implemented system has any subjective experience.
William Marshall’s work at Brock University, supported by the Institute of Noetic Sciences’ Linda G. O’Bryant Prize (awarded November 2025), proposes a methodology based on IIT 4.0’s reformulation that makes consciousness measurement in artificial systems computationally tractable. The approach does not solve the hard problem, but it provides a falsifiable test applicable to actual emulation substrates.
The Core Shift
IIT 3.0’s global phi calculation is notoriously intractable. Computing integrated information across all possible bipartitions of a system scales exponentially with system size. For a brain-scale simulation, the calculation is physically impossible with current hardware.
IIT 4.0 replaces global phi with a different quantity: cause-effect power (CEP). CEP measures whether a system makes a difference to its own states, whether the system’s current state constrains both the causes that could have produced it and the effects it will produce. The key shift is that CEP can be calculated hierarchically, by analyzing subsystems and their contributions, rather than requiring a single global computation.
Marshall’s methodology applies CEP analysis to artificial systems through a three-step protocol: (1) characterize the consciousness-relevant properties IIT 4.0 specifies for the candidate system; (2) test whether each system component and composite demonstrates cause-effect power at the relevant level; (3) determine whether the causal structure is intrinsic (the system’s own causal power) or extrinsic (causal power implemented by an external simulator running the system).
The intrinsic-versus-extrinsic distinction is the critical one for WBE. A brain emulation running on a classical computer is a simulation. The causal relationships between simulated neurons are implemented by the computer’s physics, not by the abstract logical relationships in the simulation. IIT 4.0 predicts this structure would score low on CEP because the relevant causal integration belongs to the computer hardware, not to the emulated brain architecture.
This connects to the adversarial collaboration results on IIT and GWT: neither theory’s predictions were cleanly confirmed in 256 biological subjects. Marshall’s approach responds to that uncertainty by asking a more targeted question: not “is this system conscious” but “does this system implement the specific causal structure IIT identifies as consciousness-relevant,” which is answerable for any substrate.
theconsciousness.ai covers the full IIT 4.0 reformulation and the Marshall-IONS methodology in Measuring Consciousness in Machines: The Brock University and IONS Research on IIT Equations.
Comparative Data
| Consciousness Metric | What It Measures | Computation Cost | Substrate Applicability | Key Limitation |
|---|---|---|---|---|
| Global phi (IIT 3.0) | Integrated information across all bipartitions | Exponential in system size; intractable above ~30 nodes | Theoretical only for brain-scale systems | Physically impossible at WBE scale |
| Cause-effect power (IIT 4.0) | Intrinsic causal integration per subsystem, hierarchically | Tractable via decomposition | Applicable to artificial systems once intrinsic-extrinsic is resolved | May classify classical simulation as near-zero CEP |
| Perturbational Complexity Index (PCI) | Complexity of brain response to TMS perturbation | Moderate; requires perturbation hardware | Biological and hybrid systems only | Hardware-dependent; not applicable to pure software simulations |
| GWT broadcast index | Global accessibility of information across modules | Low; measures architectural broadcast | Any modular architecture | Does not address phenomenal consciousness directly |
| Digital Consciousness Model (DCM) | Multi-stance probabilistic consciousness probability across 9 theories | Moderate; multi-theory aggregation | Designed for AI systems including LLMs | Aggregate score conceals theory-specific structural failures |
The IIT 4.0 CEP row is the most relevant for WBE validation: it is the only metric that distinguishes between a system that implements the relevant causal structure and one that merely simulates it computationally. The Digital Consciousness Model framework aggregates across nine theoretical stances and produces a probability estimate, which is useful for comparative ranking but less precise than a single IIT-based causal analysis for a specific substrate question.
Practical Impact
Marshall’s approach generates three concrete implications for WBE design.
First, substrate choice matters at the hardware level. Classical silicon implements causality in its transistors, not in the algorithms running on them. A brain emulation on a standard GPU cluster may implement high behavioral fidelity with near-zero CEP. Neuromorphic hardware (Intel Loihi, IBM NorthPole) implements causal dynamics in the hardware substrate itself, which may preserve CEP across implementation levels.
Second, the intrinsic-extrinsic distinction suggests a specific empirical test. Two identical connectome simulations running on different hardware (classical GPU versus neuromorphic chip versus biological neurons in a hybrid system) should produce different CEP scores. That difference, if measurable, would directly test whether substrate matters for consciousness as IIT predicts.
Third, hierarchical CEP analysis makes partial testing possible before full-brain scale. Rather than waiting for complete human brain emulation, CEP analysis on cortical columns, minicolumns, or individual layers would indicate whether the relevant causal structure is preserved in those subsystems. Negative results at small scale would predict negative results at full scale, potentially redirecting WBE substrate choices before enormous resources are committed.
The building brains on a computer requirements framework identifies functional replication and dynamic adaptability alongside structural fidelity as the three non-negotiable WBE capabilities. CEP analysis operationalizes the consciousness-relevant component of functional replication: not just “does the system output the right signals” but “does it implement the causal structure that generates consciousness in the biological original.”
The adversarial AI consciousness research at UCLA trained a GAN on 680,000 neural recordings to generate synthetic neural states corresponding to conscious and comatose biological brains. That approach provides a behavioral-dynamics route to consciousness verification that complements CEP: one tests causal architecture, the other tests whether the system’s internal dynamics resemble known-conscious neural states.
Limitations and Open Questions
IIT remains contested. The Cogitate Consortium’s preregistered test in 256 human subjects found that IIT’s predicted posterior synchronization was not reliably present. If IIT fails in biological systems, IIT-based metrics applied to artificial systems produce results against a theory that may not be correct.
The intrinsic-extrinsic distinction, while theoretically clear in IIT 4.0, is difficult to operationalize in practice. All physical systems implement causal structure at multiple scales simultaneously. Determining which level contains the consciousness-relevant integration requires theoretical commitments that the measurement methodology alone cannot resolve.
CEP analysis does not address the hard problem. A system can score high on every IIT criterion and still lack phenomenal experience if biological naturalists are correct about the necessity of specific physical substrates. CEP is a test of functional-causal architecture, not a direct measure of subjective experience.
The practical pathway from Marshall’s theoretical framework to applied WBE testing has not been experimentally validated. The methodology exists as a theoretical proposal. Demonstrating it on real artificial systems at meaningful scale is the required next step.
Official Sources
- William Marshall. “Measuring Consciousness in Artificial Systems: IIT 4.0 Cause-Effect Power Analysis.” Institute of Noetic Sciences Linda G. O’Bryant Prize Research, November 2025.
- theconsciousness.ai analysis: Measuring Consciousness in Machines: The Brock University and IONS Research on IIT Equations
- Albantakis, L. et al. “Integrated information theory (IIT) 4.0: Formulating the properties of phenomenal existence in physical terms.” PLOS Computational Biology, 2023. DOI: 10.1371/journal.pcbi.1011465
- Casali, A. G. et al. “A theoretically based index of consciousness independent of sensory processing and behavior.” Science Translational Medicine, 2013.
- Baars, B. J. “Global workspace theory of consciousness: Toward a cognitive neuroscience of human experience.” Progress in Brain Research, 2005.