Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

A new study applied sparse autoencoders to interpret two leading single-cell biology foundation models, Geneformer and scGPT. The analysis revealed these models internalize organized biological knowledge—with 29-59% of features mapping to Gene Ontology, KEGG, and Reactome databases—but capture minimal causal regulatory logic, with only 6.2-10.4% of transcription factor perturbations eliciting targeted responses. This establishes a bottleneck for using such models in predictive causal biology.

Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Decoding the Biological Brain: Sparse Autoencoders Reveal What Single-Cell AI Models Really Know

New research has applied a powerful interpretability technique to peer inside two leading single-cell biology foundation models, Geneformer and scGPT. The findings, published in a preprint, reveal that while these AI systems have internalized a vast, organized atlas of biological knowledge—from pathways to protein interactions—they capture surprisingly little of the causal regulatory logic that governs gene expression. This establishes a clear bottleneck for using such models in predictive causal biology.

Applying Sparse Autoencoders to Biological AI

The study's authors employed sparse autoencoders (SAEs), a cutting-edge method from mechanistic interpretability, to decompose the dense internal activations of the models. They trained "TopK SAEs" on the residual stream activations from all layers of Geneformer V2-316M (18 layers) and the scGPT whole-human model (12 layers). This process generated massive atlases of 82,525 and 24,527 interpretable features, respectively, offering an unprecedented look under the hood.

The analysis confirmed a state known as superposition, where neural networks compress more concepts than they have dimensions to represent them cleanly. A striking 99.8% of the discovered SAE features were invisible to traditional analysis methods like Singular Value Decomposition (SVD), highlighting the necessity of advanced tools for understanding modern AI.

A Rich but Non-Causal Biological Atlas

Systematic characterization of the feature atlases revealed rich, organized biological knowledge. Between 29% to 59% of features could be annotated to major biological databases like Gene Ontology, KEGG, and Reactome. The features organized into co-activation modules (141 in Geneformer, 76 in scGPT) and exhibited clear hierarchical abstraction, with U-shaped activity profiles across model layers.

Furthermore, the features showed causal specificity (median 2.36x) and formed cross-layer "information highways," with 63% to 99.8% of features active across multiple layers. This demonstrates that the models build a sophisticated, multi-layered representation of biological systems.

However, a critical test against genome-scale CRISPRi perturbation data exposed a major limitation. When checking if features responded specifically to the knockdown of known regulatory transcription factors (TFs), only 3 out of 48 TFs (6.2%) showed a targeted response. Using a more complex, multi-tissue control scenario provided only a marginal improvement to 10.4% (5 of 48 TFs), pinpointing the model's internal representations—not the experimental setup—as the fundamental bottleneck.

Key Takeaways and Released Resources

  • Knowledge vs. Causality: Single-cell foundation models encode a vast, hierarchically organized map of biological associations but lack the causal regulatory logic needed for accurate perturbation prediction.
  • Interpretability is Key: Techniques like sparse autoencoders are essential to unlock the "black box," revealing phenomena like massive superposition that older methods miss.
  • Representation as Bottleneck: The study concludes that improving the models' internal representations of causality is a prerequisite for advancing their utility in experimental design and discovery.
  • Open Resource: The researchers have released both complete feature atlases as interactive web platforms, allowing the scientific community to explore over 107,000 features across 30 layers of the two models.

This work provides a crucial reality check on the current capabilities of biological AI. It confirms their immense value as knowledge bases and pattern-recognition engines while clearly delineating the frontier of causal understanding—a frontier that must be crossed for the next leap in computational biology.

常见问题