CASSIA Benchmark
Performance comparison of different LLMs using the CASSIA method for single-cell annotation
Optimal models appear in the top-left corner (higher score, lower cost). Larger circles indicate better overall ranking.
CASSIA (Collective Agent System for Single-cell Interpretable Annotation) is the first multi-agent LLM-based method for single-cell annotation. It enhances annotation accuracy across diverse datasets and rare cell types by integrating step-by-step reasoning, validation, quality scoring, and optional refinement or retrieval-augmented generation.
The method leverages the collaboration of five basic agents and five advanced agents to provide comprehensive and interpretable cell type annotations with robust performance across different tissues.
Dataset
100 cell types across 5 tissues: human kidney, human lung, human large intestine, human fetal skin, and whole mouse atlas.
Evaluation Metric
We built an agent that scores annotations by averaging similarity between predicted and gold standard cell types per tissue. The agent tends to underestimate accuracy, and although some clear errors in the gold standard were corrected, the true accuracy is still considered to be higher.