Portrait of Yixuan Elliot Xie, PhD Candidate in Biomedical Data Science
PhD Candidate • Biomedical Data Science

Yixuan Elliot Xie

University of Wisconsin - Madison .
Bridging the gap between Single-Cell Genomics and Large Language Models.

“The vast majority of all human effort, however great or minuscule, ends in failure. So what are your options? You just admit pre-defeat that odds are you’re going to be right or you do it anyway. Maybe we’re a success regardless of the outcome, because we tried. Maybe there’s beauty in the struggle against near certain failure.”

Selected Papers

Featured Paper

CASSIA

Nature Communications
2024

A multi-agent Large Language Model framework for reference-free, interpretable, and automated single-cell type annotation. CASSIA mimics the reasoning process of human experts to handle ambiguity in high-dimensional biological data.

LLM Agents Single-Cell Python / R
2025

Gut-larynx axis and its contribution to laryngeal immunity

mSystems • R An, E Xie, J Binns, FE Rey...
Microbiome 16S rRNA Immunology
2025

Surrogate selection oversamples expanded T cell clonotypes

Annals of Applied Statistics • P Yu, Y Lian, E Xie...
Bayesian Stats TCR-seq Birth-Death Models
2024

Single-cell view into the role of microbiota shaping host immunity in the larynx

iScience • R An, Z Ni, E Xie...
scRNA-seq Microbiome Cell Atlas
2024

Transcriptomic and proteomic spatial profiling of diffuse midline glioma

Scientific Reports • MD Sudarshawn Damodharan...
Spatial-omics GeoMx DSP Cancer

Thoughts

The New Productivity Metric: Agent-Hours per Human-Hour

In the age of AI agents, competitive advantage comes from maximizing the surplus value of autonomous work. If you're not exploiting this free capital, you're already behind.

Read More
Close X
Nov 2025

On the fragility of Agentic Systems

When we began developing CASSIA, the premise was simple: Large Language Models (LLMs) are excellent reasoning engines. If we treat them as individual agents—one acting as a biologist, another as a critic, and another as a summarizer—we should be able to replicate the manual annotation workflow.

However, biological data is messy. Unlike code generation, where the output works or it doesn't, cell type annotation exists in a grey area. A cell might express markers for both T-cells and NK cells. An LLM agent, trained to be helpful, often tries to "force" a classification where ambiguity is the scientific reality.

The "Yes-Man" Problem

One of the primary failure modes we observed was agent agreeableness. If Agent A (the Proposer) hallucinated a cell type based on weak evidence, Agent B (the Critic) often failed to correct it, instead fabricating a justification for the error. This echo chamber effect is the single biggest hurdle in deploying autonomous agents in rigorous scientific pipelines.

The solution lies not in better prompting, but in grounding. By forcing the agents to query external, immutable knowledge graphs before debating, we reduce the hallucination window. But the struggle remains: how do we teach a system to say "I don't know"?

Close X
Dec 2025

The New Productivity Metric: Agent-Hours per Human-Hour

I've been letting Claude Code run continuously for the past few weeks, and I've realized something fundamental has shifted in how we should measure productivity. The critical metric is no longer output per hour worked. It's how many agent-hours you can generate per human-hour invested.

This is surplus value (剩余价值) in its purest form. But unlike traditional capital, where surplus value is extracted from labor, here the surplus comes from autonomous systems working on your behalf. Every hour I spend designing a task architecture, writing clear specifications, or setting up workflows translates into 10, 20, sometimes 50 hours of agent work running in parallel.

The beauty of continuous agent work is that it's fundamentally asymmetric. While I sleep, the agents are refactoring code, running experiments, writing documentation, searching through literature. The compounding effect is staggering. A single well-structured prompt can spawn days of productive work.

Here's the uncomfortable truth: if you're not exploiting this, you're losing competitive edge. This isn't optional anymore. The researcher who can orchestrate 100 agent-hours per week will outpace the one doing everything manually by an order of magnitude. The startup that parallelizes development across autonomous agents will ship faster than the one relying solely on human velocity.

This is free capital sitting on the table. The infrastructure is here. The models are capable. The only bottleneck is learning to delegate effectively and trust the process. Those who master this will define the next decade of scientific and technical progress. Those who don't will wonder why they can't keep up.

We're entering an era where the limiting factor isn't intelligence or effort. It's the ability to architect work such that autonomous agents can execute it. That's the skill worth developing. That's the new literacy.