Abstract
Should researchers avoid translational research in animals in favor of human or AI models? We argue that this debate should focus not on comparing species but instead on how experimental systems can be combined to maximize mechanistic confidence, human relevance, and real-world decision-making value.
Recent guidance from funding and regulatory agencies in the USA has discouraged the use of animal models in favor of human and in silico methods. Undoubtedly, this guidance reflects the considerable progress that has been made in artificial intelligence (AI) and human tissue modeling, coupled with mounting examples of drugs that showed promise in animal models but failed in humans. Yet whether ex vivo modeling approaches are sufficient to entirely replace animal studies is contentious, and in some cases doubtful. Here we propose a function-based framework, one that helps researchers decide which models to use by breaking the problem into three core questions. First, which experiment is needed to show how a drug works (mechanistic causality)? Second, what evidence shows that the drug is likely to work in humans (human relevance)? Third, which data are needed to understand the real-world risks and benefits (translational evaluation)?
Biological discovery has advanced faster than the capacity to translate mechanistic insight into effective and safe therapies in humans. Durable clinical benefit increasingly depends on targeting molecular pathways within complex, context-dependent biological systems — namely, the human body. Although designed interventions have produced notable successes, these gains remain unpredictable, highlighting the challenge of defining what experimental evidence truly justifies clinical advancement.
This challenge is especially acute in oncology. Advances in targeted therapies, immunotherapies, and rational drug combinations have reshaped cancer care, yet sustained responses remain limited to a subset of patients and tumor types.
The debate around replacing animal models with AI and ex vivo and/or in vitro human systems reflects both the limitations of animal studies and the rapid progress in human-based experimental platforms, large-scale datasets, and computational methods. Animal models have long been foundational for mechanistic biology, enabling causal studies, whole-organism analysis, and detection of toxicity.
Yet concerns regarding reproducibility, problems with predicting human toxicity and efficacy, and well-known translational failures have eroded confidence in animal-only approaches. A frequently cited example is DMXAA, a proposed agonist of the protein STING that advanced into clinical testing before it was discovered only to activate its homolog in mice1. This case illustrates how preclinical signals may appear persuasive yet fail to translate into human biology. Additional challenges arise from species-specific differences in lifespan, immune dynamics, tumor evolution, and molecular pathways — limitations that are particularly pronounced in immuno-oncology.
Here we argue that the debate should focus not on whether in vivo or human-based models are inherently superior but instead on how experimental systems should be combined to maximize mechanistic confidence, human relevance, and real-world decision value. In other words, rather than comparing species, we should focus on purpose or translational objective. What biological question is being asked? And which model — or combination of models — is best suited to answer it (Table 1)?