Biomni: a general-purpose biomedical AI agent
Benchmark datasets, data loaders, and evaluators for graph machine learning
(NeurIPS D&B 2024) STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
Automated Hypothesis Testing with Agentic Sequential Falsifications
(ICLR 2026) Optimas: Optimizing Compound AI Systems