Computer Vision for Microscopy Image Analysis 2026

Invited Speakers

Talk Title: Multimodal AI for Healthcare Development: Integrating Imaging, Text, and Genomics

Start Time: 9:00 AM MT
Speaker: Daguang Xu, Research Manager, NVIDIA

Abstract: Medical datasets increasingly contain diverse modalities, including medical imaging, genomics, spatial transcriptomics, and clinical notes. Integrating these heterogeneous data sources has the potential to improve clinical diagnosis and optimize healthcare workflows. In this talk, I will present recent research and foundation models developed at NVIDIA to integrate biomedical data across molecular, cellular, and patient levels. The approach leverages vision–language models and intermediate modality fusion strategies to align imaging, text, and structured clinical information within a shared representation space. By enabling cross-modal reasoning across imaging, molecular data, and clinical records, the framework aims to build more transparent and interpretable AI systems that help clinicians understand complex patient information. Such capabilities may support clinical decision-making, facilitate biomarker discovery, and advance data-driven clinical development.

Speaker Bio: Dr. Daguang Xu is a Senior Research Manager in Healthcare at NVIDIA, where he leads AI research in healthcare. His work focuses on medical imaging analysis, electronic health record (EHR) modeling, computer-aided diagnosis, and the development of multimodal foundation and language models for healthcare applications. His team is the primary contributor to the open-source platforms MONAI and NVIDIA FLARE, which are widely used for medical imaging AI and federated learning.

Talk Title: Vision Transformers for Drug-Induced Toxicity and Mechanism Prediction

Start Time: 9:40 AM MT
Speaker: Alex Beatson, Ph.D., Axiom Bio

Abstract: Axiom is building AI systems to understand, predict, and avoid drug toxicity, using a range of clinical and in vitro data including cell imaging. I'll talk about how we use these data to help drug hunters understand toxicity, including how unsupervised "bioactivity" classifiers on ViT embeddings of cell images can outperform more complex, state-of-the-art assays for drug-induced liver injury prediction, and how the same embeddings can in many cases be used to accurately predict a perturbation's biological targets and pathways. I'll also cover some unglamorous details we've learned along the way: issues with confounding in many liver toxicity benchmarks, tricks for normalizing out batch effects in image features and embeddings, and how today's frontier VLMs can make poor biological inferences from cell imaging but can still produce accurate and useful biological-interpretation-free descriptions of the visual content of cell images.

Speaker Bio: Alex Beatson is a co-founder of Axiom Bio, where his team is building AI systems to help drug hunters understand, predict and avoid drug toxicity. He and his team have generated lots of wetlab data, built agents to curate clinical literature, trained models to reason about and predict clinical outcomes. and are working with many of the largest pharma companies to understand and predict drug toxicity and safety.

Talk Title: Multi-modal modeling in precision medicine: from data imputation to synthetic data

Start Time: 10:40 AM MT
Speaker: Olivier Gevaert, Associate Professor, Stanford University

Abstract: Missing data presents a persistent challenge in biomedical research. Data imputation techniques have evolved from single-modality approaches to multi-modal approaches, which show great promise for imputing one modality based on the availability of another. Recent advancements in large, pre-trained artificial intelligence (AI) models, known as foundation models, offer even more powerful solutions for data imputation. We introduce the concept of cross-modal data modeling, a methodology harnessing foundation models to impute missing data and also generate realistic synthetic samples. Multi-modal modeling empowers researchers to model complex interactions among diverse biomedical data types, including omics and imaging. This approach can illuminate how one modality influences another, facilitating in-silico exploration of disease mechanisms without the need for extensive and costly real-world data collection. We highlight ongoing efforts in multi-modal modeling in spatial omics, digital pathology and radiology, and anticipate its substantial contributions to understanding disease biology and enhancing healthcare practices.

Speaker Bio: Dr. Olivier Gevaert is an associate professor at Stanford University focusing on developing machine-learning methods for biomedical decision support from multi-scale data. He is an electrical engineer by training with additional training in artificial intelligence, and a PhD in bioinformatics at the University of Leuven, Belgium. He continued his work as a postdoc in radiology at Stanford and then established his lab in the department of medicine in biomedical informatics. The Gevaert lab focuses on multi-scale biomedical data fusion primarily in oncology and neuroscience. The lab develops machine learning methods including Bayesian, kernel methods, regularized regression and deep learning to integrate molecular data or omics. The lab also investigates linking omics data with cellular and tissue data in the context of computational pathology, imaging genomics & radiogenomics. Dr. Gevaert joined BMIR in 2015 as an Assistant Professor of Medicine.

Talk Title: NOVA: Scalable Vision Foundation Model for Organellome-Wide Phenotyping of Human Neurons

Start Time: 13:40 PM MT
Speaker: Eran Hornstein, Professor, Weizmann institute of science

Abstract: Systematic assessment of organelle architectures, termed the organellome, offers valuable insights into cellular states and patho-mechanisms, but remains largely uncharted. Here, we present a deep phenotypic learning based on vision transformers, resulting in the Neuronal Organellomics Vision Atlas (NOVA) model that studies confocal images of more than 30 markers of distinct membrane-bound and membrane-less organelles in 11.5 million images of human neurons. Organellomics analysis quantifies perturbation-induced changes in organelle localization and morphology using a rigorous mixed-effects meta-analytic framework that accounts for sampling variance and experimental heterogeneity. Applying this approach, we delineate phenotypic alterations in neurons carrying ALS-associated mutations and uncover a physical and functional crosstalk between cytoplasmic mislocalized TDP-43, a hallmark of ALS, and processing bodies (P-bodies), membrane-less organelles regulating mRNA stability. These findings are validated in patient-derived neurons and human neuropathology. NOVA establishes a scalable framework for quantitative mapping of subcellular phenotypes and provides a new avenue for investigating the neuro-cellular basis of disease.

Speaker Bio: The Hornstein laboratory has pioneered research on RNA biology in ALS for 15 years, and has transformed into a machine learning–driven research hub, and now lead the integration of computational biology, AI, and advanced imaging into neurodegeneration research. With the unique expertise at the interface of neuroscience and data science, we are positioned to deliver transformative insights into the mechanisms of motor neuron death and to establish new therapeutic avenues for ALS and related disorders.

Talk Title: Learning Multi-Modal Tissue Representations via Enhanced Spatial Barcoding and Sequencing

Start Time: 15:40 PM MT
Speaker: Michelle Chan, Assistant Professor, Princeton University

Abstract: Comprehensive tissue profiling requires robust tools capable of capturing diverse genomic modalities in situ. Due to the design of the microfluidic-based Deterministic Barcoding in Tissue for spatial omics sequencing (DBiT-seq), it is particularly well suited for extension to multimodal profiling. We have improved DBiT-seq transcriptomic capture by ~2-fold and are now extending the technology for simultaneous profiling of other modalities. Two examples I will describe are targeted genotyping and lineage recording. Together, these optimized platforms form a versatile, multi-omic toolkit, significantly expanding our ability to dissect complex cellular heterogeneity, genetic variation, and gene regulation in native tissue architectures.

Speaker Bio: Michelle Chan is an Assistant Professor at the Lewis- Sigler Institute and Molecular Biology at Princeton University. She earned her B.Sc. in Computer Science and Microbiology at UBC. Her graduate work in computational biology at MIT in Aviv Regev’s lab centered on epigenetic regulation during embryogenesis. As a LSRF fellow in Jonathan Weissman's lab at UCSF, she developed a CRISPR- Cas9 molecular recorder. The Chan lab integrates computational and experimental methods to characterize the mammalian cell fate map with a focus on understanding how robustness is encoded in development. Michelle is a recipient of the 2022 NIH Director’s New Innovator Award.

Talk Title: The Thinking Microscope: Autonomously Designing, Perturbing, and Analyzing Biological Experiments

Start Time: 16:20 PM MT
Speaker: Vivek Gopal Ramaswamy, Senior Software Engineer, UCSF Gladstone Institutes

Abstract: Modern biological discovery relies on microscopy, yet the process remains largely manual — researchers design experiments, acquire images, and interpret results in separate, human-driven steps. We present the Thinking Microscope, an autonomous closed-loop microscopy system that integrates AI-driven experimental design, targeted optical perturbation, and real-time image analysis into a single self-directing workflow. The system uses computer vision to detect and track biological events of interest, an AI reasoning engine to decide what to do next, and a digital micromirror device (DMD) to deliver spatially targeted light stimulation to individual cells — all without human intervention. By closing the loop between observation and action, the Thinking Microscope can adaptively adjust imaging parameters, select regions of interest, perturb specific cells, and iteratively refine its experimental strategy based on what it sees.

Speaker Bio: Vivek Gopal Ramaswamy is a Senior Software Engineer at the Gladstone Institutes, where he develops next-generation AI systems at the intersection of computer vision, robotics and biomedical research. His work focuses on building biology-specific foundational models that classify, track, and predict cell fate, and on creating generative-AI–based vision tools to uncover novel cellular and brain-tissue phenotypes. By integrating these phenotypes with transcriptomic data, his research advances multimodal foundation models for precision medicine. He has expertise in designing intelligent imaging systems that integrate AI with robotic control, precision motor calibration, and adaptive feedback to enable autonomous and dynamic experimentation. He also leads efforts in scalable image-analysis pipelines capable of processing thousands of whole-slide images using distributed frameworks such as Docker, Kafka, and Kubernetes. His deep learning research spans multi-scale CNNs, Vision Transformers, and multimodal foundation models trained with techniques such as QLoRA and contrastive learning. Across his work, Vivek bridges AI, imaging, and biology to build intelligent, scalable systems accelerating discovery in neuroscience and disease pathology.