An open-source vision-language model for computational pathology that enables natural language interactions with whole-slide images. Outperforms MedGemma on organ identification (0.91 vs 0.48), neoplasm detection, and differential diagnosis. Trained on HISTAI-Instruct with 1.1M+ instruction-response pairs.
All Projects
All code, data, and models are publicly available, because medical AI research should be open.
A local-first project management tool written in Go. Combines human CLI/TUI interfaces with AI agent access through 18 purpose-built MCP tools. Your thoughts stay on your machine. Compatible with Obsidian and other markdown editors.
A domain-agnostic framework for generating synthetic instruction-tuning data at scale using LLMs. Supports DAG-based prompt chaining, multiple inference engines (vLLM, llama.cpp, HuggingFace), and distributed GPU processing.
Full-stack demo for the Kaggle MedGemma Impact Challenge based on ANTONI-Alpha. A FastAPI + Next.js application for interactive gigapixel pathology analysis with integrated whole-slide image viewer.
The largest fully open-source instruction-tuning dataset in computational pathology: 24,259 slides, over 1.1 million conversational instances across 7 languages, 9 organs, and 354 ICD-10 codes.