ODIN 2026 Challenges

Multimodal Text Report Generation for Oral and Dental Image Analysis


ODIN 2026 is a cluster of challenges associated with the Oral and Dental Image aNalysis Workshop (ODIN 2026) at MICCAI 2026. The cluster focuses on automatic generation of structured clinical reports from routine oral and dental imaging.

Three-dimensional imaging is now routine in dentistry and maxillofacial surgery. Cone-beam computed tomography (CBCT) supports diagnosis and surgical planning by capturing internal dental and craniofacial anatomy, while intraoral scanning (IOS) provides accurate surface geometry of crowns and gingiva. Despite the increasing availability of rich 3D and complementary 2D data, clinical reporting remains largely manual, time-consuming, and subject to inter-observer variability.

ODIN 2026 addresses this gap by benchmarking systems that transform multimodal oral imaging into clinically meaningful text reports. The cluster includes two complementary tracks:

  • Task 1 - ToothFairy4: Maxillofacial and surgical report generation from CBCT volumes.
  • Task 2 - Bite2Text: Orthodontic report generation from intraoral scans and intraoral photographs.

The goal is to advance clinically useful multimodal 3D-to-text and 2D/3D-to-text learning, with a strong emphasis on robustness across acquisition centers, scanners, protocols, and patient populations.

Background

Previous ODIN challenge editions, including the ToothFairy and 3DTeethSeg/Land series, focused primarily on geometric understanding tasks such as segmentation, landmark detection, and labeling. ODIN 2026 moves one step further by evaluating end-to-end report generation: models must reason over volumes, surface meshes, photographs, and clinical language to produce structured textual outputs suitable for clinical decision support.

The challenge is designed to evaluate not only surface-level text similarity, but also clinical correctness, completeness, and usefulness. Automatic metrics will be complemented by blinded expert review for top-performing methods.

Tasks

Task 1 - ToothFairy4: Maxillofacial and Surgical Report Generation from CBCT

ToothFairy4 mirrors the workflow of CBCT-based surgical and interventional planning. Participants must generate reports that describe clinically actionable findings, such as dental status, bone quality and quantity, anatomical variants, proximity to critical structures, and procedure-related risk factors.

The input is a 3D CBCT volume of the jaws. The expected output is a clinically meaningful report supporting use cases such as tooth extraction, implant placement, and other maxillofacial interventions.

Task 2 - Bite2Text: Orthodontic Report Generation from Intraoral Scans and Photographs

Bite2Text reflects routine orthodontic diagnosis and treatment planning. Participants must generate orthodontic reports from 3D intraoral scans and 2D intraoral photographs.

The task includes multimodal reasoning over dental geometry and visual appearance. Reports should describe clinically relevant orthodontic findings such as malocclusion patterns, occlusal relationships, crowding and spacing, overjet and overbite categories, molar and canine relationships, and treatment-relevant anomalies.

Resources

Announcements

  • 29/05/2026: First 1000 training cases releases for Bite2Text.
  • 25/05/2026: Grand Challenge website launched.
  • 10/05/2026: Training cases released for ToothFairy4.