Datasets


The ODIN 2026 challenge contains two training datasets released under the CC BY-NC-SA license: one for CBCT-based maxillofacial and surgical report generation (ToothFairy4) and one for multimodal orthodontic report generation from intraoral scans and intraoral photographs (Bite2Text). Please note that a sign-up will be required to download the data.

Any publication that uses our data must explicitly reference this challenge and cite [1, 2, 3] for TootFairy4 and [4, 5, 6] for Bite2Tex.

Test cases are hidden and will not be publicly released.

Task 1 - ToothFairy4 Dataset (Download Page)


Data Description

Each ToothFairy4 case contains:

  1. A CBCT volume of the jaws.
  2. An associated textual report describing anatomically and clinically relevant structures visible in the scan.

The report is provided as free text in the original language and as an English-translated version.

Please visit the official webpage for additional details.

Data Size

Split Number of cases Publicly released?
Training 630 CBCT volumes Yes
Test 50 CBCT volumes No

Task 2 - Bite2Text Dataset (Download Page)


Data Description

Each Bite2Text case contains:

  1. Upper and lower intraoral 3D scans.
  2. A set of standardized 2D intraoral photographs acquired during the same visit.
  3. A structured textual report describing clinically relevant orthodontic findings visible in the 2D/3D data.

The report is provided as free text in the original language and as an English translated version.

Please visit the official webpage for additional details.

Data Size

Split Number of cases Publicly released?
Training 2,000 patient cases Yes
Test 50 patient cases No