Datasets¶
The ODIN 2026 challenge contains two training datasets released under the CC BY-NC-SA license: one for CBCT-based maxillofacial and surgical report generation (ToothFairy4) and one for multimodal orthodontic report generation from intraoral scans and intraoral photographs (Bite2Text). Please note that a sign-up will be required to download the data.
Any publication that uses our data must explicitly reference this challenge and cite [1, 2, 3] for TootFairy4 and [4, 5, 6] for Bite2Tex.
Test cases are hidden and will not be publicly released.
Task 1 - ToothFairy4 Dataset (Download Page)¶
Data Description¶
Each ToothFairy4 case contains:
- A CBCT volume of the jaws.
- An associated textual report describing anatomically and clinically relevant structures visible in the scan.
The report is provided as free text in the original language and as an English-translated version.
Please visit the official webpage for additional details.
Data Size¶
| Split | Number of cases | Publicly released? |
|---|---|---|
| Training | 630 CBCT volumes | Yes |
| Test | 50 CBCT volumes | No |
Task 2 - Bite2Text Dataset (Download Page)¶
Data Description¶
Each Bite2Text case contains:
- Upper and lower intraoral 3D scans.
- A set of standardized 2D intraoral photographs acquired during the same visit.
- A structured textual report describing clinically relevant orthodontic findings visible in the 2D/3D data.
The report is provided as free text in the original language and as an English translated version.
Please visit the official webpage for additional details.
Data Size¶
| Split | Number of cases | Publicly released? |
|---|---|---|
| Training | 2,000 patient cases | Yes |
| Test | 50 patient cases | No |