Performance assessment of artificial intelligence chatbots (ChatGPT-4 and Copilot) for sharing insights on 3D-printed orthodontic appliances: A cross-sectional study - 24/02/25

Doi : 10.1016/j.ortho.2025.100992

Asma Muhammad Yousuf, Fizzah Ikram, Munnal Gulzar, Rashna Hoshang Sukhia ⁎ , Mubassar Fida
Section of Dentistry, Department of Surgery, The Aga Khan University Hospital, P.O Box 3500, Stadium Road, 74800 Karachi, Pakistan

^⁎Rashna Hoshang Sukhia, Section of Dentistry, Department of Surgery, The Aga Khan University Hospital, P.O Box 3500, Stadium Road, 74800 Karachi, Pakistan.Section of Dentistry, Department of Surgery, The Aga Khan University HospitalP.O Box 3500, Stadium RoadKarachi74800Pakistan

Summary

Objective

To evaluate and compare the performance of OpenAI's ChatGPT-4 and Microsoft Copilot in providing information on 3D-printed orthodontic appliances, with a focus on the accuracy, completeness of the content, and response generation time.

Methods

This cross-sectional study proceeded in five stages. Initially, three orthodontists created a total of 125 questions concerning 3D printed orthodontic appliances of which 105 questions were finalized to be incorporated into the study by a panel of senior orthodontists. These questions were subsequently organized into 15 distinct domains. Both chatbots were presented with the questions under consistent conditions, using the same laptop and internet setup. A stopwatch was used to record response times. The responses were anonymized and evaluated by seven orthodontists with extensive experience, who scored accuracy and completeness based on standardized tools. Through discussion, evaluators reached a consensus on each score, ensuring reliability.

Results

Spearman's correlation revealed a moderate to strong negative correlation between accuracy and completeness for both chatbots (p≤0.001). The negative correlation observed between accuracy and completeness scores, particularly prominent in Copilot, indicates a trade-off between these qualities in some responses. Mann-Whitney U tests confirmed significant differences in accuracy and completeness between the chatbots (p≤0.001), though response time differences were not statistically significant (p=0.204). Cohen's Kappa results implied little to no consistency between the two models on the assessed parameters (p>0.05).

Conclusion

ChatGPT-4 outperformed Microsoft Copilot in accuracy and completeness, providing more precise and comprehensive information on 3D-printed orthodontic appliances demonstrating a greater ability to handle complex, and detailed requests in this area.

Le texte complet de cet article est disponible en PDF.

Keywords : 3D printing, 3D printed orthodontic appliances, Artificial intelligence chatbots, ChatGPT-4, Microsoft Copilot, Large language programs

Plan

Descriptive statistics

Correlation between accuracy and completeness

Comparison between ChatGPT-4 and Microsoft Copilot

Cohen's Kappa Analysis of agreement between ChatGPT-4 and Copilot

Contributions of authors

Export

Vol 23 - N° 3

Article 100992- septembre 2025 Retour au numéro

Article précédent

Efficacy of single versus double miniscrew-assisted maxillary anterior intrusion in subjects with gummy smile or deep bite: A systematic review and meta-analysis
Dhruv Ahuja, Siddarth Shetty, Rajaganesh Gautam, Puneet Batra, Toni Lego

| Article suivant

3D analysis of morphological changes, maxillary central incisor–incisive canal relationship, and root resorption in subjects with maxillary incisors protrusion who underwent non-extraction fixed appliance therapy considering demographic and skeletal factors: A retrospective study
Remsh Khaled Al-Rokhami, Hongzheng Gu, Xiaobao Dang, Zhihua Li, Sadam Ahmed Elayah, Xing Zhao, Karim Ahmed Sakran

Bienvenue sur EM-consulte, la référence des professionnels de santé.
L’accès au texte intégral de cet article nécessite un abonnement.

Déjà abonné à cette revue ?

connectez-vous ou créez un compte

Performance assessment of artificial intelligence chatbots (ChatGPT-4 and Copilot) for sharing insights on 3D-printed orthodontic appliances: A cross-sectional study - 24/02/25

Summary

Objective

Methods

Results

Conclusion

Plan

Export citations

Fichier

Contenu

Accès rapides

Mon compte

Aide & support

Plateformes Elsevier Masson

Déclaration CNIL