Skip to main content
Mount Sinai
July 2024

A large language model (LLM) artificial intelligence (AI) system can match, or sometimes do better than, human eye doctors in diagnosing and treating patients with glaucoma and retina disease. This is according to research from New York Eye and Ear Infirmary of Mount Sinai (NYEE).


The study, published on February 22 in JAMA Ophthalmology, shows that advanced AI tools, trained on a lot of data, text, and images, can help eye doctors make decisions in diagnosing and managing glaucoma and retina disorders. These disorders affect millions of patients.

The study compared the knowledge of eye specialists with the abilities of the latest AI system, GPT-4 from OpenAI. This AI is designed to perform at a human level. In medicine, advanced AI tools are seen as potentially changing how diagnoses and treatments are done. Ophthalmology, which deals with many complex cases, could benefit a lot from AI, giving doctors more time to use evidence-based medicine.

“The performance of GPT-4 in our study was quite eye-opening,” says Andy Huang, MD, an ophthalmology resident at NYEE, and lead author of the study. “We recognised the enormous potential of this AI system from the moment we started testing it and were fascinated to observe that GPT-4 could not only assist but in some cases match or exceed, the expertise of seasoned ophthalmic specialists.”

For the human part of the study, the team from Mount Sinai recruited 12 specialists and three senior trainees from their Department of Ophthalmology. They selected 20 basic questions (10 for glaucoma and 10 for retina) from a list of common questions patients ask, along with 20 patient cases from Mount Sinai eye clinics. They compared responses from both the GPT-4 AI system and human specialists, analysing and rating them for accuracy and thoroughness using a Likert scale, a tool commonly used in clinical research.

The results showed that AI matched or outperformed human specialists in both accuracy and completeness. AI performed better in responding to glaucoma questions and case management, while it matched humans in accuracy but exceeded them in completeness for retina questions.

“AI was particularly surprising in its proficiency in handling both glaucoma and retina patient cases, matching the accuracy and completeness of diagnoses and treatment suggestions made by human doctors in a clinical note format,” says Louis R. Pasquale, MD, Deputy Chair for Ophthalmology Research, and senior author of the study. “Just as the AI application Grammarly can teach us how to be better writers, GPT-4 can give us valuable guidance on how to be better clinicians, especially in terms of how we document findings of patient exams.”

While more testing is needed, Dr. Huang believes this work shows a promising future for AI in ophthalmology. “It could serve as a reliable assistant to eye specialists by providing diagnostic support and potentially easing their workload, especially in complex cases or areas of high patient volume,” he explains. “For patients, the integration of AI into mainstream ophthalmic practice could result in quicker access to expert advice, coupled with more informed decision-making to guide their treatment.”