AI chatbots give misleading medical advice 50% of the time, study finds
Published in Health & Fitness
Artificial intelligence-driven chatbots are giving users problematic medical advice about half the time, according to a new study, highlighting the health risks of the technology that’s becoming increasingly integral in day-to-day life.
Researchers from the United States, Canada and the United Kingdom evaluated five popular platforms — ChatGPT, Gemini, Meta AI, Grok and DeepSeek — by asking each of them 10 questions across five health categories. Out of the total responses, about 50% were deemed problematic, including almost 20% that were highly problematic, according to findings published this week in medical journal BMJ Open.
The chatbots performed relatively better on closed-ended prompts and questions related to vaccines and cancer, and worse on open-ended prompts and in areas like stem cells and nutrition, according to the study.
Answers were often delivered with confidence and certainty, though no chatbot produced a fully complete and accurate reference list in response to any prompt, the researchers said. There were only two refusals to answer a question, both from Meta AI.
The results highlight the growing concern about how people are using generative AI platforms, which aren’t licensed to give medical advice and lack the clinical judgment to make diagnoses.
The explosive growth of AI chatbots has made them a popular tool for people seeking guidance on their ailments and OpenAI has said that more than 200 million people ask ChatGPT health and wellness questions every week. The platform announced in January health tools for both everyday users and clinicians, and Anthropic said the same month its Claude product is launching a new health care offering.
A major risk to the deployment of chatbots without public education and oversight is that they could amplify misinformation, the BMJ Open study authors said.
The findings “highlight important behavioral limitations and the need to reevaluate how AI chatbots are deployed in public-facing health and medical communication,” they wrote. These systems can generate “authoritative-sounding but potentially flawed responses,” they wrote.
©2026 Bloomberg L.P. Visit bloomberg.com. Distributed by Tribune Content Agency, LLC.










Comments