New Page 12

How Successful Are Artificial Intelligence Chatbots on Higher Education Entrance Physics Exams in Turkey

ABSTRACT

In this study, the performance of artificial intelligence chatbots—OpenAI's ChatGPT, Google Gemini, and Microsoft's Copilot—was evaluated and compared based on their responses to questions from the Turkish Higher Education Entrance Physics Examination over the past three years. Analysis of the chatbots' responses to TYT Physics questions showed that ChatGPT correctly answered 38.09% of the questions, while both Gemini and Copilot achieved a correct answer rate of 28.57%. For AYT Physics questions, ChatGPT demonstrated a higher success rate, correctly answering 45.23% of the questions, compared to 26.18% for Gemini and 14.28% for Copilot. While ChatGPT exhibited the best performance overall, Copilot performed the worst. Nonetheless, the overall performance of all three chatbots was insufficient for providing consistently accurate answers to both TYT and AYT questions.