• 2026.04.21 (Tue)
  • All articles
  • LOGIN
  • JOIN
Global Economic Times
fashionrunwayshow2026
  • Synthesis
  • World
  • Business
  • Industry
  • ICT
  • Distribution Economy
  • Well+Being
  • Travel
  • Eco-News
  • Education
  • Korean Wave News
  • Opinion
  • Arts&Culture
  • Sports
  • People & Life
    • International Student Report
    • With Ambassador
  • Column
    • Cho Kijo Column
    • Cherry Garden Story
    • Ko Yong-chul Column
    • Kim Seul-Ong Column
    • Lee Yeon-sil Column
  • Photo News
  • New Book Guide
MENU
 
Home > Synthesis

South Korean AI Models Flunk College Entrance Math Exams, Lagging Far Behind Global Leaders

Yim Kwangsoo Correspondent / Updated : 2025-12-15 07:01:13
  • -
  • +
  • Print

(C) Seeking Alpha


SEOUL— A recent performance comparison of South Korea's leading large language models (LLMs), often dubbed "National AI" contenders, revealed a significant gap in mathematical problem-solving ability compared to their international counterparts. The domestic models largely failed to achieve passing grades on standardized mathematics tests, including the highly challenging Suneung (College Scholastic Ability Test).

A research team led by Professor Kim Jong-rak of Sogang University's Department of Mathematics conducted the rigorous assessment. They tested five major South Korean LLMs—Upstage’s Solar Pro-2, LG AI Research’s Exaone 4.0.1, Naver’s HCX-007, SK Telecom’s A.X 4.0 (72B), and NCSOFT’s lightweight model Llama Varco 8B Instruct—against five frontier international models, including GPT-5.1, Gemini 3 Pro Preview, Claude Opus 4.5, Grok 4.1 Fast, and DeepSeek V3.2.

Rigorous Testing Methodology

The researchers administered a total of 50 mathematics problems across two categories:

Suneung (CSAT) Math (20 Problems): The 20 questions were selected as the most difficult from the common subjects, Probability and Statistics, Calculus, and Geometry sections of the highly competitive South Korean CSAT.
Essay-Type/Advanced Math (30 Problems): This set comprised questions from the entrance exams of 10 domestic universities, 10 questions from the Indian university entrance examination, and 10 questions from the mathematics section of the graduate school entrance exam for the University of Tokyo's Faculty of Engineering.
In the initial test comprising the 20 Suneung and 30 essay-type problems, the performance disparity was stark. International models consistently scored high, ranging from 76 to 92 points. In sharp contrast, the South Korean models struggled immensely. Only Solar Pro-2 managed a score of 58 points, while the others languished in the 20s. NCSOFT's Llama Varco 8B Instruct recorded the lowest score, a mere 2 points.

The research team noted that even after designing the domestic models to use Python as a tool to enhance problem-solving accuracy beyond simple inference, the results remained discouraging.

Second Test: EntropyMath Dataset Confirms Lag

The researchers conducted a second test using a proprietary dataset they developed called 'EntropyMath,' which features 100 questions of varying difficulty, from university-level to professorial research standards. Ten selected questions from this set were presented to the 10 AI models.

The results mirrored the first test: International models achieved scores between 82.8 and 90 points, whereas the domestic models were significantly lower, ranging from 7.1 to 53.3 points.

In a third attempt, where the models were given three chances to solve a problem for a correct answer, the international models again demonstrated dominance. Grok 4.1 Fast achieved a perfect score, and the rest of the overseas models scored 90 points. The best-performing domestic model, Solar Pro-2, scored 70 points, followed by Exaone at 60 points. The other domestic contenders, HCX-007, A.X 4.0, and Llama Varco 8B Instruct, recorded 40, 30, and 20 points, respectively.

Call for Improvement and Future Plans

"There was a lot of inquiry about why there was no evaluation of the five domestic sovereign AI models on Suneung problems, so our team conducted this test," Professor Kim explained. "It confirmed that the level of domestic models is significantly behind that of the overseas frontier models."

The research team acknowledged that the domestic models tested were based on existing public versions and plan to conduct a re-evaluation once the updated, dedicated "National AI" versions from each team are officially released.

Professor Kim also announced the launch of a dedicated mathematics leaderboard based on the EntropyMath dataset, with the goal of expanding it to an international standard. He added that the team will improve their proprietary problem-generation algorithms and pipelines to create specialized datasets for domains beyond mathematics, including science, manufacturing, and culture, to contribute to the performance enhancement of domain-specific AI models.

The study was jointly supported by Sogang University's Institute of Mathematical Sciences and Data Science (IMDS) and Deep Fountain.

[Copyright (c) Global Economic Times. All Rights Reserved.]

  • #globaleconomictimes
  • #micorea
  • #mykorea
  • #nammidonganews
  • #singaporenewsk
  • #Samsung
  • #Daewoo
  • #Hyosung
  • #Apple
  • #korea
Yim Kwangsoo Correspondent
Yim Kwangsoo Correspondent

Popular articles

  • Won-Dollar Exchange Rate Surges to 1,515 Range Amid Triple Whammy: War, Oil Prices, and Foreign Capital Outflow

  • GIST Researchers Develop Next-Generation EV Battery: Full Charge in 12 Minutes with Enhanced Safety

  • LG AI Research Unveils ‘EXAONE 4.5’: A New Multimodal Powerhouse Outperforming Global Tech Giants

I like it
Share
  • Facebook
  • X
  • Kakaotalk
  • LINE
  • BAND
  • NAVER
  • https://globaleconomictimes.kr/article/1065563947796469 Copy URL copied.
Comments >

Comments 0

Weekly Hot Issue

  • The cherry blossoms at Gakwonsa Temple in Cheonan are in full bloom, attracting tourists to the area.
  • The cherry blossoms at Gakwonsa Temple in Cheonan are in full bloom, making the area beautiful.
  • Pope Leo XIV Slams ‘Handful of Tyrants’ for Ravaging the World Amid Tensions with Trump
  • South Korea Visionary Plan: Transforming Into a Global “UN AI Hub”
  • 60-Year-Old Man Sentenced to 27 Years in Prison for Killing Wife Immediately After Restraining Order Expired
  • El Salvador Imposes Life Sentences for 12-Year-Olds: A Stark Contrast to South Korea's Juvenile Laws

Most Viewed

1
From the Alps to Seoul: Life in the Heart of Europe
2
$2 Million Per Ship: Iran’s "Hormuz Toll" Emerges as Chokepoint in Peace Talks
3
BOK Holds Rate Steady for Seventh Consecutive Meeting, Signaling End of Easing Cycle
4
BYD Hits 10,000-Unit Milestone in South Korea Within One Year, Eyes Exclusive "10,000 Club" Entry
5
Republican Party Faces "Total Crisis" as War and Inflation Cloud Midterm Outlook
광고문의
임시1
임시3
임시2

Hot Issue

Hormuz Impasse: Reclosure of Strategic Strait Clouds Hopes for Second Peace Peace Talks

The AI Tsunami: Meta to Slash 10% of Workforce Amid Global Tech Purge

Woori Bank Tightens Reins on Dormant Corporate Accounts to Combat Financial Fraud

K-Innovation Hits Record High: Over 27,000 Public Ideas Flood the ‘Everyone’s Idea’ Project

Fashion Runway Show 2026

Global Economic Times
korocamia@naver.com
CEO : LEE YEON-SIL
Publisher : KO YONG-CHUL
Registration number : Seoul, A55681
Registration Date : 2024-10-24
Youth Protection Manager: KO YONG-CHUL
Singapore Headquarters
5A Woodlands Road #11-34 The Tennery. S'677728
Korean Branch
Phone : +82(0)10 4724 5264
#304, 6 Nonhyeon-ro 111-gil, Gangnam-gu, Seoul
Copyright © Global Economic Times All Rights Reserved
  • 에이펙2025
  • APEC2025가이드북TV
  • 반달곰 프로젝트
Search
Category
  • All articles
  • Synthesis
  • World
  • Business
  • Industry
  • ICT
  • Distribution Economy
  • Well+Being
  • Travel
  • Eco-News
  • Education
  • Korean Wave News
  • Opinion
  • Arts&Culture
  • Sports
  • People & Life 
    • 전체
    • International Student Report
    • With Ambassador
  • Column 
    • 전체
    • Cho Kijo Column
    • Cherry Garden Story
    • Ko Yong-chul Column
    • Kim Seul-Ong Column
    • Lee Yeon-sil Column
  • Photo News
  • New Book Guide
  • Multicultural News
  • Jobs & Workers