News On Japan

Why Tokyo University’s Data Science Course Has Teens Hooked

TOKYO, Oct 17 (News On Japan) - A rapidly growing data science program at the University of Tokyo is attracting an unusually wide range of participants, with junior high and high school students studying alongside university students and working adults.

The course, known as GCI, is offered nationwide online and free of charge for students, eliminating barriers even for complete beginners and fueling a surge of interest from across Japan and abroad.

At a recent completion ceremony, organizers reported 10,579 total enrollees and 1,490 graduates, highlighting the program’s rigorous nature with a 14% completion rate. “I thought I might fail the final assignment, but I managed to finish,” said a second-year junior high school student with little programming experience. GCI is held twice a year, with the next session starting in mid-October, and its popularity has gone global, drawing 7,700 applicants from 32 countries and 430 universities for the English-language version.

To explore why the course is so compelling, GCI instructor and AI startup researcher Masayuki Sera walked through its approach, from fundamentals to practical applications. Sera works at Twins, a company spun out of the university’s AI lab, and applies data science to real business problems. “The work is wide-ranging,” he said. “For a telecom company, for example, we might predict whether customers are likely to cancel their contracts and then suggest changes to their plans. We also assess whether current strategies are effective and adjust them if necessary.”

The program’s curriculum follows a structured process: explore and clean data, build models, evaluate results, and iterate. A signature assignment involves the “Home Credit Default Risk” challenge, where students predict whether customers will default on loans based on tabular data such as income, family size, and loan type. The training dataset includes about 170,000 rows and 51 columns, while the test set has around 60,000 rows and 50 columns, with the default labels hidden.

Exploratory data analysis (EDA) is emphasized early on, teaching students to identify missing values, outliers, and skewed distributions. In one example, missing entries in household size and product price had to be filled before modeling. Students also learn how class imbalance—92% repay their loans while 8% default—can distort results and why metrics like AUC are better than raw accuracy. Visualization reveals useful patterns: income distributions become more interpretable after log transformations, and certain features, like education level and loan type, strongly correlate with default rates.

Before modeling, text categories must be encoded as numbers and missing values filled. Although one-hot encoding is generally safer, GCI demonstrates label encoding for simplicity with tree-based models. A basic random forest model trained on a 70/30 split achieves an AUC of around 0.65—“not exceptional but proof the features contain predictive power,” Sera noted.

Students then learn how to improve performance through feature engineering, such as creating new variables like the ratio of loan amount to income (repayment burden) or product price to loan amount (self-financing ratio). These changes can nudge AUC scores upward—sometimes by just 0.5 percentage points, a difference that can significantly impact leaderboard rankings. Other techniques include comparing individual loan amounts to group averages, trying different encoding or imputation strategies, tuning hyperparameters, or even switching algorithms. This iterative cycle—hypothesizing, testing, and refining—is where many learners find themselves “hooked.”

What keeps even teenagers engaged, instructors say, is the immediate feedback and sense of discovery. With only a few lines of Python, beginners can build a competitive model, and a single visualization can reshape their understanding of the data. “You don’t need to master every algorithm to start,” said Sera. “What matters is rigorous analysis, thoughtful feature design, and relentless iteration.”

GCI’s success reflects a broader trend: data science has become the gateway to artificial intelligence. By grounding learners in core skills—predictive modeling, fair evaluation, and careful data preparation—the course demystifies AI and builds practical foundations. For companies, the message is similar: rather than chasing buzzwords, start by examining existing data, asking the right questions, and letting evidence guide strategy.

Source: テレ東BIZ

News On Japan
POPULAR NEWS

Police in Osaka arrested a 48-year-old man on October 22nd after a tense 14-hour standoff in which he allegedly held a woman at knifepoint inside an apartment. A special tactical unit forced entry into the residence late at night, ending the standoff without injuries.

The Emperor, Empress, and their daughter Princess Aiko visited the Tokyo Metropolitan Memorial Hall in Sumida Ward on Thursday afternoon, marking their first visit to the site as Japan observes the 80th year since the end of World War II. They were greeted upon arrival by Tokyo Governor Yuriko Koike and other officials.

The Kofu Local Meteorological Observatory announced on October 23rd that the season’s first snow had been observed on Mount Fuji, which stands 3,776 meters tall. Around 6 a.m., an official visually confirmed that snow had clearly accumulated near the summit.

After nearly a decade of construction, the newly rebuilt Haneda Line of the Metropolitan Expressway, one of Tokyo’s key arteries linking the city center with Haneda Airport, has been unveiled to the media ahead of its official switch to a new road on October 29th.

The newly launched Takaichi Cabinet moved into full operation on October 22nd, with early personnel decisions revealing a clear conservative tone. Satsuki Katayama was appointed as finance minister and Kimi Onoda as minister in charge of foreign resident policy, underscoring what observers are calling the emergence of a distinct “Takaichi color.”

MEDIA CHANNELS
         

MORE Web3 NEWS

Toyota unveiled a new autonomous driving system that connects AI-equipped vehicles with traffic infrastructure such as traffic lights, marking the first public demonstration of its development-stage technology.

Blending traditional design with modern functionality, Asian games vivid symbols like tigers, lanterns, and coins that create a culturally rich visual atmosphere.

NTT announced on October 20th that it has launched "tsuzumi2," the second-generation version of its domestically developed generative AI large language model (LLM). The new model strengthens domain-specific capabilities in high-demand areas such as finance, healthcare, and local government administration.

AI-powered technologies took center stage at this year’s CEATEC, Japan’s premier digital technology exhibition, with more than half of all displays devoted to artificial intelligence.

A new series titled "The Big Question" by WIRED has launched, tackling some of the most profound and complex topics shaping the future of humanity. Its inaugural question — "Will AI ever become conscious?" — delves into a concept that, while once confined to science fiction, has now entered the realm of serious scientific debate as artificial intelligence becomes an integral part of daily life.

Cryptocurrency is no longer a niche term as it was a few years ago, but a mass financial instrument that has captured the attention worldwide.

A 31-year-old company employee has been arrested on suspicion of creating and posting pornographic images generated by artificial intelligence that imitate female celebrities on the internet. Police allege that Hiroya Yokoi used generative AI tools to produce explicit images resembling well-known women and then published them online.

SoftBank is ramping up its ambitions in the artificial intelligence sector, committing vast sums of capital and striking major strategic deals as part of an aggressive bid to position itself at the center of the AI revolution.