Why did Google's 'incorrect settings' cause serious network failure in Japan?

Japan Today -- Sep 01

Google Inc has issued an apology for the large-scale communication failure that occurred around noon on Aug 25 in Japan, saying that its incorrect settings caused the failure.

"Due to incorrect network settings, a failure that makes it difficult to access Internet services occurred," the company said. "We apologize for the inconvenience and concern."

From the beginning, many experts considered that a large amount of route information sent from Google triggered the failure. And it turned out to be true.

Especially, NTT Communications Corp, KDDI Corp and companies/individuals using the two companies' communication services were severely affected. The failure affected Internet connection, the Internet-related services, financial transactions, payment services such as Mobile Suica, etc.

However, Google did not clarify whether the "incorrect network settings" were caused by human error or defects of software, devices, etc. And it said, "the information was updated within eight minutes."

So, only eight minutes of "incorrect network settings" had a great impact on Japan's communication infrastructures. Why did this happen? Is it possible that it will happen again?

Google owns data centers around the world and runs a gigantic network connecting them. Companies and communication carriers that have such large-scale networks exchange "route information" with one another to communicate with one another. For this purpose, "BGP (border gateway protocol)" is used. The internet can exist because of this interconnection of large-scale networks.

In the accident, the route information was wrong, and communication routes changed for some services.