“Of the many languages that have ever been spoken, only a few of them have been able to achieve global prominence, they have been important enough to become a global language,” Hidalgo told Serious Science.
The researchers began to form their Global Language Network by identifying sources of media that had been translated into multiple languages. This included analyzing data from books, Wikipedia, and Twitter. The data set for the books included 2.2 million volumes that represented over 1,000 languages. Books that were translated from one language into another were connected in their data map. Articles on Wikipedia that had been edited by humans, not bots, were analyzed to see if editors were writing in multiple languages. The Twitter data consisted of tweets sent by 17 million users, spanning 73 languages. If a Twitter user sent out tweets in multiple languages, say French and Italian, those two languages were connected.
The map shows what languages are connected to one another, with the thickest lines indicating the strongest links. An interactive version of this map is available on MIT’s website. Credit: S. Ronen et al., PNAS 2014
Ultimately, English turned out to be the largest hub for information to be translated from one language into another in all three data sets. Other languages including Russian, German, and Spanish also serve as hubs to other languages, but to a lesser extent compared to English.
Hidalgo pointed out that while over 50% of all communication on the internet is in English, that might not be a bias towards the language. If English is the prevailing language of the internet and the internet is the way that most people are communicating now, that speaks to the ability of English to connect people across all languages.
The ability to communicate with a wider number of people confers a certain amount of power, because of an ability to influence a greater number of people.
“Basically, being born into highly connected language is a better predictor of whether that person is going to be important or not, than being born into language that is very populous, or that is spoken by people that are very wealthy,” Hidalgo continued. “[T]he centrality of a language in the global language network is a significantly strong predictor of whether that language produces a large number of successful people after controlling for their income and the population of the language.”