In a new paper published in Science Advances, we’ve launched an extensive database of language grammars called Grambank. With this resource, we can answer many research questions about language and see how much grammatical diversity we may lose if the crisis isnt stopped.
There are more than 7,000 languages in the world, and their grammar can vary a lot. Linguists are interested in these differences because of what they tell us about our history, our cognitive abilities and what it means to be human.
But this great diversity is threatened as more and more languages aren’t taught to children and fall into slumber.
Our findings are alarming: were losing languages, were losing language diversity, and unless we do something, these windows into our collective history will close.
What is grammar?
The grammar of a language is the set of rules that determines what a sentence is in that language, and what is gibberish. For example, tense is obligatory in English. To combine Sarah, write and paper into a well-formed sentence, I have to indicate a time. If you dont have tense in an English sentence, then its not grammatical.
Thats not the case in all languages though. In the indigenous language of Hokkaido Ainu in Japan, speakers dont need to specify time at all. They can add words such as already or tomorrow—but speakers consider the sentence correct without them.
As the great anthropologist Franz Boas once said: “grammar […] determines those aspects of each experience that must be expressed.”
Linguists arent interested in correct grammar. We know grammar changes over time and from place to place—and that variation isnt a bad thing to us, its amazing!
By studying these rules across languages, we can get an insight into how our minds work, and how we transfer meaning from ourselves to others. We can also learn about our history, where we come from, and how we got here. Its rather extraordinary.
A huge linguistic database of grammar
Were thrilled to release Grambank into the world. Our team of international colleagues built it over several years by reading many books about language rules, and speaking to experts and community members about specific languages.
It was a difficult task. Grammars of different languages can be very different from each other. Moreover, different people have different ways of describing how these rules work. Linguists love jargon, so it was a special challenge to understand them sometimes.
In Grambank, we used 195 questions to compare more than 2,400 languages—including two signed languages. The map below provides an overview of what we have captured.
Each dot represents a language, and the more similar the color, the more similar the languages. To create this map, we used a technique called principal component analysis—it reduced the 195 questions to three dimensions, which we then mapped onto red, green and blue.
The large variation in colors reveals how different all these languages are from each other. Where we get regions with similar colors, such as in the Pacific, this could mean the languages are related, or that they have borrowed a lot from each other.
Language is very special to humans; its part of what makes us who we are.
Sadly, the worlds indigenous languages are facing an endangerment crisis due to colonization and globalization. We know each language lost heavily impacts the health of Indigenous individuals and communities by severing ties to ancestry and traditional knowledge.
#LanguageDiversity; #Grammar; #LinguisticDatabaseofGrammar; #Grambank; #IndigenousLanguages; #IndigenousIndividuals