Geraldine is growing. Although I haven't actually written anything that will appear in any of Geraldine's final drafts, her database is becoming something worth smiling about. Geraldine is a traditional typological study. In traditional typologocial studies, one must collect data from many languages. How many languages? Well... that's hard to say. Nowadays there are publications that compare two or three languages, but in traditional typological studies, databases should consist of at least 100 languages. Is that difficult? Well... it has been up to now. I started collecting data for Geraldine's database 4 years ago. When I turned in my preliminary draft, my database had 38 languages in it. Not exactly close to 100, but I was still proud of it. 38 languages may not seem like a lot, but in traditional typological studies, there are many conditions for selecting langauges.
Languages can be related to each other in two ways. The first way is through what we refer to as a 'genetic relationship.' These languages have developed from a common ancestor language that no one really speaks anymore. For example, German and English both came from proto-Germanic. This is why the languages share so many vocabulary words, like 'house' and 'Haus' or 'scribe' and 'schreiben'. Languages may also be related through areal contact. When languages have prolonged contact with eachother, they start to borrow terms. This is why Japanese has borrowed English words. Some of my favorites are: kay-ki (cake), su-pun (spoon) and jusu (juice). In order to make claims that all languages (or most languages) do something, one must collect data from many langauges that are neither genetically or areally related to each other.
Thus, when ever I come across data from a langauge that is new to me, I have to look up where it is spoken and the language family it belongs to. Most languages that have a lot of data available are from large language families and are spoken in Europe or in Asia. It is much more difficult to find data from a language spoken deep in the jungles of Papua New Guinea (a place that has roughly 1/8 of the world's languages) or Africa. Even when I do come across the name of a language from a language family I have never heard of before, I have to find a data source that is strong enough for me to defend my use of it. In general, dictionaries are looked down upon because they don't have enough information about how the words are used, and grammars don't contain the information I need.
Before my computer crashed 3 weeks ago, I found a database that had the data I needed in languages that I didn't have. (Yay!) In a few days, Geraldine's database grew from 38 to 68 languages. (Ooooooo.... Aaaaaaaaaaaah....) The only problem was, Geraldine's database still had more than twice as many languages from Europe and Asia than it did from the other 4 major geographic areas. (Fail) This has been on my mind but uploading everything onto the online courses I teach had to be my priority for the last two weeks. Finally I had extra time to read last night and within two pages of reading, I discovered the name of another database. Could it be?! More goodies for Geraldine?!? Indeed... this was another motherload of data with the needed information from more than 75 languages indigenous to South America. (Tada!) It also has a few more languages from North American and Africa that I can add to my database AND I can copy and paste information from it into Geraldine's database (no more searching for special symbols every other letter). And now, Geraldine's database will reach the 100 languages mark this week. I will only need to find data from another 20 languages or so to round out my database, and I can start reanalyzing the data to test my hypotheses.
This means, if you talk to me this week, I will be frantically busy while also displaying signs of being in an incredibly good mood (until my new data doesn't support my claims and I have to rewrite 30 pages of my preliminary work). And, if you have access to data from a language spoken in Africa, PNG or North America that isn't already in my database.... pass it my way. :)
No comments:
Post a Comment