AI Seminar: Is the association of nouns to gender classes truly arbitrary?

Speaker

Portrait photo of Adina WilliamsAdina Williams, Postdoctoral Researcher, Facebook Artificial Intelligence Research (FAIR) Group, New York.

Abstract

In linguistics, debate has long raged about the nature of the relationship between nouns and grammatical gender. For languages which have a robust grammatical gender system (i.e., languages that systematically require morphological markers on nouns and, perhaps, other words nearby), how a language chooses to gender its nouns appears, at first glance, largely unrelated to the meaning of the noun. For example, ‘table’ end up translated into a feminine word in Spanish, but into a masculine one in German! In this talk, I present two approaches for measuring the arbitrariness of gender on inanimate nouns: the first approaches measures how well gender correlates with word vector representations of meaning, and the second measures much information about grammatical gender (in bits) can be measured from other words in the context of the noun. This work shows that state-of-the-art NLP systems trained on large-scale, multilingual corpora---coupled with information theoretic tools in the second case---can help us uncover new facts relevant to linguistic typology. These multilingual studies can also be viewed as an initial step towards a general methodological program that utilizes information theoretic approaches to shed light on traditional cognitive scientific questions about language structure and use.

The seminar is free and open for everyone.

Bio

Adina Williams is a postdoctoral researcher in the Facebook Artificial Intelligence Research (FAIR) Group in New York City. She received her PhD in Linguistics under the supervision of Liina Pylkkänen in the fall of 2018 from New York University, where she also contributed to the Machine Learning for Language Laboratory in the Center for Data Science with the support of Sam Bowman. Her research aims to bridge the gap between linguistics, cognitive science, and NLP. She is currently working on projects involving natural language inference, evaluating model biases, and information theoretic approaches to computational morphology.