Dr Dunstan Brown Prof Greville G. Corbett Dr Trevor Sweeting Dr Carole Tiberius Dr Peter Williams
Period of award
January 2003 - December 2005
Economic and Social Research Council (ESRC) - RES-000-23-0082
The nature of the relationship between frequency of use and grammar in natural language is poorly understood. In order to understand this relationship better, we looked at textual frequency distributions in a language which encodes a reasonable number of grammatical distinctions in its word forms, namely Russian. We have developed for other purposes a precise, computationally verified hierarchical model of Russian morphology. In this project we took the next logical step, namely to use this model to determine how far distinct categorizations within the model correspond to differences in use in Russian texts. In order to achieve this we looked at a specific kind of construct which we had already investigated cross-linguistically. This is syncretism, or grammatical ambiguity, where one form can have multiple functions. The major new element in this project was to investigate the relationship between frequency of use and syncretism based on corpus analysis. There are different types of syncretism (and we reflected this by locating them at different points in the hierarchy of our formal model); this made syncretism an ideal construct to use to investigate the more general and harder question of the relationship between textual frequency and grammar.