Surrey Morphology Group's research focuses on the consequences linguistic diversity has for developing theories of language and the role this plays in understanding the human mind.
The Nuer Literacy Initiative targets inequalities in access to mother-tongue education in the Nuer language, spoken in South Sudan and Ethiopia. The major outcome will be 58 new digital books, and 15,000 physical books to benefit Nuer speakers, heritage learners and second-language learners in East Africa and the global diaspora. These will be produced by Nuer authors, translators and illustrators in conjunction with the South African Institute for Distance Education. The books will have a transformative effect in providing teachers and parents with the materials they need to engender a love of reading Nuer in children.
Many languages of Europe have changed dramatically over the centuries. One of the major events was the loss of rich case systems found in languages like Latin or Old English, leaving us just with remnants such as 'she' versus 'her'. This evolution is preserved only spottily in texts and is poorly understood. But a similar process is still underway in South Slavonic varieties spoken between East Serbia and West Bulgaria. Here the dialects reflect the different historical stages, allowing us to investigate language change directly, and to develop statistically-based models and test existing theories of grammatical evolution.
In the few milliseconds necessary for speakers to say a word and for listeners to understand it, they both make several elaborate deductions. The internal structure of words can be a crucial source of information for these deductions, particularly when words have multiple grammatical forms, a process known as inflection. Across languages, the nature and number of contrasts expressed through inflection can vary greatly. While a language such as English has only a handful of grammatical distinctions, some languages can have up to thousands.
Even closely related languages can exhibit a stunning diversity of morphological complexity, which raises the question: how does this complexity evolve over time, and can we design computational models that would allow us to turn the clock backwards? This project elaborates an evolutionary theory of inflectional systems, such as verb conjugations and noun declensions. It emphasises interdisciplinary parallels between linguistics and fields such as cultural evolution and seeks to demystify the potential of mathematical modelling for inquiry into the linguistic past.
Mutual intelligibility (MI) between languages is observed when a speaker of one language can understand a speaker of another (related) language without any special preparation. Currently, there is no consensus amongst linguists on how MI should be tested and measured, and which linguistic factors are the primary determinants of the degree of MI between languages. This project scrutinises the relation of MI and morpho-syntactic features, using experimental methods to investigate asymmetries in MI in three Turkic languages: Kazakh, Karakalpak and Uzbek.
The very existence of gender is a source of bafflement: why in Russian is 'elbow' masculine, while 'knee' is neuter and 'bone' is feminine? Why do some Dutch speakers distinguish three genders, and others only two? It challenges language learners and excites linguists and psychologists no less. The origin of grammatical gender is a major question in linguistics, and the related issue of how entities are categorised by speakers of different languages is a key question in psychology. How do such systems arise, and what is their impact on the speakers to whom they are native? And most pertinently, why do such different methods of categorisation exist?
Whilst seemingly rare typologically, 'external agreement' appears with fascinating regularity in languages of the Nakh- Daghestanian family spoken in the Caucasus where there are 17 languages with diachronically unrelated instances of external agreement. Such an abundance of examples appearing in languages with considerable variation in their syntactic systems makes external agreement in Nakh-Daghestanian an ideal opportunity for research into morphosyntactic, semantic and pragmatic mechanisms which regulate not only agreement, a fundamental part of a grammar of many languages, but also the less obvious relationships between syntactic elements in a sentence.
Many languages show seemingly arbitrary elaboration of their inflection. For example, most English nouns can take a plural ending, but it is not the same for every noun: compare bird-s, ox-en, and phenomen-a. Exactly which one to use is an additional fact that must be learnt and remembered. Within a general theory of language, such morphological complexity is rather the elephant in the room: it is far from clear what it’s doing there, and why it is taking up so much space. One of the most extreme examples of inflectional variation – the most extreme, we would argue – comes from Seri, a language isolate spoken by approximately 900 people on the Sonoran coast in Mexico.
Differences in meaning are often expressed through transparent changes in the forms of a word (e.g. the verbs ‘talk’, ‘jump’ and ‘shout’ all take the suffix ’-ed’ to indicate past tense: ‘talked‘, ‘jumped‘ and ‘shouted‘). But sometimes differences in meaning are expressed by unexpected changes, which we refer to as “splits”; for example, the verb ‘go’ exhibits a “split” between ‘go’ in the present tense and ‘went’ in the past tense. Investigating diverse splits across the world’s languages will reveal the surprising extent to which individual ‘forms’ may differ from each other while belonging to the same ‘word’.
Over the last 1200 years English has lost nearly all of its complex inflectional system, radically transforming its character, and similar developments have occurred in the histories of language all across the world. At first glance this looks simply like decay, and this is often how it figures in the public imagination. But the loss of inflection is a complex and multidimensional process. The processes involved in the loss of inflectional loss are a potential source of insight into the workings of grammar, seen from a unique perspective.
Syntacticians generally assume that the properties of the head of a phrase, are more important for phrase-external syntactic processes than the properties of the non-head subconstituent. Yet possessive constructions pose a challenge for an adequate theoretical account of possible linguistic systems since several languages exist in which the properties of a possessor, standardly assumed to act as a non-head daughter within a possessive phrase, figure more prominently in syntax than expected by triggering grammatical agreement on the clausal predicate or by participating as a controller in the switch reference system.
One of the world's most extreme examples of a morphologically complex language comes from Nuer, a member of the West Nilotic branch of the Nilo-Saharan language family, mainly spoken in the Republic of South Sudan. This complexity is not due so much to a large number of forms, but rather to their unpredictability and internal structure. Alongside the structural complexities of the system, the prosodic system poses particular descriptive challenges of its own, in which the overlapping effects of tone, phonation type and the typologically unusual three-way vowel length distinction must be untangled through careful acoustic measurements
Inflection is often fairly transparent, involving an incremental mapping between meanings and form. One alternative is distributed exponence, in which the marking of grammatical meaning is distributed across a number of smaller pieces of the word, each of which contribute a subcomponent of that meaning.
Typically, a language will have only gender or classifiers, but we sometimes find both systems together. How fundamentally different systems of categorization interact in a language can uncover important principles underlying the interaction between semantics, morphology, and cognitive categories in general.
Mental categorisation is reflected in some languages by grouping names of entities into nominal classes, and more rarely events and states into verbal classes. Experimental data from Eegimaa speakers collected in Senegal provides insight into a complex cross-categorial categorisation system.
The factors underlying differential subject marking systems are conditional and probabilistic. In the Tibeto-Burman languages of Manang District, Nepal, ergative case marking is largely determined by information-structural properties of a clause rather than purely structural syntactic constraints.
Inflectional classes appear to be functionally useless, but can be highly structured and remarkably resilient over time. The Oto-Manguean languages of Mexico provide important evidence of the degree of the limits of inflectional idiosyncrasy that a human language can tolerate.
The Archi language of Daghestan presents an unusually pervasive agreement system that poses challenges for the central tenets of different syntactic theories. This extreme morphosyntactic system provides a rich testing ground for comparing and evaluating the claims and predictions of HPSG, LFG and Minimalism.
Morphological systems introduce an extra layer of structure in between meaning and its expression. Such apparently arbitrary distinctions may exhibit an astonishing degree of complexity: a key resource for understanding mental processes that are unconscious, yet reflect a highly structured autonomous system.
SENĆOŦEN is the language of the Saanich First Nations community from the Saanich Peninsula of Vancouver Island and neighbouring Gulf and San Juan Islands on the west coast of Canada. Along with five closely related Northern Straits dialects, it is one of 32 indigenous languages of British Columbia.
The Alor-Pantar languages are a group of about 20 endangered non-Austronesian languages spoken on the islands Alor and Pantar in the eastern Indonesian province of Nusa Tenggara Timur. Two typologically interesting phenomena in these languages shed light on the semantic underpinnings of grammatical features.
Russian is a language of tremendous geographic breadth and of remarkable linguistic diversity. Audio data recorded in a wide range of locations, from Siberia and Far East to Southern Russia provides the basis for examining the relationship between phonology, morphology, syntax, the lexicon, discourse and socio-linguistic factors.
Periphrasis is a widespread and significant phenomenon, and a valuable indicator of how a language functions. It reveals how the construction of meaning in language is apportioned between morphology ('bright' and 'brighter') and syntax ('intelligent' and 'more intelligent').
The fact that inflectional paradigms may have such anomalous gaps in them has been known since at least the days of the classical grammarians. The term 'defectiveness' refers to gaps in inflectional paradigms — specifically, those which do not appear to follow from natural restrictions imposed by meaning or function.
Turning owners into actors: Possessive morphology as subject-indexing in languages of the Bougainville region
A fundamental communicative task for all languages is to show which participant in a sentence is the subject. Languages have various ways of achieving this, including word-order, agreement, and case-marking. In some North-West Solomonic languages, subject is indicated using word-forms normally indicating possessors of nouns.
Languages change by gaining and losing word forms over time, but an equally significant role in their history is played by subtle shifts in the function of existing forms. While the system of forms in Russian has changed relatively little over a long period, the use of these forms has undergone a remarkable degree of change over the last 200 years.
The Archi language is characterised by a remarkable morphological system, with extremely large paradigms, and irregularities on all levels. The online dictionary of Archi contains sound files for every word form of the lexeme, digital pictures of culturally significant objects, idioms and example sentences with interlinear glossing.
In attempting to understand language, a central notion is features. Examples of features (and their possible values) include Person (1st, 2nd, 3rd), Number (singular, plural, dual...) and Tense (present, past...). Features have proven invaluable for analysis and description, and have a major role in contemporary linguistics, across the discipline.
Northwest Solomonic is a linkage of languages spoken on Bougainville, Papua New Guinea and on the islands of Santa Isabel and Choiseul and in the New Georgia group of islands, all belonging to the Solomon Islands. It comprises several highly endangered languages in need of language documentation, description and analysis.
Deponency arises when there is a mismatch between the apparent morphosyntactic value of a morphological form and its actual value in a given syntactic context. In the context of typology and morphological theory, an informed account of deponency must reveal which features may be affected, and what the characteristics of the resulting paradigm can be.
Speakers 'know' what a word is, yet linguists have said little about possible words. Words often have different forms, and these are normally related in predictable ways. However, there are also cases where the relations involve more challenging properties such as suppletion, syncretism, defectiveness and deponency and displaced grammatical information.
A paradigm is the complete set of related word forms associated with a given lexeme. Sometimes, the word forms in a paradigm are syncretic and result in grammatical ambiguity, where one form can have multiple functions. Investigation into the relationship between frequency of use and syncretism can shed light on the factors that constrain paradigms.
While linguists have investigated the notion ‘possible sentence’, less has been done to establish the notion 'possible word'. Suppletion, where different inflectional forms of a word are not related phonologically, is common, involves extremely frequent words and provides a ripe testing ground for examining the bounds of possibility for the word.
Agreement is the 'displaced' expression of grammatical information. Along with government, it is one a pair of morphosyntactic phenomenon that involves the morphological expression of a syntactic relation through the displacement of inflectional information associated with an agreement controller on an available target.
Syncretism is a surprising yet widespread and poorly understood phenomenon in natural language. A form is said to be an instance of syncretism if it fulfils two or more different functions within a paradigm and is found even in English, whose inflectional morphology is simple in comparison with many languages.
The notion of default inheritance can be used to relate different languages, as well as different stages of a single language’s development. Using a computational tool for modelling lexical knowledge, changes in the meaning of colour terms in Slavonic languages can be plotted through time to demonstrate its viability for historical linguistics.
The relationship between the general availability of a grammatical category across languages (such as number), and the way it is used by speakers of a single language (such as Russian), can be investigated to reveal the extent to which a hierarchy modelling cross-linguistic tendencies accurately reflects the way a grammatical feature is used.
Innovations can spread and eventually pervade a language, they can fail to take hold, or they can remain, without ever affecting a large number of lexical items. These exceptional cases are interesting because they provide us with insights into why a linguistic system does not favour such innovations.
Different languages such as Slave (an Athabaskan language of Canada), Pirahã, (an Mura language from Brazil) and Breton (a Celtic language of France) present different types of challenges for the description of an inflectional system. Diverse data provide the best opportunity to examine what types of analyses of morphology are required.
An adequate morphological theory must be able to account for morphology at opposite ends of the spectrum of possibilities; fusional morphology (where disparate information is packed into small segments) is common in Slavonic, while polysynthetic languages, such as the Eskimo language Yup'ik, can build up long, complex but segmentable word-forms.
Russian verbal morphology: Alternative perspectives and implementations and their theoretical justification
Russian verbal morphology: Alternative perspectives and implementations and their theoretical justification
After a long period of relative neglect, morphology emerged in the 1980s as a key area of interest characterized by a good interaction between different schools and approaches. The frontiers of research at different British institutions are in part complementary and in part overlapping, facilitating a productive forum for collaboration.
Not all morphological processes are equally productive. The English suffix -ness productively derives nouns from adjectives, as in good > goodness, whereas the suffix -th is limited to warmth and a few others. In a computer implementable theory of morphology this difference can be captured using the notion of defaults.
The treatment of declension classes as nodes in an inheritance hierarchy contrasts strongly with the traditional notion of paradigms as discrete entities which do not share information. Using default inheritance hierarchies in DATR to model word structure we see evidence for a great deal of information sharing between classes.