From competing theories to fieldwork: The challenge of an extreme agreement system

The wiki for the project 'From competing theories to fieldwork' was constructed to allow contributors to the project to share their preliminary analyses with any interested party. It contained handouts from all the seminars for the project as well as a range of additional materials on Archi. The wiki is no longer being updated but all the resources are available on the project wiki pages.

Project description

Agreement is a key part of the syntax (sentence structure) of many languages. It is the mechanism whereby information about one word is expressed on another (e.g. the system works, where -s on the verb signals that the noun is singular). Even the straightforward examples are truly puzzling, because agreement involves both redundancy and arbitrariness (Acuña-Fariña 2009: 390). The indirect relationship between semantics and sentence structure expressed by agreement is a significant source of syntactic complexity, which can be exacerbated by great diversity of expression in form (morphology). Agreement is peculiar to human languages, absent in other forms of communication. This suggests that agreement is a core linguistic phenomenon, ideally suited for testing and comparing leading theories of syntax and morphology. Such a comparison would be a valuable contribution, since linguistics has reached a point where we have competing theories but disappointingly little interaction between them. Particular challenges posed by agreement systems have been addressed by contemporary syntactic theories. These have naturally concentrated on isolated instances of complexity across a number of languages. A typology of agreement has also been developed (Corbett 2006). Two further things are required in order to move forward in understanding syntax and in developing the theoretical debate. First, we need a language which exhibits the full spectrum of agreement mechanisms, including a variety of means for realizing feature structures and a high degree of syntactic complexity. This will enable us to test the effect each theoretical distinction has in predicting related properties. The Nakh-Daghestanian language Archi (Figure 1) has the extreme agreement system required for just such an investigation.  

The second thing required is a framework for comparing and evaluating leading theoretical approaches. This project meets the challenge head on by bringing together leading centres for syntax, morphology and Nakh-Daghestanian languages (Essex, Harvard, Surrey). We will create an implemented ontology – a formal consistent description in a format accessible to practitioners in different frameworks – to compare the three major theories (HPSG, LFG and Minimalism). There will be systematic and sustained interaction between, and comparison of, competing approaches informed by a rich body of empirical data from this understudied language family, giving the first complete account of an extreme agreement system.


1. Research questions or problems

1.1. Research questions 

We aim to understand how the separate elements of an extreme agreement system interact. We need to see how neat partial solutions to problems (given in §1.2) scale up to a complete analysis of a language. Our research questions are formulated so as to determine the compatibility of the separate theoretical solutions and the minimal requirements for analyzing the complete spectrum of agreement:  

RQ1 – theoretical: How can current theories of syntax be extended to model extreme agreement systems? 

In order to answer this question, we must address the domain problem. Languages of the Archi type are particularly challenging, first because there is a large number of domains, and second because these domains cannot be readily defined in current theoretical terms. This is demonstrated by example (1), from Kibrik (1994):

‘Mother made bread early for me.'

Archi has thoroughgoing ergative syntax. The subject buwamu ‘mother’ is in the ergative case and has no effect on agreement. It is the absolutive argument χːʷalli ‘bread’ that controls infix agreement on the verb abu ‘make’. Absolutive argument and adverb present a further agreement domain: the adverb ditːabu ‘early’ agrees with χːʷalli ‘bread’. Finally, the dative pronoun also agrees with the absolutive, taking the singular gender III form bez ‘for me’. The syntactic domain for this agreement (marked by the leftmost arrow) is difficult to define, and poses a particularly challenging instance of the domain problem outlined in §1.2.1.

The second problem for syntactic theory is also illustrated by this example. Here the adverb shows agreement, but not all Archi adverbs do so. Similarly not all verbs show agreement, and a few pronouns do. This is an instance of thelexical problem, discussed in §1.2.2, of defining word classes relevant for agreement.

RQ2 – theoretical: How do the elements of the Archi agreement system interact, given its unusual syntax, over-large paradigms and substantial asymmetry in the lexicon?

This question presents a number of related general problems for the theory of agreement: the syntax-morphology interface problem (see §1.2.3), the conditions on agreement problem (see §1.2.4), and the syntax-semantics interface problem (see §1.2.5). A detailed answer to this question for Archi will therefore present major progress.

RQ3 – methodological: How do the formal theories cope with intra-linguistic variation?

There are various constructions in which more than one agreement form is acceptable (see §1.2.6). This is an additional challenge to formal theories; we need analyses which allow for this optionality in a convincing way.

RQ4 – methodological: How can we obtain and present the essential data in a way which will engage three different research communities? 

We have an innovative method (see §3), which we are confident will ensure constructive interaction between the theoretical perspectives and progress towards understanding very challenging data. RQ4 requires that we address the following: (i) the requirements for undertaking fieldwork with parallel theoretical assumptions in mind; (ii) evaluation of substantive differences between the competing theories; (iii) how to reconcile the demands of formal theories with the sensitivity required for working with an endangered language. We will address the first two of these using an ontology (see §3.1), and the latter will build on the good will created during our previous fieldwork.

1.2 Research problems

1.2.1 The domain problem: In contemporary theories agreement is defined in terms of syntactic domains. Archi adverbs, pronouns and particles therefore present challenges for theory, because they lack a clear syntactic link to a controller. This involves two levels of complexity: adverbs and particles have been accounted for as having syntactic and semantic scope over the whole clause, whereas there are other problems, such as the one with dative pronouns in §1.1, which have never been addressed by syntactic theories.

1.2.2 The lexical problem: It is sometimes assumed that members of a word class behave identically. However, the range of agreement in Archi is never 100%: there is not a single word class where every member has a morphological slot for agreement: most adjectives show agreement, as do about half the verbs (judging by the 1328 verbs in our database), and a handful of adverbs and particles. This presents a challenge for defining word classes.

1.2.3 The syntax-morphology interface problem: Even if targets behave similarly (agreeing or not agreeing), there are still complexities in the syntax morphology interaction. Thus targets may have more than one agreement slot. In Archi, many targets mark agreement in two places, some in three, and a few in four. The agreement can be with the same controller, as in example (2):

‘... and left the household (to someone)’

Alternatively, agreement can be with two different controllers, as happens with participles: 

‘This girl is cunning.’ (Literally ‘this female child is with cunning’)

In (3) the participle bitːur ‘being (with)’ agrees with two arguments: the prefix b- agrees with sːiħru ‘cunning’, the suffix -rwith lo, ‘girl’. Given our lexical database, we can calculate the percentage of the vocabulary that shows agreement, and our corpus of texts will help us to see the relative frequency of ‘heavily agreeing’ items and the average ‘agreement weight’ within the clause. 

1.2.4 The conditions on agreement problem: In Archi some verb forms license a bi-absolutive construction:

‘Butta is sorting grain.’

The auxiliary wi agrees in gender and number with the agent (Butta, a man’s name), whereas the main verb berk’urši‘sort’ agrees in gender and number with buq’ ‘grain’. This is a morphosyntactic condition; the aspectual characteristics license the unusual agreement.

1.2.5 The syntax-semantics interface problem: In addition to the problems above, Archi also has the more familiar semantic agreement. In (5) the verb is plural, though the controller is singular: 

‘One is running after another.’

1.2.6 The variation problem: There are two types of variation: ‘pervasive’, where all speakers manifest variation in agreement; and ‘circumscribed’ variation, restricted to a particular group. Circumscribed variation is found with new borrowings, where some speakers show agreement alternation between genders III and IV, while others do not. To the best of our knowledge, no attempts have been made to study this variation in an unwritten language with a small number of speakers. Archi gives us the unique opportunity to provide the material for theoretical reflection, as there is a substantial corpus of texts of different genres from 40 years ago (provided by Kibrik et al. 1977c) as well as contemporary data collected by Chumakina. 


2. Research context  

An adequate description of agreement requires a clear understanding and consistent presentation of two components: the syntax and the morphology. The former describes the strategy of agreement: controllers, targets, domains and conditions; the latter describes the means: morphological realization of the feature specifications by affixes and stem changes. We will provide a consistent description in the form of an ontology (see §3.1), as an essential setting for three leading proponents of different theories (HPSG, LFG and Minimalism) to meet and discuss the complex issues presented by Archi. For morphology, we will work within the inferential-realizational approach, to which we have already made a substantial contribution. This will allow us to compare competing syntactic theories in a consistent manner.

Substantial problems such as the domain problem, the lexical problem and conditions on agreement (see §1.2) are only partly understood (see Wechsler and Zlatić, 2003, and Corbett 2006 for an overall typology and references). The project is feasible, because the essential preparatory work has been done. Clear definitions were worked out for the SMG database of agreement, which contains detailed descriptions for 15 languages and therefore provides an informed view of the typological range (Brown, Corbett, Tiberius 2003). We believe that this is the most detailed structured source of data on agreement. There is a good descriptive grammar of Archi (Kibrik et al. 1977a, b), and we have compiled a lexical database of 4,644 items, which underlies a substantial dictionary (Chumakina, Brown, Corbett and Quilliam 2007). This database has information on the morphological means of agreement for all word classes, and allows searches by different parameters. We have recorded, transcribed and translated a set of texts, including spontaneous discourse, which will be the initial corpus to be accounted for by the theoretical accounts.

This project will make a substantial contribution by scaling up theoretical approaches to syntax, presenting them with a language of a type not dealt with before. The results will be of interest to linguists working in various theories of syntax, as it will provide them with an apparatus for dealing with challenging data. For fieldworkers, the project will provide parallel formal means for describing syntactic phenomena in less well studied languages.  


3. Research methods  

Evaluating key theories with regard to complex agreement in Archi requires that we determine: (i) how well they extend to the extreme phenomena; (ii) the effect of theoretical adaptations on coverage or predictions about familiar phenomena.

3.1 Linguistic ontology

We shall develop an extension (known as a COPE) for GOLD (General Ontology for Linguistic Description). GOLD has been developed over a number of years with NSF funding, and with Brown and Corbett acting as consultants on morphosyntax. GOLD’s purpose is to make linguistic description explicit and consistent. For instance, a linguist can consult GOLD for information about (grammatical) number, listed as a concept under Number Feature (see Among other things there is a list of links to definitions of potential values. The COPE we propose is new in that it will employ Canonical Typology (Corbett, 2007, Brown, Chumakina and Corbett, forthcoming) to define agreement phenomena in terms of greater or lesser proximity to a canonical ideal. 

3.2 Syntactic theories

A novel part of the research method is bringing together the ontology and experts on leading syntactic theories chosen as follows: 





Robert Borsley (Essex)

Louisa Sadler (Essex)

Maria Polinsky (Harvard)

All three theories have paid considerable attention to agreement, favouring different approaches. Within HPSG a number of approaches have been explored employing constraints on various levels of representation. For LFG, agreement is taken to be primarily a reflection of constraints on the level of surface grammatical functions (f-structure), rather than on the level of surface constituent structure. For Minimalism the standard view is that agreement is intimately associated with the central mechanisms of licensing and applies to structures which may be modified by movement processes. However, some proposals within Minimalism have treated agreement as a PF phenomenon.

Within the project team of experts there is considerable experience of working within a second theory; this makes the team ideally suited to the comparison and consensus-building for which the project is designed.

3.3 Existing datasets and data collection 

The existing descriptive grammar of Archi (Kibrik et al. 1977a, b) and the lexical database on which Chumakina, Brown, Corbett and Quilliam (2007) is built represent a solid basis for the project. The existing data will be supplemented by targeted fieldwork on agreement, guided by the predictions made by the three theories.

The planned iterations through the three theoretical strands will produce questions for fieldwork interleaved in a seminar cycle (specified in §4). The postdoctoral researcher Chumakina has training and years of experience in the application of a variety of fieldwork techniques. She will use standard elicitation techniques, questionnaires, and gather representative texts in different genres.

4. Project Management

The project will be structured around a cycle of working seminars, open to PhD students. The contributors will be the typology/fieldwork team, giving a clear account of the Archi material consistent with the ontology, and the three syntax experts, who will be invited to give an analysis of the Archi constructions within their theory.  

The structure of a standard seminar will be as follows: 




Part 1: 

 HPSG account

 Robert Borsley        

 LFG account

 Louisa Sadler           

 Minimalism account

 Maria Polinsky

Part 2:

 Comparative discussion, specification of fieldwork problems

Part 3:

 Presentation of next topic; its place in canonical typology and ontology

We plan six standard seminars, two per year, plus an introductory one, and a dissemination conference to conclude the project. The standard seminar, each lasting a full day, consists of three analyses of a specific problem, comparative discussion, and careful exposition of the next topic. (The introductory seminar naturally requires only part 3.) The topics are logically ordered, progressing according to the requirements of the syntax experts:

Topic 1. The domain problem (§1.2.1)
Topic 2. The lexical problem (§1.2.2)
Topic 3. The syntax-morphology interface problem (§1.2.3)
Topic 4. The conditions on agreement problem (§1.2.4)
Topic 5. The syntax-semantics interface problem (§1.2.5)
Topic 6. The variation problem (§1.2.6)

To ensure that theoretical parsimony is not undermined by lax use of features, we shall maintain an independent control on the feature inventory, applying Canonical Typology. This will guarantee a sound basis for comparison of the three theories.

Project members

Prof Greville G. Corbett
Prof Dunstan Brown (University of York)
Prof Bob Borsley (University of Essex)
Prof Maria Polinsky (University of Maryland)
Prof Louisa Sadler (University of Essex)
Dr Marina Chumakina
Dr Oliver Bond

Period of award:

January 2012 - June 2015


Arts and Humanities Research Council (AHRC)