Grammatical Features Inventory
Typology of grammatical features
Inflected words show variation in form. The different forms are correlated with meanings or functions which we label as 'features'. However, not all features that are identified through inflectional morphology are morphosyntactic. The most basic definition of a morphosyntactic feature is a feature which is relevant to syntax. For a feature, to be 'relevant to syntax' means that it is involved in either syntactic agreement or government. Gender, number, and person are involved in agreement in a large number of languages, therefore they are typical morphosyntactic features. However, while in many familiar languages the feature 'tense' encodes regular semantic distinctions, it is not required by the syntax through the mechanisms of either agreement or government: syntax is not sensitive to the tense value of the verb. Therefore, many familiar instances of the feature 'tense' are morphosemantic, but not morphosyntactic.
In discussion of features, labels such as 'gender', 'person', or 'tense' are often used to refer both to the value of the feature and to the feature as such. For example, the term 'gender' is used both for the particular classes of nouns (so, a language may have two or more genders) and for the whole grammatical category (so, a language may or may not have the category of gender). Similarly, we refer to an 'inventory of features' (meaning, categories, or features as such), while at the same we time talk about 'feature checking', or 'unification of features' in syntax (meaning, checking or unifying feature specifications, i.e. feature values). However, it is important to maintain the distinction between 'features' and their 'values' while attempting to construct any taxonomy or typology of features, because the characteristics or behaviour of the feature as such will not be the same as the characteristics of a feature value.
The relationship between the concept of 'gender' and the concepts 'masculine', 'feminine', 'neuter'; or between the concept 'case' and the concepts 'nominative', 'accusative', 'genitive', etc., has been referred to with the following pairs of terms (based on Castairs-McCarthy 1999:266-267, expanded):
|Matthews (1972:162; 1991:38-40)||category||property, feature|
|Wurzel (1984:61)||Kategoriengefüge [complex of categories]||Kategorien [category]|
|Bybee (1985)||category||(inflectional) meaning|
|Stump (2005:50)||inflectional category||morphosyntactic property|
Following Zwicky (1985), we use terms 'feature' and 'value'. Although the concepts 'masculine', 'feminine', 'neuter'; or the concepts 'nominative', 'accusative', 'genitive', etc., are all 'values', further questions can be asked about the relationship between them. One question concerns the partitioning of the feature space in general between the available values (see, for example, attempts to arrive at definitions of gender and number values for an ontology of linguistic description); another concerns the structuring within the values available for a particular feature in a particular language (see, for example, the structuring of gender values discussed in Corbett 1991, or the structuring of number values discussed in Corbett 2000).
Establishing an inventory of features and values for a language can be a complex issue. An example of a language in which justifying a feature requires careful analysis is Archi (Daghestanian), and the feature in question is person. This language has no phonologically distinct forms realising person and the standard description of Archi does not involve the feature person (only gender and number). However, the agreement patterns in Archi indicate that this language does require us to recognise a person feature, even though it is a non-autonomous one (Chumakina, Kibort & Corbett 2007).
There are also many examples in the literature of argumentation regarding the number of values for various features in different languages. In an inferential-realisational approach to inflectional morphology, which we adopt after Stump (2001), we identify the realisations of feature values by establishing a paradigm correlating inflected forms with morphosyntactic properties. The cells in a lexeme's paradigm are regarded as pairings of a stem with a morphosyntactic property (or a morphosyntactic property set), to yield an inflected word form which is the realisation of the pairing. Examples of how to establish the paradigm for case in Russian can be found in Zaliznjak (1973) and Comrie (1986). Following from the work of Schenker (1955) and Zaliznjak (1964), Corbett (1991: Chapter 6) describes principles for determining the number of gender values in a language, giving sample analyses of gender in Romanian, Telugu, Lak, Tamil, Chibemba, Slovene, Tsova-Tush, and other languages.
Both agreement and government are concepts that are necessary to describe inflectional morphology. Both involve specifying, or determining, a feature value on an element in the clause. In the case of agreement we call this element 'target', and in the case of government we call it 'governee'. In both agreement and government the demand for the specific feature value comes from elsewhere (i.e. not from the target or the governee): it comes from a 'controller' (in the case of agreement), or from a 'governor' (in the case of government). In this way, agreement and government 'share the characteristic of being syntactic relations of an asymmetric type' (Corbett 2006:8). However, while a controller of agreement bears the feature value it requires of its target (this has been referred to as 'feature matching'), a governor does not bear the feature value it requires of its governee (we can refer to this as 'feature branding'). Despite this general principle, note that agreement mismatches may occur for various reasons (see Corbett 2006: Chapter 5), and a governor may have the relevant feature specification coincidentally.
Both agreement and government can apply to more than one element in the clause simultaneously, resulting in multiple occurrence of the same feature specification in the domain. In agreement, we find that an element may control a set of targets in the clause (and beyond). In government, we find that an element typically governs a unit consisting of one or more elements. The most familiar example of government of a feature over a unit is the assignment of case to (the elements within) a noun phrase. When a noun and its adjectival modifier are in the same case, it is because the case value is imposed on both simultaneously. Corbett (2006:133-135) discusses the possibility of viewing this type of feature multirepresentation as agreement and concludes that it 'will not count as canonical agreement, if we take seriously the issue of asymmetry'. If one accepts the view of syntax which is based on the notion of constituency, when the noun and its modifier are a constituent 'it follows that we have matching of features within the noun phrase resulting from government (rather than agreement in case)'. Note that the same analysis holds for languages which allow more than one case to be stacked. This is characteristic of many Australian languages including Kayardild (Evans 1995) whose multiple case marking has been offered for consideration as agreement (Evans 2003). Following a heuristic developed for the Features Inventory, some types of multiply represented cases in Kayardild have now been re-classified as instances of government of case over a noun phrase (specifically, this analysis applies to the multiple case marking of some relational, adnominal, and verbalising cases in Kayardild, as well as to the multiple marking of associating case, which is a type of case governed by the nominalised verb over the nominal phrase it heads; for details see Evans 1995 and 2003, and for discussion and reanalysis see Kibort 2010).
Apart from agreement and government, we find one other mechanism behind simultaneous inflectional marking of the same information on more than one element in the domain: semantic choice. We find that a feature value may be selected on the basis of semantics and realised on several elements which are members of a constituent or an 'informational unit', e.g. a noun phrase, verb phrase, verbal complex, or the clause. In this situation, multiple elements will be expressing the same value of a morphosemantic feature simultaneously. The 'rule' that determines which elements have to bear particular inflections is found in the lexicon in the form of a generalisation over relevant parts of speech or a subclass within a part of speech. The clearest examples of a feature value which is semantically imposed on several elements simultaneously are: number, definitness, or semantic case imposed on a noun phrase (see also Givón 2001:427), or a verbal feature imposed on (the elements making up) a verbal complex.
It is also possible that simultaneous marking of the same information on more than one element in the clause could be due to a semantic or pragmatic choice made at each element individually for the same semantic or pragmatic reason ('what's once true stays true'). In other words, the multirepresentation of a feature value in the clause could be due to coinciding individual semantics of the elements bearing the feature value. A clear example of this type of multirepresentation of a feature value would be a feature of respect whose marking could be justified semantically on every element on which it appears. Corbett (2006:137) remarks: 'There are ... languages where the existence of multiple honorifics suggests an agreement analysis, but where it is not clear that this is justified. It may be argued that each honorific is determined on pragmatic grounds (and that they agree only in the sense that they are being used in the same pragmatic circumstances).'
Finally, some instances of semantically justified multiple marking of information in the domain may, arguably, not even be an expression of a morphosemantic feature. This applies to phenomena such as the so-called 'negative concord', where the principal marker of information (negation) is there or not, and when it is there, it requires the presence of the second negation marker. Arguably, the phenomenon does not qualify to be a feature because the 'positive' polarity is not information that can be assigned to a value - it is, rather, simply lack of information. Corbett (2006:29) suggests that where the selection of additional information requires simply that it has to be repeated somewhere else in the clause, such instances can be termed 'concord'. (In order to determine whether such phenomena are indeed not features, or whether they are perhaps less canonical features, we would have to analyse them within a canonical framework. This work is in preparation.)
The distinction between multirepresentation of inflectional information which is due to agreement or government, and multirepresentation of inflectional information which is due to semantic choice, makes it possible to classify the remaining problematic phenomena in Kayardild (Evans 1995; 2003) as instances of multirepresentation of morphosemantic features (of semantic case, and tense-aspect-mood-polarity), but not as agreement phenomena. Hence, Kayardild modal case is a component of the tense-aspect-mood-polarity (TAMP) marking, with the particular TAMP value selected for the clause for semantic reasons, and Kayardild complementising case is a type of semantic case assigned to a noun phrase for semantic reasons. Multirepresentation of case values in Kayardild is due to the generalisation in the lexicon that nominal elements have to bear all cases they are assigned (in general, 'case suffixes appear on all words over which they have semantic or syntactic scope', Evans 1995:103). Furthermore, Kayardild multiple TAMP inflection on elements bearing verbalising case is due to a particular value of TAMP having been selected for the clause following a semantic choice, and to the requirement that TAMP be marked on all elements of the verbal complex. Similarly, multiple TAMP inflection on elements in a verbal group/complex is due to a particular value of TAMP having been selected for the clause following a semantic choice, and to the requirement that TAMP be marked on all elements of the verbal group (verbal complex). These conclusions are consistent with the widely held view that tense, aspect, mood, and polarity are features of the clause, that is, they are selected for and interpreted at the level of the clause rather than at the level of the lexical items they are realised on. Therefore, tense, aspect, mood, and polarity have not been included in the present Inventory as morphosyntactic features, but as morphosemantic features (for more discussion see Kibort 2010).
Having established the inventory of values to draw from, and identified the feature value on the element we are analysing, we can compare the sources of feature specifications found on different elements. The feature value may arise from within the element itself, in which case it is inherent, or it may be determined by some other element, in which case it is contextual. The inherent versus contextual distinction was first proposed in the description of inflection (Zwicky 1986; Anderson 1982; Booij 1994, 1996). Roughly, contextual inflection is 'dictated by syntax', while inherent inflection is 'not required by the syntactic context, although it may have syntactic relevance' (Booij 1996:2). This classification is not absolute but relative to particular word classes. Corbett suggests that the inherent versus contextual distinction can be applied to features in general, specifically, that it 'concerns the feature in relation to where it is realized' (2006:123). Thus, a contextual feature can be defined as 'dictated by syntax', while an inherent feature can be defined as 'not required by the syntactic context (for the particular item), although it may have syntactic relevance'.
Inherent feature specification can be thought of as expressing information that logically belongs to, or arises from within the element on which it is found, while contextual feature specification can be thought of as expressing information that logically originates outside the element on which it is found (in agreement, we call this information 'displaced', and in government, we can refer to this information as a 'brand mark'). Thus, features found on controllers of agreement are inherent features, while features found on agreement targets and on governees are contextual features.
Examples of contextual features of agreement:
- gender - on adjectives, verbs, etc.
- (nominal) number - on adjectives, verbs, etc.
- person - on verbs, and on nouns in possessive constructions
- case - on adjectives in predicate nominal constructions
- respect - (honorifics/ politeness markers/ special agreement) on verbs
- definiteness - (non-autonomous inflection, but possibly agreement effects) on adjectives
Examples of contextual features of government:
- case - 'structural case' on nouns or noun phrases
The term 'assignment' with respect to feature values is used commonly with reference to verbs which 'assign case' values. It is also used with reference to gender values, as in Corbett (1991) who discusses mechanisms for alloting nouns to different genders; namely, in languages which do not mark gender on nouns, native speakers have the ability to 'work out' the gender of a noun, and models of this ability have been called 'gender assignment systems'.
The concept of 'assignment of a feature value' has not been commonly used outside these two situations, even though it might be useful as a general term to refer to the values of any feature. However, after some consideration, we decided not to adopt it for this Inventory: since the term 'assignment' might be understood to correspond to either the interpretation or the realisation of a feature value, it is potentially confusing.
If the term 'assignment' were adopted here, the locus of 'assignment' of a feature value would correspond to what is referred here as the locus of 'interpretation' of a feature value. This contrasts with the locus of 'realisation' of a feature value, which is where we find the feature value expressed with a particular lexical item. Hence, a case value is 'assigned' to a constituent - which means that it is interpreted at a phrasal level - but it is 'realised' on particular elements (nouns, adjectives) which are members of the constituent. Note also that when a gender value is 'assigned' to a noun, its realisation on that element is typically not overt, although contextual realisations of gender values on targets of agreement with the controlling noun are overt.
We can now construct an articulated catalogue of different types of feature realisation identified so far. The diagram below, representing the structure of the catalogue, includes the inherent versus contextual distinction, and within the contextual realisation, distinguishes between feature values determined through agreement and those determined through government:
A further question that can be asked about inherently realised feature values is whether the value is lexically supplied to the element, or whether it has been selected from a range of available values. The following example illustrates the distinction: both gender and number values are inherently realised on nouns, they logically 'belong to' nouns when they are used to demand matching agreement on targets. But they are different in that a gender value is typically fixed for a particular noun, while a number value is typically not fixed, but selected from a set of options. Furthermore, since inherent feature values are 'not dictated by syntax', they can also be found on elements other than controllers of agreement. Examples include features such as tense, aspect, mood, evidentiality, voice, topic, focus, and other nominal and verbal features which can be expressed through inflectional morphology in various languages. For the majority of these features, the feature value found on the element is a value selected from a range of values available in the language. So, for example, an inflectional tense marker on a verb can be regarded as expressing an inherently realised value of the feature tense, selected from a range of options. Thus, the catalogue of feature realisations can be extended to include the fixed versus selected distinction within inherent feature realisation:
Finally, one more distinction can be made within both types of inherently realised feature values, orthogonal to the realisation method itself: the distinction between formal and semantic criteria for the selection of a feature value. Frequently, instances of the realisation of a fixed or a selected feature value on an element can be identified as either formally or semantically determined. In many instances the formal and semantic criteria coincide. Therefore, this distinction can be considered an optional subclassification within the catalogue, to enable the choice when the two criteria for the selection of a feature value do not coincide:
The distinction between semantic versus formal criteria behind the selection of an inherent feature value to an element corresponds to a distinction proposed independently elsewhere, that of semantic versus syntactic agreement (see, for example, Corbett 2006:155-165).
I can now take a different perspective on features and compare features as superordinate categories, rather than ways of realising feature values on elements. I want to distinguish between features which are relevant to syntax (morphosyntactic features), and those which are not (morphosemantic features). Also, I want to be able to relate purely morphological features to the other two types. In order to do this, I define a feature as follows:
- A feature is a set of values and the available options for their realisation on linguistic elements.
What follows is that the three types of grammatical features (morphosyntactic, morphosemantic, and morphological), can now be defined in terms of the realisation options available to their values.
A morphosyntactic feature is a feature whose values are involved in either agreement or government. Since agreement requires the presence of a controller which is specified for the feature value it imposes on the target, the values of a morphosyntactic feature may be contextual (when found on targets and governees) or inherent (when found on controllers of agreement). Hence, a morphosyntactic feature is a set of values which have available to them all of the realisation options identified in the catalogue of realisation types:
A morphosemantic feature is a feature whose values are not involved in agreement or government, but are inherent only. That is, the elements on which the values are found are not controllers of agreement. Because it is not involved in either agreement or government, a morphosemantic feature is not relevant to syntax. Hence, a morphosemantic feature is a set of values which have the following realisation options available to them:
An example of a morphosemantic feature is tense in many familiar languages where it encodes regular semantic distinctions, but it is not required by the syntax through the mechanisms of either agreement or government.
A morphosemantic feature may be marked more than once in the phrase or clause. We distinguish such multiple marking of a morphosemantic feature from agreement. Agreement requires systematic co-variance of controllers and targets (Corbett 2006). Multiple marking of a morphosemantic feature does not fall under agreement, but is better analysed as a piece of information that is marked in more than one place in the clause. The inclusion of such information depends wholly on the speaker who chooses to express the given meaning or function, and is independent of syntactic requirements. Sometimes the lack of this information does not imply that the phrase or clause is marked for the negative value of the feature in question (in such instances we refer to the multiple marking of the information, when it is added, as 'concord').
A purely morphological feature is a feature whose values are not involved in agreement or government, and are inherent only. Furthermore, the values of a morphological feature do not co-vary with semantic functions (even though there may be instances of free formal variation between values of a morphological feature). Hence, a morphological feature is a set of values which have the following realisation options available to them:
Morphological features have a role only in morphology (hence the notion of 'morphology-free syntax'). An example of a morphological feature is inflectional class (a 'declensional class', or a 'conjugation'). Morphological features can be arbitrary; they may have to be specified for individual lexical items, hence they are instances of lexical features. Alternatively, they may be predictable, to varying extents, from phonological and/or semantic correlations. That is, given the phonology or semantics of a given lexical item, it may be possible to assign its morphological feature by an assignment rule, rather than having to specify it in the lexicon (Corbett 2006:122-123; for more on morphological features, see Corbett & Baerman 2006).
A distinction is sometimes drawn between overt and covert features. We recognise only overt features as features, while we identify some so-called 'covert features' as 'conditions on agreement'.
To establish the agreement features in a given language, we establish the paradigms for the various agreement targets and ask which features are required to identify each cell. These features are the morphosyntactic features of agreement. When all cells are accounted for, that is to say, we have recognised sufficient features to describe each one, there may still be additional conditions on the use of the forms specified. For example, in Russian the plural form of the verb may be used rather than the singular form for some controllers (such as conjoined noun phrases), according to whether the controller precedes the target or not. This is not a feature of the paradigm, but it is a condition generalising over the possibilities described by the overt features (Corbett 2006:116-122). Similarly, in French the plural is used both for multiple entites, and to show politeness. We do not want to say that French has a morphosyntactic feature 'respect'; for one thing, like 'precedence' in Russian, it has no dedicated exponent. Rather, 'respect' is a condition on the use of the feature 'number'.
Since features are partial descriptions of objects, including linguistic objects, they can be employed to model phenomena in grammatical description other than the regularities identified through morphology. Most importantly, since features allow us to capture regularities in different components, a first classification of features is according to the components of linguistic description in which it is justified to use particular features: semantic, syntactic, morphological and phonological features.
Second, there are features which have an effect across component boundaries. Such features may be termed 'interface features' (Svenonius 2007). An example of these are morphosyntactic and morphosemantic features. Our Feature Inventory is concerned precisely with these features: interface features spanning morphology, semantics, and syntax, with particular emphasis on morphosyntactic features. Note that our 'morphosyntactic features' interface all three components, so a more accurate term for them, though one with no chance of adoption, would be 'morpho-semantico-syntactic' features.
Another type of linguistic phenomenon that has been modelled with features is 'word class' or 'part of speech'. An early example includes Chomsky (1965:79-86, 110-111), and an explicit discussion of this approach can be found in Gazdar et al. (1985:17-18). Our view is that part of speech categories and subcategories are one type of categorisation, and morphosyntactic features are a cross-classification of these. The distinction between part of speech and morphosyntactic feature can be expressed as a regular correspondence between an open set of lexical items and closed sets of morphosyntactic features and values. (More work on this issue is in preparation.)
- Anderson, Stephen R. 1982. Where's morphology? Linguistic Inquiry 13:571-612.
- Booij, Geert. 1994. Against split morphology. In: Booij, Geert & Jaap van Marle (eds) Yearbook of Morphology 1993. Dordrecht: Kluwer. 27-49.
- Booij, Geert. 1996. Inherent versus contextual inflection and the split morphology hypothesis. In: Booij, Geert & Jaap van Marle (eds) Yearbook of Morphology 1995. Dordrecht: Kluwer. 1-15.
- Bybee, Joan L. 1985. Morphology: A Study of the Relation between Meaning and Form. Amsterdam: Benjamins.
- Castairs-McCarthy, Andrew. 1999. Category and feature. In: Booij, Geert, Christian Lehmann, Joachim Mugdan & Stavros Skopeteas (eds) Morphology. An International Handbook on Inflection and Word-formation. Berlin: Walter de Gruyter. 264-272.
- Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
- Chumakina, Marina, Anna Kibort & Greville G. Corbett. 2007. Determining a language's feature inventory: person in Archi. In: Austin, Peter K. & Andrew Simpson (eds) Endangered Languages. [Linguistische Berichte, Sonderheft 14]. Hamburg: Helmut Buske. 143-172.
- Comrie, Bernard. 1986. On delimiting cases. In: Richard D. Brecht & James S. Levine (eds) Case in Slavic. Columbus, OH: Slavica. 86-106.
- Corbett, Greville G. 1991. Gender. Cambridge: CUP.
- Corbett, Greville G. 2000. Number. Cambridge: CUP.
- Corbett, Greville G. 2006. Agreement. Cambridge: CUP.
- Corbett, Greville G. & Matthew Baerman. 2006. Prolegomena to a typology of morphological features. Morphology 16.231-246. Draft available at: http://epubs.surrey.ac.uk/cgi/viewcontent.cgi?article=1000&context=smgjournal
- Evans, Nicholas. 1995. A Grammar of Kayardild, with Historical-Comparative Notes on Tangic. Berlin: Mouton de Gruyter.
- Evans, Nicholas. 2003. Typologies of agreement: some problems from Kayardild. In: Brown, Dunstan, Greville G. Corbett & Carole Tiberius (eds) Agreement: A Typological Perspective [Special issue of the Transactions of the Philological Society 101(2)]. Oxford: Blackwell. 203-234.
- Gazdar, Gerald, Ewan Klein, Geoffrey K. Pullum & Ivan A. Sag. 1985. Generalized Phrase Structure Grammar. Blackwell: Oxford.
- Givón, Talmy. 2001. Syntax. An Introduction. Volume I. Amsterdam: John Benjamins.
- Kibort, Anna. 2010. Towards a typology of grammatical features. In: Kibort, Anna & Greville G. Corbett (eds) Features: Perspectives on a Key Notion in Linguistics. Oxford: Oxford University Press. 64-106.
- Matthews, P.H. 1972. Inflectional Morphology. Cambridge: CUP.
- Matthews, P.H. 1991. Morphology. Second Edition. Cambridge: CUP.
- Mel'čuk, Igor A. 1993. Cours de morphologie générale. Vol. I: Introduction et première partie: Le mot. Montréal: Les Presses de l'Université de Montréal; Paris: CNRS Éditions.
- Schenker, Alexander M. 1955. Gender categories in Polish. Language 31(3):402-408.
- Stump, Gregory T. 2001. Inflectional Morphology: A Theory of Paradigm Structure. Cambridge: CUP.
- Stump, Gregory T. 2005. Word-formation and inflectional morphology. In: Štekauer, Pavol & Rochelle Lieber (eds) Handbook of Word-Formation. Dordrecht: Springer. 49-71.
- Svenonius, Peter. 2007. Interpreting uninterpretable features. Linguistic Analysis 33:375-413.
- Wurzel, Wolfgang Ullrich. 1984. Flexionsmorphologie und Natürlichkeit. (Studia grammatica 21). Berlin: Akademie-Verlag. [English translation 1989: Inflectional Morphology and Naturalness. Dordrecht: Kluwer].
- Zaliznjak, A.A. 1964. K voprosu o grammatičeskix kategorijax roda i oduševlennosti v sovremennon russkom jazyke. Voprosy jazykoznanija 4:25-40.
- Zaliznjak, A.A. 1973. O ponimanii termina 'padež' v lingvističeskix opisanijax. In: A.A. Zaliznjak (ed.) Problemy grammatičeskogo modelirovanija. Moscow: Nauka. 53-87.
- Zwicky, Arnold M. 1985. How to describe inflection. Berkeley Linguistic Society 11:372-386.
- Zwicky, Arnold M. 1986. Imposed versus inherent feature specifications, and other multiple feature markings. Indiana University Linguistics Club Twentieth Anniversary Volume. 85-106.
How to cite:
Kibort, Anna & Corbett, Greville G. 2008. Grammatical Features Inventory: Typology of grammatical features. University of Surrey. http://dx.doi.org/10.15126/SMG.18/1.16