Grammatical Features Inventory

Gender

Anna Kibort & Greville G. Corbett

'Gender' most commonly refers to classes of nouns within a language which are 'reflected in the behavior of associated words' (Hockett 1958:231). The term is used both for the particular classes of nouns (so, a language may have two or more genders) and for the whole grammatical category (so, a language may or may not have the category of gender).

Almost all languages have some grammatical means of dividing up their noun lexicon into distinct classes. Gender is one such device; other devices are frequently grouped under the term 'classifiers' and include noun classifiers, numeral classifiers, classifiers in possessive constructions, verb-incorporated classifiers (also referred to as classificatory noun incorporation), classificatory verbs, locative classifiers, and deictic classifiers. Distinguishing gender from classifiers is justified in Dixon (1982) and Corbett (1991:136-137), and examples can be found in Allan (1977) and Aikhenvald (2000). In some traditions genders are referred to as 'noun classes'.

The core of the gender system in any language is the gender assignment system, a set of rules according to which nouns are allotted to genders. From three possible assignment systems - based on the meaning of words, the form of words, or a combination of both - two gender assignment systems are attested in the world's languages: semantic gender assignment systems and semantic-and-formal gender assignment systems. In other words, there is always some semantic basis to gender classification, though genders can be semantically transparent to a greater or lesser extent.

In languages with a strict semantic gender assignment system the meaning of a noun is sufficient to determine its gender, for all or almost all nouns. An example of such a language is Bagvalal (North Caucasian; Kibrik et al. 2001:64-66), where nouns denoting male humans (and only those) are masculine, nouns denoting female humans are feminine, and all remaining nouns are neuter. In Kala Lagaw Ya (Australian, spoken on the Western Torres Strait Islands) nouns denoting males (and the moon, a mythical 'male entity') are masculine, and all others belong in the feminine gender (Bani 1987). In Diyari (Dieri; a now extinct South Australian language), nouns denoting female animates are in one gender, and all remaining nouns in another (Austin 1981). Strict semantic three-gender systems are also common in the Dravidian family (e.g. in Kannada and Tamil).

Many languages have a predominantly semantic gender assignment system, where assignment of nouns to some genders, or some nouns to genders, is semantically transparent, but there are exceptions for which there is no readily available explanation. An example of such a language is Bininj Gun-Wok (a large group of related dialects spoken in Western Arnhem Land, Australia). It has four genders with a semantic core: masculine, feminine, vegetable, and neuter - but some semantic groups of nouns appear to be allocated to the genders arbitrarily. It has been claimed that many of such apparently random allocations can be explained once the cultural setting of the language is taken into account and gender assignment is seen as reflecting the worldview of the speakers (see e.g. the discussions of Dyirbal, Australian, Dixon 1972 and recent re-analysis in Plaster & Polinsky 2010; or Ojibwa and other Algonquian languages of North America). The most frequently found semantic distinctions on which gender assignment is based are male versus female (many Afroasiatic languages, East-Nilotic, Central Khoisan), human versus non-human (some Dravidian languages), rational (i.e. humans, gods, demons) versus non-rational (Tamil and other Dravidian languages), and animate versus inanimate (Siouan, North American). Other less usual criteria may include: non-flesh food (Dyirbal, Australian), insects (the Rikvani dialect of Andi, Nakh-Daghestanian), diminutives (various Bantu languages), places (also Bantu), and sometimes also shape and size (also Bantu; however, sometimes primarily sex-based genders may have additional shape- and size-related meanings). Furthermore, languages may combine these parameters in different ways (Aikhenvald 2006:463-464).

In many languages, apart from the semantic basis for gender assignment which is at the core of the system, we find additional rules for assigning nouns to genders according to their form. The rules may access two types of information: phonological and morphological, and there may be combinations of such rules. A clear example of gender assignment depending on phonological information is found in Qafar (East Cushitic; Parker & Hayward 1985), where sex-differentiable nouns are assigned gender - masculine or feminine - according to their semantics, and all other nouns are assigned to the two genders depending on their phonological form. Typically, nouns denoting males and females fit with the phonological assignment rules, but in cases of conflict semantic rules take precedence. An example of a language with a morphological assignment system supplementing the semantic core is Russian. As in many other Indo-European languages, nouns denoting sex-differentiables in Russian are assigned masculine and feminine genders. However, the semantic residue is shared between all three Russian genders - masculine, feminine, and neuter - with the neuter not even receiving the majority. The gender assignment of the residue is determined by the morphological form of the nouns: there are four main inflectional classes in Russian, and the remaining nouns are assigned gender according to their inflectional class (with indeclinable nouns requiring separate, additional rules). Again, many sex-differentiable nouns would be assigned correctly by the morphological assignment rules alone. However, in cases of conflict, semantic rules take precedence here, too. Apart from many other Indo-European languages, semantic-and-morphological assignment systems are found in Arabic (Cowell 1964:372-375) and Kuot (a language isolate of New Ireland, Papua New Guinea; Lindström 2002:147-164, 176-194).

Finally, even though in many languages most nouns are assigned to just one gender, in some languages different genders can be chosen to highlight a particular property of the referent. Manambu (a Sepik language of Papua New Guinea) has two genders: masculine gender including male referents, and feminine gender including females. But the choice of gender can vary and depend on other factors: if the referent is exceptionally long, or large, it is assigned masculine gender, and if it is small and round, it is feminine (Aikhenvald 2006:464). English provides more familiar examples of this phenomenon. There are double- and multiple-gender nouns, such as doctor, baby and nouns referring to animals (especially familiar animals such as pets). There are also hybrid nouns (which trigger different agreement forms depending partly on the type of target), such as ship and other 'boat nouns' which can take the personal pronoun she but not the relative who. For discussion of the types of nouns that deviate from a consistent agreement pattern, see Corbett (1991:180-184).

Gender systems are typically found in languages with a fusional or agglutinating (not an isolating) profile (Aikhenvald 2006:463). It is unusual, but possible, for a language to have both gender and classifiers. This has been found in Tariana (North Arawakan, Brazil; Aikhenvald 1994), Retuarã (Tucanoan, Columbia; Strom 1992:10-11, 34-36, 45-47), and Tidore (West Papuan, Indonesia; van Staden 2000:77-81). Furthermore, Ngan'gityemerri (Daly; northern Australia) shows the development from generic classifiers into genders (Reid 1997), and a similar situation, but with a very large system, has been reported for Miraña (Boa, Witotoan, Colombia; Seifart 2004).

Expressions of 'gender'

In some languages there is a marker of gender on every noun, in others nouns bear no markers of gender. The former, perhaps less familiar, class of languages includes Bantu languages, Berber (especially North Berber - Kabyle, Tashelhit, possibly Tamazight - which mark just about all nouns for gender; Alexandra Aikhenvald personal communication), many Arawak languages of South America (e.g. Baniwa and Tariana; in these languages genders are marked on all nouns, sometimes with the exception of human nouns, and are also used in other classificatory functions, such as numeral classifiers; Aikhenvald 2007), and many languages in New Guinea. However, no amount of marking on a noun can be taken as evidence that the language has gender. The evidence of gender comes from the agreement targets that show gender in the language.

It is taken as the definitional characteristic of gender that some constituent outside the noun itself must agree in gender with the noun. In other words, a language has a gender system only if we find different agreements ultimately dependent on nouns of different classes (Corbett 1991:146ff; 2005:126). Agreement can appear on: other words in the noun phrase (adjectives, determiners, demonstratives, numerals, etc., even focus particles), on the predicate of the clause, on an adverb, and - for some linguists - on an anaphoric pronoun outside the clause boundary. Languages often have portmanteau markers combining information about gender with number, person, case, or other features.

Barlow (1992:134-152) discusses the issue of the scope of agreement and concludes that there are no good grounds for distinguishing between agreement and antecedent-anaphor relations. If antecedent-anaphor relations are accepted as agreement, languages in which gender distinctions are absent from noun phrase modifiers and from predicates and in which free pronouns present the only evidence for gender, can be counted as having a (pronominal) gender system. Such languages are rare, and the best known example is English, which is typologically unusual in this respect (Corbett 2005:126; see also §6 below); another such language is Defaka (Niger-Congo, Nigeria; Jenewari 1983:103-106).

Genders, understood as classes of nouns within a language which are 'reflected in the behavior of associated words' (Hockett 1958:231), are agreement classes which may be defined as follows (Corbett 1991:147-150; cf. Zaliznjak 1964:30; see also §4 below):

An agreement class is a set of nouns such that any two members of that set have the property that:

whenever (i) they stand in the same morphosyntactic form,

and (ii) they occur in the same agreement domain,

and (iii) they have the same lexical item as agreement target,

then their targets have the same morphological realisation.

'Standing in the same morphosyntactic form' means that, crucially, the nouns have to have the same number and case. Unlike gender, these two features can often be justified without reference to agreement, only on the basis of the morphological material on the noun itself. Thus, when number and case are taken out of the equation, the remaining distinctions between agreement classes - if there are any - identify these classes of nouns as belonging to different genders. (Note that these are 'controller genders', which are the values of the morphosyntactic feature of gender, as opposed to 'target genders' which are generalisations about the patterns of forms).

The status of 'gender' as a feature

Gender is an indisputable morphosyntactic feature, since it is required for agreement. The realisation of the value for gender on the target is the canonical instance of the need for a syntactic rule of agreement (Corbett 2006:126).

Gender is an inherent feature of nouns, and a contextual feature (determined through agreement) for any other elements that have to agree with the nouns in this feature (e.g. adjectives, verbs, etc.). Typically, gender is lexically supplied and its value is fixed for the noun. However, on some nouns (multi-gendered nouns and hybrid nouns) gender can be a semantically selected feature, where one gender value is selected from a set of options. Therefore, the lexical entries of nouns in a gendered language must specify either that the noun has a fixed gender value or that it is capable of taking on different gender values as dictated by the semantics.

An interesting question that arises with respect to the shape of lexical entries is what information exactly needs to be specified: the gender of a noun in a gendered language must be available, but it can be derived from other information - semantic, morphological, or phonological.

Some lexical oppositions which correspond to semantic distinctions similar to gender, but which are not instantiations of the morphosyntactic feature of gender, include semantically contrasting lexical items (as in Kanuri, Nilo-Saharan, Nigeria - which doesn't have a gender system but does have lexical items for 'boy' versus 'girl', for example), or lexical derivations (e.g. in English, the class of nouns ending in -tion).

The values of 'gender'

The noun inventory of a language is divided into different classes, or genders, according to the different agreements they take (for the notion of 'agreement class', based on Zaliznjak 1964, see §2 above). 'For two nouns to be in the same agreement class, they must take the same agreements under all conditions - that is, if we hold constant other features such as case and number. (...) If two nouns differ in their agreements when factors such as case and number are held constant, then they belong to two different agreement classes and normally they will belong to two different genders' (Corbett 2006:750). Note that the earlier Bantuist tradition treated nouns as being in different noun classes when singular and plural, and it is often stated that Bantu languages have twenty noun classes. Counted in the way outlined above, the number is typically between seven and ten.

For many languages, establishing agreement classes determines the number of genders straightforwardly. However, in languages with more complex gender systems, it may be necessary to separate out the classes into which nouns are divided (the controller genders) from the number of different genders marked on agreement targets (the target genders), and propose a different number of genders for the controllers and for the targets (Corbett 1991; see also the example of Romanian in the 'Problem cases' section below). However, while controller genders are the values of the morphosyntactic feature of gender, target genders can be thought of as generalisations about the patterns of gender markers (as found on targets).

Based on the analysis of a sample of 256 languages, Corbett (2005:127-129) reports that somewhat over half (144) have no gender system, two-gender systems are common (50 languages in the sample), three-gender systems are around half as common (26 languages), and four-gender systems about half as common again (12 languages). Larger systems, with five or more genders, represent a substantial minority (24 languages in the sample).

A minimal gender system requires two genders, and this is the most common number of genders, found in most geographic areas where gender is found. For larger systems, the major source is Niger-Congo, where systems in excess of five genders are common, with the record-breaker Nigerian Fula having around twenty genders (the exact count depending on the dialect). Outside Niger-Congo, Arapesh (Papua New Guinea) has thirteen genders, and Ngan'gityemerri (northern Australia) arguably has fifteen genders (for references see Corbett 2005:127).

Even though every language that has gender divides up its noun lexicon in a different way, the typical semantic distinctions that underlie gender distinctions have given rise to several common labels used for gender values (see also §6 below: a note on 'Gender labels and the correspondence problem'). However, because languages with strictly semantic gender assignment are rare, it is important to remember that for most languages gender labels with semantic denotation likely correspond to classes of nouns with membership determined by both semantic and formal criteria. For example (with thanks to Dunstan Brown for discussion of these):

Feminine gender - is a gender to which nouns may be assigned if: (1) they inherently denote females. Additionally, but not necessarily, nouns may be assigned this value of gender if: (2) their formal properties (morphological or phonological) lead them to be assigned to the same agreement pattern as other nouns within the language that have female denotation; or (3) they are arbitrarily assigned to the same agreement pattern as other nouns in the language that have female denotation.
Masculine gender - is a gender to which nouns may be assigned if: (1) they inherently denote males. Additionally, but not necessarily, nouns may be assigned this value of gender if: (2) their formal properties (morphological or phonological) lead them to be assigned to the same agreement pattern as other nouns within the language that have male denotation; or (3) they are arbitrarily assigned to the same agreement pattern as other nouns in the language that have male denotation.
Animate gender - is a gender to which nouns may be assigned if they denote animates. Additionally, in a given language, this gender may include a larger or smaller number of nouns which do not meet this semantic criterion. Animate gender may occur in a two-gender system, with the other gender being labelled inanimate. However, animate gender may also occur in larger gender inventories (i.e. greater than two values). Examples of these larger systems are found in Bantu languages (where nouns denoting humans are included in the animate gender) and in languages of Daghestan (where the animate gender is typically for non-human animates) (see Corbett 1991:20-32).
Inanimate gender - is a gender to which nouns may be assigned if they denote non-living things. Additionally, in a given language, this gender may include a larger or smaller number of nouns which do not meet this semantic criterion. Often, though not necessarily, inanimate gender occurs in a two-gender system, the other gender being animate (as e.g. in Nishnaabemwin). It may also occur in larger gender inventories.
Human gender - is a gender to which nouns may be assigned if they denote humans. Additionally, in a given language, this gender may include a larger or smaller number of nouns which do not meet this semantic criterion.
Vegetable gender - is a gender to which nouns may be assigned if they denote plants and their products. Additionally, in a given language, this gender may include a larger or smaller number of nouns which do not meet this semantic criterion.
etc.

Apart from using semantic labels for gender values, two other conventions are to use Arabic numerals, or Roman numerals. Roman and Arabic numerals are often used for languages for which there is a descriptive tradition involving use of the term 'noun class' instead of 'gender', in particular in languages of the Caucasus or Bantu languages; Roman numerals are particularly useful where the number of genders is large (as in Bantu languages). If the 'noun classes' are involved in agreement systems, they are gender systems. Roman and Arabic numerals may also be used in instances where another label is possible. For instance, in one language the gender to which nouns with human denotation are assigned might be called 'human', whereas in another language nouns with a similar denotation may be assigned to a gender with an arbitrary Arabic numerical label such as '1'. Similarly, in one language the gender to which nouns with male rational denotation are assigned might be called 'masculine', whereas in another language nouns with a similar denotation may be assigned to a gender with an arbitrary Roman numerical label such as 'I' (see e.g. Corbett 1991:25; again thanks to Dunstan Brown for discussion).

Oddly behaving gender markers

Barasano (an eastern Tucanoan language spoken in Colombia) shows a interesting interrelation between between gender and person: the subject agreement markers (suffixes) in Barasano mark gender and animacy in the third person; curiously, the inanimate marker is also used for speech act participants, i.e. first or second person, singular or plural (Jones & Jones 1991:73-74).

Archi, a Daghestanian (or North East Caucasian) language traditionally assigned to the Lezgian group, distinguishes four genders and two numbers, but gender marking does not appear to be consistent across different persons. Archi has no unique forms for agreement in person, and the standard descriptions of this language do not involve the feature person (Kibrik et al. 1977; Kibrik 1977). However, the apparently inconsistent gender and number agreement patterns in Archi have a straightforward explanation when we accept that Archi does have the person feature and some gender markers double as person agreement markers (Chumakina, Kibort & Corbett 2007).

The following table lists affixal agreement forms marking verbs in Archi:

GENDER	NUMBER
GENDER	singular	plural
I (male human)	w-/<w>	b-
II (female human)	d-/<r>	b-
III (other)	b-/<b>	∅-
IV (other)	∅-/<∅>	∅-

The personal pronouns zon 'I' and un 'you (singular)' take gender agreements corresponding to the gender of the speaker or addressee: male humans trigger gender I agreement, female humans - gender II agreement, and imaginary locutors of genders III and IV (e.g. a speaking cow and a speaking goat kid) trigger gender III and IV agreements, respectively.

If Archi has no person feature, we should expect the same pattern of agreement, based on gender, to occur with personal pronouns in the plural. Indeed, this is what happens with the personal pronoun teb 'they'. It takes gender I/II agreement (the prefix b-) when the referents are human, and gender III/IV agreement (zero marking) when the referents are non-human. However, unexpectedly, the personal pronouns nen 'we' and žwen 'you (plural)' referring to humans do not take the gender-based I/II agreement marker (b-). Instead, they trigger zero marking, which we gloss as III/IV.PL as in the table above:

zon “I” → gender agreement
un “you (sg)” → gender agreement
teb “they” → gender agreement
nen “we” [humans/non-humans] → ∅-
žwen “you (pl)” [humans/non-humans] → ∅-

An analysis of Archi without a person feature requires a complication of the gender system, as we have to add four more genders to the system, each representing one of the four types of gender agreement required by the set of the personal pronouns. In this way, the pronouns are treated as unique lexical items. Furthermore, very complex and typologically odd resolution rules for agreement with conjoined phrases have to be proposed to account for agreement when one of the conjuncts is a pronoun.

The alternative is to base the gender resolution rules for Archi on a general rule formulated in purely semantic terms (i.e. if there is at least one conjunct denoting a rational or rationals, gender I/II agreement will be used; otherwise gender III/IV will be used) and, rather than treating the personal pronouns as each being an exception in terms of gender, accept a person feature for Archi. In this way, the gender resolution rules in Archi are fairly usual, and the person resolution rule required (persons 1 and 2 > person 3) is standard, except for the interesting points that there is no distinction here between persons 1 and 2, and that it operates only in the plural.

The following paradigm illustrates Archi agreement in person, showing how gender markers double as person agreement markers:

PERSON	NUMBER
PERSON	singular	plural
1	gender agreement	∅-
2	gender agreement	∅-
3	gender agreement	gender agreement

In Garífuna (Arawak, spoken in Belize, Honduras, Nicaragua and Guatemala), it is possible for certain nouns to be of different genders according to the sex of the speaker (Taylor 1977:60, cited in Munro 1998).

Gender can be affected by status. Three interesting examples of curious effects of status within feminine gender are reported in Corbett (2005:131). Lak (Daghestanian) has four genders, broadly: I for male rationals, II for female rationals, III for other animates (though this has other members, too, including many inanimates), and IV for the semantic residue (which also includes a few animates). There was an important exception: duš 'girl, daughter' was a member of gender III instead of the expected gender II. Gender III agreements became a sign of politeness when addressing young women (Xajdakov 1963:49-50), particularly those earning their own living, and nouns denoting them have been transferred to gender III. This usage has extended so that now gender III agreement forms are appropriate for any woman outside the immediate family. Within the family, older women such as ninu 'mother' and amu 'grandmother' are addressed and referred to using gender II forms. Thus gender II is semantically restricted and is left with extremely few nouns in it. Something comparable has happened in Konkani (Indo-European, spoken on the west coast of India), where the word for 'girl' was neuter. Where human referents are concerned, the neuter has become the gender for young females (or those relatively younger from the speaker's standpoint), while the feminine is for old, or relatively older, females (Miranda 1975:208-213). A similar change in the core meaning of genders has occurred in some southern Polish dialects (Zaręba 1984-85). In several of these dialects, nouns denoting girls and unmarried women (irrespective of age), and including hypocoristics, are of neuter gender. Neuter agreements are employed when unmarried women are addressed, and they use them for self-reference. In a smaller area, to the south-west of Kraków, instead of the neuter the masculine is used. In both types of dialect, the feminine is used for married women. The change from neuter or maasculine to feminine for a particular woman occurs immediately after the church wedding ceremony; the communities involved are small, so there is no difficulty about knowing who is married and who is not. The meaning of the feminine has changed in both dialect types, being restricted now to denote marrired women. (Feminine nouns which are not semantically motivated also remain feminine.) For further details on all these, and references, see Corbett (1991:24-26, 99-101; 2005:131).

Problem cases

Gender versus nominal classification. Gender is perhaps the only feature whose values, when found on controllers of agreement (i.e. the gender values inherently assigned to nouns), typically have no overt expression in the majority of languages which have the gender feature. There are a few languages where gender is marked on every noun, or on most nouns with the exception of nouns referring to humans (see §2 above for examples). However, languages marking genders on nouns are a small minority. Perhaps the fact that gender value is not marked on nouns may be related to the fact that gender is also a feature in which inherent assignment of the value is predominantly fixed - that is, typically, most nouns in languages that have the feature of gender have only one, fixed value of gender. Furthermore, the inherent value of gender on the noun is assigned to it on the basis of some specific criteria, usually a combination of semantic and formal criteria. Therefore, in fact, the gender of a noun in a gendered language need not even be specified in the noun's lexical entry, since it can be derived from other information - semantic, morphological, or phonological.

Because of these characteristics, the term 'gender' is most commonly used to refer to classes of nouns within a language which are 'reflected in the behaviour of associated words' (Hockett 1958:231). This is also the definition adopted by Corbett (1991), who argues that in order to define gender we have to refer to the targets of agreement in gender, which allow us to justify the classification of nouns into genders. In the typology of grammatical features which underlies this Inventory (see the 'Feature Inventory' page), it has been possible to retain the special status of gender. However, the position of gender within the typology needs to be clarified in order to enable comparisons with other features.

As defined in Hockett (1958) and Corbett (1991), gender is exclusively a feature of agreement. Hence, the feature is referred to as 'gender' in a language if it concerns the classification of the nominal inventory of the language, but only if the inherently assigned gender values found on nouns are matched by contextually assigned gender values found on targets of agreement in gender. If a language has a system of nominal classification expressed through inflectional morphology, but the feature of nominal classification does not participate in agreement, it does not qualify as 'gender'. With respect to syntax, the status of such feature is similar to the status of tense in most of the familiar languages: an inflectionally marked feature such as tense expresses a semantic or formal distinction, but is not relevant to syntax for the purposes of agreement or government. Syntax does not need to know the value of the inflectional noun classifier or inflectionally marked tense. Therefore, the distinction between inflectional noun classification and gender is that, while the former can only be a morphosemantic feature, gender can only be a morphosyntactic feature.

Gender labels and the correspondence problem. All labels for genders - including 'masculine', 'feminine', and 'neuter', which are the most familiar ones - capture distinctions between classes of nouns. As was discussed in §1 and §4, each language divides up its noun lexicon in a different way, therefore defining particular genders is possible only with reference to the gender system of the given language. This is sometimes referred to as the correspondence problem, meaning that corresponding gender labels in different languages do not refer to genders that can be defined in the same way. Here is a simple example: French and Slovene both have masculine and feminine genders. Do they correspond? Yes, in the sense that nouns denoting females are typically assigned to the feminine gender. No, if we consider that French has two genders and Slovene three. Therefore, in an inventory of features, labels such as FEM or MASC are bound to have different meanings depending on the system of which they are a part (e.g. FEM in a two-gender system versus FEM in a three-gender system).

Furthermore, the semantics of the genders can differ dramatically, as say between Tamil (semantically assigned) and Slovene (semantically and formally assigned), though both have three genders. As was discussed in §1, gender systems in all languages do appear to have some semantic basis. For example, in Russian (as in many other Indo-European languages), for sex-differentiables, nouns denoting males are masculine, and those denoting females are feminine. However, the nouns not covered by these rules - the semantic residue - are not simply assigned to the neuter gender. Rather, the semantic residue is shared between the three genders, with the neuter gender not even receiving the majority. The following table, from Corbett (2006:752), shows that nouns with apparently similar semantics can be assigned to all three genders. The assignment of these nouns to gender classes is achieved on the basis of their morphological properties (specifically, their inflectional class):

Masculine	Feminine	Neuter
dub 'oak'	sosna 'pine'	derevo 'tree'
stvol '(tree) trunk'	doska 'plank'	brevno 'log'
čaj 'tea'	voda 'water'	moloko 'milk'
ogon' 'fire'	peč' 'stove'	plamja 'flame'
okean 'ocean'	reka 'river'	more 'sea'
avtomobil' 'car'	mašina 'car'	taksi 'taxi'
den' 'day'	noč' 'night'	utro 'morning'
čas 'hour'	minuta 'minute'	vremja 'time'
nerv 'nerve'	kost' 'bone'	serdce 'heart'
glaz 'eye'	brov' 'eyebrow'	yeko 'eyelid'
lokot' 'elbow'	lodyžka 'ankle'	zapjast'e 'wrist'
flag 'flag'	èmblema 'emblem'	znamja 'banner'

Thus, when describing the feature inventory, it is not enough to list gender values, but we need some declaration of the gender system to which they belong.

The problem of the interrelation between gender and number. In several of the more familiar languages, the gender agreement pattern is straightforward and the way in which the gender system is analysed is taken as self-evident. French, for example, is taken to have two values of gender, and this works well, because not only nouns are 'masculine' or 'feminine', but agreeing verb forms and adjectives are, too. Thus, controller genders and target genders correspond in a straightforward way and the gender values are labelled 'masculine' and 'feminine':

singular		plural
	MASC nouns
MASC agreement forms	———————————	MASC agreement forms

	FEM nouns
FEM agreement forms	———————————	FEM agreement forms

However, difficulties arise in languages with more complex gender systems. If the distinction between controller and target genders is not considered, gender values may be presented as though the pattern was equally uncontroversial, but no indication is given about what the values really mean. Then, the number of genders in a particular language can be the subject of interminable dispute, or we find that similar situations are described differently by those working on different language families. This problem can be seen as another instance of the correspondence problem, where this time the feature values do not correspond intralinguistically, rather than crosslinguistically.

A good example of a language whose gender system has been the source of continuing disagreement is Romanian (for references to the extensive literature on this topic see Corbett 1991:150). The argument, which has gone on for decades, is whether we have two genders or three. In terms of agreement classes, the situation is clear: there are three classes that should be set up as follows (where the agreement endings are typical allomorphs for each target gender):

class I nouns taking -∅ in the singular and -i in the plural (e.g. bǎrbat 'man')
class II nouns taking -∅ in the singular and -e in the plural (e.g. scaun 'chair')
class III nouns taking -a in the singular and -e in the plural (e.g. fatǎ 'girl')

However, simply to say that Romanian has three genders suggests that it is like German, Latin or Tamil, even though in each of these languages, intuitively, the situation is rather different. It can be seen that Romanian has three controller genders (i.e. the genders into which nouns are divided), and it has two target genders (i.e. the genders which are marked on adjectives, verbs, and so on, depending on the language) in both singular and plural. The gender system of Romanian can, then, be diagrammed as follows (illustrated with the agreement forms of the adjective bun- 'good'):

undefined

The controller genders in Romanian (i.e. the lines labelled 'class I nouns', 'class II nouns', and 'class III nouns') are usually called 'masculine' (I), 'feminine' (II), and the disputed gender (III) is sometimes called 'neuter' and sometimes 'ambigeneric'. The latter is a useful term, provided it is used not to imply that there is no distinct gender but rather that the situation is different from the more common Indo-European three-gender system: Romanian does have neuter gender, but it is non-autonomous.

While there are many languages where the number of controller and target genders are the same, mismatches of the type that occurs in Romanian are common. Examples of several even more complex systems are given in Corbett 1991. The important point here is that the mismatches do not concern one or two odd exceptions within the category of nouns or agreeing forms, but they concern substantial parts of the lexicon of the language: the diagram above represents systematic correspondences occurring across the whole lexicon in Romanian. At present, the best account of such systems, which captures the generalisations with regard to gender agreement, involves proposing a different number of genders for controllers and for targets, as has been sketched for Romanian.

How many genders does English have? Establishing whether English has gender is not problematic, but the answer depends on one's view of agreement. English predicates and noun phrase modifiers - such as adjectives, articles and demonstratives - do not mark gender. But if antecedent-anaphor relations count as agreement (see end of §2 above), English has a pronominal gender system. However, how many genders are there?

English has three forms of the singular personal pronoun (he, she and it) and two forms of the relative pronoun (who and which), which distinguish between masculine, feminine, neuter, personal, and nonpersonal nouns, respectively. The patterns of pronoun coreference for singular nouns give three consistent agreement patterns in English (in the plural only the distinction between personal and nonpersonal is preserved, i.e. they/who versus they/which) (Corbett 1991:180): who/he - masculine, who/she - feminine, and which/it - neuter. Payne (2006:713-714) extends these to four: who/he - personal masculine, who/she - personal feminine, which/it - nonpersonal neuter, and which/she - nonpersonal feminine (for the so called 'boat nouns', e.g. ship; note, however, that these can be analysed as hybrid nouns that trigger different agreement forms depending partly on the type of target - see Corbett 1991:180-184). And, at the extreme end, Quirk et al. (1985:314) propose nine 'gender classes' for singular nouns in English: male (brother), female (sister), dual (doctor), common (baby), collective (family), higher male animal (bull), higher female animal (cow), lower animal (ant), and inanimate (box). This classification results from an attempt to differentiate between all possible types of nouns which have different agreement possibilities based on pronoun coreference.

However, such multiplication of noun classes is frequently unsatisfactory. Intralinguistically, it misses generalisations; and crosslinguistically, it makes similar systems appear more different than they really are (Corbett 1991:161). The most satisfying account for the given language should list consistent patterns of pronoun coreference (which often correspond to the traditional number of genders generally accepted for that language), with the additional extensions to the system identified as subgenders, overdifferentiated targets, inquorate genders, defective nouns, multiple-gender nouns, or hybrid nouns; it is also possible that the gender system of a language may be best analysed as combining two coexisting gender systems (Corbett 1991:161-188).

Pronominal gender and gender agreement in Dutch. The Standard Dutch gender system is in decline: the former masculine and feminine genders have merged to form one 'common' gender, and pronominal gender is shifting from the predominantly formal system to a semantic one. The development of semantic patterns of distribution for pronouns is expected when languages lose formal exponents of gender distinctions and when, as a result, pronouns distinguish more genders than do other agreement targets (such as definite articles and adjectives in Dutch). Specifically, while in the noun phrase Dutch distinguishes common and neuter gender, the personal pronoun has distinct forms for the masculine, feminine and neuter. This leads to conflicts in agreement (due to the mismatch between pronominal and nominal gender) and interesting patterns of variation which are currently investigated by Audring (2006; forthcoming a). She notes that in a situation of similar historical change, English had reorganised its gender system in terms of natural gender (male, female, others; see the problem case above). The Dutch system, however, appears to distinguish male from female humans and countable from uncountable entities. The latter is an unusual move, as countability is a conceptual notion rarely manifested in gender systems.

Furthermore, the rules of pronoun usage in spoken Dutch make almost all nouns of the language appear to be 'hybrids' (Corbett 1991:183-184) which neither simply take the agreements of one consistent agreement pattern nor belong to two or more genders, but whose agreement form depends in part on the type of target. This is an unsatisfactory analysis, as hybrid nouns are expected to be isolated exceptions to the general rule that a noun consistently controls a particular gender value on all targets. This problem has been identified by Audring (forthcoming b). She notes that the hybridity account construes the situation entirely from the perspective of the noun. Hybridity is a nominal property, and the nouns are held responsible for all agreement values that appear on the targets. However, she argues that the Dutch data can be explained better from a pronominal perspective. The pronouns themselves have developed a semantic link between countability and gender. This has the consequence that some syntactic agreement configurations become dispreferred and are replaced by semantically motivated choices. Thus, the agreement patterns of Dutch reflect properties of pronouns rather than those of nouns.

Can gender be a feature of government? We have not found instances of gender as a feature of government. There are instances of a gender value found on elements which normally have to agree in gender with a controller, but the controller is absent. This can happen, for example, with predicate adjectives used in nominalisations (as in 'being happy'), or when predicate adjectives are complements of infinivites (i.e. they occur in uncontrolled, or arbitrarily controlled, infinitivals). An example comes from Polish, where adjectives obligatorily express gender and number. When a predicate adjective is a complement of an infinitive, it has to appear in instrumental case, singular number, and masculine/neuter gender. Masculine and neuter forms of adjectives are syncretic in the instrumental.

(1)	Jest	ważne	być	szczęśliwym.
	is(3SG)	important.N	be.INF	happy.M/N.SG.INS
cf.	/??Jest*	ważne	być	szczęśliwą.
	is(3SG)	important.N	be.INF	happy.F.SG.INS
	/??Jest*	ważne	być	szczęśliwymi.
	is(3SG)	important.N	be.INF	happy.VIR/NVIR.PL.INS
	'It is important to be happy.'

Since the adjective has to express gender and number, in a situation where there is no controller that could dictate its gender and number, it shows 'default agreement', which is typically 'third person singular neuter'. Hence, instances of gendered adjectives which have no controller are not instances of government but of default agreement.

Key literature

Aikhenvald, Alexandra Y. 2006. Classifiers and noun classes: semantics. In: Brown, Keith (ed.) The Encyclopedia of Language and Linguistics. 2nd Edition. Oxford: Elsevier. 463-471.
Corbett, Greville G. 1991. Gender. Cambridge: CUP.
Corbett, Greville G. 2005. Number of genders. In: Haspelmath, Martin, Matthew S. Dryer, David Gil & Bernard Comrie (eds) The World Atlas of Language Structures (WALS). Oxford: Oxford University Press. 126-129.
Corbett, Greville G. 2005. Sex-based and non-sex-based gender systems. In: Haspelmath, Martin, Matthew S. Dryer, David Gil & Bernard Comrie (eds) The World Atlas of Language Structures (WALS). Oxford: Oxford University Press. 130-133.
Corbett, Greville G. 2005. Systems of gender assignment. In: Haspelmath, Martin, Matthew S. Dryer, David Gil & Bernard Comrie (eds) The World Atlas of Language Structures (WALS). Oxford: Oxford University Press. 134-137.
Corbett, Greville G. 2006. Gender, grammatical. In: Brown, Keith (ed.) The Encyclopedia of Language and Linguistics. 2nd Edition. Oxford: Elsevier. 749-756.

References

Aikhenvald, Alexandra Y. 1994. Classifiers in Tariana. Anthropological Linguistics 36:407-465.
Aikhenvald, Alexandra Y. 2000. Classifiers: a Typology of Noun Categorization Devices. Oxford: OUP.
Aikhenvald, Alexandra Y. 2006. Classifiers and noun classes: semantics. In: Brown, Keith (ed.) The Encyclopedia of Language and Linguistics. 2nd Edition. Oxford: Elsevier. 463-471.
Aikhenvald, Alexandra Y. 2007. Classifiers in multiple environments: Baniwa of Içana/Kurripako: a North Arawak perspective. International Journal of American Linguistics 73(4):474-500.
Allan, Keith. 1977. Classifiers. Language 53:284-310.
Audring, Jenny. 2006. Pronominal gender in spoken Dutch. Journal of Germanic Linguistics 18(2):85-116.
Audring, Jenny. (forthcoming b). A pronominal view on gender agreement. (Paper presented at Amsterdam Gender Colloquium, 15-16 September 2006, Amsterdam. Abstract available at:http://www.let.vu.nl/conference/agc/audring).
Audring, Jenny. (forthcoming a). Countability and Pronominal Gender. PhD thesis, Vrije Universiteit Amsterdam.
Austin, Peter. 1981. A Grammar of Diyari, South Australia. Cambridge: CUP.
Bani, E. 1987. Garka a ipika: masculine and feminine grammatical gender in Kala Lagaw Ya. Australian Journal of Linguistics 7:189-201.
Barlow, Michael. 1992. A Situated Theory of Agreement. New York: Garland.
Chumakina, Marina, Anna Kibort & Greville G. Corbett. 2007. Determining a language's feature inventory: person in Archi. In: Austin, Peter K. & Andrew Simpson (eds) Endangered Languages. [Linguistische Berichte, Sonderheft 14]. Hamburg: Helmut Buske. 143-172.
Corbett, Greville G. 1991. Gender. Cambridge: CUP.
Corbett, Greville G. 2005. Number of genders. In: Haspelmath, Martin, Matthew S. Dryer, David Gil & Bernard Comrie (eds) The World Atlas of Language Structures (WALS). Oxford: Oxford University Press. 126-129.
Corbett, Greville G. 2006. Gender, grammatical. In: Brown, Keith (ed.) The Encyclopedia of Language and Linguistics. 2nd Edition. Oxford: Elsevier. 749-756.
Cowell, M.W. 1964. A Reference Grammar of Syrian Arabic (based on the dialect of Damascus). [Arabic Series 7]. Washington, DC: Georgetown University Press.
Dixon, R.M.W. 1972. The Dyirbal Language of North Queensland. Cambridge: CUP.
Dixon, R.M.W. 1982. 'Where Have All the Adjectives Gone?' and Other Essays in Semantics and Syntax. Berlin: Mouton de Gruyter.
Hockett, Charles F. 1958. A Course in Modern Linguistics. New York: Macmillan.
Jenewari, Charles E.W. 1983. Defaka, Ijo's closest linguistic relative. In: Dirhoff, Ivan R. (ed.) Current Approaches to African Linguistics I. [Publications in African Languages and Linguistics 1]. Dordrecht: Foris. 85-111.
Jones, Wendell & Paula Jones. 1991. Barasano Syntax. Publications in Linguistics 101. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington.
Kibrik, Alexandr E., S. V. Kodzasov, I. P. Olovjannikova & D. S. Samedov. 1977. Opyt strukturnogo opisanija arčinskogo jazyka, I: Leksika, fonetika. (Publikacii otdelenija strukturnoj i prikladnoj lingvistiki, 11). Moscow: Izdatel'stvo Moskovskogo universiteta.
Kibrik, Alexandr E. 1977. Opyt strukturnogo opisanija arčinskogo jazyka, III: Dinamičeskaja grammatika. (Publikacii otdelenija strukturnoj i prikladnoj lingvistiki, 13). Moscow: Izdatel'stvo Moskovskogo universiteta.
Kibrik, Alexandr E., K.I. Kazenin, E.A. Ljutikova & S.G. Tatevosov (eds) 2001. Bagvalinskij jazyk: Grammatika, Teksty, Slovar'. Moscow: Nasledie.
Lindström, Eva. 2002. Topics in the Grammar of Kuot: a Non-Austronesian Language of New Ireland, Papua New Guinea.PhD thesis, University of Stockholm. Available at: http://www.ling.su.se/staff/evali/thesis/Kuot-PhD.html
Miranda, Rocky V. 1975. Indo-European gender: a study in semantic and syntactic change. Journal of Indo-European Studies 3:199-215.
Munro, Pamela. 1998. The Garífuna gender system. In: Hill, J.H., P.J. Mistry & L. Campbell (eds) The Life of the Language: Papers in Linguistics in Honor of William Bright. Berlin: Mouton de Gruyter. 443-461.
Parker, E.M. & R.J. Hayward. 1985. An Afar-English-French Dictionary (with grammatical notes in English). London: School of Oriental and African Studies, University of London.
Payne, John R. 2006. Noun phrases. In: Brown, Keith (ed.) The Encyclopedia of Language and Linguistics. 2nd Edition. Oxford: Elsevier. 712-720.
Plaster, Keith & Maria Polinsky. 2010. Features in categorization, or a new look at an old problem. In: Kibort, Anna & Greville G. Corbett (eds) Features: Perspectives on a Key Notion in Linguistics. Oxford: OUP. 109-142.
Quirk, R., S. Greenbaum, G. Leech & J. Svartvik (eds) 1985. A Comprehensive Grammar of the English Language. London: Longman.
Reid, Nicholas. 1997. Class and classifier in Ngan'gityemerri. In: Harvey, Mark & Nicholas Reid (eds) Nominal Classification in Aboriginal Australia. [Studies in Language Companion Series 37]. Amsterdam: John Benjamins. 165-228.
Seifart, Frank. 2004. Nominal classification in Miraña, a Witotoan language of Colombia. Sprachtypologie und Universalienforschung 57:228-246.
Strom, Clay. 1992. Retuarã Syntax. [Studies in the Languages of Colombia 3]. Dallas: Summer Institute of Linguistics and the University of Texas at Arlington.
Taylor, D. 1977. Languages of the West Indies. Baltimore: Johns Hopkins University Press.
van Staden, Miriam. 2000. Tidore: a Linguistic Description of a Language of the North Moluccas. PhD thesis. Leiden University.
Xajdakov, S.M. 1963. Principy raspredelenija imen suščestvitel'nyx po grammatičeskim klassam v lakskom jazyke. Studia Caucasica 1:48-55.
Zaliznjak, A.A. 1964. K voprosu o grammatičeskix kategorijax roda i oduševlennosti v sovremennon russkom jazyke. Voprosy jazykoznanija 4:25-40.
Zaręba, Alfred. 1984-85. Osobliwa zmiana rodzaju naturalnego w dialektach polskich. Zbornik Matice Srpske za Filologiju i Lingvistiku 17-18:243-247.

How to cite:
Kibort, Anna & Greville G. Corbett. 2008. Grammatical Features Inventory: Gender. University of Surrey. http://dx.doi.org/10.15126/SMG.18/1.01

TOP