词条 | Agglutination | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
释义 |
Agglutination is a linguistic process pertaining to derivational morphology in which complex words are formed by stringing together morphemes without changing them in spelling or phonetics. Languages that use agglutination widely are called agglutinative languages. An example of such a language is Turkish, where for example, the word evlerinizden, or "from your houses", consists of the morphemes ev-ler-iniz-den with the meanings house-plural-your-from. Agglutinative languages are often contrasted both with languages in which syntactic structure is expressed solely by means of word order and auxiliary words (isolating languages) and with languages in which a single affix typically expresses several syntactic categories and a single category may be expressed by several different affixes (as is the case in inflectional (fusional) languages). However, both fusional and isolating languages may use agglutination in the most-often-used constructs, and use agglutination heavily in certain contexts, such as word derivation. This is the case in English, which has an agglutinated plural marker -(e)s and derived words such as shame·less·ness. Agglutinative suffixes are often inserted irrespective of syllabic boundaries, for example, by adding a consonant to the syllable coda as in English tie – ties. Agglutinative languages also have large inventories of enclitics, which can be and are separated from the word root by native speakers in daily usage. Note that the term agglutination is sometimes used more generally to refer to the morphological process of adding suffixes or other morphemes to the base of a word. This is treated in more detail in the section on other uses of the term. Examples of agglutinative languages{{main|Agglutinative language}}Although agglutination is characteristic of certain language families, this does not mean that when several languages in a certain geographic area are all agglutinative, they are necessarily related phylogenetically. In particular, such a conclusion formerly led linguists to propose the so-called Ural–Altaic language family, which would (in the largest scope ever proposed) include the Uralic and Turkic languages as well as Mongolian, Korean, Tamil and Japanese. However, contemporary linguistics views this proposal as controversial.[1] On the other hand, it is also the case that some languages that have developed from agglutinative proto-languages have lost this feature. For example, contemporary Estonian, which is so closely related to Finnish that the two languages are mutually intelligible,[2] has shifted towards the fusional type.[3] (It has also lost other features typical of the Uralic families, such as vowel harmony.) Eurasia{{unreferenced section|date=October 2014}}Examples of agglutinative languages include the Uralic languages, such as Finnish, Estonian, and Hungarian. These have highly agglutinated expressions in daily usage, and most words are bisyllabic or longer. Grammatical information expressed by adpositions in Western Indo-European languages is typically found in suffixes. Hungarian uses extensive agglutination in almost all and any part of it. The suffixes follow each other in special order based on the role of the suffix, and can be heaped in extreme amount, resulting words conveying complex meanings in very compact form. An example is fiaiéi where the root "fi-" means "son", the subsequent four vowels are all separate suffixes, and the whole word means "[plural properties] of his/her sons". The nested possessive structure and expression of plurals is quite remarkable (note that Hungarian uses no genders). Almost all Austronesian languages, such as Malay, and most Philippine languages, also belong to this category, thus enabling them to form new words from simple base forms. The Indonesian and Malay word mempertanggungjawabkan is formed by adding active-voice, causative and transitive affixes to the compound verb tanggung jawab, which means "to account for". In Tagalog (and its standardised register, Filipino), nakakapágpabagabag ("that which is upsetting/disturbing") is formed from the root bagabag ("upsetting" or "disquieting"). Japanese, along with Korean, is also an agglutinating language, adding information such as negation, passive voice, past tense, honorific degree and causality in the verb form. Common examples would be hatarakaseraretara (働かせられたら), which combines causative, passive or potential, and conditional conjugations to arrive at two meanings depending on context "if (subject) had been made to work..." and "if (subject) could make (object) work", and tabetakunakatta (食べたくなかった), which combines desire, negation, and past tense conjugations to mean "I/he/she/they did not want to eat".
Turkish, along with all other Turkic languages, is another agglutinating language: as an extreme example, the expression Muvaffakiyetsizleştiriciveremeyebileceklerimizdenmişsinizcesine is pronounced as one word in Turkish, but it can be translated into English as "as if you were of those we would not be able to turn into a maker of unsuccessful ones" (the "-siniz" refers to plural form of you with "-sin" being the singular form, the same way "-im" being "I" and "-imiz" making it become "we"). All Dravidian languages, including Kannada, Telugu, Malayalam and Tamil, are agglutinative. Agglutination is used to very high degrees both in the conversational and in the standardised written form of Telugu. Agglutination is also a notable feature of the Basque. The conjugation of verbs, for example, is done by adding different prefixes or suffixes to the root of the verb: dakartzat, which means 'I bring them', is formed by da (indicates present tense), kar (root of the verb ekarri → bring), tza (indicates plural) and t (indicates subject, in this case, "I"). Another example would be the declination: Etxean = "In the house" where etxe = house. AmericasAgglutination is used very heavily in most Native American languages, such as the Inuit languages, Nahuatl, Mapudungun, Quechua, Tz'utujil, Kaqchikel, Cha'palaachi and K'iche, where one word can contain enough morphemes to convey the meaning of what would be a complex sentence in other languages. Conversely, Navajo contains affixes for some uses, but overlays them in such unpredictable and inseparable ways that it is often referred to as a fusional language.{{citation needed|date=October 2017}} ConstructedEsperanto is a constructed auxiliary language with highly regular grammar and agglutinative word morphology. See Esperanto vocabulary. FictionalNewspeak is a fictional language in 1984 based on the sole goal of agglutination, as expressed by the character Syme, "Every concept that can ever be needed, will be expressed by exactly one word"[4] For instance, using the root word "good" we can form words such as goodly (does well), plusgood (very good), doubleplusgood (very good), and ungood (bad). Words with comparative and superlative meanings are also simplified, so "better" becomes "gooder", and "best" becomes "goodest."[5]SlotsAs noted above, it is a typical feature of agglutinative languages that there is a one-to-one correspondence between suffixes and syntactic categories. For example, a noun may have separate markers for number, case, possessive or conjunctive usage etc. The order of these affixes is fixed;[6] so we may view any given noun or verb as a stem followed by several inflectional "slots", i.e. positions in which inflectional suffixes may occur. It is often the case that the most common instance of a given grammatical category is unmarked, i.e. the corresponding affix is empty. The number of slots for a given part of speech can be surprisingly high. For example, a finite Korean verb has seven slots (the inner round brackets indicate parts of morphemes which may be omitted in some phonological environments):[7]
Moreover, passive and causative verbal forms can be derived by adding suffixes to the base, which could be seen as the null-th slot; however, passives are not as commonly used as in English and many verbs do not allow passivization at all. Even though some combinations of suffixes are not possible (e.g. only one of the aspect slots may be filled with a non-empty suffix), over 400 verb forms may be formed from a single base. Here are a few examples formed from the word root ga `to go'; the numbers indicate which slots contain non-empty suffixes:
Suffixing or prefixingAlthough most agglutinative languages in Europe and Asia are predominantly suffixing, the Bantu languages of southern Africa are known for a highly complex mixture of prefixes, suffixes and reduplication. A typical feature of this language family is that nouns fall into noun classes. To each noun class, there are specific singular and plural prefixes, which also serve as markers of agreement between the subject and the verb. Moreover, the noun determines prefixes of all words that modify it and subject determines prefixes of other elements in the same verb phrase. For example, the Swahili nouns -toto ("child") and -tu ("person") fall into class 1, with singular prefix m- and plural prefix wa-. The noun -tabu ("book") falls into class 7, with singular prefix ki- and plural prefix vi-.[8] The following sentences may be formed: `That one tall person who read that long book.' In the context of quantitative linguisticsWe have already mentioned the fact that most languages include inflectional, agglutinative and isolating constructions side by side. The American linguist Joseph Harold Greenberg in his 1960 paper proposed to use the so-called agglutinative index to calculate a numerical value that would allow a researcher to compare the "degree of agglutitativeness" of various languages.[9] For Greenberg, agglutination means that the morphs are joined only with slight or no modification.[10] A morpheme is said to be automatic if it either takes a single surface form (morph), or if its surface form is determined by phonological rules that hold in all similar instances in that language.[11] A morph juncture – a position in a word where two morphs meet – is considered agglutinative when both morphemes included are automatic. The index of agglutination is equal to the average ratio of the number of agglutinative junctures to the number of morph junctures. Languages with high values of the agglutinative index are agglutinative and with low values of the agglutinative index are fusional. In the same paper, Greenberg proposed several other indices, many of which turn out to be relevant to the study of agglutination. The synthetic index is the average number of morphemes per word, with the lowest conceivable value equal to 1 for isolating (analytic) languages and real-life values rarely exceeding 3. The compounding index is equal to the average number of root morphemes per word (as opposed to derivational and inflectional morphemes). The derivational, inflectional, prefixial and suffixial indices correspond respectively to the average number of derivational and inflectional morphemes, prefixes and suffixes. Here is a table of sample values:[12]
Phonetics and agglutinationThe one-to-one relationship between an affix and its grammatical function may be somewhat complicated by the phonological processes active in the given language. For example, the following two phonological phenomena appear in many of the Uralic and Turkic languages:
Several examples from Finnish will illustrate how these two rules and other phonological processes lead to diversions from the basic one-to-one relationship between morphs and their syntactic and semantic function. No phonological rule is applied in the declension of talo `house'. However, the second example illustrates several kinds of phonological phenomena.[13][14]
ExtremesIt is possible to construct artificially extreme examples of agglutination, which have no real use, but illustrate the theoretical capability of the grammar to agglutinate. This is not a question of "long words", because some languages permit limitless combinations with compound words, negative clitics or such, which can be (and are) expressed with an analytic structure in actual usage. English is capable of agglutinating morphemes of solely Germanic origin, as un-whole-some-ness, but generally speaking the longest words are assembled from forms of Latin or Ancient Greek origin. The classic example is antidisestablishmentarianism. Agglutinative languages often have more complex derivational agglutination than isolating languages, so they can do the same to a much larger extent. For example, in Hungarian, a word such as {{lang|hu|elnemzetietleníthetetlenségnek}}, which means "for [the purposes of] undenationalizationability" can find actual use.[15] In the same way, there are the words that have meaning, but probably are never used such as {{lang|hu|legeslegmegszentségteleníttethetetlenebbjeitekként}}, which means "like the most of most undesecratable ones of you", but is hard to decipher even for native speakers. Using inflectional agglutination, these can be extended. For example, the official Guinness world record is Finnish {{lang|fi|epäjärjestelmällistyttämättömyydellänsäkäänköhän}} "I wonder if – even with his/her quality of not having been made unsystematized". It has the derived word {{lang|fi|epäjärjestelmällistyttämättömyys}} as the root and is lengthened with the inflectional endings -llänsäkäänköhän. However, this word is grammatically unusual, because -kään "also" is used only in negative clauses, but -kö (question) only in question clauses. A very popular Turkish agglutination is {{lang|tr|Çekoslovakyalılaştıramadıklarımızdanmışsınız}}, meaning "You are one of those that we were not able to convert manage to convert into Czechoslovakians" (Çek is Czech but the longer term is used). This historical reference is used as a joke for the individuals who are hard to change or those who stick out in a group. On the other hand, {{lang|tr|Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesine}} is a longer word that does not surprise people and means "As if you were one of those we were able to make resemble people from Afyonkarahisar" (Afyonkarahisar is a town name but it is an agglutination itself, Afyon being opium, kara being black, and hisar being an old style fortress; when the Selçuks claimed the city, it was named Karakale meaning black fortress due to the fortress surrounding the city, and then was changed into Karahisar which was a fairly common town name. When it started producing most of the poppies in the country from which opium was derived after the Republic of Turkey was founded, the word Afyon was added to discern it from other Karahisar's). A recent addition to the claims has come with the introduction of the following word in Turkish {{lang|tr|muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine}}, which means something like "(you are talking) as if you are one of those that we were unable to turn into a maker of unsuccessful people" (someone who un-educates people to make them unsuccessful). Georgian is also a highly agglutinative language. For example, the word {{lang|ka|gadmosakontrrevolucieleblebisnairebisatvisaco}} ({{lang|ka|გადმოსაკონტრრევოლუციელებლებისნაირებისათვისაცო}}) would mean "(someone not specified) said that it is also for those who are like the ones who need to be to again/back counter-revolutionized". Aristophanes' comedy Assemblywomen includes the Greek word {{lang|grc|λοπαδοτεμαχοσελαχογαλεοκρανιολειψανοδριμυποτριμματοσιλφιοκαραβομελιτοκατακεχυμενοκιχλεπικοσσυφοφαττοπεριστεραλεκτρυονοπτοκεφαλλιοκιγκλοπελειολαγῳοσιραιοβαφητραγανοπτερύγων}}, a fictional dish named with a word that enumerates its ingredients. It was created to ridicule a trend for long compounds in Attic Greek at the time.{{citation needed|date=September 2017}} Slavic languages are not considered agglutinative but fusional. However, extreme derivations similar to ones found in typical agglutinative languages do exist. A famous example is the Bulgarian word непротивоконституциослователствувайте, meaning don't speak against the constitution and secondarily don't act against the constitution. It is composed of just three roots: против against, конституция constitution, a loan word and therefore devoid of its internal composition and слово word. The remaining are bound morphemes for negation (не, a proclitic, otherwise written separately in verbs), noun intensifier (-ателств), noun-to-verb conversion (-ува), imperative mood second person plural ending (-йте). It is rather unusual, but finds some usage, e.g. newspaper headlines on 13 July 1991, the day after the current Bulgarian constitution was adopted with much controversies, debate and even scandals. Other uses of the words agglutination and agglutinativeThe words agglutination and agglutinative come from the Latin word agglutinare, 'to glue together'. In linguistics, these words have been in use since 1836, when Wilhelm von Humboldt's posthumously published work Über die Verschiedenheit des menschlichen Sprachbaues und ihren Einfluß auf die geistige Entwicklung des Menschengeschlechts [lit.: On the differences of human language construction and its influence on the mental development of mankind] introduced the division of languages into isolating, inflectional, agglutinative and incorporating.[16] Especially in some older literature, agglutinative is sometimes used as a synonym for synthetic. In that case, it embraces what we call agglutinative and inflectional languages, and it is an antonym of analytic or isolating. Besides the clear etymological motivation (after all, inflectional endings are also "glued" to the stems), this more general usage is justified by the fact that the distinction between agglutinative and inflectional languages is not a sharp one, as we have already seen. In the second half of the 19th century, many linguists believed that there is a natural cycle of language evolution: function words of the isolating type are glued to their head-words, so that the language becomes agglutinative; later morphs become merged through phonological processes, and what comes out is an inflectional language; finally inflectional endings are often dropped in quick speech, inflection is omitted and the language goes back to the isolating type.[17] The following passage from Lord (1960) demonstrates well the whole range of meanings that the word agglutination may have. (Agglutination...) consists of the welding together of two or more terms constantly occurring as a syntagmatic group into a single unit, which becomes either difficult or impossible to analyse thereafter. Agglutinative languages in natural language processingIn natural language processing, languages with rich morphology pose problems of quite a different kind than isolating languages. In the case of agglutinative languages, the main obstacle lies in the large number of word forms that can be obtained from a single root. As we have already seen, the generation of these word forms is somewhat complicated by the phonological processes of the particular language. Although the basic one-to-one relationship between form and syntactic function is not broken in Finnish, the authoritative institution Kotimaisten kielten tutkimuskeskus (KOTUS, i.e. the Institute for the Languages of Finland) lists 51 declension types for Finnish nouns, adjectives, pronouns, and numerals. Even more problems occur with the recognition of word forms. Modern linguistic methods are largely based on the exploitation of corpora; however, when the number of possible word forms is large, any corpus will necessarily contain only a small fraction of them. Hajič (2010) claims that computer space and power are so cheap nowadays that all possible word forms may be generated beforehands and stored in a form of a lexicon listing all possible interpretations of any given word form. (The data structure of the lexicon has to be optimized so that the search is quick and efficient.) According to Hajič, it is the disambiguation of these word forms which is difficult (more so for inflective languages where the ambiguity is high than for agglutinative languages).[19] Other authors do not share Hajič's view that space is no issue and instead of listing all possible word forms in a lexicon, word form analysis is implemented by modules which try to break up the surface form into a sequence of morphemes occurring in an order permissible by the language. The problem of such an analysis is the large number of morpheme boundaries typical for agglutinative languages. A word of an inflectional language has only one ending and therefore the number of possible divisions of a word into the base and the ending is only linear with the length of the word. In an agglutinative language, where several suffixes are concatenated at the end of the word, the number of different divisions which have to be checked for consistency is large. This approach was used for example in the development of a system for Arabic, where agglutination occurs when articles, prepositions and conjunctions are joined with the following word and pronouns are joined with the preceding word. See Grefenstette et al. (2005) for more details. See also
Notes1. ^Bernard Comrie: "Introduction", p. 7 and 9 in Comrie (1990). For instance, the Turkic language family is a well-established language family, as is each of the Uralic, Mongolian and Tungusic families. What is controversial, however, is whether or not these individual families are related as members of an even larger family. The possibility of an Altaic family, comprising Turkic, Mongolian, and Tungusic, is rather widely accepted, and some scholars would advocate increasing the size of this family by adding some or all of Uralic, Korean and Japanese. For instance, the study of word order universals by Greenberg ("Some Universals of Grammar with Particular Reference to the Order of meaningful Elements", in J. H. Greenberg (ed.): Universals of language, MIT Press, Cambridge, Mass, 1963, pp. 73–112) showed that if a language has verb-final word order (i.e. if `the man saw the woman' is expressed literally as `the man the woman saw'), then it is highly probable that it will also have postpositions rather than prepositions (i.e. `in the house' will be expressed as `the house in') and that it will have genitives before the noun (i.e. the pattern `cat's house' rather than `house of cat'). Thus, if we find two languages that happen to share the features: verb-final word order, postpositions, prenominal genitives, then the co-occurrence of these features is not evidence for genetic relatedness. Many earlier attempts at establishing wide-ranging genetic relationships suffer precisely from failure to take this property of typological patterns into account. Thus the fact that Turkic languages, Mongolian languages, Tungusic languages, Korean and Japanese share all of these features is not evidence for their genetic relatedness (although there may, of course, be other similarities, not connected with recurrent typological patterns, that do establish genetic relatedness). 2. ^Personal communication with Matti Palomäki, around 2001. See also a discussion on UniLang UniLang. 3. ^Lehečková (1983), p. 17: Flexivní typ je nejvýrazněji zastoupen v estonštině. Projevuje se kongruencí, nedostatkem posesivních sufixů, větší homonymií a synonymií a tolika alternacemi, že se dá mluvit o různých deklinacích. Koncovky jsou většinou fonologicky redukovány, takže ztrácejí slabičnou samostatnost. 4. ^{{cite book|last=Orwell|first=George|title=1984|year=1949|publisher=Harcourt|location=New York}} 5. ^Orwell, George (1949). Nineteen Eighty-Four, "Appendix: The Principles of Newspeak", pp. 309–323. New York: Plume, 2003. Pynchon, Thomas (2003). "Foreword to the Centennial Edition" to Nineteen Eighty-Four, pp. vii–xxvi . New York: Plume, 2003. Fromm, Erich (1961). "Afterword" to Nineteen Eighty-Four, pp. 324–337. New York: Plume, 2003. Orwell's text has a "Selected Bibliography", pp. 338–9; the foreword and the afterword each contain further references. Copyright is explicitly extended to digital and any other means. Plume edition is a reprint of a hardcover by Harcourt. Plume edition is also in a Signet edition. 6. ^There may exist exceptions in a language requiring some affixes go in an unexpected slot. 7. ^Nam-Kil Kim: Korean, p. 890–897 in Comrie (1990). 8. ^The first twelve examples are taken from Fromkin et al. (2007) p. 110, with the following adjustments: I changed sentences, which were originally in present perfect tense (with marker -me-) to sentences in past simple tense (-li); I also changed the subject of the last four sentences from -kapu `basket' to tabu `book', which falls into the same class. The final two examples are taken from Benji Wald: Swahili and the Bantu Languages, p. 1002 in Comrie (1990). For the class 7 prefixes, see the Mwana Simba {{webarchive|url=https://web.archive.org/web/20110504212426/http://mwanasimba.online.fr/ |date=May 4, 2011 }}, Chapter 16 {{webarchive|url=https://web.archive.org/web/20110326150007/http://mwanasimba.online.fr/E_Chap16.htm |date=March 26, 2011 }}. For the past tense, see Chapter 32 {{webarchive|url=https://web.archive.org/web/20110407183114/http://mwanasimba.online.fr/E_Chap32.htm |date=April 7, 2011 }} and the verb generator {{webarchive|url=https://web.archive.org/web/20110721015215/http://mwanasimba.online.fr/E_verb.htm |date=July 21, 2011 }}. 9. ^[https://www.jstor.org/pss/1264155 A quantitative approach to the morphological typology of language] 10. ^Denning et al. (1990), [https://books.google.com/books?id=_B2BVl2JpT0C&pg=PA13&lpg=PA13&dq=greenberg+agglutinative+index&source=bl&ots=bfkZTt0ttn&sig=JVbXdquj5L_ZQHJiei4kqWHwKKw&hl=cs&ei=PU0iTeeFKJS38QP78bzHBQ&sa=X&oi=book_result&ct=result&resnum=3&ved=0CCgQ6AEwAg#v=onepage&q&f=false page 12]. 11. ^Surprisingly, Greenberg does not consider the English plural morpheme -s to be automatic. Indeed, the alternation between the phonetic realizations -s, -z and -ez is automatic, but there are other, although rare, cases when the plural morpheme is -en, -∅ etc. See Denning et al. (1990), [https://books.google.com/books?id=_B2BVl2JpT0C&pg=PA13&lpg=PA13&dq=greenberg+agglutinative+index&source=bl&ots=bfkZTt0ttn&sig=JVbXdquj5L_ZQHJiei4kqWHwKKw&hl=cs&ei=PU0iTeeFKJS38QP78bzHBQ&sa=X&oi=book_result&ct=result&resnum=3&ved=0CCgQ6AEwAg#v=onepage&q&f=true page 20]. 12. ^Greenberg calculated the indices only from a single passage of 100 words for each language. The values in the table are taken from Luschützky (2003), p. 43; they are compiled from Greenberg (1954) and from Warren Crawford Cowgill: A Search for Universals in Indo-European Diachronic Morphology, Universals of Language, MIT Press, Cambridge (Massachusetts), 1963, p. 91–113. 13. ^The examples may be checked with the Finnish morphological analyser. 14. ^Note that there is no article in Finnish, so the use of a/the in English translations is arbitrary. 15. ^Used for example in the book of Dr. József Végváry: „És mégsem mozog ...” 16. ^The division is attributed to Humboldt in Luschützky (2003), p. 17. The dating comes from Michael Losonsky (ed): Wilhelm von Humboldt: on language, [https://books.google.com/books?id=_UODbGlD4WUC&pg=PR22&dq=Wilhelm+von+Humboldt+1836+agglutination&hl=cs&ei=4JUjTYGAMIbj4Ab8iLWGAg&sa=X&oi=book_result&ct=result&resnum=1&ved=0CCUQ6AEwAA#v=onepage&q&f=false p. xxxvi] (available through googlebooks). 17. ^Vendryes (1925), p. 349, already mentions this hypothesis as out-dated, stating the more contemporary view that all three kinds of processes are present at the same time. According to Vendryes, proponents of this hypothesis would include A. Hovelacque: La linguistique, Paris 1888; F. Misteli: Charakteristik der hauptsächlichsten Typen des Sprachbaus, Berlin 1893; and finally A. H. Sayce: Introduction to the Science of Language, 2 Vols., 3rd edition London 1890.Compare also Lehečková (2003), p. 18–19, a passage which is much closer to the original concept of separate stages. 18. ^Lord (1960), p. 160. 19. ^Hajič (2010), Abstract: However, it is not the morphology itself (not even for inflective or agglutinative languages) that is causing the headache – with today’s cheap space and power, simply listing all the thinkable forms in an appropriately hashed list is o.k. – but it’s the disambiguation problem, which is apparently more difficult for such morphologically rich languages (perhaps surprisingly more for the inflective ones than agglutinative ones) than for the analytical ones. References{{reflist}}Bibliography
External links
1 : Linguistic morphology |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。