Limbu Corpus
Limbu: a brief note
Resources :
All available resources here
The Limbu inhabit Nepal east of the Arun river and bordering areas of the Indian states of West Bengal (Darjeeling District) and Sikkim. They are called "Limbu" in Nepali, "Yakthung" in Limbu. There are some 200000 speakers of the language, virtually all bilingual in Nepali, the Indo-Aryan national language of Nepal.
The language
The Limbu language belongs to the Tibeto-Burman family. It was classified by Shafer (1955) in the East Himalayish section of the Bodic Division of Sino-Tibetan. Limbu is the easternmost member of this group, also known as "Kiranti". There are several Limbu dialects, but speakers generally consider them to be intercomprehensible, with a bit of effort.
The following works on Limbu may be found useful:
- Michailovsky, Boyd, 2002, Limbu-English Dictionary of the Mewa Khola Dialect with English-Limbu index, Kathmandu, Mandala Book Point. (Dictionary of the Mewa Khola dialect.) A computerized version of this dictionary is available on line (see below). The introduction to the dictionary, containing basic phonological and morphological information on the Mewa Khola dialect, is available in pdf format here.
- Royal Nepal Academy, 2002, Limbu-Nepali-English Dictionary edited by Bairagi Kainla [Til Bikram Nembang]. (A much larger, multi-dialectal dictionary based mainly on the Panchthar dialect, and including literary Limbu. Entries are in the Srijonga, Devanagari, and roman scripts. Nepali-Limbu index.)
- Van Driem, George, 1987. A grammar of Limbu. Berlin. Mouton de Gruyter. (Description of the Phedappe dialect.)
- Weidert, A. and B. Subba [D. Bikram Ingwaba], 1985, Concise Limbu Grammar and Dictionary. Amsterdam. Lobster. (Panchthar dialect.)
The online dictionary
The online version of the Limbu-English dictionary (Michailovsky 2002) can be browsed here Many of the example sentences in the dictionary are taken from the archived texts (below. The introduction to the dictionary, containing phonological and morphological information on the Mewa Khola dialect, is available in pdf format here.
The archived texts
The archived documents were originally recorded and transcribed in 1977 by B. Michailovsky and Martine Mazaudon in the village of Libang, in the valley of the Mewa Khola in Taplejung district. They are mainly autobiographical narratives.
All texts have free, sentence-level English translations. The texts "Untimely death", "Ogre Kanayongba", "Manioc", "Father-in-law", "Wife-stealing", and "Paddy-dancing" also have English morpheme glosses. French translation and glosses are provided for "Untimely Death". Sentence-level French translation is provided for "Father-in-Law".
About the transcription
The transcription is IPA-based, except that "y" is used for the palatal glide (IPA [j]).
Loanwords from Nepali (or from English, etc. via Nepali) are transcribed roughly as they are pronounced in the recordings. The phoneme "ə" appears only in Nepali loanwords
Two different Limbu transcriptions are provided:
- The sentence-level transcription is close to the pronunciation and is punctuated for easier understanding. In this transcription, the form of individual morphemes may vary according to the context within the word, in accord with the pronunciation. The transcription does not take into account phonological context across word boundaries (indicated by space). This is why no native word is transcribed with a voiced stop initial (b, d, g, dz) although word-initial stops (p, t, k, ts) may be voiced in context.
- In the morpheme-level transcription, most Limbu morphemes always appear in the same form, ignoring regular, phonologicially conditioned variation. This transcription facilitates word-searching. Remarks: finite verb forms are are analysed as prefix-stem-suffix; the agreement and negative-marking prefixes and suffixes are not analysed into individual morphemes. Frequent grammatical words are not always analysed into their component morphemes; etymologies can be found in the dictionary.
Apostrophe in the transcription indicates an elided vowel. In some cases these are hypothesized on the theory that Limbu has the syllable structure (C)V(C) with no initial consonant clusters. The apostrophe does not indicate a syllable boundary.
Morpheme-level translations (where available) are displayed as interlinear glosses, visually aligned word-by-word with the morpheme-level transcription. In the morpheme-level transcription, morphemes are separated by hyphens, and their glosses are separated by corresponding hyphens in the morpheme-level translation. Periods are used as separators in multi-word glosses which correspond to single morphemes or to verb prefixes or suffixes transcribed as single units (see above).
In the morpheme glosses, where a verb prefix or suffix indexes both actor and undergoer (subject and object), the actor appears before the arrow and the undergoer after it. The notations "S1" and "S2" in verb stem glosses indicate present and past stems, respectively. Since prefixes and suffixes are glossed idependently, and only audible morphemes are transcribed and glossed, information that can be inferred from the combination of prefix and suffix, or from the absence of a prefix or suffix, is not necessarily reflected in the gloss.
The text documents (XML-coded documents available through the metadata interface) contain some information that is not displayed by the current interface, including form-class information (identification of verb stems and of prefix and suffix strings) and standardized transliterations of Nepali loan words.
The following abbreviations are used in morpheme glosses:
| Gloss | meaning |
| 1, 2, 3 | 1st person, etc. |
| 12 | 1st or 2d person |
| 23 | 2d or 3d person |
| -> | [separates agent from object in glosses of agreement strings] |
| ABL | ablative |
| ACT | active participle/agent nominal |
| ADJ | adjectival marker |
| ADV | adverbial marker |
| CJ | conjunctive subordinator |
| CLF | classifier |
| COMP | complementizer |
| CONCESS | concessive |
| COP | copula (with predicate nominal) |
| CTR | counter-expectancy focus particle |
| DEF | definite |
| DU | dual |
| EMPH | emphatic particle |
| ERG | ergative |
| EVID | evidential |
| EX | exclusive |
| EXCL | exclamation |
| EXPR | expressive, empathetic particle |
| F | feminine |
| GEN | genitive |
| GER | gerundive |
| HORT | hortative |
| HYP | hypothetical |
| IMP | imperative |
| IN | inclusive |
| INCOH | incohative |
| INF | infinitive |
| INST | instrumental |
| INTENS | intensifier |
| INTERJ | interjection |
| IRR | irrealis, contrary to fact |
| LOC | locative |
| MEAS | measure word |
| NEG | negative |
| NOM | nominalizer |
| NSG | non-singular |
| O | transitive object |
| ONOM | onomatopoeic, phonaesthetic |
| PA | past/accomplished |
| PL | plural |
| PN | proper noun |
| PR | present/non-accomplished |
| PROG | progressive |
| PURP | purpose |
| PV | pre-verb |
| Q | question |
| REDUP | reduplication |
| REFL | reflexive |
| REP | reported/hearsay marker |
| S1 | present stem |
| S2 | past stem |
| SA | actor (intransitive subject or transitive agent) |
| SG | singular |
| SO | non-agent (intransitive subject or transitive object) |
| SOC | sociative |
| SUB | subordinator |
| TOP | topic, theme |
| VOC | vocative |
| xx | (speech error, false start, etc.) |
The abbreviations SG, NSG, DU, PL, EX, IN are not separated from a preceding person number, e.g. "1SG".
Boyd Michailovsky