Eventually I'll make a blog or something.

Drafts

Self-segregating morphology for logical languages: An overview

Unambiguous syntax is the sine qua non of a logical language. One way to understand this is that it is always clear where some part of a sentence ends and the next begins. Logical languages since Loglan have tried to implement this criterion at all structural levels, from the sentence down to the morpheme. (For our purposes, a morpheme is a string of phonemes that encodes meaning and cannot be further broken down into meaningful parts. Examples of morphemes in English include dog and the plural marker s in dogs.) It seems that this is somewhat easier to do in written language than in spoken language. Self-segregation in speech typically requires some rather unnatural rules governing the phonological patterns, or "shapes," of morphemes.

Various conlangers have written about the ways in which self-segregation can be built into a language. Jim Henry, notably, put together the well-known "List of self-segregating morphology methods" found at Frathwiki and the Conlang Wikia. I have found this list to be invaluable as a starting point. However, it lacks in generality.

At the most basic level, self-segregation can be accomplished in only a few ways. It involves some basic unit and some kind of higher-level unit made up of basic units. I will call the basic units elements. In the schemes most often discussed, elements are either phonological segments (phonemes) or syllables; but they can also be morae, strings of segments of determinate length, or any other linguistic unit. Higher-level units may be morphemes or words (among other things); but for simplicity's sake, I will not distinguish between morphemes and words, and so refer only to words.

The easiest way to have words self-segregate is to fix their length at some number of elements, for all words. This inflexibility is not very attractive: Who wants a language where, for instance, every single word is two syllables long? Most existing languages use what might be termed a "bipartite scheme." This involves classifying elements into two sets, which are loosely speaking word-initial and word-final: Set A and Set B. Words are then defined in terms of these classes, in one of three ways.

1. The left-breaking method: A word is defined phonologically as a string consisting of any element from A followed optionally by any number of elements from B. Words thus come in the patterns A, AB, ABB, ABBB, etc.

2. The right-breaking method: A word is defined as a string consisting of any element from B, preceded optionally by any number of elements from A. Words come in the patterns B, AB, AAB, AAAB, etc. (The prototypical method is found in Xorban: a word consists of one or more consonants followed by a peripheral vowel.)

3. The bidirectional method: A word is defined as a string consisting of one or more A-elements followed obligatorily by one or more B-elements. Words end at the juncture where a final B-element meets an initial A-element, or at the end of an utterance. They come in the patterns AB, AAB, ABB, AABB, AABBB, etc.

I must stop and note that there is a passing resemblance in these systems to prefix, postfix and interfix notation, as well as to the head directionality parameter that features prominently in typology. (In fact, The World Atlas of Language Structures' classification of fixed-stress systems into left- and right-headed inspired my analysis here.) Mathematics has investigated similar phenomena; perhaps graph theory, or some part thereof, should have the final word here.

This is, of course, a simplified picture. The nuances that can be worked in to a self-segregation system are multifarious. One of the simplest is having joining elements. Let us use the left-breaking system as our basic structure. We can define a class C that, where present, links the element before a C-element to the element after, taking precedence over the left-breaking rule. Morphemes can now look like ACA, ACAB, ABCAB, etc., in addition to the patterns shown earlier. Something similar is possible under right-breaking and bidirectional systems.

Another twist is to have deferred, or long-distance, effects. For instance, we can define a class of joiners, D, that connect the following two elements to the previous element; and another class, E, that connect the following three. This is similar to what Jeff Prothero's "Plan B" language does by having morpheme-initial segments encode the length of the morphemes they belong to.

It is entirely possible to have systems and rules of different types operating simultaneously. This may be done to permit a greater variety of word-forms, to implement self-segregation at multiple levels of the grammar, or for the sake of redundancy. Lojban exemplifies such a hybrid system. It has a deferred right-breaking rule in the form of fixed penultimate stress. On top of that, it has various joining elements, such as the vowel y and heterosyllabic consonant clusters like rp; and something like a left-breaking rule in its requirement that content words have a consonant cluster in the first five segments. More analysis is needed. Questions of interest include: Is Lojban's cluster requirement best understood as a rule concerning segments, segment sequences or syllables? Which rules, if any, are redundant? Which rules belong to which level of self-segregation?

Most existing logical languages draw a distinction between words and morphemes: they have words containing more than one morpheme. This necessitates two forms of self-segregation operating at once (at least — intermediate-level groupings of morphemes within a word are possible, and occur in Lojban). One elegant system, which allows discrimination between three types of morpheme within a word, is presented by Rick Morneau in his essay "Morphology." It provides a good demonstration of how right-breaking rules can be stacked. Morneau's method is to "[ensure] that each type of morpheme can always be identified by its shape, and . . . that each type can occupy only one position in a word." Three morpheme types are possible in his model system: prefix, root and suffix. Only suffixes are obligatory; prefixes and roots may appear zero or more times. Prefixes are of the form consonant-semivowel-vowel; roots, consonant-vowel-nasal; and suffixes, consonant-vowel. Altogether, this constitutes a right-breaking system at the word level, with CV syllables the B-elements; and at the morpheme level, with syllable rimes the B-elements. Morneau uses a similar but more elaborate system in his magnum opus.

I believe I have covered the most common and useful ways of implementing morphological self-segregation. Needless to say, there is still much more to investigate in this area. Many combinations of methods remain untested. Some may be impossible or practically useless; others may be quite fruitful. Greater complexity seems often to result in more agreeably naturalistic languages.

User:Selguha/Essays

Drafts

Self-segregating morphology for logical languages: An overview

Navigation menu

User:Selguha/Essays

Drafts

Self-segregating morphology for logical languages: An overview

Navigation menu

Search