User:Selguha/Essays

From the Logical Languages Wiki
Jump to navigation Jump to search

Eventually I'll make a blog or something.

Drafts

Self-segregating morphology for logical languages: An overview

Unambiguous syntax is the sine qua non of a logical language. In plain language, unambiguous syntax means that it is always clear where something ends and the next thing begins. In other words, expressions can only be parsed in one way. Logical languages since Loglan have tried to implement this criterion at all structural levels, from the sentence down to the morpheme. (For our purposes, a morpheme is a string of phonemes that encodes meaning and cannot be further broken down into meaningful parts. Examples of morphemes in English include dog and the plural marker s in dogs.) It seems that this is somewhat easier to do in written language than in spoken language. Self-segregation in speech typically requires some rather unnatural rules governing the phonological patterns, or "shapes," of morphemes.

Various conlangers have written about the ways in which self-segregation can be built into a language. Jim Henry, notably, put together the well-known "List of self-segregating morphology methods" found at Frathwiki and the Conlang Wikia. I have found this list to be invaluable as a starting point. However, it lacks in generality.

At the most basic level, self-segregation can be implemented in only a few ways. It involves some basic unit and some kind of higher-level unit made up of basic units. I will call the basic units elements. In the schemes most often discussed, elements are either phonological segments (phonemes) or syllables; but they can also be morae, strings of segments of determinate length, or any other linguistic unit. Higher-level units may be morphemes or words (among other things); but for simplicity's sake, I will not distinguish between morphemes and words, and so refer only to words.

The easiest way to have words self-segregate is to fix their length at some number of elements, for all words. This inflexibility is not very attractive: Who wants a language where, for example, every single word is two syllables long? Most existing languages use what might be termed a "bipartite scheme." This involves classifying elements into two sets, which are loosely speaking word-initial and word-final: Set A and Set B. Words are then defined in terms of these classes, in one of three ways.

1. The left-breaking method: A word is defined phonologically as a string consisting of any element from A followed optionally by any number of elements from B. Words thus come in the patterns A, AB, ABB, ABBB, etc.

2. The right-breaking method: A word is defined as a string consisting of any element from B, preceded optionally by any number of elements from A. Words come in the patterns B, AB, AAB, AAAB, etc.

3. The bidirectional method: A word is defined as a string consisting of one or more A-elements followed obligatorily by one or more B-elements. Words end at the juncture where a final B-element meets an initial A-element, or at the end of an utterance. They come in the patterns AB, AAB, ABB, AABB, AABBB, etc.

(I must stop and note that there is a passing resemblance in these systems to prefix, postfix and interfix notation, as well as to the head directionality parameter that is much discussed across linguistics. (For an example outside syntax, see how the World Atlas of Language Structures classifies systems of fixed stress as right- and left-headed.) Mathematics has investigated similar phenomena; perhaps graph theory, or some part thereof, should have the final word here.)

This is, of course, a simplified picture. The nuances that can be worked in to a self-segregation system are multifarious. One of the simplest is having joining elements. Let us use the left-breaking system as our basic structure. We can define a class C that, where present, links the element before a C-element to the element after, taking precedence over the left-breaking rule. Morphemes can now look like ACA, ACAB, ABCAB, etc., in addition to the patterns shown earlier. Something similar is possible under right-breaking and bidirectional systems.

Another twist is to have deferred, or long-distance, effects. For instance, we can define a class of joiners, D, that connect the following two elements to the previous element; and another class, E, that connect the following three. This is similar to what Jeff Prothero's "Plan B" language does by having morpheme-initial segments encode the length of the morphemes they belong to.

It is entirely possible to have systems and rules of different types operating simultaneously. For instance, Lojban has a deferred right-breaking rule in the form of fixed penultimate stress. On top of that, it has various joining elements, such as the vowel y and heterosyllabic consonant clusters like rp; and something like a left-breaking rule in its requirement that content words have a consonant cluster in the first five segments. More analysis is needed, obviously. For example, is Lojban's cluster requirement best understood as a rule concerning segments, segment sequences or syllables?

Most existing logical languages draw a distinction between words and morphemes: they have words containing more than one morpheme. This necessitates two forms of self-segregation operating at once (at least — intermediate groupings of morphemes within a word are possible, and found in Lojban). Adding distinctions between types of morpheme may actually simplify things, counterintuitively enough. As Rick Morneau describes in his essay "Morphology," one easy way to implement self-segregation at the morpheme and word level is to "[ensure] that each type of morpheme can always be identified by its shape, and . . . that each type can occupy only one position in a word." He provides the following system for illustration. Three morpheme types are possible: prefix, root and suffix. Only suffixes are obligatory; prefixes and roots may appear zero or more times.

C = b, p, d, t, g, k, z, s, v, f
V = a, e, i, o, u
S = y, w
N = m, n

prefix = CSV
root = CVN
suffix = CV

word = {prefix} {root} suffix

Essentially, this is a simple right-breaking system at the word level: words end to the right of the first CV syllable. Within a word, morphemes also break to the right: they end to the right of every syllable rime. Morneau uses a similar but more elaborate system in his magnum opus.

I believe I have covered all the obvious and useful ways morphological self-segregation can be implemented. Needless to say, there is still much more about self-segregation systems in need of investigation.