User:Selguha/Sandbox

From the Logical Languages Wiki
Jump to navigation Jump to search

See also my Essays page, Comparison of orthographies page, and Lido phonotactics page.

The most recent material is at the top of this page.

Draft of TALM (23 June 2021)

TALM (“Third Attempt at Loglan Morphology”) is a blueprint for a logical language, excluding syntax or semantics. Its main goal is to implement monoparsing with a simple and average phonology, so that the eventual language it will support can act as an aux-loglang with an international lexicon, like Ceqli.

Phonology and orthography

There are 26 phonemes in TALM: five vowels and 21 consonants.

Vowel phonemes
Front Back
Close i u
Mid e o
Open a

The consonants are divided into three classes, through which the parsing morphology is defined.

Consonant phonemes
Labial Alveolar Palatal Velar Glottal
Plosive p b t d t͡ʃ d͡ʒ k g ʔ
Fricative f v s z ʃ x
Nasal m n
Lateral l
Rhotic r
Semivowel w j

Pure consonants (C) are highlighted in blue. Medial consonants (M) are highlighted in pink. The glottal stop (Q) is in its own class, and shown in green.

The alphabet is as follows. Major differences from the IPA are highlighted in red.

The TALM alphabet
Grapheme a b c d e f g h i j k l m n o p q r s t u v w x y z
Phoneme a b t͡ʃ d e f g x i d͡ʒ k l m n o p ʔ r s t u v w ʃ j z

Morphology

All native words have phonological shapes that allow them to be parsed without ambiguity as to word boundaries.

General word-shape formula

((H*S)?L)|(SQ)

H is a heavy, or closed, syllable, i.e. a syllable with a final consonant. S is a stressed syllable, either heavy or light. L is a light, or open, syllable that is always unstressed. Q is a glottal stop; SQ is a stressed light syllable with a glottal-stop coda. The glottal stop behaves like a light syllable, except that no word may consist of a glottal stop alone. So, in plain language, the formula can be stated

A word consists of (a) an unstressed light syllable, optionally preceded by a string of any number of heavy syllables ending in a stressed syllable; or (b) a stressed light syllable followed by a glottal stop.

Two syllables whose intervening consonant is M behave as one syllable for the purposes of the formula. H may be a bisyllable of the form CVMVC, CVMVM, CVMVV, CMVMVC, CMVMVM or CMVMVV. S may be a bisyllable of the form CVˈMV, CMVˈMV, CVˈMVC, CVˈMVM, CVˈMVV, CMVˈMVC, CMVˈMVM or CVˈMVV, where ⟨ˈ⟩ indicates that the following syllable is stressed (as in the IPA). L, however, is always a CV or CVV syllable.

Word classes

There are seven word classes. The definition of a class is primarily phonological and morphological. Classes may be divided in three ways. Lexically, there are function words, content words and names – just like in Lojban. Phonologically, there are native, adapted and foreign words, the last of which can be transcribed or untranscribed. In terms of parsing morphology or self-segregation method, there are Type 1 words, which use an A*B self-segregation method; Type 2 words, which use a “projection” method (to be described below); and Type 3 words, which are bracketed to set them apart from the rest of the utterance. Types 1, 2 and 3 have a one-to-one correspondence with native, adapted and foreign words. Each word class has a conventional designation of its type number plus an identifying letter; for instance, 2b.

Word classes
Class No. Class name Lojban equivalent Lexical category Nativeness Self-seg. method
1a Function word cmavo Function word Native A*B
1b Root word gismu Content word Native A*B
1c Compound word lujvo Content word Native A*B
2a Loanword zi'evla / fu'ivla Content word Adapted Projection
2b Name cmevla Name Adapted Projection
3a Delimited name Type 1 fu'ivla / Non-Lojban name/quote Name Foreign, transcribed Bracketing
3b Foreign material Type 1 fu'ivla / Non-Lojban name/quote Name Foreign, untranscribed Bracketing

Function words

Function words are one to two syllables in length. Their word shapes fit the formula (CM?)?V(V|(M?VV?))?. In other words, their word-shapes range from V to CMVV or CMVMVV. Shapes such as CVMMV or CVVMV are banned for purely phonotactical reasons; the clusters [rw wr ry yr wy yw] were deemed unstable (likely to result in a sound change like coalescence or fortition).

Monosyllabic function words cannot bear primary stress without potentially absorbing a following monosyllable; CVCV is a root-word shape. However, any monosyllabic function word can be stressed if it takes a glottal stop coda; its word shape is then SQ. This serves to make the word boundary unambiguous. Lojban uses glottal stops in approximately the same way, but in TALM the glottal stop is regarded as part of the function word rather than a “pause” between words. In a string of monosyllabic function words, every odd-numbered word is pronounced with a glottal stop coda.

Root words

Root words are two to three syllables in length, and may have between four and nine segments. They have the word-shape formula C((M?V)?M)?V(V|M|C)?C(M|C)V. The minimal shape is CVCV. The maximal shape may be CMVMVVCMV (e.g. triraitra), CMVMVVCCV (triraikla), CMVMVMCMV (triyartra), CMVMVMCCV (triyarsta), CMVMVCCMV (trirantra) or CMVMVCCCV (triyankla).

Problem:

- Want arsta to be syllabified /ar.sta/; insta to be syllabified /in.sta/ (MOP); and absta should be allowed

- Therefore st must be a Level 1 onset.

- Therefore triyarsta is /triyar.sta/; hence --> triyas

- But triyasta should --> triyat

- Cf. papla --> pap, pampla --> pap

- Solution A: make sp st sk (sm (sn (sl))) semi-native onsets but not native onsets. вспы́шка (fspixka) 'flash'; insta, *absterakt (normally abestrakt). Remove the requirement that all native onsets be allowed intervocalically in root words. (Should loanwords be distinguished from names again? Name: fspixka. Loanword: spixka.)

Affixes

Root words have short combining forms, called affixes. Every root word has one affix, and the phonological form of the affix is predictable given the form of the parent word. The converse is only true if the entire lexicon is known. The phonological derivation procedure involves truncation, or the stripping away of segments from the word. To see the pattern, it is necessary to number each segment.

C₁((M₁?V₁)?M₂)?V₂(V₃|M₃|C₂)?C₃(M₄|C₄)V₄ → C₁((M₁?V₁)?M₂)?V₂C₃

All affixes begin with the consonant that occupies the C₁ slot and end with the consonant that occupies the C₃ slot. If a root word begins with the string C₁M₁V₁M₂V₂(V₃|M₃|C₂)C₃, its affix is C₁M₁V₁M₂V₂C₃; this is the maximal affix shape. If a root word begins with the string C₁V₂C₃, its affix is C₁V₂C₃; this is the minimal affix shape.

The relation between root-word forms and affixes is potentially many-to-one. For instance, the following root words, if they existed, would all have the same affix:

bata → bat

baitu → bat

basti → bat

bantro → bat

In actuality, the relation is one-to-one. Only one of the words above may exist in the lexicon. This is ensured by generating an alphabetized list of all affixes before the first root word is created. Each new root word is then placed next to its affix on the list. Thus, a root word blocks all competitors for a given affix when it is entered into the dictionary.

Compound words

Loanwords

Names

Old stuff

Drafts predating June 2021 below.

Levels, etc.

Ordering A

Level 1a / 1.1: function words (closed class); cmavo.

Level 1b / 1.2: root words (semi-closed class); gismu.

Level 1c / 1.3: compound words (open class); lujvo.

Level 2: loanwords and specialized words; Type 4 fu'ivla/zi'evla.

Level 3: stress-delimited, partially assimilated foreign words (typically names, nine or fewer syllables).

Level 4: bracketed, partially assimilated foreign material; foreign or ungrammatical speech, transcribed into native phonemes.

Level 5: bracketed, unassimilated foreign material; may contain foreign phones or foreign characters.

Function words involved in Level 4 and Level 5

  • "laya": Prefixed to a Level 4 quote; terminated by /!/ (a syllabic alveolar click, or another sound specified as the right-bracket).
  • "liyu": Prefixed to a Level 5 quote; terminated by /!/.
  • "luwa": Elidable/elided written material. Prefixed to a Level 5 quote/phrase by a writer. Indicates that an attempt at a foreign pronunciation is unnecessary; the word "luwa" may be substituted for the quote in read-aloud speech. When followed immediately by a Level 4 name, indicates that the Level 4 name is a transcription of the Level 5 quote.
  • "lirai": Un-elidable written material. Elicits a read-aloud pronunciation as faithful to the original foreign pronunciation as possible.
  • "lurau": Metalinguistic indicator of failure to read aloud faithfully.

Examples

1a. Written: lirai Abraham Lincoln; laya eibraham linkon; broda

1b. Read aloud successfully: /ˈliraɪ̯ ˈeɪ̯bɻəhæm ˈlɪnkn̩ ǃ, ˈlaja ˈeɪ̯braham ˈlinkon, ˈbroda/

1c. Read aloud unsuccessfully: /ˈluraʊ̯ ˈlaja ˈeibraham ˈlinkon ! ˈbroda/

Heterosyllabic clusters for a phonology

Set No. 1
p b f v m t d s z n l r c j x k g h
p pt ps pn pl pr px
b bd bz bn bl br bj
f ft fs fn fl fr fx
v vd vz vn vl vr vj
m mt md ms mz mn ml mj mx
t ts tr
d dz dr
s sp sm st sn sl sk
z zm zd zn zl
n nt nd ns nz nl nc nj nk ng nh
l lp lb lf lv lm lt ld ls lz ln lc lj lx lk lg lh
r rp rb rf rv rm rt rd rs rz rn rl rc rj rx rk rg rh
c
j jm jd jn jl jr
x xp xm xt xn xl xr xc xk
k kt ks kn kl kr kx
g gd gz gn gl gr gj
h ht hs hn hl hr hx

Comparison of orthographies

Orthographic representation of selected diaphonemes
English Esperanto Hanyu Pinyin Malay Latejami Xorban Loglan Lojban Toaq
//ʔ// ∅ / k1 q . . 2
//t͡s// c c c
//d͡z// dz3 z4 z
//t͡ʃ// ch ĉ ch q5 c c ch
//d͡ʒ// j ĝ zh j j j j
//z// z z z z z z z
//ʃ// sh ŝ sh x sy x c c c sh
//ʒ// zh ĵ q j j j
//x// ĥ h kh x x x
//h// h h h h6 7 h h
//w// w ŭ w / u w w w u u w
//j// y j y / i y y y i i y

1 Intervocalic glottal stop is implied when certain vowels appear back-to-back or doubled, such as in the word kemuliaan /kəmuli.aʔan/ 'glory; dignity'. Syllable-finally, glottal stop is written k.

2 Glottal stop is never written in Toaq, but does occur, at least phonetically, as the realization of the empty, or null, onset.

3 There is disagreement over whether Esperanto has a /d͡z/ phoneme. However, Kalocsay and Waringhien consider it to be one in their influential Plena Analiza Gramatiko de Esperanto (1985; p. 47).

4 For the sole purpose of orthographic comparison, we treat Standard Chinese's unaspirated stops as voiced consonants in this chart; Pinyin uses traditionally voiced consonant letters for these sounds.

5 Similarly, we have collapsed the Chinese retroflex and alveolopalatal series into the diaphonemic category of postalveolars. Admittedly this reflects the perceptual habits of the English speaker; it may be more intuitive to Chinese speakers to group the retroflex and alveolar sounds together as apicals. This would be less useful for the purposes of this chart, however.

6 The Latejami phoneme is allowed to vary between a dorsal fricative and a glottal fricative or stop; its main allophones can be inferred to be [ç], [x], [χ], [h] and [ʔ].

7 This sound is actually voiced, /ɦ/, with permitted allophones including [ɣ] and [ʁ].


Comparison of the IPA values of selected consonant letters
Grapheme English Pinyin Malay Latejami Xorban Loglan Lojban Toaq
’ (apostrophe) θ h ʔ
, (comma) ʔ .
. ʔ
; ʔ
c k / s t͡sʰ t͡ʃ t͡ʃ ʃ ʃ t͡sʰ
ch t͡ʃ t͡ʂʰ
h h x h h~x θ h h
j d͡ʒ t͡ɕ d͡ʒ d͡ʒ ʒ ʒ ʒ d͡ʑ
q t͡ɕʰ ʒ ʔ θ ŋ
sh ʃ ʂ ɕ
w w w w w w y w
x ks~gz ɕ ks ʃ x x x
y j j j j j ə ə j
z z t͡s z z z z z
zh ʒ t͡ʂ
Comparison of IPA values for consonant letters and digraphs in various languages
English Spanish Italian German Albanian Pinyin Malay Latejami Xorban Loglan Lojban Toaq
. ɦ ??? h
. ʔ ʔ
c k / s k / s~θ k / t͡ʃ k / t͡s t͡s t͡sʰ t͡ʃ t͡ʃ ʃ ʃ t͡sʰ
ch t͡ʃ ch k x t͡ʂʰ t͡ɕʰ
ç
dh ð
g g / d͡ʒ g~ɣ / x g / d͡ʒ g g g g g g g g
gh g / f~∅ g ɣ~x
gj ɟ~d͡ʑ
gn ɲ
h h / ∅ h / ː h x h h~x θ h h
j d͡ʒ x (j) j j d̥͡ʑ̥ d͡ʒ d͡ʒ ʒ ʒ ʒ d͡ʑ
kh (x) x
ll ʎ~ʝ ɫ
ng ŋ / ŋg ŋg ŋg ŋg ŋg ŋ ŋ ŋg ŋg ŋg ŋg
nj ɲ
ny ɲ
ñ (nj) ɲ
q c~t͡ɕ t͡ɕʰ ʒ ʔ θ ŋ
qu kw kw / k kw k / kv
r ɻ ɾ r ʁ / ɐ̯ ɾ ɻ~ʐ / ʵ r r r r r ɾ
rr r r
s s s s z s s s s s s s s
sc sk / s sk sk / ʃ ???
sch (ʃ) ʃ
sh ʃ ʃ ʂ ɕ
sy ʃ
ß s
th θ / ð t θ
v v b~β v f v v~f v v v v
w w (w) (w) v w w w w y w
x ks~gz / z ks~gz ks~gz? ks d͡z ɕ ks ʃ x x x
xh d͡ʒ
y j ʝ / i̯ (j) ??? y j j j j ə ə j
z z s~θ t͡s~d͡z t͡s z d̥͡z̥ z z z z z
zh ʒ ʒ d̥͡ʐ̥

Phonemic inventories for inspiration for new Ithkuiloids

Consonant phonemes of Ubykh, plus some others
Labial Alveolar Postalveolar Palatal Velar Uvular Epiglottal Glottal
central lateral laminal
closed
laminal apical
plain pal. phar. plain lab. plain lab. plain lab. plain lab. plain lab. plain lab. pal. plain lab. pal. plain lab. phar. phar. & lab. plain lab. pal. plain lab.
Plosive voiceless p t k q qˤʷ ʡ ʡʷ ʔʲ ʔ ʔʷ
voiced b d ɡʲ ɡ ɡʷ ɢʲ ɢ ɢʷ ɢˤ ɢˤʷ
ejective pʲʼ pˤʼ tʷʼ kʲʼ kʷʼ qʲʼ qʷʼ qˤʼ qˤʷʼ
Affricate voiceless t͡s t͡sʷ t͡ɬ t͡ɬʷ t̠͡ʃ t̠͡ʃʷ ȶ͡ɕ ȶ͡ɕʷ ʈ͡ʂ ʈ͡ʂʷ
voiced d͡z d͡zʷ d͡ɮ d͡ɮʷ d̠͡ʒ d̠͡ʒʷ ȡ͡ʑ ȡ͡ʑʷ ɖ͡ʐ ɖ͡ʐʷ
ejective t͡sʼ t͡sʷʼ t͡ɬʼ t͡ɬʷʼ t̠͡ʃʼ t̠͡ʃʷʼ ȶ͡ɕʼ ȶ͡ɕʷʼ ʈ͡ʂʼ ʈ͡ʂʷʼ
Fricative voiceless f s ɬ ɬʷ ʃ ʃʷ ɕ ɕʷ ʂ ʂʷ x χʲ χ χʷ χˤ χˤʷ ʜ ʜʷ h
voiced v z ɮ ɮʷ ʒ ʒʷ ʑ ʑʷ ʐ ʐʷ ɣʲ ɣ ɣʷ ʁʲ ʁ ʁʷ ʁˤ ʁˤʷ
ejective ɬʼ xʲʼ χʼ
Nasal m n ȵ ȵʷ ɳ ɳʷ ŋʲ ŋ ŋʷ
Approximant w l j ɥ
Trill r
Consonant phonemes of Naxi, plus some others & minus /ɥ/
Labial Dental/
Alveolar
Retroflex Palatal Velar Glottal
Plosive voiceless p t ʈ c k ʔ
aspirated ʈ
voiced b d ɖ ɟ ɡ
prenasalized ᵐb ⁿd ᶯɖ ᶮɟ ᵑɡ
Affricate voiceless ts ʈʂ
aspirated tsʰ ʈʂʰ tɕʰ
voiced dz ɖʐ
prenasalized ⁿdz ᶯɖʐ ⁿdʑ
Fricative voiceless f s ʂ ɕ x h
voiced v z ʐ ʑ ɣ
Nasal m n ɳ ɲ ŋ
Lateral approximant l ɭ ʎ
Flap or trill r ɽ
Semivowel w j
Consonant phonemes of Eastern Arrernte
Peripheral Coronal
Laminal Apical
Bilabial Velar Palatal Dental Alveolar Retroflex
Stop p pʷ k kʷ c cʷ t̪ t̪ʷ t tʷ ʈ ʈʷ
Nasal m mʷ ŋ ŋʷ ɲ ɲʷ n̪ n̪ʷ n nʷ ɳ ɳʷ
Prestopped nasal ᵖm ᵖmʷ ᵏŋ ᵏŋʷ ᶜɲ ᶜɲʷ ᵗn̪ ᵗn̪ʷ ᵗn ᵗnʷ ᵗɳ ᵗɳʷ
Prenasalized stop ᵐb ᵐbʷ ᵑɡ ᵑɡʷ ᶮɟ ᶮɟʷ ⁿd̪ ⁿd̪ʷ ⁿd ⁿdʷ ⁿɖ ⁿɖʷ
Lateral Approximant ʎ ʎʷ l̪ l̪ʷ l lʷ ɭ ɭʷ
Approximant β̞ ɰ j jʷ ɻ ɻʷ
Tap ɾ ɾʷ

A mild critique of Latejami (draft)

Rick Morneau's Latejami is one of the most complete and ingenious languages ever constructed. On the whole, it seems to be a remarkable success at being what it sets out to be: an easily learnable and easily speakable intermediary language for machine translation, capable of making translation from any source language straightforward and translation into any target language 'almost trivially easy'. If it has not been utilized as such, that would seem to reflect less on the language itself than on the direction that machine translation has taken since Latejami's publication -- or perhaps it's just a simple case of bad luck and undeserved obscurity. I'm unqualified to assess most of Morneau's work in the Latejami reference grammar, which as the title suggests deals with lexical semantics, other than to comment that it is rigorous, awesomely detailed and worthy of study by every conlanger. However, I can weigh in on a couple of the less important components of the language: morphology and phonology. I question a few of Morneau's choices in these matters. Certain things could have been done in more mnemonic and naturalistic ways, better serving Latejami's goals.

On the plus side, Morneau's phonemic inventory and orthography are both very sensible. He uses a standard five-vowel system and the following set of consonants:

Latejami consonant phonemes
Labial Alveolar Palatal Velar Glottal
Plosive unvoiced p t c  t͡ʃ k
voiced b d j d͡ʒ g
Fricative unvoiced f s x ʃ h
voiced v z q ʒ
Nasal m n
Lateral l
Rhotic r
Semivowel w y j

Phonetic diphthongs are treated as vowel-semivowel sequences. Phonotactics are very strict, permitting no more than one consonant plus an optional semivowel in onset position and, where coda is present, only nasals (N) and optionally semivowels (S) in coda position. Maximal syllable structure is thus CV(S)N, but words always end in a vowel or semivowel. The result is a language that resembles the stereotypical Niger-Congo language, say, Swahili, in its phonotactics. This isn't at all a bad thing. I'd perhaps be laxer on onsets, as I'll explain below, and tighter on codas. Diphthongs are fine, but codas such as that of loyn (/ojn/) are probably not cross-linguistically common enough to be necessary, as well as being subjectively ugly. The other phonotactical rules are all sound. As far as orthography goes -- and this is a really minor point of preference -- I wouldn't represent semivowels with consonant letters in coda. The most common practice is to use vowel letters; this is the convention in almost every natural language written in the Latin alphabet outside of Eastern Europe. Also, I think q is better for glottal stop than for /ʒ/, both in the sense of grapheme assignments and phonemes in the inventory. The phoneme /ʒ/ just isn't very important. In what major languages is it fully contrastive with /d͡ʒ/ or /zj/, not just contrastive in loanwords and/or restricted in occurrence? English, if only arguably; Polish, and not much else. Only the /v/-/w/ contrast appears to be rarer among major languages. Latejami has both /d͡ʒ/ and /zj/ (in at least one relatively common syllable, zyu), and this is enough. The glottal stop, on the other hand, is perfect for Latejami: it's very cross-linguistically common as a phoneme, and probably near-universal as an allophone; and the fact that it's not contrastive in the most widely spoken languages doesn't matter much, since Latejami is an a priori language.

But so far, so good; these are just quibbles. A bigger problem is that Latejami doesn't logically map consonants to morphological classes.

Latejami's self-segregation strategy depends on words being composed of certain types of morphemes in certain possible orders; morpheme types are distinguished by different groups of segments. Morneau's description of these classes is reproduced below:

() indicates that the enclosed item is optional
{} indicates that the enclosed item may appear zero or more times
[] indicates that the enclosed item must appear one or more times
| ::= logical or
V ::= any vowel ::= a | e | i | o | u
S ::= any semivowel ::= y | w
C ::= any consonant ::= b | c | d | f | g | j | k | l | m | n | p | q | r | s | t | v | x | z
	[The letter 'h' is reserved for anaphora ...]
C1 ::= modifier starter ::= b c d f j k q r t x z
	[q and r not used in native words]
C2 ::= classifier terminator ::= g l m p s v
C3 ::= suffix terminator ::= g m n p s v
[Note that C3 is any classifier terminator except l, which is reserved for prefixes and classifier terminators. C3 also includes n, which can never start a modifier (but can terminate one).]
...
N ::= vocalic-nucleus ::= [V]
...
prefix ::= l N (n)
...
suffix ::= N C3 | N m C | N n C
...
classifier ::= C1 N C2
...
modifier ::= C1 N (n)
...
root-morpheme ::= modifier | classifier
...
root ::= {modifier} classifier
...
POS ::= part-of-speech marker ::= a, e, aw, yu, etc
...
word = {prefix} + root + {suffix} + POS
anaphor ::= first-root-CN(n) + h + POS

I will attempt to put this in plain language. It will help to use Morneau's convention of curly braces around elements that may appear zero or more times.

We can unfold Morneau's morphological formula for the Latejami word into the following: {prefix} + {modifier} + classifier + {suffix} + POS marker. Every word has a classifier morpheme. Every word ends with a POS morpheme, which is always a vowel, with an optional on- or offglide. Since the part-of-speech vowels can occur in positions other than word-final, they are not sufficient for self-segregation. Indeed, vowels could be omitted entirely from Latejami's word-resolution algorithm; all that matters, abjad-like, is consonant strings.

The key to self-segregation is the penultimate morpheme in a word, which is either a classifier or a suffix. These morpheme classes are differentiated by having one or two final consonants, which must come from a particular, restricted set, the 'terminator' consonants. The classifier terminators are [g l m p s v], and the suffix terminators are [g m n p s v], plus any cluster of a nasal and another consonant. The presence of one of these elements signals a word break after the following syllable, unless the next consonant is also a suffix terminator. All other segments extend a word to the right. L is tricky due to its dual role as prefix initiator and classifier terminator, but any temporary ambiguity is always resolved by context. If the consonant before l is a terminator consonant, l starts a prefix. If the consonant before l is a non-terminator consonant, l terminates a classifier. Anyway, if we leave out l's prefix-initiator role, as well as a few other details like nasal codas, the picture of Latejami morphology becomes much clearer. A Latejami word, from an algorithmic point of view, is basically composed of consonants. These may be either from set A, [b c d f j k q r t x z], or set B, [g m n l p s v]. A word has the pattern {A}AB{B}, and every B-A juncture is a word boundary.

It can be difficult for a new learner to remember which segments serve the crucial 'terminator' role. To recap, the single segments that act as terminators are [g l m n p s v]. It is not clear why Morneau chose these consonants for this class. They have no salient phonological features in common; they are not even an alphabetical grouping such as [m n p q r s t]. If Morneau had instead picked a natural class of consonants, or a union of natural classes, word resolution would likely be more easily intuited. For instance, the single-classifier-terminator segment class could have been [v z q l m n r] -- the set of sonorants plus the set of voiced continuants. Or it could be [t d s z n l r], the set of alveolar consonants; or [p k b g f h v], the set of peripheral (i.e. non-coronal) obstruents; or [f s x h v z q], the fricatives; etc. Perhaps Morneau wanted maximal phonetic variety among each morphological segment class; if so, it is not clear why.

For reasons of aesthetics, Latejami has an idiosyncratic system of stress. Stress is unnecessary for self-segregation, and strictly speaking is allophonic. However, it is not easily predictable either; it depends on the type and order of morphemes present in a word. Morneau gives four rules that together determine stress placement for any word. Rules 3 and 4 will give the reader a sense of system:

If a word contains at least one modifier and one suffix, the suffix should be given primary (i.e., heavier) stress, and the modifier should be given secondary (i.e., lighter) stress.

If a word contains neither a modifier nor a suffix, then the final vowel of the classifier should be stressed.

Latejami's stress system is remarkably odd. Stress could have turned Latejami into one of the simplest engineered languages. Instead, it reinforces the language's morphological complexity and adds a layer of unuseful complexity on top of that! Consider that every word in Latejami is two syllables or more in length. The minimal word pattern is classifier + POS, i.e. CVC + V. This means that if Latejami had fixed penultimate stress, word boundaries would be totally unambiguous without even looking at individual segments. Words would continue up to the syllable after a stressed syllable. To illustrate, take a string, cvcvCVcvcvcvcvcvCVcvCVcv. ('cv' represents an unstressed syllable and 'CV' a stressed syllable.) The word boundaries must be as follows: cvcvCVcv cvcvcvcvCVcv CVcv. Of course, this would make stress phonemic, as well as de-correlate it from morphemic salience or prominence within a word, but I don't see why these outcomes are worth such convolutions to avoid. With phonemic stress, Latejami would no longer would need different morphological segment classes at all, though they are still worth having to aid the breaking down of unfamiliar words into morphemes.

References