User:Selguha/Sandbox: Difference between revisions

From the Logical Languages Wiki
Jump to navigation Jump to search
Line 43: Line 43:
|-
|-
! //h//
! //h//
| colspan="2" | h || colspan="2" | h || colspan="2" |  || colspan="2" | h || colspan="2" | h<sup>6</sup> || colspan="2" | '''’''' || colspan="2" | h || colspan="2" | '''’''' || colspan="2" | h
| colspan="2" | h || colspan="2" | h || colspan="2" |  || colspan="2" | h || colspan="2" | h<sup>6</sup> || colspan="2" | '''’'''<sup>7</sup> || colspan="2" | h || colspan="2" | '''’''' || colspan="2" | h
|-
|-
! //w//
! //w//
Line 63: Line 63:


<sup>6</sup> The Latejami phoneme is allowed to vary between a dorsal fricative and a glottal fricative or stop; its main allophones can be inferred to be [ç], [x], [χ], [h] and [ʔ].
<sup>6</sup> The Latejami phoneme is allowed to vary between a dorsal fricative and a glottal fricative or stop; its main allophones can be inferred to be [ç], [x], [χ], [h] and [ʔ].
<sup>7</sup> This sound is actually voiced, /ɦ/, with permitted allophones including [ɣ] and [ʁ].





Revision as of 03:49, 28 May 2020

Comparison of orthographies

Orthographic representation of selected diaphonemes
English Esperanto Pinyin Malay Latejami Xorban Loglan Lojban Toaq
//ʔ// ∅ / k1 q . . 2
//t͡s// c c c
//d͡z// dz3 z4 z
//t͡ʃ// ch ĉ ch q5 c c ch
//d͡ʒ// j ĝ zh j j j j
//z// z z z z z z z
//ʃ// sh ŝ sh x sy x c c c sh
//ʒ// zh ĵ q j j j
//x// ĥ h kh x x x
//h// h h h h6 7 h h
//w// w ŭ w / u w w w u u w
//j// y j y / i y y y i i y

1 Intervocalic glottal stop is implied when certain vowels appear back-to-back or doubled, such as in the word kemuliaan /kəmuli.aʔan/ 'glory; dignity'. Syllable-finally, glottal stop is written k.

2 Glottal stop is never written in Toaq, but does occur, at least phonetically, as the realization of the empty, or null, onset.

3There is disagreement over whether Esperanto has a /d͡z/ phoneme. However, Kalocsay and Waringhien consider it to be one in their influential Plena Analiza Gramatiko de Esperanto (1985; p. 47).

4 For the sole purpose of orthographic comparison, we treat Standard Chinese's unaspirated stops as voiced consonants in this chart; Pinyin uses traditionally voiced consonant letters for these sounds.

5 Similarly, we have collapsed the Chinese retroflex and alveolopalatal series into the diaphonemic category of postalveolars. Admittedly this reflects the perceptual habits of the English speaker; it may be more intuitive to Chinese speakers to group the retroflex and alveolar sounds together as apicals. This would be less useful for the purposes of this chart, however.

6 The Latejami phoneme is allowed to vary between a dorsal fricative and a glottal fricative or stop; its main allophones can be inferred to be [ç], [x], [χ], [h] and [ʔ].

7 This sound is actually voiced, /ɦ/, with permitted allophones including [ɣ] and [ʁ].


Comparison of the IPA values of selected consonant letters
Grapheme English Pinyin Malay Latejami Xorban Loglan Lojban Toaq
θ h
. ʔ ʔ
c k / s t͡sʰ t͡ʃ t͡ʃ ʃ ʃ t͡sʰ
ch t͡ʃ t͡ʂʰ
h h x h h~x θ h h
j d͡ʒ t͡ɕ d͡ʒ d͡ʒ ʒ ʒ ʒ d͡ʑ
q t͡ɕʰ ʒ ʔ θ ŋ
sh ʃ ʂ ɕ
w w w w w w y w
x ks~gz ɕ ks ʃ x x x
y j j j j j ə ə j
z z t͡s z z z z z
zh ʒ t͡ʂ
Comparison of IPA values for consonant letters and digraphs in various languages
English Spanish Italian German Albanian Pinyin Malay Latejami Xorban Loglan Lojban Toaq
. ɦ ??? h
. ʔ ʔ
c k / s k / s~θ k / t͡ʃ k / t͡s t͡s t͡sʰ t͡ʃ t͡ʃ ʃ ʃ t͡sʰ
ch t͡ʃ ch k x t͡ʂʰ t͡ɕʰ
ç
dh ð
g g / d͡ʒ g~ɣ / x g / d͡ʒ g g g g g g g g
gh g / f~∅ g ɣ~x
gj ɟ~d͡ʑ
gn ɲ
h h / ∅ h / ː h x h h~x θ h h
j d͡ʒ x (j) j j d̥͡ʑ̥ d͡ʒ d͡ʒ ʒ ʒ ʒ d͡ʑ
kh (x) x
ll ʎ~ʝ ɫ
ng ŋ / ŋg ŋg ŋg ŋg ŋg ŋ ŋ ŋg ŋg ŋg ŋg
nj ɲ
ny ɲ
ñ (nj) ɲ
q c~t͡ɕ t͡ɕʰ ʒ ʔ θ ŋ
qu kw kw / k kw k / kv
r ɻ ɾ r ʁ / ɐ̯ ɾ ɻ~ʐ / ʵ r r r r r ɾ
rr r r
s s s s z s s s s s s s s
sc sk / s sk sk / ʃ ???
sch (ʃ) ʃ
sh ʃ ʃ ʂ ɕ
sy ʃ
ß s
th θ / ð t θ
v v b~β v f v v~f v v v v
w w (w) (w) v w w w w y w
x ks~gz / z ks~gz ks~gz? ks d͡z ɕ ks ʃ x x x
xh d͡ʒ
y j ʝ / i̯ (j) ??? y j j j j ə ə j
z z s~θ t͡s~d͡z t͡s z d̥͡z̥ z z z z z
zh ʒ ʒ d̥͡ʐ̥

Phonemic inventories for inspiration for new Ithkuiloids

Consonant phonemes of Ubykh, plus some others
Labial Alveolar Postalveolar Palatal Velar Uvular Epiglottal Glottal
central lateral laminal
closed
laminal apical
plain pal. phar. plain lab. plain lab. plain lab. plain lab. plain lab. plain lab. pal. plain lab. pal. plain lab. phar. phar. & lab. plain lab. pal. plain lab.
Plosive voiceless p t k q qˤʷ ʡ ʡʷ ʔʲ ʔ ʔʷ
voiced b d ɡʲ ɡ ɡʷ ɢʲ ɢ ɢʷ ɢˤ ɢˤʷ
ejective pʲʼ pˤʼ tʷʼ kʲʼ kʷʼ qʲʼ qʷʼ qˤʼ qˤʷʼ
Affricate voiceless t͡s t͡sʷ t͡ɬ t͡ɬʷ t̠͡ʃ t̠͡ʃʷ ȶ͡ɕ ȶ͡ɕʷ ʈ͡ʂ ʈ͡ʂʷ
voiced d͡z d͡zʷ d͡ɮ d͡ɮʷ d̠͡ʒ d̠͡ʒʷ ȡ͡ʑ ȡ͡ʑʷ ɖ͡ʐ ɖ͡ʐʷ
ejective t͡sʼ t͡sʷʼ t͡ɬʼ t͡ɬʷʼ t̠͡ʃʼ t̠͡ʃʷʼ ȶ͡ɕʼ ȶ͡ɕʷʼ ʈ͡ʂʼ ʈ͡ʂʷʼ
Fricative voiceless f s ɬ ɬʷ ʃ ʃʷ ɕ ɕʷ ʂ ʂʷ x χʲ χ χʷ χˤ χˤʷ ʜ ʜʷ h
voiced v z ɮ ɮʷ ʒ ʒʷ ʑ ʑʷ ʐ ʐʷ ɣʲ ɣ ɣʷ ʁʲ ʁ ʁʷ ʁˤ ʁˤʷ
ejective ɬʼ xʲʼ χʼ
Nasal m n ȵ ȵʷ ɳ ɳʷ ŋʲ ŋ ŋʷ
Approximant w l j ɥ
Trill r
Consonant phonemes of Naxi, plus some others & minus /ɥ/
Labial Dental/
Alveolar
Retroflex Palatal Velar Glottal
Plosive voiceless p t ʈ c k ʔ
aspirated ʈ
voiced b d ɖ ɟ ɡ
prenasalized ᵐb ⁿd ᶯɖ ᶮɟ ᵑɡ
Affricate voiceless ts ʈʂ
aspirated tsʰ ʈʂʰ tɕʰ
voiced dz ɖʐ
prenasalized ⁿdz ᶯɖʐ ⁿdʑ
Fricative voiceless f s ʂ ɕ x h
voiced v z ʐ ʑ ɣ
Nasal m n ɳ ɲ ŋ
Lateral approximant l ɭ ʎ
Flap or trill r ɽ
Semivowel w j

A mild critique of Latejami (draft)

Rick Morneau's Latejami is one of the most complete and ingenious languages ever constructed. On the whole, it seems to be a remarkable success at being what it sets out to be: an easily learnable and easily speakable intermediary language for machine translation, capable of making translation from any source language straightforward and translation into any target language 'almost trivially easy'. If it has not been utilized as such, that would seem to reflect less on the language itself than on the direction that machine translation has taken since Latejami's publication -- or perhaps it's just a simple case of bad luck and undeserved obscurity. I'm unqualified to assess most of Morneau's work in the Latejami reference grammar, which as the title suggests deals with lexical semantics, other than to comment that it is rigorous, awesomely detailed and worthy of study by every conlanger. However, I can weigh in on a couple of the less important components of the language: morphology and phonology. I question a few of Morneau's choices in these matters. Certain things could have been done in more mnemonic and naturalistic ways, better serving Latejami's goals.

On the plus side, Morneau's phonemic inventory and orthography are both very sensible. He uses a standard five-vowel system and the following set of consonants:

Latejami consonant phonemes
Labial Alveolar Palatal Velar Glottal
Plosive unvoiced p t c  t͡ʃ k
voiced b d j d͡ʒ g
Fricative unvoiced f s x ʃ h
voiced v z q ʒ
Nasal m n
Lateral l
Rhotic r
Semivowel w y j

Phonetic diphthongs are treated as vowel-semivowel sequences. Phonotactics are very strict, permitting no more than one consonant plus an optional semivowel in onset position and, where coda is present, only nasals (N) and optionally semivowels (S) in coda position. Maximal syllable structure is thus CV(S)N, but words always end in a vowel or semivowel. The result is a language that resembles the stereotypical Niger-Congo language, say, Swahili, in its phonotactics. This isn't at all a bad thing. I'd perhaps be laxer on onsets, as I'll explain below, and tighter on codas. Diphthongs are fine, but codas such as that of loyn (/ojn/) are probably not cross-linguistically common enough to be necessary, as well as being subjectively ugly. The other phonotactical rules are all sound. As far as orthography goes -- and this is a really minor point of preference -- I wouldn't represent semivowels with consonant letters in coda. The most common practice is to use vowel letters; this is the convention in almost every natural language written in the Latin alphabet outside of Eastern Europe. Also, I think q is better for glottal stop than for /ʒ/, both in the sense of grapheme assignments and phonemes in the inventory. The phoneme /ʒ/ just isn't very important. In what major languages is it fully contrastive with /d͡ʒ/ or /zj/, not just contrastive in loanwords and/or restricted in occurrence? English, if only arguably; Polish, and not much else. Only the /v/-/w/ contrast appears to be rarer among major languages. Latejami has both /d͡ʒ/ and /zj/ (in at least one relatively common syllable, zyu), and this is enough. The glottal stop, on the other hand, is perfect for Latejami: it's very cross-linguistically common as a phoneme, and probably near-universal as an allophone; and the fact that it's not contrastive in the most widely spoken languages doesn't matter much, since Latejami is an a priori language.

But so far, so good; these are just quibbles. A bigger problem is that Latejami doesn't logically map consonants to morphological classes.

Latejami's self-segregation strategy depends on words being composed of certain types of morphemes in certain possible orders; morpheme types are distinguished by different groups of segments. Morneau's description of these classes is reproduced below:

() indicates that the enclosed item is optional
{} indicates that the enclosed item may appear zero or more times
[] indicates that the enclosed item must appear one or more times
| ::= logical or
V ::= any vowel ::= a | e | i | o | u
S ::= any semivowel ::= y | w
C ::= any consonant ::= b | c | d | f | g | j | k | l | m | n | p | q | r | s | t | v | x | z
	[The letter 'h' is reserved for anaphora ...]
C1 ::= modifier starter ::= b c d f j k q r t x z
	[q and r not used in native words]
C2 ::= classifier terminator ::= g l m p s v
C3 ::= suffix terminator ::= g m n p s v
[Note that C3 is any classifier terminator except l, which is reserved for prefixes and classifier terminators. C3 also includes n, which can never start a modifier (but can terminate one).]
...
N ::= vocalic-nucleus ::= [V]
...
prefix ::= l N (n)
...
suffix ::= N C3 | N m C | N n C
...
classifier ::= C1 N C2
...
modifier ::= C1 N (n)
...
root-morpheme ::= modifier | classifier
...
root ::= {modifier} classifier
...
POS ::= part-of-speech marker ::= a, e, aw, yu, etc
...
word = {prefix} + root + {suffix} + POS
anaphor ::= first-root-CN(n) + h + POS

I will attempt to put this in plain language. It will help to use Morneau's convention of curly braces around elements that may appear zero or more times.

We can unfold Morneau's morphological formula for the Latejami word into the following: {prefix} + {modifier} + classifier + {suffix} + POS marker. Every word has a classifier morpheme. Every word ends with a POS morpheme, which is always a vowel, with an optional on- or offglide. Since the part-of-speech vowels can occur in positions other than word-final, they are not sufficient for self-segregation. Indeed, vowels could be omitted entirely from Latejami's word-resolution algorithm; all that matters, abjad-like, is consonant strings.

The key to self-segregation is the penultimate morpheme in a word, which is either a classifier or a suffix. These morpheme classes are differentiated by having one or two final consonants, which must come from a particular, restricted set, the 'terminator' consonants. The classifier terminators are [g l m p s v], and the suffix terminators are [g m n p s v], plus any cluster of a nasal and another consonant. The presence of one of these elements signals a word break after the following syllable, unless the next consonant is also a suffix terminator. All other segments extend a word to the right. L is tricky due to its dual role as prefix initiator and classifier terminator, but any temporary ambiguity is always resolved by context. If the consonant before l is a terminator consonant, l starts a prefix. If the consonant before l is a non-terminator consonant, l terminates a classifier. Anyway, if we leave out l's prefix-initiator role, as well as a few other details like nasal codas, the picture of Latejami morphology becomes much clearer. A Latejami word, from an algorithmic point of view, is basically composed of consonants. These may be either from set A, [b c d f j k q r t x z], or set B, [g m n l p s v]. A word has the pattern {A}AB{B}, and every B-A juncture is a word boundary.

It can be difficult for a new learner to remember which segments serve the crucial 'terminator' role. To recap, the single segments that act as terminators are [g l m n p s v]. It is not clear why Morneau chose these consonants for this class. They have no salient phonological features in common; they are not even an alphabetical grouping such as [m n p q r s t]. If Morneau had instead picked a natural class of consonants, or a union of natural classes, word resolution would likely be more easily intuited. For instance, the single-classifier-terminator segment class could have been [v z q l m n r] -- the set of sonorants plus the set of voiced continuants. Or it could be [t d s z n l r], the set of alveolar consonants; or [p k b g f h v], the set of peripheral (i.e. non-coronal) obstruents; or [f s x h v z q], the fricatives; etc. Perhaps Morneau wanted maximal phonetic variety among each morphological segment class; if so, it is not clear why.

For reasons of aesthetics, Latejami has an idiosyncratic system of stress. Stress is unnecessary for self-segregation, and strictly speaking is allophonic. However, it is not easily predictable either; it depends on the type and order of morphemes present in a word. Morneau gives four rules that together determine stress placement for any word. Rules 3 and 4 will give the reader a sense of system:

If a word contains at least one modifier and one suffix, the suffix should be given primary (i.e., heavier) stress, and the modifier should be given secondary (i.e., lighter) stress.

If a word contains neither a modifier nor a suffix, then the final vowel of the classifier should be stressed.

Latejami's stress system is remarkably odd. Stress could have turned Latejami into one of the simplest engineered languages. Instead, it reinforces the language's morphological complexity and adds a layer of unuseful complexity on top of that! Consider that every word in Latejami is two syllables or more in length. The minimal word pattern is classifier + POS, i.e. CVC + V. This means that if Latejami had fixed penultimate stress, word boundaries would be totally unambiguous without even looking at individual segments. Words would continue up to the syllable after a stressed syllable. To illustrate, take a string, cvcvCVcvcvcvcvcvCVcvCVcv. ('cv' represents an unstressed syllable and 'CV' a stressed syllable.) The word boundaries must be as follows: cvcvCVcv cvcvcvcvCVcv CVcv. Of course, this would make stress phonemic, as well as de-correlate it from morphemic salience or prominence within a word, but I don't see why these outcomes are worth such convolutions to avoid. With phonemic stress, Latejami would no longer would need different morphological segment classes at all, though they are still worth having to aid the breaking down of unfamiliar words into morphemes.

References