The Children of Planet Earth

Chapter 25: Essay - Straight from the Horse's Mouth

Essay - Straight from the Horse's Mouth

STRAIGHT FROM THE HORSE’S MOUTH:
A DISCOURSE ON THE HARMONIC VOICE

Dr. Adam Somerset & Ms. Twilight Sparkle
New Tacoma colony, Rhysling
1997-08-09

Abstract

On 30 June 1997 (ship time, to account for any general-relativity differences), the interstellar colony ship Zodiac-Altair arrived in orbit of a world mankind named Rhysling. Per Commander Louis Darcy’s orders, a probe was dropped onto the lush surface, but contact was lost almost immediately. While exhausting every other means of regaining contact with the equipment to retrieve its data, short of physical intervention, by serendipity the crew had also discovered evidence of an extraterrestrial civilization on the surface as well.

Thus Commander Darcy had sent the author alone to the surface, to make contact with the Rhyslinger Indigenous and document their language in a way that the rest of the Zodiac-Altair crew could learn. This process was not an easy one, considering how closely language and culture are tied to one another, not to mention an assumption of a complete and utter lack of commonality between their language and those on Earth.

That assumption proved correct in some areas, and misguided in others. After discovering their language is articulated orally, with phonemes produced by air moved by a diaphragm through the oral and nasal cavities, it became clear that, in terms of mechanics, it would be possible for a human to speak as the Indigenous do, and vice versa – an amazing discovery, considering the beings with which we were communicating were largely equine in appearance.

In terms of grammar, however, it proved a challenge for the author to master. It took a natively-printed codex, along with guidance from a native speaker (to whom insisted be taught English), to master the Indigenous language – which the author had later come to learn calls itself Ơhqer (may also be spelled “Eochqer”), or literally the Harmonic Voice (ơh [ɤx], “harmony” + ner [neɹ], “voice,” “speech/language”). A fellow colonist humorously suggested “Equestrian,” and against better judgment, this paper will be using such a term.

[Fitting, then, that the equines’ nation should call itself Ơhesti (“Eochesti”), or literally the Harmonic Empire (ơh + esti [esˈti], “state,” “nation,” “empire”), but naturally the author digresses.]

After weeks of practice with reading, writing, and speaking the Voice, the author was able to negotiate his own Imperial citizenship and a colony site for the rest of Zodiac-Altair – a site later dubbed New Tacoma, after the author’s birthplace. He had also been tasked with providing an easier avenue for the rest of the colony, and mankind at large, to learn Equestrian. This paper shall attempt to serve an introductory purpose, with supplementary materials to further mankind’s understanding of the language, and any necessary errata should the need ever arise, all to be produced at later dates.

Here is documented its phonology, its grammar, its dialectology, its writing system, and other miscellanea that the author feels the reader ought to know to further his or her fluency in the language, all checked against a native Equestrian speaker, one Ãtir Ḷsapa [ɑ̃ˈtiɹ l̩sɑˈpɑ], alias Twilight Sparkle, whose contributions before and during this paper had proved major enough to warrant her coäuthor status here.

1 Introduction

Equestrian is an extraterrestrial language spoken by about ten million sapient equine analogues (hereafter “ponies”) who compose the Harmonic Empire on Rhysling. Genetic relationships with other Rhyslinger languages have yet to be determined. Most Imperial citizens are monolingual in Equestrian, only using other languages when interpreting with other political bodies.

A few phonological and grammatical principles remain, despite the language being open to borrowing words from other languages. Indeed, it has not been determined yet if Equestrian is a creole of multiple other languages. It is also unknown at this time where the language originated within the Empire, or indeed if the language is native to the region.

However, as the Empire comprises the largest and most powerful native political body on the planet, it would be pertinent for the time being at least to use Equestrian in communications with the Empire, then rely on Imperial interpreters to communicate with the rest of Rhysling.

2 Phonology

Equestrian has a set of fourteen vowels – seven oral and seven nasal – and two syllabic consonants, neither of which can occur nasally. Given the language’s total lack of polyphthongs, this gives a total of sixteen possible syllable nuclei.

Figure 1a: The Equestrian vowels.

Figure 1b: The Equestrian syllabic consonants.

Equestrian also has a perfectly regular consonant series.

Figure 1c: The Equestrian consonants.

Differences in phonology are detailed further in Section 4. Please note that Equestrian does have a native writing system, details of which are found in Section 5; for ease of readability, this paper will use the Romanization system shown in the phonological charts presented ante, along with any IPA transcriptions of newly-introduced Equestrian words.

Neither in the consonants nor in the vowels is any length distinction made – even allophonic length is not present in the language; e.g. dḷgãgru [dl̩ɡɑ̃ɡˈɹu] (“I need to go”), in combining dḷgã [dl̩ˈɡɑ̃] + ak [ɑk], drops the vowel from ak, assimilating the second root into the first, a sandhi process appearing to be present even in slow, careful speech. Further, the final vowel in the first root determines whether it is oral or nasal, simply by remaining in the state in which it is, i.e. a theoretical word dḷgagru combining dḷga + ãk would keep the oral vowel in the vowel sandhi.

In fact, any phonemic length appears to be in free variation, e.g. [ɑ], [ɑː], [ɑːː], and even [ɑ̆] are all phonemically /a/. The reason(s) why is/are not certain at this time, but it is possible it had always been this way. My current working theory is that lacking such a feature would let the language work better in song (see Section 5).

2.1 Phonological Harmonies

Equestrian vowels and consonants work in strict harmonies, both with rules that can never be violated; even loanwords are made to comply with them.

*However, since the word does not have the ‘recessive’ diacritic(s) as well, they are considered part of the ‘dominant’ harmony as well.

Each harmony has ‘dominant,’ ‘recessive,’ and ‘neutral’ categories, each specifying their own writing rules. The ‘dominant’ harmonies are assumed to be the default, and need no special markings. The ‘recessive’ harmonies require special diacritics to indicate the harmony; they only need to be marked once per word, and only on the first relevant glyph. The ‘neutral’ harmony does not influence the harmony into either direction, but has every vowel explicitly marked in writing.*

As Equestrian presents an agglutinative grammar, there arises the question of how the harmonies are applied in compound words. The harmonies in a given word are determined by the first root; even prefixes comply with the first root. This means that even a ‘recessive’ harmony can influence its corresponding ‘dominant’ harmony to comply, if the first root has the recessive harmony.

Vowels are divided along a rounding harmony, into these categories: ‘solar,’ ‘lunar,’ and ‘rainy.’ ‘Solar’ vowels are unrounded, and serve as the dominant harmony. ‘Lunar’ vowels are rounded, and serve as the recessive harmony. ‘Rainy’ vowels, while unrounded, are neutral.

Figure 2a: The Equestrian vowel harmony.

Consonants are divided along a pharyngeal harmony, into these categories: ‘aquatic,’ ‘terrestrial,’ and again ‘rainy.’ ‘Aquatic’ consonants are voiced, and serve as the dominant harmony. ‘Terrestrial’ consonants are voiceless, and serve as the recessive harmony. ‘Rainy’ consonants, while voiced, are neutral.

Figure 2b: The Equestrian consonant harmony.

An astute reader might notice that the neutral consonants and vowels not only share the same term, but also almost perfectly reflect each other. Section 5 further emphasizes this relationship, while Section 4 presents a few dialectal deviations from this pattern.

2.2 Phonotactics

Equestrian phonotactics are quite simple, albeit not without its complexities. A few rules exist in the language, and loanwords do not appear to be exempt from any of them.

The syllable structure is (C)V(C). Any consonant can serve as an onset or a coda; the nucleus can be any vowel, oral or nasal, or either syllabic consonant. As stated before, no diphthongs, triphthongs, etc. exist in Equestrian; for instance, any falling diphthongs ending in [i̯] are really analyzed as a sequence of a vowel followed by [j] as the syllable coda. If a loanword has a consonant cluster larger than two syllables, or if it has two syllables not straddling a syllable break, one or more [e] or [ɑ], depending on the vowel harmony (see Section 2.1, ante), are inserted to break up the cluster.

The primary stress always falls upon the final syllable of a word, unless it is a monosyllable, in which case it is not stressed at all. Only a single tone exists in Equestrian: a simple rising tone, placed upon the final syllable. Unlike the primary stress, it also applies to monosyllables. It does not matter whether the tone is [˩˥], [˩˧], [˧˥], or any other rising tone; it only matters that the tone ends higher than when it started. Consequently, it may be encoded in IPA transcriptions with the tone diacritic [◌̌], as its weakness of ambiguity perfectly suits Equestrian’s purposes. (See Section 3 for its grammatical purpose.)

Corresponding ‘rainy’ vowels and consonants do not appear next to each other. Which is to say: [i] and [ĩ] do not appear adjacent to [j]; [ɹ̩] does not appear adjacent to [ɹ]; and [l̩] does not appear adjacent to [l].

Nasal consonants do not assimilate position based on the following consonant.

3 Grammar

Equestrian grammar, while paralleling many Terrestrial languages, is markedly different from most. This section will explain how the language functions. The reader should take note that it will assume one specific dialect (traditionally associated with the farming caste); any dialectal deviations from this norm are listed in Section 4. Additionally, to simplify this paper, the affixes will all use the ‘solar’ and ‘aquatic’ phonological harmonies.

3.1 Noun Classes

Equestrian nouns are sorted into three classes, of which ‘animate,’ ‘magical,’ and ‘inert’ are the closest English translations for the language’s internal terminology. This is not to be confused with ‘masculine,’ ‘feminine,’ and ‘neuter,’ as nouns are not divided along gender lines per se; in fact, sexual differences are not expressed anywhere in Equestrian grammar at all. Another thing to note is that the definitions given here are not perfectly strict, and many exceptions abound – either through cultural definitions, or from a possible older noun-class system; the true reason(s) is/are not known at this time.

‘Animate’ nouns, as the term suggests, include living organisms, including the ponies themselves, and geographical locations. Some inanimate nouns are included in this list, such as ṛyli ([ɹ̩jˈli], “rain”), whose natural fall is seen as a force of life itself, rather than gravity forcing the water down (despite modern Imperial science and technology having perfected weather manipulation and control).

‘Magical’ nouns are a little harder to describe, as their definitions are clearly and deeply tied into Imperial culture. By and large they are composed of items one might consider ‘animate’ or ‘inert’ instead, along with celestial objects both local and distant, events, and any technology with which they are not familiar, either through sheer novelty, or those introduced by mankind. It would seem Arthur C. Clarke’s Third Law best describes this last point: “Any sufficiently-advanced technology is indistinguishable from magic.” Perhaps in that vein, some of these items truly are magical, as their underlying mechanics demand further human scientific study to understand.

‘Inert’ nouns are the miscellanea class; anything that does not fall into either previous class. Certain items we know to be ‘animate’ are classified as inert, i.e. microörganisms.

Important to note is that these noun classes are not marked on the nouns themselves at all, which suggests this noun class may have been invented, either by the Imperial state, or by a single location from which it organically spread. Noun classes instead are marked on the verb, as the agent (the object, if any, being irrelevant in class).

3.2 Noun Case Declension

Nouns have a simple case declension system, composed of merely six (although some additional cases are detailed in Section 4): ergative, absolutive, genitive, dative, comitative, and prepositional. Note that these case endings do not change with the noun’s class, but may change based both on consonant and vowel harmony and according to any conflicting ‘rainy’ vowels and consonants.

The ergative (null) and absolutive (-ley [ˈlej]) cases also illustrate Equestrian’s morphosyntactic alignment, although the ergative case is very rarely evoked, save for a few special circumstances, including official government speeches and where the agent needs to be indicated.

The genitive and dative cases use the endings -zḷ [ˈzl̩] and -we [ˈɣe].

The presence of a comitative case (-ez [ˈez]) is unusual, especially since while the language also has an instrumental case, said case is strictly dialectal. The presence of the comitative case in all dialects appears to reflect the close-knit social structure in the Empire’s culture.

The prepositional case (-il [ˈil]) is a ‘catch-all’ for any other cases that are expressed by a particle placed before the noun. This includes any cases that are also expressed as an affix in other dialects.

3.3 Noun Numbers

Nouns decline for four numbers: singular, dual, paucal, and plural. These suffixes are -imẽ [iˈmẽ] for the dual, -ize [iˈze] for the paucal, and -ye [ˈje] for the plural. The singular is unmarked. Each of these numbers are placed directly after the noun but before any applicable case suffix.

Occasionally, due to the ‘rainy’ sound rule, these endings would have to be changed. If the noun ends in either syllabic consonant, [ɹ̩] or [l̩], the dual suffix changes to -ymẽ [jˈmẽ] and the paucal suffix changes to -yze [jˈze]. If the noun ends in [i] or [ĩ], the suffixes are assimilated as -mẽ [ˈmẽ] for the dual, -ze [ˈze] for the paucal, and -e [ˈe] for the plural.

The paucal number covers sets of three, four, five, or six items. This is a strict grammatical rule, one suspected to be tied into Imperial culture.

3.4 Pronouns

Equestrian recognizes five different pronouns, three of which are divided into the four numbers as detailed in Section 3.3, ante. These are the first person, the second person, the third person, the indefinite (also known as the zero person), and the reflexive.

The first-person singular absolutive pronoun is ṛs [ɹ̩s]. Its dual form is ṛsiþẽ [ɹ̩siˈᵑʘe], its paucal ṛsise [ɹ̩siˈse], and its plural rsye [ɹ̩sˈje]. Ṛsiþẽ functions as the ‘royal We’ in older and official speech.

The second-person singular absolutive pronoun is vẽ [ʙẽ]. Its dual form is vẽymẽ [ʙẽjˈmẽ], its paucal vẽyze [ʙẽjˈze], and its plural vẽye [ʙẽˈje].

The third-person singular absolutive pronoun is mưl [mɯl]. Its dual form is mưlimẽ [mɯliˈmẽ], its paucal mưlize [mɯliˈze], and its plural mưlye [mɯlˈje].

The indefinite absolutive pronoun is wo [ɣo]. The reflexive absolutive pronoun is ezeg [eˈzeɡ].

3.5 Numerology

Equestrian works on a senary counting system. Body counting is done first on the knees, then on the ears. Sãlu [sɑ̃ˈlu] indicates one item, iþã [iˈᵑʘɑ̃] two items, kurso [kuɹˈso] three, deñe [deˈŋe] four, uru [uˈɹu] five, and iza [iˈzɑ] technically six, but in the context of the language it is best understood as ‘ten’ items. Numbers beyond 10₆ combine roots in a specific order, in a fashion not too dissimilar from Japanese: 11₆ is izazãlu [izɑzɑ̃ˈlu] (note the consonant-harmony compliance), 12₆izaymã [izɑjˈmɑ̃], 20₆ is iþãysa [iᵑʘɑ̃jˈsɑ], and so forth. 100₆ is ayla [ɑjˈlɑ], and 1000₆ is awidṛ [ɑɣiˈdɹ̩].

An astute reader can note the parallels between the numerals and the grammatical numbers, which I suspect makes me think of a strong conscious influence from a governing body. One should note that awidṛ does not have an equivalent grammatical number, but it is also used to mean ‘countless,’ in a similar vein to Greek μῡρίος (mȳríos, ‘ten thousand’). For most day-to-day usage, these numbers will suffice; however, for actual mathematical usage, which frequently exceeds these basic limitations, awidṛ can be partially reduplicated, i.e. awidṛwidṛ [ɑɣidɹ̩ɣiˈdɹ̩] for ‘1,000,000₆.’ This technique does not appear to have an upper limit save for practicality.

The Empire heavily favors this counting system, primarily for its mathematical advantages over decimal, but applies it as well to all their weights and measures, including timekeeping and datekeeping, upon which I will elaborate further in the following subsections. It may initially appear to be a digression, but the reader should bear in mind that this facet of their civilization is important to recognize when dealing with the Harmonic Empire.

See Section 5 for the Imperial mathematical notation.

3.5.1 Timekeeping

First, the reader should note that the Rhyslinger day is exactly two-thirds of a Terrestrial day, and somehow not one second more or less. While oddly coincidental, this provides an excellent basis upon which to compare perceptions of the progression of time.

The Imperial clock works similarly to ISO 8601, but nonetheless divides the Rhyslinger day differently. First is the kãtṛtal [kɑ̃tɹ̩ˈtɑl], which lasts eight Terrestrial hours. The word has no direct equivalent in English, but can be understood as a ‘half-day;’ traditionally, the dividing line between one kãtṛtal and the next was the sunrise and sunset, though naturally when these occur vary from day to day. The equinoxes are the only days of the year when they are perfectly half-and-half.

From there, each kãtṛtal is divided into six (6₁₀, 10₆) ‘hours,’ or izãdal [izɑ̃ˈdɑl], each of which lasts eighty Terrestrial minutes. One can see that the unit is derived from the root for ‘10₆,’ a pattern that will continue to follow.

Each izãdal is divided into thirty-six (36₁₀, 100₆) ‘minutes,’ or ayladal [ɑjlɑˈdɑl] (a word again derived from a number, this time for 100₆), each of which lasts two Terrestrial minutes and two-ninths of a third.

A very recent cultural trend has meant dividing each ayadal into two hundred sixteen (216₁₀, 1000₆) seconds. This unit, however, is too new to have a common name, though if I had to guess, it would be named the awidṛdal [ɑɣidɹ̩ˈdɑl], following the trend of the previous units of time.

Traditionally, the units smaller than the kãtṛtal were determined by a water clock. The specific design is not one seen on Earth – well, not one used for such a purpose in any case. The clock ran on a pair of sōzu – Japanese deer-scarers – with one emptying into the other. A smaller one filled up within one ayadal before dumping its contents into the larger one; the resulting noise when it fell back into place counted one ayadal. The larger one held thirty-six times the smaller’s capacity, so it would fill up within one izãdal before spilling out, with the resulting knock counting one izãdal. The tentatively-named awidṛdal derives from the flow of water from the clock’s source; again, this is where the word’s dual definition of ‘1000₆’ and ‘countless’ come into play. For obvious reasons, this clock did not track the kãtṛtal, as its progression was manifest.

3.5.2 Datekeeping

The Imperial calendar year is divided into nine months, each with six weeks, each week with six days. No intercalation occurs at all; it is unclear if Rhysling’s rotation-revolution pattern is simply that perfect, or if the Empire sees no need for intercalation.

Each year is also divided into four seasons, with the same spring-summer-autumn-winter season differentiation as with western civilization. Spring, summer, and autumn each occupy two months; winter occupies three. The calendar year starts on the vernal equinox. The summer solstice is celebrated as an Imperial holiday as well; it is unknown if the winter solstice is treated the same way.

*The rain in the Empire appears to be under perfect control and manipulation, to the point where weather as a whole can be outright scheduled instead of predicted. What methods they use is still unknown to us at this time, but are being investigated.

The names of the seasons are derived from the general climate in the northern hemisphere. Spring is considered ‘wet’ with the seasonal rains,* summer is considered ‘hot,’ autumn is considered ‘dry’ – which proves perfect for the farmers’ tradition of burning the refuse and inedibles from their crops, then using the ashes to fertilize crops the following year – and winter is considered ‘cold.’

*Paper money is not used in the Empire at all; instead, the ponies still rely on coins made from gold and silver (with each gold coin worth six silver), which prove too substantially heavy for actual daily usage.

The traditional working week is five days long, with the sixth and final day of the week reserved either for rest or for transactions – either in a town market, or on the last day of each month, when the ponies traditionally conduct the actual transactions, insofar as currency is transferred.*

3.6 Color Categories

Equestrian color categories are best plotted within a cone. At the base is a color disc, with a white (ñalab) center and a black (ebơni) outer edge. Five categories are recognized between them – red (katja), yellow (hṛtjĩ), green (sila), blue (pele), and purple (tưsḷ). Occupying the third dimension is a category (zanja) only ponies can see, extending from the disc up to the tip – but as it approaches the tip, the other colors are expressed less.

3.7 Verb Conjugation

Equestrian verbs conjugate almost entirely for the subject/agent – four people except for the reflexive, and four numbers for each except for the indefinite – and four tenses: present, past, future, and cyclic, which is used for events that happen repeatedly or habitually. Curiously, no distinction in temporal aspect is made in Equestrian. Conjugation is perfectly regular across all verbs, except for the copula al [ɑl], which is not negated; instead, a negative copula ơzưñ [ɤˈzɯŋ] exists.

The first-person conjugation is marked with -rư [ˈɹɯ] for the singular, -rimẽ [ɹiˈmẽ] for the dual, -rize [ɹiˈze] for the paucal, and -ṛye [ɹ̩ˈje] or -rye [ɹˈje] for the plural. Two different endings exist for the plural, in part due to the approximant rule, and in part due to the syllable structure, both of which are detailed in Section 2.2. For brevity’s sake, this is a pattern that repeats throughout each of the other grammatical people, for similar reasons.

The second-person conjugation is marked with -vư [ˈʙɯ] for the singular, -vimẽ [ʙiˈmẽ] for the dual, -vize [ʙiˈze] for the paucal, and -vie [ʙiˈe] or -vye [ʙˈje] for the plural.

The third-person conjugation is marked with -mư [ˈmɯ] for the singular, -mimẽ [miˈmẽ] for the dual, -mize [miˈze] for the paucal, and -mie [miˈe] or -mye [mˈje] for the plural. The third person also conjugates for the noun classes; the conjugations given ante apply to the inert noun-class. The animate noun-class uses -djemư [ɟeˈmɯ] for the singular, -djemime [ɟemiˈme] for the dual, -djemize [ɟemiˈze] for the paucal, and -djemye [ɟemˈje] for the plural. The magical noun-class uses -njemư [ɲeˈmɯ] for the singular, -njemime [ɲemiˈme] for the dual, -njemize [ɲemiˈze] for the paucal, and -njemye [ɲemˈje] for the plural.

The indefinite-person conjugation is marked with only -wư [ˈɣɯ], with no number distinction.

The temporal conjugations are expressed with prefixes. The present tense is unmarked, the past is marked with zjơ- [ʒɤ], the future with il- [il], and the cyclic with ḷzj- [l̩ʒ].

The negative affix is -zữ- [zɯ], placed between the verb and the person/number marking.

A handful of mood particles exist. The indicative mood is unmarked, ama [ɑˈmɑ] marks the subjunctive mood, elsi [elˈsi] marks the conditional mood, and þesơ [ᵑʘeˈsɤ] marks the imperative mood.

A verb without any markings is infinitive. However, Equestrian has a rather unique conjugation behavior that I have taken to calling ‘compound verbs.’ Simply put, instead of leaving an indefinite verb adjacent to a conjugated one, the two verbs are agglutinated and treated as a single verb. To return to a previous example, dḷgãgru (“I need to go”) combines dḷgã (to need) and ak (to go), then conjugates the entire word for the first person, the singular number, the present tense, and the indicative mood.

3.8 Questions

Asking questions is a simple affair in Equestrian – the last syllable in a certain word receives a rising tone (see Section 2.2 for intonation). If it is a polar question, the tone applies to the verb; otherwise, it applies to the interrogative particle.

4 Dialectology

Equestrian has three dialects, divided along historical tribal lines. While there exists mutual intelligibility between them, their differences are still pronounced, and can even serve as shibboleths.

The rest of this paper assumes Common Equestrian, but this section in particular will take a closer look at the other two dialects, Courier Equestrian and Arcane Equestrian, and the chief phonological and grammatical differences between them and Common Equestrian.

4.1 Courier Equestrian

Courier Equestrian is spoken by winged ponies, ones we might call pegasuses in English. Compared to Common Equestrian, their palatal stops [c] and [ɟ] are actually postalveolar affricates, [t͡ʃ] and [d͡ʒ]. Courier Equestrian also forbids null onsets; where they occur in Common Equestrian, Courier Equestrian instead uses [ɦ], transcribed as ħ when appropriate. This is considered a ‘rainy’ consonant, even though it does not have a vocal equivalent.

Thankfully, there are no grammatical differences between Common and Courier Equestrian.

4.2 Arcane Equestrian

Arcane Equestrian, spoken by horned ponies (ones we might call unicorns in English), is trickier in some ways to master when coming from Common Equestrian, and easier in others. In either case, both the phonology and the grammar of Arcane Equestrian are radical departures from Common Equestrian.

4.2.1 Phonological Differences

Instead of the bilabial trills [ʙ̥] and [ʙ] like in Common and Courier Equestrian, Arcane Equestrian uses bilabial fricatives – [ɸ] and [β], respectively. Also instead of the trademark clicks ([ᵑʘ], [ᵑǃ], [ᵑǂ], [ᵑǁ]) of the Common and Courier dialects, Arcane Equestrian uses true voiceless nasals – [m̥], [n̥], [ɲ̊], and [ŋ̊].

The Arcane dialect also has a palatal lateral consonant [ʎ], occurring as a merger of [j.l] in the other dialects (the merged [ʎ] becomes an onset, never a coda). This sound is transcribed as lj when appropriate. This is also considered a ‘rainy’ consonant and, as with the Courier dialect’s ħ, does not have a vocal equivalent. However, unlike with ħ, it has its own glyph in the native script (see Section 5).

4.2.2 Grammatical Differences

Unlike between the Common and Courier dialects, the Arcane dialect features its own grammatical features. For the most part, they are four additional grammatical cases – namely, the locative -ơme [ɤˈme], the lative -ơwe [ɤˈɣe], the ablative -ơzle [ɤzˈle], and the instrumental -edj [ˈeɟ].

There appears to be an additional rule when used in official capacity (as Arcane Equestrian is also the administrative lingua franca) – a total lack of pro-dropping. However, since this usage is limited to two individuals, both of whom are rumored to be extremely long-lived, it is best to assume this is merely a register for said individuals, perhaps left over from an archaic form of Equestrian.

4.3 Human Adaptations

However, a human speaker need not pronounce these sounds exactly as prescribed. In particular, I have been experimenting with using labiodental fricatives [f] and [v] in place of [ʙ̥] and [ʙ] – Twilight Sparkle assumed I was developing a lisp, but could otherwise understand me. It is also acceptable to use the [t͡ʃ] and [d͡ʒ] from Courier Equestrian, and to forgo all the other dialectal differences from Common Equestrian.

5 Writing System

Equestrian is written with a featural alphasyllabary.

Figure 3: A sample of the Equestrian script. This spells “Ãtir Ḷsapa,” Twilight Sparkle’s native name.

The vowel order, assuming solar harmony, is as follows: e, ư, ơ, i, ṛ, ḷ. The consonant order, assuming aquatic harmony, is as follows: ħ, g, ñ, w, dj, nj, zj, d, n, z, b, m, v, y, r, l. Note that ħ represents the null consonant in the series, even though it is pronounced only in the Courier dialect. In the Arcane dialect, lj is collated between y and r.

Most of the letter shapes (save for lj) are optimized for stomagraphy, as this is the main method of writing for two of the three tribes of ponies, and as such are composed of a handful of regularly-applied sub-elements. Moreover, Equestrian has a native music notation system, similar to western staff notation, but obviously developed independently from Earth.

5.1 History

(Notā bene: It is unknown whether this account is actually historical or merely legend.)

*Indeed, this is still done today with the modern script

Before the Empire was founded, two other scripts existed – one used for the Arcane dialect, the other for the Courier dialect. The Arcane script was written using rods held in a unicorn’s telekinetic grip,* while the Courier script was strongly pterographic. The Common dialect was traditionally unwritten, and indeed most of its speakers were illiterate even in the other two scripts.

One day long ago, a gardener named Zenedjưge [zeneɟɯˈɡe] – literally ‘climbing-flower,’ but better translated as Wisteria – had to stay inside from a sudden rainstorm. When the rainstorm cleared, Wisteria neglected to redon her boots, so she had to work barehoofed. The shoes were meant to keep her hooves clean, and more importantly to keep her frogs from being damaged by stray twigs and thorns.

She felt herself step on one such twig, but when she recoiled in pain, she noted the pattern of the hoofprint and twig in the muddy soil – and inspiration struck. Within the space of a day, she was able to produce all the shapes of the modern script using only her hooves and a rod held in her mouth.

Wisteria found her script quickly adopted by the rest of her tribe, but she found resistance from the pegasi and unicorn tribes. They were both content with their own forms of writing, and thought it beneath themselves to adopt a ‘lowly’ script like Wisteria’s.

It was not until she demonstrated the new script to the leaders of a budding Harmonic Empire that it finally received greater adoption, as all of their initial laws and proclamations were written in Wisteria’s hoof.

5.2 Consonants

The consonants have two elements: a large circle or semicircle to define the place of articulation, and a secondary element placed inside the semicircle (but never the circle) to define the manner of articulation.

There are four directions in which the semicircle can point (insofar as the rounded loop is concerned): up, down, left, and right. The right-pointing one defines velar consonants, the up-pointing one palatal consonants, the left-pointing one alveolar consonants, and the down-pointing one labial consonants.

Standing alone, these semicircles represent voiced stops. There are three elements that can be placed inside, however. First is a smaller circle, to create nasal consonants. Second is a vertical line, to indicate fricatives. Third is a horizontal line, to change the consonant harmony from ‘aquatic’ to ‘terrestrial,’ i.e. to devoice these sounds. It need only be marked once per word.

The approximants are represented with their own symbols. These are L-shaped glyphs, with no subglyphs written within them. [j] is written with the bend in the upper-right corner. [ɹ] is written with the bend in the lower-right corner. [l] is written with the bend in the lower-left corner.

[ʎ] is uniquely written as a Z-shaped curve, likely ligated from [j] and [l]. This is not easily written with stomagraphy, and is likely a unicorn invention.

5.3 Vowels

The vowels are written with a series of marks placed outside of the circle or semicircle consonant-element.

An unmarked consonant carries the inherent vowel [e]. Placing a vertical line on the right side changes it to [ɯ]. Placing a vertical line on the left side changes it to [i ]. Placing vertical lines on both sides changes it to [ɤ]. Placing a curl on the right side, heading down and to the left, changes it to [ɹ̩]. Placing a curl on the left side, heading down and to the right, changes it to [l̩].

Above the consonant, a small crescent shape can be placed to change the vowel harmony to ‘lunar.’ It need only be marked once per word. Below the consonant, if a syllabic consonant is not indicated, one can place a dot to signify a nasal vowel, or a horizontal line to mute the vowel altogether.

5.4 Other Marks

Punctuation is sparse; only two marks are used – first a colon-like mark, meant to end an utterance, itself seeming to be a recent development, as it is frequently dropped. The other is more obligatory – a spiral shape, resembling a backwards 6, placed above the consonant (to the left of the lunar-harmony mark, if applicable) in a word where the interrogative tone is applied.

5.5 Mathematical Notation

Numerals use a place value system, marked with dots arranged in specific patterns, similar to dice pips. Zero is marked with two thick vertical lines. The senary point uses a thin vertical line.

Dots can also be arranged in triangle shapes. Pointing left is addition (+), pointing right is subtraction (-), pointing up is multiplication (×), and pointing down is division (÷).

Solid triangles are also used in mathematical notation. One pointing left is a greater-than sign (>), pointing right a lesser-than sign (<), and pointing down an equal sign (=).

A solid triangle pointing up represents exponents, roots, and logarithms all at the same time. The base is written in the lower-left corner, the exponent in the top corner, and the logarithm in the lower-right corner.

There also exist symbols specifically for negative values, π, φ, e, i, ∞, and -∞.

5.6 Musical Notation

Instead of lines, as with western notation, Equestrian music is plotted on a grid. Each grid is four lines tall, and stretches the full width of the page.

At the start of the grid, a symbol is placed to indicate the key signature. Dots and lines are inserted along the interstices of, and above, the grid. A single dot indicates a note played; vertical dots indicate chords. Horizontal lines indicate a longer note; diagonal lines indicate a slur or glissando. A lack of dots and lines indicates a rest.

Dynamics are noted above the grid. Instead of a time signature, each vertical gridline after a number is thickened, to indicate measures. Unlike with western notation, the ‘basic’ note in an Equestrian time signature is always the smallest; this is implied to be the key signature.

It is the understanding of the reader that key and time signatures typically do not deviate, but if they ever do, they are marked with a new time signature at the start of the next measure, or the next measure is a different length, whichever applies.

One should also note that Equestrian music is traditionally on a hexatonic scale (usually the whole-tone), and will need some significant tinkering to make it work for other scales. One ad-hoc solution for non-hexatonic scales is to use numerals over the clef.

Figure 4: The Japanese folk song “Sakura” using the Equestrian method. Note the numeral 5, which indicates a pentatonic scale. This transcription was provided as a courtesy by a musician living in Ginzol.

6 Conclusion

Despite being untold light-years apart and therefore having no previous contact with one another, I find that Equestrian is not terribly different from the languages we speak regularly on Earth. It shares several features with those found on every continent, and yet employs its own innovations that, while alien, are still remarkably human.

While I have endeavored to document everything I have found so far, I realize all the same that I still have so much left to go – to say nothing about the other languages spoken in the Harmonic Empire, never mind on Rhysling.

Still within the scope of Equestrian, I am left with a few personal curiosities. For one, why distinguish what are essentially two animate classes? For another, why do they place so much emphasis on phonological harmony? Even though the three tribes have apparently set aside their differences, they still maintain their respective dialects, for reasons yet unknown.

Perhaps most curiously of all, qapata [ᵑǃɑpɑˈtɑ] (‘friendship,’ also ‘hello,’ ‘goodbye’), an obvious abstraction, is classified not as inert, but as magical. . . .

Author's Note

And so it comes together. Here’s the largest block of notes I’ve written for the story, and with it everything about their language – the reasonings, the musings, the regrets, the what-could-have-beens, and so forth.

Did you notice the date this document was ‘compiled’? That’s not arbitrarily chosen – some of you who know me will already know the significance of this date, but for those who don’t, I’ll tell you: that’s the day I was born.

There are two reasons for this. I thought of dating the paper on my date of birth would be symbolic, at least on a personal level. Vain? Perhaps. But the other reason is that it fits in thematically.

This story is partly about the past – it’s also why I set it just after Season 1, without directly acknowledging canon in Season 2 and beyond. Knowing what comes after gives me an odd advantage – I could potentially write a story that would seem like something you’d read in 2011, but it’d be ‘future-proofed’ against future seasons.

It’s my way of looking back on my time in the fandom.

You might find it strange that I refused to refer to Equestrian as Equestrian until the very end. To be fair, so did I – but I’m not Dr. Somerset (despite sharing his knowledge and expertise). He might think of using the term first, but dismiss it as ridiculous.

Even now, he still refuses to refer to Equestria as anything other than the Harmonic Empire.

. . . actually, an earlier draft of Chapter 19 did contain a similar lampshade:

What’s next, Adam mused to himself, do we call this place Equestria? – No! He remembered bursting out laughing, thankfully out of earshot of the ponies. That would be ridiculous, forget about it.

I cut it both to make the colleague’s suggestion possible, and because it didn’t mesh at all with the tone.

It may not be my specific area of expertise, but I do have a surprising amount of knowledge on phonology. And that meant Equestrian’s went through an absolute ton of changes.

This was the very first iteration of Equestrian phonology, which I presented to Admiral Biscuit on 18 January 2017. As you can see, it’s very lateral, and very Turkic.

I fell in love with the consonant table but, seeing it as a poor fit for this project, decided to keep it around for another. Will that ever see the light of day? I’m not sure, but I’m keeping my fingers crossed.

The vowels could get stuffed for all I care – because skies above, that’s complicated. But they did have one thing present: vowel harmony. This was a feature present from the start. Equestria prides itself on harmony; why shouldn’t their language?

It wasn’t for another two years or so when I started to tinker around with it again. I kept picturing the vowels in my head on the bus rides to and from college, trying to find an elegant system. At one point, I considered a system of contrasting short/long/nasal vowels – then adding long nasal vowels into the mix. It was getting to be over my head, so I rejected it as well. Then I remembered Middle Mongol – how its seven vowels made a backing harmony – and replicated it in Equestrian. (It kinda helps that Mongolian culture is very equinocentric.)

But with a twist. See, when I borrow a feature from a real-life language, I rarely ever play it straight – I always twist it, sculpt it, make it so it would fit my language better. To wit, I made the front-rounded vowels back-unrounded instead (reflecting Southeast Asian tradition) and added an oral-nasal distinction and few syllabic consonants.

Even the vowel harmony wasn’t a slam-dunk. Initially under the new vowel system, it was purely a height harmony: /i/, /ɯ/, and /u/ formed one group, and /e/, /ɤ/, and /o/ formed another; /a/, /ɹ̩/, and /l̩/ were neutral. Nasal vowels harmonized like their oral counterparts.

Then I read up more about vowel harmony, and how other languages did theirs. (Let me just say, Turkish’s is really complex, yet stupidly intuitive.) I saw Hungarian, Finnish, Mongolian again, and Korean – and again and again the same pattern emerged: front-unrounded vowels tended to be neutral, especially /i/. If I had to guess – and don’t quote me on this – this must be because of its relationship with the consonant /j/, frequently the only liquid to have a vocal equivalent in the same language. Not every language has a /w/, nor a /r̩/ or /l̩/ – yet /j/ and /i/ frequently appear together. Something worth investigating. . . .

Anyway, getting back on track – I was sitting in my math class when I remembered how these vowels could be grouped, so while I was still sitting in class, on the back of the math test I had just gotten back (this was just before lockdown), I started sketching out the rudimentary categories for these vowels. I put /e/, /ɯ/, and /ɤ/ into one group, /a/, /u/, and /o/ in another, and leaving /i/, /ɹ̩/, and /l̩/ neutral was a piece of cake. And of course, nasal vowels retained their previous behavior.

I then tried naming these new groups – initially it was A versus E vowels, but then I remembered how Arabic names their letters ‘sun’ and ‘moon,’ and renamed them ‘solar’ versus ‘lunar’ vowels. They absolutely stuck when I tied them to Celestia and Luna just a second later.

(‘Sun’ and ‘moon’ letters in Arabic, by the way, are so grouped depending on how the definite article al- (ال) is assimilated. The ‘sun’ letters are all the coronal consonants, so the /l/ assimilates. The ‘moon’ letters are all the other consonants, and the /l/ does not assimilate. The names not only list polar opposites, but even demonstrate it with the words themselves – “the sun” is aššams (الشمس), and “the moon” is alqamar (القمر).)

A few more things I should add – first, Equestrian originally included diphthongs (only those ending with [-j]), but I later cut them out since it would interfere with the script. Second, Equestrian’s interrogative tone was originally an upstep, which I had simplified. Third, I had originally planned to use vowel harmony to indicate noun classes (see below for more), having been inspired by Manchu using it for grammatical gender (e.g. ᡝᠯᡝ (eme, “mother”) versus ᠠᠯᠠ᠋ (ama, “father”)). The last of these ideas still survives in the titles for Princesses Celestia and Luna (yere uses solar vowel harmony, while yara uses lunar vowel harmony).

Also in that same math class, and on the bus ride home that day, I started thinking about consonant harmony as well. How would I incorporate such an idea? My first instinct was to use a type of pharyngeal harmony – voiced versus voiceless. Research into this topic, however, led me to conclude that this type of harmony is unusual in the real world – consonant harmony, usually, takes the form of a coronal harmony – dental versus alveolar, for instance (e.g. sasi is a valid word, as is šaši, but not saši nor šasi). This only fueled my motivation to ‘program in’ a consonant harmony, the way I envisioned it – after all, it was an idea, and I wanted to explore it.

The stops and fricatives were easy enough to pair up – /f/ versus /v/, /t/ versus /d/, and so forth. And the approximants, as few as they were, were left to be neutral – this decision further cemented the vowel harmony explained ante.

But the nasals and clicks gave me a heap of trouble. The nasals were simply stops articulated with a nasal element, and while it was tempting to replace the clicks with true voiceless nasals, making a click language was sorta kinda maybe. . . fine I’ll admit it, it was on my bucket list. At first I settled on a rather complicated rule that nasals and clicks were also neutral, but behaved differently from the approximants. See, the approximants are more specifically transparently neutral – the harmony simply ‘passes through them’ without interference. Same thing with the neutral vowels. But the nasals and clicks were at first opaquely neutral – they didn’t harmonize so much as they changed the harmony mid-word. The nasals voiced the harmony, while the clicks devoiced it.

Then a few months later, long before I actually broke ground on the story, I had an epiphany. Aren’t there other kinds of clicks? Not just in the places of articulation, but in the manner as well? Turns out there were, and I had simply (and foolishly) discarded them. To my surprise, and my relief, not only were nasal clicks possible, every click language on Earth had them – even if one only had a single manner of click articulation, it was always nasal. Immediately I went to revising the phonology – I nasalized all the clicks, then reänalyzed them as voiceless nasals, which allowed me to reärrange the consonant chart to group them together, and the consonant harmony to oppose them. That’s also when I got the idea of adding voiceless nasals as a dialectal difference (elaborated post). It’s funny how a single change can make everything fall into place.

Oh, and as for why I called them ‘terrestrial’ and ‘aquatic’ consonants? Simple: those terms were shamelessly stolen from High Valyrian, which, alongside ‘solar’ and ‘lunar,’ uses them for its noun classes.

Do you recall from Chapter 17, where Adam was assigning letters to sounds in Equestrian? That was my own thought process, believe it or not.

Most of the assignments were the usual sort. Digraphing the palatal sounds with -⟨j⟩ is a little strange, but that’s not without precedent. If you speak Danish, Norwegian, Faroese, or Dutch, you’d instinctively pronounce ⟨sj⟩ as [ʃ]. If you speak Swedish, you’d be pronouncing it more like [ɧ] – regrettably, the precise articulatory mechanisms are ill-described, and are still being debated even today.

⟨tj⟩ is also used in Faroese for [t͡ʃʰ], though ⟨dj⟩ represents [t͡ʃ]. ⟨dj⟩ is used in French as well, for [d͡ʒ]. ⟨nj⟩ and ⟨lj⟩ are used in the Serbo-Croatian dialects (Gaj’s Alphabet) for [ɲ] and [ʎ], and the former is also used in Albanian. ⟨zj⟩ is an innovation, but is consistent with the other digraphs.

Using ⟨ñ⟩ for [ŋ] would look familiar if you’ve read Tolkien, as he used the sound and glyph in early Quenya, e.g. Ñoldor. (The sound is still present in final Quenya, but serves as an allophone of a nasal followed by a velar consonant.) Admittedly, I am not actually perfectly intimate with Tolkien’s writings, as embarrassing as that sounds, and had simply gotten the idea from the Common Turkic Alphabet.

Now I know what you’re going to ask: “Ted, what the hell were you thinking, using ⟨w⟩ for anything other than [w] or [v]?” Actually, that letter has a few more diverse uses.

The letter originally developed as a result of a sound shift in Early Medieval Latin, where [w] (written ⟨v⟩) shifted to [β] (which would later become [v]), necessitating a new letter to represent the old sound, especially in the Germanic neighbors to the north, which still had that sound. The joke was on them, however, as in Middle High German that too shifted to [v].

Polish uses the letter for [v], and in fact apart from loanwords entirely lacks ⟨v⟩. Dutch typically uses it for the related sound [ʋ]. Welsh, also lacking ⟨v⟩, uses ⟨w⟩ as a vowel [ʊ], and also has a long vowel ⟨ŵ⟩.

Yet believe it or not, using the letter for [ɣ] is not an innovation. While I came up with that usage independently, it turns out that Yi Pinyin (Romanization for the Yi languages in southern China) does the same thing.

But far and away the strangest part of Romanized Equestrian is using ⟨þ⟩ for [ᵑʘ~m̥]. That’s mostly me paying a bit too much attention to how clicks are Romanized in the Bantu languages.

How clicks had been written throughout history is extremely convoluted, so I’m going to try to simplify it somewhat. Naturally by the early nineteenth century, as European explorers pushed their way further into southern Africa, they used the otherwise vestigial letters ⟨c⟩, ⟨q⟩, and ⟨x⟩ respectively to write [ǀ], [ǃ], [ǁ]. But linguists found these letters to be confusing, as they had other uses. While the ‘pipe letters’ the IPA uses nowadays were first published by Karl Richard Lepsius in 1854, the first published attempt at writing the Bantu clicks in a ‘scientific’ way was by Norwegian missionary Hans Paludan Smith Schreuder in 1834.

The click letters used by Schreuder, paired with their modern Latin-script equivalents. Source

By 1911 (and likely sooner), Lucy Lloyd and Wilhelm Bleek had created the letter ⟨ʘ⟩ for bilabial clicks, using it alongside Lepsius’s letters. Since they were working with the ǀKam and ǃKung languages, neither of which were genetically related to the Bantu languages to the south of the continent (ǀKam is part of the Tuu family, while ǃKung is part of the Kx’a family, both of which are grouped under the Khoisan umbrella term), they needed the extra letter. As no other linguists bothered to create a letter for the same sound, Lloyd and Bleek became the de facto standard.

During WWI, the International Phonetic Association commissioned Daniel Jones to fill the click gap in their inventory. He eventually delivered ⟨ʇ⟩ for [ǀ], ⟨ʖ⟩ for [ǁ], ⟨ʗ⟩ for [ǃ], and for the non-Bantu languages found further north, ⟨ʞ⟩ for [ǂ]. These symbols were published in 1921.

In 1989, as part of sweeping reforms to their notation system, the International Phonetic Association abandoned Jones’s letters, instead adopting Lepsius’s with the Lloyd and Bleek addition.

Now that I’ve gotten that nonsense out of the way, we’ll focus on the modern IPA symbols as adopted in 1989. [ǀ] is written ⟨c⟩, [ǃ] is written ⟨q⟩, and [ǁ] is written ⟨x⟩. Any further attributes to those clicks, such as being nasal or glottalized, are digraphed with the click letter, e.g. [ᵑǀ] is typically written ⟨nc⟩.

For a time, from 1987 to 1994, Juǀ’hoan (a southeastern variety of ǃKung) was written entirely with the Latin script, and that included ditching the click letters. They went with the Bantu approach, except for [ǂ], a sound it has that none of the Bantu languages do, so they opted to write it using ⟨ç⟩.

Personally, I for one like that system. It was elegant, having distinctive upper and lower cases for each letter. But the bilabial click was never written with a single Latin letter apart from its click letter (the closest I could find was the Bantuist ⟨pc⟩). Then I had an idea: why not use thorn for such a usecase?

Now, I will admit a choice like this is extremely controversial. This letter is exotic even in its native European setting, surviving today only in Icelandic. It was originally created by Roman missionaries while converting the Pagan Germanic and Nordic peoples to Christianity. The process was very simple: they took the letter from their fuþark, Latinized it somewhat, and put it into their now-Latin-lettered texts.

However, one by one the Germanic languages abandoned the sound, and the letter used to write it. Today, the sound survives only in Icelandic and English (which opts to write it as ⟨th⟩, owing to Dutch type foundries in the sixteenth century not creating a custom ⟨þ⟩ for England).

Sigríður Magnúsdóttir’s gambit was just a shoehorned-in narrative excuse, and I’m sorry that it was so blatant.

As clumsy as you think my Equestrian Romanization system is, be glad I’m not Nūrsūltan Nazarbaev.

During his tenure as its first President, he essentially inherited a post-Soviet Qazaqstan, so understandably he had his work cut out for him. As part of his modernization program, Presidential Decree № 569, issued 26 October 2017, ordered that Qazaq’s Cyrillic alphabet be replaced with a Latin alphabet by 2025 – reasoning that Soviet rule had devastated Qazaq as a language and culture, and switching to a Latin alphabet would help assert Qazaqstan on the world stage.

(Side note: originally the Soviet Union proposed Romanizing the various Turkic languages within its borders during 1928, but in 1938, following disintegrating relations with Turkey, Russian became compulsory throughout the entire Union, and with it no script other than Cyrillic was to be used. To make matters worse, each of the Turkic languages was Cyrillicized in a different way – reducing interlinguistic literacy and preventing a pan-Turkic uprising. This is why, as a result, the Cyrillic tables in Unicode are so goddamn bloated.)

Development of that Latin alphabet, however, is best described as a shitshow. The original proposal, by Nazarbaev himself, tries to avoid digraphs – which I support – but also avoids diacritics – which I oppose. The modern Latin script comes with a whole suite of diacritics for any purpose imaginable. (In fact, in this manner it’s entirely possible to convert the Latin script into an abugida, which is just about the hottest linguistic take you’re likely ever to hear – but I digress.) Shunning one of the script’s greatest strengths is a grave mistake.

Yet Nazarbaev insisted that “there should be no hooks or superfluous dots that cannot be put straight into a computer” – and introduced several new letters that use apostrophes. Nazarbaev rarely received criticism during his regime, even from the Qazaq Parliament – “if the president says something,” political analyst Aidos Sarym told the New York Times, “or just writes something on a napkin, everyone has to applaud” – yet here, this proposal was universally decried as unsightly and needlessly clunky, comparable to ‘alien languages’ one might read in Marvel or DC comic books. Ironically, they would also break on computers – Twitter hashtags, for instance, do not work with apostrophes.

“No hooks or superfluous dots,” eh? Source

It was so bad that, when I first came across it, I decided to try Romanizing Qazaq myself. After looking up its phonology, I hammered out another scheme by myself. Within fifteen minutes. In the middle of my English class. I cannot recall it now, but I know that I did it.

Thankfully the Qazaq government is not without a voice of reason. On 19 February 2018, Presidential Decree № 637 amended the 2017 Decree, replacing the apostrophes with two digraphs and the acute accent.

Admittedly a little odd in my eyes, since it’s not quite in-line with other Turkic alphabets, but at least it wasn’t as much of an eyesore. Source

After Nazarbaev resigned on 19 March 2019, ending a 29-year administration, his successor Qasym-Jomart Toqaev called for yet another revision of the Latin alphabet. Toqaev seemed more open to linguists’ opinions, so they got to work. Eventually in January 2021, they produced a version differing only slightly from the Common Turkic Alphabet.

The only difference was replacing the Common Turkic Alphabet’s ⟨ñ⟩ with ⟨ŋ⟩. Source

Eventually on 22 April, even the ⟨ŋ⟩ was changed to ⟨ñ⟩, bringing the whole alphabet in-line with the Common Turkic Alphabet. This is said to be the final version of the Qazaq Latin alphabet, to be implemented in 2023 – Romanization is expected to be completed by 2031.

And now for the grammar. This is the real meat-and-potatoes of the language. My area of expertise is the relationship between language and thought, so this is where my work on the language really shines. What I’ve done here was essentially translate the values and morals of the show into linguistic features. Harmony was one example, as I’ve explained above, but I didn’t stop there.

First off, full disclosure: I was taking a Japanese class at the same time as I was writing this story. While I tried to avoid copying Japanese directly into the language, naturally such influence made its way into the language anyway. For instance, technically the word order in Equestrian is free, but because of the class I was taking, the ‘default’ order naturally drifted to Japanese’s, SOV.

The noun classes were a bit tricky for me to nail down. I knew for sure one of them would have to be ‘magical,’ for reasons you can probably deduce by now, but what about the others? I read up on noun classes, trying to find some way to sort my nouns. Then I had an idea – and I’ll admit it wasn’t that clever at first: just take the classic animate/inanimate class division, and add a magical twist.

Seriously, as stupid as that sounds, it just worked. The nouns just fell into place naturally. I renamed the inanimate class ‘inert’ to reflect its true nature, and from there it stuck.

But how would I make these classes? That is to say, how would I define them for purposes of sorting nouns? Making broad categorizations was a good start – living things belonged to the animate class, magical items, spells, and so forth belonged to the magical class, and everything else belonged to the inert class.

Then one bus ride to college gave me another idea: remember what Arthur C. Clarke said about sufficiently-advanced technology? While that helped balance the noun classes, ultimately it was little more than an extension of the magical-items definition.

I did create a little hack for defining things that happen by treating events as magical, so “What’s happening?” literally means “What magically is?” This later worked out perfectly for describing qapata (“friendship”). Think of it this way: “friendship” is not merely a state, but an activity. It’s not a one-and-done deal; it requires constant maintenance to keep it going. Friendships fade over time – something anyone’s experienced time and time again. Even the show portrays this – recall “Amending Fences” (S5E12), where Twilight apparently forgot about her friendship with Moondancer.

The only real quirk is that noun classes aren’t explicitly marked on the nouns themselves, but verbs do conjugate for the class of the subject/agent in certain circumstances. More on this in another section.

You’re probably wondering why in the world I went with an ergative-absolutive language – or more likely, what the hell an ergative-absolutive language is. Fair enough, and I’ll tell you both. It has to do with morphosyntactic alignment, which is a fancy way of saying how semantic arguments are treated.

In any language, there are two kinds of sentences: intransitive and transitive. An intransitive sentence has only a subject, e.g. “The bird flew.” A transitive sentence has an agent and an object, e.g. “The man held the book.”

English is a nominative-accusative language, meaning it treats the subject and agent as the same semantic argument – which is to say, “the bird” and “the man” in the preceding examples would be treated as the same thing. An ergative-absolutive language like Equestrian treats the subject and object as the same semantic argument – that is, “the bird” and “the book” are the same thing.

Why would I brain myself on my desk figuring out how to construct an ergative-absolutive language? Simple: that was another bucket-list item. Plus, thematically speaking, it worked out quite nicely – it provides a starker contrast to the human languages involved, all of which are nominative-accusative.

That’s not to say human languages are only nominative-accusative, but indeed the vast majority of them are. Basque, Georgian, Tibetan, Mayan, and the now-extinct Sumerian are all ergative-absolutive.

Other morphosyntactic alignments exist, but they’re all incredibly rare: for instance, transitive alignment treats the agent and object as the same, direct alignment treats all arguments as the same, and tripartite treats all arguments separately.

And then there’s split ergativity – where a language uses the ergative-absolutive alignment for some instances and nominative-accusative alignment for others. Guugu Yimithirr, a language native to Far North Queensland in Australia, is ergative-absolutive for nouns, switching to nominative-accusative for pronouns. And before you write this off as the quirk of an isolate (which I can assure you it is not an isolate), this is perhaps the defining grammatical feature of all Australian Aboriginal languages.

Normally, I would say not to get me started on the Philippines – that area, and nowhere else in the world, has a rather unique alignment. But since I’m doing this in writing, I may as well try to explain how convoluted it is.

All the preceding alignments have one thing in common: they have a single active and passive voice. In a nominative-accusative language, the active voice is necessarily unmarked, with the passive voice marked. An English example would be “The man held the book” versus “The book was held by the man.” An ergative-absolutive language would have an “antipassive” voice. The difference is pretty simple: the passive voice replaces the agent with the object, while the antipassive replaces the object with the agent – the other way around, so to speak.

So with that explained, here comes the Philippine alignment (technically called the Austronesian alignment, though it’s mostly restricted to the Philippines). Rather than having one unmarked transitive voice, these actually have two – one marks the primary argument (the more prominent one, on which the speaker wants to focus) as the agent, and the secondary argument is the patient, and the other switches them around. These serve entirely pragmatic purposes, to emphasize arguments in speech, and are distinct from active/passive voicing (as these languages still have that concept).

(There’s also the direct-inverse alignment, which is similar to the Philippine alignment, but the different transitive voices are directly tied to the active and passive voices and thus are not interchangeable.)

Sound confusing? You’re not alone. The resources I found on the Internet tend to be written thick with technical jargon, and are part of a linguistic topic one does not usually study.

Moving on to other cases. There’s something called the Case Hierarchy, a theory developed by Barry Blake in 1994 – it states that grammatical cases develop in a certain order. That specific order goes: nominative/absolutive, accusative/ergative, genitive, dative, locative, ablative/instrumental, comitative, then all the rest.

I was quick to add the genitive and dative cases, following the Hierarchy. The ablative and locative cases soon joined the lative in the Arcane dialect, along with the instrumental, since I would imagine unicorn magic places so much emphasis on location, direction, and object manipulation both tangible and intangible. Then I added a prepositional case, as a miscellaneous catch-all that other cases don’t cover.

Here’s the real big hiccup: adding a comitative case so soon, violating the Hierarchy. But there’s a cultural reason for it: ponies greatly value companionship, arguably above all else (which, incidentally, Admiral Biscuit pointed out reflects actual Terrestrial equine behavior), so it would make sense for them to have a case specifically for being with someone else.

You might think it’s a little strange to have more numbers besides singular and plural, but it’s not entirely out of the question. Hell, Proto-Indo-European used to have a dual number, though most of its descendants have all but gotten rid of it. There still exist some remnants of it in English, e.g. we still use words like both and neither. The only Indo-European language that still maintains a full dual number is Lithuanian.

Looking outside the Indo-European family, other languages that have dual numbers are Arabic, Hebrew (only for time, number, and natural pairs), the Austronesian family (only for pronouns), the Samoyedic languages in the Uralic family (Finnish, Estonian, and Hungarian have lost it), and a few Pama-Nyungan languages like Yidinʸ and Woiwurrung-Daungwurrung.

But what about paucal? That’s much less common, but is used in Arabic for some nouns, Hopi, Fijian, and there’s some evidence that it exists in the Serbo-Croatian dialects. As you might see, very few languages, if any, consistently apply these four numbers throughout themselves, relegating them instead to a few places where they ‘make sense.’

Another grammatical number exists, but is not used here: trial. As the term suggests, it refers to a group of three items, as opposed to one (singular), two (dual), or four or more (plural). There once was a theoretical ‘quadral’ grammatical number, suggested by the pronoun systems of Marshallese and Sursurunga, but it turned out Marshallese simply has trial and paucal numbers, and Sursurunga has two different paucal numbers: one for three or four items, the other with a minimum of four (think “a few” versus “several”).

Speaking of pronouns, I simply contented myself at first to adding just the three people (first, second, third) – these are universal enough, and are perfectly suited here. I later added a reflexive pronoun, which doubles as the passive voice. The fourth person is really a zero person, for which in English we would use one, e.g. “One sees the tree.”

Using the first-person dual as a ‘royal We’ is, admittedly, breaking my own rule for observing canon from only season 1. Mea máxima culpa.

The third person does not make any distinction as to gender or noun class, so don’t go thinking I went full Tumblr here.

The paper clearly implied the behind-the-scenes origin of awidṛ’s dual definitions as “1000₆” and “countless” – but not the reason why. That’s just a bit of tongue-in-cheek humor on my part; I like to imagine the show as being a poor English translation of the Equestrian original.

I rather liked sprinkling a few references here and there into the language’s vocabulary. For another instance, the Equestrian word for “harmony,” ơh, is a reference to Starscribe’s Message in a Bottle, which calls the language Eoch. That may or may not have been how it’s actually pronounced, but it’s a little too late to go back on it now, isn’t it?

Here’s one you might’ve missed: the Equestrian word for “mystery” is yesik – a nod to A Voice Among the Strangers, in which protagonist Jessica Rivera is named ~Mystery~ by the ponies. Another callback to the same work was naming “black” ebơni.

Another bit would have been an idiom I regrettably forgot to add. I’ve always liked to use “to dine/drink with Discord” in ponies’ speech to mean “to profit from misfortune.” If you want a human equivalent, try “to dine/drink with the Devil.”

Another regret: kinship. For a culture so centered around community, you’d think that’d be one of the first things I’d outline, or at least even mention. If someone wants to fill in the blanks, here’s what I’ve envisioned.

To start, consider this diagram. Source

Notice how, as one gets further away from oneself, the attachments become weaker. (Ideally, that’s how it should be – diagrams like these are meant to educate autistic people on which interactions are appropriate for which distances.) Equestrian’s would work more on a spectrum than clearly-defined circles, with definitions much more loosely defined than any other part of the language.

You’ve seen them in action back in Chapter 20, and now you’ve seen how they work. Color categories are basic terms for what a language “sees.” Of course, assuming you’re not colorblind, you can see all of them – you simply “group” them into different categories.

Not only that, color categories follow a hierarchy. In 1969, Brent Berlin and Paul Kay published a book called Basic Color Terms: Their Universality and Evolution. In it, they outlined a seven-stage hierarchy of a language acquiring color categories, in a strict progression from one to the next:

Black and white (or dark/cool and light/warm; this is a broad category)RedGreen or yellowBoth in IIIBlueBrownGray, orange, pink, and purple

And it’s not just these categories, either; some languages (using Russian as an example) recognize light blue (голубой (goluboj)) and dark blue (синий (sinij)) as separate categories, semantically impossible to confuse with one another.

French has a beige color category, referring to undyed wool.

Hungarian has two color categories for what English calls “red” – piros and vörös (spelt also veres, though I’m told this is archaic and/or dialectal). On a color chart, a Hungarophone would indicate that vörös is a darker, more definite hue (since it is derived from the Proto-Uralic root *were, “blood;” compare Hungarian vér, Finnish/Estonian veri), while piros can consider a few shades of what we call “orange.” But there’s also a more grammatical difference between them – vörös tends to be used for animate objects, while piros tends to be used for inanimate objects. For another division, vörös also has a much more serious connotation than piros.

For instance, the top stripe of the Hungarian flag is defined as piros. A common rhyme taught to schoolchildren is “Piros-fehér-zöld, ez a magyar föld” (“Red-white-green, this is the Hungarian land”). Source

Color categories aren’t set in stone, either. In 2015, researchers in London have found that English may be developing two more color categories – lilac and turquoise (basically like the light/dark blue split Russian exhibits).

Over the intervening years, however, as color categories have been further explored, the strictness of this hierarchy has loosened somewhat; even Berlin and Kay became more lenient in ordering them. In 2012, a computational model showed that the rods and cones of the human eye are greatly (though not entirely) responsible for the color category hierarchy.

Naturally, Equestrian color categories would also encompass those invisible to the human eye.

Verb conjugation can be a lot of fun. After all, verbs are one of the only universal grammatical functions in linguistics, so why not have some fun with them?

First, the tenses. Past, present, and future seem like obvious choices, but these aren’t universal. Quite often the future tense is missing – Japanese, Arabic, and (morphologically) English, for instance. Some languages lack a past tense, such as Quechua and Greenlandic. And some languages lack tense altogether – Chinese and Dyirbal come to mind.

(As an aside, I have not heard of a language making a distinction between present and nonpresent tenses, i.e. the past and the future are expressed as the same tense. Might be worth looking into.)

But some not only have all three tenses, they further subdivide them to make finer lexical distinctions. Kalaw Lagaw Ya, spoken in the islands in the Torres Strait (between Australia’s Cape York Peninsula and Papua New Guinea), divides their time into six tenses: the remote past, the recent past, the today past (events that have already happened, but still within the current day), the present, the immediate future, and the remote future. Cubeo, spoken in the Amazon forest in Brazil and Colombia, has a separate past tense used for historical events.

A cyclic tense is usually used for events described relative to a reference point or span (which is not the same as temporal aspect, before you ask). I threw that into Equestrian just for the hell of it, redefining it as for actions that occur repeatedly (like, say, the next villain of the week), but regrettably have not found a place to use it in the story.

Then of course, I simply had to conjugate the verbs for person – by which I mean the subject/agent of the sentence. I charted up the five persons (including zero and reflexive) and the four numbers for each save for reflexive.

Then I found myself in a pinch: I had no markings for noun classes. Pondering how I should express them in the grammar, I came across how Cree simply put the markings in the verbs instead, and copied that for the third person specifically. When I did, I suddenly realized what a good idea that was – even if the subject/agent isn’t mentioned, one could deduce what that is from the noun class for which the verb is conjugated.

I’ve saved the best for last – verb agglutination. This is a process I’ve never seen before in the real world, so that made me want to explore the possibility all the more.

Constructing a language is hard enough, and I’ll be the first to tell you that. But constructing one with distinct dialects? Hoo boy!

. . . or so one thinks at first. But really, when you think about it – all you’re doing is applying a very small number of phonological and/or grammatical differences between them – it’s really not that hard.

Many of these interdialectal changes originated from previous iterations of Equestrian. In particular, [ʎ] was added to the Equestrian phonology following a major restructuring. However, it turns out that it’s not a very commonly-used sound, and even where it does appear, it’s usually an allophone of [l] before [j] – and when I was composing a glyph for it, naturally my hand drifted to ligating those of [j] and [l], in that order. This not only gave me an excuse to cut it from the main (read: Common) dialect, it also provided an avenue for the relevant sound change, and part of the foundation for the Arcane dialect. This joined the true voiceless nasals and the true bilabial fricatives to create the most phonetically-divergent dialect of the three.

[ɦ] for the null onset in the Courier dialect actually came out of nowhere – it was a total spur-of-the-moment decision. But switching the palatal stops to postalveolar affricates seemed right to me – they sound louder than palatal stops, which is important for a tribe that spends a lot of time in the air, so they can hear one another over the roar of air rushing past them.

A register is a form of language used for a specific purpose. This is different from a dialect per se; languages typically have formal and informal registers, e.g. English “kids” versus “children,” “walking” versus “walkin’,” and so forth.

Some languages make further register distinctions. For example, Thai has five official registers:

“Common” Thai is informal, typically used between friends and close relatives.“Formal” Thai is used in writing, and includes respectful honorifics; newspapers use a simplified form.“Rhetorical” Thai is used in public speaking.“Religious” Thai is used for discussing Buddhism and for addressing Buddhist monks. This bears heavy influence from Sanskrit and Pāli.“Royal” Thai is used for addressing members of the Thai Royal family. This is influenced by Khmer.

The writing system’s design was for a large part according to Admiral Biscuit’s specifications. He simply told me what he envisioned for his Onto the Pony Planet, and I simply put his words into more technical vocabulary and went from there.

Horses, certainly the real-world variety, aren’t terribly flexible. Their best tool, as demonstrated in the show, would be to hold a writing implement in the mouth, and carefully flex the lips and jaw in such a manner to produce writing. (As as far as I know, I seem to have coined the word “stomagraphy” to describe this process.)

But even then, there are only so many shapes one could produce this way, and the ones one could tend to be very simple and abstract. The script, as seen in the show, is clearly alphabetic – a design which I consciously rejected outright. Instead, I took Admiral Biscuit’s idea, and coupled it with Hangeul – the featural alphabet used to write Korean – to produce perhaps the world’s first featural abugida.

How this script works is clearly outlined in Chapter 17, but to help flesh it out further, I will reïterate it here. For most consonants, the basic letter shapes (which behind the scenes are called “hooves”) tell you where in the mouth the consonant is articulated. Inset within these are several simple shapes – lines and circles (called “horns” behind the scenes) – to tell you how the consonant is articulated. The circle indicates a nasal consonant, and a vertical line indicates a fricative. The horizontal line devoices the consonant, meant for the terrestrial-harmony – picture a flat ground.

The rainy consonants have their own shapes, called “wings” – usually a simple curve, though the Arcane [ʎ] has a more complex curve, owing to its blatant origin as a ligature, the unicorns’ use of magic enabling its creation. The null onset counts as a wing, as it is pronounced in the Courier dialect. Finally, each consonant, no matter what, carries an inherent vowel – /e/ or /a/, depending on the vowel harmony.

Wings can frame the hooves and wings as well – these such markings determine the vowel, when changed from the inherent /e/ or /a/. Above can be placed the lunar-harmony mark – itself the shape of a crescent moon, which in turn neatly defined the vowel harmony as a labial (rounding) one. Next to that can be placed the question mark, a feature inspired by the Armenian script (e.g. Ինչպէ՞ս ես։ (Inč’pẻs es, “How are you?”)).

Below that, when not occupied by the vowel marks for /r̩/ or /l̩/, a dot can be placed to indicate a nasal vowel (originally a circle, but got simplified), or a horizontal line to indicate a mute vowel.

Most of the wings for vowel marking came to me naturally, and I built the corresponding rainy consonants around their designs. The only one that threw me for a loop was coming up with a marking for /ɤ/-/o/. I’ve tried several designs, from adding other marks to curving the bend into a loop, but none of them had the elegance I wanted. Then I remembered how Sanskrit described their vowels – guttural (/a/), palatal (/i/), labial (/u/), retroflex (/r̩/), dental (/l̩/), palatoguttural (/e/), and labioguttural (/o/). That last one got me thinking: what if I simply combined two existing wings together? That answer came to me while I was in the middle of writing Chapter 17, and was absolutely spur-of-the-moment. (Granted, an actual “labiopalatal” vowel would be [y], but screw it, this works perfectly.)

The ordering of the vowels and consonants was inspired by that of Sanskrit. I simply followed the same principles as the ancient Indian linguist Pāṇini: going from simple to complex, going from the back of the mouth to the front, and grouping similar sounds. These principles applied to the vowels as well as the consonants, but I also wanted to follow the phonological harmonies as well.

True to Pāṇini, I mostly went from the null consonant to the velars, to the palatals, to the dentals, to the labials. The only exception was grouping the sibilants with the stops and nasals (where in Sanskrit they would be grouped and ordered last of all), but otherwise worked just the same.

Vowels were more divergent – I mostly went with a Pāṇini-like order, but placed the neutral vowels at the very end, ordering them the same way as their consonantal equivalents.

But there’s a lot more to talk about how scripts are ordered. You have a good idea how Sanskrit does theirs, and you’re familiar with the ABC order of the Latin script – but I’m far from done.

The original ABCs were based off of the ordering of the Phoenician abjad:

𐤕 𐤔 𐤓 𐤒 𐤑 𐤐 𐤏 𐤎 𐤍 𐤌 𐤋 𐤊 𐤉 𐤈 𐤇 𐤆 𐤅 𐤄 𐤃 𐤂 𐤁 𐤀‎
ʾ, b, g, d, h, w, z, ḥ, ṭ, y, k, l, m, n, s, ʿ, p, ṣ, q, r, š, t

(Note: Phoenician is normally written right-to-left, as with other scripts in the Middle East; here, the order has been reversed as a guide for the reader to help match the letters to their Romanizations.)

Egyptian hieroglyphs (from which the Phoenicians got the idea for their script) did not have a collated order, and this order was selected basically at random. Over the intervening centuries and millennia, letters cropped up, and letters disappeared, but glimmers of the same order still remain.

Some languages, like Sanskrit, make a conscious effort to order their glyphs logically. Chinese does this too, even though they don’t have letters per se. They have, however, found a few ways to sort their logographies by common elements, or radicals. The most popular scheme is the 214 Kāngxī radicals, as compiled in 1615 (though they are named after a 1716 dictionary, published during the Kāngxī (康熙) period). These are arranged first by stroke number, then by stroke direction:

一丨丶丿乙亅
二亠人儿入八冂冖冫几凵刀力勹匕匚匸十卜卩厂厶又
口囗土士夂夊夕大女子宀寸小尢尸屮山巛工己巾干幺广廴廾弋弓彐彡彳
心戈戶手支攴文斗斤方无日曰月木欠止止歹殳毋比毛氏气水火爪父爻爿片牙牛犬
玄玉瓜瓦甘生用田疋疒癶白皮皿目矛矢石示禸禾穴立
竹米糸缶网羊羽老而耒耳聿肉臣自至臼舌舛舟艮色艸虍虫血行衣襾
見角言谷豆豕豸貝赤走足身車辛辰辵邑酉釆里
金長門阜隶隹雨靑非
面革韋韭音頁風飛食首香
馬骨高髟鬥鬯鬲鬼
魚鳥鹵鹿麥麻
黃黍黑黹
黽鼎鼓鼠
鼻齊
齒
龍龜
龠
one, line, dot, slash, second, hook
two, lid, man, legs, enter, eight, wide, cloth cover, ice, table, receptacle, knife, power, wrap, spoon, box, hiding enclosure, ten, divination, seal (i.e. stamp), cliff, private, again
mouth, enclosure, earth, scholar, go, go slowly, evening, big, woman, child, roof, inch, small, lame, corpse, sprout, mountain, river, work, oneself, scarf, dry, small, cliff, long stride, arch, shoot (i.e. an arrow), bow, snout, bristle, step
heart, halberd, door, hand, branch, tap, writing, dipper, axe, square, not, sun, say, moon, tree, lack, stop, death, weapon, do not, compare, fur, clan, steam, water, fire, claw, father, Trigrams, split wood, slice, fang, cow, dog
profound, jade, melon, tile, sweet, life, use, field, bolt of cloth, sickness, footsteps, white, skin, dish, eye, spear, arrow, stone, spirit, track, grain, cave, stand
bamboo, rice, silk, jar, net, sheep, feather, old, and, plow, ear, brush, meat, minister, self, arrive, morter, tongue, oppose, boat, stopping, color, grass, tiger, insect, blood, walk enclosure, clothing, cover
see, horn, speech, valley, bean, pig, badger, shell, red, run, foot, body, cart, bitter, morning, walk, city, wine, distinguish, village
metal, long, gate, mound, slave, short-tailed bird, rain, green/blue, wrong
face, leather, tanned leather, leek, sound, leaf, wind, fly, food, head, fragrant
horse, bone, tall, hair, fight, sacrificial wine, cauldron, ghost
fish, bird, salt, deer, wheat, hemp
yellow, millet, black, embroidery
frog, tripod, drum, rat
nose, even
tooth
dragon, turtle
flute

(In case you were curious, the last one has seventeen strokes.)

The Kāngxī order is the one you would typically find in dictionaries for Traditional Chinese, Japanese, and Korean. On occasion, those dictionaries might use a different radical order, or even develop a new one in-house, but the Kāngxī order tends to be more reliable.

Simplified Chinese, as prescribed by the Chinese government, uses a different, 201-radical order, defined in 2009 as the following:

一丨丿丶乛
十厂匚卜冂八人勹儿匕几亠冫冖凵卩刀力又厶廴
干工土艹寸廾大尢弋小口囗山巾彳彡夕夂丬广门宀辶彐尸己弓子屮女飞马么巛
王无韦木支犬歹车牙戈比瓦止攴日贝水见牛手气毛长片斤爪父月氏欠风殳文方火斗户心毋
示甘石龙业目田罒皿生矢禾白瓜鸟疒立穴疋皮癶矛
耒老耳臣覀而页至虍虫肉缶舌竹臼自血舟色齐衣羊米聿艮羽糸
麦走赤豆酉辰豕卤里足邑身釆谷豸龟角言辛
青龺雨非齿黾隹阜金鱼隶
革面韭骨香鬼食音首
髟鬲鬥高
黄麻鹿
鼎黑黍
鼓鼠
鼻
龠
one, line, slash, dot, second
ten, cliff, box, divination, wide, eight, man, wrap, legs, spoon, table, lid, ice, cover, receptacle, seal (i.e. stamp), knife, power, and, private, long stride
dry, work, earth, grass, inch, arch, big, lame, shoot (i.e. an arrow), small, mouth, enclosure, mountain, scarf, step, bristle, evening, go, split wood, cliff, door, roof, walk, snout, corpse, oneself, bow, child, sprout, woman, fly, horse, small, river
jade, not, tanned leather, tree, branch, dog, death, cart, fang, halberd, compare, tile, stop, tap, sun, shell, water, see, cow, hand, steam, hair, long, slice, axe, claw, father, moon, clan, lack, wind, weapon, writing, square, fire, dipper, house, heart, do not
ancestor, sweet, stone, dragon, work, eye, field, net, dish, life, arrow, grain, white, melon, bird, sickness, stand, cave, bolt of cloth, skin, footsteps, spear
plow, old, ear, minister, cover, but, leaf, arrive, tiger, insect, meat, jar, tongue, bamboo, mortar, self, blood, boat, color, even, clothing, sheep, rice, brush, stop, feather, silk
wheat, run, red, bean, wine, morning, pig, salt, village, foot, city, body, distinguish, valley, badger, turtle, horn, speech, bitter
green/blue, profound, rain, wrong, tooth, frog, short-tailed bird, mound, metal, fish, slave
leather, face, leek, bone, fragrant, ghost, food, sound, head
hair, cauldron, fight, tall
yellow, hemp, deer
tripod, black, millet
drum, mouse
nose
flute

But earlier than both of these is the Shuōwén Jiězì, an early Han-Dynasty-era dictionary that sorted characters by a system of 540 radicals. I will not be listing them here.

Hangeul also orders their alphabet logically, though the actual order has evolved significantly alongside the language. Originally in 1446, the letters, or jamo (자모), were ordered:

ㄱ ㅋ ㆁ ㄷ ㅌ ㄴ ㅂ ㅍ ㅁ ㅈ ㅊ ㅅ ㆆ ㅎ ㅇ ㄹ ㅿ
ㆍ ㅡ ㅣ ㅗ ㅏ ㅜ ㅓ ㅛ ㅑ ㅠ ㅕ
g, k, ng, d, t, n, b, p, m, j, ch, s, ', h, null, r, y
ə, eu, i, o, a, u, eo, yo, ya, yu, yeo

Note how all the consonants come before all the vowels, and how both are collated separately.

In 1527, Choe Sejin, a Korean linguist, reordered the jamo as follows:

ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㆁ ㅋ ㅌ ㅍ ㅈ ㅊ ㅿ ㅇ ㅎ
ㅏ ㅑ ㅓ ㅕ ㅗ ㅛ ㅜ ㅠ ㅡ ㅣ ㆍ
g, n, d, r, m, b, s, ', k, t, p, j, ch, y, null, h
a, ya, eo, yeo, o, yo, u, yu, eu, i, ə

This would form the basis of the two modern orders. Yes, two – since North and South Korea deal with the ssang (쌍, “twin, pair”) jamo and vowel digraphs differently, as these had evolved into existence after 1527. The North Korean order is as follows:

ㄱ ㄴ ㄷ ㄹ ㅁ ㅂ ㅅ ㅈ ㅊ ㅋ ㅌ ㅍ ㅎ ㄲ ㄸ ㅃ ㅆ ㅉ ㅇ
ㅏ ㅑ ㅓ ㅕ ㅗ ㅛ ㅜ ㅠ ㅡ ㅣ ㅐ ㅒ ㅔ ㅖ ㅚ ㅟ ㅢ ㅘ ㅝ ㅙ ㅞ
g, n, d, r, m, b, j, ch, k, t, p, h, kk, tt, pp, ss, jj, null
a, ya, eo, yeo, o, yo, u, yu, eu, i, ae, yae, e, ye, oe, yoe, wi, ui, wa, wo, wae, we

The South Korean order, the standard used for Korean-language learners outside of either country, is as follows:

ㄱ ㄲ ㄴ ㄷ ㄸ ㄹ ㅁ ㅂ ㅃ ㅅ ㅆ ㅇ ㅈ ㅉ ㅊ ㅋ ㅌ ㅍ ㅎ
ㅏ ㅐ ㅑ ㅒ ㅓ ㅔ ㅕ ㅖ ㅗ ㅘ ㅙ ㅚ ㅛ ㅜ ㅝ ㅞ ㅟ ㅠ ㅡ ㅢ ㅣ
g, kk, n, d, tt, r, m, b, pp, s, ss, null, j, jj, ch, k, t, p, h
a, ae, ya, yae, eo, e, yeo, ye, o, wa, wae, oe, yo, u, wo, we, wi, yu, eu, ui, i

Both of them agree, however, that a few jamo were obsoleted and thus removed. These include ng ⟨ㆁ⟩, ' ⟨ㆆ⟩, y ⟨ㅿ⟩, and ə ⟨ㆍ⟩. In fact, the null onset ⟨ㅇ⟩ and ng ⟨ㆁ⟩ were combined into one jamo ⟨ㅇ⟩ – silent as an onset, but pronounced [ŋ] as a coda.

But as much as I talk about the orders used in Chinese and Korean, my favorite has got to be the traditional hanacaraka order used in the Javanese abugida in Indonesia. The letters are arranged in such a way that they actually tell a story!

Hana caraka
Data sawala
Padha jayanya
Maga bathanga

“There were two messengers who quarreled with each other. They were equal in fight; here are their bodies.”

Balinese and Sundanese use the same order, less tha and dha since they do not have these sounds.

And what better topic to top off my notes than writing about punctuation?

This is a relatively new concept – writing has existed since antiquity, but since they always started out as logographic or syllabic, where one glyph represented an entire word or morpheme, not even spacing was necessary. That’s right – spaces are punctuation!

Even as Phoenician developed from Egyptian, writing remained as scripta continua – no spaces. After 200 BC, the Greeks started using a system developed by Aistophanes of Byzantium, consisting of a series of dots – the ὑποστιγμή (hypostigmḗ, “low dot”) to end a subclause, a στιγμὴ μέση (stigmḕ mésē, “middle dot”) to end a clause, and a στιγμὴ τελεία (stigmḕ teleía, “high dot”) to end a sentence. They also used the παράγραφος (parágraphos) to start a sentence, διπλῆς (diplês) in the margins to mark a quotation, and the κορωνίς (korōnís) to end a major section.

These were eventually inherited by the Romans, often to indicate pauses when reading texts aloud. Another thing they did was write text out per capitula, where every sentence was on its own line.

Punctuation development exploded in the west when the Bible was being copied in large numbers – standardized, alongside spelling rules, with the invention of the printing press. Cecil Hartley wrote a poëm in 1860 about how much pause to give each mark:

The stop point out, with truth, the time of pause
A sentence doth require at ev’ry clause.
At ev’ry comma, stop while one you count;
At semicolon, two is the amount;
A colon doth require the time of three;
The period four, as learned men agree.

Children in the nineteenth century were taught the importance of punctuation with sayings like “Charles the First walked and talked half an hour after his head was cut off.” One should add a semicolon after “talked,” and a comma after “after.”

And the quote marks! There’s no single way how to show spoken dialogue in print – different languages do it in different ways, and even then, different countries sharing a language have different ways. In fact, there are so many ways to do quote marks that it’d take an essay by itself to detail them all. I’ll take the liberty of sparing you that headache, though.

The typewriter forced a few concessions due to limited room on the keyboard – quotation marks, as curly as they were, were straightened out so they could open and close with the same key. Even the exclamation point had to be built, by overstriking an apostrophe with a period.

Meanwhile, westernization elsewhere in the world meant that other scripts picked up punctuation, typically by borrowing from Portuguese or Dutch traders. The Arabic script, for instance, often uses reversed question marks (⟨؟⟩) and commas (⟨،⟩), since it is written right-to-left; Hebrew, while written in the same direction, does not reverse these marks. Chinese, Japanese, and Korean have all borrowed the period and comma to mark off the ends of sentences and clauses, respectively (previously inferred from writing).

Punctuation is still evolving today – @, once used as an obscure symbol for selling bulk commodities (where it originally meant “at the amount of”), has been revived in the digital age, meant for technical routing, e.g. email addresses and Twitter handles. The tilde, used on the typewriter only for striking that diacritic, is now used for estimates (e.g. π = ~3.14) or to punctuate emotionally-charged sentences – there, it’s usually sexual, though I have seen it used for romance, including flirting, and even simple friendship.

Author's Note

Login