The Strangest Letter of the Alphabet: the Rise and Fall of Yogh
Posted3 months agoActive3 months ago
deadlanguagesociety.comOtherstoryHigh profile
calmmixed
Debate
70/100
LinguisticsLanguage HistoryAlphabet Evolution
Key topics
Linguistics
Language History
Alphabet Evolution
The article discusses the history of the letter yogh, a letter used in Middle English that eventually fell out of use, sparking a discussion on the quirks of the English language and its spelling system.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
53m
Peak period
110
0-12h
Avg / period
26.7
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 2, 2025 at 5:34 PM EDT
3 months ago
Step 01 - 02First comment
Oct 2, 2025 at 6:27 PM EDT
53m after posting
Step 02 - 03Peak activity
110 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 10, 2025 at 1:31 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45455882Type: storyLast synced: 11/20/2025, 8:18:36 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Super key could arguably apply to the shift keys, because you are using a super set of letters (or am I reaching too far)
But, the meta key already exists and on Windows it is the Windows key. We're going around in circles.
Approaching an Apple keyboard for the first time, I naively thought Command would be Ctrl and was quite confused, since there also is Ctrl. But once you start a terminal it became quite clear that this was not the case. This is quite neat, it directly solves the SIGINT,SIGSTOP / Copy, Ctrl-D? confusion. Also having these operation as OS command means that all programs support them.
You're not though. The original letters are not available, so uppercase is not a superset of lowercase. (Unless you're just stating your opinion that the set of uppercase letters happens to be super)
We've two letters that are visually similar to yogh 'ȝ': 'ვ' (an English 'v', or 'w' if you're Tbliseli), and 'პ' (a hard 'p'). Our 'gh' sound is 'ღ'.
I have no doubt that two very disparate languages and scripts will find a few similarities simply due to proximity. Georgia and England (UK) are close enough for a fair amount of cultural exchange.
And AFAIK Georgian alphabet predates Insular script by a few centuries.
It's not uncommon for otherwise unrelated alphabets to come with similar symbols simply because the trend over time is towards simplification of letter shapes, and there are only so many basic elements. So even originally quite different characters can end up looking very similar when reduced to a few squiggles.
People have an extremely distorted perspective on European history for many reasons, but the late industrial age nation state probably had the biggest impact on that mental model people still have today in many ways. By all evidence I’ve seen, the cultural exchange in the distant past was far more organic than most people can easily imagine today for many reasons. Trade and cultural communion, religious exchanges and defensive unions all made that possible in a world that was not at all as controlled and authoritarian as we even experience today. It all waxed and waned over the centuries and regions of course, in a rather organic manner; but due to practical limitations a lot of the authoritarian restrictions we are all subject to today simply did not exist.
In some ways the USA until about 1960, is probably the most similar analogue of how Europe seems to have generally been for the longest time leading up to the Industrial Revolution. It was a land of general regions of self-regulating, cultural clustering with local levels of varying jurisdictions and power structures which to a large extent kept most people in their home region, if not their place of birth. By the latter part of that period the identity with one’s state and region and local culture had already largely succumbed to the oppressive force of the centralized dominating power of the federal and global power, but your region was still largely your cultural identity as a person and community.
That of course has all been totally razed and destroyed now and the USA effectively exists in name only today, which has been the case for an even longer time, but that’s a different topic altogether.
In the last week, our most famous Menzies passed away - politician Sir "Ming" Campbell.
Some wag made a 'ming vase' of his face: https://www.portrait.gov.au/portraits/2004.176/ming-vase-sir...
different hill, but one I would die on is: as the letter "c" should make the "ch" sound, the letter "c" serves no purpose not already handled by "s" or "k" otherwise
that last one is hardly fair - gist and mirage are french words. might as well complain about the silent letters in rendezvous or faux pas.
So surgery is full of -ectomies instead of -cut-outs.
I wonder do German brains work on a much longer context window because of the language?
b) the official title of the law was "Gesetz zur Übertragung der Aufgaben für die Überwachung der Rinderkennzeichnung und Rindfleischetikettierung", so how again is it that English "gets to use a sentence" and German doesn't? German has the choice depending on context, sometimes having one word is convenient.
The composed word also has a specific meaning that the same words with space between doesn't. For example "das rote Kraut" – "red herb" and "das Rotkraut" – "red cabbage". Also suppose "red cabbage" was grown in abnormal conditions, so it doesn't have the color pigments, it is still "red cabbage", but not "red" "cabbage". This is awkward to state in English, but no problem in German.
Maybe, but more due to the spelling of numbers and long sentences. Compound words are not an example of this, since Germans can parse these words just fine as different things. It just means that the lowest "tokenization" in everyday use is not the word, but subcomponents of them.
Do English native speakers "tokenize" expressions in words? Do you see it as '(labelling) (of) (minced)' or '(label)l(ing) (of) (minc)(ed)' ?
I can't speak for most Germans, but the algorithm I think I use is just greedy from left to right. This is also consistent with how mistokenization in common puns works, so I think this is common.
In primary school we trained to recognize syllable boundaries. Is that just a German thing, or is this common in other countries? You need to know these for spelling and once you know these, separating word components becomes trivial.
Not entirely true. English, as any other Germanic language, still likes to compound words to produce a new meaning, the main difference is that, as opposed to most other Germanic languages, spaces are usually retained in writing. But this is just a spelling difference, the underlying process is the same.
See https://en.m.wikipedia.org/wiki/Compound_(linguistics)
Let's consider "scheepskapitein", "Schiffskapitän" and "ship captain". All three are formed the exact same way and mean roughly the same thing, but it's customary in Dutch and German to spell it without a space and in English it's considered correct to have a space in between. Note, that there are no spaces in speech, it's simply a writing convention. So, how many words are there in this example?
Sure, linguists can dissect everything and should, but how does the English laymen perceive it?
(To put it another way, most native speakers treat "ts" as two sounds but not "ch")
Luckily there are other wasted characters, like "x" and "q".
But we'd still be arguing about how to pronounce "ᵹif"
Catalans seem to pronounce "caixa" fine, so I think they _could_ say "Shabi"... But this does back up your larger point about "x" -> "sh" in Catalan.
j -> dzh is more weird than anything.
Vowels, of course, are a cause of war between dialects; nobody can even agree how many there are.
Esperanto has a nice trick where they reserve "x" as a modifier letter, so if you can't use diacritics you write "cx", "sx", "jx" etc; but it does not have a sound value of its own and can never occur by itself. We could extend this to "tx" and "dx" with obvious values, and also to vowels - "a" for /æ/ vs "ax" for /ɑ/, "i" for "ɪ" vs "ix" for /i/ etc. Using "j" the way it is today feels somewhat wasteful given how rare it is. In the x-system it would probably be best represented by "gx", and then we could have a saner use for "j" like all other Germanic languages do. Which would free up "y" so we could use it for the schwa.
One thing that occurred to me the other day is that "x" is also a diacritic, so we could just say that e.g. "sx" and "s͓" are the same thing. Then again from a purely utilitarian perspective a regular dot serves just fine and looks neater (and would be a nice homage to Old English even if ċ and ġ are really just a modern convention).
Vowels, yeah... I think it's pretty much impossible to do a true phonemic orthography for English vowels that is not dialect-specific. As in, either some dialects will have homographs that are not homonyms, or else other dialects will not have the ability to "write it as you speak it" because they'll need to use different letters for the same (to them) sound. In the latter case, it would become more of a morphological orthography. Which would still be a massive improvement if it's at least consistent.
OTOH if you look at General American specifically, and treat [ə] and [ʌ] as stress-dependent allophones, then you can get away with 9 vowel characters in total (ɪiʊuɛəoæɑ). That's pretty easy with diacritics.
What’s the ch sound? My intuition from German class is that ch represents a throaty hhhh. Somehow that got spoiled into k in most English words.
Every c in Pacific Ocean is pronounced differently. C is a silly letter.
If you mean the standard German from Germany, there a two variants. At the end of a syllable it is like you described (kind of throaty hhhh). For the beginning of syllables think of sh and open your mouth.
It varies between dialects. Swiss German speakers tend to stick out to Germans because we pronounce the ch in a much scratchier way than is accepted in Standard German.
-- Caeser, seizer of the day
It's funny you use "tube" as an example though, as in my British accent I pronounce that as "chube", whereas I believe many Americans would use a "t" sound for that word. Not sure how you settle on a spelling in those cases.
That aside, what you describe is a distinction between yod-dropping and lack thereof, and whether and where it happens is highly dialect dependent.
practice / practise licence / license
Most don't bother because context is nearly always sufficient.
How is it that you can say these words without confusion?
Language is context sensitive and you understand the word based on the context around it. Likewise, you understand homographs based on the context. Because of this, spelling isn't as important as it might appear.
Exceptions to this are generally loan words, particularly from French (eg chaise, which sounds more like "sh"). Others are harder to explain. "Lichen" springs to mind. Yes it technically comes from Latin but we're beyond the time range to truly consider it a loan word.
There are also some "ch" words of Greek origin (IIRC) that could simply be replaced with "c" or "k" (eg chemistry, school).
"Kh" on the other hand I think is entirely loan words, particularly from Arabic. Even then we have names like "Achmed" that would more consistently be written as "Akhmen". "Khan" is obviously a loan word but I think time has largely reduced the pronunciation to "karn" rather than "kharn" if it ever was that.
But I can't think of a single "kh" word that pronounced like "ch" in "chair".
"Sh" doesn't seem to crossover with any of these pronunciations.
Maybe as a fun pet project someday!
What happened in short was that the greeks copied the ancient and now virtually defunct phoenican script, varieties of which were used across the region and kept the names even though they made no sense in the context of Greek, added vowels and wrote it from left to right.
The russians adapted the script in one way, Latin in another, Hebrew and arabic took entirely separate paths and now the only thing they share in common is alphabets that follows vaguely the same ordering.
It goes to show just how powerful the idea of writing is - once you have a society where it's pervasive, all its neighbors acquire it from them in short order, and they usually do so by adapting the original writing script to their needs. I strongly suspect that the reason why alphabets (and to a lesser extent syllabaries) spread especially widely is because they are easier to adapt to a different language - usually, once you've learned a new alphabet, it's more or less readily obvious how to use it to approximate any language that you already know.
(Which is also how you get imperfect spellings even in brand new orthographies. Practicality usually beats purity.)
In some way everything has it's roots there: language, numbers, math, philosophy, politics, religion. And earlier humans itself moved in the same direction. It's just were the large population and the high culture used to be. The last remnants were purged in the middle ages and now by islamic fundamentalists.
We pretend phonics exists, but it's just a lie we tell little kids to kickstart their learning. In reality, English spelling is more like learning Kanji. The original meanings of the words is warped beyond belief and we tell the specific pronunciation of specific letter sequences based on the surrounding letter sequences (much like telling which Kanji reading to use based on the surrounding Kanji). Words aren't so much sounded out as memorized and because English has such a massive vocabulary, the memorization work needed to be proficient is very extensive.
The classic example of this is "ough" which has NINE different pronunciations for the same letters and no real rules to indicate which one should be used. Spelling reform would make such situations completely unnecessary.
Languages with more phonetic alphabets tend to have much higher literacy rates for the same education quality and literacy can be achieved much faster. This works because once you memorize the sounds the letters make, you can sound out any word or write any word (provided you pronounce it correctly). The memorization process slowly kicks in where common words are still sight-read, but that process can happen much sooner and the individual can start independent reading much earlier with a focus on comprehension rather than memorizing weird rules and exceptions.
English departments have done massive damage in this regard. English started finalizing how words would be spelled around the same time the great vowel shift happened and completely screwed up everything. We then mass-adopted words with foreign spellings that used completely different phonetic systems. Despite the issues, English departments insist that these bugs are actually features despite the great harm they cause students and not only codify them, but denigrate all attempts to fix the problems.
English departments aren't the only ones. Even 150 years ago when Webster was trying small spelling reforms (some stuck around and some did not), people complained that the writing was childish. When Teddy Roosevelt tried a further spelling reform of getting rid of unneeded letters, he was turned into a laughing stock for the same reason (again with a handful sticking around). Modern "text speak" is yet another unofficial attempt to simplify spelling so it is more consistent, but once again, better, shorter alternatives are derided as making someone look unintelligent.
This still doesn't deal with the more fundamental phoneme/alphabet mismatch though. English has 44 common phonemes and a bunch of less common and regional sounds (for example, the χ sound in "cloCK"). Our adopted Latin alphabet has 26 letters of which at least 3 are unneeded (C as K or S, Q as KW, and X as KS). This leads to a horrible situation where a lot of sounds no longer have letters (Futhorc didn't get all the sounds, but still did better with 33 letters of which something like 11 were vowels). Some English sounds like the S in "treaSure" seem to have no real, unique spelling at all. Others like th and th have no indicator if it is supposed to be voiced like "THen" or unvoiced like "THink" (we used to have thorn and eth for this). We have 18 unique consonants and 24 common consonant sounds.
The vowel situation is even more dire. We have just 5 vowels and around 20 common vowels leaving each vowel desperately overloaded with all kinds of weird phonics "rules" and almost all of them having either multiple rules or different pronunciations for the same word (eg, "reed" vs "red" in "I read the book"). There needs to be massive vowel reform (either a ton of stable digraphs, diacritics, or more letters) so that sounds can be differentiated properly.
Spelling reform could all but eliminate our illiteracy problems and open a whole new world of possibilities to more than half of all Americans. In a world dominated by ever-increasing volumes of information, these people would have much better lives if we lowered the bar of learning to read to something more attainable.
There are certainly differences, but if you place current English spelling next to something like Shavian (or some other language with near-pure phonetic spelling), I'd say that Modern English learning patterns are closer to Kanji than the pure phonetic alphabets.
One thing that worries me is the widespread adoption of english words and nouns in many languages. The list is ever increasing, even though the word makes absolutely no sense out of the context of English, cannot be adapted by a mon english speaker to have anything more then a single, rigid meaning. It's annoying enough for me when some books use French words. I don't know how everyone else copes.
As for literacy, i find it hard to believe the true statistics are as dire as you say but i'm prepared to accept that it is. Firstly, what are the statistics for contemporary societies with more sensible spellings? And can better education help? A final point, you clearly know far more about this topic then i do, but would adding half a dozen letters to the alphabet really help with increasing literacy?
https://www.thenationalliteracyinstitute.com/2024-2025-liter...
> It's annoying enough for me when some books use French words.
From around 1060 to 1360, French was the official language of England. It wasn't normal French though as William the Conqueror spoke Norman French. Both French dialects mixed in what can only be considered English style. For example Norman French said Warder while other French speakers said Guarder. English adopted both Warden and Guard, but gave them two different meanings. Overall, some 30% of our words are French though over 800 of the most common 1000 words are English in origin.
> would adding half a dozen letters to the alphabet really help with increasing literacy?
ITA (International Teaching Alphabet) shows the benefits and problems.
ITA students rocketed ahead the first couple of years and could read way more words than their traditional counterparts. The problem was the transition. Learning both systems seems to have evened things up or maybe even caused a net negative for ITA students. I believe this was because they had to learn two sets of spelling for everything. If you would like to see the difficulty in learning a new way to read/write and have a bit of fun, try learning Shavian script.
In an ideal world, they would have phonetic spelling only. I believe under those conditions that their advantage would continue to grow all the way through school. The problem is that this study is unethical to conduct because even if it is correct, the students would graduate and be unable to read traditional English which would permanently harm them.
This leaves the tricky problem of bridging the gap. This can't be done too quickly or the older generations get left behind. There's also an issue of transcribing everything into the new spelling. Technology has made that easier than ever, but it would still be a very hard proposition.
The first and easiest step is cleaning up the spellings using the letters we currently have. Stuff like all those -ough endings get rewritten in sane letters as an accepted alternative spelling. Silent letters start going away. We start moving toward consistent vowel and consonant digraphs. This will take time for older people to adapt to, but more consistent rules will mean they will have an easier time sounding them out.
After this, we start adding back letters. Maybe eth and thorn come back for the two "th" sounds. We certainly need a new letter for the S in "treaSure" and maybe bring back the elongated S to use for SH. At some point, we then start working on slowly adding new characters to stand in for the vowel digraphs.
I don't think you could convince adults to do more than a couple of steps at a time each generation. Such a plan would likely take decades to maybe even a century or two. Until the creation of the printing press, such slow changes were considered normal. Only in recent times have we attempted to gate-keep what "real English" is. If we allow the language to grow more organically, I think it could be guided into something far better than we have today.
The other sound that ‘ȝ’ once spelled is the “harsh” or “guttural” sound made in the back of the mouth, which you hear in Scots loch or German Bach.4 This sound is actually the reason for the most famous bit of English spelling chaos: the sometimes-silent, sometimes-not sequence ‘gh’ that you see in laugh, cough, night, and daughter. Maybe one day I’ll tell you that story too.
[0] https://en.wikipedia.org/wiki/Consonant_cluster
[1] https://nl.wikipedia.org/wiki/Medeklinkerstapeling (Dutch wiki page for consonant clusters with more examples)
It's funny I just started reading and understanding the first paragraph, before recognizing that this is a foreign language, I don't know at all.
Quick reminder that writing != language. Even the highest fidelity writing systems are lossy encoding systems. In fact, the more phonologically accurate a writing system is to its language, the more it obscures the history of its words, especially words borrowed from other languages.
So from the perspective of someone interested in etymology, English writing's tendency to preserve old and foreign spellings is a good thing.
Of course it also depends on how conservative the language is, like Finnish orthography is practically IPA, and yet Finnish is a freaking time capsule for words like borrowed Proto-Germanic *kuningaz and *wīsaz, which became king and wise in English, but kuningas and viisas in modern Finnish. So you can have both phonemic writing as well as etymological transparency if your phonology doesn't change much.
And OTOH even modern English spelling often doesn't distinguish differences that are there in most dialects (e.g. "bear" vs "near"), so this isn't even a new problem. Realistically I suspect there's some "minimal reasonable set" of phonemes that need to be distinguished to reflect the most prominently distinct pairs in all major dialects, even if some subtle dialectal distinctions might not be reflected in spelling.
eg. Egyptian Heiroglyphs, or Asian characters (esp. Korean which has a relatively young alphabet - which IIRC is phoneme based, or Chinese which has a very old set, which is used across multiple languages (eg. Mandarin/Cantonese/etc)
Chinese' pictorial writing completely obfuscates the historical state of the spoken language, to the extent that in order to reconstruct older phases of the spoken Chinese language, scholars have had to inspect old Chinese loan-words in foreign languages that do preserve the old phonetic structure.
An example of this is the discovery that Chinese tones developed from earlier final-consonants, which were lost in Mandarin, but are preserved in Cantonese, Japanese and Korean. i.e: Mandarin "guó" compared with the early borrowing into Japanese : "koku", both meaning "country".
Korean Hangul is not ideographic (I think what you meant by pictorial?). It's a morphophonemic alphabet that just happens to organize the basic phonemic units into larger graphemes representing whole syllables - but in a completely predictable way. And it is another example of this playing out: the original Hangul was entirely phonemic, but over time pronunciation diverged from spelling, and today it's morpho-phonemic, and even then not perfectly so. So they preserved the history at the cost of some mismatch between the spelling and the sound.
This increases the complexity of learning to write the language -- 56 letters in alphabet and each combination of consonant+vowel and consonant+consonant takes on a different letter form instead of just being a string of independent letters like English.
But reading / pronunciation is straightforward. (No we don't have spelling bees :) )
Indian languages are generally rich in phonemes though. My mind boggles at the notion of [n] [ɳ] [ɲ] [ŋ] all being distinct. I mean, I can reproduce each one of them on its own, but doing that in rapid speech, and worse yet, recognizing the same in others' speech...
They are phonetically distinct, but not phonemically distinct, which is to say that in most places they occur, they aren't used to distinguish words or meanings.
In particular, the velar nasal "ङ" or "ng" always appears adjacent to a velar sounds (k/kh/g/gh) and similarly the palatal nasal "ञ" always appears adjacent to palatal sounds (c/ch/j/jh), both as allophones of the nasal phonemes "m" (bilabial) and "n" (alveo-dental), basically just like we speak in English under the exact same conditions (like the nasal in the word "English"!)
You perceive a difference with Indic language and English because the Latin script doesn't distinguish nasals for velar and palatal points of articulation - it only distinguishes by bilabial (m) and alveolar (n), whereas Indic scripts do distinguish those, even though they offer no additional information.
The unique nasal sound which is often contrastive in many Indian languages is the retroflex nasal "ण" (ṇ). That's the one that it's easy to confuse in speech if you are not a native speaker, so it's the only one you need to pay extra attention to when learning.
But, as far as I know, the different nasals are phonemic in some languages of India. Which ones depends on which languages, but I do remember seeing at least one in which all four of these were distinct.
None of the major Indian languages I'm familiar with have 4 nasal phones, from either the Indo-Aryan or Dravidian language families.
In the Indo-Aryan languages, the convergence of the various nasals is so complete that they are all often represented with a single "dot" diacritic character when they occur at word junctions.
I'd be open to hearing examples of Indian languages that have 4 nasal phonemes, though.
m (ಮ) n (ನ) ɳ (ಣ) ɲ (ಞ) ŋ (ಙ)
There are 5 nasal glyphs, but like in other Indian languages, 2 (velar and palatal) are allophones of the others, leaving only 3 actual phonemes. Indian scripts are often overspecified, and not every glyph represents a phoneme.
I don't think that's true. From the northern Indian languages schwa deletion (https://en.wikipedia.org/wiki/Schwa_deletion_in_Indo-Aryan_l...) to the extreme divergence between the standard formal and spoken forms of languages in Southern languages (https://en.wikipedia.org/wiki/Diglossia), it's a stretch to say the scripts mirror what is spoken.
It's just that if you are a native speaker/reader, you are so fluent that you unconsciously auto-correct those inconsistencies - just like in English.
Even in the formal registers of each spoken Indian language, which should be in theory be more systematically consistent with their scripts, there are inconsistencies in spelling/pronunciation of loan-words from both foreign and other Indian languages (i.e. aspirates in South India and retroflex approximates in northern India, and any number of inconsistent renderings of English words in Indic script).
Tibetan, Mon-Burmese and Thai scripts, as an example, all derive from the Brahmi script (through a long and sometimes windy ancestry), but neither reflects the modern pronunciation, hence mind numbing transcription systems.
Tibetan and Burmese languages are particularly notorious for codifying the archaic pronunciation of respective languages that has been frozen in time for centuries. It is a treasure trove for linguists that have got a time machine for free, but I don't think that the same can't be said modern speakers of both languages.
On the topic of screwball spelling is this video essay on silent letters. The fun takeaway for me was that a lot of silent letters were never pronounced. it is just that when some of the first dictionaries were being produced, and the spellings decided on, they decided to introduce silent letters to indicate the origin of the word. the b in debt is because it comes from the latin debitum. but it was not spelled that way until the 1500's prior to that it was dette.
https://www.youtube.com/watch?v=NXVqZpHY5R8 (RobWords: Why English is full of silent letters)
Is that true? Seems like it's in every other word when I visit Spain...
They all wrote the fancy 'g' rather than the simplified 'g' we use now. I assume they copied text from textbooks rather than (say) from teachers from England.
As a real young'un I used to attempt to do the same as an exercise but it's not easy to make it look good.
145 more comments available on Hacker News