The Strangest Letter of the Alphabet: the Rise and Fall of Yogh

https://en.m.wikipedia.org/wiki/Meta_key

3 months ago

2 replies

I like Apple's thinking - Option key

Super key could arguably apply to the shift keys, because you are using a super set of letters (or am I reaching too far)

eucyclos

3 months ago

1 reply

shift could be the metakey? Maybe too many syllables...

Agentlien

3 months ago

1 reply

But, the meta key already exists and on Windows it is the Windows key. We're going around in circles.

3 months ago

1 reply

I thought the Meta key is labeled "Alt" in Windows(-like) keyboards.

Agentlien

3 months ago

1 reply

I always get confused around this and it seems to depend on specific keyboard and software. It seems to mostly be Alt, sometimes Alt Gr, and sometimes the Windows key. But I do remember using Meta to mean Alt when I was setting things up in Ubuntu maybe ten years ago.

3 months ago

I think Super and Meta are the vendor independent terms.

    Ctrl             - Ctrl    - Ctrl
    Super            - Windows - Command
    Left/Right Super - ??      - Open/Close Apple
    Left Meta        - Alt     - Option
    Right Meta       - Alt Gr  - Right Option?

Anybody able to fill the gaps?

Approaching an Apple keyboard for the first time, I naively thought Command would be Ctrl and was quite confused, since there also is Ctrl. But once you start a terminal it became quite clear that this was not the case. This is quite neat, it directly solves the SIGINT,SIGSTOP / Copy, Ctrl-D? confusion. Also having these operation as OS command means that all programs support them.

lmm

3 months ago

> you are using a super set of letters (or am I reaching too far)

You're not though. The original letters are not available, so uppercase is not a superset of lowercase. (Unless you're just stating your opinion that the set of uppercase letters happens to be super)

SllX

3 months ago

Why would it be? It’s the Windows key, not Microsoft’s rebadge of the Super key.

magarnicle

3 months ago

1 reply

WynnDOS

hackernewds

3 months ago

Maybe you didn't realize until now that it is indeed Wynn-DOS as in W-Disk Operating System

specproc

3 months ago

1 reply

The old English 'ᵹ' has visual similarities with the Georgian 'გ', also our 'g'.

We've two letters that are visually similar to yogh 'ȝ': 'ვ' (an English 'v', or 'w' if you're Tbliseli), and 'პ' (a hard 'p'). Our 'gh' sound is 'ღ'.

gerdesj

3 months ago

2 replies

Took me a couple of seconds - Georgian .. as in the other country with a red cross on a white background (although you add a few extra crosses than England)

I have no doubt that two very disparate languages and scripts will find a few similarities simply due to proximity. Georgia and England (UK) are close enough for a fair amount of cultural exchange.

3 months ago

ᵹ has a clear line of development back to capital "G": https://en.wikipedia.org/wiki/Insular_script#/media/File:Evo...

And AFAIK Georgian alphabet predates Insular script by a few centuries.

It's not uncommon for otherwise unrelated alphabets to come with similar symbols simply because the trend over time is towards simplification of letter shapes, and there are only so many basic elements. So even originally quite different characters can end up looking very similar when reduced to a few squiggles.

hopelite

3 months ago

If you start looking into it, you will surely be astonished at just how much “cultural exchange” there must have been going back even into the Paleolithic time, and definitely during the period of the OP article is touching on.

People have an extremely distorted perspective on European history for many reasons, but the late industrial age nation state probably had the biggest impact on that mental model people still have today in many ways. By all evidence I’ve seen, the cultural exchange in the distant past was far more organic than most people can easily imagine today for many reasons. Trade and cultural communion, religious exchanges and defensive unions all made that possible in a world that was not at all as controlled and authoritarian as we even experience today. It all waxed and waned over the centuries and regions of course, in a rather organic manner; but due to practical limitations a lot of the authoritarian restrictions we are all subject to today simply did not exist.

In some ways the USA until about 1960, is probably the most similar analogue of how Europe seems to have generally been for the longest time leading up to the Industrial Revolution. It was a land of general regions of self-regulating, cultural clustering with local levels of varying jurisdictions and power structures which to a large extent kept most people in their home region, if not their place of birth. By the latter part of that period the identity with one’s state and region and local culture had already largely succumbed to the oppressive force of the centralized dominating power of the federal and global power, but your region was still largely your cultural identity as a person and community.

That of course has all been totally razed and destroyed now and the USA effectively exists in name only today, which has been the case for an even longer time, but that’s a different topic altogether.

choult

3 months ago

7 replies

In the UK, yogh existed in Scotland a while longer than in England. You can observe it still in the name Menzies which is pronounced Ming-is there - the z in place of the yogh.

In the last week, our most famous Menzies passed away - politician Sir "Ming" Campbell.

zabzonk

3 months ago

1 reply

The newsagent chain Menzies (plural implied) was pronounced Ming-is-es.

ggm

3 months ago

S/was/is/ in as much as the pronunciation remains even if the chain doesn't.

milesrout

3 months ago

1 reply

Yes it says this in the article.

DonHopkins

3 months ago

Why is the far right always so racist?

3 months ago

1 reply

Australia's longest serving prime minister https://en.wikipedia.org/wiki/Robert_Menzies also had the nickname 'ming'.

Some wag made a 'ming vase' of his face: https://www.portrait.gov.au/portraits/2004.176/ming-vase-sir...

ViscountPenguin

3 months ago

1 reply

I prefer the modern pronunciation for the mild chuckle of hearing Antony Green talk about who won period blood every election.

3 months ago

1 reply

Oof, 'menses' - never heard that pronunciation but thanks I think.

ViscountPenguin

3 months ago

You're welcome, I think!

selimthegrim

3 months ago

But apparently, Macungie, PA has nothing to do with Mackenzie

FridayoLeary

3 months ago

Thanks for that. I always thought it was Greek or something like that. I wondered why we had a major political party led by a Greek guy.

jgtrosh

3 months ago

That's /ˈmɪŋɪs/,/ˈmɪŋɡɪs/ for anyone wondering

geocar

3 months ago

Funny. Russian pronounces the з as a z sound.

samgutentag

3 months ago

10 replies

"English spelling has a reputation. And it’s not a good one." - never have i ever agreed with anything more

different hill, but one I would die on is: as the letter "c" should make the "ch" sound, the letter "c" serves no purpose not already handled by "s" or "k" otherwise

3 months ago

2 replies

> It’s full of silent letters, as in numb, knee, and honour. A given sound can be spelled in multiple ways (farm, laugh, photo), and many letters make multiple sounds (get, gist, mirage).

that last one is hardly fair - gist and mirage are french words. might as well complain about the silent letters in rendezvous or faux pas.

pessimizer

3 months ago

2 replies

Almost every English word is French, except for the most important ones.

3 months ago

1 reply

Touche

3 months ago

1 reply

Call me a douche, but the e in "touche" is silent, whereas that in "touché" is voiced.

3 months ago

I was lazy, I didn't do the accent.

jleyank

3 months ago

2 replies

The food is French, the animal is Anglo Saxon. At least English lacks compound words or whatever German calls those 30-character constructions.

onestay42

3 months ago

3 replies

"Cattle labeling meat labeling supervision task transfer act" is just as bad as Rinderkennzeichnungsfleischetikettierungsüberwachungsaufgabenübertragungsgesetz, English just gets to use spaces where German doesn't. The underlying construction is the same. (I definitively got that translation wrong)

magarnicle

3 months ago

1 reply

Usually English will try to come up with a single, Latin-or-Greek-derived word for compound ideas like this, which is another bad habit.

So surgery is full of -ectomies instead of -cut-outs.

3 months ago

Medicine terms in German also use Latin or Greek, since this is the subject language, so this is a bad example.

3 months ago

2 replies

English gets to use a sentence. It can be reworded any number of ways. I did a bit of quick googling and the clearest English I came up with for `Regulation (EC) No 1760/2000` is "Requirements for the Labelling of Minced Beef" which is a lot easier to process than Rinderkennzeichnungsfleischetikettierungsüberwachungsaufgabenübertragungsgesetz. The reason we split code over lines is the same reason we split sentences into words. Easier for the brain to parse.

I wonder do German brains work on a much longer context window because of the language?

detaro

3 months ago

1 reply

a) the title of the regulation is not equivalent to the law (unsurprisingly), onestay42's translation is clunky but a lot closer

b) the official title of the law was "Gesetz zur Übertragung der Aufgaben für die Überwachung der Rinderkennzeichnung und Rindfleischetikettierung", so how again is it that English "gets to use a sentence" and German doesn't? German has the choice depending on context, sometimes having one word is convenient.

3 months ago

1 reply

I'm not a German speaker. Why would someone use such a long word as a convenience?

3 months ago

I am. It is a semantic difference. Single entities get referred to by a single word. If you use a word group to describe it, it means you don't consider it a single "thing", but a "system" described by the relations of single "things".

The composed word also has a specific meaning that the same words with space between doesn't. For example "das rote Kraut" – "red herb" and "das Rotkraut" – "red cabbage". Also suppose "red cabbage" was grown in abnormal conditions, so it doesn't have the color pigments, it is still "red cabbage", but not "red" "cabbage". This is awkward to state in English, but no problem in German.

3 months ago

> I wonder do German brains work on a much longer context window because of the language?

Maybe, but more due to the spelling of numbers and long sentences. Compound words are not an example of this, since Germans can parse these words just fine as different things. It just means that the lowest "tokenization" in everyday use is not the word, but subcomponents of them.

Do English native speakers "tokenize" expressions in words? Do you see it as '(labelling) (of) (minced)' or '(label)l(ing) (of) (minc)(ed)' ?

I can't speak for most Germans, but the algorithm I think I use is just greedy from left to right. This is also consistent with how mistokenization in common puns works, so I think this is common.

In primary school we trained to recognize syllable boundaries. Is that just a German thing, or is this common in other countries? You need to know these for spelling and once you know these, separating word components becomes trivial.

bmacho

3 months ago

Maybe in speech they are similar, but not in writing. The underlying construction is as different as it can be. English puts " " between words, and German does not.

See https://en.m.wikipedia.org/wiki/Compound_(linguistics)

3 months ago

1 reply

> At least English lacks compound words or whatever German calls those 30-character constructions.

Not entirely true. English, as any other Germanic language, still likes to compound words to produce a new meaning, the main difference is that, as opposed to most other Germanic languages, spaces are usually retained in writing. But this is just a spelling difference, the underlying process is the same.

3 months ago

1 reply

Does that mean, that "compound word" counts as a single word? And how do I distinguish between "a" "compound" "word" and "a" "compound word"?

3 months ago

1 reply

Depends on your definition of a word and how it relates to writing. It's not such a simple question, actually.

Let's consider "scheepskapitein", "Schiffskapitän" and "ship captain". All three are formed the exact same way and mean roughly the same thing, but it's customary in Dutch and German to spell it without a space and in English it's considered correct to have a space in between. Note, that there are no spaces in speech, it's simply a writing convention. So, how many words are there in this example?

3 months ago

I don't know, I think German laymen have a unambiguous understanding. "der Schiffskapitän" = 2, "des Schiffes Kapitän" = 3

Sure, linguists can dissect everything and should, but how does the English laymen perceive it?

ochrist

3 months ago

In Danish knee is 'knæ' and the K is pronounced very clearly. It's interesting that English speaking people have forgotten how to pronounce K before N, so the Danish king Knud became Canute.

user982

3 months ago

1 reply

But which "ch" sound? "Ch" as in "church" is just "tsh". "Ch" as in "charade" is just "sh".

cwnyth

3 months ago

2 replies

Seconding this. C should be the ʃ sound, and then TC should be the "ch" in "church." The fact that there's no one letter for ʃ is the real tragedy.

bee_rider

3 months ago

2 replies

I imagine integrals would make a loud static-y burst of noise.

cperciva

3 months ago

No, crackle is the 5th derivative, not the integral.

kccqzy

3 months ago

It's not an integral sign. It's U+0283 LATIN SMALL LETTER ESH.

https://guidetogrammar.org/grammar/twain.htm

3 months ago

Post-alveolar affricates are phonemic in English and deserve their own characters.

(To put it another way, most native speakers treat "ts" as two sounds but not "ch")

Luckily there are other wasted characters, like "x" and "q".

murderfs

3 months ago

6 replies

  For example, in Year 1 that useless letter "c" would be dropped to be replased either by "k" or "s", and likewise "x" would no longer be part of the alphabet.

  The only kase in which "c" would be retained would be the "ch" formation, which will be dealt with later.

  Year 2 might reform "w" spelling, so that "which" and "one" would take the same konsonant, wile Year 3 might well abolish "y" replasing it with "i" and iear 4 might fiks the "g/j" anomali wonse and for all.

  Jenerally, then, the improvement would kontinue iear bai iear with iear 5 doing awai with useless double konsonants, and iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants.

  Bai iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.

  Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

SamBam

3 months ago

1 reply

> fiks the "g/j" anomali wonse and for all

But we'd still be arguing about how to pronounce "ᵹif"

yatopifo

3 months ago

We'll just make it g'jif

mixmastamyk

3 months ago

1 reply

Recommend X for the ‘sh’ sound, as it is pronounced that way in languages like Portuguese. Y is a common typographical substitute for theta/thorn, as in “ye olde shoppe.”

3 months ago

2 replies

Or X -> ch, as in Greek, and footballers called Xavi?

darkwater

3 months ago

1 reply

Xavi is catalan (shorter form of the name Xavier) and in Catalan "x" has exactly the "sh" sound. To get the "ch" sound you need to use "tx". And yes, most people - even natives - pronounce Xavi badly, due to Castillan influence on Catalan, and the lack of the "sh" sound in Castillan.

3 months ago

1 reply

> most people - even natives - pronounce Xavi badly, due to...the lack of the "sh" sound in Castillan

Catalans seem to pronounce "caixa" fine, so I think they _could_ say "Shabi"... But this does back up your larger point about "x" -> "sh" in Catalan.

darkwater

3 months ago

1 reply

Yes Catalans haver no problem with the "x" :) but it's just with the name that is mispronounced due to the Castillan overlap. I think that "caixa" with the "i" before the "x" makes it easier also for Castillans (although it's funny to hear them pronounce it). There is also the fact that both speakers have serious issues with words starting with "s" + consonant, so my theory is that "shavi" is also affected, while "chavi" is far easier.

spookie

3 months ago

I wonder how the castilians pronounce "xaile" now.

3 months ago

There's no /x/ phoneme in modern English, so it's unneeded in English spelling.

tim333

3 months ago

1 reply

By the way the source was a Mr Shield's letter to the Economist rather than Twain https://web.archive.org/web/20200311221105/http://www.letter...

madcaptenor

3 months ago

1 reply

There are a lot of things Mark Twain didn't say.

542354234235

3 months ago

On the whole, most things that have been said were not said by Twain.

pmcarlton

3 months ago

The nice thing about this passage is it reflects the extent of Twain's non-rhotic dialect -- he keeps the R in "year"/"years", "orthographical", and "world" but drops it in "after", "letters", and "dodderers". So only dropped in final unstressed syllables of multi-syllable words.

bmacho

3 months ago

I'm convinced that this is Just The Right Thing To Do. Like ridiculously strong benefits, and practically no drawbacks at all.

the_lucifer

3 months ago

I remember a version which ends with how we'll end up speaking German.

o11c

3 months ago

2 replies

I've played around with respelling quite a bit; one of the most difficult adaptations is forcing yourself to correctly use "dh" (few-but-common words, "thy", "either", "teethe") vs "th" (most words, "thigh", "ether", "teeth").

j -> dzh is more weird than anything.

Vowels, of course, are a cause of war between dialects; nobody can even agree how many there are.

3 months ago

1 reply

Bring back þ!

o11c

3 months ago

1 reply

The problem with þ is that it dates from a time when /ð/ vs /θ/ were allophones. That is, none of the minimal pairs I listed above existed (mostly due to more words having additional syllables - often, inflections at the end).

3 months ago

"ð" was also a thing at that time, so we can bring them both back and use them to distinguish.

3 months ago

I kinda wish English avoided Xh type digraphs because they screw up common borrowings like Thai. Sure, that's not strictly phonemic in English, but I think realistically given how readily English adopts foreign words into it without completely nativizing them phonemically, any orthography should strive to reflect that, meaning that combinations like "th" should have their obvious meanings that can be inferred by native speakers even if such a sequence never occurs in native words.

Esperanto has a nice trick where they reserve "x" as a modifier letter, so if you can't use diacritics you write "cx", "sx", "jx" etc; but it does not have a sound value of its own and can never occur by itself. We could extend this to "tx" and "dx" with obvious values, and also to vowels - "a" for /æ/ vs "ax" for /ɑ/, "i" for "ɪ" vs "ix" for /i/ etc. Using "j" the way it is today feels somewhat wasteful given how rare it is. In the x-system it would probably be best represented by "gx", and then we could have a saner use for "j" like all other Germanic languages do. Which would free up "y" so we could use it for the schwa.

One thing that occurred to me the other day is that "x" is also a diacritic, so we could just say that e.g. "sx" and "s͓" are the same thing. Then again from a purely utilitarian perspective a regular dot serves just fine and looks neater (and would be a nice homage to Old English even if ċ and ġ are really just a modern convention).

Vowels, yeah... I think it's pretty much impossible to do a true phonemic orthography for English vowels that is not dialect-specific. As in, either some dialects will have homographs that are not homonyms, or else other dialects will not have the ability to "write it as you speak it" because they'll need to use different letters for the same (to them) sound. In the latter case, it would become more of a morphological orthography. Which would still be a massive improvement if it's at least consistent.

OTOH if you look at General American specifically, and treat [ə] and [ʌ] as stress-dependent allophones, then you can get away with 9 vowel characters in total (ɪiʊuɛəoæɑ). That's pretty easy with diacritics.

Swizec

3 months ago

2 replies

> as the letter "c" should make the "ch" sound

What’s the ch sound? My intuition from German class is that ch represents a throaty hhhh. Somehow that got spoiled into k in most English words.

Every c in Pacific Ocean is pronounced differently. C is a silly letter.

3 months ago

> My intuition from German class is that ch represents a throaty hhhh

If you mean the standard German from Germany, there a two variants. At the end of a syllable it is like you described (kind of throaty hhhh). For the beginning of syllables think of sh and open your mouth.

microtherion

3 months ago

> My intuition from German class is that ch represents a throaty hhhh

It varies between dialects. Swiss German speakers tend to stick out to Germans because we pronounce the ch in a much scratchier way than is accepted in Standard German.

shemtay

3 months ago

1 reply

Words of Latin origin are identifiable at a glance, and homonymic collisions are thereby avoided

-- Caeser, seizer of the day

user982

3 months ago

1 reply

"Caesar" was pronounced in Latin with a hard C, which is preserved in German ("Kaiser").

magarnicle

3 months ago

1 reply

And v is pronounced like w, but people look at you funny if you pronounce vice versa "wikay wersa".

sapphicsnail

3 months ago

I made someone with a veni vidi vici tattoo when I told him that.

tbrake

3 months ago

4 replies

Changing "cube" to "kube" would just look like it's pronounced "koob" (e.g. rube, tube, lube), so we swap a minor spelling aggravation for a minor pronunciation edge case. unless you want to go full kyube but we're not putting that on the table.

nicoburns

3 months ago

1 reply

kyube or kyoob would definitely be the way to go.

It's funny you use "tube" as an example though, as in my British accent I pronounce that as "chube", whereas I believe many Americans would use a "t" sound for that word. Not sure how you settle on a spelling in those cases.

3 months ago

2 replies

Regional variations are available! I think the BBC would have had it pronounced tyoob. And don't Americans pronounce it "subway"?

mcny

3 months ago

1 reply

Most Americans sadly never get to ride one anyway.

smegger001

3 months ago

No but they do eat at them.

xyzzy3000

3 months ago

In the north of England it is still commonly 'tyoob'.

cvoss

3 months ago

Well, it would be a step backward in the right direction to go with spelling it 'kube' and pronouncing it 'koob'. That would hew to the original Greek. We'd also bring cybernetic back closer to kubernetes. And circle to kuklos. (Side note: It's another spelling "error" that we use 'y' in English to transliterate the Greek upsilon, which looks like 'Y' when capitalized, but is really a better match to 'u'. Hence, hyper and hypo instead of huper and hupo (like super and sub).)

3 months ago

Why would it? "u" generally doesn't follow this pattern in English after "k" any more so than it does after "c".

That aside, what you describe is a distinction between yod-dropping and lack thereof, and whether and where it happens is highly dialect dependent.

3 months ago

This is an issue because vowel letters/digraphs are much more inconsistent than consonant letters/digraphs.

kevin_thibedeau

3 months ago

6 replies

English's spelling irregularities help with disambiguating homophones:

  cent / sent / scent
  ceiling / sealing
  cite / sight / site
  colonel / kernel
  carrot / karat
  cue / queue

jleyank

3 months ago

2 replies

Which, of course, does not help things like polish polish (made in Warsaw) and to produce produce (pull apples out of a bag). However you look at it, when they set up English words and spelling there was large quantities of alcohol involved.

madcaptenor

3 months ago

And present present (to pull a gift out of, well, a bag)

smegger001

3 months ago

Also read (future tense) and read (past tense) being pronounced differently despite the same spelling.

3 months ago

And cause confusion with needless heterographs?

practice / practise licence / license

3 months ago

Some other languages do the same with diacritics.

Most don't bother because context is nearly always sufficient.

its-kostya

3 months ago

On paper, yes. But not when someone speaks. If you used a homophone while speaking, the listener would be able to distinguish which variant the talker intended based on context. I would argue this is enough of a reason for written text as well.

3 months ago

If you look up these words in the dictionary, the same word with the same spelling very often has several different definitions that are often unrelated because homographs (same spelling, but different meaning) are super-common in English. Dictionaries don't account for newer or more niche meanings of words either.

How is it that you can say these words without confusion?

Language is context sensitive and you understand the word based on the context around it. Likewise, you understand homographs based on the context. Because of this, spelling isn't as important as it might appear.

jraph

3 months ago

Only in writing. The disambiguation is already needed when spoken and the context does this.

jmyeet

3 months ago

1 reply

"Ch" is a strange hill to die on. "Ch" has a mostly consistent pronunciation (eg chair, touch, chain, choke, recharge, etc) that no other letter combination does.

Exceptions to this are generally loan words, particularly from French (eg chaise, which sounds more like "sh"). Others are harder to explain. "Lichen" springs to mind. Yes it technically comes from Latin but we're beyond the time range to truly consider it a loan word.

There are also some "ch" words of Greek origin (IIRC) that could simply be replaced with "c" or "k" (eg chemistry, school).

"Kh" on the other hand I think is entirely loan words, particularly from Arabic. Even then we have names like "Achmed" that would more consistently be written as "Akhmen". "Khan" is obviously a loan word but I think time has largely reduced the pronunciation to "karn" rather than "kharn" if it ever was that.

But I can't think of a single "kh" word that pronounced like "ch" in "chair".

"Sh" doesn't seem to crossover with any of these pronunciations.

3 months ago

In Dutch and German Ch is pronounced as 'g'.

zcdziura

3 months ago

I completely agree with you. I've taken an amateurish interest in linguistics over the past couple of years, and I've often thought that it might be a fun exercise to come up with a phonetic alphabet for the English language. Use the letter 'c' to represent /ch/, 'x' to represent /sh/, etc.

Maybe as a fun pet project someday!

FridayoLeary

3 months ago

4 replies

ot but i recently discovered that the latin alphabet western languages use has it's roots in the semitic languages of the middle east. It is of course obvious when you think about it, even the name alphabet is basically the same as aleph bet, the first two letters of hebrew. It's even more obvious when you look at the similarities in the names of the Greek alphabet which Latin is based off.

What happened in short was that the greeks copied the ancient and now virtually defunct phoenican script, varieties of which were used across the region and kept the names even though they made no sense in the context of Greek, added vowels and wrote it from left to right.

The russians adapted the script in one way, Latin in another, Hebrew and arabic took entirely separate paths and now the only thing they share in common is alphabets that follows vaguely the same ordering.

Symbiote

3 months ago

If you had read the article, you would see this is on-topic as it's described at the beginning.

kevin_thibedeau

3 months ago

The Greeks used RTL and Boustrophedon (alternate directions) from the Phoenicians before switching to LTR.

3 months ago

And Phoenician and Proto-Sinaitic in turn derive from Egyptian hieroglyphs. Furthermore, most alphabets in the world derive from that same one source.

It goes to show just how powerful the idea of writing is - once you have a society where it's pervasive, all its neighbors acquire it from them in short order, and they usually do so by adapting the original writing script to their needs. I strongly suspect that the reason why alphabets (and to a lesser extent syllabaries) spread especially widely is because they are easier to adapt to a different language - usually, once you've learned a new alphabet, it's more or less readily obvious how to use it to approximate any language that you already know.

(Which is also how you get imperfect spellings even in brand new orthographies. Practicality usually beats purity.)

3 months ago

> has it's roots in the semitic languages of the middle east

In some way everything has it's roots there: language, numbers, math, philosophy, politics, religion. And earlier humans itself moved in the same direction. It's just were the large population and the high culture used to be. The last remnants were purged in the middle ages and now by islamic fundamentalists.

3 months ago

2 replies

Time to get on the English/American spelling reform and alphabet reform soapbox. 54% of US citizens have a less than 6th grade reading ability and 21% are functionally illiterate. The cause of this is almost entirely non-phonetic/phonemic spelling.

We pretend phonics exists, but it's just a lie we tell little kids to kickstart their learning. In reality, English spelling is more like learning Kanji. The original meanings of the words is warped beyond belief and we tell the specific pronunciation of specific letter sequences based on the surrounding letter sequences (much like telling which Kanji reading to use based on the surrounding Kanji). Words aren't so much sounded out as memorized and because English has such a massive vocabulary, the memorization work needed to be proficient is very extensive.

The classic example of this is "ough" which has NINE different pronunciations for the same letters and no real rules to indicate which one should be used. Spelling reform would make such situations completely unnecessary.

Languages with more phonetic alphabets tend to have much higher literacy rates for the same education quality and literacy can be achieved much faster. This works because once you memorize the sounds the letters make, you can sound out any word or write any word (provided you pronounce it correctly). The memorization process slowly kicks in where common words are still sight-read, but that process can happen much sooner and the individual can start independent reading much earlier with a focus on comprehension rather than memorizing weird rules and exceptions.

English departments have done massive damage in this regard. English started finalizing how words would be spelled around the same time the great vowel shift happened and completely screwed up everything. We then mass-adopted words with foreign spellings that used completely different phonetic systems. Despite the issues, English departments insist that these bugs are actually features despite the great harm they cause students and not only codify them, but denigrate all attempts to fix the problems.

English departments aren't the only ones. Even 150 years ago when Webster was trying small spelling reforms (some stuck around and some did not), people complained that the writing was childish. When Teddy Roosevelt tried a further spelling reform of getting rid of unneeded letters, he was turned into a laughing stock for the same reason (again with a handful sticking around). Modern "text speak" is yet another unofficial attempt to simplify spelling so it is more consistent, but once again, better, shorter alternatives are derided as making someone look unintelligent.

This still doesn't deal with the more fundamental phoneme/alphabet mismatch though. English has 44 common phonemes and a bunch of less common and regional sounds (for example, the χ sound in "cloCK"). Our adopted Latin alphabet has 26 letters of which at least 3 are unneeded (C as K or S, Q as KW, and X as KS). This leads to a horrible situation where a lot of sounds no longer have letters (Futhorc didn't get all the sounds, but still did better with 33 letters of which something like 11 were vowels). Some English sounds like the S in "treaSure" seem to have no real, unique spelling at all. Others like th and th have no indicator if it is supposed to be voiced like "THen" or unvoiced like "THink" (we used to have thorn and eth for this). We have 18 unique consonants and 24 common consonant sounds.

The vowel situation is even more dire. We have just 5 vowels and around 20 common vowels leaving each vowel desperately overloaded with all kinds of weird phonics "rules" and almost all of them having either multiple rules or different pronunciations for the same word (eg, "reed" vs "red" in "I read the book"). There needs to be massive vowel reform (either a ton of stable digraphs, diacritics, or more letters) so that sounds can be differentiated properly.

Spelling reform could all but eliminate our illiteracy problems and open a whole new world of possibilities to more than half of all Americans. In a world dominated by ever-increasing volumes of information, these people would have much better lives if we lowered the bar of learning to read to something more attainable.

coronasaurus

3 months ago

1 reply

I strongly object to the claim "In reality, English spelling is more like learning Kanji." as someone who had to learn both Chinese and English characters.

3 months ago

1 reply

Both rely on groups of characters. Both are non-phonetic. Both rely on multiple memorized pronunciations for those character groups based on surrounding character group context. Both preserve symbol shape for reason of historical context.

There are certainly differences, but if you place current English spelling next to something like Shavian (or some other language with near-pure phonetic spelling), I'd say that Modern English learning patterns are closer to Kanji than the pure phonetic alphabets.

coronasaurus

3 months ago

If you ever tried to learn Chinese, you wouldn't be saying any of this.

FridayoLeary

3 months ago

1 reply

The trouble is that most people can read English effortlessly and are completely unconscious of it's many, many inconsistencies. It's also not that hard for an average child to pick up. Also, i enjoy the sophistication of english because i've mastered it.

One thing that worries me is the widespread adoption of english words and nouns in many languages. The list is ever increasing, even though the word makes absolutely no sense out of the context of English, cannot be adapted by a mon english speaker to have anything more then a single, rigid meaning. It's annoying enough for me when some books use French words. I don't know how everyone else copes.

As for literacy, i find it hard to believe the true statistics are as dire as you say but i'm prepared to accept that it is. Firstly, what are the statistics for contemporary societies with more sensible spellings? And can better education help? A final point, you clearly know far more about this topic then i do, but would adding half a dozen letters to the alphabet really help with increasing literacy?

https://www.thenationalliteracyinstitute.com/2024-2025-liter...

3 months ago

> i find it hard to believe the true statistics are as dire as you say but i'm prepared to accept that it is.

> It's annoying enough for me when some books use French words.

From around 1060 to 1360, French was the official language of England. It wasn't normal French though as William the Conqueror spoke Norman French. Both French dialects mixed in what can only be considered English style. For example Norman French said Warder while other French speakers said Guarder. English adopted both Warden and Guard, but gave them two different meanings. Overall, some 30% of our words are French though over 800 of the most common 1000 words are English in origin.

> would adding half a dozen letters to the alphabet really help with increasing literacy?

ITA (International Teaching Alphabet) shows the benefits and problems.

ITA students rocketed ahead the first couple of years and could read way more words than their traditional counterparts. The problem was the transition. Learning both systems seems to have evened things up or maybe even caused a net negative for ITA students. I believe this was because they had to learn two sets of spelling for everything. If you would like to see the difficulty in learning a new way to read/write and have a bit of fun, try learning Shavian script.

In an ideal world, they would have phonetic spelling only. I believe under those conditions that their advantage would continue to grow all the way through school. The problem is that this study is unethical to conduct because even if it is correct, the students would graduate and be unable to read traditional English which would permanently harm them.

This leaves the tricky problem of bridging the gap. This can't be done too quickly or the older generations get left behind. There's also an issue of transcribing everything into the new spelling. Technology has made that easier than ever, but it would still be a very hard proposition.

The first and easiest step is cleaning up the spellings using the letters we currently have. Stuff like all those -ough endings get rewritten in sane letters as an accepted alternative spelling. Silent letters start going away. We start moving toward consistent vowel and consonant digraphs. This will take time for older people to adapt to, but more consistent rules will mean they will have an easier time sounding them out.

After this, we start adding back letters. Maybe eth and thorn come back for the two "th" sounds. We certainly need a new letter for the S in "treaSure" and maybe bring back the elongated S to use for SH. At some point, we then start working on slowly adding new characters to stand in for the vowel digraphs.

I don't think you could convince adults to do more than a couple of steps at a time each generation. Such a plan would likely take decades to maybe even a century or two. Until the creation of the printing press, such slow changes were considered normal. Only in recent times have we attempted to gate-keep what "real English" is. If we allow the language to grow more organically, I think it could be guided into something far better than we have today.

obiefernandez

3 months ago

3 replies

my biggest TIL takeaway from that article was an "oh wow" moment:

The other sound that ‘ȝ’ once spelled is the “harsh” or “guttural” sound made in the back of the mouth, which you hear in Scots loch or German Bach.4 This sound is actually the reason for the most famous bit of English spelling chaos: the sometimes-silent, sometimes-not sequence ‘gh’ that you see in laugh, cough, night, and daughter. Maybe one day I’ll tell you that story too.

maxhille

3 months ago

1 reply

Lachen, Nacht and Tochter (don't know a cognate for 'cough') still have this sound in Standard German.

atq2119

3 months ago

1 reply

'cough' could share a root with 'keuchen' (IANAL)

3 months ago

That has a different sound though. But yes, it might be a cognate.

3 months ago

1 reply

In Dutch there is an even harder 'g' sound.

sev

3 months ago

1 reply

Is it less hard than the ‘k’ sound?

[0] https://en.wikipedia.org/wiki/Consonant_cluster

3 months ago

1 reply

Yes, more back of the throat. One particularly nasty form is as in 'Scheveningen'. The Scottish version comes close in for instance 'Loch'.

vanderZwan

3 months ago

2 replies

I personally am more fond of provoking an "angstschreeuw" in English speakers by asking them to pronounce "slechtstschrijvend" or "zachtstschrijdend" and watching them recoil in horror at the consonant clusters[0][1].

[1] https://nl.wikipedia.org/wiki/Medeklinkerstapeling (Dutch wiki page for consonant clusters with more examples)

3 months ago

> [1]

It's funny I just started reading and understanding the first paragraph, before recognizing that this is a foreign language, I don't know at all.

3 months ago

Those are funny!

FergusArgyll

3 months ago

I've heard that Knecht (servant in German) is the same word as Knight in English

3 months ago

3 replies

> English spelling has a reputation. And it’s not a good one." - never have i ever agreed with anything more

Quick reminder that writing != language. Even the highest fidelity writing systems are lossy encoding systems. In fact, the more phonologically accurate a writing system is to its language, the more it obscures the history of its words, especially words borrowed from other languages.

So from the perspective of someone interested in etymology, English writing's tendency to preserve old and foreign spellings is a good thing.

efskap

3 months ago

1 reply

Plus, a more phonetic writing system is also problematic for dialectal variation. I pronounce marry/Mary/merry identically, as well as bag/beg, but other dialects distinguish them. I don't think the written standard would benefit from spelling them identically. That's relevant for everyday use, not just upsetting etymology enthusiasts.

Of course it also depends on how conservative the language is, like Finnish orthography is practically IPA, and yet Finnish is a freaking time capsule for words like borrowed Proto-Germanic *kuningaz and *wīsaz, which became king and wise in English, but kuningas and viisas in modern Finnish. So you can have both phonemic writing as well as etymological transparency if your phonology doesn't change much.

3 months ago

That is indeed a problem with English, but even then it is possible to come up with a morpho-somewhat-phonemic spelling that would be far more consistent than modern English - because the bar set by the standard orthography is really that low.

And OTOH even modern English spelling often doesn't distinguish differences that are there in most dialects (e.g. "bear" vs "near"), so this isn't even a new problem. Realistically I suspect there's some "minimal reasonable set" of phonemes that need to be distinguished to reflect the most prominently distinct pairs in all major dialects, even if some subtle dialectal distinctions might not be reflected in spelling.

3 months ago

2 replies

How does that play out for languages that use characters that are pictorial.

eg. Egyptian Heiroglyphs, or Asian characters (esp. Korean which has a relatively young alphabet - which IIRC is phoneme based, or Chinese which has a very old set, which is used across multiple languages (eg. Mandarin/Cantonese/etc)

3 months ago

1 reply

> How does that play out for languages that use characters that are pictorial.

Chinese' pictorial writing completely obfuscates the historical state of the spoken language, to the extent that in order to reconstruct older phases of the spoken Chinese language, scholars have had to inspect old Chinese loan-words in foreign languages that do preserve the old phonetic structure.

An example of this is the discovery that Chinese tones developed from earlier final-consonants, which were lost in Mandarin, but are preserved in Cantonese, Japanese and Korean. i.e: Mandarin "guó" compared with the early borrowing into Japanese : "koku", both meaning "country".

3 months ago

That is very interesting, and along the lines of what I was wondering. Thanks

3 months ago

It plays out perfectly. E.g. Chinese is one of the least phonological scripts around, and this is precisely why old texts in it are more interpretable.

Korean Hangul is not ideographic (I think what you meant by pictorial?). It's a morphophonemic alphabet that just happens to organize the basic phonemic units into larger graphemes representing whole syllables - but in a completely predictable way. And it is another example of this playing out: the original Hangul was entirely phonemic, but over time pronunciation diverged from spelling, and today it's morpho-phonemic, and even then not perfectly so. So they preserved the history at the cost of some mismatch between the spelling and the sound.

albert_e

3 months ago

3 replies

Many Indian languages are written in scripts that mirror what is spoken. Silent letters don't exist and pronunciations that don't match the spelling are very rare. This does npt preclude the existence of rich dialects and accents.

This increases the complexity of learning to write the language -- 56 letters in alphabet and each combination of consonant+vowel and consonant+consonant takes on a different letter form instead of just being a string of independent letters like English.

But reading / pronunciation is straightforward. (No we don't have spelling bees :) )

3 months ago

1 reply

Phonemic spelling does not require a syllabary, though. Several European languages are also written "as spoken" using the Latin alphabet, usually with a few extra digraphs or letter variants. Or you can make the syllabary itself compose regularly, like in Hangul.

Indian languages are generally rich in phonemes though. My mind boggles at the notion of [n] [ɳ] [ɲ] [ŋ] all being distinct. I mean, I can reproduce each one of them on its own, but doing that in rapid speech, and worse yet, recognizing the same in others' speech...

3 months ago

1 reply

> My mind boggles at the notion of [n] [ɳ] [ɲ] [ŋ] all being distinct.

They are phonetically distinct, but not phonemically distinct, which is to say that in most places they occur, they aren't used to distinguish words or meanings.

In particular, the velar nasal "ङ" or "ng" always appears adjacent to a velar sounds (k/kh/g/gh) and similarly the palatal nasal "ञ" always appears adjacent to palatal sounds (c/ch/j/jh), both as allophones of the nasal phonemes "m" (bilabial) and "n" (alveo-dental), basically just like we speak in English under the exact same conditions (like the nasal in the word "English"!)

You perceive a difference with Indic language and English because the Latin script doesn't distinguish nasals for velar and palatal points of articulation - it only distinguishes by bilabial (m) and alveolar (n), whereas Indic scripts do distinguish those, even though they offer no additional information.

The unique nasal sound which is often contrastive in many Indian languages is the retroflex nasal "ण" (ṇ). That's the one that it's easy to confuse in speech if you are not a native speaker, so it's the only one you need to pay extra attention to when learning.

3 months ago

1 reply

I don't actually perceive a difference. For that matter, my native language doesn't have [ŋ] at all (not even before velars), so it's actually tricky for me to distinguish it in English as well.

But, as far as I know, the different nasals are phonemic in some languages of India. Which ones depends on which languages, but I do remember seeing at least one in which all four of these were distinct.

3 months ago

1 reply

> but I do remember seeing at least one in which all four of these were distinct.

None of the major Indian languages I'm familiar with have 4 nasal phones, from either the Indo-Aryan or Dravidian language families.

In the Indo-Aryan languages, the convergence of the various nasals is so complete that they are all often represented with a single "dot" diacritic character when they occur at word junctions.

I'd be open to hearing examples of Indian languages that have 4 nasal phonemes, though.

3 months ago

1 reply

It was Kannada, a coworker's language. Per Wikipedia it has five nasals, each with its own glyph:

m (ಮ) n (ನ) ɳ (ಣ) ɲ (ಞ) ŋ (ಙ)

3 months ago

> Per Wikipedia it has five nasals, each with its own glyph: m (ಮ) n (ನ) ɳ (ಣ) ɲ (ಞ) ŋ (ಙ)

There are 5 nasal glyphs, but like in other Indian languages, 2 (velar and palatal) are allophones of the others, leaving only 3 actual phonemes. Indian scripts are often overspecified, and not every glyph represents a phoneme.

3 months ago

1 reply

> Many Indian languages are written in scripts that mirror what is spoken. Silent letters don't exist and pronunciations that don't match the spelling are very rare.

I don't think that's true. From the northern Indian languages schwa deletion (https://en.wikipedia.org/wiki/Schwa_deletion_in_Indo-Aryan_l...) to the extreme divergence between the standard formal and spoken forms of languages in Southern languages (https://en.wikipedia.org/wiki/Diglossia), it's a stretch to say the scripts mirror what is spoken.

It's just that if you are a native speaker/reader, you are so fluent that you unconsciously auto-correct those inconsistencies - just like in English.

Even in the formal registers of each spoken Indian language, which should be in theory be more systematically consistent with their scripts, there are inconsistencies in spelling/pronunciation of loan-words from both foreign and other Indian languages (i.e. aspirates in South India and retroflex approximates in northern India, and any number of inconsistent renderings of English words in Indic script).

albert_e

3 months ago

Very interesting and informative - thanks for sharing.

inkyoto

3 months ago

Indian languages, yes, but the story is more complicated with languages that use Indic scripts.

Tibetan, Mon-Burmese and Thai scripts, as an example, all derive from the Brahmi script (through a long and sometimes windy ancestry), but neither reflects the modern pronunciation, hence mind numbing transcription systems.

Tibetan and Burmese languages are particularly notorious for codifying the archaic pronunciation of respective languages that has been frozen in time for centuries. It is a treasure trove for linguists that have got a time machine for free, but I don't think that the same can't be said modern speakers of both languages.

somat

3 months ago

1 reply

If we are voting on missing letters I want thorn(þ). My understanding is that thorn is one of the rarer sounds in the worlds languages, and it deserves to get it's own letter back.

On the topic of screwball spelling is this video essay on silent letters. The fun takeaway for me was that a lot of silent letters were never pronounced. it is just that when some of the first dictionaries were being produced, and the spellings decided on, they decided to introduce silent letters to indicate the origin of the word. the b in debt is because it comes from the latin debitum. but it was not spelled that way until the 1500's prior to that it was dette.

https://www.youtube.com/watch?v=NXVqZpHY5R8 (RobWords: Why English is full of silent letters)

geocar

3 months ago

1 reply

> one of the rarer sounds in the worlds languages

Is that true? Seems like it's in every other word when I visit Spain...

jkaplowitz

3 months ago

1 reply

It’s true. English and the main Spain version of Spanish are two of the few languages in the world which have the sound. Even most Latin American versions of Spanish (maybe all?) do not have it.

3 months ago

3 replies

Can you give an example of a common Spanish word that has it?

pezezin

3 months ago

1 reply

My favourite word to troll people who are learning the language is "cerrojo" /θe'roxo/, meaning "latch" or "lock", as it contains the three most difficult consonants in the language in sequence xD

3 months ago

1 reply

In Polish there is 'szczoteczka', which took me just about forever to learn how to pronounce. I just could not hear what I was doing wrong.

3 months ago

2 replies

Same word in Russian is "щёточка". The lengths to which Poles are willing to go instead of just using Cyrillic will never stop to amuse me :D

black_knight

3 months ago

1 reply

I struggled quite a bit with "существительное" when learning Russian.

3 months ago

I was talking about spelling. I can clearly see how these clusters of consonants characteristic of all Slavic languages can be a pain for a beginner, no matter how you spell them.

kakacik

3 months ago

1 reply

No Cyrillic imports, thank you. Russia decided to be the bully and murderer of its closest neighbors, don't need any more russian influence, even if literally just on paper. The further one is from them the more safety and prosperity there is, in every possible way.

3 months ago

Dude, this is a linguistics thread. Ukrainian also uses Cyrillic, btw

jaggederest

3 months ago

In "distinción" spanish, the classic pair is the word for house and for hunt - "casa" and "caza" respectively. If you pronounce them the same (with an S sound), you're a Seseo speaker like (most) latin america. If you pronounce them with different sounds, one an S sound, the other a TH sound, you're a "distinción" speaker, and if you pronounce them both with a TH sound, it's the more uncommon ceceo accent, usually largely Andalusian.

petesergeant

3 months ago

cien

3 months ago

If this was a serious article they would have used gif as an example of the g sound /s :)