Eurollm: LLM Made in Europe Built to Support All 24 Official EU Languages
Posted2 months agoActive2 months ago
eurollm.ioTechstoryHigh profile
calmmixed
Debate
70/100
LLMMultilingual AIEuropean Tech
Key topics
LLM
Multilingual AI
European Tech
The EuroLLM project releases a multilingual large language model supporting 24 official EU languages, sparking discussions on its performance, potential applications, and the role of EU funding in AI development.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
10m
Peak period
138
Day 1
Avg / period
22.9
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 28, 2025 at 10:58 AM EDT
2 months ago
Step 01 - 02First comment
Oct 28, 2025 at 11:07 AM EDT
10m after posting
Step 02 - 03Peak activity
138 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 8, 2025 at 8:11 AM EST
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45733707Type: storyLast synced: 11/22/2025, 11:17:55 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Maltese, interestingly, is the only Afro-Asiatic derived language.
Hungarian, Finnish, and Estonian are the three Uralic languages.
All the others are Indo-European, Greek being the only Hellenic one, Irish the only Celtic, the rest are Baltic, Slavic, Italic, or Germanic.
(I originally used the term Balto-Slavic, though I was unaware of some of the connotations of that term until just now. Baltic and Slavic do share a common origin, but that was a very very long time ago)
Source: I'm also Maltese.
Arabic (language): al-‘arabiyyah (الْعَرَبِيَّة).
https://en.wikipedia.org/wiki/Endonym_and_exonym
can arabic people understand maltese?
That’s a really interesting question — and the answer is: *partially, but not easily.*
Here’s why:
### Linguistic roots
Maltese is a *Semitic language*, and its *core grammar and basic vocabulary* come from *Arabic*, specifically from *Siculo-Arabic*, the dialect of Arabic spoken in Sicily and Malta about 1,000 years ago. Because of that, *many Maltese words sound familiar* to Arabic speakers — especially from the *Maghrebi (North African)* or *Levantine* dialects.
For example:
| Maltese | Meaning | Similar in Arabic | | ------- | ------- | ----------------- | | Dar | house | دار (dar) | | Kelb | dog | كلب (kalb) | | Seba | seven | سبعة (sabʿa) | | Xemx | sun | شمس (shams) |
### Influence from Italian and English
However, over the centuries, Maltese absorbed *a lot of Italian (especially Sicilian)* and *English* vocabulary — so modern Maltese is *a hybrid*. Roughly:
* 30–40% of its vocabulary is Semitic (Arabic origin), * 40–50% is Romance (mostly Italian/Sicilian), * and the rest is English and other sources.
That means Arabic speakers might *recognize some words and structures*, but they’ll *struggle to understand full sentences*, especially because:
* Pronunciation has changed, * Grammar evolved differently, * Many everyday words are not Arabic anymore.
### Summary
So:
* *Yes*, Maltese and Arabic share a deep connection — like cousins. * *No*, they’re *not mutually intelligible* today. An Arabic speaker might catch words here and there, but a real conversation would be hard without studying Maltese.
The above is exactly my experience with Arabic speakers by the way. Again, not surprising after 1k years of divergence.
There was a point about 7 years ago when the overton window shifted to "speak english to strangers first" because of a large influx of foreigners who did not know the language. Since then I've met foreigners who have better Maltese than some natives.
Older folks & geriatrics will sometimes be surprised when they assume someone is foreign and they turn out to be Maltese. "int Malti??" is a statement I get often because I don't look Mediterranean despite being born here.
I was surprised to hear Maltese radio stations played in taxis, while visiting Malta just a few weeks back
Businesses do work in Maltese and English. Both are official languages. Its quite rare to encounter a business that deals near exclusively in Maltese. Many prefer Maltese but will fall back to english where necessary.
Regarding monolignual speakers, I think theres a lot of stereotypes for maltese only, english only and code switchers. I think its all a bit silly... So as long as communication can happen I don't fuss.
On Maltese music... There's a lot of low ish quality music then there's a few absolute gems. Look up The Travellers, Lapes, Jon Mallia on YouTube/Spotify.
How much do you consider Maltese its own language (as opposed to a dialect of Arabic)?
Maltese is definitely its own language. Arabic roots are there (theres a Semitic joke in there ) but it isn't arabic anymore. Its written left to right with a variant of the english alphabet.
Hindi and Urdu are 90% the exact same language, and are mutually inteligible (Urdu speaker and Hindi speaker can have complete full conversation with each other) but each is written differently (one LTR the other RTL) and with different alphabets
In my books, the distinction between languages and dialects are so arbitrary that the best method is simply to ask the people that speak those languages/dialects. If they consider them to be different language (which Maltese speakers seemingly do) I call them different languages.
I don't buy the argument of just asking the speakers. There are cultural, political, etc. reasons people may think things which don't conform with reality. Many Hindi-Urdu speakers get insulted by the reality that the languages are pretty much the same because they don't want to identify with people from another country their country is constantly at war with.
I don't think anyone would seriously consider it a dialect of Arabic though with its completely different alphabet and half the vocabulary and morphology coming from Italian languages/dialects, even if Malta hadn't spent the best part of a millennium trying very hard not to become part of the Arab world
I do wonder what natives think and feel about the longevity of their language? What is taught in schools at what ages (assuming English is in the mix somewhere). Is there enough media in Maltese for Malti to go about the moderns at fully in Maltese? It’s shockingly hard to find any information on Maltese, and even harder to find content.
I’m not sure if’s dying out, or in danger thereof; if there are preservation efforts, or if there is no need.
https://tatoeba.org/
I think some people get touchy about them being lumped together if their last period of commonality (per the article) was 1400 BCE. For comparison, I believe all the Slavic languages were mutually intelligible around 1200 AD. But much more recently than this, in the last few centuries, there have been notable attempts by east slavs to absorb the Baltic language cultures and deny them.
I doubt West and East Slavic were. But inside those geographic groups they probably were (Czech and Polish AFAIR were around that time).
I was in Crimea for about 2 weeks (in 2012) they split me Russian there. I couldn't understand a word they said. And I didn't learn to understand than for 2 weeks of travel there.
I could understand some words from Ukrainian (I traveled by train from Lviv).
Another example is Croatian, I've been there on vacation and renting a room. I couldn't understand a word they said and didn't learn any.
TlI can understand some Czech (because this is the closest language together with Slovakian to Polish) but that's it.
I wouldn't mix Slavs from different groups together. They evolved separately and are as close as English and German.
It is an example I think of often, about how quickly languages can change. In the scale of 1000 years, a lot changes. Most of the diversity in Romance languages is from around that timescale too, it really started to diverge substantially around 900ad-1100ad.
I never said it wasn't.
Well, that and Romanian. And Hungarian. And outside the EU, Albanian. And Georgian, Azeri and Armenian if you consider those Eastern Europe.
Some of my fellow Romanians will also claim they're Central European, but in my mind, all the ones I listed are Eastern European countries. I'd even include Turkey and Kazakhstan in there, part of the latter is to the West of the Urals, which is what we normally consider the border between Europe and Asia.
https://www.researchgate.net/publication/382295560/figure/fi...
https://www.worldatlas.com/r/w960-q80/upload/03/90/9b/countr...
Albania is clearly south east europe.
And, I don't care about your random Romanian friend's anecdote.
Yes, it is clearly south east Europe. East.
> And, I don't care about your random Romanian friend's anecdote.
Who's my friend?
In my mind, I was thinking of the belt of countries between Russia and Central Europe, starting from the Baltics down to the Balkan (excluding Greece).
There is a branch that contains both Baltic and Slavic languages, but there's also one that contains Albanian and Greek.
There have been some attempts to tie Albanian to Germanic, or Greek, or other branches, but they all have failed.
At some point they all are Indo_european, but they split a way ago.
and
> only Estonian is not a Slavic language.
So following this logic saying "in Eastern Europe, only Estonian is not a Baltic language" would make as much sense?
The fact they they are the closest surviving relatives on it own doesn't mean it makes sense to group them together (i.e. Italo-Celtic is also a theorized subgroup in a similar way but nobody is disputing that Celtic and Italic languages evolved into distinct groups).
Then there is a huge amount of missing links and unknown unknowns. e.g. Thracian and Dacian probably were also pretty close to Baltic or Slavic (maybe even closer to Baltic than Slavic is but we don't know enough about them to make any conclusive claims at all... but we at least know these languages existed)
I updated my original comment, and learned a good amount about that dispute as a result, so thanks for calling it out.
Best get to retraining those models.
The dialects are a whole other thing though.
Any literate German can read the NZZ easily, but they cannot have a colloquial conversation with an average person from Zürich, unless the latter switches to standard German (which is a foreign language for them, though one they have to learn from age 6).
I presume they also pick up a lot of standard German in the media: there's lots of German movies, and Germany has the biggest movie dubbing industry in the world, too. There's some Swiss German media, but not nearly as much as there's on offer in standard German.
As a native french speaker, no other language gives me that "why don't I understand what they say... oh, right, that's not my language!" feeling. Something with frequencies used, I suppose, but it always puzzles me.
EDIT: It's worth noting that this is mostly a spoken thing, AIUI - most formal/semi-formal writing would be in Hochdetusch rather than a local dialect.
Historically, Germany used to be divided into countless small fiefdoms and each of them used to speak unique barely intelligible languages.
Hochdeutsch is in opposition to Niederdeutsch which Dutch and arguably English are a variety of.
Eg it does a passable impression of Singapore's Singlish.
https://www.reddit.com/r/northernireland/comments/1fivtob/no...
People closer to the issue are better-placed to gather the necessary information, but again: strong feeling. Most people find it hard to get past that. The most informed person I know is so biased that I don't at all trust their conclusions.
Does modern English read like historical English?
> Native speakers have complained that official documents and signage in Ulster-Scots are incomprehensible to them.
Sure, there are tonnes of issues with the "officialisation" of any language but the fact that there are "native speakers" involved in the debate strongly suggests it wasn't all just made up for political reasons, which was the point I was responding to.
If you can read and understand text from the 18th century, then yes. We're not talking about Middle English or Old English.
>but the fact that there are "native speakers" involved in the debate
I should have put native speakers in quotes as well. What counts as a native Ulster Scots speaker is someone who speaks English with an NI accent with some localisms thrown in.
Nobody speaks the official Ulster Scots that was invented because the Irish language was getting support and political leaders on the other side of the community felt they deserved something as well. The Protestant community in NI see it as a bit of an embarrassment.
Yes, and I can read and understand historical Ulster Scots as well, but you were making a different point about codification/drift, no? The English I would find in those historical writings is quite different from what is being taught in schools today or recommended in style guides.
> What counts as a native Ulster Scots speaker is someone who speaks English with an NI accent with some localisms thrown in.
Then by your definition I am a native speaker. So how can we square it that you're telling me native speakers feel one way while I feel another way?
> Nobody speaks the official Ulster Scots
That's the nature of any newly codified minority language.
> The Protestant community in NI see it as a bit of an embarrassment.
There is no "protestant community" in Northern Ireland. A Dungannon farmer, an East Belfast loyalist and a BT9 lecturer will all give you very different views despite being of protestant background.
I'm not entertaining the notion that I have to pretend you're a native speaker when you've made clear you're only identifying as such for the purpose of making an argument.
>There is no "protestant community" in Northern Ireland.
Anyone who applies for a job in NI fills out a form where they are asked if they are a member of "the Protestant community", "the Roman Catholic community" or neither. You're denying the factual existence of the different communities in NI for the purpose of winning an argument on the internet.
Simply put, Ulster Scots prominence in legislation is merely a reflection of bad-faith political negotiations by Unionists to degrade the status of the Irish Language Act by proxy. Anyone on the ground knows it for the dog-whistle that it is, used simply to curry favour with a particularly sectarian unionist base in as a counter to the Irish Language provisions outlined and agreed to in the Good Friday Agreement.
And that's 'curry favour' - not 'curry my yoghurt' by the way. https://www.bbc.com/news/uk-northern-ireland-29895593
This has more or less been the case ever since the forced Ulster plantations lead to the development of Ulster Scots as a defined community with resilient Protestant and unionist ties. It'd be far more credible if Fingal tried to secede from Dublin and the Republic tomorrow morning using Yola as a justification.
https://en.wikipedia.org/wiki/Yola_dialect
In short, the ILA and promotion of Gaeilge in the north is about trying to make some small reparation at a state level for a cultural genocide perpetrated by our Colonists, and to help re-establish the oldest written vernacular language in western Europe, dating back over 2,500 years.
The promotion of Ulster Scots however... well the Commissioner is literally called 'Commissioner for Ulster Scots and Ulster British Tradition'. This is after DUP members removed themselves from the equality and good relations group after basically fillibustering for 5 years of discussions on bi-lingual signs to force a stalemate.
https://www.belfastlive.co.uk/news/northern-ireland/dup-stor...
Ah right, I get you now! The point you're making is fair enough, apologies for drawing the labour from your to explain it so fully.
https://www.politico.eu/article/catalan-basque-galician-boos...
It was the de facto language, but not the official language. What was baffling.
Also, English remains one of the main working languages of the EU bureaucracy, because for many EU states (especially in Eastern Europe) it is a more popular foreign language than the other two (French and German)-when Czech diplomats need to talk to Spanish diplomats, English is the language they choose.
This idea people have here that “each country gets to nominate a language” isn’t how it actually works. The treaties just contain a list of languages, and which languages are in the list is down to diplomatic negotiations not any coherent principle.
https://www.irishstatutebook.ie/eli/2003/act/32/enacted/en/p...
It says that each country can only request ONE language. And Ireland requested Irish.
(In fact to strengthen that probability, if it had been say French, when and why would it have switched go English? Just because the UK joined?)
“Brea, bûter en griene tsiis is goed Ingelsk en goed Frysk”
Latin would have been spoken in towns and cities but as Roman rule collapsed it was replaced by Brittonic (ancestor of Welsh), unlike in the continent where it developed into various Latin derived Romance languages.
https://web.archive.org/web/20170530232902/https://blogs.bl....
> Whoever preserved this story was also curious about Ohthere’s descriptions of where the Angles had lived ‘before they came into this land’ (England). Members of Alfred's court remembered that their ancestors came from mainland Europe, and they wanted to learn more about the lands which they identified as their own places of origin.
The scribe explicitly wrote things like "he said krán which we call crein" showing they were speaking in their own languages. It's even clearer if you consider our standard Old English is West Saxon from 850 and our standard Old Norse is from 1250 in Iceland (more different than the Danish variety of most Scandinavians in England). At the same time point,they would have more similarities (8th century Danish had wír before w turned to v).
https://en.wikipedia.org/wiki/Ohthere_of_H%C3%A5logaland
The cadence and general way it sounds is much closer to English than any other language
I’m sure everyone is aware that English comes from Anglish, i.e., the Angles as in the Germanic tribe.
Deutsch is derived from proto-germanic (as best we can tell) þiudiskaz, meaning “the people” i.e., the group of the different self associating tribes. It gets far more interesting in that it seems many of the strong dialects of especially southern Germany, Austria, and England have in fact retained some very old words and pronunciations that were lost in more standardized, conformed, and perverted dialects.
> Some of these [Old Saxon] speakers took part in the Germanic conquest of England in the fifth century AD. While it is not true that English and Plattdeutsch derive completely from the same source, the Old Saxon input into Anglo-Saxon was of primary importance and this linguistic group contributed greatly to the Anglo-Saxon dialects which our English forefathers spoke.
[1]: http://www.plattmaster.de/plattoew.htm
They get certain recognition, but they are not official in Europe. For example, just from Spain there are 13 languages on that list.
Irish is certainly not a robust vigorous language but your 40,000-80,000 numbers downplay it I'd suggest. Here are some statistics from Deepseek
Sorry, but it is.
Fun fact: villages, towns, and cities in Frisia often have names which differ in Frisian and Dutch. In those cases the signs at the place limits will have both names listed; the official one on top (which in some cases is the Dutch name (e.g., Leeuwarden/Ljouwert) and in some cases the Frisian (e.g., Gytsjerk/Giekerk)).
And huh interesting, I didn't know that for some places with bilingual names, the Dutch name is official and for others the Frysian is? Who gets to decide that, the municipality?
In a number of cases originally Frisian names actually supplanted older Dutch names (e.g., Burgum, Grou, Eastermar, etc.), so those places have just one name in both languages (except on the Dutch language Wikipedia because of weird reasoning about allowable sources and apparently a hatred of Frisianised Dutch names).
446 more comments available on Hacker News