Why Rdf Is the Natural Knowledge Layer for AI Systems

Posted4 months agoActive4 months ago

arto

68 points

71 comments

bryon.ioTechstory

skepticalmixed

Debate

80/100

RdfAIKnowledge GraphsSemantic Web

Key topics

Rdf

Knowledge Graphs

Semantic Web

The article argues that RDF is the natural knowledge layer for AI systems, but the discussion is marked by skepticism and debate about RDF's limitations and potential.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

34m

Peak period

0-12h

Avg / period

8.9

Comment distribution71 data points

Loading chart...

Based on 71 loaded comments

Key moments

01Story posted
Sep 5, 2025 at 1:39 AM EDT
4 months ago
Step 01
02First comment
Sep 5, 2025 at 2:13 AM EDT
34m after posting
Step 02
03Peak activity
54 comments in 0-12h
Hottest window of the conversation
Step 03
04Latest activity
Sep 11, 2025 at 7:01 PM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (71 comments)

Showing 71 comments

flanked-evergl

4 months ago

3 replies

RDF is great but it's somewhat inadvertently captured by academia.

The tooling is not in a state where you can use it for any commercial or mission critical application. The tooling is mainly maintained by academics, and their concerns run almost exactly counter to normal engineering concerns.

An engineer would rather have tooling with limited functionality that is well designed and behaves correctly without bugs.

Academics would rather have tooling with lots of niche features, and they can tolerate poor design, incorrect behavior and bugs. They care more for features, even if they are incorrect, as they need to publish something "novel".

The end result is that almost all things you find for RDF is academia quality and lots of it is abandoned because it was just part of publication spam being pumped and dumped by academics that need to publish or perish.

Anyone who wants to use it commercially really has to start from scratch almost.

jraph

4 months ago

3 replies

> even if they are incorrect

Uh. Do you have a source for this? Correctness is a major need in academia.

tsimionescu

4 months ago

1 reply

I think they mean things like a tool that has feature X even if it crashes 50% when it is used is preferable to a tool that doesn't have feature X at all.

jraph

4 months ago

Ok, makes sense, I hadn't read it like this. For me, "correct" means "provides correct results".

rglullis

4 months ago

2 replies

Correct != Bug-free.

My experience working with software developed by academics is that it is focused on getting the job done for a very small user base of people who are okay with getting their hands dirty. This means lots of workarounds, one-off scripts, zero regards for maintainability or future-proofing...

jraph

4 months ago

This I fully agree with.

DougBTX

4 months ago

“Incomplete” seems like a better word than “incorrect” for this. The code is likely correct in the narrow scope it was needed for, but is missing features (and error checking!) beyond the happy path, making it easy to use incorrectly.

lmm

4 months ago

1 reply

> Correctness is a major need in academia.

How so? Consider the famous result that most published research findings are false.

jraph

4 months ago

How so? Finding correct stuff is the whole point of research, no matter the extent at which it actually succeeds in reaching this. So yes, regardless on the actual results it is a major need in academia. We have nothing better anyway (which doesn't need it can't improve; we critically need it to improve).

Now. I'll assume you are referring to "Why Most Published Research Findings Are False". This paper is 20 years old, only addresses medical research despite its title, and seems to have mixed reception [1]

> Biostatisticians Jager and Leek criticized the model as being based on justifiable but arbitrary assumptions rather than empirical data, and did an investigation of their own which calculated that the false positive rate in biomedical studies was estimated to be around 14%, not over 50% as Ioannidis asserted.[12] Their paper was published in a 2014 special edition of the journal Biostatistics along with extended, supporting critiques from other statisticians

14% is a huge concern and I think nobody will disagree with this. But we are far from most, if this is true.

[1] https://en.wikipedia.org/wiki/Why_Most_Published_Research_Fi...

philjohn

4 months ago

2 replies

Yes and no.

I worked for a company that went hard into "Semantic Web" tech for libraries (as in, the places with books), using an RDF Quad Store for data storage (OpenLink Virtuoso) and structuring all data as triples - which is a better fit for the Heirarchical MARC21 format than a relational database.

There are a few libraries (the software kind) out there that follow the W3 spec correctly, Redland being one of them.

simonw

4 months ago

1 reply

How well did that work? Based on your experience at that company would you build a new project on the stack that they chose?

philjohn

4 months ago

It worked very well, as I mentioned, Marc21 (the interchange format for bibliographic data) is heirarchical, not relational, so there was already a better impedance match.

Then with URL's being the primary identifiers, it was trivial to take a large dataset like VIAF (Virtual International Authority File - canonical representation of all authors) and query the two together seamlessly.

Virtuoso was a pretty good Quad Store, and we got away with storing tens of billions of triples on a 4 node cluster, with very fast query times (although sticking to Sparql 1.1 and not leaning on property paths).

As to if I would choose it again ... I don't know. I'm now a decade out of the library space and haven't seen anything in my day-to-day work (backend distributed systems) that would benefit from the RDF data model.

Zardoz84

4 months ago

I'm in a similar boat. On my case, it's software for public libraries, and it's a must having data accessible as RDF. Event, we have our own public fork of Marc4j .

ragebol

4 months ago

Tooling sounds like it can be fixed? If the knowledge bases are useful, why not use them with better tools?

mdhb

4 months ago

1 reply

Maybe worth also pointing out that a meaningful refresh of the RDF specification is getting rather close to completion.

Hopefully version 1.2 which addresses a lot of shortcomings should officially be a thing this year.

In the meantime you can take a look at some of the specification docs here https://w3c.github.io/rdf-concepts/spec/

jiggawatts

4 months ago

1 reply

The sibling comment by flanked-evergl "RDF is great but it's somewhat inadvertently captured by academia." is made manifestly obvious when reading this spec.

It's overburdened by terminology, an exponential explosion of nested definitions, and abstraction to the point of unintelligibility.

It is clear that the authors have implementation(s) of the spec in mind while writing, but very carefully dance around it and refuse to be nailed down with pedestrian specifics.

I'm reminded of the Wikipedia mathematics articles that define everything in terms of other definitions, and if you navigate to those definitions you eventually end up going in circles back to the article you started out at, no wiser.

mdhb

4 months ago

For what it’s worth, I’m half way through writing a RDF library at the moment and I get what you are saying in an abstract / first impression sense but it all made sense and wasn’t a huge pain to work with in practice.

I just think that the idea of how to represent knowledge graphs is a topic that has hidden complexities in it that you need to make explicit.

My impression is that the mismatch comes from you get introduced to the idea of a triple just being a “subject, predicate and and object” and it’s incredibly simple but again, real life knowledge representation is a complicated thing and there’s a bit to unpack beyond that first impression.

rglullis

4 months ago

2 replies

Wrote this about one month ago here at https://news.ycombinator.com/item?id=44839132

I'm completely out of time or energy for any side project at the moment, but if someone wants to steal my idea: please take an llm model and fine tune so that it can take any question and turn it into a SparQL query for Wikidata. Also, make a web crawler that reads the page and turns into a set of RDF triples or QuickStatements for any new facts that are presented. This would effectively be the "ultimate information organizer" and could potentially turn Wikidata into most people's entry page of the internet.

luguenth

4 months ago

1 reply

https://spinach.genie.stanford.edu/

Here you are :)

yorwba

4 months ago

I asked "Which country has the most subway stations?" and got the query

  SELECT ?country (COUNT(*) AS ?stationCount) WHERE {
    ?station wdt:P31 wd:Q928830.
    ?station wdt:P17 ?country.
  }
  GROUP BY ?country
  ORDER BY DESC(?stationCount)
  LIMIT 1

https://query.wikidata.org/#SELECT%20%3Fcountry%20%28COUNT%2...

which is not unreasonable as a quick first attempt, but doesn't account for the fact that many things on Wikidata aren't tagged directly with a country (P17) and instead you first need to walk up a chain of "located in the administrative territorial entity" (P131) to find it, i.e. I would write

  SELECT ?country (COUNT(DISTINCT ?station) AS ?stationCount) WHERE {
    ?station wdt:P31 wd:Q928830.
    ?station wdt:P131*/wdt:P17 ?country.
  }
  GROUP BY ?country
  ORDER BY DESC(?stationCount)
  LIMIT 1

https://query.wikidata.org/#SELECT%20%3Fcountry%20%28COUNT%2...

In this case it doesn't change the answer (it only finds 3 more subway stations in China), but sometimes it does.

IanCal

4 months ago

1 reply

Even without tuning Claude is pretty solid at this, just give it the sparql endpoint as a tool call. Claude can generate this integration too.

rglullis

4 months ago

1 reply

But the idea of tuning the model for this task is to make a model that is more efficient, cheaper to operate and not requiring $BILLIONS of infrastructure going to the hands of NVDA and AMZN.

ako

4 months ago

1 reply

I've built an mcp for sparql and rdf. Used claude on iphone to turn pictures of archeological site information shields to transcription, to an ontology, to an rdf, to an er-model and sql statements, and then with mcp tool and claude desktop to save the data into parquet files on blobstorage and the ontology graph into a graph database. Then used it to query data from parquet (using duckdb), where sonnet 4 used the rdf graph to write better sql statements. Works quite well. Now in the process of using sonnet 4 to find the optimal system prompt for qwen coder to also handle rdf and sparql: i've given sonnet 4 access to qwen coder through an mcp tool, so it can trial and error different system prompt strategies. Results are promising, but can't compete with the quality of sonnet 4.

Graph database vendors are now trying to convince you that AI will be better with a graph database, but what i've seen so far indicates that the LLM just needs the RDF, not an actual database with data stored in triplets. Maybe because these were small tests, if you need to store a large amount of id mappings it may be different.

ricksunny

4 months ago

>what i've seen so far indicates that the LLM just needs the RDF, not an actual database with data stored in triplets

While I guarantee you know much more than I do about graph databases and RDFs, in practice, what is the difference between an RDF graph database and an RDF? They're both a set of text-based triplets, no?

retube

4 months ago

2 replies

What is "RDF" ? Not defined in the article

jraph

4 months ago

3 replies

Resource Description Framework [1] is basically a way to describes resources with (subject, verb, object) predicates, where subject is the resource being described and object is another resource related to the subject in a way verb defines (verb is not necessarily a grammatical verb/action, it's often a property name).

There are several formats to represent these predicates (Turtle), database implementations, query languages (SPARQL), and there are ontologies, which are schemas, basically, defining/describing what how to describe resource in some domain.

It's highly related to the semantic web vision of the early 2000s.

If you don't know about it, it is worth taking a few minutes to study it. It sometimes surfaces and it's nice to understand what's going on, it can give good design ideas, and it's an important piece of computer history.

It's also the quiet basis for many things, OpenGraph [3] metadata tags in HTML documents are basically RDF for instance. (TIL about RDFa [4] btw, I had always seen these meta tags as very RDF-like, for a good reason indeed).

[1] https://en.wikipedia.org/wiki/Resource_Description_Framework

[2] https://en.wikipedia.org/wiki/Semantic_Web

[3] https://ogp.me/

[4] https://en.wikipedia.org/wiki/RDFa

rapnie

4 months ago

1 reply

Does OpenGraph gain any benefit from its definition as linked data? Or might it just as well have been defined as, say, JSON Schema's and referring to those property names in html?

jraph

4 months ago

I'm not expert on OpenGraph, and it's been a while I've actually manipulated RDF other than the automatically generated og meta tags.

I'd say defining this as linked data was quite idiomatic / elegant. It's possibly mainly because OpenGraph was inspired of Dublin Core [1], which was RDF-based. They didn't reinvent everything with OpenGraph, but kept the spirit, I suppose.

In the end it's probably quite equivalent.

And in this end, why not both? Apparently we defined an RDF ontology for JSON schemas! [2]

[1] https://en.wikipedia.org/wiki/Dublin_Core

[2] https://www.w3.org/2019/wot/json-schema

pbronez

4 months ago

Also worth checking out schema.org, which provides a baseline ontology that pops up in many places

https://schema.org/Thing

retube

4 months ago

Thanks vm for the detailed reply, most helpful!

vixen99

4 months ago

2 replies

We meet this casual use of acronyms all too often on HN. It only takes a line or two to enable everyone to follow along without recourse to a search expedition.

jraph

4 months ago

1 reply

You wouldn't spell out HyperText Markup Language each time.

RDF is one of those things it's easy to assume everybody has already encountered. RDF feels fundamental. Its predicate triplet design is fundamental, almost obvious (in hindsight?). It could not have not existed. Had RDF not existed, something else very similar would have appeared, it's a certitude.

But we might have reached a point where this assumption is quite false though. RDF and the semantic web were hot in the early 2000s, which was twenty years ago after all.

rapnie

4 months ago

1 reply

Tim Berners-Lee, now co-founder of Inrupt [0], has launched Solid Project [1] where he kept working on semantic web concepts and linked data specs. Looks like Inrupt went full AI today. What ailed Solid was I think the academic approach mentioned in other comments, the heavy-weight specification process (inspired by W3C), and overlooking the fact that you better get your dev community on board and excited as a good road to adoption. Inrupt didn't spend much attention to their Solid community, except for the active followers in their chat channels, and were directly targeting commercial customers. I don't know the health of Solid project today, but there are a couple of interesting projects around social networking and the fediverse.

[0] https://www.inrupt.com/about

[1] https://solidproject.org/

mdhb

4 months ago

1 reply

I wish more people would get on board with Solid, it would solve a lot of the surveillance capitalism problems that plague modern society.

For what it’s worth, I think Solid is still very much “in development” and is a pre 1.0 thing no?

rapnie

4 months ago

My impression too. I felt that in the past "solid apps" in practice boiled down to "rebooting the semantic web", and adopting / maturing the whole host of specs needed for end-to-end generic linked data app development defined by open standards. Solid as "Out with the old, in with the new. The web done right, this time". Which may be too ambitious.

Update: Reboot the semantic web seems still the goal. From About Solid page [0]:

> Solid is an open standard for structuring data, digital identities, and applications on the Web. Solid aims to support the creation of the Web as Sir Tim Berners-Lee originally envisioned it when he invented the Web at CERN in 1989. Tim sometimes refers to Solid as “the web - take 3" — or Web3.0 — because Solid integrates a new layer of standards into the Web we already have. The goal of Solid is for people to have more agency over their data.

A more manageable scope for the project would be "Personal data vaults for self-sovereign linked data" or something like that. That positioning would make me more interested than current "let's reboot" call-to-action.

[0] https://solidproject.org/about

Animats

4 months ago

Right. I was thinking "RDF - vaguely remember that as some XML thing from Semantic Web era".

Yup, it's still that RDF. Inevitably, it had to be converted to new JSON-like syntaxes.

It reminds me of the "is-a" predicate era of AI. That turned out to be not too useful for formal reasoning about the real world. As as a representation for SQL database output going into a LLM, though, it might go somewhere. Maybe.

Probably because the output of an SQL query is positional, and LLMs suck at positional representations.

jandrewrogers

4 months ago

1 reply

As the article itself points out, this has been around for 25 years. It isn’t an accident that nobody does things this way, it wasn’t an oversight.

I worked on semantic web tech back in the day, the approach has major weaknesses and limitations that are being glossed over here. The same article touting RDF as the missing ingredient has been written for every tech trend since it was invented. We don’t need to re-litigate it for AI.

rglullis

4 months ago

2 replies

I would be very interested in reading what you think it can't work. I am inclined to agree with the post on a sibling thread that mentions that the main problem with RDF is that it is been captured by academia.

FrankyHollywood

4 months ago

1 reply

The article states "When that same data is transformed into a knowledge graph"

This is a non-trivial exercise. How does one transform knowledge into a knowledge graph using RDF?

RFD is extremely flexible and can represent any data and that's exactly it's great weakness. It's such a free format there is no consensus on how to represent knowledge. Many academic panels exist to set standards, but many of these efforts end up in github as unmaintained repositories.

The most important thing about RDF is that everyone needs to agree on the same modeling standards and use the same ontologies. This is very hard to achieve, and room for a lot of discussion, which makes it 'academic' :)

bfuller

4 months ago

1 reply

>This is a non-trivial exercise. How does one transform knowledge into a knowledge graph using RDF?

by using the mcp memory knowledge graph tool, which just worked out of the box for my application of turning forum posts into code implementations.

ricksunny

4 months ago

Link? I'm still new to mcp so forgive me if this is trivially uniquely found using your keywords.

4ndrewl

4 months ago

IME it's less than a "capture", more that most outside of academia don't have the requisite learning to be able to think in the abstract outside of trivial examples.

zekrioca

4 months ago

1 reply

Author listed RDF a couple dozen of times but didn’t define it, so:

The Resource Description Framework (RDF) is a standard model for data interchange on the web, designed to represent interconnected data using a structure of subject-predicate-object triples. It facilitates the merging of data from different sources and supports the evolution of schemas over time without requiring changes to all data consumers.

jraph

4 months ago

1 reply

I wrote a comment trying to explain it there, with a concrete, current and widespread example: https://news.ycombinator.com/item?id=45135302#45135593

zekrioca

4 months ago

Sure, but this should have been included in the article, not here.

IanCal

4 months ago

4 replies

This seems to miss the other side of why all this failed before.

Rdf has the same problems as the sql schemas with information scattered. What fields mean requires documentation.

There - they have a name on a person. What name? Given? Legal? Chosen? Preferred for this use case?

You only have one id for apple eh? Companies are complex to model, do you mean apple just as someone would talk about it? The legal structure of entities that underpins all major companies, what part of it is referred to?

I spent a long time building identifiers for universities and companies (which was taken for ROR later) and it was a nightmare to say what a university even was. What’s the name of Cambridge? It’s not “Cambridge University” or “The university of Cambridge” legally. But it also is the actual name as people use it. The university of Paris went from something like 13 institutes to maybe one to then a bunch more. Are companies locations at their headquarters? Which headquarters?

Someone will suggest modelling to solve this but here lies the biggest problem:

The correct modelling depends on the questions you want to answer.

Our modelling had good tradeoffs for mapping academic citation tracking. It had bad modelling for legal ownership. There isn’t one modelling that solves both well.

And this is all for the simplest of questions about an organisation - what is it called and is it one or two things?

simonw

4 months ago

2 replies

That university example is fantastic.

I went looking and as far as I can tell "The Chancellor, Masters, and Scholars of the University of Cambridge" is the official name! https://www.cam.ac.uk/about-the-university/how-the-universit...

IanCal

4 months ago

1 reply

That's the one! It's not even that weird of a case compared to others but is an excellent example.

Here's the history of the Paris example: https://en.wikipedia.org/wiki/University_of_Paris where there was one, then many, then fewer universities. Answering a question of "what university is referred to by X" depends on why you want to know, there are multiple possible answers. Again it's not the weirdest one, but a good clear example of some issues.

There's a company called Merk, and a company called Merk. Merk is called Merk in the US but MSD outside of it. The other Merk is called Merk outside the US and EMD inside it. Technically one is Merk & Co and used to be part of Merk but later wasn't and due to trademark disputes, which aren't even all resolved yet.

This is an area I think LLMs actually have a space to step in, we have tried perfectly modelling everything so we can let computers which have no ability to manage ambiguity answer some questions. We have tried barely modelling anything and letting humans figure out the rest, as they're typically pretty poor at crafting the code, and that has issues. We ended up settling largely on spending a bunch of human time modelling some things, then other humans building tooling around them to answer specific questions by writing the code, and a third set who get to actually ask the questions.

LLMs can manage ambiguity, and they can also do more technical code based things. We haven't really historically had things that could manage ambiguity like this for arbitrary tasks without lots of expensive human time.

I am now wondering if anyone has done a graph db where the edges are embedding vectors rather than strict terms.

isoos

4 months ago

1 reply

> I am now wondering if anyone has done a graph db where the edges are embedding vectors rather than strict terms.

Curious: how would you imagine it working if there were such a graph db?

IanCal

4 months ago

1 reply

I had the idea a few hours ago so I'm sure there are holes in this but my first idea is forming a graph where the relationship isn't a fixed label but a description that is then embedded as a vector.

First of all, consider that in a way each edge label is a one-hot binary vector. And we search using only binary methods. A consequence is anything outside of that very narrow path all data is missed in a search. A simple step could be to change that to anything within an X similarity to some target vector. Could you then search "(fixed term) is a love interest of b?" and have b? filled from facts like "(fixed term) is intimate with Y" and "(fixed term) has a date with Z"?

There are probably issues, I'm sure there are, but some blend of querying but with some fuzziness feels potentially useful.

fishmicrowaver

4 months ago

1 reply

Isn't this exactly what neo4j does for graphrag?

IanCal

4 months ago

1 reply

Is that vectors for edges or for searching the nodes? I’m talking about encoding the edges as vectors for traversal.

fishmicrowaver

4 months ago

1 reply

Yes you can do that with neo4j.

IanCal

4 months ago

Interesting, thanks, I'll have to explore that.

muglug

4 months ago

Brb updating my LinkedIn

jtwaleson

4 months ago

2 replies

Indeed, I often get the impression that (young) academics want to model the entire world in RDF. This can't work because the world is very ambiguous.

Using it to solve specific problems is good. A company I work with tries to do context engineering / adding guard rails to LLMs by modeling the knowledge in organizations, and that seems very promising.

The big question I still have is whether RDF offers any significant benefits for these way more limited scopes. Is it really that much faster, simpler or better to do queries on knowledge graphs rather than something like SQL?

IanCal

4 months ago

1 reply

I think it's a journey a lot of us have gone on, it's an appealing idea until you hit a variety of really annoying cases and where you are depends on how you end up trying to solve it. I'm maybe being unfair to the academic side but this is how I've seen it (exaggerated to show what I mean hopefully).

The more academic side will add more complexity to the modelling, trying to model it all.

The more business side will add more shortcuts to simplify the modelling, trying to get just something done.

Neither is wrong as such but I prefer the tendency to focus on solving an actual problem because it forces you to make real decisions about how you do things.

I think being able to build up knowledge in a searchable way is really useful and having LLMs means we finally have technology that understands ambiguity pretty well. There's likely an excellent place for this now that we can model some parts precisely and then add more fuzzy knowledge as well.

> The big question I still have is whether RDF offers any significant benefits for these way more limited scopes. Is it really that much faster, simpler or better to do queries on knowledge graphs rather than something like SQL?

I'm very interested in this too, I think we've not figured it out yet. My guess is probably no in that it may be easier to add the missing parts to non-rdf things. I have a rough feeling that actually having something like a well linked wiki backed by data sources for tables/etc would be great for an llm to use (ignoring cost, which for predictions across a year or more seems pretty reasonable).

They can follow links around topics across arbitrary sites well, you only need more programmatic access for aggregations typically. Or rare links.

ddkto

4 months ago

The academic / business divide is a great example of the correct model depending on what you want to do. The academic side wants to understand, the business side wants to take action.

For example, the Viable System Model[1] can capture a huge amount of nuance about how a team functions, but when you need to reorganize a disfunctional team, a simple org chart and concise role descriptions are much more effective.

[1] https://en.wikipedia.org/wiki/Viable_system_model

pbronez

4 months ago

1 reply

Which company? I need to build an enterprise knowledge graph.

jtwaleson

4 months ago

A small startup in the Netherlands, but they're very much searching for approaches themselves, I don't think they can help you right now.

dwaite

4 months ago

> The correct modelling depends on the questions you want to answer.

Coincidentally, my main point in any conversation about UML I've ever had

AtlasBarfed

4 months ago

Basically it's name spacing hell right?

To adapt the saying, an engineer is talking to another engineer about is system, saying he's having issues with names. So he's thinking of using name spaces.

Now he has two problems

tannhaeuser

4 months ago

Here's the first paragraph of that article:

> The Big Picture: Knowledge graphs triple LLM accuracy on enterprise data. But here’s what nobody tells you upfront: every knowledge graph converges on the same patterns, the same solutions. This series reveals why RDF isn’t just one option among many — it’s the natural endpoint of knowledge representation. By Post 6, you’ll see real enterprises learning this lesson at great cost — or great savings.

If you really want to continue reading and discuss this kind of drivel, go ahead. RDF the "natural endpoint of knowledge representation" right. As someone having worked on commercial RDF projects at the time, after two decades of pushing RDF by a self-serving W3C and academia until around 2018 or so, let's say I welcome people having come to their senses and are back at working with Datalog and Prolog. Even as a target for neurolinguistics and generation by coding LLMs does SPARQL suck because of its idiosyncratic, design-by-comittee nature compared to the minimalism and elegance of Prolog.

epolanski

4 months ago

Small rant: I hate how RDF is the central topic of the blog post, yet it isn't defined even *once*.

For the interested: resource description framework.

kidehen2

4 months ago

RDF provides a very natural layer for AI systems. Today, LLM-based AI systems are fundamentally challenged by hallucinations, which makes RDF-based knowledge graphs—constructed using Linked Data Principles—a powerful complement. By using hyperlinks to denote edges and nodes, these graphs enable context enrichment through ontology lookups combined with reasoning and inference.

For a detailed post on this synergy, see: https://www.linkedin.com/pulse/large-language-models-llms-po...

Disclaimer: I am the Founder & CEO of OpenLink Software, creators of Virtuoso.

never_inline

4 months ago

So so obvious slop and pains my eyes to read.

barrenko

4 months ago

Or a semantic layer it's called?

ricksunny

4 months ago

Five times in that article he says some version of “Accuracy triples”.

What does that even mean? Suppose something 97% accurate became 99.5% accurate? How can we talk of accuracy doubling or tripling in that context? The only way I could see that working is if the accuracy of something went from say 1% to 3% or 33% to 99%. Which are not realistic values in the LLM case. (And I’m writing as a fan of knowledge graphs).

Kwpolska

4 months ago

Stop trying to make Semantic Web happen. It’s not going to happen.

verisimi

4 months ago

RDF - Resource Description Framework

> The Resource Description Framework (RDF) is a method to describe and exchange graph data. It was originally designed as a data model for metadata by the World Wide Web Consortium (W3C).

https://www.wikipedia.org/wiki/Resource_Description_Framewor...

crabmusket

4 months ago

I really like RDF in theory, as a lot of its ideas just make sense to me:

- Using URIs to clarify ambiguous IDs and terms

- EAV or subject/verb/object representation for all knowledge

- "Open world" graph where you can munge together facts from different sources

I guess using RDF specifically, instead of just inventing your own graph database with namespaced properties, means using existing RDF tooling and languages like SPARQL, OWL, SHACL etc.

Having looked into the RDF ecosystem to see if I can put something together for a side project inspired by https://paradicms.github.io, it really feels like there's a whole shed of tools out there, but the shed is a bit dingy, you can't really tell the purpose of the oddly-shaped tools you can see, nobody's organised and laid things out in a clear arrangement and, well, everything seems to be written in Java, which shouldn't be a huge issue but really isn't to my taste.

View full discussion on Hacker News

ID: 45135302Type: storyLast synced: 11/20/2025, 3:44:06 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN