JSON River – Parse JSON Incrementally as It Streams In
Posted3 months agoActive3 months ago
github.comTechstoryHigh profile
calmpositive
Debate
40/100
JSON ParsingStreaming DataLarge Language Models
Key topics
JSON Parsing
Streaming Data
Large Language Models
JSON River is a JavaScript library that parses JSON incrementally as it streams in, useful for handling large or unbounded JSON data, with discussion around its use cases, design choices, and comparisons to other libraries.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
50m
Peak period
84
Day 6
Avg / period
13.7
Comment distribution96 data points
Loading chart...
Based on 96 loaded comments
Key moments
- 01Story posted
Oct 8, 2025 at 12:38 PM EDT
3 months ago
Step 01 - 02First comment
Oct 8, 2025 at 1:28 PM EDT
50m after posting
Step 02 - 03Peak activity
84 comments in Day 6
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 22, 2025 at 10:01 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45518033Type: storyLast synced: 11/20/2025, 8:42:02 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
The benefit with that was that you didn't need the memory to store the deserialized JSON object in memory.
This seems to be more oriented towards interactivity, which is an interesting use-case I hadn't thought about.
I would expect an object JSON stream to be more like a SAX parser though. It's familiar, fast and simple.
Any thougts on not chosing the SAX approach?
I don't see it as particularly convenient if I want to stream a large array of small independent objects and read each one of them once, then discard it. The incremental parsed array would get bigger and bigger, eventually containing all the objects I wanted to discard. I would also need to move my array pointer to the last element at each increment.
jq and JSON.sh have similar incremental "mini-object-before-complete" approaches to parsing JSON. However, they do include some tools to shape those mini-objects (pruning, selecting, and so on). Also, they're tuned for pipes (new line is the event), which caters to shell and text-processing tools. I wonder what would be the analogue for that in a higher language.
https://www.npmjs.com/package/bfj
EDIT: this is totally wrong and the question is right.
```json {"name": "Al"} {"name": "Ale"} ```
So the braces are always closed
Then added benchmarks and started doing optimization, getting it ~10x faster than my initial naive implementation. Then I threw agents at it, and between Claude, Gemini, and Codex we were able to make it an additional 2x faster.
So I'd definitely count strings as "unbounded" as well.
For my use case I wanted streaming parse of strings, I was rendering JSON produced by an LLM, for incrementally rendering a UI, and some of the strings were long enough (descriptions) that it was nice to see them render incrementally.
I _think_ the intended use of this is for people with bad internet connections so your UI can show data that's already been received without waiting for a full response. I.e. if their connection is 1KB/s and you send an 8KB JSON blob that's mostly a single text field, you can show them the first kilobyte after a second rather than waiting 8 seconds to get the whole blob.
At first I thought maybe it was for handling gigantic JSON blobs that you don't want to entirely load into memory, but the API looks like it still loads the whole thing into memory.
Why not at least wait until the key is complete - what's the use in a partial key?
{"cleanup_cmd":"rm -rf /home/foo/.tmp" }
I’m imagining it in my mental model as being typed “unknown”. Anything that prevents accidental use as if it were a whole string… I imagine a more complex type with an “isComplete” flag of sorts would be more powerful but a bit of a blunderbuss.
[1]: https://github.com/globalaiplatform/langdiff/tree/main/ts
> As a consequence of 1 and 5, we only add a property to an object once we have the entire key and enough of the value to know that value's type.
name: A name: Al name: Ale name: Alex
Which would suggest you are getting unfinished strings out in the stream.
[1] https://github.com/karminski/streaming-json-js
Does it create a new value each time, or just mutate the existing one and keep yielding it?
I wrote it when I was doing prototyping on doing streaming rendering of UIs defined by JSON generated by LLMs. Using constrained generation you can essentially hand the model a JSON serializable type, and it will always give you back a value that obeys that type, but the big models are slow enough that incremental rendering makes a big difference in the UX.
I'm pretty proud of the testing that's gone into this project. It's fairly exhaustively tested. If you can find a value that it parses differently than JSON.parse, or a place where it disobeys the 5+1 invariants documented in the README I'd be impressed (and thankful!).
This API, where you get a series of partial values, is designed to be easy to render with any of the `UI = f(state)` libraries like React or Lit, though you may need to short circuit some memoization or early exiting since whenever possible jsonriver will mutate existing values rather than creating new ones.
I can imagine it being useful to have a made where you never emit strings until they are final, also. I don't entirely understand why strings are emitted incrementally but numbers aren't.
Name: John Smith. Birth Year: A.D. 1 [Customer is a Senior: 2,024 years old]
Name: John Smith. Birth year: A.D. 19 [Customer is a Senior: 2,006 years old]
Name: John Smith. Birth year: A.D. 199 [Customer is a Senior: 1,826 years old]
Name: John Smith. Birth year: 1997
Seriously, you should be able to update the UI with a new character, and much more, at 60fps easily.
(but for other uses - nope)
For instance, imagine you don't fully control the backend to split up a large response into several smaller API calls, but you could render the top part of the UI, which may be the most useful part, from the first couple of keys in the JSON, while a large "transaction history" after that is still downloading.
> The parse function also matches JSON.parse's behavior for invalid input. If the input stream cannot be parsed as the start of a valid JSON document, then parsing halts and an error is thrown. More precisely, the promise returned by the next method on the AsyncIterable rejects with an Error. Likewise if the input stream closes prematurely.
As for why strings are emitted incrementally, it's just that I was often dealing with long strings produced slowly by LLMs. JSON encoded numbers can be big in theory, but there's no practical reason to do so as almost everyone decodes them as 64bit floats.
Previously the parser would get an array of tokens each time it pushed data into the tokenizer. This was easy to write, but it meant we needed to allocate token objects. Now the tokenizer has a reference to the parser and calls token-specific methods directly on it. Since most of the tokens carry no data, this keeps us from jumping all over the heap so much. If we were parsing a more complicated language this might become a huge pain in the butt, but JSON is simple enough, and the test suite is exhaustive enough, that we can afford a little nightmare spaghetti if it improves on speed.
There might be room for some helper functions in something like a 'jsonriver/helpers.js' module. I'll poke around at it.
Then I see a Node style import and npm. When did Node/NPM stop being dependencies and become standardized by JavaScript? Where's my raw es6 module?
The library doesn't use any APIs beyond those in the JS standard, so I'm pretty confident it will work everywhere, but happy to publish in more places and run more tests. Any in particular that you'd like to see?
For some reason everybody in the JS world takes "download and execute random software from the Internet" as the only way to do things.
The Debian approach of having global versions of libraries seems like it's solving a different problem than the ones I have. I want each application to track and version its own dependencies, so that upgrading a dependency for one doesn't break another, and so that I can go back to an old project and be reasonably confident it'll still work. That ultimately led me to nix.
It's amazing how much the quality of installed software improves when you do this. Something our industry desperately needs.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
And node seems to be used only as a dev dependency, to test, benchmark and build/package the project. If you'd be inclined you can use the project's code as-is elsewhere, i.e. in the browser.
Allows parsing and streaming without any special libraries and allow for an unlimited amount of data (with objects being reasonably sized).
Usually gives these files the .jsonlines suffix when stored on disk.
Allows for batch process without requiring huge amounts of memory.
Newline Delimited JSON
TIL
https://en.wikipedia.org/wiki/JSON_streaming
The title made me think of Star Trek DS9 and Nog talking about The Great Material Continuum.
“Nog: The river will provide”
I did something similar for streaming but built it with a streaming protocol at the frame level wrapping the JSON messages [1]. The streaming protocol has support for both the LF based scheme and the HTTP Content-Length header based scheme. It's for supporting MCP and LSP.
[1] https://github.com/williamw520/zigjr/?tab=readme-ov-file#str...
Like yours, I'm sure, these incremental or online parser libraries are orders of magnitude faster[2] than alternatives for parsing LLM tool calls for the very simple reason that alternative approaches repeatedly parse the entire concatenated response, which requires buffering the entire payload, repeatedly allocating new objects, and for an N token response, you parse the first token N times! All of the "industry standard" approaches here are quadratic, which is going to scale quite poorly as LLMs generate larger and larger responses to meet application needs, and users want low latency outputs.
One of the most useful features of this approach is filtering LLM tool calls on the server and passing through a subset of the parse events to the client. This makes it relatively easy to put moderation, metadata capture, and other requirements in a single tool call, while still providing low latency streaming UI. It also avoids the problem with many moderation APIs where for cost or speed reasons, one might delegate to a smaller, cheaper model to generate output in a side-channel of the normal output stream. This not only doesn't scale, but it also means the more powerful model is unaware of these requirements, or you end up with a "flash of unapproved content" due to moderation delays, etc.
I found that it was extremely helpful to work at the level of parse events, but recognize that building partial values is also important, so I'm working on something similar in Rust[3], but taking a more holistic view and building more of an "AI SDK" akin to Vercel's, but written in Rust.
[1] https://github.com/aaronfriel/fn-stream
[2] https://github.com/vercel/ai/pull/1883
[3] https://github.com/aaronfriel/jsonmodem
(These are my own opinions, not those of my employer, etc. etc.)
Plus ça change, et plus c'est la même chose.
(The downside of JSON Merge Patch is it doesn't support concatenating string values, so you must send a value like `{"msg": "Hello World"}` as one message, you can't join `{"msg": "Hello"}` with `{"msg": " World")`.)
[1] https://github.com/pierreinglebert/json-merge-patch
Roughly how does it compare with https://github.com/promplate/partial-json-parser-js ?
Concretely, it means I can call an LLM, wrap its output stream in a streaming string, and treat it like a regular string. No need for print loops, it’s all handled behind the scenes. I can chain transformations (joining strings, splitting them with regexes, capturing substrings, etc.) and serialize the results into JSON progressively, building lazy sequences or maps on the fly.
The benefit is that I can start processing and emitting structured data immediately, without waiting for the LLM’s full response. Filtered output can be shown to users as it arrives, with near-zero added latency (aside from regex lookaheads).
Install with `uv install jsonriver`
https://github.com/chrisschnabl/jsonriver-py
https://github.com/Qbix/Platform/blob/main/platform/classes/...
- trim off all trailing delimiters: },"
- then add on a fixed suffix: "]}
- then try parsing as a standard json. Ignore results if fails to parse.
This works since the schema I’m parsing had a fairly simple structure where everything of interest was at a specific depth in the hierarchy and values were all strings.
It's less about incrementally parsing objects, and more about picking paths and shapes out from a feed. If you're doing something like array/newline delimited json, it's a great tool for reading things out as they arrive. Also great for example for feed parsing.
We used the streaming parser to create an index of the file locally {json key: (byte offset, byte size)} and then simply used http range queries to access the data we needed.
Here is the full write up about it:
https://dinesh.cloud/2022/streaming-json-for-fun-and-profit/
And here is the open sourced code:
https://github.com/multiversal-ventures/json-buffet
[1] https://github.com/chrchr/flojay
Given a schema and a JSON message prefix, parse the complete message but substitute missing field values with Promise objects. Likewise, represent lists as lazy sequences. Add a pubsub system.