Skip to content

vtst/streamy-json-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

streamy-json-parser

A JSON parser that receives its input as a stream of strings

Overview

This library offers a lightweight, streaming JSON parser that processes input incrementally as it arrives. Rather than waiting for the entire JSON string, you can access parsed data progressively while the input is still being received.

Typical use cases include:

  • Parsing JSON output from slow or streaming sources (such as LLM APIs), allowing you to render or process data as soon as it becomes available.
  • Handling large JSON payloads efficiently by processing them piece by piece, reducing memory usage and latency.

Usage

Installation

To install the package, run

npm install streamy-json-parser

To use the package in a module, use one of the following statements depending on the module system you are using:

import streamy_json_parser from 'streamy-json-parser';
import {Parser, parse, SyntaxError} from 'streamy-json-parser';
const streamy_json_parser = require('streamy-json-parser');
const {Parser, parse, SyntaxError} = require('streamy-json-parser');

The Parser class

The Parser class is the primary way to interact with this library. Here’s a simple example demonstrating how to use it:

let parser = new Parser();
parser.push('{"foo": 1, bar');
console.log(parser.getValue());  // Log the partially parsed object.
parser.push(': [2');
console.log(parser.getValue());  // Log the partially parsed object.
parser.push(', 3]}');
parser.close();
console.log(parser.getValue());  // Log the full parsed object.

The Parser class has the following constructor and methods:

  • Parser(options?): Create a new parser object with the specified options. (See the "Options" section below.)
  • .push(string): Add a string to the input stream and parses it.
  • .close(): Declare the input stream as complete and complete parsing.
  • .reset(): Reset the parser to its initial state so that a new input string can be parsed.
  • .getValue(): Get the value which has been parsed so far. Warning! See "Modifying returned values" below if you intend to modify this value.
  • .takeEvents(): Retrieve the events generated by the parser (since the last call to takeEvents). (See the "Events" section below.)
  • .setPlaceholder(value): Set an initial object before parsing is done. The parsed value is constructed over this object which is progressively updated.

The parse function

The parse function provides an alternative interface using JavaScript iterators.

const input = [
  '{"foo": 1, bar',
  ': [2',
  ', 3]}'
];
const iterator = input[Symbol.iterator]();
for (const {root, done} of parse(iterator)) {
  console.log(root);
  if (done) console.log('Parsing is complete');
}

The parse function takes two arguments:

  • iterator: A string iterator, representing the input to be part,
  • options (optional): The options object (see below.)

It returns an iterator whose items are objects with the following properties:

  • root: The parsed value (same as .getValue()),
  • done: true if parsing is complete,
  • events: The events produced in this iteration (if the option track_events is set).

Options

Parser options are set through a record object that may contain the following properties:

  • include_incomplete_strings (default: false): If true, incomplete strings are added to the output value at the end of each input chunk. If false, strings are added to the output value object only when they are complete. You can also set this option to a string value, which is in this case added as a suffix to incomplete string values (e.g. use "...").
  • track_events: Track additional, SAX-style, events through parsing. See the "Events" section below.

Advanced usage

Events

In addition to the incrementally built value, the parser can produce SAX-style events through parsing. The following events are generated:

  • {type: "set", path: [...]}: A literal value is set,
  • {type: "begin", path: [...]}: Parsing of an object or array begins,
  • {type: "end", path: [...]}: Parsing of an object or array ends.

In these events, the .path property contains an array designating the path of the currently parsed node of the object relative to the root object (as returned by .getValue()). Each item of .path is either a string (for the property name of an object) or an integer (for the index of an array). For instance, if .path is ["foo", 2, "bar"], the events refer to the node .getValue()["foo"][2]["bar"].

Modifying returned values

For performance reasons, the parser returns references to its internal data structures when you call .getValue() during parsing. This means that if you mutate (modify in place) any part of the returned value, you are directly changing the parser's internal state. Doing so can lead to unexpected behavior or even cause the parser to fail.

If you need to modify the parsed data while parsing, consider the following approaches:

  • Make a deep copy of the value before mutating it, for example using structuredClone or a similar utility.
  • Use the event tracking feature (track_events option) to listen for set or end events, and only modify values after those events have been emitted for the relevant path.

In some cases, altering values can be useful for memory management, such as discarding sub-objects that are no longer needed. However, always ensure you do so safely, after the parser has finished processing those parts of the data.

About

A JSON parser that receives its input as a stream of strings

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published