Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 31 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,19 +30,37 @@ import spawnAsync from '@expo/spawn-async';
`spawnAsync` takes the same arguments as [`child_process.spawn`](https://nodejs.org/api/child_process.html#child_process_child_process_spawn_command_args_options). Its options are the same as those of `child_process.spawn` plus:

- `ignoreStdio`: whether to ignore waiting for the child process's stdio streams to close before resolving the result promise. When ignoring stdio, the returned values for `stdout` and `stderr` will be empty strings. The default value of this option is `false`.
- `maxBuffer`: the maximum bytes retained from `stdout` and `stderr` (independently). Output is collected with a sliding window. When set explicitly, exceeding it rejects the promise with an error whose `code` is `ERR_CHILD_PROCESS_STDIO_MAXBUFFER` and whose `stdout`/`stderr` carry the truncated tail. When omitted, the default is `buffer.constants.MAX_STRING_LENGTH` (~512 MiB).
- `maxBuffer`: the maximum bytes retained from `stdout` and `stderr` (independently). Output is collected with a sliding window. Exceeding the cap rejects the promise with `code: 'ERR_CHILD_PROCESS_STDIO_MAXBUFFER'`; the most recent bytes that fit are attached to the error's stdio fields. The default is `buffer.constants.MAX_STRING_LENGTH`. Under `encoding: 'buffer'` the caller may pass a larger value, up to `buffer.constants.MAX_LENGTH`. Passing a value larger than the encoding's hard limit throws `TypeError` synchronously.
- `encoding`: selects whether the child's output is exposed as decoded strings or raw bytes. The default `'utf8'` (and any other [`BufferEncoding`](https://nodejs.org/api/buffer.html#buffers-and-character-encodings) value such as `'latin1'` or `'hex'`) decodes the child's output into a string. The value `'buffer'` keeps the output as raw `Uint8Array` instead. The `stdout` / `stderr` / `output` field names are the same in either mode; only their type changes.

It returns a promise whose result is an object with these properties:

- `pid`: the process ID of the spawned child process
- `output`: an array with stdout and stderr's output
- `stdout`: a string of what the child process wrote to stdout
- `stderr`: a string of what the child process wrote to stderr
- `stdout`: what the child process wrote to stdout — a `string` for text encodings, or a `Uint8Array` under `encoding: 'buffer'`
- `stderr`: same shape as `stdout`, but for stderr
- `output`: `[stdout, stderr]`
- `status`: the exit code of the child process
- `signal`: the signal (ex: `SIGTERM`) used to stop the child process if it did not exit on its own

The `Uint8Array` returned by `'buffer'` mode is a Node `Buffer` at runtime, so callers who need `Buffer`-specific methods can do `Buffer.from(result.stdout)` for a zero-copy view of the same memory.

If there's an error running the child process or it exits with a non-zero status code, `spawnAsync` rejects the returned promise. The Error object also has the properties listed above.

### Reading binary output

```js
import fs from 'node:fs';
import spawnAsync from '@expo/spawn-async';

const result = await spawnAsync(
'pandoc',
['--from=html', '--to=docx'],
{ encoding: 'buffer' }
);
result.child.stdin.end('<h1>Hello, world</h1>');
fs.writeFileSync('out.docx', result.stdout); // Uint8Array
```

### Accessing the child process

Sometimes you may want to access the child process object--for example, if you wanted to attach event handlers to `stdio` or `stderr` and process data as it is available instead of waiting for the process to be resolved.
Expand Down Expand Up @@ -70,8 +88,14 @@ Here is an example:

### `maxBuffer`

`maxBuffer` is a later addition to the API. Set it when child output could exhaust memory and crash the parent process, or when the command or arguments are influenced by untrusted input — an attacker can otherwise force unbounded output to crash the parent.
Set `maxBuffer` when child output could exhaust memory and crash the parent process, or when the command or arguments are influenced by untrusted input; an attacker can otherwise force unbounded output to crash the parent.

The default `maxBuffer` is `buffer.constants.MAX_STRING_LENGTH`. For text encodings that is also the runtime hard limit (the longest string `Buffer.toString()` can build without crashing). Under `encoding: 'buffer'` the runtime allows up to `buffer.constants.MAX_LENGTH`, but the default stays at `MAX_STRING_LENGTH` for consistency; callers that need more output pass a larger `maxBuffer` explicitly. At either size the materialized output can still exhaust process memory.

Exceeding the cap rejects the promise with `ERR_CHILD_PROCESS_STDIO_MAXBUFFER`, regardless of whether the cap was explicit or the default. The most recent bytes that fit are attached to `error.stdout` and `error.stderr`.

Passing `maxBuffer` larger than the encoding's hard limit throws `TypeError` synchronously at the call site.

The default of `buffer.constants.MAX_STRING_LENGTH` (~512 MiB) is a crash-safe floor, not a memory bound: at that size the materialized string itself can still exhaust process memory.
### Memory profile

When `maxBuffer` is set explicitly, exceeding it rejects the promise immediately with `ERR_CHILD_PROCESS_STDIO_MAXBUFFER`. When left at the default, exceeding it doesn't reject; the sliding-window tail is still readable, but reading `stdout`/`stderr` throws `ERR_CHILD_PROCESS_STDIO_MAXBUFFER` with the truncated tail attached.
During the child's lifetime the per-chunk buffers Node delivers are retained as they arrive. When the process exits the chunks are concatenated once and decoded if a text encoding was requested; at that moment peak memory is briefly ~2× the output size (the chunk array plus the concatenated result). The chunk references are dropped immediately afterward, so steady-state memory is ~1× the output: either the decoded string (text encodings) or the `Uint8Array` (`encoding: 'buffer'`).
220 changes: 193 additions & 27 deletions src/__tests__/spawnAsync-test.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
import assert from 'assert';
import { constants as bufferConstants } from 'buffer';
import path from 'path';

import spawnAsync, { SpawnOptions, SpawnPromise, SpawnResult } from '../spawnAsync';

// Captured at file load, before any `jest.doMock('buffer', …)` in later tests
// can swap the constants out from under us.
const REAL_MAX_STRING_LENGTH = bufferConstants.MAX_STRING_LENGTH;
const REAL_MAX_LENGTH = bufferConstants.MAX_LENGTH;

it(`receives output from completed processes`, async () => {
let result = await spawnAsync('echo', ['hi']);
expect(typeof result.pid).toBe('number');
Expand Down Expand Up @@ -238,50 +244,72 @@ it(`listens on 'close' (not 'exit') when stdio is piped`, async () => {
await task;
});

describe(`default-cap (lazy) maxBuffer path`, () => {
// The lazy path only triggers against MAX_STRING_LENGTH (~512 MiB), which is
// impractical to generate. Mock the constant so the same code path activates
// at a testable size.
function spawnAsyncWithCap(cap: number) {
describe(`default-cap maxBuffer path`, () => {
// The default text cap is MAX_STRING_LENGTH (~512 MiB), which is impractical
// to generate. Mock the constant so the same code path activates at a
// testable size.
function spawnAsyncWithCap(
cap: number,
encoding?: spawnAsync.SpawnOptions['encoding']
) {
let task: any;
jest.isolateModules(() => {
jest.doMock('buffer', () => {
const actual = jest.requireActual<typeof import('buffer')>('buffer');
return {
...actual,
constants: { ...actual.constants, MAX_STRING_LENGTH: cap },
constants: {
...actual.constants,
MAX_STRING_LENGTH: cap,
MAX_LENGTH: cap,
},
};
});
const localSpawnAsync = require('../spawnAsync');
task = localSpawnAsync(
process.execPath,
['-e', 'process.stdout.write("a".repeat(100), () => process.stdout.write("b".repeat(50)));']
['-e', 'process.stdout.write("a".repeat(100), () => process.stdout.write("b".repeat(50)));'],
encoding ? { encoding } : undefined
);
});
return task as Promise<SpawnResult> & { child: any };
return task;
}

it(`resolves the promise without rejecting`, async () => {
const result = await spawnAsyncWithCap(100);
expect(result.status).toBe(0);
expect(result.signal).toBe(null);
});

it(`throws ERR_CHILD_PROCESS_STDIO_MAXBUFFER on stdout access with the truncated tail`, async () => {
const result = await spawnAsyncWithCap(100);
let error: any;
try { void result.stdout; } catch (e) { error = e; }
expect(error).toMatchObject({
code: 'ERR_CHILD_PROCESS_STDIO_MAXBUFFER',
stdout: 'a'.repeat(50) + 'b'.repeat(50),
stderr: '',
});
it(`rejects suggesting the string-length limit when text output exceeds it`, async () => {
let caught: any;
try {
await spawnAsyncWithCap(100);
} catch (e) {
caught = e;
}
expect(caught).toBeDefined();
expect(caught.code).toBe('ERR_CHILD_PROCESS_STDIO_MAXBUFFER');
expect(caught.message).toMatch(/the maximum string length \(100 bytes, buffer\.constants\.MAX_STRING_LENGTH\)/);
// Suggests the only real remedy: switching to bytes.
expect(caught.message).toMatch(/encoding: "buffer"/);
// The sliding-window tail is still attached so callers can inspect it.
expect(caught.stdout).toBe('a'.repeat(50) + 'b'.repeat(50));
expect(caught.stderr).toBe('');
expect(caught.status).toBe(0);
});

it(`only throws on the overflowed stream; the other reads normally`, async () => {
const result = await spawnAsyncWithCap(100);
expect(result.stderr).toBe('');
expect(() => result.stdout).toThrow();
it(`rejects suggesting a larger maxBuffer when bytes output exceeds the default cap`, async () => {
let caught: any;
try {
await spawnAsyncWithCap(100, 'buffer');
} catch (e) {
caught = e;
}
expect(caught).toBeDefined();
expect(caught.code).toBe('ERR_CHILD_PROCESS_STDIO_MAXBUFFER');
expect(caught.message).toMatch(/exceeded the default maxBuffer of 100 bytes/);
// The default cap in buffer mode is not the runtime ceiling, so the
// message points the caller at a larger maxBuffer.
expect(caught.message).toMatch(/Pass maxBuffer to capture more output/);
expect(caught.message).toMatch(/buffer\.constants\.MAX_LENGTH/);
expect(Buffer.from(caught.stdout).toString()).toBe(
'a'.repeat(50) + 'b'.repeat(50)
);
});
});

Expand All @@ -291,3 +319,141 @@ it(`exports TypeScript types`, async () => {
let result: SpawnResult = await promise;
expect(typeof result.pid).toBe('number');
});

describe(`encoding: 'buffer'`, () => {
it(`returns stdout/stderr as Uint8Array`, async () => {
const expected = Buffer.from([0, 1, 2, 0xff, 0x80, 0x7f]);
const result = await spawnAsync(
process.execPath,
[
'-e',
`process.stdout.write(Buffer.from(${JSON.stringify(Array.from(expected))}));`,
],
{ encoding: 'buffer' }
);
expect(result.stdout).toBeInstanceOf(Uint8Array);
expect(result.stdout.byteLength).toBe(expected.byteLength);
expect(Buffer.from(result.stdout).equals(expected)).toBe(true);
expect(result.stderr.byteLength).toBe(0);
});

it(`survives a byte sequence that is not valid UTF-8`, async () => {
// The continuation byte 0xC0 followed by 0x00 would be replaced by U+FFFD
// when decoded as UTF-8, losing information. With encoding: 'buffer' we get
// the exact bytes back.
const bytes = Buffer.from([0xc0, 0x00, 0xc1, 0xff]);
const result = await spawnAsync(
process.execPath,
[
'-e',
`process.stdout.write(Buffer.from(${JSON.stringify(Array.from(bytes))}));`,
],
{ encoding: 'buffer' }
);
expect(Buffer.from(result.stdout).equals(bytes)).toBe(true);
});

it(`populates output as [stdout, stderr] of Uint8Array`, async () => {
const result = await spawnAsync(
process.execPath,
['-e', 'process.stdout.write("ok"); process.stderr.write("warn");'],
{ encoding: 'buffer' }
);
expect(result.output).toHaveLength(2);
expect(result.output[0]).toBe(result.stdout);
expect(result.output[1]).toBe(result.stderr);
expect(result.output[0]).toBeInstanceOf(Uint8Array);
});

it(`attaches bytes to the error on non-zero exit, like string stdout`, async () => {
const expected = Buffer.from([0x10, 0x20, 0x30]);
let caught: any;
try {
await spawnAsync(
process.execPath,
[
'-e',
`process.stdout.write(Buffer.from(${JSON.stringify(Array.from(expected))})); process.exit(7);`,
],
{ encoding: 'buffer' }
);
} catch (e) {
caught = e;
}
expect(caught).toBeDefined();
expect(caught.status).toBe(7);
expect(Buffer.from(caught.stdout).equals(expected)).toBe(true);
});

it(`enforces maxBuffer with encoding: 'buffer'`, async () => {
await expect(
spawnAsync(
process.execPath,
['-e', 'process.stdout.write(Buffer.alloc(1000, 0xab));'],
{ encoding: 'buffer', maxBuffer: 100 }
)
).rejects.toMatchObject({
code: 'ERR_CHILD_PROCESS_STDIO_MAXBUFFER',
});
});
});

describe(`encoding: text encodings other than utf8`, () => {
it(`decodes stdout with the requested encoding`, async () => {
// Latin-1 maps 0x00–0xff one-to-one to U+0000–U+00FF. Bytes that would be
// multibyte continuations in UTF-8 (e.g. 0xC0, 0xFF) decode cleanly here.
const bytes = Buffer.from([0xc0, 0xc1, 0xff]);
const result = await spawnAsync(
process.execPath,
[
'-e',
`process.stdout.write(Buffer.from(${JSON.stringify(Array.from(bytes))}));`,
],
{ encoding: 'latin1' }
);
expect(result.stdout).toBe(bytes.toString('latin1'));
expect(result.stdout).toBe('ÀÁÿ');
});

it(`decodes stdout with hex encoding`, async () => {
const result = await spawnAsync(
process.execPath,
['-e', 'process.stdout.write(Buffer.from([0xde, 0xad, 0xbe, 0xef]));'],
{ encoding: 'hex' }
);
expect(result.stdout).toBe('deadbeef');
});
});

describe(`maxBuffer validation`, () => {
it(`throws TypeError synchronously when maxBuffer exceeds MAX_STRING_LENGTH in text mode`, () => {
expect(() =>
spawnAsync('echo', ['hi'], { maxBuffer: REAL_MAX_STRING_LENGTH + 1 })
).toThrow(TypeError);
expect(() =>
spawnAsync('echo', ['hi'], { maxBuffer: REAL_MAX_STRING_LENGTH + 1 })
).toThrow(/exceeds the maximum string length/);
});

it(`throws TypeError synchronously when maxBuffer exceeds MAX_LENGTH in buffer mode`, () => {
if (REAL_MAX_LENGTH >= Number.MAX_SAFE_INTEGER) {
// On runtimes where MAX_LENGTH already equals MAX_SAFE_INTEGER there's
// no representable integer Number larger than it, so this case is
// unreachable. Recent Node sets MAX_LENGTH this high.
return;
}
expect(() =>
spawnAsync('echo', ['hi'], {
encoding: 'buffer',
maxBuffer: REAL_MAX_LENGTH + 1,
})
).toThrow(/exceeds the maximum byte array length/);
});

it(`accepts maxBuffer exactly equal to MAX_STRING_LENGTH`, async () => {
const result = await spawnAsync('echo', ['hi'], {
maxBuffer: REAL_MAX_STRING_LENGTH,
});
expect(result.stdout).toBe('hi\n');
});
});
Loading
Loading