Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v8.x backport] encoding: rudimentary TextDecoder support w/o ICU #14786

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions doc/api/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,12 @@ only used in the [WHATWG URL API][] for strict compliance with the specification
native Node.js APIs, `func(undefined)` and `func()` are treated identically, and
the [`ERR_INVALID_ARG_TYPE`][] error code may be used instead.

<a id="ERR_NO_ICU"></a>
### ERR_NO_ICU

Used when an attempt is made to use features that require [ICU][], while
Node.js is not compiled with ICU support.

<a id="ERR_SOCKET_ALREADY_BOUND"></a>
### ERR_SOCKET_ALREADY_BOUND
Used when an attempt is made to bind a socket that has already been bound.
Expand Down Expand Up @@ -795,6 +801,7 @@ are most likely an indication of a bug within Node.js itself.
[`new URLSearchParams(iterable)`]: url.html#url_constructor_new_urlsearchparams_iterable
[`process.on('uncaughtException')`]: process.html#process_event_uncaughtexception
[`process.send()`]: process.html#process_process_send_message_sendhandle_options_callback
[ICU]: intl.html#intl_internationalization_support
[Node.js Error Codes]: #nodejs-error-codes
[V8's stack trace API]: https://github.com/v8/v8/wiki/Stack-Trace-API
[WHATWG URL API]: url.html#url_the_whatwg_url_api
Expand Down
2 changes: 1 addition & 1 deletion doc/api/intl.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ option:
| [WHATWG URL Parser][] | partial (no IDN support) | full | full | full
| [`require('buffer').transcode()`][] | none (function does not exist) | full | full | full
| [REPL][] | partial (inaccurate line editing) | full | full | full
| [`require('util').TextDecoder`][] | none (class does not exist) | partial/full (depends on OS) | partial (Unicode-only) | full
| [`require('util').TextDecoder`][] | partial (basic encodings support) | partial/full (depends on OS) | partial (Unicode-only) | full

*Note*: The "(not locale-aware)" designation denotes that the function carries
out its operation just like the non-`Locale` version of the function, if one
Expand Down
61 changes: 39 additions & 22 deletions doc/api/util.md
Original file line number Diff line number Diff line change
Expand Up @@ -544,7 +544,7 @@ added: v8.0.0
A Symbol that can be used to declare custom promisified variants of functions,
see [Custom promisified functions][].

### Class: util.TextDecoder
## Class: util.TextDecoder
<!-- YAML
added: v8.3.0
-->
Expand All @@ -563,23 +563,33 @@ while (buffer = getNextChunkSomehow()) {
string += decoder.decode(); // end-of-stream
```

#### WHATWG Supported Encodings
### WHATWG Supported Encodings

Per the [WHATWG Encoding Standard][], the encodings supported by the
`TextDecoder` API are outlined in the tables below. For each encoding,
one or more aliases may be used. Support for some encodings is enabled
only when Node.js is using the full ICU data (see [Internationalization][]).
`util.TextDecoder` is `undefined` when ICU is not enabled during build.
one or more aliases may be used.

##### Encodings Supported By Default
Different Node.js build configurations support different sets of encodings.
While a very basic set of encodings is supported even on Node.js builds without
ICU enabled, support for some encodings is provided only when Node.js is built
with ICU and using the full ICU data (see [Internationalization][]).

#### Encodings Supported Without ICU

| Encoding | Aliases |
| ----------- | --------------------------------- |
| `'utf8'` | `'unicode-1-1-utf-8'`, `'utf-8'` |
| `'utf-16be'`| |
| `'utf-8'` | `'unicode-1-1-utf-8'`, `'utf8'` |
| `'utf-16le'`| `'utf-16'` |

##### Encodings Requiring Full-ICU
#### Encodings Supported by Default (With ICU)

| Encoding | Aliases |
| ----------- | --------------------------------- |
| `'utf-8'` | `'unicode-1-1-utf-8'`, `'utf8'` |
| `'utf-16le'`| `'utf-16'` |
| `'utf-16be'`| |

#### Encodings Requiring Full ICU Data

| Encoding | Aliases |
| ----------------- | -------------------------------- |
Expand Down Expand Up @@ -621,13 +631,14 @@ only when Node.js is using the full ICU data (see [Internationalization][]).
*Note*: The `'iso-8859-16'` encoding listed in the [WHATWG Encoding Standard][]
is not supported.

#### new TextDecoder([encoding[, options]])
### new TextDecoder([encoding[, options]])

* `encoding` {string} Identifies the `encoding` that this `TextDecoder` instance
supports. Defaults to `'utf-8'`.
* `options` {Object}
* `fatal` {boolean} `true` if decoding failures are fatal. Defaults to
`false`.
`false`. This option is only supported when ICU is enabled (see
[Internationalization][]).
* `ignoreBOM` {boolean} When `true`, the `TextDecoder` will include the byte
order mark in the decoded result. When `false`, the byte order mark will
be removed from the output. This option is only used when `encoding` is
Expand All @@ -636,7 +647,7 @@ is not supported.
Creates an new `TextDecoder` instance. The `encoding` may specify one of the
supported encodings or an alias.

#### textDecoder.decode([input[, options]])
### textDecoder.decode([input[, options]])

* `input` {ArrayBuffer|DataView|TypedArray} An `ArrayBuffer`, `DataView` or
Typed Array instance containing the encoded data.
Expand All @@ -652,49 +663,55 @@ internally and emitted after the next call to `textDecoder.decode()`.
If `textDecoder.fatal` is `true`, decoding errors that occur will result in a
`TypeError` being thrown.

#### textDecoder.encoding
### textDecoder.encoding

* Value: {string}
* {string}

The encoding supported by the `TextDecoder` instance.

#### textDecoder.fatal
### textDecoder.fatal

* Value: {boolean}
* {boolean}

The value will be `true` if decoding errors result in a `TypeError` being
thrown.

#### textDecoder.ignoreBOM
### textDecoder.ignoreBOM

* Value: {boolean}
* {boolean}

The value will be `true` if the decoding result will include the byte order
mark.

### Class: util.TextEncoder
## Class: util.TextEncoder
<!-- YAML
added: v8.3.0
-->

> Stability: 1 - Experimental

An implementation of the [WHATWG Encoding Standard][] `TextEncoder` API. All
instances of `TextEncoder` only support `UTF-8` encoding.
instances of `TextEncoder` only support UTF-8 encoding.

```js
const encoder = new TextEncoder();
const uint8array = encoder.encode('this is some data');
```

#### textEncoder.encode([input])
### textEncoder.encode([input])

* `input` {string} The text to encode. Defaults to an empty string.
* Returns: {Uint8Array}

UTF-8 Encodes the `input` string and returns a `Uint8Array` containing the
UTF-8 encodes the `input` string and returns a `Uint8Array` containing the
encoded bytes.

### textDecoder.encoding

* {string}

The encoding supported by the `TextEncoder` instance. Always set to `'utf-8'`.

## Deprecated APIs

The following APIs have been deprecated and should no longer be used. Existing
Expand Down
Loading