#Dictionaries
Pass an array of DictionarySource objects in the dictionaries option.
Dictionaries are probed in order; the first match wins.
#Standard Korean Dictionary packages
@gukhanmun/stdict-fst and @gukhanmun/stdict-cdb both ship the bundled
Standard Korean Dictionary (標準國語大辭典) in different binary formats.
import { function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load } from "@gukhanmun/wasm";
import { function stdictFst(): Promise<FileDictionarySource>Loads the bundled Standard Korean Language Dictionary as a
FileDictionarySource
ready to pass to load({ dictionaries: [...] }).
@returnsA FileDictionarySource with format: "fst". stdictFst } from "@gukhanmun/stdict-fst";
import { function stdictCdb(): Promise<FileDictionarySource>Loads the bundled Standard Korean Language Dictionary as a
FileDictionarySource
ready to pass to load({ dictionaries: [...] }).
@returnsA FileDictionarySource with format: "cdb". stdictCdb } from "@gukhanmun/stdict-cdb";
// FST — preferred; smaller on disk, better for lattice segmentation
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [await function stdictFst(): Promise<FileDictionarySource>Loads the bundled Standard Korean Language Dictionary as a
FileDictionarySource
ready to pass to load({ dictionaries: [...] }).
@returnsA FileDictionarySource with format: "fst". stdictFst ()] });
// CDB — O(1) lookup; simpler layout
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [await function stdictCdb(): Promise<FileDictionarySource>Loads the bundled Standard Korean Language Dictionary as a
FileDictionarySource
ready to pass to load({ dictionaries: [...] }).
@returnsA FileDictionarySource with format: "cdb". stdictCdb ()] });Both stdictFst() and stdictCdb() return a Promise<FileDictionarySource>.
#Custom dictionaries
A FileDictionarySource specifies the binary format and the data to load from:
interface FileDictionarySource {
FileDictionarySource.format: "fst" | "cdb" format : "fst" | "cdb";
FileDictionarySource.data: string | ArrayBuffer | ArrayBufferView<ArrayBufferLike> | URL data : ArrayBuffer | interface ArrayBufferView<TArrayBuffer extends ArrayBufferLike = ArrayBufferLike> ArrayBufferView | URL | string;
}The data field accepts:
| Type | Where | Notes |
|---|---|---|
ArrayBuffer/ArrayBufferView | All environments | Bytes already in memory |
URL | All environments | Remote or local URL; loaded via fetch() |
string | Node.js, Deno, Bun | Filesystem path |
#Load from a URL
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({
GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [{
FileDictionarySource.format: "fst" | "cdb"The on-disk format of the dictionary file.
"fst" — Gukhanmun FST file (*.gukfst); preferred for small
WebAssembly bundles. Supported in all runtimes.
"cdb" — Gukhanmun CDB-trie file (*.gukcdb); preferred when code
auditability or trivial mmap support matters. Requires a filesystem
or in-memory bytes; supported in Node-API and (with from_bytes) in
WASM builds that include the cdb feature.
The "tsv" format is reserved for future use; passing it throws
GukhanmunError with code "unsupported-content-type".
format : "fst",
FileDictionarySource.data: string | URL | ArrayBuffer | ArrayBufferView<ArrayBufferLike>The binary dictionary data or a reference to where it can be loaded.
Pass a BufferSource for data already in memory, a URL for a remote
or local URL (resolved via fetch or readFile), or a path string
for filesystem paths (Node.js / Deno 2.0+ / Bun only).
data : new var URL: new (url: string | URL, base?: string | URL) => URLThe URL interface is used to parse, construct, normalize, and encode URLs. It works by providing properties which allow you to easily read and modify the components of a URL.
URL class is a global reference for import { URL } from 'url'
https://nodejs.org/api/url.html#the-whatwg-url-api
@sincev10 .0.0 URL ("./legal.gukfst", import.meta.ImportMeta.url: stringThe absolute file: URL of the module.
This is defined exactly the same as it is in browsers providing the URL of the
current module file.
This enables useful patterns such as relative file loading:
import { readFileSync } from 'node:fs';
const buffer = readFileSync(new URL('./data.proto', import.meta.url));
url ),
}],
});In a browser the URL is fetched with fetch(). In Node.js, Deno, and Bun a
file:// URL is read from disk.
#Load from bytes
const const response: Response response = await function fetch(input: string | URL | Request, init?: RequestInit): Promise<Response> (+1 overload) fetch ("/custom.gukcdb");
const const buf: ArrayBuffer buf = await const response: Response response .Body.arrayBuffer(): Promise<ArrayBuffer> arrayBuffer ();
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({
GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [{ FileDictionarySource.format: "cdb" | "fst"The on-disk format of the dictionary file.
"fst" — Gukhanmun FST file (*.gukfst); preferred for small
WebAssembly bundles. Supported in all runtimes.
"cdb" — Gukhanmun CDB-trie file (*.gukcdb); preferred when code
auditability or trivial mmap support matters. Requires a filesystem
or in-memory bytes; supported in Node-API and (with from_bytes) in
WASM builds that include the cdb feature.
The "tsv" format is reserved for future use; passing it throws
GukhanmunError with code "unsupported-content-type".
format : "cdb", FileDictionarySource.data: string | URL | ArrayBuffer | ArrayBufferView<ArrayBufferLike>The binary dictionary data or a reference to where it can be loaded.
Pass a BufferSource for data already in memory, a URL for a remote
or local URL (resolved via fetch or readFile), or a path string
for filesystem paths (Node.js / Deno 2.0+ / Bun only).
data : const buf: ArrayBuffer buf }],
});#Load from a file path (Node.js/Deno/Bun)
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
The native addon is synchronously ready; dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched or read from disk and
passed to the Rust engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({
GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [{ FileDictionarySource.format: "fst" | "cdb"The on-disk format of the dictionary file.
"fst" — Gukhanmun FST file (*.gukfst); preferred for small
WebAssembly bundles. Supported in all runtimes.
"cdb" — Gukhanmun CDB-trie file (*.gukcdb); preferred when code
auditability or trivial mmap support matters. Requires a filesystem
or in-memory bytes; supported in Node-API and (with from_bytes) in
WASM builds that include the cdb feature.
The "tsv" format is reserved for future use; passing it throws
GukhanmunError with code "unsupported-content-type".
format : "fst", FileDictionarySource.data: string | ArrayBuffer | ArrayBufferView<ArrayBufferLike> | URLThe binary dictionary data or a reference to where it can be loaded.
Pass a BufferSource for data already in memory, a URL for a remote
or local URL (resolved via fetch or readFile), or a path string
for filesystem paths (Node.js / Deno 2.0+ / Bun only).
data : "/data/domain.gukfst" }],
});Passing a plain string path in a browser throws GukhanmunError with code
"io".
#Combining multiple dictionaries
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
The native addon is synchronously ready; dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched or read from disk and
passed to the Rust engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({
GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [
{ FileDictionarySource.format: "fst" | "cdb"The on-disk format of the dictionary file.
"fst" — Gukhanmun FST file (*.gukfst); preferred for small
WebAssembly bundles. Supported in all runtimes.
"cdb" — Gukhanmun CDB-trie file (*.gukcdb); preferred when code
auditability or trivial mmap support matters. Requires a filesystem
or in-memory bytes; supported in Node-API and (with from_bytes) in
WASM builds that include the cdb feature.
The "tsv" format is reserved for future use; passing it throws
GukhanmunError with code "unsupported-content-type".
format : "fst", FileDictionarySource.data: string | ArrayBuffer | ArrayBufferView<ArrayBufferLike> | URLThe binary dictionary data or a reference to where it can be loaded.
Pass a BufferSource for data already in memory, a URL for a remote
or local URL (resolved via fetch or readFile), or a path string
for filesystem paths (Node.js / Deno 2.0+ / Bun only).
data : "/data/legal.gukfst" }, // checked first
await function stdictFst(): Promise<FileDictionarySource>Loads the bundled Standard Korean Language Dictionary as a
FileDictionarySource
ready to pass to load({ dictionaries: [...] }).
@returnsA FileDictionarySource with format: "fst". stdictFst (), // fallback
],
});