#Conversion options
Pass these as properties of the GukhanmunOptions object to load().
#Preset
preset selects a preconfigured set of defaults:
| Value | Dictionary | Initial sound law | Homophone window |
|---|---|---|---|
"ko-kr" (default) | None bundled—load stdict explicitly | true | "per-block" |
"ko-kp" | None | false | "off" |
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.preset?: Preset | undefinedNamed configuration preset. Defaults to "ko-kr".
@seePreset preset : "ko-kp", GukhanmunOptions.dictionaries?: readonly FileDictionarySource[] | undefinedOrdered list of dictionary sources. Sources are queried in order;
earlier entries take precedence. When omitted (or empty), only the
fallback Unihan character map is used (no stdict).
Unlike the "ko-kr" Rust preset, JavaScript presets do not
automatically include a bundled dictionary. To use the Standard Korean
Language Dictionary, add @gukhanmun/stdict-fst or
@gukhanmun/stdict-cdb explicitly.
@seeDictionarySource dictionaries : [] });Unlike the Rust crate, the JavaScript packages never include the bundled
dictionary automatically; always pass it via dictionaries.
#Segmentation strategy
segmentation controls how Gukhanmun finds word boundaries within hanja runs:
"lattice"(default): evaluates all dictionary matches at every position and selects the globally optimal segmentation using dynamic programming. Most accurate, especially for compound words and ambiguous boundaries."eager": greedy left-to-right longest-match. Faster, but may mis-segment compound words.
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({
GukhanmunOptions.segmentation?: Segmentation | undefinedHanja-span segmentation algorithm. Defaults to "lattice".
@seeSegmentation segmentation : "lattice", // default: optimal, dynamic programming
// segmentation: "eager", // greedy, faster but less accurate
});Prefer "eager" only when throughput matters more than accuracy.
#Numeral handling
numerals controls how hanja numeral characters such as 二〇一六 are rendered.
Chinese-style numerals can represent numbers in multiple ways depending on
whether they encode positions or quantities:
| Value | 二〇一六年 | 十一月 | 一千二百三十四 |
|---|---|---|---|
"hangul-phonetic" (default) | 이공일륙년 | 십일월 | 일천이백삼십사 |
"positional-arabic" | 2016년 | — | — |
"additive-arabic" | — | 11월 | 1234 |
"smart" | 2016년 | 11월 | 1234 |
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.numerals?: NumeralStrategy | undefinedHow runs of hanja numerals are converted. Defaults to
"hangul-phonetic".
@seeNumeralStrategy numerals : "hangul-phonetic" }); // 이공일륙 (default)
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.numerals?: NumeralStrategy | undefinedHow runs of hanja numerals are converted. Defaults to
"hangul-phonetic".
@seeNumeralStrategy numerals : "positional-arabic"}); // 2016
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.numerals?: NumeralStrategy | undefinedHow runs of hanja numerals are converted. Defaults to
"hangul-phonetic".
@seeNumeralStrategy numerals : "additive-arabic" }); // 11 (月), 1234
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.numerals?: NumeralStrategy | undefinedHow runs of hanja numerals are converted. Defaults to
"hangul-phonetic".
@seeNumeralStrategy numerals : "smart" }); // picks best per context"smart" chooses positional notation for year-like four-digit sequences and
additive notation for quantities; it is a good default for general-purpose
documents.
#Initial sound law
The initial sound law (頭音法則) is a South Korean phonological rule that changes certain initial consonants at the start of a word. The rule applies to fallback readings for characters not found in any dictionary; dictionary entries already encode their correct readings.
| Input | initialSoundLaw: true (ko-kr) | initialSoundLaw: false (ko-kp) |
|---|---|---|
| 來日 | 내일 | 래일 |
| 理由 | 이유 | 리유 |
| 女子 | 여자 | 녀자 |
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.initialSoundLaw?: boolean | undefinedWhether to apply the Korean initial sound law (頭音法則) to fallback
phonetic readings. Defaults to true for "ko-kr" and false for
"ko-kp".
Note: dictionary entries are assumed to encode the correct reading
already; this flag only affects the character-by-character fallback path.
initialSoundLaw : true }); // default for ko-kr
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.initialSoundLaw?: boolean | undefinedWhether to apply the Korean initial sound law (頭音法則) to fallback
phonetic readings. Defaults to true for "ko-kr" and false for
"ko-kp".
Note: dictionary entries are assumed to encode the correct reading
already; this flag only affects the character-by-character fallback path.
initialSoundLaw : false }); // default for ko-kpDisable it for North Korean orthography ("ko-kp" preset) or when processing
text that follows North Korean spelling conventions.
#Homophone disambiguation window
When the same hanja character appears multiple times, Gukhanmun can mark
repeated occurrences to help readers distinguish homophones.
homophoneWindow sets the scope across which repetitions are tracked:
| Value | Behaviour |
|---|---|
"off" | No disambiguation tracking |
"per-block" (default for ko-kr) | Reset at paragraph, list, and heading boundaries |
"per-section" | Reset at heading boundaries only |
"per-document" | Track across the entire input |
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.homophoneWindow?: ContextWindow | undefinedContext window for homophone disambiguation. The HomophoneMarker
middleware sets homophone = true on annotations whose hangul reading is
shared by another hanja form within this window. Defaults to
"per-block".
@seeContextWindow homophoneWindow : "off" });
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.homophoneWindow?: ContextWindow | undefinedContext window for homophone disambiguation. The HomophoneMarker
middleware sets homophone = true on annotations whose hangul reading is
shared by another hanja form within this window. Defaults to
"per-block".
@seeContextWindow homophoneWindow : "per-block" }); // default for ko-kr
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.homophoneWindow?: ContextWindow | undefinedContext window for homophone disambiguation. The HomophoneMarker
middleware sets homophone = true on annotations whose hangul reading is
shared by another hanja form within this window. Defaults to
"per-block".
@seeContextWindow homophoneWindow : "per-section" });
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.homophoneWindow?: ContextWindow | undefinedContext window for homophone disambiguation. The HomophoneMarker
middleware sets homophone = true on annotations whose hangul reading is
shared by another hanja form within this window. Defaults to
"per-block".
@seeContextWindow homophoneWindow : "per-document" });Wider windows are appropriate for dense hanja texts where the same character recurs across many sections.
#First-occurrence clearing window
When enabled, first-occurrence clearing stops annotating a hanja after its first occurrence within the window. This is useful for documents that introduce each character once and then use it freely; subsequent occurrences are left as plain hangul without parenthetical hanja.
| Value | Behaviour |
|---|---|
"off" (default) | Never clear; annotate every occurrence |
"per-block" | Clear within the same paragraph/block |
"per-section" | Clear within the same section |
"per-document" | Clear across the entire document |
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.firstOccurrenceWindow?: ContextWindow | undefinedContext window for first-occurrence filtering. The
FirstOccurrenceFilter middleware clears requireHanja /
requireHangul on repeated occurrences of the same word within this
window, so the gloss appears only the first time. Defaults to "off"
(filter disabled) in both presets.
@seeContextWindow firstOccurrenceWindow : "off" }); // default
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.firstOccurrenceWindow?: ContextWindow | undefinedContext window for first-occurrence filtering. The
FirstOccurrenceFilter middleware clears requireHanja /
requireHangul on repeated occurrences of the same word within this
window, so the gloss appears only the first time. Defaults to "off"
(filter disabled) in both presets.
@seeContextWindow firstOccurrenceWindow : "per-block" });
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.firstOccurrenceWindow?: ContextWindow | undefinedContext window for first-occurrence filtering. The
FirstOccurrenceFilter middleware clears requireHanja /
requireHangul on repeated occurrences of the same word within this
window, so the gloss appears only the first time. Defaults to "off"
(filter disabled) in both presets.
@seeContextWindow firstOccurrenceWindow : "per-section" });
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.firstOccurrenceWindow?: ContextWindow | undefinedContext window for first-occurrence filtering. The
FirstOccurrenceFilter middleware clears requireHanja /
requireHangul on repeated occurrences of the same word within this
window, so the gloss appears only the first time. Defaults to "off"
(filter disabled) in both presets.
@seeContextWindow firstOccurrenceWindow : "per-document" });#Error recovery
recovery controls what happens when the HTML parser encounters markup it
cannot interpret. It has no effect for plain text or Markdown input.
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.recovery?: Recovery | undefinedError recovery policy for HTML scanning. Defaults to "strict".
Ignored for non-HTML input formats.
@seeRecovery recovery : "strict" }); // default: throw on error
const const g: Gukhanmun g = await function load(options?: GukhanmunOptions): Promise<Gukhanmun>Creates a Gukhanmun converter with the given options.
Initialises the WASM module on the first call (subsequent calls reuse the
cached module). Dictionaries supplied via
GukhanmunOptions.dictionaries
are fetched and passed to the Rust
engine as FileDictionarySource values.
Note: unlike the Rust ko-kr preset, the JavaScript preset never includes a
bundled dictionary. Pass dictionaries: [await stdictFst()] to include the
Standard Korean Language Dictionary.
@paramoptions - Conversion options. All fields are optional; defaults match
the ko-kr preset.@returnsA Gukhanmun instance.@throws{@linkGukhanmunError } on invalid options or dictionary load failure. load ({ GukhanmunOptions.recovery?: Recovery | undefinedError recovery policy for HTML scanning. Defaults to "strict".
Ignored for non-HTML input formats.
@seeRecovery recovery : "lenient" }); // skip bad fragments (HTML)Use "lenient" when processing HTML from external sources that may contain
fragments or non-standard markup; it skips problematic parts rather than
throwing a GukhanmunError.