TypeScript / Node.js API
import {
preprocess,
preprocessString,
preprocessWithOptions,
preprocessGmail,
toLlmContext,
toLlmContextWithOptions,
RenderMode,
} from "langmail"
import type {
ProcessedEmail,
PreprocessOptions,
LlmContextOptions,
Address,
CallToAction,
ThreadMessage,
} from "langmail"
Note
The public TypeScript surface also re-exports the underlying NAPI-RS
generated names (NapiAddress, NapiCallToAction, NapiLlmContextOptions,
NapiRenderMode, NapiThreadMessage) as backward-compatible aliases.
Prefer the unprefixed names — they are the canonical public API.
preprocess()
Accepts raw RFC 5322 email bytes and returns a structured ProcessedEmail object. Handles MIME multipart messages, HTML and plain-text body variants, and normalises character encodings.
function preprocess(raw: Buffer): ProcessedEmail
function preprocessString(raw: string): ProcessedEmail
function preprocessWithOptions(
raw: Buffer,
options: PreprocessOptions
): ProcessedEmail
preprocess is synchronous and takes a Buffer. Use preprocessString as a convenience wrapper if you already have the email as a string. Use preprocessWithOptions to override defaults — see PreprocessOptions.
ProcessedEmail
Optional fields are declared with ?: in the generated .d.ts — their type
is T | undefined, not T | null.
| Field | Type | Description |
|---|---|---|
| body | string | Cleaned body text, with quotes and signature removed |
| subject | string | undefined | Subject line |
| from | Address | undefined | Sender |
| to | Address[] | To recipients |
| cc | Address[] | Cc recipients |
| date | string | undefined | ISO 8601 date string |
| rfcMessageId | string | undefined | RFC 2822 Message-ID header value |
| inReplyTo | string[] | undefined | In-Reply-To header values (for threading) |
| references | string[] | undefined | References header values (for threading) |
| signature | string | undefined | Extracted signature, if found |
| rawBodyLength | number | Length of the original body before cleaning |
| cleanBodyLength | number | Length of the cleaned body |
| primaryCta | CallToAction | undefined | Primary call-to-action link extracted from the HTML body |
| threadMessages | ThreadMessage[] | Quoted reply messages, oldest first |
Address is { name?: string, email: string }. CallToAction is { url: string, text: string, confidence: number }. ThreadMessage is { sender: string, timestamp?: string, body: string }.
PreprocessOptions
| Option | Type | Default | Description |
|---|---|---|---|
| stripQuotes | boolean | true |
Remove quoted reply chains |
| stripSignature | boolean | true |
Remove trailing signature block |
| maxBodyLength | number | 0 |
Truncate body after N characters. 0 = no limit |
preprocessGmail()
Provider adapter for the Gmail API. Accepts the response of gmail.users.messages.get({ id, format: "full" }) from googleapis and returns the same ProcessedEmail shape as preprocess(). Skips MIME re-parsing — the Gmail API has already decomposed the message into typed parts, so the adapter walks payload.parts, base64url-decodes the bodies, and feeds them into the shared cleaning pipeline. The body tree walk, header parsing, and base64url decoding all happen in Rust — this wrapper only serializes the caller's object to JSON and delegates to the native binding, so the output is byte-identical to the Python and Rust entry points.
function preprocessGmail(
msg: GmailInput,
options?: PreprocessOptions
): ProcessedEmail
Accepts either the bare Schema$Message or the full googleapis response ({ data: Schema$Message, ... }). The message must have been fetched with format: "full" so payload is present with headers and base64url-encoded body parts.
Body selection: walks payload.parts depth-first and picks the first non-attachment leaf of each type. When both text/html and text/plain are present, HTML wins. Parts with Content-Disposition: attachment or a filename are skipped.
Throws:
TypeErrorif the input is not an object or has nopayload(i.e. the message wasn't fetched withformat: "full").Errorif the chosen body part is attachment-backed (Gmail returnedbody.attachmentIdinstead ofbody.databecause the body exceeded the inline size threshold — fetch withusers.messages.attachments.getand inline the decoded content).
Note
langmail does not bundle or depend on googleapis — only the shape of the response is consumed. Bodies are decoded as UTF-8; per-part charset parameters are not consulted, so legacy 8-bit encodings may produce mojibake.
toLlmContext()
Accepts a ProcessedEmail and returns a deterministic plain-text string formatted for direct inclusion in a prompt. The output includes a header block (FROM / TO / SUBJECT / DATE) followed by a CONTENT: section.
function toLlmContext(email: ProcessedEmail): string
function toLlmContextWithOptions(
email: ProcessedEmail,
options: LlmContextOptions
): string
Use toLlmContextWithOptions when you need to control rendering — for example, to include quoted reply history.
LlmContextOptions
| Option | Type | Default | Description |
|---|---|---|---|
| renderMode | RenderMode.LatestOnly | RenderMode.ThreadHistory |
LatestOnly |
LatestOnly strips quoted content; ThreadHistory appends quoted replies as a chronological transcript below the main content |
Warning
Quote detection is heuristic. See Concepts → Caveats for where accuracy degrades.