Skip to content

Accepts a raw RFC 5322 email string and returns a structured ParsedEmail object. Handles MIME multipart messages, HTML and plain-text body variants, and normalises all character encodings including zero-width characters and HTML entities.

function preprocess(
raw: string
): Promise<ParsedEmail>

What it does internally:

  • Parses the MIME structure and selects the most relevant body variant (HTML preferred over plain text)
  • Converts HTML to Markdown, preserving semantic structure
  • Detects and removes quoted reply chains across Gmail, Outlook, Apple Mail, and non-standard clients
  • Detects and removes email signatures
  • Normalises zero-width characters and HTML entities
  • Strips URLs while preserving anchor text
FieldTypeDescription
subjectstring | nullDecoded subject line
senderAddressFrom field, parsed into name and email
recipientsAddress[]To, Cc recipients
datestring | nullISO 8601 date string
bodystringCleaned Markdown body
ctasCTA[]Extracted calls-to-action
rfc_message_idstring | nullRFC 5322 Message-ID header

Accepts a ParsedEmail and returns a Markdown string formatted for direct inclusion in a prompt. The output includes a structured header block followed by the cleaned body.

function toLLMContext(
parsed: ParsedEmail,
options?: LLMContextOptions
): string
OptionTypeDefaultDescription
format”markdown” | “plain""markdown”Output format
includeHeadersbooleantruePrepend From / Subject / Date header block
includeCTAsbooleantrueAppend extracted CTAs at the end