Getting Started
Quickstart
Install Cell and run your first insurance document extraction
Installation
Install Cell and its peer dependencies:
npm install @claritylabs-inc/cell ai pdf-lib
Then install a model provider:
# Anthropic (default)
npm install @ai-sdk/anthropic
# OpenAI
npm install @ai-sdk/openai
# Google
npm install @ai-sdk/google
Cell is published to GitHub Packages. You'll need a .npmrc that points @claritylabs-inc to https://npm.pkg.github.com.
Extract a policy
The simplest path — default Anthropic models, no configuration:
import { classifyDocumentType, extractFromPdf, applyExtracted } from "@claritylabs-inc/cell";
import { readFileSync } from "fs";
// Load a PDF as base64
const pdfBase64 = readFileSync("./policy.pdf").toString("base64");
// Step 1: Classify — is this a policy or a quote?
const { documentType, confidence } = await classifyDocumentType(pdfBase64);
console.log(`Classified as ${documentType} (confidence: ${confidence})`);
// Step 2: Extract — run the full multi-pass pipeline
const { extracted } = await extractFromPdf(pdfBase64);
// Step 3: Apply — map raw extraction to structured fields
const fields = applyExtracted(extracted);
console.log(fields.carrier); // "Hartford"
console.log(fields.policyNumber); // "GL-2024-001234"
console.log(fields.coverages); // [{ name: "General Liability", limit: "$1,000,000", ... }]
Extract a quote
Quotes have a separate pipeline that captures quote-specific fields like subjectivities and premium breakdowns:
import { extractQuoteFromPdf, applyExtractedQuote } from "@claritylabs-inc/cell";
const { extracted } = await extractQuoteFromPdf(pdfBase64);
const fields = applyExtractedQuote(extracted);
console.log(fields.quoteNumber); // "Q-2024-5678"
console.log(fields.premiumBreakdown); // [{ line: "GL", amount: "$5,200" }, ...]
console.log(fields.subjectivities); // [{ description: "Loss runs required", ... }]
Use a custom model
Bring any model from any provider:
import { createOpenAI } from "@ai-sdk/openai";
import { extractFromPdf, createUniformModelConfig } from "@claritylabs-inc/cell";
const openai = createOpenAI();
const { extracted } = await extractFromPdf(pdfBase64, {
models: createUniformModelConfig(openai("gpt-4o")),
metadataProviderOptions: {}, // disable Anthropic-specific thinking
});
Add logging
Every pipeline function accepts a log callback:
const { extracted } = await extractFromPdf(pdfBase64, {
log: async (msg) => console.log(`[cell] ${msg}`),
});
Output:
[cell] Pass 1: Extracting metadata...
[cell] Calling model (max 4096 tokens)...
[cell] 12450 in / 2300 out tokens (3.2s)
[cell] Document: 45 page(s)
[cell] Pass 2: Extracting sections pages 1–15...
...