Claim Assistant

AI-powered automation that streamlines insurance claim intake with LLM-based PDF processing, policy matching, and coverage analysis

Client: AmTrust · New York · United States 🇺🇸

PythonFastAPIOpenAI GPT-5Azure Doc IntelligenceNext.jsReactTypeScript

Live Demo GitHub Solution

Highlights

Situation

A large insurer processed ~107K claims per year across phone, email, fax, and portal — with 30 staff manually re-keying data into ASC, limited to 12h/day on weekdays.

Task

Build an automation solution to extract claim data from PDFs, faxes, and handwritten forms, score per-field confidence, and map fields to ASC for 24/7 processing.

Action

Designed a two-stage pipeline pairing Azure Document Intelligence OCR with GPT-5 mapping and validation, plus a review UI with bounding-box highlights and confidence-gated export.

Result

Cut processing to ~1 minute per form with 24/7 pre-processing that eliminates backlog, focusing staff on low-confidence exceptions and reducing data-entry errors.

Core Team

Viacheslav Danilov

R&D Lead

Symfa

Barcelona · Spain 🇪🇸

Anton Makoveev

ML Engineer

Symfa

Prague · Czechia 🇨🇿

Mikhail Vinogradov

Data Scientist

Symfa

Barcelona · Spain 🇪🇸

Overview

Claim Assistant is an AI-powered solution that automates insurance claim intake and processing, transforming a slow, manual workflow into a fast, scalable, and accurate one. It converts filled insurance claim forms — including scanned and handwritten documents — into structured, validated data ready for downstream systems.

The platform pairs specialized document intelligence with LLM-based reasoning. Azure Document Intelligence extracts key-value pairs, layout, and per-field confidence, while OpenAI GPT-5 maps the results to the target schema, resolves field aliases, and generates a coverage summary. A review interface with side-by-side PDF previews and bounding-box highlights lets adjusters verify, edit, and approve fields — with a confidence-driven queue that focuses attention only on the exceptions that need a human.

Data

The solution targets a high-volume claims operation handling roughly 107,000 claims per year (~9,000 per month) arriving through multiple channels:

•ASC System (75%): ~6.8K claims/month from phone (37%), email (45%), and fax (18%) channels.
•External Portal (25%): ~2.5K claims/month submitted directly by insureds through partner portals.

Inputs span PDFs, faxes, and emails — both digital and handwritten — across claim forms from eight US states (Florida, New Hampshire, Minnesota, Iowa, Kansas, New York, Ohio, and Wisconsin). The pipeline is form-agnostic, handling arbitrary layouts without per-form training, and flags any field extracted below 80% confidence for human review.

Methods

Claim Assistant uses a two-stage pipeline that combines specialized document intelligence with LLM-based mapping and validation:

Stage 1 — Document Intelligence: Azure Document Intelligence (Form Recognizer v3.x) extracts key-value pairs, bounding boxes, layout structure, and per-field confidence scores from each document.
Stage 2 — LLM Mapping & Validation:
- •Field Mapping: GPT-5 aligns extracted values to target schema fields while preserving evidence links to the source document.
- •Alias Resolution: Normalizes inconsistent labels and synonyms across form variants into a single canonical schema.
- •Policy Matching: Maps extracted identifiers to policy records using weighted Levenshtein similarity and date validation.
- •Summary Generation: Produces a coverage analysis (covered / not covered / uncertain) with transparent reasoning and confidence metrics.
Review Workflow: A side-by-side review UI highlights each field on the source PDF, supports inline editing and approval, surfaces a dedicated low-confidence queue (<80%), and gates export to ASC until all fields are approved.

Results

By shifting intake from full manual data entry to AI-driven pre-processing, the solution delivers measurable operational gains:

•Processing Speed: ~1 minute per form (parallelizable), down from longer, fully manual data entry.
•Coverage Hours: 24/7 automated pre-processing replaces 12h/day weekday-only coverage, eliminating overnight and weekend backlog.
•Staff Productivity: The confidence-driven queue lets staff focus only on exceptions and low-confidence cases instead of re-keying every claim.
•Error Rate: Reduced through automated validation and visual bounding-box verification.

Being form-agnostic, the pipeline generalizes across layouts and input quality — from clean digital PDFs to noisy handwritten submissions — without per-form training:

Claim Assistant review interface processing a digital Wisconsin claim form — Figure 1. Claim Assistant review interface on a digital Wisconsin form — side-by-side PDF preview with extracted key fields, confidence scores, and per-field approval.

Claim Assistant processing a handwritten New Hampshire claim form — Figure 2. Handwritten New Hampshire form — fields extracted and validated despite free-form handwriting, with low-confidence values surfaced for review.

Claim Assistant processing a digital New Hampshire claim form — Figure 3. Digital New Hampshire form — high-confidence extraction across key fields with an auto-generated coverage summary.

Conclusion

Claim Assistant shows how pairing document intelligence with LLM reasoning can turn a manual, bottlenecked claims operation into a scalable, 24/7 pipeline. Its hybrid approach — Azure for accurate extraction, GPT-5 for mapping and validation — combined with a confidence-driven, evidence-linked review workflow, keeps humans in the loop exactly where it matters while automating the rest.

The architecture is ready for enterprise integration. Roadmap items include a document classification pre-filter to route incoming messages (FNOL, billing, misrouted), direct email-body parsing, and full ASC export integration for end-to-end straight-through processing.