Back to Portfolio
Claim Assistant - AI-powered insurance claim automation platform

Claim Assistant

AI-powered automation that streamlines insurance claim intake with LLM-based PDF processing, policy matching, and coverage analysis

Client: AmTrust Β· New York Β· United States πŸ‡ΊπŸ‡Έ
PythonFastAPIOpenAI GPT-5Azure Doc IntelligenceNext.jsReactTypeScript

Highlights

Situation

A large insurer processed ~107K claims per year across phone, email, fax, and portal β€” with 30 staff manually re-keying data into ASC, limited to 12h/day on weekdays.

Task

Build an automation solution to extract claim data from PDFs, faxes, and handwritten forms, score per-field confidence, and map fields to ASC for 24/7 processing.

Action

Designed a two-stage pipeline pairing Azure Document Intelligence OCR with GPT-5 mapping and validation, plus a review UI with bounding-box highlights and confidence-gated export.

Result

Cut processing to ~1 minute per form with 24/7 pre-processing that eliminates backlog, focusing staff on low-confidence exceptions and reducing data-entry errors.

Core Team

Viacheslav Danilov

Viacheslav Danilov

R&D Lead

Symfa

Barcelona Β· Spain πŸ‡ͺπŸ‡Έ

Anton Makoveev

Anton Makoveev

ML Engineer

Symfa

Prague Β· Czechia πŸ‡¨πŸ‡Ώ

Mikhail Vinogradov

Mikhail Vinogradov

Data Scientist

Symfa

Barcelona Β· Spain πŸ‡ͺπŸ‡Έ

Overview

Claim Assistant is an AI-powered solution that automates insurance claim intake and processing, transforming a slow, manual workflow into a fast, scalable, and accurate one. It converts filled insurance claim forms β€” including scanned and handwritten documents β€” into structured, validated data ready for downstream systems.

The platform pairs specialized document intelligence with LLM-based reasoning. Azure Document Intelligence extracts key-value pairs, layout, and per-field confidence, while OpenAI GPT-5 maps the results to the target schema, resolves field aliases, and generates a coverage summary. A review interface with side-by-side PDF previews and bounding-box highlights lets adjusters verify, edit, and approve fields β€” with a confidence-driven queue that focuses attention only on the exceptions that need a human.

Data

The solution targets a high-volume claims operation handling roughly 107,000 claims per year (~9,000 per month) arriving through multiple channels:

  • β€’ASC System (75%): ~6.8K claims/month from phone (37%), email (45%), and fax (18%) channels.
  • β€’External Portal (25%): ~2.5K claims/month submitted directly by insureds through partner portals.

Inputs span PDFs, faxes, and emails β€” both digital and handwritten β€” across claim forms from eight US states (Florida, New Hampshire, Minnesota, Iowa, Kansas, New York, Ohio, and Wisconsin). The pipeline is form-agnostic, handling arbitrary layouts without per-form training, and flags any field extracted below 80% confidence for human review.

Methods

Claim Assistant uses a two-stage pipeline that combines specialized document intelligence with LLM-based mapping and validation:

  • Stage 1 β€” Document Intelligence: Azure Document Intelligence (Form Recognizer v3.x) extracts key-value pairs, bounding boxes, layout structure, and per-field confidence scores from each document.
  • Stage 2 β€” LLM Mapping & Validation:
    • β€’Field Mapping: GPT-5 aligns extracted values to target schema fields while preserving evidence links to the source document.
    • β€’Alias Resolution: Normalizes inconsistent labels and synonyms across form variants into a single canonical schema.
    • β€’Policy Matching: Maps extracted identifiers to policy records using weighted Levenshtein similarity and date validation.
    • β€’Summary Generation: Produces a coverage analysis (covered / not covered / uncertain) with transparent reasoning and confidence metrics.
  • Review Workflow: A side-by-side review UI highlights each field on the source PDF, supports inline editing and approval, surfaces a dedicated low-confidence queue (<80%), and gates export to ASC until all fields are approved.

Results

By shifting intake from full manual data entry to AI-driven pre-processing, the solution delivers measurable operational gains:

  • β€’Processing Speed: ~1 minute per form (parallelizable), down from longer, fully manual data entry.
  • β€’Coverage Hours: 24/7 automated pre-processing replaces 12h/day weekday-only coverage, eliminating overnight and weekend backlog.
  • β€’Staff Productivity: The confidence-driven queue lets staff focus only on exceptions and low-confidence cases instead of re-keying every claim.
  • β€’Error Rate: Reduced through automated validation and visual bounding-box verification.

Being form-agnostic, the pipeline generalizes across layouts and input quality β€” from clean digital PDFs to noisy handwritten submissions β€” without per-form training:

Claim Assistant review interface processing a digital Wisconsin claim form
Figure 1. Claim Assistant review interface on a digital Wisconsin form β€” side-by-side PDF preview with extracted key fields, confidence scores, and per-field approval.
Claim Assistant processing a handwritten New Hampshire claim form
Figure 2. Handwritten New Hampshire form β€” fields extracted and validated despite free-form handwriting, with low-confidence values surfaced for review.
Claim Assistant processing a digital New Hampshire claim form
Figure 3. Digital New Hampshire form β€” high-confidence extraction across key fields with an auto-generated coverage summary.

Conclusion

Claim Assistant shows how pairing document intelligence with LLM reasoning can turn a manual, bottlenecked claims operation into a scalable, 24/7 pipeline. Its hybrid approach β€” Azure for accurate extraction, GPT-5 for mapping and validation β€” combined with a confidence-driven, evidence-linked review workflow, keeps humans in the loop exactly where it matters while automating the rest.

The architecture is ready for enterprise integration. Roadmap items include a document classification pre-filter to route incoming messages (FNOL, billing, misrouted), direct email-body parsing, and full ASC export integration for end-to-end straight-through processing.