Back to Portfolio

Sales Pilot

AI-powered lead scoring system that transforms manual lead generation into a streamlined, scalable, and data-driven process

Client: Symfa Β· Miami Β· United States πŸ‡ΊπŸ‡Έ
PythonOpenAIscikit-learnDVCCI/CDLLM

Highlights

Situation

Manual lead sourcing on freelance platforms like Upwork was slow, unscalable, and lacked targeting accuracy.

Task

Lead the design of an AI-driven scoring system to automate and prioritize high-fit leads.

Action

Built a hybrid scoring engine using explainable heuristics and OpenAI embeddings to rank jobs, contacts, and companies.

Result

Reduced lead sourcing time from days to minutes and improved outreach accuracy through high-quality lead filtering.

Core Team

Viacheslav Danilov

Viacheslav Danilov

R&D Lead

Symfa

Barcelona Β· Spain πŸ‡ͺπŸ‡Έ

Anton Makoveev

Anton Makoveev

ML Engineer

Symfa

Prague Β· Czechia πŸ‡¨πŸ‡Ώ

Mikhail Vinogradov

Mikhail Vinogradov

Data Scientist

Symfa

Barcelona Β· Spain πŸ‡ͺπŸ‡Έ

Aleksandr Nasstrom

Aleksandr Nasstrom

Senior JS Developer

Symfa

Barcelona Β· Spain πŸ‡ͺπŸ‡Έ

Rita Tretyakevich

Rita Tretyakevich

Business Analyst

Symfa

Warsaw Β· Poland πŸ‡΅πŸ‡±

Vitali Yurkevich

Vitali Yurkevich

Product Owner

Symfa

Miami Β· United States πŸ‡ΊπŸ‡Έ

Overview

Sales Pilot is an AI-powered system that transforms manual lead generation into a streamlined, scalable, and data-driven process. Designed to target freelance and consulting opportunities, primarily on Upwork and LinkedIn, the system identifies promising job postings, assesses the publishing contacts, and evaluates associated companies to prioritize high-quality leads.

The innovation lies in its component-based scoring architecture. It evaluates each lead across three pillars – job description, contact profile, and company fit – using both deterministic heuristics and embedding-based semantic models. The final result is a dynamic lead scoring engine that automates decision-making, allowing sales and growth teams to focus on outreach with the highest potential for success. This not only accelerates the workflow but also significantly enhances targeting accuracy and lead quality.

Data

The system integrates and processes diverse datasets from several public and third-party platforms:

  • β€’Job Data: Pulled from Upwork's public API, including metadata like job descriptions, client spending history, and ratings.
  • β€’Company Data: Enriched via Apollo and other business intelligence sources, including industry, size, location, and revenue signals.
  • β€’Contact Data: Retrieved via LinkedIn scraping and profile parsing.

The data is stored in a dedicated MongoDB database, normalized using a custom feature labeling scheme (0–4 scale) to ensure consistency across structured and unstructured fields. Features requiring semantic interpretation (e.g., job description, contact role) are further processed using OpenAI's text-embedding-3-large vectors.

Methods

The project employs a hybrid evaluation model combining explainable heuristic-based logic with semantic vector scoring. Key techniques include:

  • Heuristic-based scoring: Deterministic scoring of structured data fields (e.g., location, university, Upwork payment verification) with weighted formulas across components.
  • Component-based lead modeling:
    • β€’Job: Assessed by employer rating, money spent, description quality, and payment verification.
    • β€’Contact: Evaluated by seniority (title), location, education, and past employment.
    • β€’Company: Scored by size, revenue, industry alignment, and geography.
  • OpenAI embedding class assignment: For unstructured text fields, the system embeds input into 3072-D space and assigns labels based on proximity to pre-labeled centroids (0–4 score classes).
  • Weighted aggregation: Final lead score = 0.10 Γ— Job + 0.40 Γ— Contact + 0.50 Γ— Company. This ensures relevance filtering tailored to different business priorities.
Sales Pilot Workflow
Figure 1. Sales Pilot workflow showing the component-based lead scoring architecture.

Results

The system enables proactive engagement with high-quality prospects, yielding better conversion rates and streamlined workflows:

  • β€’Lead Filtering Accuracy: Only leads scoring above 2.5 (on a 0–4 scale) are passed to outreach tools, ensuring high-fit contacts.
  • β€’Efficiency Gains: Reduced manual lead sourcing time from hours/days to minutes, automating the full funnel from job discovery to contact personalization.
  • β€’Semantic Robustness: Embedding-based scoring proved effective across varied job descriptions and role titles, achieving consistent categorization even with diverse phrasing.

Conclusion

Sales Pilot demonstrates how AI and ML can transform outbound lead generation from a time-intensive, manual task into a strategic and scalable advantage. By integrating heuristic scoring with embedding-powered NLP, the system identifies the most promising opportunities for outreach β€” tailored to different operational strategies.

Future enhancements could include continuous retraining of embeddings with feedback loops, expansion to other platforms beyond Upwork and LinkedIn, and integration with CRM systems for fully automated outreach workflows.