Sales Pilot
AI-powered lead scoring system that transforms manual lead generation into a streamlined, scalable, and data-driven process
Highlights
Manual lead sourcing on freelance platforms like Upwork was slow, unscalable, and lacked targeting accuracy.
Lead the design of an AI-driven scoring system to automate and prioritize high-fit leads.
Built a hybrid scoring engine using explainable heuristics and OpenAI embeddings to rank jobs, contacts, and companies.
Reduced lead sourcing time from days to minutes and improved outreach accuracy through high-quality lead filtering.
Core Team
Overview
Sales Pilot is an AI-powered system that transforms manual lead generation into a streamlined, scalable, and data-driven process. Designed to target freelance and consulting opportunities, primarily on Upwork and LinkedIn, the system identifies promising job postings, assesses the publishing contacts, and evaluates associated companies to prioritize high-quality leads.
The innovation lies in its component-based scoring architecture. It evaluates each lead across three pillars β job description, contact profile, and company fit β using both deterministic heuristics and embedding-based semantic models. The final result is a dynamic lead scoring engine that automates decision-making, allowing sales and growth teams to focus on outreach with the highest potential for success. This not only accelerates the workflow but also significantly enhances targeting accuracy and lead quality.
Data
The system integrates and processes diverse datasets from several public and third-party platforms:
- β’Job Data: Pulled from Upwork's public API, including metadata like job descriptions, client spending history, and ratings.
- β’Company Data: Enriched via Apollo and other business intelligence sources, including industry, size, location, and revenue signals.
- β’Contact Data: Retrieved via LinkedIn scraping and profile parsing.
The data is stored in a dedicated MongoDB database, normalized using a custom feature labeling scheme (0β4 scale) to ensure consistency across structured and unstructured fields. Features requiring semantic interpretation (e.g., job description, contact role) are further processed using OpenAI's text-embedding-3-large vectors.
Methods
The project employs a hybrid evaluation model combining explainable heuristic-based logic with semantic vector scoring. Key techniques include:
- Heuristic-based scoring: Deterministic scoring of structured data fields (e.g., location, university, Upwork payment verification) with weighted formulas across components.
- Component-based lead modeling:
- β’Job: Assessed by employer rating, money spent, description quality, and payment verification.
- β’Contact: Evaluated by seniority (title), location, education, and past employment.
- β’Company: Scored by size, revenue, industry alignment, and geography.
- OpenAI embedding class assignment: For unstructured text fields, the system embeds input into 3072-D space and assigns labels based on proximity to pre-labeled centroids (0β4 score classes).
- Weighted aggregation: Final lead score = 0.10 Γ Job + 0.40 Γ Contact + 0.50 Γ Company. This ensures relevance filtering tailored to different business priorities.

Results
The system enables proactive engagement with high-quality prospects, yielding better conversion rates and streamlined workflows:
- β’Lead Filtering Accuracy: Only leads scoring above 2.5 (on a 0β4 scale) are passed to outreach tools, ensuring high-fit contacts.
- β’Efficiency Gains: Reduced manual lead sourcing time from hours/days to minutes, automating the full funnel from job discovery to contact personalization.
- β’Semantic Robustness: Embedding-based scoring proved effective across varied job descriptions and role titles, achieving consistent categorization even with diverse phrasing.
Conclusion
Sales Pilot demonstrates how AI and ML can transform outbound lead generation from a time-intensive, manual task into a strategic and scalable advantage. By integrating heuristic scoring with embedding-powered NLP, the system identifies the most promising opportunities for outreach β tailored to different operational strategies.
Future enhancements could include continuous retraining of embeddings with feedback loops, expansion to other platforms beyond Upwork and LinkedIn, and integration with CRM systems for fully automated outreach workflows.





