Role + Company
FastestInput
`Staff Backend Engineer` + target company context.
Expected output
Repo set, shortlist of contributor-led candidates, score distribution.
First action
Edit top repos before running outreach.
SourceKit is an evidence-first sourcing system for technical hiring. You start with a role or JD, SourceKit turns it into a discovery plan, verifies signal across GitHub and market context, then returns a ranked pipeline that is ready for outreach.
Pick one input path, get ranked candidates, and take one clear first action.
Input
`Staff Backend Engineer` + target company context.
Expected output
Repo set, shortlist of contributor-led candidates, score distribution.
First action
Edit top repos before running outreach.
Input
Paste full JD with stack, seniority, and constraints.
Expected output
Criteria draft + market-adjacent discovery + stronger filtering.
First action
Convert top criteria to binary EEA checks.
Input
Paste Lever/Greenhouse/Ashby link.
Expected output
Autoparsed scope, ranked candidates, Webset-ready criteria.
First action
Promote durable searches into a weekly Webset.
If strategy output has no usable EEA criteria, SourceKit seeds 3-5 draft checks you can edit before creating a Webset. This keeps every search evidence-first by default.
Value: Faster setup
Operators start from concrete criteria in every run instead of writing EEA checks from scratch.
Value: Lower noise
Binary evidence criteria reduce false positives before scoring and outreach effort accumulates.
Value: Better spend
Verification-first criteria keep enrichment and outreach spend focused on candidates with proof.
Why teams use SourceKit instead of title search and static LinkedIn filters.
Hidden gem rate
~40%
Top candidates often have limited profile visibility. Artifact-led discovery surfaces strong builders before they become heavily recruited.
Pipeline behavior
Always on
Websets convert one strong search into a persistent, auto-updating candidate stream with verified entrants.
Screening logic
Binary proof
Criteria can be framed as pass/fail against public evidence, reducing soft interpretation and resume-style noise.
The operating flow from intake to pipeline. Click each stage to view what to do, what the system does, and what success looks like.
Specificity at intake is the highest leverage quality control for the entire run.
Repo and criteria edits are more impactful than downstream reranking.
This stage finds hidden builders without relaxing criteria quality.
Use thresholds to keep review load stable across high-volume searches.
Batch movement keeps throughput predictable while maintaining quality gates.
Websets are highest ROI when role definitions stay stable for multiple weeks.
Core capabilities and where they create leverage in the workflow.

Operational habits that improve quality and reduce wasted effort.
Do this
Specific role language drives better repo targeting and less scoring noise.
Repo quality is the biggest upstream lever for candidate quality.
Use objective proof signals before enrichment and outreach.
Let the best searches compound through weekly or daily monitoring.
Avoid this
Criteria like "strong engineer" will inflate false positives.
Front-loading title filters reintroduces profile bias and misses builders.
Verify first, enrich survivors second to protect spend and quality.
Use adjacency signals to reach less saturated ecosystems.
Practical examples you can reuse. Keep criteria verifiable from public artifacts and keep Websets strict at admission.
Sample EEA criteria by role
Webset operating playbook
Use 3-5 binary checks. Avoid soft language like "strong" or "solid."
Run criteria filters before adding contact or publication enrichments.
Daily is best for urgent hiring; weekly is cleaner for most durable roles.
If false positives rise, tighten criteria before scaling outreach volume.
Keep the criteria version with each cohort to learn which definition works.
Example patterns teams can run immediately.
Search setup
Seed with model infra repos (`vllm`, `transformers`, `triton`) and narrow to startup/early-team surfaces.
EEA criteria
Top contributor rank, shipped production inference/training system, and paper/talk signal in relevant venues.
Expected output
High-signal shortlist of builders with maintainer velocity and low profile saturation.
Target repos
10-20
Score bar
85+
Webset cadence
Weekly
Search setup
Target distributed systems repos plus customer deployment indicators and implementation depth constraints.
EEA criteria
Production ownership proof, systems reliability changes, and evidence of customer-facing technical delivery.
Expected output
Candidates with both backend depth and field execution signal, not pure platform-only profiles.
Primary stack
Go + Python
Score bar
82+
Pipeline stage
Contact fast
Search setup
Focus on framework core repos, perf tooling ecosystems, and maintainership markers over title matching.
EEA criteria
Core contribution to framework/tooling, performance ownership, and cross-team DX impact evidence.
Expected output
Platform-minded ICs who improve system-level frontend velocity across teams.
Signal type
Maintainer
Score bar
80+
Key proof
Tooling commits
Search setup
Define criteria around CVE discovery, advisories, and security-centric repos with active remediation work.
EEA criteria
CVE or advisory contribution, sustained security commits, and public research artifact (talk or writeup).
Expected output
Candidates with proof of practical offensive/defensive capability and product-grade security ownership.
Signal source
CVE + commits
Score bar
83+
Review mode
Strict verify
Concrete repo targets, criteria thresholds, and expected pipeline output to calibrate your first runs.
Repo targets
`vllm`, `transformers`, `triton`, `deepspeed`, `llama.cpp`, `ray`.
Score threshold
Builder Score >= 85 and at least 2 EEA criteria met.
Bad criteria
"Strong ML engineer, startup mindset, good communicator."
Better criteria
Top-10 contributor rank OR production inference ownership + public technical artifact.
Repo targets
`kubernetes`, `temporal`, `envoy`, `vitess`, `cockroachdb`, `grpc`.
Score threshold
Builder Score >= 82 with production ownership proof.
Bad criteria
"Great backend developer from top company."
Better criteria
50+ commits + maintainer/reviewer signal + reliability/latency or scaling evidence.
Set expectations early so teams use SourceKit where it performs best.
Copy-ready prompts for new searches and Webset setup.
Role: Staff Backend Engineer (distributed systems)
Primary work: high-throughput APIs and workflow orchestration
Must-have evidence:
- 50+ meaningful commits to relevant repos
- ownership signal (maintainer/reviewer/RFC)
- production-scale system evidence
Target company surfaces: Infra-heavy startups + OSS-adjacent teams
Build a weekly Webset for Founding ML Engineers.
Admit only if candidate matches at least 2/3:
1) top contributor to frontier ML repo
2) shipped production ML system
3) publication/talk evidence in relevant venues
Add enrichments: email, current company, GitHub stats.
Take one action to validate the workflow with your current role.
Start with one role, tighten criteria after first results, and convert the winning search into a weekly Webset.
Run your first search