Dallas Crilley
Work
CRM data engineering Shipped

Manifest

Every record accounted for on the way into one identity layer.

A PostgreSQL-staged pipeline that ingests Airtable client records, deduplicates them across four signal paths, and resolves them into a single master-contact identity layer.

language
Python 3.11, PostgreSQL (JSONB, generated columns, views)
pipeline
JSONB ingest, 10 typed SQL views, linking job
dedup
4 signal paths (primary email, billing email, proposal ID, external IDs)
validated
792 records, 27 fields field-by-field
repo
Private (shares a repo with Throughline)

The problem

A PR company kept its operational CRM in two Airtable tables full of cross-system IDs, disconnected from its analytical database, with inconsistent emails and duplicate records across tables. Manifest unified them into one contact identity layer with verified cross-links.

How it is built

  1. Staged JSONB ingest with typed views

    Raw Airtable records land as JSONB, then ten SQL views extract and type the fields server-side. Ingest stays schema-agnostic while downstream consumers work with typed columns. A validation run confirmed all 27 documented field extractions were correct against the Airtable schema.

  2. Four-signal deduplication into a master contact

    A candidate view unions four independent match paths, so a record links if any one resolves: normalized primary email, billing email, PandaDoc proposal ID, or external system IDs. Matches set role flags (billing contact, contract signer) with priority scores, and unmatched rows surface as candidates instead of being dropped.

  3. Dry-run before it writes

    The linking job has a --dry-run mode that reports every candidate and proposed mutation without touching master_contacts, so an operator can see exactly what a new Airtable sync would change before committing it.

By the numbers

792 records resolved into lifecycle views
27 Airtable fields validated field-by-field
10 typed SQL enrichment views
4 deduplication signal paths

Where this honestly stands

Shipped: the Airtable staging and identity-linking path. A HubSpot destination was scoped but never built, so this is framed as CRM identity resolution, not a HubSpot migration.

Want the parts that are not in a public repo? I will walk you through the architecture and the decisions on a call.