Jan 9, 2025 - Michael

Work Log

CRM Contact Database Setup

Created canonical contact database schema (crm/docs/contact-schema.md)
Set up canonical CSV file (crm/contacts.csv) with comprehensive field definitions
Imported 20 behavioral health CEO contacts from Apollo enriched results
Tagged all imports with behavioral-health-ceo for campaign segmentation

Import Workflow Documentation

Documented import process in crm/docs/import-workflow.md
Established conversational import pattern (no scripts needed at current scale)
Validated deduplication logic (email as primary key)

Scope Clarification & Timeline Adjustment

Identified engagement tracking gap: no easy way to sync Apollo tracking data (opens, clicks) back to contacts.csv
Decision: Reduce scope to realistic MVP rather than rush incomplete automation
Timeline push required: Monday deadline not feasible for proper implementation
Advocating for: Push timeline, build it right, deliver more value

Thinking/Decisions

CRM Scope Reduction: Contacts Database vs. Campaign Execution

The Core Problem:

Need engagement tracking (opens, clicks, replies) to manage pipeline effectively
Apollo has this tracking data
No easy way to programmatically sync Apollo's tracking data back to contacts.csv
Building full automation requires Apollo API integration + webhook/polling system for tracking sync
That's 2-3 days of focused work to do properly

Original Scope (Too Ambitious for Monday):

✅ Contacts.csv as source of truth
✅ Import from multiple sources
❌ Campaign execution from contacts.csv
❌ Engagement tracking sync from Apollo
❌ Automated pipeline management with tracking data

Revised Scope (Realistic MVP):

What contacts.csv does:

Canonical list of all contacts (source of truth for WHO to contact)
Import from multiple sources (Apollo, conferences, Excel)
Deduplication (email as primary key)
Basic segmentation (tags, stage, priority)
Manual updates for lifecycle changes (prospect → lead → relationship)

What Apollo does:

Campaign execution (sequences, one-off emails)
Engagement tracking (opens, clicks, replies)
Operational tool for outreach

Bridge between them:

Export segments from contacts.csv → Import to Apollo for campaigns
When people respond, manually update contacts.csv stage/status
Periodic sync (not real-time) acceptable for MVP

Timeline Decision:

Original deadline: Monday (Jan 13)
Reality: Proper implementation needs more time
Recommendation: Push timeline, build the right foundation
This is taking longer than expected, but it's more ambitious and more valuable
Better to advocate for doing it right than shipping something incomplete

What "Done" Looks Like for MVP:

Contacts.csv populated with all existing contacts from Apollo
Import workflow established and documented
Clear boundaries: contacts.csv = source of truth, Apollo = execution layer
Manual bridge process documented (how to sync between them)
Foundation ready for future automation when we have time

Next Action Required:

Communicate timeline push to co-founder
Present trade-off: "Full automation needs more time vs. MVP ready sooner"
Get alignment on reduced scope

Data Complexity Issue: Account-Level Research vs. Contact Database

Context: Discovered significant research data in research/prospects/imports/ that can't be easily imported into the contacts database:

Outreach Nov 19 - Sheet1.csv (122 organizations)
Prospect List - Prospect List.csv (similar structure)

Both files share the same fundamental issue: they're account-level research trying to fit into a contact-level database.

Why This Data is Valuable:

Represents significant human research effort into prospect organizations
Contains strategic context: buyer personas, value propositions, contactability scores
Includes notes on prior outreach attempts and what happened
Maps prospect organizations to specific industry types (Health Plans, Outsourcers, BHCs, IROs)
Has qualitative intelligence that took time to develop

Why It's Complicated:

Schema Mismatch - Organization vs. Contact Data
- Our contacts database expects: email, first_name, last_name, title, company
- Nov 19 file has: Organization Name, Type, Description, Buyer Persona, Why High-Probability
- This is organization-level research, not contact-level data
- We'd need a separate accounts/organizations table to store this properly
Missing Contact Information
- Only ~20 of 122 rows have actual contact names
- "Contact" column shows method (Email, LinkedIn, Website Form) not actual email addresses
- Names like "cyndi" or "Phil Salemi Jr" without email addresses
- Can't import without emails (email = our primary key for deduplication)
Mixed Data Types
- Some rows are job postings with salary ranges (not prospects)
- Some are organizations without contacts (e.g., "Advanced Medical Reviews")
- Some are actual contacts but missing email addresses
- Some include rich strategic notes in other columns
- No consistent pattern to extract
Strategic Context vs. Structured Data
- Fields like "Why High-Probability" contain nuanced reasoning
- "Buyer Persona" has specific titles but not actual people
- This is intelligence not operational data
- Would lose value if forced into structured contact fields
Prior Outreach History Embedded in Notes
- Some rows contain notes about what happened with outreach
- This context is valuable but unstructured
- Would need to be preserved somewhere (notes field?)
- Risk of losing context if we just extract contact info

The Core Problem: We have two types of data that need different structures:

Contacts (individuals, emails, LinkedIn, titles) → crm/contacts.csv
Accounts (organizations, strategy, research notes, buyer personas) → no home yet

The Nov 19 file is primarily account/organization research, not a contact list. Forcing it into a contact database would lose most of its value.

Potential Solutions:

Option 1: Two-Table System

Create crm/accounts.csv for organization-level data
Keep crm/contacts.csv for individual contacts
Link contacts to accounts via company name field
Import Nov 19 data as accounts, then enrich with contacts later

Option 2: Selective Enrichment

Extract the ~20 rows with actual contact names
Use Apollo to find their email addresses
Import only those enriched contacts to crm/contacts.csv
Keep original Nov 19 file as reference documentation

Option 3: Manual Integration Over Time

Keep Nov 19 file as-is for now
When we find contacts at those organizations (via Apollo), reference the strategic notes
Gradually absorb the intelligence into contact notes fields
Accept that organization-level research is separate from contact data

Decision for Now:

Skip importing both files (Outreach Nov 19 + Prospect List)
Focus on Apollo searches that have actual contact data with emails
Revisit when we have bandwidth to build proper account/organization tracking
Not the highest value-add right now - interesting architecture problem but not blocking GTM pipeline

Why This Matters:

We're building a CRM system but only have half of it (contacts)
Missing the accounts/organizations layer where strategic research lives
This will come up again as we do more research that's org-level vs contact-level
Need to decide: Do we build accounts tracking now, or keep it informal?

Prioritization Rationale:

The research data exists and is documented - value is captured
The problem is documented - can be solved later when needed
Could spend significant time solving this architecture problem
Higher value work: GTM pipeline for Monday (Tier 1-4 outreach)
Will revisit when either: (1) have time to think through architecture, or (2) operational pain is high enough to justify the investment

Files Affected:

research/prospects/imports/Outreach Nov 19 - Sheet1.csv - 122 organizations, account-level research
research/prospects/imports/Prospect List - Prospect List.csv - similar structure, same issue

Notes

Conversational import workflow is working well for 20-50 contacts at a time
Email as primary key is the right choice for deduplication
Need to think more about organization/account-level data structure
This isn't urgent but will become important as research scales