Jan 9, 2025 - Michael
Work Log
CRM Contact Database Setup
- Created canonical contact database schema (
crm/docs/contact-schema.md) - Set up canonical CSV file (
crm/contacts.csv) with comprehensive field definitions - Imported 20 behavioral health CEO contacts from Apollo enriched results
- Tagged all imports with
behavioral-health-ceofor campaign segmentation
Import Workflow Documentation
- Documented import process in
crm/docs/import-workflow.md - Established conversational import pattern (no scripts needed at current scale)
- Validated deduplication logic (email as primary key)
Scope Clarification & Timeline Adjustment
- Identified engagement tracking gap: no easy way to sync Apollo tracking data (opens, clicks) back to contacts.csv
- Decision: Reduce scope to realistic MVP rather than rush incomplete automation
- Timeline push required: Monday deadline not feasible for proper implementation
- Advocating for: Push timeline, build it right, deliver more value
Thinking/Decisions
CRM Scope Reduction: Contacts Database vs. Campaign Execution
The Core Problem:
- Need engagement tracking (opens, clicks, replies) to manage pipeline effectively
- Apollo has this tracking data
- No easy way to programmatically sync Apollo's tracking data back to contacts.csv
- Building full automation requires Apollo API integration + webhook/polling system for tracking sync
- That's 2-3 days of focused work to do properly
Original Scope (Too Ambitious for Monday):
- ✅ Contacts.csv as source of truth
- ✅ Import from multiple sources
- ❌ Campaign execution from contacts.csv
- ❌ Engagement tracking sync from Apollo
- ❌ Automated pipeline management with tracking data
Revised Scope (Realistic MVP):
What contacts.csv does:
- Canonical list of all contacts (source of truth for WHO to contact)
- Import from multiple sources (Apollo, conferences, Excel)
- Deduplication (email as primary key)
- Basic segmentation (tags, stage, priority)
- Manual updates for lifecycle changes (prospect → lead → relationship)
What Apollo does:
- Campaign execution (sequences, one-off emails)
- Engagement tracking (opens, clicks, replies)
- Operational tool for outreach
Bridge between them:
- Export segments from contacts.csv → Import to Apollo for campaigns
- When people respond, manually update contacts.csv stage/status
- Periodic sync (not real-time) acceptable for MVP
Timeline Decision:
- Original deadline: Monday (Jan 13)
- Reality: Proper implementation needs more time
- Recommendation: Push timeline, build the right foundation
- This is taking longer than expected, but it's more ambitious and more valuable
- Better to advocate for doing it right than shipping something incomplete
What "Done" Looks Like for MVP:
- Contacts.csv populated with all existing contacts from Apollo
- Import workflow established and documented
- Clear boundaries: contacts.csv = source of truth, Apollo = execution layer
- Manual bridge process documented (how to sync between them)
- Foundation ready for future automation when we have time
Next Action Required:
- Communicate timeline push to co-founder
- Present trade-off: "Full automation needs more time vs. MVP ready sooner"
- Get alignment on reduced scope
Data Complexity Issue: Account-Level Research vs. Contact Database
Context: Discovered significant research data in research/prospects/imports/ that can't be easily imported into the contacts database:
Outreach Nov 19 - Sheet1.csv(122 organizations)Prospect List - Prospect List.csv(similar structure)
Both files share the same fundamental issue: they're account-level research trying to fit into a contact-level database.
Why This Data is Valuable:
- Represents significant human research effort into prospect organizations
- Contains strategic context: buyer personas, value propositions, contactability scores
- Includes notes on prior outreach attempts and what happened
- Maps prospect organizations to specific industry types (Health Plans, Outsourcers, BHCs, IROs)
- Has qualitative intelligence that took time to develop
Why It's Complicated:
-
Schema Mismatch - Organization vs. Contact Data
- Our contacts database expects:
email, first_name, last_name, title, company - Nov 19 file has:
Organization Name, Type, Description, Buyer Persona, Why High-Probability - This is organization-level research, not contact-level data
- We'd need a separate accounts/organizations table to store this properly
- Our contacts database expects:
-
Missing Contact Information
- Only ~20 of 122 rows have actual contact names
- "Contact" column shows method (Email, LinkedIn, Website Form) not actual email addresses
- Names like "cyndi" or "Phil Salemi Jr" without email addresses
- Can't import without emails (email = our primary key for deduplication)
-
Mixed Data Types
- Some rows are job postings with salary ranges (not prospects)
- Some are organizations without contacts (e.g., "Advanced Medical Reviews")
- Some are actual contacts but missing email addresses
- Some include rich strategic notes in other columns
- No consistent pattern to extract
-
Strategic Context vs. Structured Data
- Fields like "Why High-Probability" contain nuanced reasoning
- "Buyer Persona" has specific titles but not actual people
- This is intelligence not operational data
- Would lose value if forced into structured contact fields
-
Prior Outreach History Embedded in Notes
- Some rows contain notes about what happened with outreach
- This context is valuable but unstructured
- Would need to be preserved somewhere (notes field?)
- Risk of losing context if we just extract contact info
The Core Problem: We have two types of data that need different structures:
- Contacts (individuals, emails, LinkedIn, titles) →
crm/contacts.csv - Accounts (organizations, strategy, research notes, buyer personas) → no home yet
The Nov 19 file is primarily account/organization research, not a contact list. Forcing it into a contact database would lose most of its value.
Potential Solutions:
Option 1: Two-Table System
- Create
crm/accounts.csvfor organization-level data - Keep
crm/contacts.csvfor individual contacts - Link contacts to accounts via company name field
- Import Nov 19 data as accounts, then enrich with contacts later
Option 2: Selective Enrichment
- Extract the ~20 rows with actual contact names
- Use Apollo to find their email addresses
- Import only those enriched contacts to
crm/contacts.csv - Keep original Nov 19 file as reference documentation
Option 3: Manual Integration Over Time
- Keep Nov 19 file as-is for now
- When we find contacts at those organizations (via Apollo), reference the strategic notes
- Gradually absorb the intelligence into contact notes fields
- Accept that organization-level research is separate from contact data
Decision for Now:
- Skip importing both files (Outreach Nov 19 + Prospect List)
- Focus on Apollo searches that have actual contact data with emails
- Revisit when we have bandwidth to build proper account/organization tracking
- Not the highest value-add right now - interesting architecture problem but not blocking GTM pipeline
Why This Matters:
- We're building a CRM system but only have half of it (contacts)
- Missing the accounts/organizations layer where strategic research lives
- This will come up again as we do more research that's org-level vs contact-level
- Need to decide: Do we build accounts tracking now, or keep it informal?
Prioritization Rationale:
- The research data exists and is documented - value is captured
- The problem is documented - can be solved later when needed
- Could spend significant time solving this architecture problem
- Higher value work: GTM pipeline for Monday (Tier 1-4 outreach)
- Will revisit when either: (1) have time to think through architecture, or (2) operational pain is high enough to justify the investment
Files Affected:
research/prospects/imports/Outreach Nov 19 - Sheet1.csv- 122 organizations, account-level researchresearch/prospects/imports/Prospect List - Prospect List.csv- similar structure, same issue
Notes
- Conversational import workflow is working well for 20-50 contacts at a time
- Email as primary key is the right choice for deduplication
- Need to think more about organization/account-level data structure
- This isn't urgent but will become important as research scales