Premera Day 0–30 Execution Plan

Created: 2026-02-16 Status: DRAFT — needs Thomas answers on open questions before finalizing

Context

Premera engagement: FDE/consulting at $150/hr, ~40h/week, 3-6 months. Two work streams: Auto-Authorization (augmenting existing system for ~6→hundreds of conditions) and Appeals (going to production March 1 with Elementum workflow). Working inside Premera's environment (Azure AKS, their API keys, Phoenix tracing, data stays in VPC).

Week 1: Environment + Data + First Workflow Baseline (Days 1-5)

Outcomes

Credentialed and connected — laptops/VPN/credentials provisioned, access to Facets, ADT feeds, medical policies, Elementum
Colt/Jamie kickoff complete — daily standup cadence agreed, Slack/Teams channel live, POC confirmed
Appeals workflow mapped end-to-end — current state documented (2 appeal types in production March 1), AI decision points identified within Elementum, data flow diagrammed
Auto-auth current state understood — reviewed existing 6-condition auto-auth logic, identified how GenAI layers in, gap analysis for scaling to hundreds of policies
First baseline measurements captured — current appeal processing time, current auto-auth coverage %, nurse review time per case

Daily Breakdown

Day	Focus	Deliverable
1	Onboarding: credentials, environment setup, security orientation	Access confirmed, dev environment running
2	Appeals deep-dive with Colt/Jamie: walk through Elementum workflow, 2 appeal types, AI touchpoints	Appeals workflow diagram v1
3	Auto-auth deep-dive: review existing system, 6 conditions, data sources, GenAI integration points	Auto-auth architecture doc v1
4	Data access: connect to ADT feeds, Facets, medical policies. Understand data schemas, quality, gaps	Data access inventory + gap list
5	Baseline metrics: pull current processing times, volumes, error rates. Week 1 retro with Colt	Baseline metrics doc + Week 2 plan

❓ Questions Thomas Needs to Answer

Q1: Provisioning timeline — Nathan warned SRP tickets take 4 weeks. Has Colt pre-staged any access? Do we need to submit requests NOW before contract signs?
Q2: Which appeals work stream first? — They go live March 1 with 2 types. Do we embed in the March 1 launch or start on the next appeal types?
Q3: Who is our daily POC? — Colt, Jamie, or someone on Nathan's team? This determines how fast we move.
Q4: Can we get a sandbox/staging environment? — Or are we working directly in their production pipeline from day 1?
Q5: Security review status — They haven't received our formal security deck yet. Is that a blocker for access?

Week 2: Pilot Workflow with Synthetic Data (Days 6-10)

Outcomes

Appeals pilot running on synthetic data — AI parsing + reasoning pipeline processing sample appeal documents through Elementum decision points
Auto-auth expansion prototype — selected 2-3 new conditions beyond the existing 6, built draft policy-to-criteria mapping
LLM pipeline integrated with their security gateway — all calls routed through Premera's AI gateway, Phoenix tracing active
First nurse feedback captured — showed appeal AI output to 1-2 nurses, documented what works / what's wrong
Technical architecture documented — how our code fits into their stack, deployment approach, testing strategy

Key Activities

Activity	Owner	Dependency
Build appeal document parser (extract key claims, supporting evidence, provider arguments)	Thomas	Access to sample appeal docs
Implement criteria-matching logic against InterQual for appeal review	Thomas	InterQual access/API or documentation
Create synthetic appeal dataset (10-20 cases covering 2 appeal types)	Both	Understanding of appeal types from Week 1
Auto-auth: map 2-3 new medical policies to automation logic	Michael	Medical policy documents
Integrate LLM calls through Premera's AI security gateway	Thomas	Gateway credentials + docs
Set up Phoenix tracing for all AI inference calls	Thomas	Phoenix collector access
Demo to Colt: "here's what the AI sees when processing an appeal"	Both	Working prototype

❓ Questions Thomas Needs to Answer

Q6: InterQual access model — Can we call InterQual programmatically? Or is it a UI-only tool nurses use manually? This changes the architecture significantly.
Q7: What does "synthetic data" mean here? — Can Premera provide de-identified real cases? Or do we generate from scratch? De-identified real data is 10x more useful.
Q8: LLM model choice — Premera has Anthropic + OpenAI relationships. Which models are approved? Any restrictions on Claude vs. GPT for PHI?
Q9: How do we handle the "no automated denials" constraint technically? — Every AI output that leans toward denial must route to human. Need to design the confidence threshold / escalation logic early.

Weeks 3-4: Quality Loop, Decision-Support Outputs, Internal Demos (Days 11-20)

Outcomes

Appeals quality loop operational — AI outputs reviewed by nurses, feedback captured, prompts/logic tuned, measurable improvement across iterations
Auto-auth expansion validated — 2-3 new conditions tested against real (de-identified) data, accuracy measured, ready for production review
Decision-support outputs formatted for nurse workflow — outputs match Premera's templates, integrate into Elementum, require minimal manual editing
Internal demo to Chad Murphy (CCO) — first time the business leader sees AI-augmented appeal review + auto-auth expansion in action
Week 4 executive summary delivered — quantified results vs. baseline, roadmap for months 2-3, recommendation for expanded scope

Key Activities

Activity	Owner	Dependency
Run 50+ appeal cases through pipeline, measure accuracy vs. nurse decisions	Both	Working pipeline + test cases
Build feedback UI/form for nurses to rate AI outputs	Thomas	Nurse availability
Tune prompts based on feedback (3+ iteration cycles)	Thomas	Feedback data
Format outputs to match Premera's existing summary templates	Michael	Template access (from Week 1)
Auto-auth: run new conditions against historical decisions, measure match rate	Both	Historical decision data
Prep Chad Murphy demo: curated examples, before/after, metrics	Both	Working prototypes
Draft Month 2-3 roadmap: what's next, what's needed, scaling plan	Both	Week 1-3 learnings
Deliver Week 4 executive summary	Michael	All above

❓ Questions Thomas Needs to Answer

Q10: Have you met Chad Murphy yet? — He's the CCO and key business decision-maker. Colt's team is technical. Chad controls whether this scales. When do we get in front of him?
Q11: What does "success" look like to Chad vs. Colt? — Colt cares about technical capability. Chad cares about nurse productivity, compliance, cost. Need to present metrics that matter to both.
Q12: Vendor insourcing target — Colt mentioned "Caroline" for advanced imaging reviews as a disruption candidate. Is that a Week 3-4 deliverable or Month 2+?
Q13: What's the billing structure during ramp? — Full $150/hr from day 1? Or reduced rate during discovery? This affects how aggressive Week 1 can be.

Success Metrics

Metric	Baseline (capture Week 1)	Week 2 Target	Week 4 Target	How Measured
Appeal processing time	TBD (current manual)	First AI-assisted time	30%+ reduction	Time tracking per case
Auto-auth condition coverage	6 conditions	6 (understanding)	8-9 conditions	Count of automated policies
AI output accuracy (appeals)	N/A	First measurements	>85% nurse agreement	Nurse review sample
AI output accuracy (auto-auth)	Existing system baseline	N/A	Match or exceed existing	Comparison to historical decisions
Review quality consistency	TBD (variance across nurses)	First measurements	Measurable reduction in variance	Inter-reviewer agreement
Stakeholder confidence	Low (haven't seen it work)	Colt team bought in	Chad demo positive	Qualitative feedback
Cycle time: idea → deployed	TBD (Nathan says weeks)	First deploy	Repeatable deploy process	Calendar tracking

Risk-Adjusted Timeline

Risk	Impact	Mitigation	Plan B
Provisioning takes 4+ weeks (Nathan's warning)	Week 1 is blocked	Pre-submit access requests NOW, before contract signs. Colt to champion internally.	Work on architecture/design docs + synthetic data while waiting
Data access is harder than expected	Can't baseline or build	Start with whatever Colt's team already has extracted. Use their existing data pipelines.	Build against synthetic data, validate later with real
InterQual is UI-only (no API)	Can't automate criteria matching	Build our own criteria extraction from policy docs	Manual criteria encoding for pilot conditions
Appeals March 1 launch is chaotic	Team too busy for us	Focus on auto-auth first, pick up appeals after launch stabilizes	Observe/document March 1 launch, design improvements for post-launch
Chad Murphy doesn't engage	Business side doesn't champion us	Use Colt as bridge. Deliver results that force the conversation.	Build relationship through Romilla (clinician team lead)

Pre-Contract Actions (Do NOW)

These don't require a signed contract:

Submit security deck to Premera — they haven't seen it yet, and it may be required for access
Ask Colt about pre-staging access requests — if SRP tickets take 4 weeks, starting now saves the entire first month
Request sample data — de-identified appeal docs, medical policies, auto-auth decision logs
Confirm appeals March 1 timeline — are we joining that launch or starting after?
Schedule Chad Murphy intro — even a 15-min hello before contract signs builds trust
Decide individual contractor vs. company contract — Nathan recommends company. How long does that add?

Premera Blue Cross

Premera Day 0–30 Execution Plan

Context

Week 1: Environment + Data + First Workflow Baseline (Days 1-5)

Outcomes

Daily Breakdown

❓ Questions Thomas Needs to Answer

Week 2: Pilot Workflow with Synthetic Data (Days 6-10)

Outcomes

Key Activities

❓ Questions Thomas Needs to Answer

Weeks 3-4: Quality Loop, Decision-Support Outputs, Internal Demos (Days 11-20)

Outcomes

Key Activities

❓ Questions Thomas Needs to Answer

Success Metrics

Risk-Adjusted Timeline

Pre-Contract Actions (Do NOW)

Daisy

What do you need?