Security & Compliance: Deploying AI Inside a Payer's Infrastructure
Research for DaisyAI's Premera Blue Cross engagement. Focused on practical realities of running LLM-based systems with PHI inside a health plan's secure environment.
Last updated: 2026-02-12
Table of Contents
- Premera-Specific Context
- PHI in AI/LLM Pipelines
- AI Security Gateway Architecture
- Enterprise AI Provisioning at Health Plans
- HIPAA and GenAI: BAAs, APIs, and Deployment Models
- Cloud Platform Comparison: Azure OpenAI vs. AWS Bedrock
- AI Governance Frameworks for Payers
- Emerging Regulations (Effective 2025-2026)
- Practical Deployment Patterns
- Implications for DaisyAI at Premera
Premera-Specific Context
What we know from the Feb 11, 2026 call:
| Detail | What It Means |
|---|---|
| All LLM calls go through their "AI security gateway" | Centralized proxy/control plane between apps and LLM providers. We route through it, not around it. |
| Full tracing via Phoenix collector | Arize Phoenix — open-source AI observability built on OpenTelemetry. Every prompt/response logged, traced, evaluated. |
| Data cannot leave their VPC | No calling external APIs directly. Must use Premera's own API keys via their gateway. |
| They use Premera's own Anthropic/OpenAI API relationships | Premera holds the BAAs with Anthropic and OpenAI. We operate under their umbrella. |
| SRP tickets take ~4 weeks through 5 teams | Provisioning is bureaucratic — identity, network, data, security, and application teams all sign off. |
| Provisioning mistakes (global vs. data standard access) cause delays | Getting the wrong access tier means re-doing the SRP. Specificity matters upfront. |
Premera's AI Governance
Premera has a public AI Practices page and a cross-functional Data & AI Ethics Committee with five principles:
- Be transparent — disclose when AI contributes to decisions
- Be fair — avoid unfair discrimination
- Protect privacy and security — CISO deeply involved in AI governance
- Be accountable — maintain human oversight
- Continually improve — iterate on AI safety practices
Premera was among 25+ payers/providers that signed the White House AI safety pledge for healthcare.
Premera's Security History
Context that explains their conservative posture:
- 2014-2015 breach: APT group had unauthorized access for ~9 months, affecting 10.4 million individuals
- $6.85M OCR HIPAA fine — 2nd largest ever at the time (HHS enforcement)
- $10M state settlement (WA Attorney General) + $74M class-action settlement
- Root cause: failure to conduct enterprise-wide risk analysis, inadequate audit controls, ignored auditor warnings
This history directly shapes their current security posture. They will be conservative. They will over-audit. They will require extensive documentation. This is rational behavior from their perspective.
PHI in AI/LLM Pipelines
The Core Problem
PHI + LLM = regulatory minefield. The question is not "can you do it?" but "under what conditions?"
What Constitutes PHI in LLM Context
Any of the 18 HIPAA identifiers combined with health information, including:
- Patient names, DOBs, SSNs, MRNs in prompts
- Clinical notes passed as context
- Diagnosis codes linked to individuals
- Claims data with member identifiers
How Payers Handle PHI with GenAI
Three dominant patterns emerging in production:
Pattern 1: De-identify Before Inference (Most Common)
- Strip/replace all 18 HIPAA identifiers before prompt construction
- Use NLP-based de-identification (e.g., John Snow Labs Healthcare NLP)
- Tokenize identifiers with consistent pseudonyms for re-linking
- Send de-identified data to LLM, re-link on response
- Pro: Minimizes risk exposure. Con: Lossy — clinical context can be degraded
Pattern 2: In-VPC Processing with BAA-Covered APIs (Premera's Approach)
- Keep all data within the organization's VPC
- Route through BAA-covered cloud provider APIs (Azure OpenAI, AWS Bedrock)
- Full audit trail via gateway
- Pro: Full clinical fidelity. Con: Requires robust infrastructure and BAAs
Pattern 3: On-Premise/Private Inference
- Self-host open-source models (Llama, Mistral, Meditron)
- No data ever leaves organizational boundary
- Pro: Maximum control. Con: Operational overhead, potentially lower model quality
Sources:
- HIPAA Compliance AI Guide (TechMagic)
- HIPAA Compliant AI Best Practices (Edenlab)
- AI Chatbots and HIPAA Challenges (PMC)
AI Security Gateway Architecture
What Premera Means by "AI Security Gateway"
An AI gateway (also called LLM proxy or LLM router) is a content-aware reverse proxy that sits between applications and LLM providers. Unlike a standard API gateway, it inspects prompt/response content.
Architecture Diagram (Logical)
┌─────────────────────────────────────────────────────────┐
│ Premera VPC │
│ │
│ ┌──────────┐ ┌──────────────────┐ ┌───────────┐ │
│ │ DaisyAI │───▶│ AI Security │───▶│ Anthropic │ │
│ │ App │ │ Gateway │ │ API (BAA) │ │
│ └──────────┘ │ │ ├───────────┤ │
│ │ • Auth (JWT) │ │ OpenAI │ │
│ │ • PHI scanning │ │ API (BAA) │ │
│ │ • PII redaction │ └───────────┘ │
│ │ • Rate limiting │ │
│ │ • Token budgets │ ┌───────────┐ │
│ │ • Audit logging │───▶│ Phoenix │ │
│ │ • Policy rules │ │ Collector │ │
│ └──────────────────┘ │ (Tracing) │ │
│ └───────────┘ │
└─────────────────────────────────────────────────────────┘
Gateway Capabilities (Industry Standard)
Based on enterprise AI gateway patterns:
| Capability | What It Does | Why It Matters |
|---|---|---|
| PII/PHI Detection & Redaction | Scans prompts for identifiers, optionally redacts before forwarding | Prevents accidental PHI exposure |
| Policy Enforcement | Rules about which models, which data, which users | Least-privilege for AI |
| Token Budget Management | Per-user/per-app token limits | Cost control and abuse prevention |
| Semantic Caching | Cache similar queries to reduce API calls | Performance + cost |
| Model Routing | Route different request types to different models | Use cheaper models for simple tasks |
| Prompt Injection Defense | Filter malicious prompt patterns (OWASP LLM01) | Security |
| Full Audit Trail | Log every interaction with user, timestamp, input, output | Compliance + forensics |
| Data Residency Enforcement | Ensure requests only go to approved regions | Regulatory compliance |
Common Implementations
- Azure API Management AI Gateway — Microsoft Learn
- Databricks Mosaic AI Gateway — Databricks
- Apache APISIX AI Gateway — APISIX
- Portkey — portkey.ai
- Custom builds on Kong, NGINX, or Envoy with LLM-specific plugins
Phoenix Collector (Premera's Tracing)
Arize Phoenix is open-source AI observability:
- Built on OpenTelemetry — vendor/framework agnostic
- Traces LLM calls end-to-end: prompt → model → response → evaluation
- Self-hostable — no external data egress required
- Integrates with LangChain, LlamaIndex, direct API calls
- Supports evaluation benchmarking and dataset versioning
Implication for DaisyAI: Our code must emit OpenTelemetry spans compatible with their Phoenix collector. This means instrumenting our LLM calls with the right trace context.
Sources:
- AI Gateway Security and Compliance (API7)
- Building the AI Control Plane (Medium)
- LLM Gateways for Enterprise Risk (Medium)
Enterprise AI Provisioning at Health Plans
What Onboarding Looks Like
Based on enterprise patterns and what Premera told us about their 5-team, 4-week SRP process:
Step 1: Identity & Access Management (Week 1)
- Contractor/consultant account creation in Premera's IdP (likely Azure AD/Entra ID)
- MFA enrollment (required — HIPAA Security Rule NPRM makes this mandatory)
- RBAC role assignment — critical to get right the first time
- "Global access" vs. "data standard access" — the distinction Premera mentioned
- Global = broader than needed, violates least privilege
- Data standard = scoped to specific data domains (claims, clinical, member)
- Background check, security training, HIPAA awareness certification
Step 2: Network Access (Week 1-2)
- VPN provisioning into Premera VPC
- Network segmentation — access only to approved subnets
- Private endpoints for cloud services (no public internet paths)
- Firewall rules allowing traffic to/from AI gateway
Step 3: Data Access (Week 2-3)
- Determine which data domains the engagement requires
- Provision database/data lake credentials with row/column-level security
- PHI access requires specific justification per data element
- Audit logging enabled on all data access
Step 4: AI Service Access (Week 3)
- Register application with AI security gateway
- Receive API keys/tokens scoped to approved models
- Configure token budgets and rate limits
- Set up Phoenix tracing integration
Step 5: Security Review & Go-Live (Week 3-4)
- Security team reviews architecture, data flows, access patterns
- Penetration test or vulnerability scan of deployed components
- Sign-off from all 5 teams (identity, network, data, security, application)
- Production access granted
Common Provisioning Pitfalls
| Pitfall | What Happens | How to Avoid |
|---|---|---|
| Wrong access tier requested | Re-do the entire SRP (4 more weeks) | Be extremely specific about data domains needed upfront |
| Missing data elements from request | Can't access needed claims fields | Map out every data element before submitting SRP |
| VPN config issues | Can't reach internal services | Test connectivity immediately, don't wait |
| Expired credentials | Locked out, need IT ticket | Track expiration dates, renew proactively |
| Overly broad access request | Security team rejects, sends back for scoping | Start narrow, expand only if justified |
Sources:
- RBAC for Healthcare SaaS (Cabot Solutions)
- Least Privilege Access Control (Enter Health)
- RBAC Best Practices for Clinical Apps (Censinet)
HIPAA and GenAI
Can You Send PHI to Claude/GPT APIs?
Yes, under specific conditions:
- BAA must be in place between the covered entity (or business associate) and the API provider
- API provider must be HIPAA-eligible for that specific service
- Data handling must meet HIPAA Security Rule requirements (encryption, access controls, audit trails)
- No training on PHI — the provider must contractually agree not to use PHI for model training
Anthropic (Claude)
- BAA available on Enterprise plans — Anthropic Privacy Center
- HIPAA-ready Enterprise plans available — Claude Help Center
- Claude for Healthcare launched Jan 2026 with HIPAA-compliant configurations for enterprise
- BAA covers first-party API usage; specific use cases reviewed before BAA execution
- Key: BAAs signed before Dec 2, 2025 cover API only, not the Enterprise plan
- Also available via AWS Bedrock (under AWS BAA) — this is likely Premera's route
OpenAI
- BAA available for API services — OpenAI Help Center
- Available via Azure OpenAI Service (under Microsoft BAA) — more common for enterprise healthcare
- Zero data retention option available
- Does not use customer data for training when BAA is active
Critical BAA Clauses for AI Systems
Standard BAAs need enhancement for LLM use cases. Must explicitly address:
| Clause | Why It Matters |
|---|---|
| No training on PHI | Prevent patient data from entering model weights |
| Data retention limits | Define how long prompts/responses are stored |
| Subcontractor flow-down | BAA obligations pass to any sub-processors |
| Breach notification timeline | Usually 60 days max under HIPAA, often negotiated shorter |
| Model versioning | Which model versions are covered by the BAA |
| Incident response | Process for AI-specific incidents (hallucination causing harm, data exposure in outputs) |
HIPAA Security Rule NPRM (Dec 2024)
The proposed update to the HIPAA Security Rule explicitly addresses AI:
- Requires inventory of all AI technologies that interact with ePHI
- AI tools must be included in risk analysis and risk management
- Vulnerability scanning every 6 months, penetration testing annually
- MFA required (removing "addressable" distinction — everything is "required" now)
- Entities must monitor for known vulnerabilities and patch promptly
If finalized, this means every payer using AI with PHI must formally track and assess their AI systems as part of HIPAA compliance.
Sources:
- HIPAA Compliance for AI in Digital Health (Foley & Lardner)
- AI Meets HIPAA Security (Online and On Point)
- HIPAA Security Rule NPRM Factsheet (HHS)
- Advancing Claude in Healthcare (Anthropic)
Cloud Platform Comparison
Azure OpenAI Service
| Feature | Status |
|---|---|
| HIPAA eligible | Yes (text models; preview features excluded) |
| BAA mechanism | Microsoft DPA (Data Protection Addendum) — automatic for all customers |
| Data retention | No prompt/completion data stored for training; opt-out of all logging available |
| Network isolation | VNet, private endpoints, Azure AD RBAC, Conditional Access |
| PHI in prompts | Allowed under BAA with proper safeguards |
| Realtime API (audio) | NOT HIPAA-eligible (still in preview) |
| Model training on data | Explicitly prohibited — customer data never used to retrain |
Likely Premera pattern: Azure OpenAI via private endpoint within their Azure VPC, accessed through AI security gateway.
AWS Bedrock
| Feature | Status |
|---|---|
| HIPAA eligible | Yes — included in AWS BAA |
| BAA mechanism | AWS Business Associate Addendum |
| Data retention | Customer data not shared with model providers, not used to improve base models |
| Network isolation | AWS PrivateLink, VPC endpoints, IAM with least privilege |
| PHI in prompts | Allowed under BAA |
| Encryption | AES-256 at rest, TLS 1.2+ in transit |
| Models available | Claude (Anthropic), Llama, Titan, others |
| Monitoring | CloudTrail + CloudWatch (configured to exclude PHI from logs) |
Shared Responsibility: AWS secures the infrastructure; customer secures their data, access controls, and application logic.
On-Premise / Private Inference
| Model | Use Case | Notes |
|---|---|---|
| Llama 3 (70B/8B) | General clinical NLP | Meta open-source, deployable on-prem |
| Meditron (7B/70B) | Medical-domain tasks | Built on Llama 2, trained on medical corpus |
| Mistral 7B | Clinical note summarization | Efficient, good for constrained environments |
| John Snow Labs | Clinical NLP/de-identification | Commercial support, HIPAA-focused |
On-prem models deployed via vLLM, NVIDIA Triton, or similar inference servers. Cost-effective at scale but requires ML engineering capacity.
Sources:
- Azure OpenAI HIPAA Compliance (Microsoft Q&A)
- AWS Bedrock HIPAA BAA (SCIMUS)
- HIPAA Compliance for GenAI on AWS (AWS Blog)
- Bedrock Security and Compliance (AWS)
- Meditron (GitHub)
AI Governance Frameworks for Payers
NIST AI Risk Management Framework (AI RMF 1.0)
The primary framework payers reference. Four core functions:
- GOVERN — Establish policies, roles, accountability for AI risk
- MAP — Identify and categorize AI risks in context
- MEASURE — Assess and track AI risks quantitatively
- MANAGE — Prioritize and act on identified risks
2025 updates pushed organizations from planning to operationalizing AI risk management. RMF 1.1 guidance expected through 2026.
ECRI Institute named AI the #1 health technology hazard for 2025, pushing payer adoption of formal frameworks.
AHIP Position (Health Plan Industry)
AHIP published a one-pager (May 2025) emphasizing:
- AI increases access to quality care and improves health outcomes
- Health plans are investing in governance models and accountability frameworks
- Common challenges: fragmented data, unclear value measurement, limited governance, difficulty scaling responsibly
AHIP hosted sessions in 2025 on how health plans can manage AI portfolios for strategic alignment with enterprise goals and regulatory expectations.
ONC Health IT Certification (HTI-1)
The HTI-1 Final Rule established first-of-its-kind AI transparency requirements for certified health IT:
- Algorithmic transparency: Provide baseline information about AI/predictive algorithms
- FAVES criteria: Fairness, Appropriateness, Validity, Effectiveness, Safety
- USCDI v3 as baseline standard by Jan 1, 2026
- Compliance deadline: Feb 28, 2026
However: The Trump administration's HTI-5 proposed rule (2025) would remove "model card" requirements and eliminate 50%+ of certification criteria. Status uncertain — watch this space.
Premera's Framework
Premera's governance maps to industry patterns:
- Cross-functional Data & AI Ethics Committee
- Five principles (transparent, fair, private/secure, accountable, improving)
- CISO involvement in AI governance
- White House safety pledge signatory
Sources:
- NIST AI RMF (NIST)
- AI Risk Management with NIST in Healthcare (Censinet)
- ONC HTI-1 Final Rule (HealthIT.gov)
- HTI-5 Proposed Rule (Healthcare Dive)
- AHIP AI One-Pager
Emerging Regulations
Federal: CMS Rules for Medicare Advantage AI
CMS Guidance on AI in Coverage Decisions:
- MA orgs may use algorithms to support decisions, but full responsibility remains with the insurer
- Every coverage decision must rely on individual member circumstances — not just algorithmic output
- All coverage criteria used by algorithms must be publicly accessible — no black-box decisions
- Two explicit prohibitions:
- Predictive algorithms cannot apply non-public internal criteria
- AI cannot shift or alter coverage criteria over time
Prior Authorization Transparency (2026):
- MA orgs must publish list of all items/services requiring PA
- Must report 8 distinct PA metrics (approval/denial rates, turnaround times) at contract level
- Suspended: Health equity expertise requirements for UM committees and plan-level disparity reports (June 2025)
WISeR AI Pilot Program (January 2026):
- CMS testing AI for PA screening on select Medicare services
- AI companies handle initial screening; human clinician reviews all denials
- AI companies prohibited from compensation tied to denial rates
- Covers: skin/tissue substitutions, nerve stimulator implants, knee arthroscopy
Sources:
- CMS AI Guidance for Payers (Inovaare)
- CMS Suspends PA Transparency Rules (Georgetown)
- CMS AI PA Pilot (Jones Day)
- Medicare AI Experiment Alarm (Stateline)
Federal: HIPAA Security Rule NPRM
See HIPAA and GenAI section above. Key additions if finalized:
- Mandatory AI technology inventory
- AI included in formal risk analysis
- All security specs become "required" (no more "addressable")
- Vulnerability scanning every 6 months, pen testing annually
State-Level AI Insurance Regulations
NAIC Model Bulletin (Adopted by 24 States as of March 2025)
The NAIC Model Bulletin on Use of AI Systems by Insurers (Dec 2023) requires:
- Written program for responsible AI use
- Risk management and internal audit for AI systems
- Mitigation of adverse consumer outcomes
- Governance framework covering all AI that affects regulated insurance practices
Adopted by: AK, CT, DE, IL, IA, KY, MD, MA, MI, NE, NV, NH, NJ, NC, PA, RI, VT, VA, WA, WV, WI, DC, and others.
California (Effective Jan 2026)
- Health plans/insurers cannot rely solely on automated tools for coverage decisions
- Any adverse determination must be reviewed by a licensed clinician
- Must disclose when AI contributes to a decision
- Accessible appeals processes required
- GenAI developers must disclose training data sources and apply watermarking
Colorado AI Act (Enforcement June 30, 2026)
- Toughest state framework: disclosure required when AI is used in high-risk decisions
- Annual impact assessments
- Anti-bias controls
- Record-keeping for 3+ years
- Applies to health benefit plans (effective Oct 15, 2025 for unfair discrimination rules)
Connecticut
- Limits insurers' use of AI to deny medical care coverage
- Aligned with NAIC Model Bulletin
Sources:
- NAIC Model Bulletin (Quarles)
- NAIC Model Bulletin Text (NAIC)
- Navigating AI Regulatory Landscape 2026 (GenHealth)
- State AI Laws Effective Jan 2026 (King & Spalding)
- Tracking AI Insurance Regulation (Fenwick)
- California AI in Healthcare (Hooper Lundy)
Practical Deployment Patterns
Pattern 1: Proxy Gateway with De-identification (Most Deployed)
App → De-ID Engine → AI Gateway → Cloud LLM API (BAA) → Gateway → Re-ID → App
- De-identify PHI before it hits the LLM
- Gateway enforces policies, logs everything
- Re-link identifiers on the way back
- Used when: organization wants to minimize PHI exposure to cloud providers
Tools: John Snow Labs NLP, AWS Comprehend Medical, custom NER models
Pattern 2: In-VPC Cloud API with Full PHI (Premera's Pattern)
App → AI Gateway (VPC) → Private Endpoint → Cloud LLM API (BAA) → Gateway → App
│ │
└────────────── Phoenix Collector (Traces) ──────────────────┘
- PHI stays within VPC boundary
- Cloud API accessed via PrivateLink/private endpoint (no public internet)
- BAA covers PHI handling end-to-end
- Full audit trail via gateway + tracing
- Used when: organization has BAA with LLM provider and strong network controls
Pattern 3: Private Inference (Air-Gapped)
App → Local Inference Server (vLLM/Triton) → App
- Self-hosted models on organization's hardware/VPC
- Zero data egress
- Models: Llama 3, Mistral, Meditron, domain-fine-tuned variants
- Used when: maximum security requirements, or when cloud APIs can't handle specific use cases
Pattern 4: Hybrid (Emerging)
┌─ Simple tasks → Local small model (7B)
App → Router ─────┤
└─ Complex tasks → Cloud API (Claude/GPT) via gateway
- Route by complexity/sensitivity
- Sensitive summarization → local model
- Complex clinical reasoning → cloud API under BAA
- Cost optimization + security optimization
What's Actually Working in Production
Based on the research:
- Azure OpenAI via private endpoint is the most common pattern at large payers (Microsoft's healthcare presence is dominant)
- AWS Bedrock is growing, especially for organizations already on AWS
- De-identification before LLM is the "safe" approach most compliance teams accept first
- Full-PHI via BAA is where mature organizations are moving — the de-ID approach loses too much clinical context
- On-prem inference is mostly at academic medical centers doing research, not yet common at payers
- AI gateways are becoming standard — the pattern of centralized control, audit, and policy enforcement is converging across the industry
Sources:
- Secure GenAI Architecture in HealthTech (Sekurno)
- Secure Infrastructure for GenAI in Healthcare (JAMIA/Oxford)
- Private AI in Healthcare (AI21)
- GenAI Security in Healthcare (NeuralTrust)
- Healthcare PHI AI Gateway (Kiteworks)
Implications for DaisyAI at Premera
What We Must Do
-
Instrument for Phoenix: All LLM calls must emit OpenTelemetry traces compatible with Premera's Phoenix collector. This is non-negotiable — budget time for instrumentation.
-
Route through their gateway: No direct API calls to Anthropic/OpenAI. Our code calls Premera's AI gateway endpoint. We need their gateway URL, auth credentials, and supported models.
-
Submit precise SRP requests: Map out exactly which data domains we need access to (claims, clinical, member demographics, provider data). Specify the exact access tier. Mistakes cost 4 weeks.
-
Prepare for audit: Every design decision about data flow, PHI handling, and LLM usage will be scrutinized. Document architecture decisions proactively.
-
No data extraction: Nothing leaves the VPC. No copying data to our systems. No screenshots of PHI. No local development with real data.
What We Should Prepare Before Day 1
| Item | Why |
|---|---|
| Architecture doc showing data flows | Security team will ask for this immediately |
| List of data elements needed per use case | Prevents SRP rework |
| OpenTelemetry integration plan | Shows we understand their tracing requirements |
| De-identification strategy (even if not primary pattern) | Demonstrates defense-in-depth thinking |
| HIPAA training certificates | Required for all personnel accessing PHI |
| Incident response plan | What happens if our system exposes PHI |
Risk Factors
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| SRP provisioning delays | High | 4+ weeks lost | Submit SRPs immediately, be precise |
| Wrong access tier provisioned | Medium | 4 weeks rework | Review with Premera contact before submission |
| Gateway compatibility issues | Medium | Days of debugging | Get gateway API docs early, build against them |
| Phoenix tracing format mismatch | Low | Days of rework | Validate OTel span format with their team |
| Model availability via gateway | Low | Architecture change | Confirm which Claude/GPT models are available |
| Regulatory change mid-engagement | Medium | Scope change | Track CMS/state rules actively |
Key Takeaways
-
Premera's approach is Pattern 2 — in-VPC processing with BAA-covered cloud APIs, full PHI in prompts, centralized gateway, full tracing. This is the mature enterprise pattern.
-
Their security posture is shaped by the 2014 breach — expect conservatism, documentation requirements, and thorough audit. Work with it, not against it.
-
The regulatory environment is tightening — CMS AI guidance, NAIC model bulletin adoption (24 states), California/Colorado laws, HIPAA NPRM all create compliance pressure. Our product helps payers navigate this.
-
Anthropic and OpenAI both offer BAAs — but Premera holds the relationships. We operate under their umbrella. No need for us to establish separate BAAs.
-
The AI gateway pattern is industry-standard — this isn't Premera being unusual. Every large payer is converging on this architecture. What we learn here transfers directly to other payer deployments.
-
Provisioning is the bottleneck — not technology. 4 weeks through 5 teams. Plan accordingly. Be precise in access requests. Build against mock data until real access is granted.