Security & Compliance: Deploying AI Inside a Payer's Infrastructure

Research for DaisyAI's Premera Blue Cross engagement. Focused on practical realities of running LLM-based systems with PHI inside a health plan's secure environment.

Last updated: 2026-02-12

Premera-Specific Context
PHI in AI/LLM Pipelines
AI Security Gateway Architecture
Enterprise AI Provisioning at Health Plans
HIPAA and GenAI: BAAs, APIs, and Deployment Models
Cloud Platform Comparison: Azure OpenAI vs. AWS Bedrock
AI Governance Frameworks for Payers
Emerging Regulations (Effective 2025-2026)
Practical Deployment Patterns
Implications for DaisyAI at Premera

Premera-Specific Context

What we know from the Feb 11, 2026 call:

Detail	What It Means
All LLM calls go through their "AI security gateway"	Centralized proxy/control plane between apps and LLM providers. We route through it, not around it.
Full tracing via Phoenix collector	Arize Phoenix — open-source AI observability built on OpenTelemetry. Every prompt/response logged, traced, evaluated.
Data cannot leave their VPC	No calling external APIs directly. Must use Premera's own API keys via their gateway.
They use Premera's own Anthropic/OpenAI API relationships	Premera holds the BAAs with Anthropic and OpenAI. We operate under their umbrella.
SRP tickets take ~4 weeks through 5 teams	Provisioning is bureaucratic — identity, network, data, security, and application teams all sign off.
Provisioning mistakes (global vs. data standard access) cause delays	Getting the wrong access tier means re-doing the SRP. Specificity matters upfront.

Premera's AI Governance

Premera has a public AI Practices page and a cross-functional Data & AI Ethics Committee with five principles:

Be transparent — disclose when AI contributes to decisions
Be fair — avoid unfair discrimination
Protect privacy and security — CISO deeply involved in AI governance
Be accountable — maintain human oversight
Continually improve — iterate on AI safety practices

Premera was among 25+ payers/providers that signed the White House AI safety pledge for healthcare.

Premera's Security History

Context that explains their conservative posture:

2014-2015 breach: APT group had unauthorized access for ~9 months, affecting 10.4 million individuals
$6.85M OCR HIPAA fine — 2nd largest ever at the time (HHS enforcement)
$10M state settlement (WA Attorney General) + $74M class-action settlement
Root cause: failure to conduct enterprise-wide risk analysis, inadequate audit controls, ignored auditor warnings

This history directly shapes their current security posture. They will be conservative. They will over-audit. They will require extensive documentation. This is rational behavior from their perspective.

PHI in AI/LLM Pipelines

The Core Problem

PHI + LLM = regulatory minefield. The question is not "can you do it?" but "under what conditions?"

What Constitutes PHI in LLM Context

Any of the 18 HIPAA identifiers combined with health information, including:

Patient names, DOBs, SSNs, MRNs in prompts
Clinical notes passed as context
Diagnosis codes linked to individuals
Claims data with member identifiers

How Payers Handle PHI with GenAI

Three dominant patterns emerging in production:

Pattern 1: De-identify Before Inference (Most Common)

Strip/replace all 18 HIPAA identifiers before prompt construction
Use NLP-based de-identification (e.g., John Snow Labs Healthcare NLP)
Tokenize identifiers with consistent pseudonyms for re-linking
Send de-identified data to LLM, re-link on response
Pro: Minimizes risk exposure. Con: Lossy — clinical context can be degraded

Pattern 2: In-VPC Processing with BAA-Covered APIs (Premera's Approach)

Keep all data within the organization's VPC
Route through BAA-covered cloud provider APIs (Azure OpenAI, AWS Bedrock)
Full audit trail via gateway
Pro: Full clinical fidelity. Con: Requires robust infrastructure and BAAs

Pattern 3: On-Premise/Private Inference

Self-host open-source models (Llama, Mistral, Meditron)
No data ever leaves organizational boundary
Pro: Maximum control. Con: Operational overhead, potentially lower model quality

Sources:

AI Security Gateway Architecture

What Premera Means by "AI Security Gateway"

An AI gateway (also called LLM proxy or LLM router) is a content-aware reverse proxy that sits between applications and LLM providers. Unlike a standard API gateway, it inspects prompt/response content.

Architecture Diagram (Logical)

┌─────────────────────────────────────────────────────────┐
│                    Premera VPC                          │
│                                                         │
│  ┌──────────┐    ┌──────────────────┐    ┌───────────┐ │
│  │ DaisyAI  │───▶│  AI Security     │───▶│ Anthropic │ │
│  │ App      │    │  Gateway         │    │ API (BAA) │ │
│  └──────────┘    │                  │    ├───────────┤ │
│                  │  • Auth (JWT)    │    │ OpenAI    │ │
│                  │  • PHI scanning  │    │ API (BAA) │ │
│                  │  • PII redaction │    └───────────┘ │
│                  │  • Rate limiting │                   │
│                  │  • Token budgets │    ┌───────────┐ │
│                  │  • Audit logging │───▶│ Phoenix   │ │
│                  │  • Policy rules  │    │ Collector │ │
│                  └──────────────────┘    │ (Tracing) │ │
│                                          └───────────┘ │
└─────────────────────────────────────────────────────────┘

Gateway Capabilities (Industry Standard)

Based on enterprise AI gateway patterns:

Capability	What It Does	Why It Matters
PII/PHI Detection & Redaction	Scans prompts for identifiers, optionally redacts before forwarding	Prevents accidental PHI exposure
Policy Enforcement	Rules about which models, which data, which users	Least-privilege for AI
Token Budget Management	Per-user/per-app token limits	Cost control and abuse prevention
Semantic Caching	Cache similar queries to reduce API calls	Performance + cost
Model Routing	Route different request types to different models	Use cheaper models for simple tasks
Prompt Injection Defense	Filter malicious prompt patterns (OWASP LLM01)	Security
Full Audit Trail	Log every interaction with user, timestamp, input, output	Compliance + forensics
Data Residency Enforcement	Ensure requests only go to approved regions	Regulatory compliance

Common Implementations

Azure API Management AI Gateway — Microsoft Learn
Databricks Mosaic AI Gateway — Databricks
Apache APISIX AI Gateway — APISIX
Portkey — portkey.ai
Custom builds on Kong, NGINX, or Envoy with LLM-specific plugins

Phoenix Collector (Premera's Tracing)

Arize Phoenix is open-source AI observability:

Built on OpenTelemetry — vendor/framework agnostic
Traces LLM calls end-to-end: prompt → model → response → evaluation
Self-hostable — no external data egress required
Integrates with LangChain, LlamaIndex, direct API calls
Supports evaluation benchmarking and dataset versioning

Implication for DaisyAI: Our code must emit OpenTelemetry spans compatible with their Phoenix collector. This means instrumenting our LLM calls with the right trace context.

Sources:

Enterprise AI Provisioning at Health Plans

What Onboarding Looks Like

Based on enterprise patterns and what Premera told us about their 5-team, 4-week SRP process:

Step 1: Identity & Access Management (Week 1)

Contractor/consultant account creation in Premera's IdP (likely Azure AD/Entra ID)
MFA enrollment (required — HIPAA Security Rule NPRM makes this mandatory)
RBAC role assignment — critical to get right the first time
- "Global access" vs. "data standard access" — the distinction Premera mentioned
- Global = broader than needed, violates least privilege
- Data standard = scoped to specific data domains (claims, clinical, member)
Background check, security training, HIPAA awareness certification

Step 2: Network Access (Week 1-2)

VPN provisioning into Premera VPC
Network segmentation — access only to approved subnets
Private endpoints for cloud services (no public internet paths)
Firewall rules allowing traffic to/from AI gateway

Step 3: Data Access (Week 2-3)

Determine which data domains the engagement requires
Provision database/data lake credentials with row/column-level security
PHI access requires specific justification per data element
Audit logging enabled on all data access

Step 4: AI Service Access (Week 3)

Register application with AI security gateway
Receive API keys/tokens scoped to approved models
Configure token budgets and rate limits
Set up Phoenix tracing integration

Step 5: Security Review & Go-Live (Week 3-4)

Security team reviews architecture, data flows, access patterns
Penetration test or vulnerability scan of deployed components
Sign-off from all 5 teams (identity, network, data, security, application)
Production access granted

Common Provisioning Pitfalls

Pitfall	What Happens	How to Avoid
Wrong access tier requested	Re-do the entire SRP (4 more weeks)	Be extremely specific about data domains needed upfront
Missing data elements from request	Can't access needed claims fields	Map out every data element before submitting SRP
VPN config issues	Can't reach internal services	Test connectivity immediately, don't wait
Expired credentials	Locked out, need IT ticket	Track expiration dates, renew proactively
Overly broad access request	Security team rejects, sends back for scoping	Start narrow, expand only if justified

Sources:

HIPAA and GenAI

Can You Send PHI to Claude/GPT APIs?

Yes, under specific conditions:

BAA must be in place between the covered entity (or business associate) and the API provider
API provider must be HIPAA-eligible for that specific service
Data handling must meet HIPAA Security Rule requirements (encryption, access controls, audit trails)
No training on PHI — the provider must contractually agree not to use PHI for model training

Anthropic (Claude)

BAA available on Enterprise plans — Anthropic Privacy Center
HIPAA-ready Enterprise plans available — Claude Help Center
Claude for Healthcare launched Jan 2026 with HIPAA-compliant configurations for enterprise
BAA covers first-party API usage; specific use cases reviewed before BAA execution
Key: BAAs signed before Dec 2, 2025 cover API only, not the Enterprise plan
Also available via AWS Bedrock (under AWS BAA) — this is likely Premera's route

OpenAI

BAA available for API services — OpenAI Help Center
Available via Azure OpenAI Service (under Microsoft BAA) — more common for enterprise healthcare
Zero data retention option available
Does not use customer data for training when BAA is active

Critical BAA Clauses for AI Systems

Standard BAAs need enhancement for LLM use cases. Must explicitly address:

Clause	Why It Matters
No training on PHI	Prevent patient data from entering model weights
Data retention limits	Define how long prompts/responses are stored
Subcontractor flow-down	BAA obligations pass to any sub-processors
Breach notification timeline	Usually 60 days max under HIPAA, often negotiated shorter
Model versioning	Which model versions are covered by the BAA
Incident response	Process for AI-specific incidents (hallucination causing harm, data exposure in outputs)

HIPAA Security Rule NPRM (Dec 2024)

The proposed update to the HIPAA Security Rule explicitly addresses AI:

Requires inventory of all AI technologies that interact with ePHI
AI tools must be included in risk analysis and risk management
Vulnerability scanning every 6 months, penetration testing annually
MFA required (removing "addressable" distinction — everything is "required" now)
Entities must monitor for known vulnerabilities and patch promptly

If finalized, this means every payer using AI with PHI must formally track and assess their AI systems as part of HIPAA compliance.

Sources:

Cloud Platform Comparison

Azure OpenAI Service

Feature	Status
HIPAA eligible	Yes (text models; preview features excluded)
BAA mechanism	Microsoft DPA (Data Protection Addendum) — automatic for all customers
Data retention	No prompt/completion data stored for training; opt-out of all logging available
Network isolation	VNet, private endpoints, Azure AD RBAC, Conditional Access
PHI in prompts	Allowed under BAA with proper safeguards
Realtime API (audio)	NOT HIPAA-eligible (still in preview)
Model training on data	Explicitly prohibited — customer data never used to retrain

Likely Premera pattern: Azure OpenAI via private endpoint within their Azure VPC, accessed through AI security gateway.

AWS Bedrock

Feature	Status
HIPAA eligible	Yes — included in AWS BAA
BAA mechanism	AWS Business Associate Addendum
Data retention	Customer data not shared with model providers, not used to improve base models
Network isolation	AWS PrivateLink, VPC endpoints, IAM with least privilege
PHI in prompts	Allowed under BAA
Encryption	AES-256 at rest, TLS 1.2+ in transit
Models available	Claude (Anthropic), Llama, Titan, others
Monitoring	CloudTrail + CloudWatch (configured to exclude PHI from logs)

Shared Responsibility: AWS secures the infrastructure; customer secures their data, access controls, and application logic.

On-Premise / Private Inference

Model	Use Case	Notes
Llama 3 (70B/8B)	General clinical NLP	Meta open-source, deployable on-prem
Meditron (7B/70B)	Medical-domain tasks	Built on Llama 2, trained on medical corpus
Mistral 7B	Clinical note summarization	Efficient, good for constrained environments
John Snow Labs	Clinical NLP/de-identification	Commercial support, HIPAA-focused

On-prem models deployed via vLLM, NVIDIA Triton, or similar inference servers. Cost-effective at scale but requires ML engineering capacity.

Sources:

AI Governance Frameworks for Payers

NIST AI Risk Management Framework (AI RMF 1.0)

The primary framework payers reference. Four core functions:

GOVERN — Establish policies, roles, accountability for AI risk
MAP — Identify and categorize AI risks in context
MEASURE — Assess and track AI risks quantitatively
MANAGE — Prioritize and act on identified risks

2025 updates pushed organizations from planning to operationalizing AI risk management. RMF 1.1 guidance expected through 2026.

ECRI Institute named AI the #1 health technology hazard for 2025, pushing payer adoption of formal frameworks.

AHIP Position (Health Plan Industry)

AHIP published a one-pager (May 2025) emphasizing:

AI increases access to quality care and improves health outcomes
Health plans are investing in governance models and accountability frameworks
Common challenges: fragmented data, unclear value measurement, limited governance, difficulty scaling responsibly

AHIP hosted sessions in 2025 on how health plans can manage AI portfolios for strategic alignment with enterprise goals and regulatory expectations.

ONC Health IT Certification (HTI-1)

The HTI-1 Final Rule established first-of-its-kind AI transparency requirements for certified health IT:

Algorithmic transparency: Provide baseline information about AI/predictive algorithms
FAVES criteria: Fairness, Appropriateness, Validity, Effectiveness, Safety
USCDI v3 as baseline standard by Jan 1, 2026
Compliance deadline: Feb 28, 2026

However: The Trump administration's HTI-5 proposed rule (2025) would remove "model card" requirements and eliminate 50%+ of certification criteria. Status uncertain — watch this space.

Premera's Framework

Premera's governance maps to industry patterns:

Cross-functional Data & AI Ethics Committee
Five principles (transparent, fair, private/secure, accountable, improving)
CISO involvement in AI governance
White House safety pledge signatory

Sources:

Emerging Regulations

Federal: CMS Rules for Medicare Advantage AI

CMS Guidance on AI in Coverage Decisions:

MA orgs may use algorithms to support decisions, but full responsibility remains with the insurer
Every coverage decision must rely on individual member circumstances — not just algorithmic output
All coverage criteria used by algorithms must be publicly accessible — no black-box decisions
Two explicit prohibitions:
1. Predictive algorithms cannot apply non-public internal criteria
2. AI cannot shift or alter coverage criteria over time

Prior Authorization Transparency (2026):

MA orgs must publish list of all items/services requiring PA
Must report 8 distinct PA metrics (approval/denial rates, turnaround times) at contract level
Suspended: Health equity expertise requirements for UM committees and plan-level disparity reports (June 2025)

WISeR AI Pilot Program (January 2026):

CMS testing AI for PA screening on select Medicare services
AI companies handle initial screening; human clinician reviews all denials
AI companies prohibited from compensation tied to denial rates
Covers: skin/tissue substitutions, nerve stimulator implants, knee arthroscopy

Sources:

Federal: HIPAA Security Rule NPRM

See HIPAA and GenAI section above. Key additions if finalized:

Mandatory AI technology inventory
AI included in formal risk analysis
All security specs become "required" (no more "addressable")
Vulnerability scanning every 6 months, pen testing annually

State-Level AI Insurance Regulations

NAIC Model Bulletin (Adopted by 24 States as of March 2025)

The NAIC Model Bulletin on Use of AI Systems by Insurers (Dec 2023) requires:

Written program for responsible AI use
Risk management and internal audit for AI systems
Mitigation of adverse consumer outcomes
Governance framework covering all AI that affects regulated insurance practices

Adopted by: AK, CT, DE, IL, IA, KY, MD, MA, MI, NE, NV, NH, NJ, NC, PA, RI, VT, VA, WA, WV, WI, DC, and others.

California (Effective Jan 2026)

Health plans/insurers cannot rely solely on automated tools for coverage decisions
Any adverse determination must be reviewed by a licensed clinician
Must disclose when AI contributes to a decision
Accessible appeals processes required
GenAI developers must disclose training data sources and apply watermarking

Colorado AI Act (Enforcement June 30, 2026)

Toughest state framework: disclosure required when AI is used in high-risk decisions
Annual impact assessments
Anti-bias controls
Record-keeping for 3+ years
Applies to health benefit plans (effective Oct 15, 2025 for unfair discrimination rules)

Connecticut

Limits insurers' use of AI to deny medical care coverage
Aligned with NAIC Model Bulletin

Sources:

Practical Deployment Patterns

Pattern 1: Proxy Gateway with De-identification (Most Deployed)

App → De-ID Engine → AI Gateway → Cloud LLM API (BAA) → Gateway → Re-ID → App

De-identify PHI before it hits the LLM
Gateway enforces policies, logs everything
Re-link identifiers on the way back
Used when: organization wants to minimize PHI exposure to cloud providers

Tools: John Snow Labs NLP, AWS Comprehend Medical, custom NER models

Pattern 2: In-VPC Cloud API with Full PHI (Premera's Pattern)

App → AI Gateway (VPC) → Private Endpoint → Cloud LLM API (BAA) → Gateway → App
         │                                                            │
         └────────────── Phoenix Collector (Traces) ──────────────────┘

PHI stays within VPC boundary
Cloud API accessed via PrivateLink/private endpoint (no public internet)
BAA covers PHI handling end-to-end
Full audit trail via gateway + tracing
Used when: organization has BAA with LLM provider and strong network controls

Pattern 3: Private Inference (Air-Gapped)

App → Local Inference Server (vLLM/Triton) → App

Self-hosted models on organization's hardware/VPC
Zero data egress
Models: Llama 3, Mistral, Meditron, domain-fine-tuned variants
Used when: maximum security requirements, or when cloud APIs can't handle specific use cases

Pattern 4: Hybrid (Emerging)

                  ┌─ Simple tasks → Local small model (7B)
App → Router ─────┤
                  └─ Complex tasks → Cloud API (Claude/GPT) via gateway

Route by complexity/sensitivity
Sensitive summarization → local model
Complex clinical reasoning → cloud API under BAA
Cost optimization + security optimization

What's Actually Working in Production

Based on the research:

Azure OpenAI via private endpoint is the most common pattern at large payers (Microsoft's healthcare presence is dominant)
AWS Bedrock is growing, especially for organizations already on AWS
De-identification before LLM is the "safe" approach most compliance teams accept first
Full-PHI via BAA is where mature organizations are moving — the de-ID approach loses too much clinical context
On-prem inference is mostly at academic medical centers doing research, not yet common at payers
AI gateways are becoming standard — the pattern of centralized control, audit, and policy enforcement is converging across the industry

Sources:

Implications for DaisyAI at Premera

What We Must Do

Instrument for Phoenix: All LLM calls must emit OpenTelemetry traces compatible with Premera's Phoenix collector. This is non-negotiable — budget time for instrumentation.
Route through their gateway: No direct API calls to Anthropic/OpenAI. Our code calls Premera's AI gateway endpoint. We need their gateway URL, auth credentials, and supported models.
Submit precise SRP requests: Map out exactly which data domains we need access to (claims, clinical, member demographics, provider data). Specify the exact access tier. Mistakes cost 4 weeks.
Prepare for audit: Every design decision about data flow, PHI handling, and LLM usage will be scrutinized. Document architecture decisions proactively.
No data extraction: Nothing leaves the VPC. No copying data to our systems. No screenshots of PHI. No local development with real data.

What We Should Prepare Before Day 1

Item	Why
Architecture doc showing data flows	Security team will ask for this immediately
List of data elements needed per use case	Prevents SRP rework
OpenTelemetry integration plan	Shows we understand their tracing requirements
De-identification strategy (even if not primary pattern)	Demonstrates defense-in-depth thinking
HIPAA training certificates	Required for all personnel accessing PHI
Incident response plan	What happens if our system exposes PHI

Risk Factors

Risk	Likelihood	Impact	Mitigation
SRP provisioning delays	High	4+ weeks lost	Submit SRPs immediately, be precise
Wrong access tier provisioned	Medium	4 weeks rework	Review with Premera contact before submission
Gateway compatibility issues	Medium	Days of debugging	Get gateway API docs early, build against them
Phoenix tracing format mismatch	Low	Days of rework	Validate OTel span format with their team
Model availability via gateway	Low	Architecture change	Confirm which Claude/GPT models are available
Regulatory change mid-engagement	Medium	Scope change	Track CMS/state rules actively

Key Takeaways

Premera's approach is Pattern 2 — in-VPC processing with BAA-covered cloud APIs, full PHI in prompts, centralized gateway, full tracing. This is the mature enterprise pattern.
Their security posture is shaped by the 2014 breach — expect conservatism, documentation requirements, and thorough audit. Work with it, not against it.
The regulatory environment is tightening — CMS AI guidance, NAIC model bulletin adoption (24 states), California/Colorado laws, HIPAA NPRM all create compliance pressure. Our product helps payers navigate this.
Anthropic and OpenAI both offer BAAs — but Premera holds the relationships. We operate under their umbrella. No need for us to establish separate BAAs.
The AI gateway pattern is industry-standard — this isn't Premera being unusual. Every large payer is converging on this architecture. What we learn here transfers directly to other payer deployments.
Provisioning is the bottleneck — not technology. 4 weeks through 5 teams. Plan accordingly. Be precise in access requests. Build against mock data until real access is granted.

Security & Compliance: Deploying AI Inside a Payer's Infrastructure

Table of Contents

Premera-Specific Context

Premera's AI Governance

Premera's Security History

PHI in AI/LLM Pipelines

The Core Problem

What Constitutes PHI in LLM Context

How Payers Handle PHI with GenAI

AI Security Gateway Architecture

What Premera Means by "AI Security Gateway"

Architecture Diagram (Logical)

Gateway Capabilities (Industry Standard)

Common Implementations

Phoenix Collector (Premera's Tracing)

Enterprise AI Provisioning at Health Plans

What Onboarding Looks Like

Step 1: Identity & Access Management (Week 1)

Step 2: Network Access (Week 1-2)

Step 3: Data Access (Week 2-3)

Step 4: AI Service Access (Week 3)

Step 5: Security Review & Go-Live (Week 3-4)

Common Provisioning Pitfalls

HIPAA and GenAI

Can You Send PHI to Claude/GPT APIs?

Anthropic (Claude)

OpenAI

Critical BAA Clauses for AI Systems

HIPAA Security Rule NPRM (Dec 2024)

Cloud Platform Comparison

Azure OpenAI Service

AWS Bedrock

On-Premise / Private Inference

AI Governance Frameworks for Payers

NIST AI Risk Management Framework (AI RMF 1.0)

AHIP Position (Health Plan Industry)

ONC Health IT Certification (HTI-1)

Premera's Framework

Emerging Regulations

Federal: CMS Rules for Medicare Advantage AI

Federal: HIPAA Security Rule NPRM

State-Level AI Insurance Regulations

NAIC Model Bulletin (Adopted by 24 States as of March 2025)

California (Effective Jan 2026)

Colorado AI Act (Enforcement June 30, 2026)

Connecticut

Practical Deployment Patterns

Pattern 1: Proxy Gateway with De-identification (Most Deployed)

Pattern 2: In-VPC Cloud API with Full PHI (Premera's Pattern)

Pattern 3: Private Inference (Air-Gapped)

Pattern 4: Hybrid (Emerging)

What's Actually Working in Production

Implications for DaisyAI at Premera

What We Must Do

What We Should Prepare Before Day 1

Risk Factors

Key Takeaways

Daisy

What do you need?