Skip to main content
Compliance · Shareable URL

How Velora handles PHI.

The short version: you (the covered entity) own the obligation to minimize PHI before sharing a claims file. Velora provides a cryptographic backstop that de-identifies anything you miss. Details below — this page is linked from every upload surface and is shareable with your compliance team.

01 · Your Responsibility

You are the covered entity. PHI minimization is your duty.

If you're a TPA, MGU, broker, or plan sponsor, HIPAA designates you as a covered entity or business associate. That means the Minimum Necessary rule (45 CFR §164.502(b)) requires you to strip patient identifiers from any file shared with a vendor whenever the analysis doesn't actually need them.

Velora's analytics do not need patient PHI. Every product — rate lookups, claims repricing, TPA spread, PBM transparency, leakage detection — runs on three things: provider NPI, procedure code, and dollar amount. Patient name, SSN, DOB, member ID, MRN, and address add zero analytical value and maximum compliance risk.

We strongly recommend stripping those columns from your claims export before uploading to Velora. Getting a PHI warning in our UI is not a substitute for doing your job under HIPAA — the obligation is on your side, not ours.

02 · Our Backstop

If you upload PHI anyway, we de-identify it client-side.

We implement a defense-in-depth safeguard so that PHI never lands in our analytics warehouse, our database, or our storage in plaintext. The pipeline runs in this order, every time, for every upload:

  1. Your file loads into the browser (nothing transmitted yet).
  2. Column headers are classified against our HIPAA schema using alias-match first, then Claude Haiku for headers the alias lookup can't resolve — headers only, never row data.
  3. The file is sent to Velora over TLS. Our scrub pipeline tokenizes PHI columns in process memory using HMAC-SHA-256 keyed by either your customer-provided secret (client-mode) or our per-customer vault secret (server-mode). The tokenization is deterministic per-customer: same input → same token for you, different from every other customer.
  4. Quasi-identifiers (DOB, ZIP, date of service) are coarsened per HIPAA Safe Harbor guidance in the same step.
  5. Only the scrubbed rows are persisted. Raw direct identifiers are dropped from process memory and are never written to our storage, DB, backups, or logs.
  6. Every de-identification event is logged to our HIPAA audit trail (intelligence_audit_log) with phi_involved=true, a hashed tenant identifier, and the columns touched.
  7. Sidecar-mode live: for customers whose counsel requires plaintext PHI to never leave their VPC, we ship a tokenizer (Docker / pip / npm) you run locally — only tokens cross the wire. See section 04 below.
03 · What We Do to Each Column
Tokenize
HMAC-SHA256 with your vault key. One-way.
Patient name · Member / subscriber / patient ID · SSN · MRN · Claim ID · Rx ID
Generalize
Coarsened per Safe Harbor.
DOB → birth year · ZIP → first 3 digits (if pop >20K) · Date of service → month/year
Keep
Not PHI in analytical context. Flow through unchanged.
Provider NPI (public NPPES ID) · CPT / HCPCS codes · Billed / paid / allowed amounts · Carrier · Plan name

Why NPI isn't tokenized: provider NPIs are federally published in the NPPES registry — anyone can look up any NPI at npiregistry.cms.hhs.gov. Tokenizing a public ID adds friction without reducing re-identification risk, and it would break the rate-matching logic that our analytics depend on. This is the standard interpretation of HIPAA Safe Harbor when no patient context is attached.

04 · Technical Implementation

Three modes. HMAC-SHA256. Audit-logged.

Tokenization algorithm. Each PHI value is hashed asHMAC-SHA256(vault_key, "column:value"), prefixed with a 3-letter semantic code (MBR_,PTN_, etc.) and truncated to 64 bits of entropy. Column name is mixed into the HMAC so the same ID value appearing under two columns produces distinct tokens. Same algorithm in all three modes — what differs is where the key lives and where the tokenization runs.

Server-mode (default). Velora holds your HMAC key encrypted at rest (AES-256-GCM, app-layer encrypted so a DB admin cannot decrypt without the master key). Token ↔ original mappings live in a Postgres vault with 90-day TTL and row-level isolation per customer. Outbound rehydration middleware swaps tokens back to originals on analysis responses, so your team can use Velora's portal directly without managing a local map. BAA-covered, audit-logged, the simplest integration.

Client-mode. You generate a 32-byte HMAC secret and send it on every request as theX-Velora-Vault-Keyheader. Velora tokenizes in process memory and immediately drops the originals — nothing PHI-bearing is written to our storage, DB, backups, or logs. Your application keeps a local{token → original}map and rehydrates results on your side. Smaller breach radius than server-mode for customers with strong infosec posture.

Sidecar-mode live. For customers whose counsel requires PHI to never leave their infrastructure, we ship the velora-vault tokenizer as a Docker image, a Python package (pip install velora-vault), and a TypeScript package (@velora/vault). You run it inside your own VPC; it tokenizes CSVs locally before upload — only tokens cross our wire. Velora literally cannot see plaintext PHI in this mode. The tokenization primitive is byte-equivalent to the server-side scrubber (pinned by 5,000-row stress + 200-row real-world cross-language byte-diff), so analytics joins work identically across all three modes.

Cross-customer isolation. Different customers have different vault keys, so the same patient or NPI produces different tokens for each customer. No cross-customer correlation is possible in any mode.

Infrastructure. ClickHouse (our analytics warehouse) contains zero direct PHI by design, in any mode — only tokens and generalized quasi-identifiers (DOB year, 3-digit ZIP, month-year date of service). Server-mode mappings live in a separate Postgres vault. R2 object storage for staged uploads uses server-side encryption and a 24-hour lifecycle policy on the raw bucket.

05 · Contracts & BAAs

We sign BAAs, but minimization comes first.

Velora will execute a Business Associate Agreement with any customer whose workflow legitimately requires PHI. But per §164.502(b), a BAA is not a blank check to share everything — the Minimum Necessary rule still applies. The BAA exists to govern what happens if PHI is shared, not as permission to bypass the scrubbing duty.

For customers who complete BAA onboarding, server-mode rehydration becomes available: we hold the vault key on our side (AES-256-GCM at rest, access-logged) so your application can request rehydrated results on-demand with appropriate auth. Until then, client-mode keeps the cryptographic burden and liability on your side.

The one-page summary.

  • Your obligation: strip PHI before upload. We recommend it strongly.
  • Our obligation if you don't: detect it, tokenize it in memory, never persist raw identifiers, log the scrub.
  • Never tokenized: NPI, CPT, dollar amounts — these are our analytics substrate.
  • Audit trail: every upload produces a log entry. Shareable with your compliance team.
  • Audit the pipeline yourself: the scrubber + vault code is inspected on every commit, unit tests verify determinism + cross-customer isolation + cryptographic tamper detection.