14 KiB
ICD-11 Version Upgrade — HAPI Integration
Audience: ICD-11 Terminology Pipeline team, DGHS FHIR ops
Related: version_upgrade.py (OCL import pipeline)
HAPI endpoint: DELETE /admin/terminology/cache
Overview
When a new ICD-11 MMS release is imported into OCL, the HAPI server's 24-hour terminology validation cache becomes stale. Vendors submitting resources after the import — but before the cache expires — will have their ICD-11 codes validated against the old OCL data. New codes from the new release will be incorrectly rejected as invalid (cache miss → OCL hit with old data → cached as invalid). Removed or reclassified codes that were previously valid will continue to be accepted from cache.
The cache flush endpoint resolves this. Calling it after OCL import forces the next validation call for every ICD-11 code to hit OCL directly, repopulating the cache with the new version's data.
Step-by-step upgrade procedure
The following steps must be executed in this exact order. Deviating from the order (e.g., flushing before OCL import completes) causes the cache to repopulate with old data and requires a second flush.
Step 1 OCL: import new ICD-11 MMS release
Step 2 OCL: patch concept_class for Diagnosis + Finding concepts
Step 3 OCL: repopulate bd-condition-icd11-diagnosis-valueset collection
Step 4 OCL: verify $validate-code returns correct results for new codes
Step 5 HAPI: flush terminology cache ← this document
Step 6 HAPI: verify validation with new codes
Step 7 DGHS: notify vendors of new release
Steps 1-4 are handled by version_upgrade.py. This document covers
Steps 5-6 and the exact integration between the two systems.
Step 4 — Pre-flush verification (run before calling HAPI)
Before flushing the HAPI cache, verify that OCL is serving correct results for the new release. Flushing a cache backed by an incorrect OCL state degrades validation quality.
4a — Verify a new code is valid in OCL
Pick a code that is new in this release (not in the previous release).
NEW_CODE="XY9Z" # Replace with an actual new code from the release notes
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
&system=http://id.who.int/icd/release/11/mms\
&code=${NEW_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
# Expected: true
4b — Verify a Device-class code is rejected by OCL
Device-class codes must be rejected by the bd-condition-icd11-diagnosis-valueset (which restricts to Diagnosis + Finding only).
DEVICE_CODE="XA7RE2" # Example Device class code — use an actual one
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
&system=http://id.who.int/icd/release/11/mms\
&code=${DEVICE_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
# Expected: false
4c — Verify a deprecated code is invalid
If this release deprecates or removes any codes, verify they are now rejected.
DEPRECATED_CODE="..." # From release notes
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
&system=http://id.who.int/icd/release/11/mms\
&code=${DEPRECATED_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
# Expected: false (if deprecated) or true (if still valid)
Do not proceed to Step 5 until all 4a-4c verifications pass.
Step 5 — Flush the HAPI terminology cache
5a — Obtain fhir-admin token
The cache flush endpoint requires the fhir-admin Keycloak role.
The fhir-admin-pipeline client is the designated service account for
this operation (see ops/keycloak-setup.md, Part 2).
# In version_upgrade.py — add this function
import requests
import json
KEYCLOAK_TOKEN_URL = "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token"
FHIR_ADMIN_CLIENT_ID = "fhir-admin-pipeline"
FHIR_ADMIN_CLIENT_SECRET = os.environ["FHIR_ADMIN_CLIENT_SECRET"] # from secrets vault
HAPI_BASE_URL = "https://fhir.dghs.gov.bd"
def get_fhir_admin_token() -> str:
"""Obtain a fhir-admin Bearer token from Keycloak."""
response = requests.post(
KEYCLOAK_TOKEN_URL,
data={
"grant_type": "client_credentials",
"client_id": FHIR_ADMIN_CLIENT_ID,
"client_secret": FHIR_ADMIN_CLIENT_SECRET,
},
timeout=30,
)
response.raise_for_status()
token_data = response.json()
access_token = token_data["access_token"]
# Verify the token contains fhir-admin role before using it
# (parse middle segment of JWT)
import base64
payload_b64 = access_token.split(".")[1]
# Add padding if needed
payload_b64 += "=" * (4 - len(payload_b64) % 4)
claims = json.loads(base64.b64decode(payload_b64))
realm_roles = claims.get("realm_access", {}).get("roles", [])
if "fhir-admin" not in realm_roles:
raise ValueError(
f"fhir-admin-pipeline token does not contain fhir-admin role. "
f"Roles present: {realm_roles}. "
f"Check Keycloak service account role assignment."
)
return access_token
5b — Check cache state before flush (optional but recommended)
def get_cache_stats(admin_token: str) -> dict:
"""Retrieve current HAPI terminology cache statistics."""
response = requests.get(
f"{HAPI_BASE_URL}/admin/terminology/cache/stats",
headers={"Authorization": f"Bearer {admin_token}"},
timeout=30,
)
response.raise_for_status()
return response.json()
# Usage:
stats_before = get_cache_stats(admin_token)
print(f"Cache before flush: {stats_before['totalEntries']} entries "
f"({stats_before['liveEntries']} live, "
f"{stats_before['expiredEntries']} expired)")
5c — Execute cache flush
def flush_hapi_terminology_cache(admin_token: str) -> dict:
"""
Flush the HAPI ICD-11 terminology validation cache.
Must be called AFTER:
- OCL ICD-11 import is complete
- concept_class patch is applied
- bd-condition-icd11-diagnosis-valueset is repopulated
- $validate-code verified returning correct results
Returns the flush summary from HAPI.
Raises requests.HTTPError on failure.
"""
response = requests.delete(
f"{HAPI_BASE_URL}/admin/terminology/cache",
headers={"Authorization": f"Bearer {admin_token}"},
timeout=60, # allow time for HAPI to process across all replicas
)
if response.status_code == 403:
raise PermissionError(
"Cache flush rejected: fhir-admin role not present in token. "
"Check Keycloak fhir-admin-pipeline service account configuration."
)
response.raise_for_status()
result = response.json()
print(f"HAPI cache flush completed: {result['entriesEvicted']} entries evicted "
f"at {result['timestamp']}")
return result
# Full upgrade function to add to version_upgrade.py:
def post_ocl_import_hapi_integration(icd11_version: str) -> None:
"""
Call after successful OCL import and verification.
Flushes HAPI cache and verifies the new version validates correctly.
Args:
icd11_version: The new ICD-11 version string, e.g. "2025-01"
"""
print(f"\n=== HAPI integration: ICD-11 {icd11_version} ===")
# Step 5a: get admin token
print("Obtaining fhir-admin token...")
admin_token = get_fhir_admin_token()
print("Token obtained.")
# Step 5b: record pre-flush state
stats_before = get_cache_stats(admin_token)
print(f"Pre-flush cache: {stats_before['totalEntries']} entries")
# Step 5c: flush
print("Flushing HAPI terminology cache...")
flush_result = flush_hapi_terminology_cache(admin_token)
print(f"Flush complete: {flush_result['entriesEvicted']} entries evicted")
# Step 6: post-flush verification (see below)
verify_hapi_validates_new_version(admin_token, icd11_version)
print(f"=== HAPI integration complete for ICD-11 {icd11_version} ===\n")
Step 6 — Post-flush verification
After the flush, verify that HAPI is now validating against the new OCL data. This confirms the end-to-end pipeline from OCL → HAPI cache → vendor validation.
6a — Submit a test Condition with a new ICD-11 code
The test resource must be submitted by the fhir-admin-pipeline client.
Note: the admin client has fhir-admin role but the FHIR resource endpoints
require mci-api role. Use a dedicated test vendor client for resource
submission, or temporarily assign mci-api to the admin client for testing.
Recommended approach: use a dedicated test vendor client
(fhir-vendor-test-pipeline) with mci-api role for post-upgrade verification.
def verify_hapi_validates_new_version(
admin_token: str, icd11_version: str) -> None:
"""
Verifies HAPI is now accepting codes from the new ICD-11 version.
Uses the $validate-code operation directly against HAPI (not resource submission)
to avoid needing mci-api role on the admin client.
Note: HAPI's $validate-code endpoint proxies to OCL via the validation chain.
A successful result confirms the cache was flushed AND OCL is returning
correct results for the new version.
"""
# Use a known-valid code from the new release
# This should be parameterised with the actual new code from release notes
test_code = get_test_code_for_version(icd11_version) # implement per release
valueset_url = (
"https://fhir.dghs.gov.bd/core/ValueSet/"
"bd-condition-icd11-diagnosis-valueset"
)
response = requests.get(
f"{HAPI_BASE_URL}/fhir/ValueSet/$validate-code",
params={
"url": valueset_url,
"system": "http://id.who.int/icd/release/11/mms",
"code": test_code,
},
headers={"Authorization": f"Bearer {admin_token}"},
timeout=30,
)
if response.status_code == 401:
# $validate-code requires mci-api — use a vendor test token here
print("WARNING: $validate-code requires mci-api role. "
"Skipping HAPI direct verification. "
"Verify manually by submitting a test Condition resource.")
return
response.raise_for_status()
result = response.json()
valid = next(
(p["valueBoolean"] for p in result.get("parameter", [])
if p["name"] == "result"),
None
)
if valid is True:
print(f"✓ HAPI verification passed: code '{test_code}' "
f"valid in new ICD-11 {icd11_version}")
else:
message = next(
(p.get("valueString") for p in result.get("parameter", [])
if p["name"] == "message"),
"no message"
)
raise ValueError(
f"HAPI verification FAILED: code '{test_code}' rejected after cache flush. "
f"Message: {message}. "
f"Check OCL import completed correctly for ICD-11 {icd11_version}."
)
Integration into version_upgrade.py — call site
Add to the end of your main upgrade function, after the OCL verification steps:
def run_upgrade(icd11_version: str) -> None:
"""Main upgrade entry point."""
# --- Existing steps (your current implementation) ---
print(f"Starting ICD-11 {icd11_version} upgrade...")
# 1. Import ICD-11 concepts into OCL
import_concepts_to_ocl(icd11_version)
# 2. Patch concept_class for Diagnosis + Finding
patch_concept_class(icd11_version)
# 3. Repopulate bd-condition-icd11-diagnosis-valueset
repopulate_condition_valueset(icd11_version)
# 4. Verify OCL $validate-code
verify_ocl_validate_code(icd11_version)
# --- New: HAPI integration ---
# 5-6. Flush HAPI cache and verify
post_ocl_import_hapi_integration(icd11_version)
# 7. Notify vendors
notify_vendors_of_upgrade(icd11_version)
print(f"ICD-11 {icd11_version} upgrade complete.")
Environment variables required by version_upgrade.py
Add to your upgrade pipeline's secrets configuration:
# Keycloak admin client for HAPI cache management
FHIR_ADMIN_CLIENT_SECRET=<secret from keycloak-setup.md Part 2>
# HAPI server base URL
HAPI_BASE_URL=https://fhir.dghs.gov.bd
Rollback procedure
If post-flush verification fails (HAPI is not accepting new codes):
- Do not re-run the flush — the cache is already empty, re-flushing has no effect.
- Check OCL directly:
curl https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/$validate-code?... - If OCL is returning wrong results: the OCL import is incomplete. Re-run steps 1-4.
- If OCL is returning correct results but HAPI is not: check HAPI logs for OCL connectivity errors. OCL may have returned HTTP 5xx during the first post-flush validation call, triggering fail-open behaviour.
- After fixing OCL: flush the cache again (it has repopulated with bad data).
# Emergency manual flush via curl
ADMIN_TOKEN=$(curl -s -X POST \
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
-d "grant_type=client_credentials" \
-d "client_id=fhir-admin-pipeline" \
-d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \
| jq -r '.access_token')
curl -s -X DELETE \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
https://fhir.dghs.gov.bd/admin/terminology/cache | jq .
Cache warm-up after flush
The HAPI cache repopulates organically as vendors submit resources. There is no pre-warming mechanism. The first vendor submission after a flush for each code will take up to 10 seconds (OCL timeout) rather than sub-millisecond (cache hit). At pilot scale (50 vendors, <36,941 distinct codes in use), this is acceptable.
At national scale, consider a pre-warming job that submits $validate-code requests
for the top-N most frequently submitted ICD-11 codes immediately after the flush.
The top-N list is derivable from the audit.audit_events table:
SELECT invalid_code, COUNT(*) as frequency
FROM audit.fhir_rejected_submissions
WHERE rejection_code = 'TERMINOLOGY_INVALID_CODE'
AND submission_time > NOW() - INTERVAL '90 days'
GROUP BY invalid_code
ORDER BY frequency DESC
LIMIT 100;
-- Invert: these are rejected codes. Use accepted codes from audit_events instead.
SELECT
(validation_messages ->> 0) as code_info,
COUNT(*) as frequency
FROM audit.audit_events
WHERE outcome = 'ACCEPTED'
AND resource_type = 'Condition'
AND event_time > NOW() - INTERVAL '90 days'
GROUP BY 1
ORDER BY frequency DESC
LIMIT 200;