# ICD-11 Version Upgrade — HAPI Integration **Audience:** ICD-11 Terminology Pipeline team, DGHS FHIR ops **Related:** `version_upgrade.py` (OCL import pipeline) **HAPI endpoint:** `DELETE /admin/terminology/cache` --- ## Overview When a new ICD-11 MMS release is imported into OCL, the HAPI server's 24-hour terminology validation cache becomes stale. Vendors submitting resources after the import — but before the cache expires — will have their ICD-11 codes validated against the **old** OCL data. New codes from the new release will be incorrectly rejected as invalid (cache miss → OCL hit with old data → cached as invalid). Removed or reclassified codes that were previously valid will continue to be accepted from cache. **The cache flush endpoint resolves this.** Calling it after OCL import forces the next validation call for every ICD-11 code to hit OCL directly, repopulating the cache with the new version's data. --- ## Step-by-step upgrade procedure The following steps must be executed **in this exact order**. Deviating from the order (e.g., flushing before OCL import completes) causes the cache to repopulate with old data and requires a second flush. ``` Step 1 OCL: import new ICD-11 MMS release Step 2 OCL: patch concept_class for Diagnosis + Finding concepts Step 3 OCL: repopulate bd-condition-icd11-diagnosis-valueset collection Step 4 OCL: verify $validate-code returns correct results for new codes Step 5 HAPI: flush terminology cache ← this document Step 6 HAPI: verify validation with new codes Step 7 DGHS: notify vendors of new release ``` Steps 1-4 are handled by `version_upgrade.py`. This document covers Steps 5-6 and the exact integration between the two systems. --- ## Step 4 — Pre-flush verification (run before calling HAPI) Before flushing the HAPI cache, verify that OCL is serving correct results for the new release. Flushing a cache backed by an incorrect OCL state degrades validation quality. ### 4a — Verify a new code is valid in OCL Pick a code that is **new** in this release (not in the previous release). ```bash NEW_CODE="XY9Z" # Replace with an actual new code from the release notes curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\ ?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\ &system=http://id.who.int/icd/release/11/mms\ &code=${NEW_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean' # Expected: true ``` ### 4b — Verify a Device-class code is rejected by OCL Device-class codes must be rejected by the bd-condition-icd11-diagnosis-valueset (which restricts to Diagnosis + Finding only). ```bash DEVICE_CODE="XA7RE2" # Example Device class code — use an actual one curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\ ?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\ &system=http://id.who.int/icd/release/11/mms\ &code=${DEVICE_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean' # Expected: false ``` ### 4c — Verify a deprecated code is invalid If this release deprecates or removes any codes, verify they are now rejected. ```bash DEPRECATED_CODE="..." # From release notes curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\ ?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\ &system=http://id.who.int/icd/release/11/mms\ &code=${DEPRECATED_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean' # Expected: false (if deprecated) or true (if still valid) ``` Do not proceed to Step 5 until all 4a-4c verifications pass. --- ## Step 5 — Flush the HAPI terminology cache ### 5a — Obtain fhir-admin token The cache flush endpoint requires the `fhir-admin` Keycloak role. The `fhir-admin-pipeline` client is the designated service account for this operation (see `ops/keycloak-setup.md`, Part 2). ```python # In version_upgrade.py — add this function import requests import json KEYCLOAK_TOKEN_URL = "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" FHIR_ADMIN_CLIENT_ID = "fhir-admin-pipeline" FHIR_ADMIN_CLIENT_SECRET = os.environ["FHIR_ADMIN_CLIENT_SECRET"] # from secrets vault HAPI_BASE_URL = "https://fhir.dghs.gov.bd" def get_fhir_admin_token() -> str: """Obtain a fhir-admin Bearer token from Keycloak.""" response = requests.post( KEYCLOAK_TOKEN_URL, data={ "grant_type": "client_credentials", "client_id": FHIR_ADMIN_CLIENT_ID, "client_secret": FHIR_ADMIN_CLIENT_SECRET, }, timeout=30, ) response.raise_for_status() token_data = response.json() access_token = token_data["access_token"] # Verify the token contains fhir-admin role before using it # (parse middle segment of JWT) import base64 payload_b64 = access_token.split(".")[1] # Add padding if needed payload_b64 += "=" * (4 - len(payload_b64) % 4) claims = json.loads(base64.b64decode(payload_b64)) realm_roles = claims.get("realm_access", {}).get("roles", []) if "fhir-admin" not in realm_roles: raise ValueError( f"fhir-admin-pipeline token does not contain fhir-admin role. " f"Roles present: {realm_roles}. " f"Check Keycloak service account role assignment." ) return access_token ``` ### 5b — Check cache state before flush (optional but recommended) ```python def get_cache_stats(admin_token: str) -> dict: """Retrieve current HAPI terminology cache statistics.""" response = requests.get( f"{HAPI_BASE_URL}/admin/terminology/cache/stats", headers={"Authorization": f"Bearer {admin_token}"}, timeout=30, ) response.raise_for_status() return response.json() # Usage: stats_before = get_cache_stats(admin_token) print(f"Cache before flush: {stats_before['totalEntries']} entries " f"({stats_before['liveEntries']} live, " f"{stats_before['expiredEntries']} expired)") ``` ### 5c — Execute cache flush ```python def flush_hapi_terminology_cache(admin_token: str) -> dict: """ Flush the HAPI ICD-11 terminology validation cache. Must be called AFTER: - OCL ICD-11 import is complete - concept_class patch is applied - bd-condition-icd11-diagnosis-valueset is repopulated - $validate-code verified returning correct results Returns the flush summary from HAPI. Raises requests.HTTPError on failure. """ response = requests.delete( f"{HAPI_BASE_URL}/admin/terminology/cache", headers={"Authorization": f"Bearer {admin_token}"}, timeout=60, # allow time for HAPI to process across all replicas ) if response.status_code == 403: raise PermissionError( "Cache flush rejected: fhir-admin role not present in token. " "Check Keycloak fhir-admin-pipeline service account configuration." ) response.raise_for_status() result = response.json() print(f"HAPI cache flush completed: {result['entriesEvicted']} entries evicted " f"at {result['timestamp']}") return result # Full upgrade function to add to version_upgrade.py: def post_ocl_import_hapi_integration(icd11_version: str) -> None: """ Call after successful OCL import and verification. Flushes HAPI cache and verifies the new version validates correctly. Args: icd11_version: The new ICD-11 version string, e.g. "2025-01" """ print(f"\n=== HAPI integration: ICD-11 {icd11_version} ===") # Step 5a: get admin token print("Obtaining fhir-admin token...") admin_token = get_fhir_admin_token() print("Token obtained.") # Step 5b: record pre-flush state stats_before = get_cache_stats(admin_token) print(f"Pre-flush cache: {stats_before['totalEntries']} entries") # Step 5c: flush print("Flushing HAPI terminology cache...") flush_result = flush_hapi_terminology_cache(admin_token) print(f"Flush complete: {flush_result['entriesEvicted']} entries evicted") # Step 6: post-flush verification (see below) verify_hapi_validates_new_version(admin_token, icd11_version) print(f"=== HAPI integration complete for ICD-11 {icd11_version} ===\n") ``` --- ## Step 6 — Post-flush verification After the flush, verify that HAPI is now validating against the new OCL data. This confirms the end-to-end pipeline from OCL → HAPI cache → vendor validation. ### 6a — Submit a test Condition with a new ICD-11 code The test resource must be submitted by the `fhir-admin-pipeline` client. Note: the admin client has `fhir-admin` role but the FHIR resource endpoints require `mci-api` role. Use a dedicated test vendor client for resource submission, or temporarily assign `mci-api` to the admin client for testing. **Recommended approach:** use a dedicated test vendor client (`fhir-vendor-test-pipeline`) with `mci-api` role for post-upgrade verification. ```python def verify_hapi_validates_new_version( admin_token: str, icd11_version: str) -> None: """ Verifies HAPI is now accepting codes from the new ICD-11 version. Uses the $validate-code operation directly against HAPI (not resource submission) to avoid needing mci-api role on the admin client. Note: HAPI's $validate-code endpoint proxies to OCL via the validation chain. A successful result confirms the cache was flushed AND OCL is returning correct results for the new version. """ # Use a known-valid code from the new release # This should be parameterised with the actual new code from release notes test_code = get_test_code_for_version(icd11_version) # implement per release valueset_url = ( "https://fhir.dghs.gov.bd/core/ValueSet/" "bd-condition-icd11-diagnosis-valueset" ) response = requests.get( f"{HAPI_BASE_URL}/fhir/ValueSet/$validate-code", params={ "url": valueset_url, "system": "http://id.who.int/icd/release/11/mms", "code": test_code, }, headers={"Authorization": f"Bearer {admin_token}"}, timeout=30, ) if response.status_code == 401: # $validate-code requires mci-api — use a vendor test token here print("WARNING: $validate-code requires mci-api role. " "Skipping HAPI direct verification. " "Verify manually by submitting a test Condition resource.") return response.raise_for_status() result = response.json() valid = next( (p["valueBoolean"] for p in result.get("parameter", []) if p["name"] == "result"), None ) if valid is True: print(f"✓ HAPI verification passed: code '{test_code}' " f"valid in new ICD-11 {icd11_version}") else: message = next( (p.get("valueString") for p in result.get("parameter", []) if p["name"] == "message"), "no message" ) raise ValueError( f"HAPI verification FAILED: code '{test_code}' rejected after cache flush. " f"Message: {message}. " f"Check OCL import completed correctly for ICD-11 {icd11_version}." ) ``` --- ## Integration into version_upgrade.py — call site Add to the end of your main upgrade function, after the OCL verification steps: ```python def run_upgrade(icd11_version: str) -> None: """Main upgrade entry point.""" # --- Existing steps (your current implementation) --- print(f"Starting ICD-11 {icd11_version} upgrade...") # 1. Import ICD-11 concepts into OCL import_concepts_to_ocl(icd11_version) # 2. Patch concept_class for Diagnosis + Finding patch_concept_class(icd11_version) # 3. Repopulate bd-condition-icd11-diagnosis-valueset repopulate_condition_valueset(icd11_version) # 4. Verify OCL $validate-code verify_ocl_validate_code(icd11_version) # --- New: HAPI integration --- # 5-6. Flush HAPI cache and verify post_ocl_import_hapi_integration(icd11_version) # 7. Notify vendors notify_vendors_of_upgrade(icd11_version) print(f"ICD-11 {icd11_version} upgrade complete.") ``` --- ## Environment variables required by version_upgrade.py Add to your upgrade pipeline's secrets configuration: ```bash # Keycloak admin client for HAPI cache management FHIR_ADMIN_CLIENT_SECRET= # HAPI server base URL HAPI_BASE_URL=https://fhir.dghs.gov.bd ``` --- ## Rollback procedure If post-flush verification fails (HAPI is not accepting new codes): 1. **Do not re-run the flush** — the cache is already empty, re-flushing has no effect. 2. Check OCL directly: `curl https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/$validate-code?...` 3. If OCL is returning wrong results: the OCL import is incomplete. Re-run steps 1-4. 4. If OCL is returning correct results but HAPI is not: check HAPI logs for OCL connectivity errors. OCL may have returned HTTP 5xx during the first post-flush validation call, triggering fail-open behaviour. 5. After fixing OCL: flush the cache again (it has repopulated with bad data). ```bash # Emergency manual flush via curl ADMIN_TOKEN=$(curl -s -X POST \ "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \ -d "grant_type=client_credentials" \ -d "client_id=fhir-admin-pipeline" \ -d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \ | jq -r '.access_token') curl -s -X DELETE \ -H "Authorization: Bearer ${ADMIN_TOKEN}" \ https://fhir.dghs.gov.bd/admin/terminology/cache | jq . ``` --- ## Cache warm-up after flush The HAPI cache repopulates organically as vendors submit resources. There is no pre-warming mechanism. The first vendor submission after a flush for each code will take up to 10 seconds (OCL timeout) rather than sub-millisecond (cache hit). At pilot scale (50 vendors, <36,941 distinct codes in use), this is acceptable. At national scale, consider a pre-warming job that submits $validate-code requests for the top-N most frequently submitted ICD-11 codes immediately after the flush. The top-N list is derivable from the `audit.audit_events` table: ```sql SELECT invalid_code, COUNT(*) as frequency FROM audit.fhir_rejected_submissions WHERE rejection_code = 'TERMINOLOGY_INVALID_CODE' AND submission_time > NOW() - INTERVAL '90 days' GROUP BY invalid_code ORDER BY frequency DESC LIMIT 100; -- Invert: these are rejected codes. Use accepted codes from audit_events instead. SELECT (validation_messages ->> 0) as code_info, COUNT(*) as frequency FROM audit.audit_events WHERE outcome = 'ACCEPTED' AND resource_type = 'Condition' AND event_time > NOW() - INTERVAL '90 days' GROUP BY 1 ORDER BY frequency DESC LIMIT 200; ```