first commit
This commit is contained in:
262
ops/adding-additional-igs.md
Normal file
262
ops/adding-additional-igs.md
Normal file
@@ -0,0 +1,262 @@
|
||||
# Adding Additional Implementation Guides
|
||||
|
||||
**Audience:** DGHS FHIR development and operations team
|
||||
**Applies to:** Any IG added after BD Core FHIR IG v0.2.1
|
||||
**Current IGs:** BD Core (`https://fhir.dghs.gov.bd/core`)
|
||||
**Planned IGs:** MCCoD (`https://fhir.dghs.gov.bd/mccod`), IMCI (`https://fhir.dghs.gov.bd/imci`)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Each DGHS Implementation Guide has its own canonical URL namespace:
|
||||
|
||||
| IG | Canonical base | Package naming convention |
|
||||
|----|---------------|--------------------------|
|
||||
| BD Core | `https://fhir.dghs.gov.bd/core` | `bd.gov.dghs.core-{version}.tgz` |
|
||||
| MCCoD | `https://fhir.dghs.gov.bd/mccod` | `bd.gov.dghs.mccod-{version}.tgz` |
|
||||
| IMCI | `https://fhir.dghs.gov.bd/imci` | `bd.gov.dghs.imci-{version}.tgz` |
|
||||
|
||||
Separate canonical namespaces mean profiles from different IGs never collide regardless of resource type overlap. A `Composition` profiled in MCCoD at `https://fhir.dghs.gov.bd/mccod/StructureDefinition/mccod-composition` and a `Composition` profiled in a future Core IG extension are completely independent. HAPI validates a resource against whichever profile URL it declares in `meta.profile`.
|
||||
|
||||
All packages are loaded into a single `NpmPackageValidationSupport` instance. HAPI merges them into one validation context at startup. There is no performance penalty for multiple IGs — profiles are loaded once into memory and reused across all validation calls.
|
||||
|
||||
---
|
||||
|
||||
## What changes when adding a new IG
|
||||
|
||||
### 1. `packages/` directory
|
||||
|
||||
Place the new IG `.tgz` alongside the existing core IG package:
|
||||
|
||||
```
|
||||
hapi-overlay/src/main/resources/packages/
|
||||
├── bd.gov.dghs.core-0.2.1.tgz ← existing
|
||||
├── bd.gov.dghs.mccod-1.0.0.tgz ← new
|
||||
└── bd.gov.dghs.imci-1.0.0.tgz ← new
|
||||
```
|
||||
|
||||
Only one version of each IG per image. If you are upgrading an existing IG, remove the old `.tgz` and place the new one.
|
||||
|
||||
### 2. `FhirServerConfig.java` — load the new package
|
||||
|
||||
Find the `npmPackageValidationSupport()` bean and add a `loadPackageFromClasspath()` call for each new IG:
|
||||
|
||||
```java
|
||||
@Bean
|
||||
public NpmPackageValidationSupport npmPackageValidationSupport(FhirContext fhirContext) {
|
||||
NpmPackageValidationSupport support = new NpmPackageValidationSupport(fhirContext);
|
||||
|
||||
// BD Core IG — always present
|
||||
support.loadPackageFromClasspath(
|
||||
"classpath:packages/bd.gov.dghs.core-0.2.1.tgz");
|
||||
|
||||
// MCCoD IG — add when deploying
|
||||
support.loadPackageFromClasspath(
|
||||
"classpath:packages/bd.gov.dghs.mccod-1.0.0.tgz");
|
||||
|
||||
// IMCI IG — add when deploying
|
||||
support.loadPackageFromClasspath(
|
||||
"classpath:packages/bd.gov.dghs.imci-1.0.0.tgz");
|
||||
|
||||
return support;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. `FhirServerConfig.java` — register new resource types
|
||||
|
||||
The `BD_CORE_PROFILE_RESOURCE_TYPES` set determines which resource types receive full profile validation versus the `unvalidated-profile` tag. Add every resource type that any of your IGs profiles:
|
||||
|
||||
```java
|
||||
private static final Set<String> BD_CORE_PROFILE_RESOURCE_TYPES = Set.of(
|
||||
|
||||
// BD Core IG
|
||||
"Patient", "Condition", "Encounter", "Observation",
|
||||
"Practitioner", "Organization", "Location",
|
||||
"Medication", "MedicationRequest", "Immunization",
|
||||
|
||||
// MCCoD IG — add the resource types your MCCoD IG profiles
|
||||
"Composition", "MedicationStatement",
|
||||
|
||||
// IMCI IG — add the resource types your IMCI IG profiles
|
||||
"QuestionnaireResponse", "ClinicalImpression"
|
||||
);
|
||||
```
|
||||
|
||||
If a resource type appears in multiple IGs (e.g., `Composition` in both MCCoD and a future Core extension), add it once. HAPI validates against whichever profile URL the submitted resource declares — it does not matter that multiple profiles for that type are loaded.
|
||||
|
||||
### 4. `IgPackageInitializer.java` — load metadata for each new package
|
||||
|
||||
The initialiser currently loads one package under an advisory lock. Extend it to load each package. The advisory lock pattern remains the same — one lock per package, identified by package ID:
|
||||
|
||||
```java
|
||||
@Override
|
||||
public void afterPropertiesSet() throws Exception {
|
||||
loadIgPackage(
|
||||
"classpath:packages/bd.gov.dghs.core-0.2.1.tgz",
|
||||
"bd.gov.dghs.core", "0.2.1");
|
||||
|
||||
loadIgPackage(
|
||||
"classpath:packages/bd.gov.dghs.mccod-1.0.0.tgz",
|
||||
"bd.gov.dghs.mccod", "1.0.0");
|
||||
|
||||
loadIgPackage(
|
||||
"classpath:packages/bd.gov.dghs.imci-1.0.0.tgz",
|
||||
"bd.gov.dghs.imci", "1.0.0");
|
||||
}
|
||||
|
||||
private void loadIgPackage(
|
||||
String classpathPath,
|
||||
String packageId,
|
||||
String version) throws Exception {
|
||||
|
||||
long lockKey = deriveLockKey(packageId);
|
||||
// ... same advisory lock acquisition logic as current implementation
|
||||
// ... same performIgLoad() call
|
||||
// Each package gets its own independent advisory lock key
|
||||
// so packages load concurrently across replicas without blocking each other
|
||||
}
|
||||
```
|
||||
|
||||
### 5. `application.yaml` — add new IG configuration entries
|
||||
|
||||
Under the `bd.fhir.ig` section, add entries for the new packages. This makes IG paths configurable without recompiling:
|
||||
|
||||
```yaml
|
||||
bd:
|
||||
fhir:
|
||||
ig:
|
||||
packages:
|
||||
- classpath: classpath:packages/bd.gov.dghs.core-0.2.1.tgz
|
||||
id: bd.gov.dghs.core
|
||||
version: 0.2.1
|
||||
- classpath: classpath:packages/bd.gov.dghs.mccod-1.0.0.tgz
|
||||
id: bd.gov.dghs.mccod
|
||||
version: 1.0.0
|
||||
- classpath: classpath:packages/bd.gov.dghs.imci-1.0.0.tgz
|
||||
id: bd.gov.dghs.imci
|
||||
version: 1.0.0
|
||||
```
|
||||
|
||||
Update `FhirServerConfig.java` to read this list and loop over it rather than having hardcoded paths. This means adding a new IG in future requires only a config change and new `.tgz` — no Java code change.
|
||||
|
||||
### 6. `.env` — add new IG version variables
|
||||
|
||||
Add version tracking variables for operational visibility and for the `/actuator/info` endpoint:
|
||||
|
||||
```bash
|
||||
# BD Core IG
|
||||
HAPI_IG_CORE_VERSION=0.2.1
|
||||
|
||||
# MCCoD IG
|
||||
HAPI_IG_MCCOD_VERSION=1.0.0
|
||||
|
||||
# IMCI IG
|
||||
HAPI_IG_IMCI_VERSION=1.0.0
|
||||
```
|
||||
|
||||
### 7. Gitea workflow — add new IG package secrets
|
||||
|
||||
For each new IG, add a Gitea secret and decode it in the build step:
|
||||
|
||||
**Gitea → Repository → Settings → Secrets — add:**
|
||||
|
||||
| Secret | Value |
|
||||
|--------|-------|
|
||||
| `MCCOD_PACKAGE_B64` | `base64 -w 0 bd.gov.dghs.mccod-1.0.0.tgz` |
|
||||
| `IMCI_PACKAGE_B64` | `base64 -w 0 bd.gov.dghs.imci-1.0.0.tgz` |
|
||||
|
||||
**Gitea → Repository → Settings → Variables — add:**
|
||||
|
||||
| Variable | Value |
|
||||
|----------|-------|
|
||||
| `MCCOD_PACKAGE_FILENAME` | `bd.gov.dghs.mccod-1.0.0.tgz` |
|
||||
| `IMCI_PACKAGE_FILENAME` | `bd.gov.dghs.imci-1.0.0.tgz` |
|
||||
|
||||
**In `.gitea/workflows/build.yml` — extend the IG placement step:**
|
||||
|
||||
```yaml
|
||||
- name: Place IG packages for build
|
||||
run: |
|
||||
echo "${{ secrets.IG_PACKAGE_B64 }}" | base64 -d > \
|
||||
hapi-overlay/src/main/resources/packages/${{ vars.IG_PACKAGE_FILENAME }}
|
||||
|
||||
echo "${{ secrets.MCCOD_PACKAGE_B64 }}" | base64 -d > \
|
||||
hapi-overlay/src/main/resources/packages/${{ vars.MCCOD_PACKAGE_FILENAME }}
|
||||
|
||||
echo "${{ secrets.IMCI_PACKAGE_B64 }}" | base64 -d > \
|
||||
hapi-overlay/src/main/resources/packages/${{ vars.IMCI_PACKAGE_FILENAME }}
|
||||
|
||||
echo "Packages placed:"
|
||||
ls -lh hapi-overlay/src/main/resources/packages/
|
||||
```
|
||||
|
||||
**And extend the cleanup step:**
|
||||
|
||||
```yaml
|
||||
- name: Clean up IG packages from workspace
|
||||
if: always()
|
||||
run: rm -f hapi-overlay/src/main/resources/packages/*.tgz
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## What does not change
|
||||
|
||||
| Component | Reason |
|
||||
|-----------|--------|
|
||||
| Validation chain order | `NpmPackageValidationSupport` handles all loaded IGs transparently |
|
||||
| OCL integration | `BdTerminologyValidationSupport` intercepts only `http://id.who.int/icd/release/11/mms` — other systems in any IG route through normally |
|
||||
| Cluster expression validator | ICD-11 specific — unaffected by other IGs |
|
||||
| Keycloak auth | No change — all vendors use `mci-api` role regardless of which IG they submit against |
|
||||
| Audit tables | Schema is resource-type agnostic — new resource types are captured automatically |
|
||||
| PostgreSQL schema | No migration needed — HAPI JPA stores all FHIR R4 resource types in the same tables |
|
||||
| pgBouncer, nginx proxy config | Infrastructure is IG-agnostic |
|
||||
|
||||
---
|
||||
|
||||
## Terminology considerations for new IGs
|
||||
|
||||
If MCCoD or IMCI IGs introduce coded elements using systems **other than ICD-11** that are already in your OCL instance (e.g., LOINC, drug ValueSets), no additional configuration is needed. `BdTerminologyValidationSupport` only handles ICD-11. All other systems fall through to HAPI's standard remote terminology mechanism which already calls OCL.
|
||||
|
||||
If a new IG introduces a **new terminology system** not currently in OCL:
|
||||
|
||||
1. Import the new system into OCL first.
|
||||
2. Verify OCL `$validate-code` works for the new system: `curl "https://tr.ocl.dghs.gov.bd/api/fhir/CodeSystem/$validate-code?system={new-system-url}&code={test-code}"`
|
||||
3. No HAPI code changes needed — HAPI's remote terminology support handles any system OCL knows about.
|
||||
|
||||
If a new IG introduces a terminology system that will **never be in OCL** (e.g., a purely local ValueSet defined within the IG itself), HAPI will validate it using `InMemoryTerminologyServerValidationSupport` from the concepts loaded with the IG package. No external call is made.
|
||||
|
||||
---
|
||||
|
||||
## Upgrade procedure for an existing specialised IG
|
||||
|
||||
When MCCoD advances from v1.0.0 to v1.1.0:
|
||||
|
||||
1. Place `bd.gov.dghs.mccod-1.1.0.tgz` in `packages/`, remove `bd.gov.dghs.mccod-1.0.0.tgz`.
|
||||
2. Update the package path in `FhirServerConfig.java` (or in `application.yaml` if you implemented the config-driven approach from Step 5 above).
|
||||
3. Update `MCCOD_PACKAGE_FILENAME` Gitea variable to `bd.gov.dghs.mccod-1.1.0.tgz`.
|
||||
4. Update `MCCOD_PACKAGE_B64` Gitea secret with the new package base64.
|
||||
5. Tag and push — CI builds and pushes the new image.
|
||||
6. Deploy the new image on the production server.
|
||||
|
||||
If the IG upgrade changes terminology ValueSets in OCL (new codes, reclassified codes), follow the cache flush procedure in `ops/version-upgrade-integration.md` after deployment.
|
||||
|
||||
---
|
||||
|
||||
## Deployment checklist for a new IG
|
||||
|
||||
- [ ] New IG `.tgz` placed in `packages/`, filename follows naming convention
|
||||
- [ ] `FhirServerConfig.java` — `npmPackageValidationSupport()` loads new package
|
||||
- [ ] `FhirServerConfig.java` — `BD_CORE_PROFILE_RESOURCE_TYPES` updated with new resource types
|
||||
- [ ] `IgPackageInitializer.java` — new package included in initialisation loop
|
||||
- [ ] `application.yaml` — new IG entry added under `bd.fhir.ig.packages`
|
||||
- [ ] `.env` — new IG version variable added
|
||||
- [ ] Gitea secrets — new `*_PACKAGE_B64` secret created
|
||||
- [ ] Gitea variables — new `*_PACKAGE_FILENAME` variable created
|
||||
- [ ] Gitea workflow — new package decode and cleanup steps added
|
||||
- [ ] New image built, pushed, deployed
|
||||
- [ ] Acceptance test: submit a resource claiming the new IG profile → 201 accepted
|
||||
- [ ] Acceptance test: submit a resource violating the new IG profile → 422 rejected
|
||||
- [ ] Acceptance test: existing Core IG submissions still work → 201 accepted
|
||||
- [ ] Vendors notified of new IG availability and profile URLs
|
||||
894
ops/deployment-guide.md
Normal file
894
ops/deployment-guide.md
Normal file
@@ -0,0 +1,894 @@
|
||||
# BD FHIR National — Production Deployment Guide
|
||||
|
||||
**Target OS:** Ubuntu 22.04 LTS
|
||||
**Audience:** DGHS infrastructure team
|
||||
**Estimated time:** 90 minutes first deployment, 15 minutes subsequent upgrades
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites checklist
|
||||
|
||||
Before starting, confirm all of the following:
|
||||
|
||||
- [ ] Ubuntu 22.04 LTS server provisioned with minimum 8GB RAM, 4 vCPU, 100GB disk
|
||||
- [ ] Server has outbound HTTPS access to:
|
||||
- `auth.dghs.gov.bd` (Keycloak)
|
||||
- `tr.ocl.dghs.gov.bd` (OCL)
|
||||
- `icd11.dghs.gov.bd` (cluster validator)
|
||||
- Your private Docker registry
|
||||
- [ ] TLS certificates provisioned at paths matching `.env` `TLS_CERT_PATH` / `TLS_KEY_PATH`
|
||||
- [ ] Keycloak `hris` realm configured per `ops/keycloak-setup.md`
|
||||
- [ ] BD Core IG `bd.gov.dghs.core-0.2.1.tgz` present in `hapi-overlay/src/main/resources/packages/` on CI machine
|
||||
- [ ] CI machine has built and pushed the Docker image to private registry
|
||||
- [ ] `.env` file prepared from `.env.example` with all secrets filled in
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — Server preparation
|
||||
|
||||
### 1.1 — Install Docker Engine
|
||||
|
||||
```bash
|
||||
# Remove any conflicting packages
|
||||
for pkg in docker.io docker-doc docker-compose docker-compose-v2 \
|
||||
podman-docker containerd runc; do
|
||||
sudo apt-get remove -y $pkg 2>/dev/null
|
||||
done
|
||||
|
||||
# Add Docker's official GPG key
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y ca-certificates curl
|
||||
sudo install -m 0755 -d /etc/apt/keyrings
|
||||
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
|
||||
-o /etc/apt/keyrings/docker.asc
|
||||
sudo chmod a+r /etc/apt/keyrings/docker.asc
|
||||
|
||||
# Add Docker repository
|
||||
echo \
|
||||
"deb [arch=$(dpkg --print-architecture) \
|
||||
signed-by=/etc/apt/keyrings/docker.asc] \
|
||||
https://download.docker.com/linux/ubuntu \
|
||||
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
|
||||
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
||||
|
||||
# Install Docker Engine and Compose plugin
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
|
||||
docker-buildx-plugin docker-compose-plugin
|
||||
|
||||
# Verify
|
||||
docker --version # Docker Engine 25.x or higher
|
||||
docker compose version # Docker Compose v2.x or higher
|
||||
```
|
||||
|
||||
### 1.2 — Configure Docker daemon
|
||||
|
||||
```bash
|
||||
# Create daemon config: limit log size, set storage driver
|
||||
sudo tee /etc/docker/daemon.json <<'EOF'
|
||||
{
|
||||
"log-driver": "json-file",
|
||||
"log-opts": {
|
||||
"max-size": "100m",
|
||||
"max-file": "5"
|
||||
},
|
||||
"storage-driver": "overlay2",
|
||||
"live-restore": true
|
||||
}
|
||||
EOF
|
||||
|
||||
sudo systemctl restart docker
|
||||
sudo systemctl enable docker
|
||||
|
||||
# Add your deploy user to the docker group (avoids sudo on every docker command)
|
||||
sudo usermod -aG docker $USER
|
||||
# Log out and back in for group membership to take effect
|
||||
```
|
||||
|
||||
### 1.3 — Create application directory
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /opt/bd-fhir-national
|
||||
sudo chown $USER:$USER /opt/bd-fhir-national
|
||||
cd /opt/bd-fhir-national
|
||||
```
|
||||
|
||||
### 1.4 — Deploy project files
|
||||
|
||||
Copy the entire project directory to the server. Recommended approach:
|
||||
|
||||
```bash
|
||||
# From your CI/deployment machine:
|
||||
rsync -avz --exclude='.git' \
|
||||
--exclude='hapi-overlay/target' \
|
||||
--exclude='hapi-overlay/src' \
|
||||
./bd-fhir-national/ \
|
||||
deploy@your-server:/opt/bd-fhir-national/
|
||||
|
||||
# The server needs:
|
||||
# /opt/bd-fhir-national/
|
||||
# ├── docker-compose.yml
|
||||
# ├── .env ← you create this (see 1.5)
|
||||
# ├── nginx/nginx.conf
|
||||
# ├── postgres/fhir/postgresql.conf
|
||||
# ├── postgres/fhir/init.sql
|
||||
# ├── postgres/audit/postgresql.conf
|
||||
# └── postgres/audit/init.sql
|
||||
#
|
||||
# The hapi-overlay/ source tree does NOT need to be on the production server.
|
||||
# Only the Docker image (pre-built and pushed to registry) is needed.
|
||||
```
|
||||
|
||||
### 1.5 — Create .env file
|
||||
|
||||
```bash
|
||||
cd /opt/bd-fhir-national
|
||||
cp .env.example .env
|
||||
chmod 600 .env # restrict to owner only — contains secrets
|
||||
|
||||
# Edit .env with actual values
|
||||
nano .env
|
||||
```
|
||||
|
||||
**Required values in .env:**
|
||||
|
||||
```bash
|
||||
# Docker image — must match what CI pushed
|
||||
HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0
|
||||
|
||||
# FHIR database
|
||||
FHIR_DB_NAME=fhirdb
|
||||
FHIR_DB_SUPERUSER=postgres
|
||||
FHIR_DB_SUPERUSER_PASSWORD=$(openssl rand -base64 32)
|
||||
FHIR_DB_APP_USER=hapi_app
|
||||
FHIR_DB_APP_PASSWORD=$(openssl rand -base64 32)
|
||||
|
||||
# Audit database
|
||||
AUDIT_DB_NAME=auditdb
|
||||
AUDIT_DB_SUPERUSER=postgres
|
||||
AUDIT_DB_SUPERUSER_PASSWORD=$(openssl rand -base64 32)
|
||||
AUDIT_DB_WRITER_USER=audit_writer_login
|
||||
AUDIT_DB_WRITER_PASSWORD=$(openssl rand -base64 32)
|
||||
AUDIT_DB_MAINTAINER_USER=audit_maintainer_login
|
||||
AUDIT_DB_MAINTAINER_PASSWORD=$(openssl rand -base64 32)
|
||||
|
||||
# TLS certificate paths (absolute paths on this server)
|
||||
TLS_CERT_PATH=/etc/ssl/dghs/fhir.dghs.gov.bd.crt
|
||||
TLS_KEY_PATH=/etc/ssl/dghs/fhir.dghs.gov.bd.key
|
||||
```
|
||||
|
||||
> **Security:** Never commit `.env` to version control. Store the filled
|
||||
> `.env` in your secrets vault (HashiCorp Vault, AWS SSM, or encrypted backup).
|
||||
> Verify permissions after creation: `ls -la .env` should show `-rw-------`.
|
||||
|
||||
### 1.6 — Fix PostgreSQL init script password injection
|
||||
|
||||
The `postgres/audit/init.sql` file contains placeholder passwords.
|
||||
PostgreSQL's Docker entrypoint does not perform variable substitution in
|
||||
`.sql` init files — only `.sh` files. Replace the init SQL with a shell script:
|
||||
|
||||
```bash
|
||||
# Create shell-based init script for audit database
|
||||
cat > /opt/bd-fhir-national/postgres/audit/init.sh <<'INITSCRIPT'
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
# Load passwords from environment variables
|
||||
# (These env vars are set in docker-compose.yml from .env)
|
||||
WRITER_USER="${AUDIT_DB_WRITER_USER:-audit_writer_login}"
|
||||
WRITER_PASS="${AUDIT_DB_WRITER_PASSWORD}"
|
||||
MAINTAINER_USER="${AUDIT_DB_MAINTAINER_USER:-audit_maintainer_login}"
|
||||
MAINTAINER_PASS="${AUDIT_DB_MAINTAINER_PASSWORD}"
|
||||
|
||||
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
|
||||
-- Create writer login user
|
||||
DO \$\$
|
||||
BEGIN
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${WRITER_USER}') THEN
|
||||
CREATE USER ${WRITER_USER}
|
||||
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
|
||||
CONNECTION LIMIT 20
|
||||
PASSWORD '${WRITER_PASS}';
|
||||
END IF;
|
||||
END
|
||||
\$\$;
|
||||
|
||||
-- Create maintainer login user
|
||||
DO \$\$
|
||||
BEGIN
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${MAINTAINER_USER}') THEN
|
||||
CREATE USER ${MAINTAINER_USER}
|
||||
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
|
||||
CONNECTION LIMIT 5
|
||||
PASSWORD '${MAINTAINER_PASS}';
|
||||
END IF;
|
||||
END
|
||||
\$\$;
|
||||
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${WRITER_USER};
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${MAINTAINER_USER};
|
||||
EOSQL
|
||||
INITSCRIPT
|
||||
|
||||
chmod +x /opt/bd-fhir-national/postgres/audit/init.sh
|
||||
```
|
||||
|
||||
Update `docker-compose.yml` to mount `init.sh` instead of `init.sql` for
|
||||
the `postgres-audit` service:
|
||||
|
||||
```yaml
|
||||
# In postgres-audit volumes: section, change:
|
||||
# - ./postgres/audit/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
|
||||
# To:
|
||||
- ./postgres/audit/init.sh:/docker-entrypoint-initdb.d/init.sh:ro
|
||||
```
|
||||
|
||||
Also pass the audit user environment variables to `postgres-audit`:
|
||||
|
||||
```yaml
|
||||
# In postgres-audit environment: section, add:
|
||||
AUDIT_DB_WRITER_USER: ${AUDIT_DB_WRITER_USER}
|
||||
AUDIT_DB_WRITER_PASSWORD: ${AUDIT_DB_WRITER_PASSWORD}
|
||||
AUDIT_DB_MAINTAINER_USER: ${AUDIT_DB_MAINTAINER_USER}
|
||||
AUDIT_DB_MAINTAINER_PASSWORD: ${AUDIT_DB_MAINTAINER_PASSWORD}
|
||||
```
|
||||
|
||||
Similarly for `postgres-fhir`, create `init.sh`:
|
||||
|
||||
```bash
|
||||
cat > /opt/bd-fhir-national/postgres/fhir/init.sh <<'INITSCRIPT'
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
APP_USER="${FHIR_DB_APP_USER:-hapi_app}"
|
||||
APP_PASS="${FHIR_DB_APP_PASSWORD}"
|
||||
|
||||
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
|
||||
DO \$\$
|
||||
BEGIN
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${APP_USER}') THEN
|
||||
CREATE USER ${APP_USER}
|
||||
WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
|
||||
CONNECTION LIMIT 30
|
||||
PASSWORD '${APP_PASS}';
|
||||
END IF;
|
||||
END
|
||||
\$\$;
|
||||
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${APP_USER};
|
||||
GRANT USAGE ON SCHEMA public TO ${APP_USER};
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public
|
||||
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO ${APP_USER};
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public
|
||||
GRANT USAGE, SELECT ON SEQUENCES TO ${APP_USER};
|
||||
EOSQL
|
||||
INITSCRIPT
|
||||
|
||||
chmod +x /opt/bd-fhir-national/postgres/fhir/init.sh
|
||||
```
|
||||
|
||||
### 1.7 — Authenticate with private Docker registry
|
||||
|
||||
```bash
|
||||
# Log in to your private registry
|
||||
docker login your-registry.dghs.gov.bd \
|
||||
--username ${REGISTRY_USER} \
|
||||
--password-stdin <<< "${REGISTRY_PASSWORD}"
|
||||
|
||||
# Verify the login persisted
|
||||
cat ~/.docker/config.json | jq '.auths | keys'
|
||||
# Should include "your-registry.dghs.gov.bd"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — First deployment
|
||||
|
||||
### 2.1 — Pull images
|
||||
|
||||
```bash
|
||||
cd /opt/bd-fhir-national
|
||||
|
||||
# Pull all images declared in docker-compose.yml
|
||||
docker compose --env-file .env pull
|
||||
|
||||
# Verify images are present locally
|
||||
docker images | grep -E "hapi|postgres|pgbouncer|nginx"
|
||||
```
|
||||
|
||||
### 2.2 — Start infrastructure services first
|
||||
|
||||
Start databases before HAPI. HAPI's `depends_on` with `condition: service_healthy`
|
||||
handles this automatically, but starting manually in stages helps isolate
|
||||
any first-run issues.
|
||||
|
||||
```bash
|
||||
# Start databases
|
||||
docker compose --env-file .env up -d postgres-fhir postgres-audit
|
||||
|
||||
# Wait for health checks to pass (up to 60 seconds)
|
||||
echo "Waiting for PostgreSQL to be ready..."
|
||||
until docker compose --env-file .env ps postgres-fhir \
|
||||
| grep -q "healthy"; do
|
||||
sleep 3
|
||||
echo -n "."
|
||||
done
|
||||
echo ""
|
||||
echo "postgres-fhir: healthy"
|
||||
|
||||
until docker compose --env-file .env ps postgres-audit \
|
||||
| grep -q "healthy"; do
|
||||
sleep 3
|
||||
echo -n "."
|
||||
done
|
||||
echo ""
|
||||
echo "postgres-audit: healthy"
|
||||
```
|
||||
|
||||
### 2.3 — Verify PostgreSQL user creation
|
||||
|
||||
```bash
|
||||
# Verify FHIR app user was created
|
||||
docker exec bd-postgres-fhir psql -U postgres -d fhirdb -c \
|
||||
"SELECT rolname, rolcanlogin FROM pg_roles WHERE rolname = 'hapi_app';"
|
||||
# Expected: hapi_app | t
|
||||
|
||||
# Verify audit writer user was created
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb -c \
|
||||
"SELECT rolname, rolcanlogin FROM pg_roles WHERE rolname = 'audit_writer_login';"
|
||||
# Expected: audit_writer_login | t
|
||||
```
|
||||
|
||||
### 2.4 — Start pgBouncer
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d pgbouncer-fhir pgbouncer-audit
|
||||
|
||||
# Verify pgBouncer is healthy
|
||||
until docker compose --env-file .env ps pgbouncer-fhir \
|
||||
| grep -q "healthy"; do
|
||||
sleep 3
|
||||
done
|
||||
echo "pgbouncer-fhir: healthy"
|
||||
```
|
||||
|
||||
### 2.5 — Start HAPI (first replica)
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d hapi
|
||||
|
||||
# Follow startup logs — this takes 60-120 seconds on first run
|
||||
# Watch for these key log events in order:
|
||||
# 1. "Running FHIR Flyway migrations" — V1 schema creation
|
||||
# 2. "Running Audit Flyway migrations" — V2 audit schema creation
|
||||
# 3. "Advisory lock acquired" — IG package initialisation begins
|
||||
# 4. "BD Core IG package loaded successfully" — IG loaded
|
||||
# 5. "BdTerminologyValidationSupport initialised" — OCL integration ready
|
||||
# 6. "KeycloakJwtInterceptor initialised" — JWT validation ready
|
||||
# 7. "HAPI RestfulServer interceptors registered" — server ready
|
||||
# 8. Spring Boot startup completion message with port 8080
|
||||
|
||||
docker compose --env-file .env logs -f hapi
|
||||
# Press Ctrl+C when you see the startup completion message
|
||||
```
|
||||
|
||||
**Expected startup log sequence (key lines only):**
|
||||
```
|
||||
INFO o.f.core.internal.command.DbMigrate - Running FHIR Flyway migrations
|
||||
INFO o.f.core.internal.command.DbMigrate - Successfully applied 1 migration to schema "public"
|
||||
INFO o.f.core.internal.command.DbMigrate - Running Audit Flyway migrations
|
||||
INFO o.f.core.internal.command.DbMigrate - Successfully applied 1 migration to schema "audit"
|
||||
INFO b.g.d.f.init.IgPackageInitializer - Advisory lock acquired: lockKey=... waitedMs=...
|
||||
INFO b.g.d.f.init.IgPackageInitializer - BD Core IG package loaded successfully: version=0.2.1
|
||||
INFO b.g.d.f.t.BdTerminologyValidationSupport - BdTerminologyValidationSupport initialised
|
||||
INFO b.g.d.f.i.KeycloakJwtInterceptor - KeycloakJwtInterceptor initialised
|
||||
INFO b.g.d.f.c.SecurityConfig - HAPI RestfulServer interceptors registered
|
||||
INFO o.s.b.w.e.t.TomcatWebServer - Tomcat started on port(s): 8080
|
||||
INFO b.g.d.f.BdFhirApplication - Started BdFhirApplication in XX.XXX seconds
|
||||
```
|
||||
|
||||
### 2.6 — Start nginx
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d nginx
|
||||
|
||||
# Verify nginx started without config errors
|
||||
docker compose --env-file .env logs nginx | tail -20
|
||||
# Should NOT contain: [emerg] or [crit] — only [notice] lines
|
||||
|
||||
# Verify nginx health
|
||||
docker compose --env-file .env ps nginx
|
||||
# Status should be: Up (healthy)
|
||||
```
|
||||
|
||||
### 2.7 — Verify full stack health
|
||||
|
||||
```bash
|
||||
# Internal health check (bypasses nginx, hits HAPI directly)
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s http://localhost:8080/actuator/health | jq .
|
||||
|
||||
# Expected output:
|
||||
# {
|
||||
# "status": "UP",
|
||||
# "components": {
|
||||
# "db": { "status": "UP" },
|
||||
# "auditDb": { "status": "UP" },
|
||||
# "ocl": { "status": "UP" },
|
||||
# "livenessState": { "status": "UP" },
|
||||
# "readinessState": { "status": "UP" }
|
||||
# }
|
||||
# }
|
||||
|
||||
# External health check (through nginx + TLS)
|
||||
curl -s https://fhir.dghs.gov.bd/actuator/health/liveness | jq .
|
||||
# Expected: { "status": "UP" }
|
||||
|
||||
# FHIR metadata endpoint (unauthenticated)
|
||||
curl -s https://fhir.dghs.gov.bd/fhir/metadata | jq '{
|
||||
resourceType,
|
||||
fhirVersion,
|
||||
software: .software,
|
||||
implementation: .implementation
|
||||
}'
|
||||
# Expected:
|
||||
# {
|
||||
# "resourceType": "CapabilityStatement",
|
||||
# "fhirVersion": "4.0.1",
|
||||
# "software": { "name": "BD FHIR National Repository", "version": "0.2.1" }
|
||||
# }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — Phase 2 acceptance tests
|
||||
|
||||
Run all seven tests before declaring the deployment production-ready.
|
||||
Each test includes the expected HTTP status, expected response body shape,
|
||||
and what to check in the audit log if the test fails.
|
||||
|
||||
### Setup: obtain a vendor test token
|
||||
|
||||
```bash
|
||||
VENDOR_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-vendor-TEST-FAC-001" \
|
||||
-d "client_secret=${TEST_VENDOR_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
echo "Token obtained: ${VENDOR_TOKEN:0:20}..."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 1 — Valid Condition with valid ICD-11 code → 201
|
||||
|
||||
Submits a BD Core IG-compliant `bd-condition` resource with a valid
|
||||
ICD-11 Diagnosis-class code.
|
||||
|
||||
```bash
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{
|
||||
"resourceType": "Condition",
|
||||
"meta": {
|
||||
"profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
|
||||
},
|
||||
"clinicalStatus": {
|
||||
"coding": [{
|
||||
"system": "http://terminology.hl7.org/CodeSystem/condition-clinical",
|
||||
"code": "active"
|
||||
}]
|
||||
},
|
||||
"verificationStatus": {
|
||||
"coding": [{
|
||||
"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status",
|
||||
"code": "confirmed"
|
||||
}]
|
||||
},
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "1C62.0",
|
||||
"display": "Typhoid fever"
|
||||
}]
|
||||
},
|
||||
"subject": {
|
||||
"reference": "Patient/test-patient-001"
|
||||
},
|
||||
"recordedDate": "2025-03-01"
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 201 Created` with `Location` header containing the new resource URL.
|
||||
|
||||
**If 422 instead:**
|
||||
- Check OCL connectivity: `curl https://tr.ocl.dghs.gov.bd/api/fhir/CodeSystem/$validate-code?system=http://id.who.int/icd/release/11/mms&code=1C62.0`
|
||||
- Check IG is loaded: `curl http://localhost:8080/actuator/health` — OCL component should be UP
|
||||
- Check HAPI logs for profile validation errors
|
||||
|
||||
---
|
||||
|
||||
### Test 2 — Invalid ICD-11 code → 422
|
||||
|
||||
```bash
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{
|
||||
"resourceType": "Condition",
|
||||
"meta": {
|
||||
"profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
|
||||
},
|
||||
"clinicalStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
|
||||
},
|
||||
"verificationStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
|
||||
},
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "INVALID-CODE-99999",
|
||||
"display": "This code does not exist"
|
||||
}]
|
||||
},
|
||||
"subject": {"reference": "Patient/test-patient-001"},
|
||||
"recordedDate": "2025-03-01"
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 422 Unprocessable Entity` with OperationOutcome containing:
|
||||
- `issue[0].severity`: `error`
|
||||
- `issue[0].diagnostics`: contains "INVALID-CODE-99999" and rejection reason
|
||||
- `issue[0].expression`: contains `Condition.code`
|
||||
|
||||
**Verify in audit table:**
|
||||
```sql
|
||||
SELECT rejection_code, rejection_reason, invalid_code, element_path
|
||||
FROM audit.fhir_rejected_submissions
|
||||
ORDER BY submission_time DESC LIMIT 1;
|
||||
-- Expected: TERMINOLOGY_INVALID_CODE | OCL rejected code... | INVALID-CODE-99999 | Condition.code...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 3 — Device-class ICD-11 code in Condition.code → 422
|
||||
|
||||
Device-class codes are valid ICD-11 codes but are not in the
|
||||
`bd-condition-icd11-diagnosis-valueset` (restricted to Diagnosis + Finding).
|
||||
|
||||
```bash
|
||||
# XA7RE2 is an example Device-class code in ICD-11 MMS
|
||||
# Verify it is Device-class in OCL before running this test:
|
||||
# curl "https://tr.ocl.dghs.gov.bd/api/fhir/CodeSystem/$lookup?system=http://id.who.int/icd/release/11/mms&code=XA7RE2"
|
||||
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{
|
||||
"resourceType": "Condition",
|
||||
"meta": {
|
||||
"profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
|
||||
},
|
||||
"clinicalStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
|
||||
},
|
||||
"verificationStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
|
||||
},
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "XA7RE2",
|
||||
"display": "Device code — should be rejected"
|
||||
}]
|
||||
},
|
||||
"subject": {"reference": "Patient/test-patient-001"},
|
||||
"recordedDate": "2025-03-01"
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 422` with OperationOutcome.
|
||||
Rejection code in audit: `TERMINOLOGY_INVALID_CLASS`.
|
||||
|
||||
**If 201 instead (code accepted):**
|
||||
- OCL ValueSet class restriction is not enforcing correctly
|
||||
- Verify the ValueSet collection in OCL has correct concept_class filter
|
||||
- Run: `python version_upgrade.py --verify-class-restriction`
|
||||
|
||||
---
|
||||
|
||||
### Test 4 — Profile violation (missing required field) → 422
|
||||
|
||||
Submits a Condition missing `clinicalStatus` which is required by `bd-condition` profile.
|
||||
|
||||
```bash
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{
|
||||
"resourceType": "Condition",
|
||||
"meta": {
|
||||
"profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
|
||||
},
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "1C62.0"
|
||||
}]
|
||||
},
|
||||
"subject": {"reference": "Patient/test-patient-001"}
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 422` with OperationOutcome referencing missing `clinicalStatus`.
|
||||
|
||||
**If 201 instead:**
|
||||
- BD Core IG is not loaded or profile is not enforcing `clinicalStatus` as required
|
||||
- Check startup logs for IG load success
|
||||
- Verify: `curl http://localhost:8080/fhir/StructureDefinition/bd-condition`
|
||||
|
||||
---
|
||||
|
||||
### Test 5 — No Bearer token → 401
|
||||
|
||||
```bash
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{"resourceType": "Condition"}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 401` with `WWW-Authenticate` header and OperationOutcome.
|
||||
|
||||
```bash
|
||||
# Verify WWW-Authenticate header is present
|
||||
curl -s -I \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{"resourceType":"Condition"}' \
|
||||
| grep -i "www-authenticate"
|
||||
# Expected: WWW-Authenticate: Bearer realm="BD FHIR National Repository"...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 6 — Valid token but missing mci-api role → 401
|
||||
|
||||
Create a test client WITHOUT `mci-api` role in Keycloak for this test.
|
||||
Or use a token from a different realm.
|
||||
|
||||
```bash
|
||||
# Token from a client without mci-api role
|
||||
NO_ROLE_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-test-no-role" \
|
||||
-d "client_secret=${TEST_NO_ROLE_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${NO_ROLE_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{"resourceType": "Condition"}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 401`.
|
||||
|
||||
**Verify in audit log:**
|
||||
```sql
|
||||
SELECT event_type, outcome_detail, client_id
|
||||
FROM audit.audit_events
|
||||
WHERE event_type = 'AUTH_FAILURE'
|
||||
ORDER BY event_time DESC LIMIT 1;
|
||||
-- Expected: AUTH_FAILURE | Required role 'mci-api' not present... | fhir-test-no-role
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Test 7 — Expired token → 401
|
||||
|
||||
```bash
|
||||
# An expired token is one whose 'exp' claim is in the past.
|
||||
# Easiest approach: obtain a token, wait for it to expire (default: 5 minutes),
|
||||
# then use it.
|
||||
#
|
||||
# For automated testing, forge an expired token manually:
|
||||
# (This requires knowing the signing key — use only in test environments)
|
||||
#
|
||||
# Alternative: Use a token from a deactivated Keycloak client
|
||||
# (revoke the client's credentials, existing tokens become invalid)
|
||||
|
||||
# Or simply wait:
|
||||
echo "Waiting 6 minutes for token to expire..."
|
||||
EXPIRED_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-vendor-TEST-FAC-001" \
|
||||
-d "client_secret=${TEST_VENDOR_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
sleep 360 # wait for 5-minute Keycloak default expiry
|
||||
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${EXPIRED_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{"resourceType": "Condition"}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 401` — "Token has expired".
|
||||
|
||||
---
|
||||
|
||||
### Test 8 — Cluster expression: raw postcoordinated code without extension → 422
|
||||
|
||||
```bash
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X POST https://fhir.dghs.gov.bd/fhir/Condition \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
-H "Content-Type: application/fhir+json" \
|
||||
-d '{
|
||||
"resourceType": "Condition",
|
||||
"meta": {
|
||||
"profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
|
||||
},
|
||||
"clinicalStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
|
||||
},
|
||||
"verificationStatus": {
|
||||
"coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
|
||||
},
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "1C62.0&has_severity=mild",
|
||||
"display": "Raw postcoordinated string — prohibited"
|
||||
}]
|
||||
},
|
||||
"subject": {"reference": "Patient/test-patient-001"},
|
||||
"recordedDate": "2025-03-01"
|
||||
}'
|
||||
```
|
||||
|
||||
**Expected:** `HTTP 422` with OperationOutcome diagnosing:
|
||||
`"ICD-11 postcoordinated expression in Condition.code.coding[0] must use the icd11-cluster-expression extension"`
|
||||
|
||||
Rejection code in audit: `CLUSTER_STEM_MISSING_EXTENSION`.
|
||||
|
||||
---
|
||||
|
||||
### Test 9 — Cache flush endpoint requires fhir-admin role
|
||||
|
||||
```bash
|
||||
# Attempt with vendor token (mci-api only) — should be 403
|
||||
curl -s -w "\n--- HTTP %{http_code} ---\n" \
|
||||
-X DELETE https://fhir.dghs.gov.bd/admin/terminology/cache \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}"
|
||||
# Expected: 403 (blocked by nginx IP restriction OR TerminologyCacheManager role check)
|
||||
|
||||
# Attempt with fhir-admin token — should be 200
|
||||
ADMIN_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-admin-pipeline" \
|
||||
-d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
# Note: /admin/ is restricted to 172.20.0.0/16 in nginx.
|
||||
# Run this from within the Docker network or from the server itself:
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s -X DELETE \
|
||||
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
|
||||
http://localhost:8080/admin/terminology/cache | jq .
|
||||
# Expected: 200 with { "status": "flushed", "entriesEvicted": N, ... }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 4 — Subsequent deployments (image upgrade)
|
||||
|
||||
When a new Docker image is built and pushed (new IG version, code changes):
|
||||
|
||||
```bash
|
||||
cd /opt/bd-fhir-national
|
||||
|
||||
# 1. Update image tag in .env
|
||||
nano .env
|
||||
# Change: HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0
|
||||
# To: HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.1.0
|
||||
|
||||
# 2. Pull new image
|
||||
docker compose --env-file .env pull hapi
|
||||
|
||||
# 3. Rolling restart — replaces containers one at a time
|
||||
# At 1 replica (pilot): brief downtime expected (~30s)
|
||||
docker compose --env-file .env up -d --no-deps hapi
|
||||
|
||||
# At 3 replicas (Phase 2): true rolling update — scale up then scale down
|
||||
docker compose --env-file .env up -d --no-deps --scale hapi=4 hapi
|
||||
# Wait for new replica to be healthy:
|
||||
sleep 30
|
||||
docker compose --env-file .env up -d --no-deps --scale hapi=3 hapi
|
||||
|
||||
# 4. Verify startup
|
||||
docker compose --env-file .env logs --tail=50 hapi
|
||||
|
||||
# 5. Run acceptance tests (at minimum Tests 1, 2, 5)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 5 — Operational runbook
|
||||
|
||||
### View logs
|
||||
|
||||
```bash
|
||||
# All services
|
||||
docker compose --env-file .env logs -f
|
||||
|
||||
# HAPI only (structured JSON — pipe through jq)
|
||||
docker compose --env-file .env logs -f hapi | jq -R 'try fromjson'
|
||||
|
||||
# nginx access log
|
||||
docker compose --env-file .env logs -f nginx
|
||||
|
||||
# Filter for rejected submissions in HAPI logs
|
||||
docker compose --env-file .env logs hapi | \
|
||||
jq -R 'try fromjson | select(.message | contains("rejected"))'
|
||||
```
|
||||
|
||||
### Restart a specific service
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env restart hapi
|
||||
docker compose --env-file .env restart nginx
|
||||
```
|
||||
|
||||
### Emergency: full stack restart
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env down
|
||||
docker compose --env-file .env up -d
|
||||
```
|
||||
|
||||
### Query rejected submissions
|
||||
|
||||
```bash
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb -c "
|
||||
SELECT
|
||||
submission_time,
|
||||
resource_type,
|
||||
rejection_code,
|
||||
LEFT(rejection_reason, 100) as reason,
|
||||
client_id
|
||||
FROM audit.fhir_rejected_submissions
|
||||
ORDER BY submission_time DESC
|
||||
LIMIT 20;"
|
||||
```
|
||||
|
||||
### Check pgBouncer pool status
|
||||
|
||||
```bash
|
||||
# Connect to pgBouncer admin interface
|
||||
docker exec -it bd-pgbouncer-fhir \
|
||||
psql -h localhost -p 5432 -U pgbouncer pgbouncer -c "SHOW POOLS;"
|
||||
```
|
||||
|
||||
### Monitor disk usage
|
||||
|
||||
```bash
|
||||
# PostgreSQL data volumes
|
||||
docker system df -v | grep -E "postgres|audit"
|
||||
|
||||
# Log volume
|
||||
docker system df -v | grep hapi-logs
|
||||
```
|
||||
303
ops/keycloak-setup.md
Normal file
303
ops/keycloak-setup.md
Normal file
@@ -0,0 +1,303 @@
|
||||
# Keycloak Setup — BD FHIR National
|
||||
|
||||
**Realm:** `hris`
|
||||
**Keycloak URL:** `https://auth.dghs.gov.bd`
|
||||
**Audience:** DGHS Identity and Access Management team
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document covers the Keycloak configuration required for BD FHIR National
|
||||
deployment. It assumes the `hris` realm and `mci-api` role already exist
|
||||
(pre-existing national HRIS configuration). Only the additions for FHIR
|
||||
deployment are documented here.
|
||||
|
||||
---
|
||||
|
||||
## Part 1 — Create `fhir-admin` realm role
|
||||
|
||||
The `fhir-admin` role grants access to:
|
||||
- `DELETE /admin/terminology/cache` — terminology cache flush
|
||||
- `GET /admin/terminology/cache/stats` — cache statistics
|
||||
|
||||
This role is **not** assigned to vendor clients. It is assigned only to the
|
||||
ICD-11 version upgrade pipeline service account and DGHS system administrators.
|
||||
|
||||
### Steps (Keycloak Admin Console)
|
||||
|
||||
1. Log in to `https://auth.dghs.gov.bd/admin/master/console`
|
||||
2. Select realm: **hris**
|
||||
3. Navigate to: **Realm roles** → **Create role**
|
||||
4. Fill in:
|
||||
- **Role name:** `fhir-admin`
|
||||
- **Description:** `BD FHIR server administrative operations — cache management and system configuration`
|
||||
5. Click **Save**
|
||||
|
||||
### Steps (Keycloak Admin REST API — for automation)
|
||||
|
||||
```bash
|
||||
# Get admin token
|
||||
ADMIN_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/master/protocol/openid-connect/token" \
|
||||
-d "grant_type=password" \
|
||||
-d "client_id=admin-cli" \
|
||||
-d "username=${KEYCLOAK_ADMIN_USER}" \
|
||||
-d "password=${KEYCLOAK_ADMIN_PASSWORD}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
# Create fhir-admin role
|
||||
curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/admin/realms/hris/roles" \
|
||||
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "fhir-admin",
|
||||
"description": "BD FHIR server administrative operations"
|
||||
}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 2 — Create `fhir-admin` service account client
|
||||
|
||||
The version upgrade pipeline authenticates with a dedicated client.
|
||||
This client must never be shared with vendor systems.
|
||||
|
||||
### Steps (Admin Console)
|
||||
|
||||
1. Navigate to: **Clients** → **Create client**
|
||||
2. **Client type:** OpenID Connect
|
||||
3. **Client ID:** `fhir-admin-pipeline`
|
||||
4. Click **Next**
|
||||
5. **Client authentication:** ON (confidential client)
|
||||
6. **Service accounts roles:** ON
|
||||
7. **Standard flow:** OFF (machine-to-machine only)
|
||||
8. Click **Save**
|
||||
|
||||
### Assign fhir-admin role to service account
|
||||
|
||||
1. Navigate to: **Clients** → `fhir-admin-pipeline` → **Service accounts roles**
|
||||
2. Click **Assign role**
|
||||
3. Filter by: **Filter by realm roles**
|
||||
4. Select: `fhir-admin`
|
||||
5. Click **Assign**
|
||||
|
||||
### Retrieve client secret
|
||||
|
||||
1. Navigate to: **Clients** → `fhir-admin-pipeline` → **Credentials**
|
||||
2. Copy **Client secret** — store in your secrets vault
|
||||
3. This secret is used in `ops/version-upgrade-integration.md`
|
||||
|
||||
---
|
||||
|
||||
## Part 3 — Configure vendor clients
|
||||
|
||||
Each vendor organisation requires one Keycloak client. This section documents
|
||||
the **template** for creating a vendor client. Repeat for each vendor.
|
||||
|
||||
### Naming convention
|
||||
|
||||
```
|
||||
fhir-vendor-{organisation-id}
|
||||
```
|
||||
|
||||
Where `{organisation-id}` is the DGHS facility code, e.g.:
|
||||
- `fhir-vendor-DGHS-FAC-001` for Dhaka Medical College Hospital
|
||||
- `fhir-vendor-DGHS-FAC-002` for Square Hospital
|
||||
|
||||
### Steps (Admin Console)
|
||||
|
||||
1. Navigate to: **Clients** → **Create client**
|
||||
2. **Client type:** OpenID Connect
|
||||
3. **Client ID:** `fhir-vendor-{organisation-id}`
|
||||
4. Click **Next**
|
||||
5. **Client authentication:** ON
|
||||
6. **Service accounts roles:** ON
|
||||
7. **Standard flow:** OFF
|
||||
8. Click **Save**
|
||||
|
||||
### Assign mci-api role
|
||||
|
||||
1. Navigate to: **Clients** → `fhir-vendor-{org-id}` → **Service accounts roles**
|
||||
2. Click **Assign role**
|
||||
3. Select: `mci-api`
|
||||
4. Click **Assign**
|
||||
|
||||
### Add sending_facility user attribute
|
||||
|
||||
The `sending_facility` claim is a custom token mapper that injects the vendor's
|
||||
DGHS facility code into every token issued to this client. The
|
||||
`KeycloakJwtInterceptor` reads this claim for audit logging.
|
||||
|
||||
**Without this mapper, audit logs will show `client_id` as the facility
|
||||
identifier instead of the DGHS facility code. This degrades audit quality
|
||||
and generates WARN logs in HAPI on every submission.**
|
||||
|
||||
#### Create user attribute on service account
|
||||
|
||||
1. Navigate to: **Clients** → `fhir-vendor-{org-id}` → **Service accounts**
|
||||
2. Click the service account user link (e.g., `service-account-fhir-vendor-xxx`)
|
||||
3. Navigate to: **Attributes** tab
|
||||
4. Click **Add attribute**
|
||||
5. Key: `sending_facility`
|
||||
6. Value: `{DGHS facility code}` (e.g., `DGHS-FAC-001`)
|
||||
7. Click **Save**
|
||||
|
||||
#### Create token mapper
|
||||
|
||||
1. Navigate to: **Clients** → `fhir-vendor-{org-id}` → **Client scopes**
|
||||
2. Click the dedicated scope link (e.g., `fhir-vendor-xxx-dedicated`)
|
||||
3. Navigate to: **Mappers** → **Add mapper** → **By configuration**
|
||||
4. Select: **User Attribute**
|
||||
5. Fill in:
|
||||
- **Name:** `sending-facility-mapper`
|
||||
- **User Attribute:** `sending_facility`
|
||||
- **Token Claim Name:** `sending_facility`
|
||||
- **Claim JSON Type:** String
|
||||
- **Add to access token:** ON
|
||||
- **Add to ID token:** OFF
|
||||
- **Add to userinfo:** OFF
|
||||
6. Click **Save**
|
||||
|
||||
#### Verify token contains sending_facility
|
||||
|
||||
```bash
|
||||
# Get vendor token
|
||||
TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-vendor-{org-id}" \
|
||||
-d "client_secret={secret}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
# Decode and check claims (base64 decode middle segment)
|
||||
echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq '{
|
||||
iss,
|
||||
sub,
|
||||
azp,
|
||||
exp,
|
||||
sending_facility,
|
||||
realm_access: .realm_access.roles
|
||||
}'
|
||||
|
||||
# Expected output:
|
||||
# {
|
||||
# "iss": "https://auth.dghs.gov.bd/realms/hris",
|
||||
# "sub": "...",
|
||||
# "azp": "fhir-vendor-{org-id}",
|
||||
# "exp": ...,
|
||||
# "sending_facility": "DGHS-FAC-001",
|
||||
# "realm_access": ["mci-api", "offline_access"]
|
||||
# }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 4 — Token validation verification
|
||||
|
||||
After creating a client, verify the full token validation chain works
|
||||
before onboarding the vendor.
|
||||
|
||||
### Test 1 — Valid token accepted
|
||||
|
||||
```bash
|
||||
TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-vendor-{org-id}" \
|
||||
-d "client_secret={secret}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
curl -s -o /dev/null -w "%{http_code}" \
|
||||
-H "Authorization: Bearer ${TOKEN}" \
|
||||
https://fhir.dghs.gov.bd/fhir/Patient
|
||||
|
||||
# Expected: 200 (empty bundle) or 404 — NOT 401
|
||||
```
|
||||
|
||||
### Test 2 — Missing token rejected
|
||||
|
||||
```bash
|
||||
curl -s -o /dev/null -w "%{http_code}" \
|
||||
https://fhir.dghs.gov.bd/fhir/Patient
|
||||
|
||||
# Expected: 401
|
||||
```
|
||||
|
||||
### Test 3 — Expired token rejected
|
||||
|
||||
```bash
|
||||
# Use a deliberately expired token (exp in the past)
|
||||
# Easiest: wait for a token to expire (default Keycloak token lifetime: 5 minutes)
|
||||
# Then attempt a request with the expired token.
|
||||
|
||||
# Expected: 401
|
||||
```
|
||||
|
||||
### Test 4 — Wrong realm rejected
|
||||
|
||||
```bash
|
||||
# Get a token from a different realm (if available) or forge iss claim
|
||||
# Expected: 401
|
||||
```
|
||||
|
||||
### Test 5 — mci-api role required
|
||||
|
||||
```bash
|
||||
# Create a test client WITHOUT mci-api role
|
||||
# Get token for that client
|
||||
# Attempt FHIR request
|
||||
# Expected: 401
|
||||
```
|
||||
|
||||
### Test 6 — fhir-admin endpoint requires fhir-admin role
|
||||
|
||||
```bash
|
||||
# Use a vendor token (mci-api only, no fhir-admin)
|
||||
VENDOR_TOKEN=...
|
||||
|
||||
curl -s -w "\n%{http_code}" \
|
||||
-X DELETE \
|
||||
-H "Authorization: Bearer ${VENDOR_TOKEN}" \
|
||||
https://fhir.dghs.gov.bd/admin/terminology/cache
|
||||
|
||||
# Expected: 403
|
||||
|
||||
# Use fhir-admin token
|
||||
ADMIN_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-admin-pipeline" \
|
||||
-d "client_secret={admin_secret}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
curl -s -w "\n%{http_code}" \
|
||||
-X DELETE \
|
||||
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
|
||||
https://fhir.dghs.gov.bd/admin/terminology/cache
|
||||
|
||||
# Expected: 200 with flush summary JSON
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Part 5 — Token lifetime configuration
|
||||
|
||||
Keycloak default access token lifetime is 5 minutes. For machine-to-machine
|
||||
FHIR submissions, this is appropriate — vendor systems must refresh tokens
|
||||
before expiry. Do not increase the token lifetime to accommodate vendors who
|
||||
are not refreshing tokens correctly. Token refresh is the vendor's
|
||||
responsibility, not a server-side workaround.
|
||||
|
||||
**Recommended settings for vendor clients:**
|
||||
|
||||
| Setting | Value | Rationale |
|
||||
|---------|-------|-----------|
|
||||
| Access Token Lifespan | 5 minutes | Short-lived — minimises window for token replay |
|
||||
| Refresh Token Max Reuse | 0 | One-time use refresh tokens |
|
||||
| Client Session Idle | 30 minutes | Vendor batch jobs may pause between submissions |
|
||||
| Client Session Max | 8 hours | Maximum session for a single batch run |
|
||||
|
||||
Configure at: **Realm Settings** → **Tokens** for defaults,
|
||||
or per-client at: **Clients** → `{client}` → **Advanced** → **Advanced settings**.
|
||||
271
ops/project-manifest.md
Normal file
271
ops/project-manifest.md
Normal file
@@ -0,0 +1,271 @@
|
||||
# BD FHIR National — Project Manifest & Pre-Flight Checklist
|
||||
|
||||
**Project:** BD Core FHIR National Repository and Validation Engine
|
||||
**IG Version:** BD Core FHIR IG v0.2.1
|
||||
**FHIR Version:** R4 (4.0.1)
|
||||
**HAPI Version:** 7.2.0
|
||||
**Published by:** DGHS/MoHFW Bangladesh
|
||||
**Generated:** 2025
|
||||
|
||||
---
|
||||
|
||||
## Complete file manifest
|
||||
|
||||
### Build and orchestration
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `pom.xml` | 1 | Parent Maven POM. HAPI 7.2.0 BOM, Spring Boot 3.2.5, all version pins. |
|
||||
| `hapi-overlay/pom.xml` | 2 | Child module POM. All runtime dependencies. Fat JAR output: `bd-fhir-hapi.jar`. |
|
||||
| `hapi-overlay/Dockerfile` | 4 | Multi-stage build: Maven builder + eclipse-temurin:17-jre runtime. tini as PID 1. |
|
||||
| `docker-compose.yml` | 4 | Production orchestration: HAPI, 2× PostgreSQL, 2× pgBouncer, nginx. Scaling roadmap in comments. |
|
||||
| `.env.example` | 4 | Environment variable template. Copy to `.env`, fill secrets, `chmod 600`. |
|
||||
|
||||
### Database
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/resources/db/migration/fhir/V1__hapi_schema.sql` | 3 | HAPI 7.2.0 JPA schema. All tables, sequences, indexes. Flyway-managed. Partition comments at 10M+ rows. |
|
||||
| `hapi-overlay/src/main/resources/db/migration/audit/V2__audit_schema.sql` | 3 | Audit schema. Partitioned `audit_events` and `fhir_rejected_submissions` by month 2025-2027. INSERT-only role grants. `create_next_month_partitions()` maintenance function. |
|
||||
| `postgres/fhir/postgresql.conf` | 4 | PostgreSQL 15 tuning for HAPI JPA workload. 2GB container. SSD-optimised. |
|
||||
| `postgres/audit/postgresql.conf` | 4 | PostgreSQL 15 tuning for audit INSERT workload. 1GB container. |
|
||||
| `postgres/fhir/init.sql` | 4 | Template — **replace with `init.sh`** per deployment-guide.md §1.6 before first deploy. |
|
||||
| `postgres/audit/init.sql` | 4 | Template — **replace with `init.sh`** per deployment-guide.md §1.6 before first deploy. |
|
||||
|
||||
### Application configuration
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/resources/application.yaml` | 5 | Complete Spring Boot + HAPI configuration. Dual datasource, dual Flyway, HAPI R4, validation chain, actuator, structured logging. All secrets via env vars. |
|
||||
| `hapi-overlay/src/main/resources/logback-spring.xml` | 5 | Structured JSON logging via logstash-logback-encoder. Async appenders. MDC field inclusion. |
|
||||
|
||||
### Java source — entry point
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/BdFhirApplication.java` | 12 | Spring Boot entry point. `@EnableAsync` activates audit async executor. |
|
||||
|
||||
### Java source — configuration
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/DataSourceConfig.java` | 6 | Dual datasource wiring. Primary FHIR datasource (HikariCP, pgBouncer session mode). Secondary audit datasource (INSERT-only). Dual Flyway instances. `auditDbHealthIndicator` using INSERT test. `oclHealthIndicator`. `entityManagerFactory` bound explicitly to FHIR datasource. |
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/FhirServerConfig.java` | 6 | Validation support chain (6 supports in dependency order). `NpmPackageValidationSupport` loading BD Core IG. `RequestValidatingInterceptor` with failOnSeverity=ERROR. `unvalidatedProfileTagInterceptor` for unknown resource types. Startup IG presence check. |
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/SecurityConfig.java` | 8 | Registers JWT, validation, and audit interceptors into HAPI RestfulServer in correct order. HTTPS enforcement filter. Security response headers filter. |
|
||||
|
||||
### Java source — initialisation
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/init/IgPackageInitializer.java` | 9 | `InitializingBean` that loads BD Core IG with PostgreSQL advisory lock. Prevents multi-replica NPM_PACKAGE race condition. djb2 hash for stable lock key. |
|
||||
|
||||
### Java source — interceptors
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/interceptor/KeycloakJwtInterceptor.java` | 8 | Nimbus JOSE+JWT with `RemoteJWKSet` (1-hour TTL, kid-based refresh). Validates: signature, expiry, issuer, `mci-api` role. Extracts: `client_id`, `subject`, `sending_facility`. Sets all `REQUEST_ATTR_*` constants. MDC population and guaranteed cleanup. `GET /fhir/metadata` and actuator health exempt. |
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/interceptor/AuditEventInterceptor.java` | 9 | Three-hook interceptor: (1) cluster expression pre-validation, (2) accepted resource audit at `STORAGE_PRESTORAGE_*`, (3) rejected resource audit at `SERVER_HANDLE_EXCEPTION`. Routes to `AuditEventEmitter` and `RejectedSubmissionSink` asynchronously. |
|
||||
|
||||
### Java source — terminology
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/terminology/BdTerminologyValidationSupport.java` | 7 | Custom `IValidationSupport`. Forces `$validate-code` for ICD-11. Suppresses `$expand` via `isValueSetSupported()=false`. 24-hour `ConcurrentHashMap` cache with TTL eviction. Retry with exponential backoff. Fail-open on OCL outage. `flushCache()` called by `TerminologyCacheManager`. |
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/terminology/TerminologyCacheManager.java` | 7 | REST controller: `DELETE /admin/terminology/cache` and `GET /admin/terminology/cache/stats`. Requires `fhir-admin` role (read from `REQUEST_ATTR_IS_ADMIN`). Called by ICD-11 version upgrade pipeline. |
|
||||
|
||||
### Java source — validator
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/validator/ClusterExpressionValidator.java` | 7 | Detects `icd11-cluster-expression` extension on ICD-11 `Coding` elements. Rejects raw postcoordinated strings (contains `&`, `/`, `%` without extension) with 422. Calls `https://icd11.dghs.gov.bd/cluster/validate` for full expression validation. Fail-open on cluster validator outage. |
|
||||
|
||||
### Java source — audit
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/audit/AuditEventEmitter.java` | 9 | `@Async` INSERT to `audit.audit_events`. Immutable (INSERT only — `audit_writer` role enforces at DB level). Serialises `validationMessages` as JSONB. Truncates fields to column lengths. Logs ERROR on write failure (audit gap is a high-priority incident). |
|
||||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/audit/RejectedSubmissionSink.java` | 9 | `@Async` INSERT to `audit.fhir_rejected_submissions`. Stores full resource payload as TEXT (preserves exact bytes). 4MB payload cap (anti-DoS). Machine-readable `rejection_code` for programmatic analysis. |
|
||||
|
||||
### Infrastructure
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `nginx/nginx.conf` | 10 | Reverse proxy. TLS 1.2/1.3 only. Rate limiting: FHIR 10r/s, admin 6r/m, metadata 5r/s. `/admin/` restricted to `172.20.0.0/16`. `/actuator/` restricted to internal network. `/fhir/metadata` unauthenticated. All other paths → HAPI. |
|
||||
| `hapi-overlay/src/main/resources/packages/.gitkeep` | 12 | Marks the IG package directory for git. CI pipeline places `bd.gov.dghs.core-{version}.tgz` here before `docker build`. |
|
||||
|
||||
### Operations
|
||||
|
||||
| File | Step | Purpose |
|
||||
|------|------|---------|
|
||||
| `ops/keycloak-setup.md` | 10 | `fhir-admin` role creation. `fhir-admin-pipeline` client setup. Vendor client template. `sending_facility` mapper configuration. Token verification tests. |
|
||||
| `ops/version-upgrade-integration.md` | 10 | ICD-11 upgrade pipeline integration. Pre-flush OCL verification. `get_fhir_admin_token()`, `flush_hapi_terminology_cache()`, `verify_hapi_validates_new_version()` Python functions. `post_ocl_import_hapi_integration()` call site. Rollback procedure. |
|
||||
| `ops/scaling-roadmap.md` | 10 | Phase 1→2→3 thresholds and changes. Monthly partition maintenance cron. PostgreSQL monitoring queries. IG upgrade procedure. Key Prometheus metrics and alert thresholds. |
|
||||
| `ops/deployment-guide.md` | 11 | Step-by-step Ubuntu 22.04 deployment. Docker install, daemon config, registry auth. PostgreSQL init script fix (critical). First-deploy sequence. Nine acceptance tests. Rolling upgrade procedure. Operational runbook. |
|
||||
|
||||
---
|
||||
|
||||
## Pre-flight checklist
|
||||
|
||||
Work through this list top to bottom before running `docker compose up`.
|
||||
Each item is a documented failure mode from real HAPI deployments.
|
||||
**Do not skip items marked CRITICAL.**
|
||||
|
||||
---
|
||||
|
||||
### CI machine (before docker build)
|
||||
|
||||
- [ ] **[CRITICAL]** `bd.gov.dghs.core-0.2.1.tgz` placed in `hapi-overlay/src/main/resources/packages/`
|
||||
*Symptom if missing: startup fails with `STARTUP FAILURE: BD Core IG package not found`. Container will not start.*
|
||||
|
||||
- [ ] `HAPI_IG_PACKAGE_CLASSPATH` in `docker-compose.yml` matches the `.tgz` filename exactly
|
||||
*Symptom if mismatch: same STARTUP FAILURE as above.*
|
||||
|
||||
- [ ] Docker image built with correct `--build-arg` values and pushed to private registry
|
||||
*Verify: `docker manifest inspect your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0`*
|
||||
|
||||
- [ ] Image tag in `.env.example` (and your `.env`) matches the pushed image tag
|
||||
*Symptom if mismatch: `docker compose pull` pulls wrong image or fails.*
|
||||
|
||||
---
|
||||
|
||||
### Production server (before docker compose up)
|
||||
|
||||
- [ ] **[CRITICAL]** `postgres/fhir/init.sql` replaced with `init.sh` (deployment-guide.md §1.6)
|
||||
*Symptom if skipped: `hapi_app` user is never created. Flyway migrations succeed but HAPI runtime fails with authentication error to postgres-fhir.*
|
||||
|
||||
- [ ] **[CRITICAL]** `postgres/audit/init.sql` replaced with `init.sh` (deployment-guide.md §1.6)
|
||||
*Symptom if skipped: `audit_writer_login` never created. HAPI starts but all audit writes fail with `FATAL: password authentication failed for user "audit_writer_login"`.*
|
||||
|
||||
- [ ] `docker-compose.yml` `postgres-audit` service updated to mount `init.sh` (not `init.sql`) and passes `AUDIT_DB_WRITER_USER/PASSWORD/MAINTAINER_*` env vars
|
||||
*Follows from the init.sh fix above.*
|
||||
|
||||
- [ ] `.env` file created, all `<CHANGE_ME>` values replaced, `chmod 600 .env`
|
||||
*Verify: `grep CHANGE_ME .env` returns no output.*
|
||||
|
||||
- [ ] `TLS_CERT_PATH` and `TLS_KEY_PATH` in `.env` point to files that exist on the server
|
||||
*Verify: `ls -la $(grep TLS_CERT_PATH .env | cut -d= -f2)`*
|
||||
|
||||
- [ ] Server can reach all external services from within the Docker network:
|
||||
```bash
|
||||
# Test from inside a temporary container on the Docker network
|
||||
docker run --rm --network bd-fhir-national_backend-fhir alpine sh -c \
|
||||
"apk add -q curl && curl -s -o /dev/null -w '%{http_code}' \
|
||||
https://auth.dghs.gov.bd/realms/hris/.well-known/openid-configuration"
|
||||
# Expected: 200
|
||||
```
|
||||
*Symptom if unreachable: KeycloakJwtInterceptor fails to fetch JWKS on startup. All authenticated requests return 401 even with valid tokens.*
|
||||
|
||||
- [ ] `random_page_cost` in both `postgresql.conf` files matches your storage type
|
||||
`1.1` for SSD (default in this project), `4.0` for spinning HDD
|
||||
*Symptom if wrong: query planner chooses sequential scans over indexes. FHIR search performance degrades at >100k resources.*
|
||||
|
||||
- [ ] Docker and Docker Compose v2 installed (`docker compose version`, not `docker-compose`)
|
||||
*Symptom if wrong: `docker-compose` (v1) does not support `deploy.replicas` or `condition: service_healthy`.*
|
||||
|
||||
- [ ] Private registry credentials stored in `~/.docker/config.json`
|
||||
*Verify: `docker login your-registry.dghs.gov.bd`*
|
||||
|
||||
---
|
||||
|
||||
### Keycloak (before first vendor submission)
|
||||
|
||||
- [ ] **[CRITICAL]** `fhir-admin` realm role created in `hris` realm (keycloak-setup.md Part 1)
|
||||
*Symptom if missing: `fhir-admin-pipeline` service account has no role to assign. Cache flush endpoint returns 403 for all callers.*
|
||||
|
||||
- [ ] **[CRITICAL]** `fhir-admin-pipeline` client created with `fhir-admin` role assigned (keycloak-setup.md Part 2)
|
||||
*Symptom if missing: version upgrade pipeline cannot flush cache. After ICD-11 upgrade, stale codes accepted/rejected for up to 24 hours.*
|
||||
|
||||
- [ ] At least one vendor client created (`fhir-vendor-TEST-FAC-001` for acceptance testing) with `mci-api` role and `sending_facility` attribute mapper (keycloak-setup.md Parts 3-4)
|
||||
*Symptom if missing: acceptance Test 1 returns 401. All vendor submissions rejected.*
|
||||
|
||||
- [ ] Token from test vendor client decoded and verified to contain:
|
||||
- `iss`: `https://auth.dghs.gov.bd/realms/hris`
|
||||
- `azp`: `fhir-vendor-TEST-FAC-001`
|
||||
- `realm_access.roles`: contains `mci-api`
|
||||
- `sending_facility`: non-empty facility code
|
||||
*Verify with: `echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq .`*
|
||||
|
||||
---
|
||||
|
||||
### Post-startup verification
|
||||
|
||||
- [ ] All health indicators GREEN:
|
||||
```bash
|
||||
curl -s http://localhost:8080/actuator/health | jq '.components | keys'
|
||||
# Expected: ["auditDb", "db", "livenessState", "ocl", "readinessState"]
|
||||
# All must show "status": "UP"
|
||||
```
|
||||
|
||||
- [ ] FHIR metadata accessible unauthenticated and shows correct IG version:
|
||||
```bash
|
||||
curl -s https://fhir.dghs.gov.bd/fhir/metadata | jq '.software.version'
|
||||
# Expected: "0.2.1"
|
||||
```
|
||||
|
||||
- [ ] Flyway migration history shows V1 and V2 applied cleanly:
|
||||
```bash
|
||||
docker exec bd-postgres-fhir psql -U postgres -d fhirdb \
|
||||
-c "SELECT version, description, success FROM flyway_schema_history;"
|
||||
# Expected: V1 | hapi_schema | t
|
||||
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb \
|
||||
-c "SELECT version, description, success FROM flyway_audit_schema_history;"
|
||||
# Expected: V2 | audit_schema | t
|
||||
```
|
||||
|
||||
- [ ] Audit tables accepting inserts (INSERT-only role working):
|
||||
```bash
|
||||
docker exec bd-postgres-audit psql -U audit_writer_login -d auditdb -c \
|
||||
"INSERT INTO audit.health_check (check_id) VALUES (gen_random_uuid())
|
||||
ON CONFLICT DO NOTHING; SELECT 'audit insert ok';"
|
||||
# Expected: audit insert ok
|
||||
```
|
||||
|
||||
- [ ] **Run all nine acceptance tests** from deployment-guide.md Part 3
|
||||
Tests 1-9 must all produce the expected HTTP status codes before the server is declared production-ready.
|
||||
|
||||
---
|
||||
|
||||
### Operational readiness (before announcing to vendors)
|
||||
|
||||
- [ ] Partition maintenance cron configured on audit database host (scaling-roadmap.md)
|
||||
*Run: `docker exec bd-postgres-audit psql -U postgres -d auditdb -c "SELECT audit.create_next_month_partitions();"` — verify it creates next month without error.*
|
||||
|
||||
- [ ] Log shipping to ELK configured (or Filebeat agent installed and shipping `/app/logs/`)
|
||||
*Minimum: verify logs appear at `docker compose logs hapi` in JSON format.*
|
||||
|
||||
- [ ] `FHIR_ADMIN_CLIENT_SECRET` stored in version upgrade pipeline's secrets vault
|
||||
*Required by `ops/version-upgrade-integration.md` before next ICD-11 release.*
|
||||
|
||||
- [ ] Next ICD-11 version upgrade date noted — cache flush must be coordinated with OCL import completion
|
||||
*See `ops/version-upgrade-integration.md` for the 7-step procedure.*
|
||||
|
||||
- [ ] Vendor onboarding runbook prepared citing `ops/keycloak-setup.md` Parts 3-4
|
||||
*Each new vendor requires: Keycloak client, `mci-api` role, `sending_facility` mapper, credentials delivery.*
|
||||
|
||||
---
|
||||
|
||||
## Architecture decision record — key decisions frozen in this implementation
|
||||
|
||||
The following decisions were finalised through the pre-implementation challenge process
|
||||
and are reflected throughout the codebase. They are not configurable at runtime
|
||||
without code changes.
|
||||
|
||||
| Decision | Rationale | Where enforced |
|
||||
|----------|-----------|---------------|
|
||||
| PostgreSQL only, no H2 | National infrastructure requires production-grade persistence | `DataSourceConfig.java`, Flyway migrations, `docker-compose.yml` |
|
||||
| Validation on ALL requests | No vendor exemptions — uniform HIE boundary | `RequestValidatingInterceptor` with `failOnSeverity=ERROR` |
|
||||
| OCL is single terminology authority | No local ICD-11 copy — live validation | `BdTerminologyValidationSupport`, chain position 6 |
|
||||
| `$expand` failures never cause rejection | Known OCL limitation | `isValueSetSupported()=false`, `expandValueSet()` returns null |
|
||||
| Only `$validate-code` failures cause 422 | Distinguish expansion from validation | `BdTerminologyValidationSupport.validateCode()` |
|
||||
| Keycloak `hris` realm, `mci-api` role, no basic auth | Single authentication authority | `KeycloakJwtInterceptor`, `SecurityConfig` |
|
||||
| Audit log append-only, separate instance | Immutability, forensic separation | `postgres-audit` separate container, `audit_writer` INSERT-only role |
|
||||
| Rejected payloads stored forensically | Vendor debugging, dispute resolution | `RejectedSubmissionSink`, `audit.fhir_rejected_submissions` |
|
||||
| IG bundled in Docker image | Reproducible builds, no runtime URL dependency | `Dockerfile` COPY, `IgPackageInitializer` |
|
||||
| Cluster expressions via extension, not raw code | BD Core IG decided pattern | `ClusterExpressionValidator`, `POSTCOORD_CHARS` rejection |
|
||||
| Fail-open for OCL/cluster validator outages | Service continuity over perfect validation | `BdTerminologyValidationSupport` catch blocks, `ClusterExpressionValidator` catch blocks |
|
||||
| `meta.tag = unvalidated-profile` for unknown types | FHIR-native, queryable, no schema changes | `unvalidatedProfileTagInterceptor` in `FhirServerConfig` |
|
||||
| pgBouncer session mode | Hibernate prepared statement compatibility | `docker-compose.yml` `PGBOUNCER_POOL_MODE: session` |
|
||||
| Flyway bypasses pgBouncer for migrations | DDL transaction safety | `SPRING_FLYWAY_URL` points to `postgres-fhir:5432` directly |
|
||||
| Advisory lock for IG initialisation | Multi-replica startup race prevention | `IgPackageInitializer` djb2 lock key |
|
||||
| Two MDC cleanup hooks | Thread pool MDC leak prevention | `KeycloakJwtInterceptor` `COMPLETED_NORMALLY` + `COMPLETED` |
|
||||
327
ops/scaling-roadmap.md
Normal file
327
ops/scaling-roadmap.md
Normal file
@@ -0,0 +1,327 @@
|
||||
# Scaling Roadmap — BD FHIR National
|
||||
|
||||
**Audience:** DGHS infrastructure team, future system architects
|
||||
**Current phase:** Pilot (Phase 1)
|
||||
|
||||
---
|
||||
|
||||
## Phase thresholds
|
||||
|
||||
| Metric | Phase 1 (Pilot) | Phase 2 (Regional) | Phase 3 (National) |
|
||||
|--------|-----------------|---------------------|---------------------|
|
||||
| Vendors | <50 | <500 | >500 |
|
||||
| Resources/day | <10,000 | <100,000 | >1,000,000 |
|
||||
| Resources total | <1M | <10M | >10M |
|
||||
| HAPI replicas | 1 | 3 | 5-10+ |
|
||||
| Orchestrator | docker-compose | docker-compose | Kubernetes |
|
||||
| PostgreSQL | Single instance | Primary + replica | Patroni HA cluster |
|
||||
| Estimated trigger | Now | 6-18 months | 18-36 months |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 → Phase 2 changes
|
||||
|
||||
### 1. Scale HAPI replicas to 3
|
||||
|
||||
No configuration changes required — the architecture was designed for this from day one.
|
||||
|
||||
```bash
|
||||
# On the production Ubuntu server
|
||||
cd /opt/bd-fhir-national
|
||||
docker-compose --env-file .env up -d --scale hapi=3
|
||||
```
|
||||
|
||||
**Verify after scaling:**
|
||||
```bash
|
||||
# All 3 replicas healthy
|
||||
docker-compose ps hapi
|
||||
|
||||
# nginx is load balancing across all 3
|
||||
# (check HAPI logs — requests should appear in all replica logs)
|
||||
docker-compose logs --tail=50 hapi
|
||||
|
||||
# pgBouncer pool has sufficient capacity
|
||||
# 3 replicas × 5 HikariCP connections = 15 connections
|
||||
# pgBouncer pool_size=20 — 5 headroom remaining. Acceptable.
|
||||
```
|
||||
|
||||
**pgBouncer adjustment at 3+ replicas:**
|
||||
At 5 replicas (5 × 5 = 25 connections), the current pgBouncer pool_size=20 becomes
|
||||
a bottleneck. Update docker-compose.yml:
|
||||
|
||||
```yaml
|
||||
# pgbouncer-fhir environment:
|
||||
PGBOUNCER_DEFAULT_POOL_SIZE: "30" # was 20
|
||||
# And increase postgres-fhir max_connections in postgresql.conf:
|
||||
# max_connections = 40 # was 30
|
||||
```
|
||||
|
||||
### 2. Add PostgreSQL streaming replication (read replica)
|
||||
|
||||
For read-heavy workloads (FHIR search, bulk export), add a read replica.
|
||||
HAPI supports separate read and write datasource URLs.
|
||||
|
||||
```yaml
|
||||
# Add to docker-compose.yml:
|
||||
postgres-fhir-replica:
|
||||
image: postgres:15-alpine
|
||||
environment:
|
||||
POSTGRES_DB: fhirdb
|
||||
PGUSER: replicator
|
||||
POSTGRES_PASSWORD: ${FHIR_REPLICA_PASSWORD}
|
||||
volumes:
|
||||
- postgres-fhir-replica-data:/var/lib/postgresql/data
|
||||
- ./postgres/fhir/replica.conf:/etc/postgresql/postgresql.conf:ro
|
||||
command: >
|
||||
bash -c "
|
||||
until pg_basebackup -h postgres-fhir -U replicator -D /var/lib/postgresql/data -P -Xs -R; do
|
||||
sleep 5;
|
||||
done && postgres -c config_file=/etc/postgresql/postgresql.conf"
|
||||
networks:
|
||||
- backend-fhir
|
||||
```
|
||||
|
||||
Add `HAPI_DATASOURCE_READ_URL` environment variable pointing to the replica,
|
||||
and update `DataSourceConfig.java` to configure a separate read datasource.
|
||||
|
||||
### 3. Add Redis for distributed JWKS cache
|
||||
|
||||
Currently each HAPI replica maintains an independent in-memory JWKS cache.
|
||||
At 3 replicas, a Keycloak key rotation triggers 3 independent JWKS re-fetches
|
||||
within the same second. This is acceptable. At 10+ replicas, add Redis for
|
||||
a shared JWKS cache to reduce Keycloak load.
|
||||
|
||||
```yaml
|
||||
# Add to docker-compose.yml:
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
networks:
|
||||
- frontend
|
||||
- backend-fhir
|
||||
command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD}
|
||||
```
|
||||
|
||||
Update `KeycloakJwtInterceptor` to use Spring Cache with Redis backend for JWKS storage.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 → Phase 3 changes
|
||||
|
||||
### Move to Kubernetes
|
||||
|
||||
At national scale, docker-compose is not the correct orchestrator. Kubernetes
|
||||
provides:
|
||||
- Horizontal Pod Autoscaler (scale on CPU/RPS automatically)
|
||||
- Rolling deployments (zero-downtime IG version upgrades)
|
||||
- Pod Disruption Budgets (maintain minimum replicas during node maintenance)
|
||||
- Namespace isolation (separate FHIR, audit, monitoring namespaces)
|
||||
|
||||
**Kubernetes equivalents:**
|
||||
|
||||
| docker-compose service | Kubernetes resource |
|
||||
|------------------------|---------------------|
|
||||
| hapi (--scale N) | Deployment + HPA |
|
||||
| postgres-fhir | StatefulSet (or external Patroni) |
|
||||
| postgres-audit | StatefulSet (or external Patroni) |
|
||||
| pgbouncer-fhir | Deployment (sidecar or standalone) |
|
||||
| nginx | Ingress (nginx-ingress-controller) |
|
||||
|
||||
### Partition HAPI JPA tables
|
||||
|
||||
At 5M+ resources in `HFJ_RESOURCE`, evaluate partitioning (see V1 migration comments).
|
||||
|
||||
Prerequisites before partitioning HAPI JPA tables:
|
||||
1. HAPI must be stopped during the migration (ALTER TABLE is not online in PostgreSQL 15)
|
||||
2. Foreign key references to HFJ_RESOURCE from all SPIDX tables must be updated
|
||||
3. The partition key must be included in all primary keys
|
||||
4. Hibernate DDL validation must be disabled during migration, then re-enabled
|
||||
|
||||
This is a planned maintenance window operation — minimum 4-hour downtime window
|
||||
for a database with 5M resources. At 10,000 resources/day, you have
|
||||
approximately 18 months to plan this migration from initial deployment.
|
||||
|
||||
**Trigger:** Run `EXPLAIN ANALYZE` on a representative FHIR search query.
|
||||
When sequential scans on HFJ_RESOURCE appear in the plan despite indexes,
|
||||
partitioning is overdue.
|
||||
|
||||
---
|
||||
|
||||
## Partition maintenance — monthly cron job
|
||||
|
||||
The audit tables are partitioned by month with partitions pre-created through 2027.
|
||||
**A missing partition causes INSERT to fail with a hard error** — no graceful degradation.
|
||||
|
||||
### Setup (run once on the audit PostgreSQL host)
|
||||
|
||||
```bash
|
||||
# Create a login user for the maintenance function
|
||||
# (audit_maintainer_login was created by postgres/audit/init.sql)
|
||||
|
||||
# Add to crontab on the Ubuntu host (or in a scheduled container):
|
||||
crontab -e
|
||||
|
||||
# Run on the 20th of each month at 00:00 UTC — creates next month's partition
|
||||
0 0 20 * * docker exec bd-postgres-audit psql \
|
||||
-U audit_maintainer_login \
|
||||
-d auditdb \
|
||||
-c "SELECT audit.create_next_month_partitions();" \
|
||||
>> /var/log/bd-fhir-partition-maintenance.log 2>&1
|
||||
```
|
||||
|
||||
### Verify partition creation
|
||||
|
||||
```bash
|
||||
# After the cron runs, verify the new partition exists
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb -c "
|
||||
SELECT
|
||||
c.relname AS partition_name,
|
||||
pg_get_expr(c.relpartbound, c.oid) AS partition_range
|
||||
FROM pg_class c
|
||||
JOIN pg_inherits i ON i.inhrelid = c.oid
|
||||
JOIN pg_class p ON p.oid = i.inhparent
|
||||
JOIN pg_namespace n ON n.oid = p.relnamespace
|
||||
WHERE n.nspname = 'audit'
|
||||
AND p.relname = 'audit_events'
|
||||
ORDER BY c.relname DESC
|
||||
LIMIT 3;
|
||||
"
|
||||
# Should show the three most recent monthly partitions
|
||||
```
|
||||
|
||||
### Monitor for missing partitions
|
||||
|
||||
Add this check to your monitoring system (Prometheus alerting or cron):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# check_audit_partitions.sh
|
||||
# Alert if the next month's partition does not exist by the 25th
|
||||
|
||||
NEXT_MONTH=$(date -d "+1 month" +%Y_%m)
|
||||
PARTITION="audit_events_${NEXT_MONTH}"
|
||||
|
||||
RESULT=$(docker exec bd-postgres-audit psql -U postgres -d auditdb -tAc "
|
||||
SELECT COUNT(*) FROM pg_class c
|
||||
JOIN pg_namespace n ON n.oid = c.relnamespace
|
||||
WHERE n.nspname = 'audit' AND c.relname = '${PARTITION}';")
|
||||
|
||||
if [ "$RESULT" -eq "0" ]; then
|
||||
echo "ALERT: Missing audit partition for next month: ${PARTITION}"
|
||||
# Send to your alerting system (PagerDuty, Slack, email)
|
||||
exit 1
|
||||
fi
|
||||
echo "OK: Partition ${PARTITION} exists"
|
||||
exit 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring — key metrics to track
|
||||
|
||||
These metrics indicate when scaling actions are needed.
|
||||
|
||||
### PostgreSQL — fhir
|
||||
|
||||
```sql
|
||||
-- Connection utilisation (should be <80% of max_connections)
|
||||
SELECT count(*) as active_connections,
|
||||
max_conn,
|
||||
round(100.0 * count(*) / max_conn, 1) as utilisation_pct
|
||||
FROM pg_stat_activity, (SELECT setting::int as max_conn FROM pg_settings WHERE name='max_connections') mc
|
||||
WHERE state = 'active'
|
||||
GROUP BY max_conn;
|
||||
|
||||
-- Table bloat (trigger VACUUM if dead_tuple_ratio > 10%)
|
||||
SELECT relname, n_live_tup, n_dead_tup,
|
||||
round(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) as dead_pct
|
||||
FROM pg_stat_user_tables
|
||||
WHERE relname IN ('hfj_resource', 'hfj_spidx_token', 'hfj_res_ver')
|
||||
ORDER BY dead_pct DESC;
|
||||
|
||||
-- Index usage (trigger REINDEX if idx_scan is 0 for a non-new index)
|
||||
SELECT relname, indexrelname, idx_scan, idx_tup_read
|
||||
FROM pg_stat_user_indexes
|
||||
WHERE relname LIKE 'hfj_%'
|
||||
ORDER BY idx_scan ASC
|
||||
LIMIT 10;
|
||||
```
|
||||
|
||||
### PostgreSQL — audit
|
||||
|
||||
```sql
|
||||
-- Partition sizes (plan next archive when any partition exceeds 10GB)
|
||||
SELECT
|
||||
c.relname as partition,
|
||||
pg_size_pretty(pg_relation_size(c.oid)) as size
|
||||
FROM pg_class c
|
||||
JOIN pg_inherits i ON i.inhrelid = c.oid
|
||||
JOIN pg_class p ON p.oid = i.inhparent
|
||||
JOIN pg_namespace n ON n.oid = p.relnamespace
|
||||
WHERE n.nspname = 'audit' AND p.relname = 'audit_events'
|
||||
ORDER BY c.relname DESC;
|
||||
|
||||
-- Rejection rate by vendor (flag vendors with >10% rejection rate)
|
||||
SELECT
|
||||
client_id,
|
||||
COUNT(*) as total_events,
|
||||
SUM(CASE WHEN outcome = 'REJECTED' THEN 1 ELSE 0 END) as rejections,
|
||||
ROUND(100.0 * SUM(CASE WHEN outcome = 'REJECTED' THEN 1 ELSE 0 END) / COUNT(*), 1) as rejection_pct
|
||||
FROM audit.audit_events
|
||||
WHERE event_time > NOW() - INTERVAL '7 days'
|
||||
AND event_type IN ('OPERATION', 'VALIDATION_FAILURE')
|
||||
GROUP BY client_id
|
||||
ORDER BY rejection_pct DESC;
|
||||
```
|
||||
|
||||
### HAPI — Prometheus metrics
|
||||
|
||||
Key metrics exposed at `/actuator/prometheus`:
|
||||
|
||||
| Metric | Alert threshold |
|
||||
|--------|-----------------|
|
||||
| `hikaricp_connections_pending` | >0 for >30s → pool exhaustion |
|
||||
| `hikaricp_connection_timeout_total` | Any increment → pool exhaustion |
|
||||
| `http_server_requests_seconds_max` | >30s → OCL timeout or slow validation |
|
||||
| `jvm_memory_used_bytes / jvm_memory_max_bytes` | >85% → OOM risk, increase container memory |
|
||||
| `process_uptime_seconds` | Resets → unexpected container restart |
|
||||
|
||||
---
|
||||
|
||||
## IG upgrade procedure
|
||||
|
||||
When BD Core IG advances from v0.2.1 to v0.3.0:
|
||||
|
||||
```bash
|
||||
# 1. On CI machine: place new package.tgz in src/main/resources/packages/
|
||||
cp bd.gov.dghs.core-0.3.0.tgz hapi-overlay/src/main/resources/packages/
|
||||
|
||||
# 2. Remove old package (one IG version per image)
|
||||
rm hapi-overlay/src/main/resources/packages/bd.gov.dghs.core-0.2.1.tgz
|
||||
|
||||
# 3. Update application.yaml / docker-compose env vars:
|
||||
# HAPI_IG_PACKAGE_CLASSPATH=classpath:packages/bd.gov.dghs.core-0.3.0.tgz
|
||||
# HAPI_IG_VERSION=0.3.0
|
||||
|
||||
# 4. Build and push new image
|
||||
docker build \
|
||||
--build-arg IG_PACKAGE=bd.gov.dghs.core-0.3.0.tgz \
|
||||
--build-arg BUILD_VERSION=1.1.0 \
|
||||
--build-arg GIT_COMMIT=$(git rev-parse --short HEAD) \
|
||||
-t your-registry.dghs.gov.bd/bd-fhir-hapi:1.1.0 \
|
||||
-f hapi-overlay/Dockerfile .
|
||||
|
||||
docker push your-registry.dghs.gov.bd/bd-fhir-hapi:1.1.0
|
||||
|
||||
# 5. Update HAPI_IMAGE in .env on production server
|
||||
# 6. Rolling redeploy
|
||||
docker-compose --env-file .env pull hapi
|
||||
docker-compose --env-file .env up -d --no-deps hapi
|
||||
|
||||
# 7. Verify new IG version is active
|
||||
curl -s https://fhir.dghs.gov.bd/fhir/metadata | jq '.software.version'
|
||||
# Expected: "0.3.0" or the configured HAPI_IG_VERSION value
|
||||
```
|
||||
|
||||
**Vendor notification:** IG upgrades that change SHALL constraints require
|
||||
vendor notification at least 30 days in advance. Vendors must test against
|
||||
the staging environment before production deployment.
|
||||
893
ops/technical-operations-document.md
Normal file
893
ops/technical-operations-document.md
Normal file
@@ -0,0 +1,893 @@
|
||||
# BD FHIR National — Technical Operations Document
|
||||
|
||||
**System:** National FHIR R4 Repository and Validation Engine
|
||||
**Published by:** DGHS / MoHFW Bangladesh
|
||||
**IG:** BD Core FHIR IG v0.2.1
|
||||
**HAPI FHIR:** 7.2.0
|
||||
**Stack:** Java 17 · Spring Boot 3.2.5 · PostgreSQL 15 · Docker Compose
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [System Purpose and Architecture](#1-system-purpose-and-architecture)
|
||||
2. [Repository Structure](#2-repository-structure)
|
||||
3. [How the System Works](#3-how-the-system-works)
|
||||
4. [Infrastructure Components](#4-infrastructure-components)
|
||||
5. [Security Model](#5-security-model)
|
||||
6. [Validation Pipeline](#6-validation-pipeline)
|
||||
7. [Audit and Forensics](#7-audit-and-forensics)
|
||||
8. [CI/CD Pipeline](#8-cicd-pipeline)
|
||||
9. [First Deployment — Step by Step](#9-first-deployment--step-by-step)
|
||||
10. [Routine Operations](#10-routine-operations)
|
||||
11. [ICD-11 Version Upgrade](#11-icd-11-version-upgrade)
|
||||
12. [Scaling](#12-scaling)
|
||||
13. [Troubleshooting](#13-troubleshooting)
|
||||
14. [Architecture Decisions You Must Not Reverse](#14-architecture-decisions-you-must-not-reverse)
|
||||
|
||||
---
|
||||
|
||||
## 1. System Purpose and Architecture
|
||||
|
||||
This system is the national FHIR R4 repository for Bangladesh. It serves three purposes simultaneously:
|
||||
|
||||
**Repository** — Stores validated FHIR R4 resources submitted by hospitals, clinics, diagnostic labs, and pharmacies (collectively: vendors). No unvalidated resource enters storage.
|
||||
|
||||
**Validation engine** — Every incoming resource is validated against BD Core FHIR IG profiles AND against the national ICD-11 terminology authority (OCL) before storage. Invalid resources are rejected with HTTP 422 and a FHIR OperationOutcome describing exactly what failed.
|
||||
|
||||
**HIE gateway** — Acts as the national Health Information Exchange boundary. The system enforces that only authenticated, authorised, and clinically valid data enters the national record.
|
||||
|
||||
### Traffic flow
|
||||
|
||||
```
|
||||
Vendor system
|
||||
│
|
||||
│ POST /fhir/Condition
|
||||
│ Authorization: Bearer {token}
|
||||
▼
|
||||
Centralised nginx proxy ← TLS termination, routing (managed separately)
|
||||
│
|
||||
▼
|
||||
HAPI server :8080
|
||||
│
|
||||
├─ KeycloakJwtInterceptor ← validates JWT, extracts facility identity
|
||||
├─ ClusterExpressionValidator ← validates ICD-11 cluster expressions
|
||||
├─ RequestValidatingInterceptor ← validates against BD Core IG profiles
|
||||
├─ BdTerminologyValidationSupport ← validates ICD-11 codes against OCL
|
||||
│
|
||||
├─ [ACCEPTED] → HFJ_RESOURCE (postgres-fhir)
|
||||
│ AuditEventEmitter → audit.audit_events (postgres-audit)
|
||||
│
|
||||
└─ [REJECTED] → 422 OperationOutcome to vendor
|
||||
RejectedSubmissionSink → audit.fhir_rejected_submissions (postgres-audit)
|
||||
AuditEventEmitter → audit.audit_events (postgres-audit)
|
||||
```
|
||||
|
||||
### External service dependencies
|
||||
|
||||
| Service | URL | Purpose | Failure behaviour |
|
||||
|---------|-----|---------|-------------------|
|
||||
| Keycloak | `https://auth.dghs.gov.bd/realms/hris` | JWT validation, JWKS | Fail closed — all requests rejected |
|
||||
| OCL | `https://tr.ocl.dghs.gov.bd/api/fhir` | ICD-11 terminology validation | Fail open — resource accepted with audit record |
|
||||
| Cluster validator | `https://icd11.dghs.gov.bd/cluster/validate` | Postcoordinated ICD-11 expressions | Fail open — resource accepted with audit record |
|
||||
|
||||
**Fail-open policy for OCL and cluster validator is deliberate.** Service continuity during external service outages takes precedence over perfect validation coverage. Every fail-open event is recorded in the audit log. OCL or cluster validator outages must be treated as high-priority incidents.
|
||||
|
||||
---
|
||||
|
||||
## 2. Repository Structure
|
||||
|
||||
```
|
||||
bd-fhir-national/
|
||||
├── .env.example ← copy to .env, fill secrets
|
||||
├── docker-compose.yml ← production orchestration
|
||||
├── pom.xml ← parent Maven POM, version pins
|
||||
├── hapi-overlay/
|
||||
│ ├── Dockerfile ← multi-stage build
|
||||
│ ├── pom.xml ← runtime dependencies
|
||||
│ └── src/main/
|
||||
│ ├── java/bd/gov/dghs/fhir/
|
||||
│ │ ├── BdFhirApplication.java ← Spring Boot entry point
|
||||
│ │ ├── audit/
|
||||
│ │ │ ├── AuditEventEmitter.java ← async INSERT to audit_events
|
||||
│ │ │ └── RejectedSubmissionSink.java ← async INSERT to rejected_submissions
|
||||
│ │ ├── config/
|
||||
│ │ │ ├── DataSourceConfig.java ← dual datasource, dual Flyway
|
||||
│ │ │ ├── FhirServerConfig.java ← validation chain, IG loading
|
||||
│ │ │ └── SecurityConfig.java ← interceptor registration
|
||||
│ │ ├── init/
|
||||
│ │ │ └── IgPackageInitializer.java ← advisory lock IG loader
|
||||
│ │ ├── interceptor/
|
||||
│ │ │ ├── AuditEventInterceptor.java ← audit hook
|
||||
│ │ │ └── KeycloakJwtInterceptor.java ← JWT auth
|
||||
│ │ ├── terminology/
|
||||
│ │ │ ├── BdTerminologyValidationSupport.java ← OCL integration
|
||||
│ │ │ └── TerminologyCacheManager.java ← cache flush endpoint
|
||||
│ │ └── validator/
|
||||
│ │ └── ClusterExpressionValidator.java ← cluster expression check
|
||||
│ └── resources/
|
||||
│ ├── application.yaml ← all Spring/HAPI configuration
|
||||
│ ├── logback-spring.xml ← structured JSON logging
|
||||
│ ├── db/migration/
|
||||
│ │ ├── fhir/V1__hapi_schema.sql ← HAPI JPA schema (Flyway)
|
||||
│ │ └── audit/V2__audit_schema.sql ← audit schema (Flyway)
|
||||
│ └── packages/
|
||||
│ └── .gitkeep ← CI places IG .tgz here
|
||||
├── ops/
|
||||
│ ├── deployment-guide.md
|
||||
│ ├── keycloak-setup.md
|
||||
│ ├── project-manifest.md
|
||||
│ ├── scaling-roadmap.md
|
||||
│ └── version-upgrade-integration.md
|
||||
└── postgres/
|
||||
├── fhir/
|
||||
│ ├── init.sql ← template only — replace with init.sh before deploy
|
||||
│ └── postgresql.conf ← PostgreSQL tuning for HAPI workload
|
||||
└── audit/
|
||||
├── init.sql ← template only — replace with init.sh before deploy
|
||||
└── postgresql.conf ← PostgreSQL tuning for audit workload
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. How the System Works
|
||||
|
||||
### Startup sequence
|
||||
|
||||
When a HAPI container starts, the following happens in order. If any step fails, the container exits and Docker restarts it.
|
||||
|
||||
1. **Flyway — FHIR schema** runs `V1__hapi_schema.sql` against `postgres-fhir` using the superuser credential. Creates all HAPI JPA tables, sequences, and indexes. Skipped if already applied.
|
||||
2. **Flyway — Audit schema** runs `V2__audit_schema.sql` against `postgres-audit`. Creates partitioned `audit_events` and `fhir_rejected_submissions` tables with monthly partitions pre-created through 2027. Skipped if already applied.
|
||||
3. **Hibernate validation** checks that the schema exactly matches HAPI's entity mappings (`ddl-auto: validate`). Fails loudly if tables are missing or wrong.
|
||||
4. **IgPackageInitializer** acquires a PostgreSQL advisory lock on `postgres-fhir`, loads the BD Core IG package from the classpath into HAPI's `NpmPackageValidationSupport`, writes metadata to `NPM_PACKAGE` tables, and releases the lock. The advisory lock prevents race conditions when multiple replicas start simultaneously — only one replica writes the metadata row; subsequent replicas find it already present and skip.
|
||||
5. **KeycloakJwtInterceptor** fetches the Keycloak JWKS endpoint and caches the signing keys. If Keycloak is unreachable at startup, the interceptor fails to initialise and the container exits.
|
||||
6. Server begins accepting traffic.
|
||||
|
||||
### Request lifecycle — accepted resource
|
||||
|
||||
```
|
||||
1. KeycloakJwtInterceptor
|
||||
└─ extracts Bearer token from Authorization header
|
||||
└─ verifies signature against cached Keycloak JWKS
|
||||
└─ verifies exp, iss = https://auth.dghs.gov.bd/realms/hris
|
||||
└─ verifies mci-api role present in realm_access or resource_access
|
||||
└─ extracts client_id, sub, sending_facility
|
||||
└─ sets request attributes, populates MDC for log correlation
|
||||
|
||||
2. AuditEventInterceptor (pre-validation hook)
|
||||
└─ invokes ClusterExpressionValidator
|
||||
└─ scans Coding elements with system = http://id.who.int/icd/release/11/mms
|
||||
└─ if icd11-cluster-expression extension present → calls cluster validator middleware
|
||||
└─ if raw postcoordination chars (&, /, %) in code without extension → rejects immediately
|
||||
|
||||
3. RequestValidatingInterceptor
|
||||
└─ runs FhirInstanceValidator against ValidationSupportChain:
|
||||
1. DefaultProfileValidationSupport (base FHIR R4 profiles)
|
||||
2. CommonCodeSystemsTerminologyService (UCUM, MimeType, etc.)
|
||||
3. SnapshotGeneratingValidationSupport (differential → snapshot)
|
||||
4. InMemoryTerminologyServerValidationSupport (cache layer)
|
||||
5. NpmPackageValidationSupport (BD Core IG profiles)
|
||||
6. BdTerminologyValidationSupport (OCL $validate-code for ICD-11)
|
||||
└─ any ERROR severity issue → throws UnprocessableEntityException → 422
|
||||
|
||||
4. HAPI JPA persistence
|
||||
└─ resource written to HFJ_RESOURCE, HFJ_RES_VER, SPIDX tables
|
||||
|
||||
5. AuditEventInterceptor (post-storage hook)
|
||||
└─ async: INSERT into audit.audit_events (outcome = ACCEPTED)
|
||||
|
||||
6. HTTP 201 Created → vendor
|
||||
```
|
||||
|
||||
### Request lifecycle — rejected resource
|
||||
|
||||
```
|
||||
1-3. Same as above up to validation failure
|
||||
|
||||
4. UnprocessableEntityException thrown with FHIR OperationOutcome
|
||||
|
||||
5. AuditEventInterceptor (exception hook)
|
||||
└─ async: INSERT full payload into audit.fhir_rejected_submissions
|
||||
└─ async: INSERT into audit.audit_events (outcome = REJECTED)
|
||||
|
||||
6. HTTP 422 Unprocessable Entity → vendor
|
||||
Body: OperationOutcome with issue[].diagnostics and issue[].expression
|
||||
```
|
||||
|
||||
### ICD-11 terminology validation detail
|
||||
|
||||
`BdTerminologyValidationSupport` intercepts every call to validate an ICD-11 coded element:
|
||||
|
||||
1. **Cache check** — if the code was validated in the last 24 hours, serve result from `ConcurrentHashMap`. No OCL call.
|
||||
2. **Cache miss** — call OCL `$validate-code` with `system=http://id.who.int/icd/release/11/mms`. For `Condition.code`, include `url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset` to enforce the Diagnosis + Finding class restriction.
|
||||
3. **OCL returns result=true** — cache as valid, return valid to chain.
|
||||
4. **OCL returns result=false** — cache as invalid, return error to chain → 422.
|
||||
5. **OCL timeout or 5xx** — log WARN, return null (not supported) — fail open.
|
||||
6. **`$expand` attempts** — `isValueSetSupported()` returns false for ICD-11 ValueSets. `$expand` is never attempted. This is intentional: OCL does not support `$expand`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Infrastructure Components
|
||||
|
||||
### Docker services
|
||||
|
||||
| Service | Image | Purpose | Networks |
|
||||
|---------|-------|---------|----------|
|
||||
| `hapi` | Private registry | HAPI FHIR application | frontend, backend-fhir, backend-audit |
|
||||
| `postgres-fhir` | postgres:15-alpine | FHIR resource store | backend-fhir |
|
||||
| `postgres-audit` | postgres:15-alpine | Immutable audit store | backend-audit |
|
||||
| `pgbouncer-fhir` | bitnami/pgbouncer:1.22.1 | Connection pool → postgres-fhir | backend-fhir |
|
||||
| `pgbouncer-audit` | bitnami/pgbouncer:1.22.1 | Connection pool → postgres-audit | backend-audit |
|
||||
|
||||
### Network isolation
|
||||
|
||||
`backend-fhir` and `backend-audit` are marked `internal: true` — no external internet access from these networks. The database containers cannot reach external services and external services cannot reach the databases directly.
|
||||
|
||||
### pgBouncer configuration
|
||||
|
||||
Both pgBouncer instances run in **session mode**. This is mandatory. HAPI uses Hibernate which relies on prepared statements — transaction mode pgBouncer breaks these. Do not change the pool mode.
|
||||
|
||||
Pool sizing at pilot phase (1 HAPI replica):
|
||||
|
||||
| Pool | HikariCP max per replica | pgBouncer pool_size | PostgreSQL max_connections |
|
||||
|------|--------------------------|--------------------|-----------------------------|
|
||||
| FHIR | 5 | 20 | 30 |
|
||||
| Audit | 2 | 10 | 20 |
|
||||
|
||||
At 3 replicas: 15 FHIR connections, 6 audit connections — both within pool limits.
|
||||
|
||||
### Databases
|
||||
|
||||
**postgres-fhir** contains all HAPI JPA tables. Schema managed by Flyway `V1__hapi_schema.sql`. `ddl-auto: validate` means Hibernate never modifies the schema — Flyway owns all DDL. If a HAPI upgrade requires schema changes, write a new Flyway migration.
|
||||
|
||||
**postgres-audit** contains the audit schema only. Two tables, both partitioned by month. Schema managed by Flyway `V2__audit_schema.sql` against postgres-audit (separate Flyway instance, separate history table `flyway_audit_schema_history`).
|
||||
|
||||
### Volumes
|
||||
|
||||
| Volume | Contents | Backup priority |
|
||||
|--------|----------|-----------------|
|
||||
| `postgres-fhir-data` | All FHIR resources | Critical — primary data |
|
||||
| `postgres-audit-data` | All audit records, rejected payloads | Critical — forensic/legal |
|
||||
| `hapi-logs` | Structured JSON application logs | Medium — operational |
|
||||
|
||||
---
|
||||
|
||||
## 5. Security Model
|
||||
|
||||
### Authentication
|
||||
|
||||
Every request to FHIR endpoints (except `GET /fhir/metadata` and `/actuator/health/**`) requires a valid Bearer token issued by Keycloak realm `hris`.
|
||||
|
||||
`KeycloakJwtInterceptor` performs these checks in order, rejecting with HTTP 401 on any failure:
|
||||
|
||||
1. `Authorization: Bearer` header present and non-empty
|
||||
2. JWT signature valid against Keycloak JWKS (`RS256` only — symmetric algorithms rejected)
|
||||
3. `exp` claim in the future (not expired)
|
||||
4. `iss` claim exactly equals `https://auth.dghs.gov.bd/realms/hris`
|
||||
5. `mci-api` role present in `realm_access.roles` OR in `resource_access.{client-id}.roles`
|
||||
|
||||
The JWKS is cached locally with a 1-hour TTL. On receiving a JWT with an unknown `kid`, the JWKS is immediately re-fetched regardless of TTL — this handles Keycloak key rotation without delay.
|
||||
|
||||
### Authorisation
|
||||
|
||||
**Vendors** — must have `mci-api` role. Client naming convention: `fhir-vendor-{organisation-id}`.
|
||||
|
||||
**Admin operations** (cache flush endpoint) — must have `fhir-admin` role. Only the `fhir-admin-pipeline` service account and DGHS system administrators hold this role.
|
||||
|
||||
### Keycloak client setup for new vendors
|
||||
|
||||
See `ops/keycloak-setup.md` for the full procedure. Summary:
|
||||
|
||||
1. Create client `fhir-vendor-{org-id}` in `hris` realm — confidential, service accounts enabled, standard flow off.
|
||||
2. Assign `mci-api` role to the service account.
|
||||
3. Add `sending_facility` user attribute with the DGHS facility code.
|
||||
4. Add a User Attribute token mapper for `sending_facility` → token claim `sending_facility`.
|
||||
5. Deliver `client_id` and `client_secret` to the vendor.
|
||||
|
||||
If a vendor token is missing the `sending_facility` claim, HAPI logs WARN on every submission and uses `client_id` as the facility identifier in audit records. This is a data quality issue — configure the mapper.
|
||||
|
||||
### Vendor token flow
|
||||
|
||||
```bash
|
||||
# Vendor obtains token
|
||||
POST https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token
|
||||
grant_type=client_credentials
|
||||
client_id=fhir-vendor-{org-id}
|
||||
client_secret={secret}
|
||||
→ { "access_token": "eyJ...", "expires_in": 300 }
|
||||
|
||||
# Vendor submits resource
|
||||
POST https://fhir.dghs.gov.bd/fhir/Condition
|
||||
Authorization: Bearer eyJ...
|
||||
Content-Type: application/fhir+json
|
||||
{ ... }
|
||||
```
|
||||
|
||||
Tokens expire in 5 minutes (Keycloak default). Vendor systems must refresh before expiry.
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation Pipeline
|
||||
|
||||
### BD Core IG profiles
|
||||
|
||||
The following resource types are validated against BD Core IG profiles:
|
||||
|
||||
| Resource type | Profile URL |
|
||||
|---------------|-------------|
|
||||
| Patient | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-patient` |
|
||||
| Condition | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition` |
|
||||
| Encounter | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-encounter` |
|
||||
| Observation | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-observation` |
|
||||
| Practitioner | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-practitioner` |
|
||||
| Organization | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-organization` |
|
||||
| Location | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-location` |
|
||||
| Medication | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-medication` |
|
||||
| MedicationRequest | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-medicationrequest` |
|
||||
| Immunization | `https://fhir.dghs.gov.bd/core/StructureDefinition/bd-immunization` |
|
||||
|
||||
Resources of any other type are stored with `meta.tag = https://fhir.dghs.gov.bd/tags|unvalidated-profile`. They are not rejected. They can be queried with `_tag=https://fhir.dghs.gov.bd/tags|unvalidated-profile`.
|
||||
|
||||
### ICD-11 cluster expression format
|
||||
|
||||
BD Core IG defines a specific pattern for postcoordinated ICD-11 expressions. **Raw postcoordinated strings in `Coding.code` are prohibited.**
|
||||
|
||||
**Correct format:**
|
||||
```json
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "1C62.0",
|
||||
"extension": [{
|
||||
"url": "icd11-cluster-expression",
|
||||
"valueString": "1C62.0/http%3A%2F%2Fid.who.int%2F..."
|
||||
}]
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
**Prohibited format (rejected with 422):**
|
||||
```json
|
||||
"code": {
|
||||
"coding": [{
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": "1C62.0&has_severity=mild"
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### Rejection codes
|
||||
|
||||
The `rejection_code` column in `audit.fhir_rejected_submissions` contains one of:
|
||||
|
||||
| Code | Meaning |
|
||||
|------|---------|
|
||||
| `PROFILE_VIOLATION` | Resource violates a BD Core IG SHALL constraint |
|
||||
| `TERMINOLOGY_INVALID_CODE` | ICD-11 code not found in OCL |
|
||||
| `TERMINOLOGY_INVALID_CLASS` | ICD-11 code exists but is not Diagnosis/Finding class |
|
||||
| `CLUSTER_EXPRESSION_INVALID` | Cluster expression failed cluster validator |
|
||||
| `CLUSTER_STEM_MISSING_EXTENSION` | Raw postcoordinated string without extension |
|
||||
| `AUTH_TOKEN_MISSING` | No Bearer token |
|
||||
| `AUTH_TOKEN_EXPIRED` | Token `exp` in the past |
|
||||
| `AUTH_TOKEN_INVALID_SIGNATURE` | Signature verification failed |
|
||||
| `AUTH_TOKEN_MISSING_ROLE` | `mci-api` role absent |
|
||||
| `AUTH_TOKEN_INVALID_ISSUER` | `iss` does not match Keycloak realm |
|
||||
|
||||
---
|
||||
|
||||
## 7. Audit and Forensics
|
||||
|
||||
### Two audit stores
|
||||
|
||||
**`audit.audit_events`** — one row per request outcome. Always written, accepted and rejected. Contains: `event_type`, `operation`, `resource_type`, `resource_id`, `outcome`, `outcome_detail`, `sending_facility`, `client_id`, `subject`, `request_ip`, `request_id`, `validation_messages` (JSONB).
|
||||
|
||||
**`audit.fhir_rejected_submissions`** — one row per rejected write. Contains: full resource payload as submitted (TEXT, not JSONB), `rejection_code`, `rejection_reason`, `element_path`, `violated_profile`, `invalid_code`, `invalid_system`.
|
||||
|
||||
### Immutability
|
||||
|
||||
The `audit_writer_login` PostgreSQL user has INSERT only on the audit schema. The HAPI JVM connects to postgres-audit as this user. No UPDATE or DELETE is possible from the application layer regardless of what the application code attempts. Only a PostgreSQL superuser can modify audit records.
|
||||
|
||||
### Partitioning
|
||||
|
||||
Both audit tables are partitioned by month (`PARTITION BY RANGE (event_time)`). Monthly partitions are pre-created through December 2027. A cron job must create next-month partitions on the 20th of each month. If this lapses, INSERT fails with a hard error.
|
||||
|
||||
**Set up the cron job immediately after first deployment:**
|
||||
```bash
|
||||
# On the host running postgres-audit
|
||||
crontab -e
|
||||
# Add:
|
||||
0 0 20 * * docker exec bd-postgres-audit psql -U audit_maintainer_login -d auditdb \
|
||||
-c "SELECT audit.create_next_month_partitions();" \
|
||||
>> /var/log/bd-fhir-partition-maintenance.log 2>&1
|
||||
```
|
||||
|
||||
### Useful audit queries
|
||||
|
||||
```sql
|
||||
-- Rejection rate by vendor, last 7 days
|
||||
SELECT client_id,
|
||||
COUNT(*) AS total,
|
||||
SUM(CASE WHEN outcome='REJECTED' THEN 1 ELSE 0 END) AS rejected,
|
||||
ROUND(100.0 * SUM(CASE WHEN outcome='REJECTED' THEN 1 ELSE 0 END) / COUNT(*), 1) AS pct
|
||||
FROM audit.audit_events
|
||||
WHERE event_time > NOW() - INTERVAL '7 days'
|
||||
AND event_type IN ('OPERATION','VALIDATION_FAILURE')
|
||||
GROUP BY client_id ORDER BY pct DESC;
|
||||
|
||||
-- Retrieve rejected payloads for a vendor
|
||||
SELECT submission_time, resource_type, rejection_code, rejection_reason, element_path
|
||||
FROM audit.fhir_rejected_submissions
|
||||
WHERE client_id = 'fhir-vendor-{org-id}'
|
||||
ORDER BY submission_time DESC LIMIT 20;
|
||||
|
||||
-- Auth failures
|
||||
SELECT event_time, client_id, outcome_detail, request_ip
|
||||
FROM audit.audit_events
|
||||
WHERE event_type = 'AUTH_FAILURE'
|
||||
ORDER BY event_time DESC LIMIT 20;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. CI/CD Pipeline
|
||||
|
||||
The production server **never builds**. It only pulls pre-built images from the private registry.
|
||||
|
||||
### CI pipeline steps (on CI machine)
|
||||
|
||||
```bash
|
||||
# 1. Obtain BD Core IG package and place it
|
||||
cp /path/to/bd.gov.dghs.core-0.2.1.tgz \
|
||||
hapi-overlay/src/main/resources/packages/
|
||||
|
||||
# 2. Run tests (TestContainers spins up real PostgreSQL — no H2)
|
||||
mvn test -pl hapi-overlay -am
|
||||
|
||||
# 3. Build Docker image (multi-stage: Maven builder + JRE runtime)
|
||||
docker build \
|
||||
--build-arg IG_PACKAGE=bd.gov.dghs.core-0.2.1.tgz \
|
||||
--build-arg BUILD_VERSION=1.0.0 \
|
||||
--build-arg GIT_COMMIT=$(git rev-parse --short HEAD) \
|
||||
-t your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0 \
|
||||
-f hapi-overlay/Dockerfile \
|
||||
.
|
||||
|
||||
# 4. Push to private registry
|
||||
docker push your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0
|
||||
```
|
||||
|
||||
The `packages/` directory must contain exactly one `.tgz` file matching `HAPI_IG_PACKAGE_CLASSPATH` in `.env`. If the directory is empty or the filename does not match, the container fails startup immediately with a clear error message.
|
||||
|
||||
---
|
||||
|
||||
## 9. First Deployment — Step by Step
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Ubuntu 22.04 LTS, minimum 8GB RAM, 4 vCPU, 100GB disk
|
||||
- Outbound HTTPS to Keycloak, OCL, cluster validator, private registry
|
||||
- Docker image already built and pushed (see Section 8)
|
||||
- Keycloak configured (see `ops/keycloak-setup.md`)
|
||||
|
||||
### Step 1 — Install Docker
|
||||
|
||||
```bash
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y ca-certificates curl
|
||||
sudo install -m 0755 -d /etc/apt/keyrings
|
||||
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
|
||||
-o /etc/apt/keyrings/docker.asc
|
||||
sudo chmod a+r /etc/apt/keyrings/docker.asc
|
||||
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] \
|
||||
https://download.docker.com/linux/ubuntu \
|
||||
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
|
||||
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
|
||||
docker-buildx-plugin docker-compose-plugin
|
||||
sudo usermod -aG docker $USER
|
||||
# log out and back in
|
||||
```
|
||||
|
||||
### Step 2 — Prepare application directory
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /opt/bd-fhir-national
|
||||
sudo chown $USER:$USER /opt/bd-fhir-national
|
||||
# rsync project files from CI/deployment machine (excluding source tree)
|
||||
rsync -avz --exclude='.git' --exclude='hapi-overlay/target' \
|
||||
--exclude='hapi-overlay/src' \
|
||||
./bd-fhir-national/ deploy@server:/opt/bd-fhir-national/
|
||||
```
|
||||
|
||||
### Step 3 — Create .env
|
||||
|
||||
```bash
|
||||
cd /opt/bd-fhir-national
|
||||
cp .env.example .env
|
||||
chmod 600 .env
|
||||
nano .env # fill all <CHANGE_ME> values
|
||||
# verify: grep CHANGE_ME .env should return nothing
|
||||
```
|
||||
|
||||
### Step 4 — Fix init scripts (CRITICAL — do not skip)
|
||||
|
||||
The `postgres/fhir/init.sql` and `postgres/audit/init.sql` files are templates with placeholder passwords. PostgreSQL Docker does not perform variable substitution in `.sql` files. Replace them with `.sh` scripts that read from environment variables.
|
||||
|
||||
```bash
|
||||
# FHIR database init script
|
||||
cat > /opt/bd-fhir-national/postgres/fhir/init.sh <<'EOF'
|
||||
#!/bin/bash
|
||||
set -e
|
||||
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
|
||||
DO \$\$ BEGIN
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${FHIR_DB_APP_USER}') THEN
|
||||
CREATE USER ${FHIR_DB_APP_USER} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE
|
||||
NOINHERIT LOGIN CONNECTION LIMIT 30 PASSWORD '${FHIR_DB_APP_PASSWORD}';
|
||||
END IF;
|
||||
END \$\$;
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${FHIR_DB_APP_USER};
|
||||
GRANT USAGE ON SCHEMA public TO ${FHIR_DB_APP_USER};
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public
|
||||
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO ${FHIR_DB_APP_USER};
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public
|
||||
GRANT USAGE, SELECT ON SEQUENCES TO ${FHIR_DB_APP_USER};
|
||||
EOSQL
|
||||
EOF
|
||||
chmod +x /opt/bd-fhir-national/postgres/fhir/init.sh
|
||||
|
||||
# Audit database init script
|
||||
cat > /opt/bd-fhir-national/postgres/audit/init.sh <<'EOF'
|
||||
#!/bin/bash
|
||||
set -e
|
||||
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
|
||||
DO \$\$ BEGIN
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${AUDIT_DB_WRITER_USER}') THEN
|
||||
CREATE USER ${AUDIT_DB_WRITER_USER} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE
|
||||
NOINHERIT LOGIN CONNECTION LIMIT 20 PASSWORD '${AUDIT_DB_WRITER_PASSWORD}';
|
||||
END IF;
|
||||
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${AUDIT_DB_MAINTAINER_USER}') THEN
|
||||
CREATE USER ${AUDIT_DB_MAINTAINER_USER} WITH NOSUPERUSER NOCREATEDB NOCREATEROLE
|
||||
NOINHERIT LOGIN CONNECTION LIMIT 5 PASSWORD '${AUDIT_DB_MAINTAINER_PASSWORD}';
|
||||
END IF;
|
||||
END \$\$;
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${AUDIT_DB_WRITER_USER};
|
||||
GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${AUDIT_DB_MAINTAINER_USER};
|
||||
EOSQL
|
||||
EOF
|
||||
chmod +x /opt/bd-fhir-national/postgres/audit/init.sh
|
||||
```
|
||||
|
||||
Update `docker-compose.yml` — in both postgres services, change the init volume mount from `.sql` to `.sh`, and pass the necessary env vars to `postgres-audit`:
|
||||
|
||||
```yaml
|
||||
# postgres-fhir volumes: change
|
||||
- ./postgres/fhir/init.sh:/docker-entrypoint-initdb.d/init.sh:ro
|
||||
# add to postgres-fhir environment:
|
||||
FHIR_DB_APP_USER: ${FHIR_DB_APP_USER}
|
||||
FHIR_DB_APP_PASSWORD: ${FHIR_DB_APP_PASSWORD}
|
||||
|
||||
# postgres-audit volumes: change
|
||||
- ./postgres/audit/init.sh:/docker-entrypoint-initdb.d/init.sh:ro
|
||||
# add to postgres-audit environment:
|
||||
AUDIT_DB_WRITER_USER: ${AUDIT_DB_WRITER_USER}
|
||||
AUDIT_DB_WRITER_PASSWORD: ${AUDIT_DB_WRITER_PASSWORD}
|
||||
AUDIT_DB_MAINTAINER_USER: ${AUDIT_DB_MAINTAINER_USER}
|
||||
AUDIT_DB_MAINTAINER_PASSWORD: ${AUDIT_DB_MAINTAINER_PASSWORD}
|
||||
```
|
||||
|
||||
### Step 5 — Registry login
|
||||
|
||||
```bash
|
||||
docker login your-registry.dghs.gov.bd
|
||||
docker compose --env-file .env pull
|
||||
```
|
||||
|
||||
### Step 6 — Start databases
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d postgres-fhir postgres-audit
|
||||
# wait for healthy
|
||||
until docker compose --env-file .env ps postgres-fhir | grep -q "healthy"; do sleep 3; done
|
||||
until docker compose --env-file .env ps postgres-audit | grep -q "healthy"; do sleep 3; done
|
||||
```
|
||||
|
||||
### Step 7 — Verify database users
|
||||
|
||||
```bash
|
||||
docker exec bd-postgres-fhir psql -U postgres -d fhirdb \
|
||||
-c "SELECT rolname FROM pg_roles WHERE rolname='hapi_app';"
|
||||
# Expected: hapi_app
|
||||
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb \
|
||||
-c "SELECT rolname FROM pg_roles WHERE rolname IN ('audit_writer_login','audit_maintainer_login');"
|
||||
# Expected: two rows
|
||||
```
|
||||
|
||||
### Step 8 — Start pgBouncer and HAPI
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d pgbouncer-fhir pgbouncer-audit
|
||||
docker compose --env-file .env up -d hapi
|
||||
|
||||
# Follow startup — takes 60-120 seconds
|
||||
docker compose --env-file .env logs -f hapi
|
||||
```
|
||||
|
||||
Expected log sequence:
|
||||
```
|
||||
Running FHIR Flyway migrations... → V1 applied
|
||||
Running Audit Flyway migrations... → V2 applied
|
||||
Advisory lock acquired... → IG loading begins
|
||||
BD Core IG package loaded... → IG ready
|
||||
BdTerminologyValidationSupport initialised...
|
||||
KeycloakJwtInterceptor initialised...
|
||||
HAPI RestfulServer interceptors registered...
|
||||
Tomcat started on port(s): 8080
|
||||
Started BdFhirApplication in XX seconds
|
||||
```
|
||||
|
||||
### Step 9 — Verify health
|
||||
|
||||
```bash
|
||||
# Internal (direct to HAPI)
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s http://localhost:8080/actuator/health | jq .
|
||||
# All components must show status: UP
|
||||
|
||||
# FHIR metadata
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s http://localhost:8080/fhir/metadata | jq '.software'
|
||||
# Expected: { "name": "BD FHIR National Repository", "version": "0.2.1" }
|
||||
```
|
||||
|
||||
### Step 10 — Set up partition maintenance cron
|
||||
|
||||
```bash
|
||||
crontab -e
|
||||
# Add:
|
||||
0 0 20 * * docker exec bd-postgres-audit psql -U audit_maintainer_login -d auditdb \
|
||||
-c "SELECT audit.create_next_month_partitions();" \
|
||||
>> /var/log/bd-fhir-partition-maintenance.log 2>&1
|
||||
```
|
||||
|
||||
### Step 11 — Run acceptance tests
|
||||
|
||||
Run all tests from Section 9.3 of `ops/deployment-guide.md`. All nine must pass before the system is declared production-ready.
|
||||
|
||||
---
|
||||
|
||||
## 10. Routine Operations
|
||||
|
||||
### View logs
|
||||
|
||||
```bash
|
||||
# All services
|
||||
docker compose --env-file .env logs -f
|
||||
|
||||
# HAPI logs as structured JSON
|
||||
docker compose --env-file .env logs -f hapi | jq -R 'try fromjson'
|
||||
|
||||
# Filter for rejections
|
||||
docker compose --env-file .env logs hapi | \
|
||||
jq -R 'try fromjson | select(.message | test("rejected|REJECTED"))'
|
||||
```
|
||||
|
||||
### Deploy a new image version
|
||||
|
||||
```bash
|
||||
# Update image tag in .env
|
||||
nano /opt/bd-fhir-national/.env
|
||||
# Change HAPI_IMAGE to new tag
|
||||
|
||||
# Pull and redeploy
|
||||
docker compose --env-file .env pull hapi
|
||||
docker compose --env-file .env up -d --no-deps hapi
|
||||
|
||||
# Verify startup
|
||||
docker compose --env-file .env logs -f hapi
|
||||
```
|
||||
|
||||
### Scale HAPI replicas
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env up -d --scale hapi=3
|
||||
# No other configuration changes needed at 3 replicas.
|
||||
# pgBouncer pool_size=20 supports up to 4 replicas at HikariCP max=5.
|
||||
# At 5+ replicas: increase PGBOUNCER_DEFAULT_POOL_SIZE and postgres max_connections first.
|
||||
```
|
||||
|
||||
### Restart a service
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env restart hapi
|
||||
docker compose --env-file .env restart postgres-fhir # causes brief HAPI downtime
|
||||
```
|
||||
|
||||
### Full stack restart
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env down
|
||||
docker compose --env-file .env up -d
|
||||
```
|
||||
|
||||
### Check pgBouncer pool status
|
||||
|
||||
```bash
|
||||
docker exec bd-pgbouncer-fhir psql -h localhost -p 5432 -U pgbouncer pgbouncer \
|
||||
-c "SHOW POOLS;"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. ICD-11 Version Upgrade
|
||||
|
||||
When a new ICD-11 MMS release is imported into OCL, the HAPI terminology cache becomes stale. The upgrade pipeline must flush the cache after OCL import. Full procedure in `ops/version-upgrade-integration.md`. Summary:
|
||||
|
||||
**Order is mandatory:**
|
||||
1. OCL: import new ICD-11 concepts
|
||||
2. OCL: patch `concept_class` for Diagnosis + Finding
|
||||
3. OCL: repopulate `bd-condition-icd11-diagnosis-valueset`
|
||||
4. OCL: verify `$validate-code` returns correct results for new codes
|
||||
5. HAPI: flush terminology cache
|
||||
6. HAPI: verify new codes validate correctly
|
||||
|
||||
**Step 5 — cache flush:**
|
||||
```bash
|
||||
# Get fhir-admin token
|
||||
ADMIN_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-admin-pipeline" \
|
||||
-d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
# Flush — run from inside Docker network (admin endpoint is network-restricted)
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s -X DELETE \
|
||||
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
|
||||
http://localhost:8080/admin/terminology/cache | jq .
|
||||
# Expected: { "status": "flushed", "entriesEvicted": N }
|
||||
```
|
||||
|
||||
**IG version upgrade** (when BD Core IG advances to a new version):
|
||||
1. Place new `.tgz` in `src/main/resources/packages/`, remove old one.
|
||||
2. Update `HAPI_IG_PACKAGE_CLASSPATH` and `HAPI_IG_VERSION` in `.env`.
|
||||
3. Build and push new Docker image on CI machine.
|
||||
4. Deploy new image on production server.
|
||||
|
||||
---
|
||||
|
||||
## 12. Scaling
|
||||
|
||||
### Current capacity (Phase 1 — Pilot)
|
||||
|
||||
| Metric | Capacity |
|
||||
|--------|----------|
|
||||
| HAPI replicas | 1 |
|
||||
| Vendors | <50 |
|
||||
| Resources/day | <10,000 |
|
||||
| PostgreSQL connections (FHIR) | 5 |
|
||||
| PostgreSQL connections (Audit) | 2 |
|
||||
|
||||
### Scaling to Phase 2 (Regional — up to 500 vendors, 100,000 resources/day)
|
||||
|
||||
```bash
|
||||
# Scale HAPI to 3 replicas — no other changes required
|
||||
docker compose --env-file .env up -d --scale hapi=3
|
||||
```
|
||||
|
||||
Beyond 3 replicas, update pgBouncer pool sizes and PostgreSQL `max_connections` before scaling. See `ops/scaling-roadmap.md` for the full capacity matrix and Phase 3 (national scale → Kubernetes) guidance.
|
||||
|
||||
---
|
||||
|
||||
## 13. Troubleshooting
|
||||
|
||||
### Container not starting
|
||||
|
||||
```bash
|
||||
docker compose --env-file .env logs hapi | tail -50
|
||||
```
|
||||
|
||||
| Log message | Cause | Fix |
|
||||
|-------------|-------|-----|
|
||||
| `STARTUP FAILURE: BD Core IG package not found` | `.tgz` missing from image | Rebuild image with package in `packages/` |
|
||||
| `FHIR Flyway configuration missing` | `SPRING_FLYWAY_*` env vars not set | Check `.env` |
|
||||
| `password authentication failed for user "hapi_app"` | `init.sh` not run or wrong password | Verify Step 4 of deployment, check `.env` passwords |
|
||||
| `Advisory lock acquisition timed out` | Another replica holding lock and crashed mid-init | Check `pg_locks` on postgres-fhir, kill stale lock |
|
||||
| `Connection refused` to Keycloak JWKS | Keycloak unreachable at startup | Check network connectivity, Keycloak health |
|
||||
| `Schema-validation: missing table` | Flyway did not run | Check `SPRING_FLYWAY_*` env vars, check flyway_schema_history table |
|
||||
|
||||
### 401 on all authenticated requests
|
||||
|
||||
```bash
|
||||
# Check JWKS endpoint is reachable from inside the container
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/certs | jq '.keys | length'
|
||||
# Expected: 1 or more keys
|
||||
```
|
||||
|
||||
If JWKS is unreachable, all requests will be rejected with 401 (fail closed). Check firewall rules — the HAPI container must have outbound HTTPS to Keycloak.
|
||||
|
||||
### 422 on all ICD-11 coded submissions
|
||||
|
||||
```bash
|
||||
# Check OCL is reachable
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s -o /dev/null -w "%{http_code}" \
|
||||
"https://tr.ocl.dghs.gov.bd/api/fhir/metadata"
|
||||
# Expected: 200
|
||||
|
||||
# Check a specific code manually
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code?\
|
||||
url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
|
||||
&system=http://id.who.int/icd/release/11/mms&code=1C62.0" | jq .
|
||||
```
|
||||
|
||||
If OCL is unreachable, the system should be fail-open (codes accepted). If codes are being rejected despite OCL being reachable, check OCL's `$validate-code` response directly.
|
||||
|
||||
### Audit writes failing
|
||||
|
||||
```bash
|
||||
# Check HAPI logs for "AUDIT WRITE FAILED"
|
||||
docker compose --env-file .env logs hapi | grep "AUDIT WRITE FAILED"
|
||||
|
||||
# Check audit datasource health
|
||||
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
|
||||
curl -s http://localhost:8080/actuator/health | jq '.components.auditDb'
|
||||
```
|
||||
|
||||
### Partition missing (INSERT to audit failing)
|
||||
|
||||
```bash
|
||||
# Check which partitions exist
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb -c "
|
||||
SELECT c.relname FROM pg_class c
|
||||
JOIN pg_inherits i ON i.inhrelid = c.oid
|
||||
JOIN pg_class p ON p.oid = i.inhparent
|
||||
JOIN pg_namespace n ON n.oid = p.relnamespace
|
||||
WHERE n.nspname = 'audit' AND p.relname = 'audit_events'
|
||||
ORDER BY c.relname DESC LIMIT 3;"
|
||||
|
||||
# Create missing partition manually
|
||||
docker exec bd-postgres-audit psql -U postgres -d auditdb \
|
||||
-c "SELECT audit.create_next_month_partitions();"
|
||||
```
|
||||
|
||||
### Check disk usage
|
||||
|
||||
```bash
|
||||
docker system df -v
|
||||
df -h /var/lib/docker
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 14. Architecture Decisions You Must Not Reverse
|
||||
|
||||
These decisions are load-bearing. Reversing any of them without fully understanding the consequences will break the system.
|
||||
|
||||
**PostgreSQL only — no H2, not even for tests.**
|
||||
The test suite uses TestContainers to spin up real PostgreSQL 15. H2 is not on the classpath. Using H2 masks database-specific behaviour (advisory locks, partitioning, JSONB) and produces false-green test results.
|
||||
|
||||
**Validation on ALL requests — no vendor exemptions.**
|
||||
The `RequestValidatingInterceptor` runs on every write. There is no per-vendor or per-resource-type bypass. This is the HIE boundary enforcement. A bypass for one vendor breaks the national data quality guarantee for everyone downstream.
|
||||
|
||||
**OCL is the single terminology authority.**
|
||||
There is no local ICD-11 concept store. All ICD-11 validation goes to OCL. This means OCL availability affects HAPI validation quality. Keep OCL healthy. Do not add a local fallback without understanding the implications for version consistency.
|
||||
|
||||
**`$expand` is never attempted for ICD-11 ValueSets.**
|
||||
OCL does not support `$expand`. The `isValueSetSupported()` override returns `false` for all ICD-11 ValueSets. Do not remove this — removing it causes HAPI to attempt `$expand`, receive an empty response, and reject every ICD-11 coded resource regardless of whether the code is valid.
|
||||
|
||||
**pgBouncer must remain in session mode.**
|
||||
Hibernate uses prepared statements and advisory locks. Transaction mode pgBouncer breaks both. Do not change `PGBOUNCER_POOL_MODE` to `transaction`.
|
||||
|
||||
**Flyway owns all DDL — Hibernate never modifies schema.**
|
||||
`ddl-auto: validate` means Hibernate will refuse to start if the schema does not match its entities, but it will never ALTER or CREATE tables. If a HAPI upgrade changes entity mappings, write a Flyway migration. Never change `ddl-auto` to `update` in production.
|
||||
|
||||
**Audit writes are append-only.**
|
||||
The `audit_writer_login` PostgreSQL user has INSERT only. The application cannot UPDATE or DELETE audit records regardless of what the code does. This is enforced at the database level. Do not grant additional privileges to this user.
|
||||
|
||||
**The IG package is bundled in the Docker image.**
|
||||
The `.tgz` is a build-time artifact, not a runtime configuration. There is no hot-reload. An IG upgrade requires a new Docker image build and deployment. This is by design — it ties IG version to container version, making deployments auditable and rollbacks clean.
|
||||
434
ops/version-upgrade-integration.md
Normal file
434
ops/version-upgrade-integration.md
Normal file
@@ -0,0 +1,434 @@
|
||||
# ICD-11 Version Upgrade — HAPI Integration
|
||||
|
||||
**Audience:** ICD-11 Terminology Pipeline team, DGHS FHIR ops
|
||||
**Related:** `version_upgrade.py` (OCL import pipeline)
|
||||
**HAPI endpoint:** `DELETE /admin/terminology/cache`
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
When a new ICD-11 MMS release is imported into OCL, the HAPI server's
|
||||
24-hour terminology validation cache becomes stale. Vendors submitting
|
||||
resources after the import — but before the cache expires — will have their
|
||||
ICD-11 codes validated against the **old** OCL data. New codes from the
|
||||
new release will be incorrectly rejected as invalid (cache miss → OCL hit
|
||||
with old data → cached as invalid). Removed or reclassified codes that were
|
||||
previously valid will continue to be accepted from cache.
|
||||
|
||||
**The cache flush endpoint resolves this.** Calling it after OCL import
|
||||
forces the next validation call for every ICD-11 code to hit OCL directly,
|
||||
repopulating the cache with the new version's data.
|
||||
|
||||
---
|
||||
|
||||
## Step-by-step upgrade procedure
|
||||
|
||||
The following steps must be executed **in this exact order**. Deviating
|
||||
from the order (e.g., flushing before OCL import completes) causes the
|
||||
cache to repopulate with old data and requires a second flush.
|
||||
|
||||
```
|
||||
Step 1 OCL: import new ICD-11 MMS release
|
||||
Step 2 OCL: patch concept_class for Diagnosis + Finding concepts
|
||||
Step 3 OCL: repopulate bd-condition-icd11-diagnosis-valueset collection
|
||||
Step 4 OCL: verify $validate-code returns correct results for new codes
|
||||
Step 5 HAPI: flush terminology cache ← this document
|
||||
Step 6 HAPI: verify validation with new codes
|
||||
Step 7 DGHS: notify vendors of new release
|
||||
```
|
||||
|
||||
Steps 1-4 are handled by `version_upgrade.py`. This document covers
|
||||
Steps 5-6 and the exact integration between the two systems.
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — Pre-flush verification (run before calling HAPI)
|
||||
|
||||
Before flushing the HAPI cache, verify that OCL is serving correct results
|
||||
for the new release. Flushing a cache backed by an incorrect OCL state
|
||||
degrades validation quality.
|
||||
|
||||
### 4a — Verify a new code is valid in OCL
|
||||
|
||||
Pick a code that is **new** in this release (not in the previous release).
|
||||
|
||||
```bash
|
||||
NEW_CODE="XY9Z" # Replace with an actual new code from the release notes
|
||||
|
||||
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
|
||||
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
|
||||
&system=http://id.who.int/icd/release/11/mms\
|
||||
&code=${NEW_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
|
||||
|
||||
# Expected: true
|
||||
```
|
||||
|
||||
### 4b — Verify a Device-class code is rejected by OCL
|
||||
|
||||
Device-class codes must be rejected by the bd-condition-icd11-diagnosis-valueset
|
||||
(which restricts to Diagnosis + Finding only).
|
||||
|
||||
```bash
|
||||
DEVICE_CODE="XA7RE2" # Example Device class code — use an actual one
|
||||
|
||||
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
|
||||
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
|
||||
&system=http://id.who.int/icd/release/11/mms\
|
||||
&code=${DEVICE_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
|
||||
|
||||
# Expected: false
|
||||
```
|
||||
|
||||
### 4c — Verify a deprecated code is invalid
|
||||
|
||||
If this release deprecates or removes any codes, verify they are now rejected.
|
||||
|
||||
```bash
|
||||
DEPRECATED_CODE="..." # From release notes
|
||||
|
||||
curl -s "https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/\$validate-code\
|
||||
?url=https://fhir.dghs.gov.bd/core/ValueSet/bd-condition-icd11-diagnosis-valueset\
|
||||
&system=http://id.who.int/icd/release/11/mms\
|
||||
&code=${DEPRECATED_CODE}" | jq '.parameter[] | select(.name=="result") | .valueBoolean'
|
||||
|
||||
# Expected: false (if deprecated) or true (if still valid)
|
||||
```
|
||||
|
||||
Do not proceed to Step 5 until all 4a-4c verifications pass.
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — Flush the HAPI terminology cache
|
||||
|
||||
### 5a — Obtain fhir-admin token
|
||||
|
||||
The cache flush endpoint requires the `fhir-admin` Keycloak role.
|
||||
The `fhir-admin-pipeline` client is the designated service account for
|
||||
this operation (see `ops/keycloak-setup.md`, Part 2).
|
||||
|
||||
```python
|
||||
# In version_upgrade.py — add this function
|
||||
|
||||
import requests
|
||||
import json
|
||||
|
||||
KEYCLOAK_TOKEN_URL = "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token"
|
||||
FHIR_ADMIN_CLIENT_ID = "fhir-admin-pipeline"
|
||||
FHIR_ADMIN_CLIENT_SECRET = os.environ["FHIR_ADMIN_CLIENT_SECRET"] # from secrets vault
|
||||
HAPI_BASE_URL = "https://fhir.dghs.gov.bd"
|
||||
|
||||
|
||||
def get_fhir_admin_token() -> str:
|
||||
"""Obtain a fhir-admin Bearer token from Keycloak."""
|
||||
response = requests.post(
|
||||
KEYCLOAK_TOKEN_URL,
|
||||
data={
|
||||
"grant_type": "client_credentials",
|
||||
"client_id": FHIR_ADMIN_CLIENT_ID,
|
||||
"client_secret": FHIR_ADMIN_CLIENT_SECRET,
|
||||
},
|
||||
timeout=30,
|
||||
)
|
||||
response.raise_for_status()
|
||||
token_data = response.json()
|
||||
access_token = token_data["access_token"]
|
||||
|
||||
# Verify the token contains fhir-admin role before using it
|
||||
# (parse middle segment of JWT)
|
||||
import base64
|
||||
payload_b64 = access_token.split(".")[1]
|
||||
# Add padding if needed
|
||||
payload_b64 += "=" * (4 - len(payload_b64) % 4)
|
||||
claims = json.loads(base64.b64decode(payload_b64))
|
||||
|
||||
realm_roles = claims.get("realm_access", {}).get("roles", [])
|
||||
if "fhir-admin" not in realm_roles:
|
||||
raise ValueError(
|
||||
f"fhir-admin-pipeline token does not contain fhir-admin role. "
|
||||
f"Roles present: {realm_roles}. "
|
||||
f"Check Keycloak service account role assignment."
|
||||
)
|
||||
|
||||
return access_token
|
||||
```
|
||||
|
||||
### 5b — Check cache state before flush (optional but recommended)
|
||||
|
||||
```python
|
||||
def get_cache_stats(admin_token: str) -> dict:
|
||||
"""Retrieve current HAPI terminology cache statistics."""
|
||||
response = requests.get(
|
||||
f"{HAPI_BASE_URL}/admin/terminology/cache/stats",
|
||||
headers={"Authorization": f"Bearer {admin_token}"},
|
||||
timeout=30,
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
|
||||
|
||||
# Usage:
|
||||
stats_before = get_cache_stats(admin_token)
|
||||
print(f"Cache before flush: {stats_before['totalEntries']} entries "
|
||||
f"({stats_before['liveEntries']} live, "
|
||||
f"{stats_before['expiredEntries']} expired)")
|
||||
```
|
||||
|
||||
### 5c — Execute cache flush
|
||||
|
||||
```python
|
||||
def flush_hapi_terminology_cache(admin_token: str) -> dict:
|
||||
"""
|
||||
Flush the HAPI ICD-11 terminology validation cache.
|
||||
|
||||
Must be called AFTER:
|
||||
- OCL ICD-11 import is complete
|
||||
- concept_class patch is applied
|
||||
- bd-condition-icd11-diagnosis-valueset is repopulated
|
||||
- $validate-code verified returning correct results
|
||||
|
||||
Returns the flush summary from HAPI.
|
||||
Raises requests.HTTPError on failure.
|
||||
"""
|
||||
response = requests.delete(
|
||||
f"{HAPI_BASE_URL}/admin/terminology/cache",
|
||||
headers={"Authorization": f"Bearer {admin_token}"},
|
||||
timeout=60, # allow time for HAPI to process across all replicas
|
||||
)
|
||||
|
||||
if response.status_code == 403:
|
||||
raise PermissionError(
|
||||
"Cache flush rejected: fhir-admin role not present in token. "
|
||||
"Check Keycloak fhir-admin-pipeline service account configuration."
|
||||
)
|
||||
response.raise_for_status()
|
||||
|
||||
result = response.json()
|
||||
print(f"HAPI cache flush completed: {result['entriesEvicted']} entries evicted "
|
||||
f"at {result['timestamp']}")
|
||||
return result
|
||||
|
||||
|
||||
# Full upgrade function to add to version_upgrade.py:
|
||||
def post_ocl_import_hapi_integration(icd11_version: str) -> None:
|
||||
"""
|
||||
Call after successful OCL import and verification.
|
||||
Flushes HAPI cache and verifies the new version validates correctly.
|
||||
|
||||
Args:
|
||||
icd11_version: The new ICD-11 version string, e.g. "2025-01"
|
||||
"""
|
||||
print(f"\n=== HAPI integration: ICD-11 {icd11_version} ===")
|
||||
|
||||
# Step 5a: get admin token
|
||||
print("Obtaining fhir-admin token...")
|
||||
admin_token = get_fhir_admin_token()
|
||||
print("Token obtained.")
|
||||
|
||||
# Step 5b: record pre-flush state
|
||||
stats_before = get_cache_stats(admin_token)
|
||||
print(f"Pre-flush cache: {stats_before['totalEntries']} entries")
|
||||
|
||||
# Step 5c: flush
|
||||
print("Flushing HAPI terminology cache...")
|
||||
flush_result = flush_hapi_terminology_cache(admin_token)
|
||||
print(f"Flush complete: {flush_result['entriesEvicted']} entries evicted")
|
||||
|
||||
# Step 6: post-flush verification (see below)
|
||||
verify_hapi_validates_new_version(admin_token, icd11_version)
|
||||
|
||||
print(f"=== HAPI integration complete for ICD-11 {icd11_version} ===\n")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6 — Post-flush verification
|
||||
|
||||
After the flush, verify that HAPI is now validating against the new OCL data.
|
||||
This confirms the end-to-end pipeline from OCL → HAPI cache → vendor validation.
|
||||
|
||||
### 6a — Submit a test Condition with a new ICD-11 code
|
||||
|
||||
The test resource must be submitted by the `fhir-admin-pipeline` client.
|
||||
Note: the admin client has `fhir-admin` role but the FHIR resource endpoints
|
||||
require `mci-api` role. Use a dedicated test vendor client for resource
|
||||
submission, or temporarily assign `mci-api` to the admin client for testing.
|
||||
|
||||
**Recommended approach:** use a dedicated test vendor client
|
||||
(`fhir-vendor-test-pipeline`) with `mci-api` role for post-upgrade verification.
|
||||
|
||||
```python
|
||||
def verify_hapi_validates_new_version(
|
||||
admin_token: str, icd11_version: str) -> None:
|
||||
"""
|
||||
Verifies HAPI is now accepting codes from the new ICD-11 version.
|
||||
Uses the $validate-code operation directly against HAPI (not resource submission)
|
||||
to avoid needing mci-api role on the admin client.
|
||||
|
||||
Note: HAPI's $validate-code endpoint proxies to OCL via the validation chain.
|
||||
A successful result confirms the cache was flushed AND OCL is returning
|
||||
correct results for the new version.
|
||||
"""
|
||||
# Use a known-valid code from the new release
|
||||
# This should be parameterised with the actual new code from release notes
|
||||
test_code = get_test_code_for_version(icd11_version) # implement per release
|
||||
valueset_url = (
|
||||
"https://fhir.dghs.gov.bd/core/ValueSet/"
|
||||
"bd-condition-icd11-diagnosis-valueset"
|
||||
)
|
||||
|
||||
response = requests.get(
|
||||
f"{HAPI_BASE_URL}/fhir/ValueSet/$validate-code",
|
||||
params={
|
||||
"url": valueset_url,
|
||||
"system": "http://id.who.int/icd/release/11/mms",
|
||||
"code": test_code,
|
||||
},
|
||||
headers={"Authorization": f"Bearer {admin_token}"},
|
||||
timeout=30,
|
||||
)
|
||||
|
||||
if response.status_code == 401:
|
||||
# $validate-code requires mci-api — use a vendor test token here
|
||||
print("WARNING: $validate-code requires mci-api role. "
|
||||
"Skipping HAPI direct verification. "
|
||||
"Verify manually by submitting a test Condition resource.")
|
||||
return
|
||||
|
||||
response.raise_for_status()
|
||||
result = response.json()
|
||||
|
||||
valid = next(
|
||||
(p["valueBoolean"] for p in result.get("parameter", [])
|
||||
if p["name"] == "result"),
|
||||
None
|
||||
)
|
||||
|
||||
if valid is True:
|
||||
print(f"✓ HAPI verification passed: code '{test_code}' "
|
||||
f"valid in new ICD-11 {icd11_version}")
|
||||
else:
|
||||
message = next(
|
||||
(p.get("valueString") for p in result.get("parameter", [])
|
||||
if p["name"] == "message"),
|
||||
"no message"
|
||||
)
|
||||
raise ValueError(
|
||||
f"HAPI verification FAILED: code '{test_code}' rejected after cache flush. "
|
||||
f"Message: {message}. "
|
||||
f"Check OCL import completed correctly for ICD-11 {icd11_version}."
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Integration into version_upgrade.py — call site
|
||||
|
||||
Add to the end of your main upgrade function, after the OCL verification steps:
|
||||
|
||||
```python
|
||||
def run_upgrade(icd11_version: str) -> None:
|
||||
"""Main upgrade entry point."""
|
||||
|
||||
# --- Existing steps (your current implementation) ---
|
||||
print(f"Starting ICD-11 {icd11_version} upgrade...")
|
||||
|
||||
# 1. Import ICD-11 concepts into OCL
|
||||
import_concepts_to_ocl(icd11_version)
|
||||
|
||||
# 2. Patch concept_class for Diagnosis + Finding
|
||||
patch_concept_class(icd11_version)
|
||||
|
||||
# 3. Repopulate bd-condition-icd11-diagnosis-valueset
|
||||
repopulate_condition_valueset(icd11_version)
|
||||
|
||||
# 4. Verify OCL $validate-code
|
||||
verify_ocl_validate_code(icd11_version)
|
||||
|
||||
# --- New: HAPI integration ---
|
||||
# 5-6. Flush HAPI cache and verify
|
||||
post_ocl_import_hapi_integration(icd11_version)
|
||||
|
||||
# 7. Notify vendors
|
||||
notify_vendors_of_upgrade(icd11_version)
|
||||
|
||||
print(f"ICD-11 {icd11_version} upgrade complete.")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Environment variables required by version_upgrade.py
|
||||
|
||||
Add to your upgrade pipeline's secrets configuration:
|
||||
|
||||
```bash
|
||||
# Keycloak admin client for HAPI cache management
|
||||
FHIR_ADMIN_CLIENT_SECRET=<secret from keycloak-setup.md Part 2>
|
||||
|
||||
# HAPI server base URL
|
||||
HAPI_BASE_URL=https://fhir.dghs.gov.bd
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rollback procedure
|
||||
|
||||
If post-flush verification fails (HAPI is not accepting new codes):
|
||||
|
||||
1. **Do not re-run the flush** — the cache is already empty, re-flushing has no effect.
|
||||
2. Check OCL directly: `curl https://tr.ocl.dghs.gov.bd/api/fhir/ValueSet/$validate-code?...`
|
||||
3. If OCL is returning wrong results: the OCL import is incomplete. Re-run steps 1-4.
|
||||
4. If OCL is returning correct results but HAPI is not: check HAPI logs for OCL
|
||||
connectivity errors. OCL may have returned HTTP 5xx during the first post-flush
|
||||
validation call, triggering fail-open behaviour.
|
||||
5. After fixing OCL: flush the cache again (it has repopulated with bad data).
|
||||
|
||||
```bash
|
||||
# Emergency manual flush via curl
|
||||
ADMIN_TOKEN=$(curl -s -X POST \
|
||||
"https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
|
||||
-d "grant_type=client_credentials" \
|
||||
-d "client_id=fhir-admin-pipeline" \
|
||||
-d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \
|
||||
| jq -r '.access_token')
|
||||
|
||||
curl -s -X DELETE \
|
||||
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
|
||||
https://fhir.dghs.gov.bd/admin/terminology/cache | jq .
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cache warm-up after flush
|
||||
|
||||
The HAPI cache repopulates organically as vendors submit resources.
|
||||
There is no pre-warming mechanism. The first vendor submission after a flush
|
||||
for each code will take up to 10 seconds (OCL timeout) rather than sub-millisecond
|
||||
(cache hit). At pilot scale (50 vendors, <36,941 distinct codes in use),
|
||||
this is acceptable.
|
||||
|
||||
At national scale, consider a pre-warming job that submits $validate-code requests
|
||||
for the top-N most frequently submitted ICD-11 codes immediately after the flush.
|
||||
The top-N list is derivable from the `audit.audit_events` table:
|
||||
|
||||
```sql
|
||||
SELECT invalid_code, COUNT(*) as frequency
|
||||
FROM audit.fhir_rejected_submissions
|
||||
WHERE rejection_code = 'TERMINOLOGY_INVALID_CODE'
|
||||
AND submission_time > NOW() - INTERVAL '90 days'
|
||||
GROUP BY invalid_code
|
||||
ORDER BY frequency DESC
|
||||
LIMIT 100;
|
||||
-- Invert: these are rejected codes. Use accepted codes from audit_events instead.
|
||||
|
||||
SELECT
|
||||
(validation_messages ->> 0) as code_info,
|
||||
COUNT(*) as frequency
|
||||
FROM audit.audit_events
|
||||
WHERE outcome = 'ACCEPTED'
|
||||
AND resource_type = 'Condition'
|
||||
AND event_time > NOW() - INTERVAL '90 days'
|
||||
GROUP BY 1
|
||||
ORDER BY frequency DESC
|
||||
LIMIT 200;
|
||||
```
|
||||
Reference in New Issue
Block a user