bd-fhir-national/ops/deployment-guide.md

# BD FHIR National — Production Deployment Guide

**Target OS:** Ubuntu 22.04 LTS
**Audience:** DGHS infrastructure team
**Estimated time:** 90 minutes first deployment, 15 minutes subsequent upgrades

---

## Prerequisites checklist

Before starting, confirm all of the following:

- [ ] Ubuntu 22.04 LTS server provisioned with minimum 8GB RAM, 4 vCPU, 100GB disk
- [ ] Server has outbound HTTPS access to:
  - `auth.dghs.gov.bd` (Keycloak)
  - `tr.ocl.dghs.gov.bd` (OCL)
  - `icd11.dghs.gov.bd` (cluster validator)
  - Your private Docker registry
- [ ] TLS certificates provisioned at paths matching `.env` `TLS_CERT_PATH` / `TLS_KEY_PATH`
- [ ] Keycloak `hris` realm configured per `ops/keycloak-setup.md`
- [ ] BD Core IG `bd.gov.dghs.core-0.2.1.tgz` present in `hapi-overlay/src/main/resources/packages/` on CI machine
- [ ] CI machine has built and pushed the Docker image to private registry
- [ ] `.env` file prepared from `.env.example` with all secrets filled in

---

## Part 1 — Server preparation

### 1.1 — Install Docker Engine

```bash
# Remove any conflicting packages
for pkg in docker.io docker-doc docker-compose docker-compose-v2 \
           podman-docker containerd runc; do
  sudo apt-get remove -y $pkg 2>/dev/null
done

# Add Docker's official GPG key
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
  -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add Docker repository
echo \
  "deb [arch=$(dpkg --print-architecture) \
  signed-by=/etc/apt/keyrings/docker.asc] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Install Docker Engine and Compose plugin
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io \
  docker-buildx-plugin docker-compose-plugin

# Verify
docker --version          # Docker Engine 25.x or higher
docker compose version    # Docker Compose v2.x or higher
```

### 1.2 — Configure Docker daemon

```bash
# Create daemon config: limit log size, set storage driver
sudo tee /etc/docker/daemon.json <<'EOF'
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m",
    "max-file": "5"
  },
  "storage-driver": "overlay2",
  "live-restore": true
}
EOF

sudo systemctl restart docker
sudo systemctl enable docker

# Add your deploy user to the docker group (avoids sudo on every docker command)
sudo usermod -aG docker $USER
# Log out and back in for group membership to take effect
```

### 1.3 — Create application directory

```bash
sudo mkdir -p /opt/bd-fhir-national
sudo chown $USER:$USER /opt/bd-fhir-national
cd /opt/bd-fhir-national
```

### 1.4 — Deploy project files

Copy the entire project directory to the server. Recommended approach:

```bash
# From your CI/deployment machine:
rsync -avz --exclude='.git' \
  --exclude='hapi-overlay/target' \
  --exclude='hapi-overlay/src' \
  ./bd-fhir-national/ \
  deploy@your-server:/opt/bd-fhir-national/

# The server needs:
# /opt/bd-fhir-national/
# ├── docker-compose.yml
# ├── .env                        ← you create this (see 1.5)
# ├── nginx/nginx.conf
# ├── postgres/fhir/postgresql.conf
# ├── postgres/fhir/init.sql
# ├── postgres/audit/postgresql.conf
# └── postgres/audit/init.sql
#
# The hapi-overlay/ source tree does NOT need to be on the production server.
# Only the Docker image (pre-built and pushed to registry) is needed.
```

### 1.5 — Create .env file

```bash
cd /opt/bd-fhir-national
cp .env.example .env
chmod 600 .env   # restrict to owner only — contains secrets

# Edit .env with actual values
nano .env
```

**Required values in .env:**

```bash
# Docker image — must match what CI pushed
HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0

# FHIR database
FHIR_DB_NAME=fhirdb
FHIR_DB_SUPERUSER=postgres
FHIR_DB_SUPERUSER_PASSWORD=$(openssl rand -base64 32)
FHIR_DB_APP_USER=hapi_app
FHIR_DB_APP_PASSWORD=$(openssl rand -base64 32)

# Audit database
AUDIT_DB_NAME=auditdb
AUDIT_DB_SUPERUSER=postgres
AUDIT_DB_SUPERUSER_PASSWORD=$(openssl rand -base64 32)
AUDIT_DB_WRITER_USER=audit_writer_login
AUDIT_DB_WRITER_PASSWORD=$(openssl rand -base64 32)
AUDIT_DB_MAINTAINER_USER=audit_maintainer_login
AUDIT_DB_MAINTAINER_PASSWORD=$(openssl rand -base64 32)

# TLS certificate paths (absolute paths on this server)
TLS_CERT_PATH=/etc/ssl/dghs/fhir.dghs.gov.bd.crt
TLS_KEY_PATH=/etc/ssl/dghs/fhir.dghs.gov.bd.key
```

> **Security:** Never commit `.env` to version control. Store the filled
> `.env` in your secrets vault (HashiCorp Vault, AWS SSM, or encrypted backup).
> Verify permissions after creation: `ls -la .env` should show `-rw-------`.

### 1.6 — Fix PostgreSQL init script password injection

The `postgres/audit/init.sql` file contains placeholder passwords.
PostgreSQL's Docker entrypoint does not perform variable substitution in
`.sql` init files — only `.sh` files. Replace the init SQL with a shell script:

```bash
# Create shell-based init script for audit database
cat > /opt/bd-fhir-national/postgres/audit/init.sh <<'INITSCRIPT'
#!/bin/bash
set -e

# Load passwords from environment variables
# (These env vars are set in docker-compose.yml from .env)
WRITER_USER="${AUDIT_DB_WRITER_USER:-audit_writer_login}"
WRITER_PASS="${AUDIT_DB_WRITER_PASSWORD}"
MAINTAINER_USER="${AUDIT_DB_MAINTAINER_USER:-audit_maintainer_login}"
MAINTAINER_PASS="${AUDIT_DB_MAINTAINER_PASSWORD}"

psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
    -- Create writer login user
    DO \$\$
    BEGIN
        IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${WRITER_USER}') THEN
            CREATE USER ${WRITER_USER}
                WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
                CONNECTION LIMIT 20
                PASSWORD '${WRITER_PASS}';
        END IF;
    END
    \$\$;

    -- Create maintainer login user
    DO \$\$
    BEGIN
        IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${MAINTAINER_USER}') THEN
            CREATE USER ${MAINTAINER_USER}
                WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
                CONNECTION LIMIT 5
                PASSWORD '${MAINTAINER_PASS}';
        END IF;
    END
    \$\$;

    GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${WRITER_USER};
    GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${MAINTAINER_USER};
EOSQL
INITSCRIPT

chmod +x /opt/bd-fhir-national/postgres/audit/init.sh
```

Update `docker-compose.yml` to mount `init.sh` instead of `init.sql` for
the `postgres-audit` service:

```yaml
# In postgres-audit volumes: section, change:
# - ./postgres/audit/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
# To:
- ./postgres/audit/init.sh:/docker-entrypoint-initdb.d/init.sh:ro
```

Also pass the audit user environment variables to `postgres-audit`:

```yaml
# In postgres-audit environment: section, add:
AUDIT_DB_WRITER_USER:       ${AUDIT_DB_WRITER_USER}
AUDIT_DB_WRITER_PASSWORD:   ${AUDIT_DB_WRITER_PASSWORD}
AUDIT_DB_MAINTAINER_USER:   ${AUDIT_DB_MAINTAINER_USER}
AUDIT_DB_MAINTAINER_PASSWORD: ${AUDIT_DB_MAINTAINER_PASSWORD}
```

Similarly for `postgres-fhir`, create `init.sh`:

```bash
cat > /opt/bd-fhir-national/postgres/fhir/init.sh <<'INITSCRIPT'
#!/bin/bash
set -e

APP_USER="${FHIR_DB_APP_USER:-hapi_app}"
APP_PASS="${FHIR_DB_APP_PASSWORD}"

psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
    DO \$\$
    BEGIN
        IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = '${APP_USER}') THEN
            CREATE USER ${APP_USER}
                WITH NOSUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN
                CONNECTION LIMIT 30
                PASSWORD '${APP_PASS}';
        END IF;
    END
    \$\$;

    GRANT CONNECT ON DATABASE ${POSTGRES_DB} TO ${APP_USER};
    GRANT USAGE ON SCHEMA public TO ${APP_USER};
    ALTER DEFAULT PRIVILEGES IN SCHEMA public
        GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO ${APP_USER};
    ALTER DEFAULT PRIVILEGES IN SCHEMA public
        GRANT USAGE, SELECT ON SEQUENCES TO ${APP_USER};
EOSQL
INITSCRIPT

chmod +x /opt/bd-fhir-national/postgres/fhir/init.sh
```

### 1.7 — Authenticate with private Docker registry

```bash
# Log in to your private registry
docker login your-registry.dghs.gov.bd \
  --username ${REGISTRY_USER} \
  --password-stdin <<< "${REGISTRY_PASSWORD}"

# Verify the login persisted
cat ~/.docker/config.json | jq '.auths | keys'
# Should include "your-registry.dghs.gov.bd"
```

---

## Part 2 — First deployment

### 2.1 — Pull images

```bash
cd /opt/bd-fhir-national

# Pull all images declared in docker-compose.yml
docker compose --env-file .env pull

# Verify images are present locally
docker images | grep -E "hapi|postgres|pgbouncer|nginx"
```

### 2.2 — Start infrastructure services first

Start databases before HAPI. HAPI's `depends_on` with `condition: service_healthy`
handles this automatically, but starting manually in stages helps isolate
any first-run issues.

```bash
# Start databases
docker compose --env-file .env up -d postgres-fhir postgres-audit

# Wait for health checks to pass (up to 60 seconds)
echo "Waiting for PostgreSQL to be ready..."
until docker compose --env-file .env ps postgres-fhir \
    | grep -q "healthy"; do
  sleep 3
  echo -n "."
done
echo ""
echo "postgres-fhir: healthy"

until docker compose --env-file .env ps postgres-audit \
    | grep -q "healthy"; do
  sleep 3
  echo -n "."
done
echo ""
echo "postgres-audit: healthy"
```

### 2.3 — Verify PostgreSQL user creation

```bash
# Verify FHIR app user was created
docker exec bd-postgres-fhir psql -U postgres -d fhirdb -c \
  "SELECT rolname, rolcanlogin FROM pg_roles WHERE rolname = 'hapi_app';"
# Expected: hapi_app | t

# Verify audit writer user was created
docker exec bd-postgres-audit psql -U postgres -d auditdb -c \
  "SELECT rolname, rolcanlogin FROM pg_roles WHERE rolname = 'audit_writer_login';"
# Expected: audit_writer_login | t
```

### 2.4 — Start pgBouncer

```bash
docker compose --env-file .env up -d pgbouncer-fhir pgbouncer-audit

# Verify pgBouncer is healthy
until docker compose --env-file .env ps pgbouncer-fhir \
    | grep -q "healthy"; do
  sleep 3
done
echo "pgbouncer-fhir: healthy"
```

### 2.5 — Start HAPI (first replica)

```bash
docker compose --env-file .env up -d hapi

# Follow startup logs — this takes 60-120 seconds on first run
# Watch for these key log events in order:
#   1. "Running FHIR Flyway migrations" — V1 schema creation
#   2. "Running Audit Flyway migrations" — V2 audit schema creation
#   3. "Advisory lock acquired" — IG package initialisation begins
#   4. "BD Core IG package loaded successfully" — IG loaded
#   5. "BdTerminologyValidationSupport initialised" — OCL integration ready
#   6. "KeycloakJwtInterceptor initialised" — JWT validation ready
#   7. "HAPI RestfulServer interceptors registered" — server ready
#   8. Spring Boot startup completion message with port 8080

docker compose --env-file .env logs -f hapi
# Press Ctrl+C when you see the startup completion message
```

**Expected startup log sequence (key lines only):**
```
INFO  o.f.core.internal.command.DbMigrate - Running FHIR Flyway migrations
INFO  o.f.core.internal.command.DbMigrate - Successfully applied 1 migration to schema "public"
INFO  o.f.core.internal.command.DbMigrate - Running Audit Flyway migrations
INFO  o.f.core.internal.command.DbMigrate - Successfully applied 1 migration to schema "audit"
INFO  b.g.d.f.init.IgPackageInitializer - Advisory lock acquired: lockKey=... waitedMs=...
INFO  b.g.d.f.init.IgPackageInitializer - BD Core IG package loaded successfully: version=0.2.1
INFO  b.g.d.f.t.BdTerminologyValidationSupport - BdTerminologyValidationSupport initialised
INFO  b.g.d.f.i.KeycloakJwtInterceptor - KeycloakJwtInterceptor initialised
INFO  b.g.d.f.c.SecurityConfig - HAPI RestfulServer interceptors registered
INFO  o.s.b.w.e.t.TomcatWebServer - Tomcat started on port(s): 8080
INFO  b.g.d.f.BdFhirApplication - Started BdFhirApplication in XX.XXX seconds
```

### 2.6 — Start nginx

```bash
docker compose --env-file .env up -d nginx

# Verify nginx started without config errors
docker compose --env-file .env logs nginx | tail -20
# Should NOT contain: [emerg] or [crit] — only [notice] lines

# Verify nginx health
docker compose --env-file .env ps nginx
# Status should be: Up (healthy)
```

### 2.7 — Verify full stack health

```bash
# Internal health check (bypasses nginx, hits HAPI directly)
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
  curl -s http://localhost:8080/actuator/health | jq .

# Expected output:
# {
#   "status": "UP",
#   "components": {
#     "db": { "status": "UP" },
#     "auditDb": { "status": "UP" },
#     "ocl": { "status": "UP" },
#     "livenessState": { "status": "UP" },
#     "readinessState": { "status": "UP" }
#   }
# }

# External health check (through nginx + TLS)
curl -s https://fhir.dghs.gov.bd/actuator/health/liveness | jq .
# Expected: { "status": "UP" }

# FHIR metadata endpoint (unauthenticated)
curl -s https://fhir.dghs.gov.bd/fhir/metadata | jq '{
  resourceType,
  fhirVersion,
  software: .software,
  implementation: .implementation
}'
# Expected:
# {
#   "resourceType": "CapabilityStatement",
#   "fhirVersion": "4.0.1",
#   "software": { "name": "BD FHIR National Repository", "version": "0.2.1" }
# }
```

---

## Part 3 — Phase 2 acceptance tests

Run all seven tests before declaring the deployment production-ready.
Each test includes the expected HTTP status, expected response body shape,
and what to check in the audit log if the test fails.

### Setup: obtain a vendor test token

```bash
VENDOR_TOKEN=$(curl -s -X POST \
  "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
  -d "grant_type=client_credentials" \
  -d "client_id=fhir-vendor-TEST-FAC-001" \
  -d "client_secret=${TEST_VENDOR_SECRET}" \
  | jq -r '.access_token')

echo "Token obtained: ${VENDOR_TOKEN:0:20}..."
```

---

### Test 1 — Valid Condition with valid ICD-11 code → 201

Submits a BD Core IG-compliant `bd-condition` resource with a valid
ICD-11 Diagnosis-class code.

```bash
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${VENDOR_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{
    "resourceType": "Condition",
    "meta": {
      "profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
    },
    "clinicalStatus": {
      "coding": [{
        "system": "http://terminology.hl7.org/CodeSystem/condition-clinical",
        "code": "active"
      }]
    },
    "verificationStatus": {
      "coding": [{
        "system": "http://terminology.hl7.org/CodeSystem/condition-ver-status",
        "code": "confirmed"
      }]
    },
    "code": {
      "coding": [{
        "system": "http://id.who.int/icd/release/11/mms",
        "code": "1C62.0",
        "display": "Typhoid fever"
      }]
    },
    "subject": {
      "reference": "Patient/test-patient-001"
    },
    "recordedDate": "2025-03-01"
  }'
```

**Expected:** `HTTP 201 Created` with `Location` header containing the new resource URL.

**If 422 instead:**
- Check OCL connectivity: `curl https://tr.ocl.dghs.gov.bd/api/fhir/CodeSystem/$validate-code?system=http://id.who.int/icd/release/11/mms&code=1C62.0`
- Check IG is loaded: `curl http://localhost:8080/actuator/health` — OCL component should be UP
- Check HAPI logs for profile validation errors

---

### Test 2 — Invalid ICD-11 code → 422

```bash
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${VENDOR_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{
    "resourceType": "Condition",
    "meta": {
      "profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
    },
    "clinicalStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
    },
    "verificationStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
    },
    "code": {
      "coding": [{
        "system": "http://id.who.int/icd/release/11/mms",
        "code": "INVALID-CODE-99999",
        "display": "This code does not exist"
      }]
    },
    "subject": {"reference": "Patient/test-patient-001"},
    "recordedDate": "2025-03-01"
  }'
```

**Expected:** `HTTP 422 Unprocessable Entity` with OperationOutcome containing:
- `issue[0].severity`: `error`
- `issue[0].diagnostics`: contains "INVALID-CODE-99999" and rejection reason
- `issue[0].expression`: contains `Condition.code`

**Verify in audit table:**
```sql
SELECT rejection_code, rejection_reason, invalid_code, element_path
FROM audit.fhir_rejected_submissions
ORDER BY submission_time DESC LIMIT 1;
-- Expected: TERMINOLOGY_INVALID_CODE | OCL rejected code... | INVALID-CODE-99999 | Condition.code...
```

---

### Test 3 — Device-class ICD-11 code in Condition.code → 422

Device-class codes are valid ICD-11 codes but are not in the
`bd-condition-icd11-diagnosis-valueset` (restricted to Diagnosis + Finding).

```bash
# XA7RE2 is an example Device-class code in ICD-11 MMS
# Verify it is Device-class in OCL before running this test:
# curl "https://tr.ocl.dghs.gov.bd/api/fhir/CodeSystem/$lookup?system=http://id.who.int/icd/release/11/mms&code=XA7RE2"

curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${VENDOR_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{
    "resourceType": "Condition",
    "meta": {
      "profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
    },
    "clinicalStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
    },
    "verificationStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
    },
    "code": {
      "coding": [{
        "system": "http://id.who.int/icd/release/11/mms",
        "code": "XA7RE2",
        "display": "Device code — should be rejected"
      }]
    },
    "subject": {"reference": "Patient/test-patient-001"},
    "recordedDate": "2025-03-01"
  }'
```

**Expected:** `HTTP 422` with OperationOutcome.
Rejection code in audit: `TERMINOLOGY_INVALID_CLASS`.

**If 201 instead (code accepted):**
- OCL ValueSet class restriction is not enforcing correctly
- Verify the ValueSet collection in OCL has correct concept_class filter
- Run: `python version_upgrade.py --verify-class-restriction`

---

### Test 4 — Profile violation (missing required field) → 422

Submits a Condition missing `clinicalStatus` which is required by `bd-condition` profile.

```bash
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${VENDOR_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{
    "resourceType": "Condition",
    "meta": {
      "profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
    },
    "code": {
      "coding": [{
        "system": "http://id.who.int/icd/release/11/mms",
        "code": "1C62.0"
      }]
    },
    "subject": {"reference": "Patient/test-patient-001"}
  }'
```

**Expected:** `HTTP 422` with OperationOutcome referencing missing `clinicalStatus`.

**If 201 instead:**
- BD Core IG is not loaded or profile is not enforcing `clinicalStatus` as required
- Check startup logs for IG load success
- Verify: `curl http://localhost:8080/fhir/StructureDefinition/bd-condition`

---

### Test 5 — No Bearer token → 401

```bash
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Content-Type: application/fhir+json" \
  -d '{"resourceType": "Condition"}'
```

**Expected:** `HTTP 401` with `WWW-Authenticate` header and OperationOutcome.

```bash
# Verify WWW-Authenticate header is present
curl -s -I \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Content-Type: application/fhir+json" \
  -d '{"resourceType":"Condition"}' \
  | grep -i "www-authenticate"
# Expected: WWW-Authenticate: Bearer realm="BD FHIR National Repository"...
```

---

### Test 6 — Valid token but missing mci-api role → 401

Create a test client WITHOUT `mci-api` role in Keycloak for this test.
Or use a token from a different realm.

```bash
# Token from a client without mci-api role
NO_ROLE_TOKEN=$(curl -s -X POST \
  "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
  -d "grant_type=client_credentials" \
  -d "client_id=fhir-test-no-role" \
  -d "client_secret=${TEST_NO_ROLE_SECRET}" \
  | jq -r '.access_token')

curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${NO_ROLE_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{"resourceType": "Condition"}'
```

**Expected:** `HTTP 401`.

**Verify in audit log:**
```sql
SELECT event_type, outcome_detail, client_id
FROM audit.audit_events
WHERE event_type = 'AUTH_FAILURE'
ORDER BY event_time DESC LIMIT 1;
-- Expected: AUTH_FAILURE | Required role 'mci-api' not present... | fhir-test-no-role
```

---

### Test 7 — Expired token → 401

```bash
# An expired token is one whose 'exp' claim is in the past.
# Easiest approach: obtain a token, wait for it to expire (default: 5 minutes),
# then use it.
#
# For automated testing, forge an expired token manually:
# (This requires knowing the signing key — use only in test environments)
#
# Alternative: Use a token from a deactivated Keycloak client
# (revoke the client's credentials, existing tokens become invalid)

# Or simply wait:
echo "Waiting 6 minutes for token to expire..."
EXPIRED_TOKEN=$(curl -s -X POST \
  "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
  -d "grant_type=client_credentials" \
  -d "client_id=fhir-vendor-TEST-FAC-001" \
  -d "client_secret=${TEST_VENDOR_SECRET}" \
  | jq -r '.access_token')

sleep 360  # wait for 5-minute Keycloak default expiry

curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${EXPIRED_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{"resourceType": "Condition"}'
```

**Expected:** `HTTP 401` — "Token has expired".

---

### Test 8 — Cluster expression: raw postcoordinated code without extension → 422

```bash
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X POST https://fhir.dghs.gov.bd/fhir/Condition \
  -H "Authorization: Bearer ${VENDOR_TOKEN}" \
  -H "Content-Type: application/fhir+json" \
  -d '{
    "resourceType": "Condition",
    "meta": {
      "profile": ["https://fhir.dghs.gov.bd/core/StructureDefinition/bd-condition"]
    },
    "clinicalStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-clinical", "code": "active"}]
    },
    "verificationStatus": {
      "coding": [{"system": "http://terminology.hl7.org/CodeSystem/condition-ver-status", "code": "confirmed"}]
    },
    "code": {
      "coding": [{
        "system": "http://id.who.int/icd/release/11/mms",
        "code": "1C62.0&has_severity=mild",
        "display": "Raw postcoordinated string — prohibited"
      }]
    },
    "subject": {"reference": "Patient/test-patient-001"},
    "recordedDate": "2025-03-01"
  }'
```

**Expected:** `HTTP 422` with OperationOutcome diagnosing:
`"ICD-11 postcoordinated expression in Condition.code.coding[0] must use the icd11-cluster-expression extension"`

Rejection code in audit: `CLUSTER_STEM_MISSING_EXTENSION`.

---

### Test 9 — Cache flush endpoint requires fhir-admin role

```bash
# Attempt with vendor token (mci-api only) — should be 403
curl -s -w "\n--- HTTP %{http_code} ---\n" \
  -X DELETE https://fhir.dghs.gov.bd/admin/terminology/cache \
  -H "Authorization: Bearer ${VENDOR_TOKEN}"
# Expected: 403 (blocked by nginx IP restriction OR TerminologyCacheManager role check)

# Attempt with fhir-admin token — should be 200
ADMIN_TOKEN=$(curl -s -X POST \
  "https://auth.dghs.gov.bd/realms/hris/protocol/openid-connect/token" \
  -d "grant_type=client_credentials" \
  -d "client_id=fhir-admin-pipeline" \
  -d "client_secret=${FHIR_ADMIN_CLIENT_SECRET}" \
  | jq -r '.access_token')

# Note: /admin/ is restricted to 172.20.0.0/16 in nginx.
# Run this from within the Docker network or from the server itself:
docker exec $(docker compose --env-file .env ps -q hapi | head -1) \
  curl -s -X DELETE \
    -H "Authorization: Bearer ${ADMIN_TOKEN}" \
    http://localhost:8080/admin/terminology/cache | jq .
# Expected: 200 with { "status": "flushed", "entriesEvicted": N, ... }
```

---

## Part 4 — Subsequent deployments (image upgrade)

When a new Docker image is built and pushed (new IG version, code changes):

```bash
cd /opt/bd-fhir-national

# 1. Update image tag in .env
nano .env
# Change: HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0
# To:     HAPI_IMAGE=your-registry.dghs.gov.bd/bd-fhir-hapi:1.1.0

# 2. Pull new image
docker compose --env-file .env pull hapi

# 3. Rolling restart — replaces containers one at a time
# At 1 replica (pilot): brief downtime expected (~30s)
docker compose --env-file .env up -d --no-deps hapi

# At 3 replicas (Phase 2): true rolling update — scale up then scale down
docker compose --env-file .env up -d --no-deps --scale hapi=4 hapi
# Wait for new replica to be healthy:
sleep 30
docker compose --env-file .env up -d --no-deps --scale hapi=3 hapi

# 4. Verify startup
docker compose --env-file .env logs --tail=50 hapi

# 5. Run acceptance tests (at minimum Tests 1, 2, 5)
```

---

## Part 5 — Operational runbook

### View logs

```bash
# All services
docker compose --env-file .env logs -f

# HAPI only (structured JSON — pipe through jq)
docker compose --env-file .env logs -f hapi | jq -R 'try fromjson'

# nginx access log
docker compose --env-file .env logs -f nginx

# Filter for rejected submissions in HAPI logs
docker compose --env-file .env logs hapi | \
  jq -R 'try fromjson | select(.message | contains("rejected"))'
```

### Restart a specific service

```bash
docker compose --env-file .env restart hapi
docker compose --env-file .env restart nginx
```

### Emergency: full stack restart

```bash
docker compose --env-file .env down
docker compose --env-file .env up -d
```

### Query rejected submissions

```bash
docker exec bd-postgres-audit psql -U postgres -d auditdb -c "
SELECT
    submission_time,
    resource_type,
    rejection_code,
    LEFT(rejection_reason, 100) as reason,
    client_id
FROM audit.fhir_rejected_submissions
ORDER BY submission_time DESC
LIMIT 20;"
```

### Check pgBouncer pool status

```bash
# Connect to pgBouncer admin interface
docker exec -it bd-pgbouncer-fhir \
  psql -h localhost -p 5432 -U pgbouncer pgbouncer -c "SHOW POOLS;"
```

### Monitor disk usage

```bash
# PostgreSQL data volumes
docker system df -v | grep -E "postgres|audit"

# Log volume
docker system df -v | grep hapi-logs
```