272 lines
17 KiB
Markdown
272 lines
17 KiB
Markdown
# BD FHIR National — Project Manifest & Pre-Flight Checklist
|
||
|
||
**Project:** BD Core FHIR National Repository and Validation Engine
|
||
**IG Version:** BD Core FHIR IG v0.2.1
|
||
**FHIR Version:** R4 (4.0.1)
|
||
**HAPI Version:** 7.2.0
|
||
**Published by:** DGHS/MoHFW Bangladesh
|
||
**Generated:** 2025
|
||
|
||
---
|
||
|
||
## Complete file manifest
|
||
|
||
### Build and orchestration
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `pom.xml` | 1 | Parent Maven POM. HAPI 7.2.0 BOM, Spring Boot 3.2.5, all version pins. |
|
||
| `hapi-overlay/pom.xml` | 2 | Child module POM. All runtime dependencies. Fat JAR output: `bd-fhir-hapi.jar`. |
|
||
| `hapi-overlay/Dockerfile` | 4 | Multi-stage build: Maven builder + eclipse-temurin:17-jre runtime. tini as PID 1. |
|
||
| `docker-compose.yml` | 4 | Production orchestration: HAPI, 2× PostgreSQL, 2× pgBouncer, nginx. Scaling roadmap in comments. |
|
||
| `.env.example` | 4 | Environment variable template. Copy to `.env`, fill secrets, `chmod 600`. |
|
||
|
||
### Database
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/resources/db/migration/fhir/V1__hapi_schema.sql` | 3 | HAPI 7.2.0 JPA schema. All tables, sequences, indexes. Flyway-managed. Partition comments at 10M+ rows. |
|
||
| `hapi-overlay/src/main/resources/db/migration/audit/V2__audit_schema.sql` | 3 | Audit schema. Partitioned `audit_events` and `fhir_rejected_submissions` by month 2025-2027. INSERT-only role grants. `create_next_month_partitions()` maintenance function. |
|
||
| `postgres/fhir/postgresql.conf` | 4 | PostgreSQL 15 tuning for HAPI JPA workload. 2GB container. SSD-optimised. |
|
||
| `postgres/audit/postgresql.conf` | 4 | PostgreSQL 15 tuning for audit INSERT workload. 1GB container. |
|
||
| `postgres/fhir/init.sql` | 4 | Template — **replace with `init.sh`** per deployment-guide.md §1.6 before first deploy. |
|
||
| `postgres/audit/init.sql` | 4 | Template — **replace with `init.sh`** per deployment-guide.md §1.6 before first deploy. |
|
||
|
||
### Application configuration
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/resources/application.yaml` | 5 | Complete Spring Boot + HAPI configuration. Dual datasource, dual Flyway, HAPI R4, validation chain, actuator, structured logging. All secrets via env vars. |
|
||
| `hapi-overlay/src/main/resources/logback-spring.xml` | 5 | Structured JSON logging via logstash-logback-encoder. Async appenders. MDC field inclusion. |
|
||
|
||
### Java source — entry point
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/BdFhirApplication.java` | 12 | Spring Boot entry point. `@EnableAsync` activates audit async executor. |
|
||
|
||
### Java source — configuration
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/DataSourceConfig.java` | 6 | Dual datasource wiring. Primary FHIR datasource (HikariCP, pgBouncer session mode). Secondary audit datasource (INSERT-only). Dual Flyway instances. `auditDbHealthIndicator` using INSERT test. `oclHealthIndicator`. `entityManagerFactory` bound explicitly to FHIR datasource. |
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/FhirServerConfig.java` | 6 | Validation support chain (6 supports in dependency order). `NpmPackageValidationSupport` loading BD Core IG. `RequestValidatingInterceptor` with failOnSeverity=ERROR. `unvalidatedProfileTagInterceptor` for unknown resource types. Startup IG presence check. |
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/config/SecurityConfig.java` | 8 | Registers JWT, validation, and audit interceptors into HAPI RestfulServer in correct order. HTTPS enforcement filter. Security response headers filter. |
|
||
|
||
### Java source — initialisation
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/init/IgPackageInitializer.java` | 9 | `InitializingBean` that loads BD Core IG with PostgreSQL advisory lock. Prevents multi-replica NPM_PACKAGE race condition. djb2 hash for stable lock key. |
|
||
|
||
### Java source — interceptors
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/interceptor/KeycloakJwtInterceptor.java` | 8 | Nimbus JOSE+JWT with `RemoteJWKSet` (1-hour TTL, kid-based refresh). Validates: signature, expiry, issuer, `mci-api` role. Extracts: `client_id`, `subject`, `sending_facility`. Sets all `REQUEST_ATTR_*` constants. MDC population and guaranteed cleanup. `GET /fhir/metadata` and actuator health exempt. |
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/interceptor/AuditEventInterceptor.java` | 9 | Three-hook interceptor: (1) cluster expression pre-validation, (2) accepted resource audit at `STORAGE_PRESTORAGE_*`, (3) rejected resource audit at `SERVER_HANDLE_EXCEPTION`. Routes to `AuditEventEmitter` and `RejectedSubmissionSink` asynchronously. |
|
||
|
||
### Java source — terminology
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/terminology/BdTerminologyValidationSupport.java` | 7 | Custom `IValidationSupport`. Forces `$validate-code` for ICD-11. Suppresses `$expand` via `isValueSetSupported()=false`. 24-hour `ConcurrentHashMap` cache with TTL eviction. Retry with exponential backoff. Fail-open on OCL outage. `flushCache()` called by `TerminologyCacheManager`. |
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/terminology/TerminologyCacheManager.java` | 7 | REST controller: `DELETE /admin/terminology/cache` and `GET /admin/terminology/cache/stats`. Requires `fhir-admin` role (read from `REQUEST_ATTR_IS_ADMIN`). Called by ICD-11 version upgrade pipeline. |
|
||
|
||
### Java source — validator
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/validator/ClusterExpressionValidator.java` | 7 | Detects `icd11-cluster-expression` extension on ICD-11 `Coding` elements. Rejects raw postcoordinated strings (contains `&`, `/`, `%` without extension) with 422. Calls `https://icd11.dghs.gov.bd/cluster/validate` for full expression validation. Fail-open on cluster validator outage. |
|
||
|
||
### Java source — audit
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/audit/AuditEventEmitter.java` | 9 | `@Async` INSERT to `audit.audit_events`. Immutable (INSERT only — `audit_writer` role enforces at DB level). Serialises `validationMessages` as JSONB. Truncates fields to column lengths. Logs ERROR on write failure (audit gap is a high-priority incident). |
|
||
| `hapi-overlay/src/main/java/bd/gov/dghs/fhir/audit/RejectedSubmissionSink.java` | 9 | `@Async` INSERT to `audit.fhir_rejected_submissions`. Stores full resource payload as TEXT (preserves exact bytes). 4MB payload cap (anti-DoS). Machine-readable `rejection_code` for programmatic analysis. |
|
||
|
||
### Infrastructure
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `nginx/nginx.conf` | 10 | Reverse proxy. TLS 1.2/1.3 only. Rate limiting: FHIR 10r/s, admin 6r/m, metadata 5r/s. `/admin/` restricted to `172.20.0.0/16`. `/actuator/` restricted to internal network. `/fhir/metadata` unauthenticated. All other paths → HAPI. |
|
||
| `hapi-overlay/src/main/resources/packages/.gitkeep` | 12 | Marks the IG package directory for git. CI pipeline places `bd.gov.dghs.core-{version}.tgz` here before `docker build`. |
|
||
|
||
### Operations
|
||
|
||
| File | Step | Purpose |
|
||
|------|------|---------|
|
||
| `ops/keycloak-setup.md` | 10 | `fhir-admin` role creation. `fhir-admin-pipeline` client setup. Vendor client template. `sending_facility` mapper configuration. Token verification tests. |
|
||
| `ops/version-upgrade-integration.md` | 10 | ICD-11 upgrade pipeline integration. Pre-flush OCL verification. `get_fhir_admin_token()`, `flush_hapi_terminology_cache()`, `verify_hapi_validates_new_version()` Python functions. `post_ocl_import_hapi_integration()` call site. Rollback procedure. |
|
||
| `ops/scaling-roadmap.md` | 10 | Phase 1→2→3 thresholds and changes. Monthly partition maintenance cron. PostgreSQL monitoring queries. IG upgrade procedure. Key Prometheus metrics and alert thresholds. |
|
||
| `ops/deployment-guide.md` | 11 | Step-by-step Ubuntu 22.04 deployment. Docker install, daemon config, registry auth. PostgreSQL init script fix (critical). First-deploy sequence. Nine acceptance tests. Rolling upgrade procedure. Operational runbook. |
|
||
|
||
---
|
||
|
||
## Pre-flight checklist
|
||
|
||
Work through this list top to bottom before running `docker compose up`.
|
||
Each item is a documented failure mode from real HAPI deployments.
|
||
**Do not skip items marked CRITICAL.**
|
||
|
||
---
|
||
|
||
### CI machine (before docker build)
|
||
|
||
- [ ] **[CRITICAL]** `bd.gov.dghs.core-0.2.1.tgz` placed in `hapi-overlay/src/main/resources/packages/`
|
||
*Symptom if missing: startup fails with `STARTUP FAILURE: BD Core IG package not found`. Container will not start.*
|
||
|
||
- [ ] `HAPI_IG_PACKAGE_CLASSPATH` in `docker-compose.yml` matches the `.tgz` filename exactly
|
||
*Symptom if mismatch: same STARTUP FAILURE as above.*
|
||
|
||
- [ ] Docker image built with correct `--build-arg` values and pushed to private registry
|
||
*Verify: `docker manifest inspect your-registry.dghs.gov.bd/bd-fhir-hapi:1.0.0`*
|
||
|
||
- [ ] Image tag in `.env.example` (and your `.env`) matches the pushed image tag
|
||
*Symptom if mismatch: `docker compose pull` pulls wrong image or fails.*
|
||
|
||
---
|
||
|
||
### Production server (before docker compose up)
|
||
|
||
- [ ] **[CRITICAL]** `postgres/fhir/init.sql` replaced with `init.sh` (deployment-guide.md §1.6)
|
||
*Symptom if skipped: `hapi_app` user is never created. Flyway migrations succeed but HAPI runtime fails with authentication error to postgres-fhir.*
|
||
|
||
- [ ] **[CRITICAL]** `postgres/audit/init.sql` replaced with `init.sh` (deployment-guide.md §1.6)
|
||
*Symptom if skipped: `audit_writer_login` never created. HAPI starts but all audit writes fail with `FATAL: password authentication failed for user "audit_writer_login"`.*
|
||
|
||
- [ ] `docker-compose.yml` `postgres-audit` service updated to mount `init.sh` (not `init.sql`) and passes `AUDIT_DB_WRITER_USER/PASSWORD/MAINTAINER_*` env vars
|
||
*Follows from the init.sh fix above.*
|
||
|
||
- [ ] `.env` file created, all `<CHANGE_ME>` values replaced, `chmod 600 .env`
|
||
*Verify: `grep CHANGE_ME .env` returns no output.*
|
||
|
||
- [ ] `TLS_CERT_PATH` and `TLS_KEY_PATH` in `.env` point to files that exist on the server
|
||
*Verify: `ls -la $(grep TLS_CERT_PATH .env | cut -d= -f2)`*
|
||
|
||
- [ ] Server can reach all external services from within the Docker network:
|
||
```bash
|
||
# Test from inside a temporary container on the Docker network
|
||
docker run --rm --network bd-fhir-national_backend-fhir alpine sh -c \
|
||
"apk add -q curl && curl -s -o /dev/null -w '%{http_code}' \
|
||
https://auth.dghs.gov.bd/realms/hris/.well-known/openid-configuration"
|
||
# Expected: 200
|
||
```
|
||
*Symptom if unreachable: KeycloakJwtInterceptor fails to fetch JWKS on startup. All authenticated requests return 401 even with valid tokens.*
|
||
|
||
- [ ] `random_page_cost` in both `postgresql.conf` files matches your storage type
|
||
`1.1` for SSD (default in this project), `4.0` for spinning HDD
|
||
*Symptom if wrong: query planner chooses sequential scans over indexes. FHIR search performance degrades at >100k resources.*
|
||
|
||
- [ ] Docker and Docker Compose v2 installed (`docker compose version`, not `docker-compose`)
|
||
*Symptom if wrong: `docker-compose` (v1) does not support `deploy.replicas` or `condition: service_healthy`.*
|
||
|
||
- [ ] Private registry credentials stored in `~/.docker/config.json`
|
||
*Verify: `docker login your-registry.dghs.gov.bd`*
|
||
|
||
---
|
||
|
||
### Keycloak (before first vendor submission)
|
||
|
||
- [ ] **[CRITICAL]** `fhir-admin` realm role created in `hris` realm (keycloak-setup.md Part 1)
|
||
*Symptom if missing: `fhir-admin-pipeline` service account has no role to assign. Cache flush endpoint returns 403 for all callers.*
|
||
|
||
- [ ] **[CRITICAL]** `fhir-admin-pipeline` client created with `fhir-admin` role assigned (keycloak-setup.md Part 2)
|
||
*Symptom if missing: version upgrade pipeline cannot flush cache. After ICD-11 upgrade, stale codes accepted/rejected for up to 24 hours.*
|
||
|
||
- [ ] At least one vendor client created (`fhir-vendor-TEST-FAC-001` for acceptance testing) with `mci-api` role and `sending_facility` attribute mapper (keycloak-setup.md Parts 3-4)
|
||
*Symptom if missing: acceptance Test 1 returns 401. All vendor submissions rejected.*
|
||
|
||
- [ ] Token from test vendor client decoded and verified to contain:
|
||
- `iss`: `https://auth.dghs.gov.bd/realms/hris`
|
||
- `azp`: `fhir-vendor-TEST-FAC-001`
|
||
- `realm_access.roles`: contains `mci-api`
|
||
- `sending_facility`: non-empty facility code
|
||
*Verify with: `echo $TOKEN | cut -d. -f2 | base64 -d 2>/dev/null | jq .`*
|
||
|
||
---
|
||
|
||
### Post-startup verification
|
||
|
||
- [ ] All health indicators GREEN:
|
||
```bash
|
||
curl -s http://localhost:8080/actuator/health | jq '.components | keys'
|
||
# Expected: ["auditDb", "db", "livenessState", "ocl", "readinessState"]
|
||
# All must show "status": "UP"
|
||
```
|
||
|
||
- [ ] FHIR metadata accessible unauthenticated and shows correct IG version:
|
||
```bash
|
||
curl -s https://fhir.dghs.gov.bd/fhir/metadata | jq '.software.version'
|
||
# Expected: "0.2.1"
|
||
```
|
||
|
||
- [ ] Flyway migration history shows V1 and V2 applied cleanly:
|
||
```bash
|
||
docker exec bd-postgres-fhir psql -U postgres -d fhirdb \
|
||
-c "SELECT version, description, success FROM flyway_schema_history;"
|
||
# Expected: V1 | hapi_schema | t
|
||
|
||
docker exec bd-postgres-audit psql -U postgres -d auditdb \
|
||
-c "SELECT version, description, success FROM flyway_audit_schema_history;"
|
||
# Expected: V2 | audit_schema | t
|
||
```
|
||
|
||
- [ ] Audit tables accepting inserts (INSERT-only role working):
|
||
```bash
|
||
docker exec bd-postgres-audit psql -U audit_writer_login -d auditdb -c \
|
||
"INSERT INTO audit.health_check (check_id) VALUES (gen_random_uuid())
|
||
ON CONFLICT DO NOTHING; SELECT 'audit insert ok';"
|
||
# Expected: audit insert ok
|
||
```
|
||
|
||
- [ ] **Run all nine acceptance tests** from deployment-guide.md Part 3
|
||
Tests 1-9 must all produce the expected HTTP status codes before the server is declared production-ready.
|
||
|
||
---
|
||
|
||
### Operational readiness (before announcing to vendors)
|
||
|
||
- [ ] Partition maintenance cron configured on audit database host (scaling-roadmap.md)
|
||
*Run: `docker exec bd-postgres-audit psql -U postgres -d auditdb -c "SELECT audit.create_next_month_partitions();"` — verify it creates next month without error.*
|
||
|
||
- [ ] Log shipping to ELK configured (or Filebeat agent installed and shipping `/app/logs/`)
|
||
*Minimum: verify logs appear at `docker compose logs hapi` in JSON format.*
|
||
|
||
- [ ] `FHIR_ADMIN_CLIENT_SECRET` stored in version upgrade pipeline's secrets vault
|
||
*Required by `ops/version-upgrade-integration.md` before next ICD-11 release.*
|
||
|
||
- [ ] Next ICD-11 version upgrade date noted — cache flush must be coordinated with OCL import completion
|
||
*See `ops/version-upgrade-integration.md` for the 7-step procedure.*
|
||
|
||
- [ ] Vendor onboarding runbook prepared citing `ops/keycloak-setup.md` Parts 3-4
|
||
*Each new vendor requires: Keycloak client, `mci-api` role, `sending_facility` mapper, credentials delivery.*
|
||
|
||
---
|
||
|
||
## Architecture decision record — key decisions frozen in this implementation
|
||
|
||
The following decisions were finalised through the pre-implementation challenge process
|
||
and are reflected throughout the codebase. They are not configurable at runtime
|
||
without code changes.
|
||
|
||
| Decision | Rationale | Where enforced |
|
||
|----------|-----------|---------------|
|
||
| PostgreSQL only, no H2 | National infrastructure requires production-grade persistence | `DataSourceConfig.java`, Flyway migrations, `docker-compose.yml` |
|
||
| Validation on ALL requests | No vendor exemptions — uniform HIE boundary | `RequestValidatingInterceptor` with `failOnSeverity=ERROR` |
|
||
| OCL is single terminology authority | No local ICD-11 copy — live validation | `BdTerminologyValidationSupport`, chain position 6 |
|
||
| `$expand` failures never cause rejection | Known OCL limitation | `isValueSetSupported()=false`, `expandValueSet()` returns null |
|
||
| Only `$validate-code` failures cause 422 | Distinguish expansion from validation | `BdTerminologyValidationSupport.validateCode()` |
|
||
| Keycloak `hris` realm, `mci-api` role, no basic auth | Single authentication authority | `KeycloakJwtInterceptor`, `SecurityConfig` |
|
||
| Audit log append-only, separate instance | Immutability, forensic separation | `postgres-audit` separate container, `audit_writer` INSERT-only role |
|
||
| Rejected payloads stored forensically | Vendor debugging, dispute resolution | `RejectedSubmissionSink`, `audit.fhir_rejected_submissions` |
|
||
| IG bundled in Docker image | Reproducible builds, no runtime URL dependency | `Dockerfile` COPY, `IgPackageInitializer` |
|
||
| Cluster expressions via extension, not raw code | BD Core IG decided pattern | `ClusterExpressionValidator`, `POSTCOORD_CHARS` rejection |
|
||
| Fail-open for OCL/cluster validator outages | Service continuity over perfect validation | `BdTerminologyValidationSupport` catch blocks, `ClusterExpressionValidator` catch blocks |
|
||
| `meta.tag = unvalidated-profile` for unknown types | FHIR-native, queryable, no schema changes | `unvalidatedProfileTagInterceptor` in `FhirServerConfig` |
|
||
| pgBouncer session mode | Hibernate prepared statement compatibility | `docker-compose.yml` `PGBOUNCER_POOL_MODE: session` |
|
||
| Flyway bypasses pgBouncer for migrations | DDL transaction safety | `SPRING_FLYWAY_URL` points to `postgres-fhir:5432` directly |
|
||
| Advisory lock for IG initialisation | Multi-replica startup race prevention | `IgPackageInitializer` djb2 lock key |
|
||
| Two MDC cleanup hooks | Thread pool MDC leak prevention | `KeycloakJwtInterceptor` `COMPLETED_NORMALLY` + `COMPLETED` |
|