FHIR Pipelines Control Panel
Last run failed! Please find error logs here

Run pipelines

Incremental pipeline is scheduled to run at 2026-03-20T12:00
Run Incremental Pipeline

This fetches the resources since last run and merges them into the data-warehouse.

Run Full Pipeline

This fetches all the resources and create a new data-warehouse snapshot.

Recreate Views

This reads the current data-warehouse snapshot and recreates the flat views in the sink database.

List of DWH snapshots

Refresh icon:

Latest: /dwh/pipeline_DWH_TIMESTAMP_2026_03_20T10_08_02_021743767Z/

Configuration Settings

Parameter Value Default Value Description
fhirdata.fhirFetchMode FHIR_SEARCH
fhirdata.fhirServerUrl http://isanteplus:8080/openmrs/ws/fhir2/R4
fhirdata.dwhRootPrefix /dwh/pipeline_DWH
fhirdata.generateParquetFiles false
fhirdata.incrementalSchedule 0 0 * * * *
fhirdata.purgeSchedule 0 30 * * * *
fhirdata.numOfDwhSnapshotsToRetain 2
fhirdata.resourceList Patient,Encounter,Observation,Condition,AllergyIntolerance,MedicationRequest,Practitioner,Group
fhirdata.numThreads 1
fhirdata.dbConfig
fhirdata.viewDefinitionsDir
fhirdata.sinkDbConfigPath
fhirdata.fhirSinkPath http://openhim-core:5001/SHR/fhir
fhirdata.sinkUserName shr-pipeline
fhirdata.sinkPassword instant101
fhirdata.structureDefinitionsPath classpath:/r4-us-core-definitions
fhirdata.fhirVersion R4
fhirdata.rowGroupSizeForParquetFiles 33554432
fhirdata.recursiveDepth 1
fhirdata.createParquetViews false

Parameter Value Default Value Description
checkPatientEndpoint false true Whether to check the /Patient endpoint with a count query at the start-up.
fhirFetchMode FHIR_SEARCH null Mode through which the FHIR resources have to be fetched from the source FHIR server. Supported modes are: FHIR_SEARCH: Reads resources through the FIHR Search API; --fhirServerUrl should be set in this mode. PARQUET: Reads resources from input Parquet files; this is only intended for regenerating flat views or syncing resources with an external FHIR server. JSON: Reads resources from input JSON files; --sourceJsonFilePattern should be set in this mode. NDJSON: Reads resources from input ndjson files; --sourceNdjsonFilePattern should be set in this mode. BULK_EXPORT: Reads resources through the FIHR Bulk Export API; --fhirServerUrl should be set in this mode. HAPI_JDBC: Reads resources through a direct JDBC connection to the database of a HAPI server; --fhirDatabaseConfigPath should be set in this mode.
fhirServerPassword Admin123 Fhir source server BasicAuth password
fhirServerUrl http://isanteplus:8080/openmrs/ws/fhir2/R4 Fhir source server URL, e.g., http://localhost:8091/fhir, etc.
fhirServerUserName admin Fhir source server BasicAuth username
fhirSinkPath http://openhim-core:5001/SHR/fhir The path to the target generic fhir store, or a GCP fhir store with the format: `projects/[\w-]+/locations/[\w-]+/datasets/[\w-]+/fhirStores/[\w-]+`, e.g., `projects/my-project/locations/us-central1/datasets/fhir_test/fhirStores/test`
fhirVersion R4 null The fhir version to be used for the FHIR Context APIs
outputParquetPath /dwh/pipeline_DWH_TIMESTAMP_2026_03_20T11_16_35_231714655Z The base name for output Parquet files; for each resource, one fileset will be created.
resourceList Patient,Encounter,Observation,Condition,AllergyIntolerance,MedicationRequest,Practitioner,Group Patient,Encounter,Observation Comma separated list of resource to fetch, e.g., 'Patient,Encounter,Observation'.
rowGroupSizeForParquetFiles 33554432 0 The approximate size (bytes) of the row-groups in Parquet files. When this size is reached, the content is flushed to disk. A large value means more data for one column can fit into one big column chunk which means better compression and faster IO/query. On the downside, larger value means more in-memory size will be needed to hold the data before writing to files. The default value of 0 means use the default row-group size of Parquet writers.
runner class org.apache.beam.runners.flink.FlinkRunner null The pipeline runner that will be used to execute the pipeline. For registered runners, the class name can be specified, otherwise the fully qualified name needs to be specified.
sinkPassword instant101 Sink BasicAuth Password
sinkUserName shr-pipeline Sink BasicAuth Username
structureDefinitionsPath classpath:/r4-us-core-definitions Directory containing the structure definition files for any custom profiles that needs to be supported. If it starts with `classpath:` then the classpath is searched; and the path should always start with `/`. Do not use this if custom profiles are not needed. Example: `classpath:/r4-us-core-definitions` is the classpath name under the resources folder of module `extension-structure-definitions`.

Parameter Value Default Value Description
activePeriod The active period with format: 'DATE1_DATE2' OR 'DATE1'. The first form declares the first date-time (non-inclusive) and last date-time (inclusive); the second form declares the active period to be from the given date-time (non-inclusive) until now. Resources outside the active period are only fetched if they are associated with Patients in the active period. All requested resources in the active period are fetched. The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime For example: --activePeriod=2020-11-10T00:00:00_2020-11-20 Note this feature implies fetching Patient resources that were active in the given period. Default empty string disables this feature, i.e., all requested resources are fetched.
batchSize 100 100 The number of resources to be fetched in one API call. For the JDBC mode passing > 170 could result in HTTP 400 Bad Request. Note by default the maximum bundle size for OpenMRS FHIR module is 100.
cacheBundleForParquetWrites false false This is an experimental feature which is intended for Dataflow runner only. The purpose is to cache output Parquet records for each Beam bundle such that the DoFn is idempotent, or to be more precise, can be retried for an incomplete bundle (Beam's bundle not FHIR) without corrupting the Parquet output.
fhirDatabaseConfigPath ../utils/hapi-postgres-config.json ../utils/hapi-postgres-config.json Path to FHIR database config for JDBC mode; the default value file (i.e., hapi-postgres-config.json) is for a HAPI server with PostgreSQL database. There is also a sample file for an OpenMRS server with MySQL database (dbz_event_to_fhir_config.json); the Debezium config can be ignored for batch.
fhirServerOAuthClientId The `client_id` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`.
fhirServerOAuthClientSecret The `client_secret` to be used in the OAuth Client Credential flow when interacting with the FHIR server; see `fhirServerOAuthEndpoint`.
fhirServerOAuthTokenEndpoint The `token_endpoint` to be used in the OAuth Client Credential flow when interacting with the FHIR server. If set, `fhirServerOAuthClientId` and `fhirServerOAuthClientSecret` should also be set. In that case, Basic Auth username/password is ignored.
jdbcFetchSize 1000 1000 This flag is used in the JDBC mode. In the context of an OpenMRS source, this is the size of each ID chunk. In the context of a HAPI source, this is the size of each database query. Setting high values (~10000 for OpenMRS, ~1000 for HAPI) will yield faster query execution.
jdbcInitialPoolSize 3 3 DEPRECATED! This is ignored; by default 3 connections are used initially.
jdbcMaxPoolSize 50 50 JDBC maximum pool size
outputParquetViewPath If set, flat Parquet files corresponding to input ViewDefinition are created and written in this directory. Each view will be in a sub-directory based on its `name` field.
parquetInputDwhRoot The path to the data-warehouse directory of Parquet files to be read. The content of this directory is expected to have the same structure used in output data-warehouse, i.e., one dir per each resource type. If this is enabled, --fhirServerUrl and --fhirDatabaseConfigPath should be disabled because input resources are read from Parquet files. This is for example useful when we want to regenerate the views. [EXPERIMENTAL]
recreateSinkTables false false If true, drops the old view tables first and recreate them; otherwise create tables only if they do not exit.
recursiveDepth 1 1 The maximum depth for traversing StructureDefinitions in Parquet schema generation (if it is non-positive, the default 1 will be used). Note in most cases, the default 1 is sufficient and increasing that can result in significantly larger schema and more complexity. For details see: https://github.com/FHIR/sql-on-fhir/blob/master/sql-on-fhir.md#recursive-structures
secondsToFlushParquetFiles 600 600 The number of seconds after which records are flushed into Parquet/text files; use 0 to disable (note this may have undesired memory implications).
since Fetch only FHIR resources that were updated after the given timestamp. The date format follows the dateTime format in the FHIR standard, without time-zone: https://www.hl7.org/fhir/datatypes.html#dateTime This feature is currently implemented only for HAPI JDBC mode.
sinkDbConfigPath Path to the sink database config; if not set, no sink DB is used. If viewDefinitionsDir is set, the output tables will be the generated views (the `name` field value will be used as the table name); if not, one table per resource type is created with the JSON content of a resource and its `id` column for each row.
sourceJsonFilePatternList Comma separated list of file patterns for input JSON files, e.g., 'PATH1/*,PATH2/*'. Each file should be one Bundle resource [EXPERIMENTAL]
sourceNdjsonFilePatternList Comma separated list of input NDJSON files, e.g., 'PATH1/*,PATH2/*'. Each file contain FHIR resources serialized with no whitespace, and separated by a newline pair.
tempLocation null null A pipeline level default location for storing temporary files.
viewDefinitionsDir The directory from which SQL-on-FHIR-v2 ViewDefinition json files are read. Note: For the Incremental Run, this directory must contain all the ViewDefinitions used to create views in both data-warehouses!

Pipeline Metrics

Fetch the latest pipeline metrics