Ingestion from Local File
Overview
File ingestion from a local source provides a reliable method for processing files that are always available.
Configuration Options
The default root path for ingesting local files is /import/.
Depending on the specific file to be ingested, an appropriate suffix is added to this path.
A complete list of ingestion directories can be found here.
If a file is ingested from an incorrect folder, it will be moved to the archive directory, and a Processing Skipped event will be raised.
If a corrupted file is ingested, the file will be skipped and moved to the failed directory.
A Processing Failed event will also be triggered.
There is a configuration property timestamp-archived-and-failed-files (see the Configuration Overriding section below) that is set to true by default and therefore all files that end up in archive and failed directories will have a timestamp added at the end of the file name (e.g., file_20251020_143530.xml). This is useful when we want to avoid overwriting for files with same names, or when ingesting same file multiple times. To avoid this, simply set the property to false manually.
Configuration Overriding
By default, the configuration works seamlessly.
However, users have the option to override the file-ingestion settings if needed.
Example of data ingestion configuration:
default-file-ingestion {
# path which should be overriden
files-directory = "/import"
initial-delay = 5s
interval = 30s
timestamp-archived-and-failed-files = true
}
Example of an overridden default value:
ipf.csm-reachability.default-file-ingestion.files-directory = /import/overriden-path
Deprecating directory mapping from MongoDB directory-mapping collection
From csm-reachability-data-ingestion:3.17.0 version and above directory mapping from MongoDB directory-mapping collection will be deprecated and moved to the ipf.file-ingestion.directory-mapping HOCON configuration that will be used for directory mappings.
From now on, it’s not allowed having disabled file ingester and mapped directoryId for disabled ingester.
|
Migration steps
-
Backup all data from Mongo
directory-mappingcollection. -
For each custom ingester ensure adding related Mongo document data from
directory-mappingscollection to ingesters' .conf file.-
Hocon example:
# added directory mapping in case of participant-file-handling module usage ipf.file-ingestion.directory-mappings += { "directory-id": "TIPS", (1) "job-name": "TIPS Participant" (2) } ipf.file-ingestion.directory-mappings += { "directory-id": "RT1", "job-name": "RT1 Participant" } ipf.file-ingestion.directory-mappings += { "directory-id": "STEP2 SCT", "job-name": "STEP2 SCT Participant" } ipf.file-ingestion.directory-mappings += { "directory-id": "SIC", "job-name": "SIC Participant" }1. directoryID and 2. jobName has to match directoryID for document in Mongo directory-mappingcollection. This is example for participant-file-handling module, make sure to do the same mappings for other custom ingesters.
-
-
Restart application and check if there is no warnings in log with message
Missing required HOCON configuration: ipf.file-ingestion.directory-mappings. -
Make sure that log doesn’t contain warnings like:
-
Mongo directory-mappings documents value doesn’t exist in Hocon configuration. -
Mismatch found for Mongo directory-mappings documents value and Hocon configuration.
-
-
Delete Mongo
directory-mappingcollection if previous steps are fulfilled.