Documentation for a newer release is available. View Latest

Core - Changes & Fixes

This page covers core changes and fixes provided in release IPF-2025.2.2.

IPF File Poller

Added

  • New configuration options at the polling job level:

    • ipf.file-poller.pollers[N].changed-file-job-reschedule-policy property: Provides three options for handling "changed files" (i.e. those with the same file path as a previously processed file, but different content/metadata, such as files that were still uploading during previous polling): NEVER, ALWAYS, IGNORE_TRIGGERED. The default option is ALWAYS.

    • ipf.file-poller.pollers[N].file-processing-parallelism: Sets the maximum number of retrieved files that can have their attributes mapped (reading file metadata and generating hashes) concurrently. The default value is 128.

    • ipf.file-poller.pollers[N].file-processing-buffer: Provides backpressure control by limiting the maximum number of retrieved files (after mapping) that can be sent in each batch for database lookup and processing to determine if files are new, changed, or already processed. The default value is 500.

    • ipf.file-poller.pollers[N].file-content-hash-buffer-bytes: Sets the buffer size (in bytes) for the BufferedInputStream used when hashing the content of a retrieved file. The default value is 8192.

  • jobStatus and lastUpdated fields added to SchedulerJobEntity. JobStatus illustrates if a job scheduled by the poller for a "changed" file has been cancelled per the "changed" file logic and prevents the scenario (observed in testing of previous logic) where more than one scheduler job entity could be created for a file (if it is retrieved twice and determined to be a "changed" file the second time) and querying the collection returns more than one entry, breaking the return contract (Mono<>) and throwing an exception.

Changed

  • The content of retrieved files is now hashed using a buffered input stream with a configurable buffer size, rather than loading the entire file into memory.

  • The id field in the FileEntity collection now stores the full file path rather than just the file name. This addresses an issue where files with the same name in different folders (retrieved by different polling jobs, for example) would be treated as one file in the database, so the second file would get wrongly marked as "changed". The @Id field name in the FileEntity document entity has been changed from fileName to filePath to reflect this update.

Fixed

  • FilePollerActor now recovers automatically if an exception is thrown when trying to schedule the poller jobs at startup