Concepts
Introduction
At a high-level the Debulker will transform an incoming file into individual items in the component store. This process is started by either polling for the file or receiving a notification that a new file is available for Debulking processing. The file can them be streamed and pushed through the appropriate splitter which will publish a stream of events containing smaller chunks (components) and saved in long term storage (the Component Store).
Key Concepts
The following are key concepts which are explored in more detailed within the linked features section. They are explained here to show how these concepts and features relate to each other.
Debulking Configuration
Every type of bulk file to be processed requires a specific configuration, to tell the Debulker what sort of file format to expect (e.g. XML, Json) and crucially a component hierarchy which provides the tree structure of that specific type of file. This tells the debulker how to break the file apart, to split it, into its component parts.
File Notification
There are two ways the Debulker can learn that there is a file ready for processing. The first is via a notification, an API is provided which is essentially a receive connector. The Debulker comes with a Kafka implementation of this receive connector. Thus an implementation could have another process or script run to send a Kafka event to a specific topic, thus communicating a new file is ready for processing.
File Polling
The second way for a file to be fed into Debulker processing is to configure a File Poller (provided as core IPF functionality), which will poll at a defined frequency for new files. The File Poller can also be used to sweep up missed files, whereby you could configure it to look for files not yet processed (useful in the case that file notifications could not reliably be sent).
Input Stream
File manager provides a pluggable component whose purpose is to take a FileDefinition and return an InputStream. This decouples the debulker from the underlying details of the file storage, and allows a range of storage options (e.g. local file system or S3 bucket).
File Processing Uniqueness
We typically want the files to be processed once and only once, thus the debulker has the option to configure a duplicate check. This is based on the entire contents of the file and will stop processing a file which it has seen before.
Splitter
A Splitter is a pluggable component, where most of the Debulker work is done. The Splitter takes a stream of data (content from a large file); and publishes a stream of events containing smaller chunks (components).
Component Store
The File Component Store is a plugable component and represents the 'place' where payment components are stored. Typically this will be a Mongo backed store, but could equally be implemented or swapped for another implementation.
Client Processing Notification
This is enabled using a pluggable component which sends notifications to a client indicating that components generated by the debulker are ready for processing.
Housekeeping
Housekeeping functionality exists to remove components which have been processed by the client flows.
Orchestration
The debulking process is managed via an MPS flow
You can view a high-level set of documentation for this flow here