Persistence
Supported Databases
Currently, ipf-ods supports only MongoDB, but it can also work with Azure Cosmos DB since it implements MongoDB wire protocol.
CosmosDB
ipf.mongodb.url = "<<cosmos-db-url>>"
//You may need this
spring.data.mongodb.database = "ipf"
//Or even the full url
spring.data.mongodb.uri = "<<cosmos-db-url>>"
Example Cosmos DB url
mongodb://ipfcosmosdbdemo:vooQt7iRpI0WyUVC5IIKbq7RT058xJCKdtnjdbOhFnQK2QDLiaWeCrN2TDIEzeBLwLFTEHT47iSSKUowEtrmWw==@ipfcosmosdbdemo.mongo.cosmos.azure.com:10255/ipf?ssl=true&replicaSet=globaldb&retrywrites=false&maxIdleTimeMS=120000&appName=@ipfcosmosdbdemo@
Collections
| Name | Description |
|---|---|
|
Contains ISO 20022 payment objects. |
|
Contains IPF defined, and client-specific, PDS types |
|
Data related to the processing of a payment. These documents have an objectType to indicate the type of processing data, such as |
|
Client-specific data related to a payment. |
|
High-level type representing a payment, built from the raw data in the other collections. |
|
Meta-data for a given unitOfWorkId, built from ingested process flow events. |
|
A daily entry documenting the details of a purge execution. Created during ODS-Purging |
|
A daily entry documenting number of unitOfWorks that should expire in n days. |
|
Tracks the archive candidate selection state. Created during Candidate Selection |
Indexing
By default, ingestion and inquiry applications will attempt to create all the required indexes on startup. To disable this behaviour and create the indexes manually, provide the following configuration:
ods.persistence.indexing.enabled = false
Even though the wire protocols are the same, the underlying database engine implementations differ for MongoDB and CosmosDB databases, which impacts the indexing options. ODS applications will therefore be creating a different set of indexes depending on which database they are talking to, with MongoDB index creation mode being the default. To change the mode to CosmosDB, provide the following configuration:
ods.persistence.indexing.mode = cosmosdb
Indexes declared by the ODS Ingestion App
Indexes declared in MongoDB mode
| Index definition | Comments |
|---|---|
|
A single report exists per day, and is retrieved by the executionDate |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate process objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate payment objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate custom objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
PDS objects are retrieved by unitOfWorkId directly |
|
Each PDS object version is stored as a single document, where sequenceNumber identifies its version. |
| Index definition | Comments |
|---|---|
|
Summaries are uniquely identified by unitOfWorkIds. |
|
Updated each time a summary changes, and used to determine candidates for purging |
| Index definition | Comments |
|---|---|
|
UnitOfWorks are uniquely identified by unitOfWorkIds. Used in upserts in order to de-duplicate UnitOfWork objects as they’re ingested |
|
UnitOfWorks startedAt is used to determine candidates for purging. |
|
UnitOfWorks are eligible and retrieved for archiving by the finishedAt field. UnitOfWorks finishedAt is used to determine candidates for purging. |
|
UnitOfWorks archivedAt is used to determine candidates for purging. |
|
Archiving can be configured to only delete unit of works with a defined journey type |
Indexes declared in CosmosDB mode
| Index definition | Comments |
|---|---|
|
A single report exists per day, and is retrieved by the executionDate |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate process objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate payment objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
Used in upserts in order to de-duplicate custom objects as they’re ingested. |
| Index definition | Comments |
|---|---|
|
PDS objects are retrieved by unitOfWorkId directly |
|
Each PDS object version is stored as a single document, where sequenceNumber identifies its version. |
| Index definition | Comments |
|---|---|
|
Summaries are uniquely identified by unitOfWorkIds. |
|
Updated each time a summary changes, and used to determine candidates for purging |
| Index definition | Comments |
|---|---|
|
UnitOfWorks are uniquely identified by unitOfWorkIds. Used in upserts in order to de-duplicate UnitOfWork objects as they’re ingested |
|
UnitOfWorks startedAt is used to determine candidates for purging. |
|
UnitOfWorks are eligible and retrieved for archiving by the finishedAt field. UnitOfWorks finishedAt is used to determine candidates for purging. |
|
UnitOfWorks archivedAt is used to determine candidates for purging. |
|
Archiving can be configured to only delete unit of works with a defined journey type |
Indexes declared by the ODS Inquiry App
Indexes declared in MongoDB mode
| Index definition | Comments |
|---|---|
|
Highly selective, we would expect only a few documents for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
| Index definition | Comments |
|---|---|
|
Unique per unitOfWorkId and possibly also unique globally so offers high selectivity. Applicable to all process object types |
|
High selectivity - depending on the implementation, should identify double to triple digit objects. Applicable to all process object types |
|
High selectivity, similar to unitOfWorkId - will usually hold either a unitOfWorkId or a flow ID. Applicable to all process object types |
|
Likely unique and therefore highly selective, in most cases we would expect only a few objects for a given value. There will, however, be many process objects with the value 'UNKNOWN' which can give low selectivity if this is used as a search value. Applicable to all process object types |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. Applicable to all process object types |
|
Highly selective, there will exist a small number (depending on implementation, tens to low hundreds) of process flow events for an entityId. Only applicable to process flow event process objects |
| Index definition | Comments |
|---|---|
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. May eventually be removed since the clientRequestId is also present in the alternativeIds.value index |
|
Links a summary, to the parent/related summary. Many summaries could link to the same "parent" summary, e.g. many batches link to a single bulk, or many payments link to a single batch which means a summary can have [0..1] parents, and [0..n] children. |
|
Values should be highly selective, if there’s any duplication, results should be small enough to be sorted in-memory |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Will be used for ODS purging, querying for values less than a given lastUpdated. Likely to be many summaries with a similar timestamp, low selectivity |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Likely unique and therefore highly selective, we would expect a single summary for a given value |
|
Will usually have a high selectivity (small number of payments for a single account for example) but in some cases (bulk) accounts may have a very large number of outgoing payments and therefore selectivity will be lower |
|
Would likely have low selectivity, for example many outgoing payments will have the same bic value |
|
Probably highly selective, but in some cases it may have low selectivity (bulk?) |
|
Will usually have a high selectivity (small number of payments for a single account for example) but in some cases (bulk) accounts may have a very large number of outgoing payments and therefore selectivity will be lower |
|
Would likely have low selectivity, for example many incoming payments will have the same bic value |
|
Probably highly selective, but in some cases it may have low selectivity (bulk?) |
|
For outbound recalls, this represents the client bank, for inbound recalls, this represents the other bank Selectivity is probably on the low-end |
|
For outbound recalls, this represents the other bank, for inbound recalls, this represents the client bank Selectivity is probably on the low-end |
|
Selectivity is probably on the low-end |
|
Selectivity is probably on the low-end |
|
Selectivity is probably on the low-end |
|
Will usually have a high selectivity (small number of payments for a single account for example) but in some cases (bulk) accounts may have a very large number of outgoing payments and therefore selectivity will be lower |
|
Will usually have a high selectivity (small number of payments for a single account for example) but in some cases (bulk) accounts may have a very large number of outgoing payments and therefore selectivity will be lower |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
| Index definition | Comments |
|---|---|
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
For instant payments, likely to be highly selective, we would expect only a few documents for a given value. For bulk payments there could be tens of thousands of documents for some values.. |
|
For instant payments, likely to be highly selective, we would expect only a few documents for a given value. For bulk payments there could be tens of thousands of documents for some values.. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
For instant payments, likely unique and therefore highly selective, we would expect only a single document for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. |
|
Highly selective, we would expect only a few documents for any given value. |
|
For instant payments, likely unique and therefore highly selective, we would expect only a single document for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
|
For instant payments, probably highly selective, we would expect only a few documents for any given value. For bulk payments there could be tens of thousands of documents for a single instruction. Field value will be lower case, and searches will be a starts with regex. |
Indexes declared in CosmosDB mode
| Index definition | Comments |
|---|---|
|
Highly selective, we would expect only a few documents for any given value. |
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Very low selectivity, there will be a large number of documents for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Very low selectivity, there will be a large number of documents for any given value. |
| Index definition | Comments |
|---|---|
|
Unique per unitOfWorkId and possibly also unique globally so offers high selectivity. Applicable to all process object types |
|
High selectivity - depending on the implementation, should identify double to triple digit objects. Applicable to all process object types |
|
High selectivity, similar to unitOfWorkId - will usually hold either a unitOfWorkId or a flow ID. Applicable to all process object types |
|
Likely unique and therefore highly selective, in most cases we would expect only a few objects for a given value. There will, however, be many process objects with the value 'UNKNOWN' which can give low selectivity if this is used as a search value. Applicable to all process object types |
|
Very low selectivity, there will be a large number of documents for any given value. Applicable to all process object types |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. Applicable to all process object types |
|
Has low selectivity, we can expect a large number of documents with the same value. Applicable to all process object types |
|
Has low selectivity, we can expect a large number of documents with the same value. Applicable to all process object types |
|
Has low selectivity, we can expect a large number of documents with the same value. Only applicable to message log process objects |
|
Has low selectivity, we can expect a large number of documents with the same value. Only applicable to system event process objects |
|
Has low selectivity, we can expect a large number of documents with the same value. Only applicable to system event process objects |
|
Has low selectivity, we can expect a large number of documents with the same value. Only applicable to system event process objects |
|
Highly selective, there will exist a small number (depending on implementation, tens to low hundreds) of process flow events for an entityId. Only applicable to process flow event process objects |
| Index definition | Comments |
|---|---|
|
Latest PDS object search does a descending sort on sequenceNumber |
| Index definition | Comments |
|---|---|
|
Likely unique and therefore highly selective, we would expect a single summary for a clientRequestId. May eventually be removed since the clientRequestId is also present in the alternativeIds.value index |
|
Links a summary, to the parent/related summary. Many summaries could link to the same "parent" summary, e.g. many batches link to a single bulk, or many payments link to a single batch which means a summary can have [0..1] parents, and [0..n] children. |
|
Values should be highly selective, if there’s any duplication, results should be small enough to be sorted in-memory |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Has low selectivity, we can expect a large number of summaries with the same value. |
|
Has low selectivity, we can expect a large number of summaries with the same value. |
|
Has low selectivity, we can expect a large number of summaries with the same value. |
|
totalAmount will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
totalAmountCurrency has low selectivity, we can expect a large number of summaries with the same currency |
|
Will be used for ODS purging, querying for values less than a given lastUpdated. Likely to be many summaries with a similar timestamp, low selectivity |
|
CosmosDB’s unique way of indexing all the fields within document; will be a mix of high (uetr, ID fields, accounts numbers, creditor/debtor names etc) and low selectivity fields (reason codes, amounts, currency, BICs) |
| Index definition | Comments |
|---|---|
|
Likely unique and therefore highly selective, we would expect only a single document for any given value. |
|
For instant payments, likely to be highly selective, we would expect only a few documents for a given value. For bulk payments there could be tens of thousands of documents for some values.. |
|
For instant payments, likely to be highly selective, we would expect only a few documents for a given value. For bulk payments there could be tens of thousands of documents for some values.. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Highly selective, we would expect only a few documents for any given value. |
|
Very low selectivity, there will be a large number of documents for any given value. |
|
Usually very low selectivity, there will be a large number of documents for most lookups. High selectivity for rare ISO types like R-messages. |
|
Will be used in ranged queries, where the upper or lower bound could be missing. If both bounds are present the selectivity could be very high for small enough ranges. If the range is large, or a bound is missing then the selectivity could be very low. |
|
Latest payment object search does a descending sort on sequenceNumber |
|
CosmosDB’s unique way of indexing all the fields within document; will be a mix of high (uetr, ID fields, accounts numbers, creditor/debtor names etc) and low selectivity fields (reason codes, amounts, currency, BICs). Some ID fields may offer medium to low selectivity for bulk payments where a single instruction may create tens of thousands of payment objects. |
Akka Jackson Serialization
Within the ods-ingestion-core module, SummaryHandler utilises akka-jackson-serialization to handle serialization and deserialization of messages passed between actors running on different hosts.
Classes that implement SummaryCommand extends CborSerializable interface are marked as classes that should be serialized/deserialized through akka-jackson-serialization.
@JsonInclude(JsonInclude.Include.NON_EMPTY)
public interface SummaryCommand extends CborSerializable {
}
In the below configuration, the ipf shared CborSerializable interface is bound to the default jackson-json serializer, therefore anything implementing this Interface are also bound to this serializer.
akka {
actor {
provider = cluster
serialization-bindings {
"com.iconsolutions.ipf.core.shared.domain.CborSerializable" = jackson-json
}
}
serialization.jackson {
jackson-json {
deserialization-features {
ADJUST_DATES_TO_CONTEXT_TIME_ZONE = off
}
}
}
}
Customizations
The configuration setting: deserialization-features.ADJUST_DATES_TO_CONTEXT_TIME_ZONE is overridden so that provided dateTimes with an offset (e.g. OffsetDateTime, ZonedDateTime…) are not adjusted to UTC when messages are deserialized between instances of the SummaryHandler actor. More information can be found in the Jackson docs.
The @JsonInclude(JsonInclude.Include.NON_EMPTY) annotation applied to SummaryCommand is used to exclude values that are empty or null from being serialized and deserialized. More information can be found in the Jackson docs.