Failed and Late Executions
This page explains, in practical terms, how the IPF Persistent Scheduler decides when a job fails and how it treats executions that run later than planned.
When is a job considered FAILED?
A job is marked FAILED only when its runtime execution fails, i.e. when your SchedulingHelper does one of the following:
-
throws an exception inside
execute(…)orlateExecute(…), or -
returns a
CompletionStagethat completes exceptionally.
There is no background component that marks jobs as failed for being late or for having missed a slot.
If the job specification provides a failure identifier and a failure command, the scheduler will invoke the SchedulingHelper that supports that command after a failure.
If those fields were not set on the job specification, no failure action is triggered.
Late execution handling (normal vs lateExecute)
Late execution is an opt-in, specified via job specification’s lateExecutionThreshold value.
When a threshold is set, the scheduler compares the current time with the latest allowed execution time and if the threshold has been breached lateExecute is called instead of execute.
The latest allowed execution time is calculated differently for one-time and recurring jobs:
-
For one-time jobs the scheduler takes the desired execution time (specified in the job’s time zone) and adds the late threshold to it.
-
For recurring (cron) jobs the scheduler first determines when this job should have run by calculating the next two scheduled times and working backwards by one interval. Once it has the planned execution time, it adds the late threshold to determine the cutoff.
|
Late execution handling of recurring jobs assumes that the scheduling specification is a fixed-interval cron expression.
If your cron expression results in variable intervals — for example, |
Minimal examples
1) One-time job with a late threshold
var job = JobSpecification.builder()
.jobRequestor("invoice-service")
.singleSchedule(LocalDateTime.now().plusSeconds(30))
.zoneId(ZoneId.of("Europe/Madrid"))
.triggerIdentifier("order-123")
.triggerCommand(new RunInvoiceCommand())
.lateExecutionThreshold(Duration.ofSeconds(10)) // after desired+10s we treat it as late
.build();
schedulingModule.scheduleJob(JobSpecificationDto.fromJobSpecification(job));
2) Recurring job with a late threshold
var job = JobSpecificationDto.builder()
.jobRequestor("billing")
.schedulingSpecification("0 0/5 * ? * *") // every 5 minutes
.triggerIdentifier("batch-1")
.triggerCommand(new ReconcileCommand())
.lateExecutionThreshold(Duration.ofMinutes(1))
.build();
schedulingModule.scheduleJob(job);
3) Implementing SchedulingHelper with late and failure handling
Given a job:
var job = JobSpecificationDto.builder()
.jobRequestor("invoice-service")
.singleSchedule(LocalDateTime.now().plusSeconds(30)) // desired execution
.zoneId(ZoneId.of("Europe/Madrid"))
.triggerIdentifier("order-123")
.triggerCommand(new RunInvoiceCommand())
.lateExecutionThreshold(Duration.ofSeconds(10)) // after desired+10s we treat it as late
.failureIdentifier("order-123")
.failureCommand(new RunInvoiceCommandFailed())
.build();
schedulingModule.scheduleJob(job);
Regular and late executions are handled by a SchedulingHandler that supports the specified triggerCommand:
public class BillingHelper implements SchedulingHelper {
@Override
public boolean supports(SchedulingCommand cmd) {
return cmd instanceof RunInvoiceCommand;
}
// timely execution
@Override
public CompletionStage<Void> execute(String id, SchedulingCommand cmd) {
return CompletableFuture.runAsync(() -> doWork(id, cmd));
}
// late execution, triggered if execution happens after desired+10s
@Override
public CompletionStage<Void> lateExecute(String id, SchedulingCommand cmd, Duration overBy) {
log.warn("Job {} is late by {}", id, overBy);
// Optionally adjust behavior when the job is late
return execute(id, cmd);
}
}
If execute/lateExecute throws or completes exceptionally, the scheduler records a FAILED status for that run and triggers the failure command.
The failure command is in turn handled by a SchedulingHandler that supports it:
public class BillingFailedHelper implements SchedulingHelper {
@Override
public boolean supports(SchedulingCommand cmd) {
return cmd instanceof RunInvoiceCommandFailed;
}
@Override
public CompletionStage<Void> execute(String id, SchedulingCommand cmd) {
return CompletableFuture.runAsync(() -> handleFailure(id, cmd));
}
}