Jobs

A job is a C# class that implements the interface IScheduledJob.

The simplest possible implementation of a job would look like this:

public class MyJob : IScheduledJob
{
    public string DefaultSchedule => CronSchedule.TimesPerMinute(10);

    public async Task<JobResults> ExecuteAsync(CancellationToken cancellationToken)
    {
        Console.WriteLine("The job ran");

        return JobResults.Completed();
    }
}

By default the name used a key for the job in the database is its type/class name. So in the example above it would be MyJob. This means that if you rename the job class name the meta data will be lost and the system will think it's a new job. The class name might also not be a great candidate for the display name of the job and to solve that you can use the [ScheduledJob] attribute like this:

[ScheduledJob("myjob", DisplayName = "My really great job")]
public class MyJob : IScheduledJob
{
}

Now we'll use myjob as the key in the database and the admin UI will display it as My really great job. Now you can also rename the job class without loosing the meta data.

IScheduledJob.DefaultSchedule

This property tells the job engine what the default schedule of the job is, and in the above example we're scheduling it to run ten times per minute. But DefaultSchedule is just a string that should contain a valid cron schedule (including seconds) so you can return any valid cron expression. CronSchedule here is just a helper class for generating such schedules.

The engine is using Cronos to parse cron expressions, so any expression that is valid in Cronos will work.

The reason for the property being called DefaultSchedule rather than Schedule is because it will only be read once by the system and that is the first time the engine registers the job. After that the schedule is saved to the database and can be changed through the UI and the API.

If you set DefaultSchedule to an empty string (or use CronSchedule.NotScheduled()) the job will get registered but the engine will never execute it unless you start it manually through the UI or the API.

IScheduledJob.ExecuteAsync()

This method is where you'll place the code that should be performed by the job. The example just contains a simple Console.WriteLine() which won't be very useful in a real scenario.

The job instance is created using the .NET IServiceProvider so you can request any services you need through constructor injection.

Instantiation

Note that the job might be instantiated for other reasons than to call ExecuteAsync() so don't do any work in the constructor, you should only set instance fields.

CancellationToken

The ExecuteAsync() method is passed a CancellationToken by the engine which the job should respect. If you've never worked with cancellation tokens before it's a way for the outside to signal into the job that someone wants to cancel the job execution.

It's up to every job to respect the cancellation token passed in and if you don't do that the job can't be cancelled. It'll either always run to completion or be forcefully killed if the job service is killed.

The cancellation token passed in is a combination of the general service cancellation token and a cancellation token for that specific job run. The service cancellation token can be triggered by a deploy or a machine restart whereas the cancellation token for a specific job is only triggered if someone manually cancels the job through the UI or the API. The difference here is not something a job needs to think about, it only have to concern itself with the cancellation token passed to ExecuteAsync().

Here's an example of how you would use the cancellation token:

public async Task<JobResults> ExecuteAsync(CancellationToken cancellationToken)
{
    foreach (var item in await _service.GetBigListOfThingsAsync())
    {
        cancellationToken.ThrowIfCancellationRequested();

        // Do stuff with the item
    }

    return JobResults.Completed();
}

You add calls to cancellationToken.ThrowIfCancellationRequested(); inside loops or between chunks of code that are time consuming. A general rule is that the job should never execute code for more than one second before calling cancellationToken.ThrowIfCancellationRequested(); again.

ThrowIfCancellationRequested() will throw an OperationCanceledException which the job engine will catch and report that job run as cancelled.

JobResults

The return type of a job is JobResults which is a class that represents the outcome of a job run. JobResults contains static methods such as Completed() that tells the engine if the job ran successfully.

The Completed() method takes an optional string describing the outcome of the job which is visible under the historical runs of the job in the UI. Something like "Exported 5 orders, 2 of which failed".

JobResults.NothingToDo()

If your job starts up and you see that there's no work to do, you should return JobResults.NothingToDo() rather than JobResults.Completed(). Returning NothingToDo() signals to the engine that this run isn't interesting, so it won't get saved to the historical runs. This helps to not clutter up the job history in the UI with runs that didn't perform any interesting work.

JobResults.RetryLater()

Sometimes your job detects that it can't do what it's supposed to do right now. It might be that an external system is down or that the data it needs isn't available right now. For jobs that run less frequently this can be an annoying thing. Lets say that you schedule a job to run every night but external systems might have downtime during the night so now you have to wait a whole day for it to run again.

Instead of waiting that whole day your job can return eg JobResults.RetryLater(TimeSpan.FromMinutes(10)) to tell the engine to schedule an extra run of that job in ten minutes. More on extra runs down below.

JobResults.Failed()

JobResults contains a method called Failed() which you don't need to call manually. Nexus will catch any exceptions that occur and mark the run as failed. This method is useful if you want to return parameters to the next run of the job. Read more about jobs with parameters here.

JobResults.CompletedWithWarnings()

Sometimes your job does what it's supposed to do but detects that some things aren't really what they should. Something might be taking a longer time than expected, or some import or exports succeeds but generates warnings. In these cases you can return JobResults.CompletedWithWarnings() and pass a descriptive text of the warnings.

The jobs health checks state will now be Degraded (rather than Healthy or Unhealthy) and will show with an orange warning sign in the Admin UI. You should use standard logging inside your job with additional details about the warnings to be able to resolve it by logging at the job logs.

The recommendation is to use this for things that needs to be looked into but isn't urgent or critical enough to use JobResults.Failed(). Don't use it for things that aren't actionable since it'll only create noise.

Max job duration

Most jobs have an average expected duration and can sometimes take a much longer time to complete than expected. For some jobs this doesn't matter and the duration can vary a lot. Such as a queue job that sometimes needs to process a lot of messages.

But if you have jobs where you want to be notified if the job runs for too long you can set the max duration either in the Admin UI under the More button in the job details, or by setting MaxDuration on the [ScheduledJob] attribute like this:

[ScheduledJob("myjob", MaxDuration = "00:10")]
public class MyJob : IScheduledJob
{
    ...
}

The value should be any string that can be passed to TimeSpan.Parse(). In the above example the max duration is set to 10 minutes.

When a job has been running for a longer time than MaxDuration the Admin UI will show a warning message about the jobs duration, and the health check for the job will report a Degraded health status. Or be cancelled if automatic cancellation is enabled. Read more about automatic cancellation below.

Automatic cancellation

Sometimes you might have jobs that aren't allowed to run for a longer period of time than the defined max duration. In such cases you can enable automatic cancellation either in the Admin UI under the More button in the job details, or through the job attribute like this:

[ScheduledJob("myjob", MaxDuration = "00:10", AutoCancel = true)]
public class MyJob : IScheduledJob
{
    ...
}

In the above case the jobs cancellation token will be triggered after ten minutes.

Disabling a job

A job can be disabled using the Admin UI or the API. This means that the job won't start on it's schedule, or when queue processing is requested. It will only start if you manually start it through the UI or API. This is very useful for when external systems that the job depends on are having issues, or when the job isn't behaving as it should.

In order to not forget about enabling important jobs again you can say that the health check for the job should get a Degraded status when it's disabled:

[ScheduledJob("myjob", DegradedWhenDisabled = true)]
public class MyJob : IScheduledJob
{
    ...
}

You can also create a new job directly in a disabled state. This can be useful when the job does senstive work where you need to be in control of when it starts. Any new job will have its first run directly when its deployed for the first time which may not be ideal in all cases. You can set the job to be initially disabled like this:

[ScheduledJob("myjob", InitiallyDisabled = true)]
public class MyJob : IScheduledJob
{
    ...
}

Pausing on error

Sometimes a job can be very sensitive to errors and in those cases you might want to prevent a job from running again until the error has been resolved. Nexus lets you handle this by setting PauseOnError in the [ScheduledJob] attribute like this:

[ScheduledJob("myjob", PauseOnError = true)]
public class MyJob : IScheduledJob
{
    ...
}

Previously this could be done by implemeting a bool StopProcessingOnError { get; } on queue jobs but that has been deprecated and will be removed in the next major version.

Note that this can be overriden in the Admin UI/API under the More button in the job details page.

How to register jobs

The job engine will use reflection to look for classes implementing the IScheduledJob or IScheduledJob<TParameters> interfaces and automatically register them. This is only done automatically for the entry assembly so if you have multiple .NET projects that contains job classes you need to register the assemblies with the service.

There's an extension method on IServiceCollection called AddNexus().AddScheduledJobs() which you can call to scan that assembly for jobs like this:

builder.Services.AddNexus().AddScheduledJobs(options =>
{
    options.AssembliesToScanForJobs.Add(typeof(MyJob).Assembly);
});

If you have assemblies that contains some jobs you don't want to register for some reason you can use the options property ScheduledJobFilter like this:

builder.Services.AddNexus().AddScheduledJobs(options =>
{
    options.ScheduledJobFilter = jobType => jobType != typeof(JobThatIDontWant);
});

Log retention

By default the job history is stored for 30 days but you can configure for how long job history and logs should be stored like this:

builder.Services.AddNexus().AddScheduledJobs(options =>
{
    options.DefaultHistoricalRunsRetention = TimeSpan.FromDays(10);
});

You can also specify this per job with the [ScheduledJob] attribute like this:

[ScheduledJob("MyJob", HistoricalRunsRetention = "30.00:00")]
public class MyJob : IScheduledJob
{
    ...
}

The string you set is any string that can be passed to TimeSpan.Parse() or the special string forever to indicate that Nexus should never delete the job history.

Note that this value can be updated in the Admin UI under the More button on the job details page.

Extra job runs

A job typically starts either by its schedule saying that it should start or by someone manually starting the job. But a job can also have extra runs outside of it's normal schedule.

An extra run can be scheduled either using the API endpoint POST /jobs/{jobName}/start?startAt=2022-08-19T12:00:00 (date should be in UTC) or using the service IScheduledJobMetaDataRepository.ScheduleExtraRunAsync().

Starting a job programatically

To explicitly start a job through code you have two options. You can either use the API endpoint POST /jobs/{jobName}/start or use the service IJobStartRequester.

When you do this an extra run of the job is scheduled to start as soon as possible. If the job is currently running another run will start immediately after.

Job parallelism and mutexes

The engine guarantees that a single job never executes multiple times in parallel. Even if it's scheduled to run for eg ten times per minute it will only run once per minute if it takes 60 seconds to complete.

Only one instance is allowed to have the job running at any given time. This is done because many jobs communicate with external data sources and having the same job doing that in parallel can often cause bugs. If you wish to speed things up you should start multiple threads inside your job to parallelize the work.

If you have two or more jobs that you want to ensure never run at the same time you can use the [RequireJobMutex] attribute on your job. Here's an example:

[RequireJobMutex("SomeMutex")]
[RequireJobMutex("SomeOtherMutex")]
public class Job1 : IScheduledJob
{
    // Code omitted for brevity
}

[RequireJobMutex("SomeMutex")]
public class Job2 : IScheduledJob
{
    // Code omitted for brevity
}

[RequireJobMutex("SomeOtherMutex")]
public class Job3 : IScheduledJob
{
    // Code omitted for brevity
}

In this example Job1 can never run at the same time as Job2 and Job3. Job2 can run at the same time as Job3 but not as Job1 since Job2 and Job3 needs different mutexes.

In more advanced scenarios where you want multiple jobs to process the same queue with different filters you can acquire mutexes dynamically as part of the job execution instead:

public class MyJob(IScheduledJobMetaDataRepository jobMetaDataRepository)
{
    public async Task<JobResults> ExecuteAsync(CancellationToken cancellationToken)
    {
        await using var _ = await jobMetaDataRepository.AcquireMutexesAsync(["some-mutex-name"], cancellationToken: cancellationToken);

        // The mutex has been acquired

        return JobResults.Completed();
    }
}

The mutexes are decentralized and safe to use with multiple Nexus servers/instances running your jobs. If you're only using a single server you're probably better of using SemaphoreSlim to handle locking.

Job middlewares

In some cases you want to wrap the execution of a job. You might want to add additional log context properties or instrument the job execution in some way. To achieve this you can register one or more IScheduledJobMiddleware instances in the service collection. Eg:

public class MyJobMiddleware(ILogger<MyJobMiddleware> logger) : IScheduledJobMiddleware
{
    public async Task<JobResults> ExecuteAsync(ScheduledJobDescriptor scheduledJobDescriptor, Func<Task<JobResults>> executeJob)
    {
        using logger.BeingScope(new Dictionary<string, object> {{ "JobName", scheduledJobDescriptor.Name }});
        return await executeJob();
    }
}

// Register the middleware
builder.Services.AddSingleton<IScheduledJobMiddleware, MyJobMiddleware>();

Organizing the Admin UI

If you have a lot of jobs you can group jobs in the Admin UI by a category, just like you can with queues. Set a category in the ScheduledJob attribute like this:

[ScheduledJob("myjob", Category = "My category")]
public class MyJob : IScheduledJob
{
}

Or set the category through the Admin UI under the More button on the job details page.