Hosting

The service is built to allow both running as a continous background service as well as a service that periodically starts up, executes outstanding jobs and then exits.

The default is to run as a continous background service but you can configure it to not use the background service:

builder.Services
    .AddNexus()
    .AddFunctions(options =>
    {
        options.UseBackgroundService = false;
    })
    .AddScheduledJobs(options =>
    {
        options.UseBackgroundService = false;
    });

When setting UseBackgroundService to false you need to call IScheduledJobExecutorService.RunAllJobsPendingSinceLastCallAsync() and INexusFunctionExecutorService.RunAllFunctionsPendingSinceLastCallAsync() yourself. It will check which jobs have pending runs since the last time the method was called; run them and then return. Which means that you can have the system running in for example an Azure Function or Azure Container Instance that runs periodically.

Running in production

Note that it's recommended to run the background service for the production environment since it offers more granular scheduling, health checks and the admin UI + API.

Application type

The typical use case is to host the service inside an ASP.NET Core application but any type of .NET6+ compatible application can be used to run the scheduler and queue system. The API however requires ASP.NET Core.

InstanceId

In a multi-server/instance environment Nexus keeps track of which server is doing what using INexusInstance.InstanceId. By default that id is the value of Dns.GetHostName() except if you're running in Azure App Service where a single VM can be running multiple instances of Nexus using deployment slots. In this case Nexus will calculate an instance id using the current process id and the environment variable WEBSITE_INSTANCE_ID that's set by the app service.

If you for some reason need to run multiple Nexus applications on the same VM you need to implement your own version of INexusInstance where InstanceId is guaranteed to be unique at any given time. There should never be two applications running at the same time with the same instance id. As a last resort you can set it to a random value at startup but that decreases Nexus ability to track and heal. If there's an application crash Nexus won't be able to update jobs and functions to a non-running state because Nexus is unaware that the specific instance id crashed. When using a stable id and Nexus starts up it will reset all jobs and functions that claims to be running on that instance. If the instance id keeps changing then you need to reset it yourself in case of a crash.

Graceful shutdown

The importance of graceful shutdown isn't specific to Nexus but you should give your application enough time to shutdown gracefully. When you have a Web API that's typically not something you have to think about because a shutdown only needs to finish the current HTTP requests and then it can shutdown without issues.

But in the case of background jobs and processing a deploy will trigger a shutdown but you might have an important job running that shouldn't be killed in the middle of execution. All jobs are given a CancellationToken which will fire if the application wants to shut down, so it's important for all your jobs to respect that and exit gracefully. Read more about that here.

The default shutdown timeout in .NET 6 is 30 seconds: https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Hosting/src/HostOptions.cs

This is the time .NET waits after signaling to all background services that a shutdown has initiated until the process terminates. If the process terminates in the middle of running a job you won't know what the status was for that run. Was it done? Can it be restarted? If the process terminated on a server that is then shutdown and replaced with another instance Nexus won't automatically mark it as not running and you'll need to reset the job yourself through the UI or API.

The default tiemout of 30 seconds is probably enough for you, but you need to consider it. It might also be that your hosting provider doesn't respect this timeout and has a shorter timeout. Verifying the timeout you have is quite easy. You can create a background service like this:

public class VerifyGracefulShutdownBackgroundService : BackgroundService
{
    private readonly ILogger<VerifyGracefulShutdownBackgroundService> _logger;

    public VerifyGracefulShutdownBackgroundService(ILogger<VerifyGracefulShutdownBackgroundService> logger)
    {
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            await Task.Delay(500);
        }

        _logger.LogInformation("Shutdown initiated");

        while (true)
        {
            _logger.LogInformation("Still alive...");
            await Task.Delay(500);
        }
    }
}

And then in Program.cs you do:

serviceCollection.AddHostedService<VerifyGracefulShutdownBackgroundService>();

Deploy the new code, and then deploy the same code again. After that you check the logs for the timestamps of the Shutdown initiated entry and the last Still alive... entry. If the period between those are long enough you're good to go. What should be considered long enough is up to you, but 30 seconds is a good starting point.

Graceful Nexus shutdown

For cases when you want Nexus to shutdown before the application starts to shutdown you can use the IGracefulShutdownService to tell Nexus to stop all background processes. Calling IGracefulShutdownService.InitiateShutdown() will stop all Nexus background processes and Nexus will attempt to cancel all running jobs.

You can also initiate a graceful shutdown through the API using POST api/graceful-shutdown/initiate but only if you've set AllowGracefulShutdownThroughApi to true when calling AddNexus().AddApi(). Note that in a multi-instance environment you need to call this explicitly for every instance. In a multi-server environment you might want to broadcast an event that all Nexus instances listens to and calls IGracefulShutdownService.InitiateShutdown().

Once you've initiated graceful shutdown there's no going back. You need to restart the application for the background processes to start again.

Soft graceful shutdown

In some cases you might want to let any jobs that has started finish before you deploy but you still want to prevent new jobs from starting. In that case you can use the soft graceful shutdown in Nexus which does the same thing as a graceful shutdown except that it won't attempt to cancel running jobs or functions. Just like graceful shutdown this is an unrecoverable state, you need to restart the application for jobs to start again.

You initiate a soft graceful shutdown either through the API with POST api/graceful-shutdown/initiate-soft or through code with IGracefulShutdownService.InitiateSoftShutdown().

Azure Web Apps

Azure Web Apps is a popular and easy way of deploying and managing a .NET application which works great together with Nexus. However since Azure Web Apps are built for a web/API workload and not a worker workload there's some considerations when using it with Nexus.

Something to be aware of is that a deployment will start your new application version on the same machines as the current version is running on and it will start the new version before the old version is signaled to initiate shutdown.

For a Web API this doesn't really matter as the new application version won't receive traffic until the deploy swaps the active application and then all traffic goes to the new version. But background services will be running on both at the same time.

This isn't a problem in general but something you need to be aware of. Nexus will ensure that there's only one instance of a job running at the same time, but if you have dependencies between jobs it can affect you.

During the time of a deploy there can be two different versions of the same job or function running. Nexus will ensure that a single job never runs at the same time in different instances, but if you've scheduled a job to run every second then the old and new version of the job will compete to start running the job during the deployment.

Let's say that you have a Nexus function that looks like this:

public class ExampleService
{
    public void DoSomethingInteresting()
    {
        _logger.LogInformation("This is interesting");
    }
}

And you have calls to that service frequently scheduled using await _nexusFunction.RunInBackgroundAsync<ExampleService>(x => x.DoSomethingInteresting());. Then you change the log to "This is very interesting!". During the deploy of that new implementation your log can look like this:

[20XX-XX-XX XX:XX:01] This is interesting
[20XX-XX-XX XX:XX:02] This is interesting
[20XX-XX-XX XX:XX:03] This is very interesting!
[20XX-XX-XX XX:XX:04] This is interesting
[20XX-XX-XX XX:XX:05] This is very interesting!
[20XX-XX-XX XX:XX:06] This is interesting
[20XX-XX-XX XX:XX:07] This is very interesting!
[20XX-XX-XX XX:XX:08] This is very interesting!

It's up to you to consider if this timing is an issue for you, or if it's fine that different versions of the function, job or enqueuing will run at the same time.

Azure Container Apps or Google Gloud Run

Since Azure App Service/Web Apps isn't optimized for a worker/background service workload you should consider using Azure Container Apps or Google Gloud Run instead which both has great support for both worker and web API workloads.