Healthchecks

Nexus integrates with and extends Health checks in ASP.NET Core.

All functions, jobs, and queues automatically get health checks that signals their health. The only thing you need to do in an ASP.NET Core application is call this in your Program.cs:

app.MapHealthChecks("/health");

The jobs, queues and functions subsystems in Nexus initializes the health checks subsystem as part of their initialization so you don't have to do that explicitly. But if you want to change any of the default options you can initialize it directly like this:

builder.Service.AddNexusHealthChecks(options =>
{
    // Set this to true if you're not using ASP.NET. Then Nexus will periodically call the health checks for you
    // and publish a health report just like in ASP.NET.
    options.UseHealthCheckBackgroundService = false;
    // If you're not using ASP.NET you can configure how often the background service calls the health checks.
    options.HealthCheckBackgroundServicePollDelay = TimeSpan.FromSeconds(30);

    // How long Nexus should keep results of health checks to display historical status in the admin UI
    options.StatisticsRetention = TimeSpan.FromDays(30);

    // If you have health checks definied in other assemblies you can add them here to get Nexus to find them
    options.AssembliesToScanForHealthChecks.Add(typeof(MyType).Assembly);

    // This lets you dynamically exclude any health check types that you don't want Nexus to register
    options.HealthCheckFilter = healthCheckType => !healthCheckType.Name.Contains("HealthCheckIDontWant");

    // Sqlite, Postgres or SqlServer
    options.DatabaseEngine = DatabaseEngine.SQLite;
});

You can see the health checks included in the demo environment here:
https://commerce-mind-nexus.azurewebsites.net/admin/healthchecks

Slack integration

If you wish to get Slack notifications when health checks are failing you can use the Slack integration for Nexus.

When are health checks invoked?

If you're using ASP.NET the health checks will be called when anyone is pinging the /health endpoint and when you access the health checks in the admin UI. If you've registered a IHealthCheckPublisher that will also cause the health checks to be called an additional two times per minute by ASP.NET.

When you create a custom health check as described below you can choose to have a scheduled or unscheduled health check. If you create a scheduled health check Nexus will ensure that your health check is never executed more often than the schedule allows. Nexus will return the previous result for such a health check until it's time to call it again.

When not using ASP.NET

If you're not using ASP.NET you can still use the Nexus health checks. You should initialize the health check system like this:

builder.Service.AddNexusHealthChecks(options =>
{
    options.UseHealthCheckBackgroundService = true;
    options.HealthCheckBackgroundServicePollDelay = TimeSpan.FromSeconds(30);
});

You should also create your own IHealthCheckPublisher and reigster it with the IServiceCollection like this:

builder.Service.AddSingleton<IHealthCheckPublisher, MyHealthCheckPublisher>();

The Nexus background service will then periodically call your publisher twice per minute. Since it's a singleton instance you should store the last status and only ping someone if the status goes from healthy to unhealthy.

Adding more health checks

There's two health check interfaces included. One that runs on demand, and one that is scheduled.

INexusHealthCheck

Use this implementation for checks that you always want to run on demand. Their results are never cached but should then respond quickly and not be very costly.

For example:

public class ExampleHealthCheck : INexusHealthCheck
{
    public string Name => "MyHealthCheck"; // This needs to be unique
    public string DisplayName => "Health check running a lot";
    public string? Description => "A longer description of what the health check does"; // You can return null if you don't need a description
    public string? Category => "My health checks category"; // Used to group checks in the Admin UI
    public IEnumerable<string> Tags => new string[] { };
    public string? AdminUILinkUrl => null;

    public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
    {
        return HealthCheckResult.Healthy($"Everything is fine!");
    }
}

INexusScheduledHealthCheck

In some cases you want a health check to run on a schedule rather than firing every time someone pings the health status endpoint. For this you can use the INexusScheduledHealthCheck interface which much like jobs has a cron schedule. When the service executes the health checks it will only execute a scheduled check if enough time has passed since the last time it was called. Otherwise the system will return the previous result for that check.

Another benefit of INexusScheduledHealthCheck is that the CheckHealthAsync() method is passed the previous result of that check. This means that you can determine health based on a previous run. An example:

public class ExampleScheduledHealthCheck : INexusScheduledHealthCheck
{
    public string Schedule => "@every_minute"; // See: https://github.com/HangfireIO/Cronos#macro
    public string Name => "MyHealthCheck"; // This needs to be unique
    public string DisplayName => "Health check running every minute";
    public string? Description => "A longer description of what the health check does"; // You can return null if you don't need a description
    public string? Category => "My health checks category"; // Used to group checks in the Admin UI
    public IEnumerable<string> Tags => new string[] { };
    public string? AdminUILinkUrl => null;

    public async Task<HealthCheckResult> CheckHealthAsync(NexusHealthCheckResult? previousHealthCheckResult, ScheduledNexusHealthCheckInitiator initiator, HealthCheckContext context, CancellationToken cancellationToken)
    {
        var previousCount = (int?)previousHealthCheckResult?.Result.Data?["count"];
        var data = new Dictionary<string, object>();
        var currentCount = GetCurrentCount();
        data["count"] = currentCount;
        if (currentCount - previousCount < 10)
        {
            return HealthCheckResult.Unhealthy($"Everything is NOT fine!", null, data);
        }

        return HealthCheckResult.Healthy($"Everything is fine!");
    }

    private int GetCurrentCount()
    {
        // Look up value
    }
}

Here we're using the Data property on HealthCheckResult to store any properties we need in the next run to determine the health. This can be used to see if counters such as the amount of Pending messages in a certain queue increases too much.

Schedule

The Schedule property is a cron expression just like DefaultSchedule on jobs, and it can be any expression supported by Cronos. The above example uses one of the supported macros in Cronos.

Note that there's a difference between the DefaultSchedule property on IJob and the Schedule property on INexusScheduledHealthCheck. The default schedule on a job is only read by Nexus the first time the job is deployed and registered in the database. Using eg CronSchedule.EveryMinute() for the job will generate a random second (eg 16) for the second during in a minute that the job will start on. This means that you can have many jobs using CronSchedule.EveryMinute() but they won't start at exactly the same time. Instead they will start on random seconds to spread out the load they generate.

The health checks Schedule property on the other hand are read every time Nexus checks if it's time to invoke them. So if you use CronSchedule.EveryMinute() for a health check it'll sometimes execute the check multiple times in a minute and sometimes less frequent than a minute. Eg if the clock is 12:00:10 and the schedule says 9 *, *, *, *, * the check is executed. And if the next time the schedule is evaluated it becomes 19 *, *, *, *, * it means that it'll only be 10 seconds between the times the check is called.

If you still want to generate a random second for a health check you can generate it when the application starts and then keep returning that cron schedule. Like this:

public class ExampleScheduledHealthCheck : INexusScheduledHealthCheck
{
    private static string _schedule = CronSchedule.EveryMinute();
    public string Schedule => _schedule;
}

This will make sure that the schedule is stable during the time that the application lives and only generate a new schedule when the application is restarted or a new version is deployed.

Registering custom health checks

If you created an implementation of INexusHealthCheck or INexusScheduledHealthCheck it becomes automatically registered if the type exists in your entry/application assembly. If your checks exists in a different assembly you can instruct Nexus which assemblies to scan:

builder.Services.AddNexusHealthChecks(options =>
{
    options.AssembliesToScanForHealthChecks.Add(typeof(MyHealthCheck).Assembly);
});

Call order

Note that you must call builder.Services.AddNexusHealthChecks() before calling builder.Services.AddNexusApi() as otherwise the checks are not registered as a default ASP.NET Core IHealthCheck and won't be checked when pinging /health.