Monitoring
The system integrates with and extends Health checks in ASP.NET Core.
All functions, jobs, and queues automatically get health checks that signals their health, the only thing you need to do in an ASP.NET Core application is call this in your Program.cs
:
app.MapHealthChecks("/health");
Now that endpoint will fail if any functions, jobs, or queues are failing/has errors. The health checks are also included in the admin UI to get an overview of the system status.
If you're not using ASP.NET you can still run the health checks by calling this:
builder.Service.AddNexus().AddHealthChecks(options =>
{
options.UseHealthCheckBackgroundService = true;
});
When not using ASP.NET and enabling UseHealthCheckBackgroundService
you can still register an implementation of IHealthCheckPublisher
in IServiceCollection
like this:
builder.Service.AddSingleton<IHealthCheckPublisher, MyHealthCheckPublisher>();
The Nexus background service will then periodically call your publisher twice per minute. Since it's a singleton instance you should store the last status and only ping someone if the status goes from healthy to unhealthy.
Only for non ASP.NET
If you're using ASP.NET the IHealthCheckPublisher
is still called even if nobody is pinging the /health
endpoint.
You can see the health checks included in the demo environment here:
https://commerce-mind-nexus.azurewebsites.net/admin/healthchecks
Adding more health checks
There's two health check interfaces included. One that runs on demand, and one that is scheduled.
INexusHealthCheck
Use this implementation for checks that you always want to run on demand. Their results are never cached but should then respond quickly and not be very costly.
INexusScheduledHealthCheck
In some cases you want a health check to run on a schedule rather than firing every time someone pings the health status endpoint. For this you can use the INexusScheduledHealthCheck
interface which much like jobs has a cron schedule. When the service executes the health checks it will only execute a scheduled check if enough time has passed since the last time it was called. Otherwise the system will return the previous result for that check.
Another benefit of INexusScheduledHealthCheck
is that the CheckHealthAsync()
method is passed the previous result of that check. This means that you can determine health based on a previous run. An example:
public class ExampleScheduledHealthCheck : INexusScheduledHealthCheck
{
public string Schedule => "@every_minute";
public string Name => "MyHealthCheck"; // This needs to be unique
public string DisplayName => "Health check running every minute";
public string? Description => "A longer description of what the health check does"; // You can return null if you don't need a description
public string? Category => "My health checks category"; // Used to group checks in the Admin UI
public IEnumerable<string> Tags => new string[] { };
public string? AdminUILinkUrl => null;
public async Task<HealthCheckResult> CheckHealthAsync(NexusHealthCheckResult? previousHealthCheckResult, ScheduledNexusHealthCheckInitiator initiator, HealthCheckContext context, CancellationToken cancellationToken)
{
var previousCount = (int?)previousHealthCheckResult?.Result.Data?["count"];
var data = new Dictionary<string, object>();
var currentCount = GetCurrentCount();
data["count"] = currentCount;
if (currentCount - previousCount < 10)
{
return HealthCheckResult.Unhealthy($"Everything is NOT fine!", null, data);
}
return HealthCheckResult.Healthy($"Everything is fine!");
}
private int GetCurrentCount()
{
// Look up value
}
}
Here we're using the Data
property on HealthCheckResult
to store any properties we need in the next run to determine the health. This can be used to see if counters such as the amount of Pending
messages in a certain queue increases too much.
Schedule
The Schedule
property is a cron expression just like DefaultSchedule
on jobs, and it can be any expression supported by Cronos. The above example uses one of the supported macros in Cronos.
Note that there's a difference between the DefaultSchedule
property on IJob
and the Schedule
property on INexusScheduledHealthCheck
. The default schedule on a job is only read by Nexus the first time the job is deployed and registered in the database. Using eg CronSchedule.EveryMinute()
for the job will generate a random second (eg 16
) for the second during in a minute that the job will start on. This means that you can have many jobs using CronSchedule.EveryMinute()
but they won't start at exactly the same time. Instead they will start on random seconds to spread out the load they generate.
The health checks Schedule
property on the other hand are read every time Nexus checks if it's time to invoke them. So if you use CronSchedule.EveryMinute()
for a health check it'll sometimes execute the check multiple times in a minute and sometimes less frequent than a minute. Eg if the clock is 12:00:10
and the schedule says 9 *, *, *, *, *
the check is executed. And if the next time the schedule is evaluated it becomes 19 *, *, *, *, *
it means that it'll only be 10 seconds between the times the check is called.
If you still want to generate a random second for a health check you can generate it when the application starts and then keep returning that cron schedule. Like this:
public class ExampleScheduledHealthCheck : INexusScheduledHealthCheck
{
private static string _schedule = CronSchedule.EveryMinute();
public string Schedule => _schedule;
}
This will make sure that the schedule is stable during the time that the application lives and only generate a new schedule when the application is restarted or a new version is deployed.
Registering custom health checks
If you created an implementation of INexusHealthCheck
or INexusScheduledHealthCheck
you register it by calling:
builder.Services.AddNexus().AddHealthCheck<ExampleScheduledHealthCheck>();