Overview

Discover most advanced features of Slurm-web.

Dashboard

Slurm-web includes a dashboard with high-level metrics to get quick insight of HPC clusters status and operation:

screenshot dashboard tablet

Multi-Clusters

Slurm-web can be deployed on a central server to monitor all HPC clusters in your organization from a unique interface.

slurm web multi clusters

From anywhere in the interface, you can jump to another cluster and easily compare there statuses:

screenshot cluster change

Jobs Status

screenshot job badges


Easily visualize jobs status with colored badges and quickly spot possible failures.

Slurm-web represents Slurm jobs status with a visual colored badge. This really helps to figure out status of the jobs queue at a glance. Never miss errors when they occur!

Jobs filters and sorting

Jobs queue can be filtered by many criteria (job state, user, account, QOS, partition) and sorted by priority, ID, state, user, etc…

screenshot jobs filters

Filters can be applied and removed instantly with just a few clicks. It becomes really trivial to observe specific job flows and better understand Slurm scheduling.

Live Jobs Status

Slurm-web gives the possibility to track specific jobs during their lifetime with live updates:

screenshot job status

Watch your jobs running with visual representation of their progress.

Nodes Status

Live status of the compute nodes is displayed in a graphical representation of the racks. Just move the mouse pointer over a specific node to get all details:

screenshot nodes hovering

Filters can be applied to quickly figure out nodes out of production:

screenshot nodes issues

The cluster status can be displayed in fullscreen to get constant overview of its health and activity.

slurm web nodes fullscreen

Advanced Reservations

Resources can be pre-allocated for a particular usage in Slurm with advanced reservations. Slurm-web displays these reservations with their resources, duration, authorized users and accounts:

screenshot reservations

QOS

Slurm supports QOS with many features and plenty of parameters. Slurm-web displays the defined QOS in a synthetic way:

screenshot qos

It becomes easy to spot differences between QOS and change limits to adjust the scheduling policy. The user interface includes built-in help messages to easily understand involved limits:

slurm web integrated help

Reactive

Slurm-web interface is continuously updated in near real-time with fresh data fetched from clusters. Tables and diagrams are updated atomically with latest changes. You never need to reload pages.

Responsive

Slurm-web interface is designed to be accessible on all devices, from smartphones to largest desktop screens.

slurm web responsive

Enterprise Authentication

Slurm-web supports users authentication with enterprise LDAP directory (FreeIPA, Active Directory, OpenLDAP, etc…).

screenshot auth

Access can be restricted to specific groups of users. Both legacy NIS and RFC 2307 bis schemas are fully supported.

Advanced RBAC Permissions

Administrators can define advanced authorization policy based on roles (RBAC) and LDAP groups to control all users permissions in Slurm-web.

screenshot rbac

Custom Service Messages

Integrate custom service message directly in Slurm-web interface to communicate efficiently with users:

screenshot login service message

Transparent Caching

Slurm-web can use Redis in-memory database to cache Slurm status, in order to maximize performances and significantly reduce load on Slurm scheduler.

slurm web transparent cache

Users are able to track jobs list in near real-time very efficiently. Finally drop the load generated by infinite loops of squeue!

Metrics

Slurm-web is designed to integrate with Prometheus (or any compatible solution) to manage many Slurm metrics.

slurm web metrics

Metrics of the computing resources statuses and the jobs are exported in standard OpenMetrics format, designed to be collected by Prometheus and stored in timeseries database. Slurm-web query this database to produce charts with these metrics.

slurm web charts

These graphs give you a clear view of the evolution of the state of your production HPC clusters.