Quickstart

Requirements

  • Cluster with Slurm >= 23.11 [1] and accounting enabled

  • Host installed with a supported GNU/Linux distributions among:

    • CentOS

    • Almalinux

    • RockyLinux

    • Fedora

    • RHEL

    • Debian

    • Ubuntu

  • LDAP directory (for authentication)

Install slurmrestd

Slurm-web extracts Slurm information from the REST API provided its slurmrestd daemon. This daemon must be installed on the host. The installation method depends on the origin of Slurm packages deployed on the cluster:

  • SchedMD RPM packages

  • EPEL

  • SchedMD Deb packages

  • Debian

On clusters deployed with SchedMD official RPM packages, install slurmrestd daemon with this command:

# dnf install slurm-slurmrestd
Please refer to SchedMD official Slurm installation guide for more help.

On clusters deployed with RPM packages from EPEL community, install slurmrestd daemon with this command:

# dnf install slurm-slurmrestd

On clusters deployed with SchedMD official Deb packages, install slurmrestd daemon with this command:

# apt install slurm-smd-slurmrestd
Please refer to SchedMD official Slurm installation guide for more help.

On clusters deployed with RPM packages from Debian community, install slurmrestd with this command:

# apt install slurmrestd

Setup slurmrestd

Create /etc/systemd/system/slurmrestd.service.d/slurm-web.conf drop-in configuration override for slurmrestd daemon:

[Service]
# Unset vendor unit ExecStart to avoid cumulative definition
ExecStart=
Environment=
# Disable slurm user security check
Environment=SLURMRESTD_SECURITY=disable_user_check
ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS unix:/run/slurmrestd/slurmrestd.socket
RuntimeDirectory=slurmrestd
RuntimeDirectoryMode=0755
User=slurm
Group=slurm
This configuration file makes slurmrestd listen for incoming connections on Unix socket accessible by Slurm-web. In this configuration slurmrestd is executed with special slurm user to get more permissions on Slurm cluster. This is normally not permitted by slurmrestd unless SLURMRESTD_SECURITY=disable_user_check environment variable is defined. This is a security measure that is relevant in many use-cases but not for Slurm-web. Indeed, Slurm-web has its own internal security autorization policy to control users permissions and enforce security.

Make systemd reload units changes on disk:

# systemctl daemon-reload

Enable and start slurmrestd service:

# systemctl enable --now slurmrestd.service

To check slurmrestd daemon is properly running, run this command:

# curl --unix-socket /run/slurmrestd/slurmrestd.socket http://slurm/slurm/v0.0.40/diag
{
   "meta": {
     "plugin": {
      "type": "openapi\/slurmctld",
      "name": "Slurm OpenAPI slurmctld",
      "data_parser": "data_parser\/v0.0.40",
      "accounting_storage": "accounting_storage\/slurmdbd"
    },
   }
  …
}

In case of failure, please refer to troubleshooting guide for help.

More links

Install Slurm-web

For simplicity reason, this quickstart guide provides a simple installation method with distribution system packages compatible with most environments. Please refer the complete Installation Guide for more detailed installation methods.

DNF

This procedure works on RHEL, CentOS, Rocky Linux and AlmaLinux OS,

On RHEL, CentOS and Rocky Linux some dependencies are missing in standard distribution repositories. You must enable EPEL repositories to get all requirements on these distributions:

# dnf install -y epel-release

Download and save RPM repository kerying:

# curl https://pkgs.rackslab.io/keyring.asc --output /etc/pki/rpm-gpg/RPM-GPG-KEY-Rackslab

Create DNF repository file /etc/yum.repos.d/rackslab.repo with this content:

  • RHEL 8

  • RHEL 9

  • Fedora 39

  • Fedora 40

These packages are also compatible with CentOS 8, Rocky Linux 8 and AlmaLinux OS 8.
[rackslab]
name=Rackslab
baseurl=https://pkgs.rackslab.io/rpm/el8/main/$basearch/
gpgcheck=1
enabled=1
countme=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Rackslab
These packages are also compatible with CentOS 9, Rocky Linux 9 and AlmaLinux OS 9.
[rackslab]
name=Rackslab
baseurl=https://pkgs.rackslab.io/rpm/el9/main/$basearch/
gpgcheck=1
enabled=1
countme=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Rackslab
[rackslab]
name=Rackslab
baseurl=https://pkgs.rackslab.io/rpm/fc39/main/$basearch/
gpgcheck=1
enabled=1
countme=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Rackslab
[rackslab]
name=Rackslab
baseurl=https://pkgs.rackslab.io/rpm/fc40/main/$basearch/
gpgcheck=1
enabled=1
countme=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-Rackslab

Install Slurm-web agent and gateway packages:

# dnf install slurm-web-agent slurm-web-gateway

All dependencies are automatically installed.

APT

This procedure works Debian and Ubuntu.

Download and install packages repository signing key:

# curl -sS https://pkgs.rackslab.io/keyring.asc | gpg --dearmor | tee /usr/share/keyrings/rackslab.gpg > /dev/null

Create APT sources file /etc/apt/sources.list.d/rackslab.sources with this content:

  • Debian 12 « bookworm »

  • Debian 13 « trixie »

  • Debian unstable « sid »

  • Ubuntu 24.04 LTS

Types: deb
URIs: https://pkgs.rackslab.io/deb
Suites: bookworm
Components: main
Architectures: amd64
Signed-By: /usr/share/keyrings/rackslab.gpg
Types: deb
URIs: https://pkgs.rackslab.io/deb
Suites: trixie
Components: main
Architectures: amd64
Signed-By: /usr/share/keyrings/rackslab.gpg
Types: deb
URIs: https://pkgs.rackslab.io/deb
Suites: sid
Components: main
Architectures: amd64
Signed-By: /usr/share/keyrings/rackslab.gpg
Types: deb
URIs: https://pkgs.rackslab.io/deb
Suites: ubuntu24.04
Components: main
Architectures: amd64
Signed-By: /usr/share/keyrings/rackslab.gpg

Update packages repositories metadata:

# apt update

Install Slurm-web agent and gateway packages:

# apt install slurm-web-agent slurm-web-gateway

All dependencies are automatically installed.

More links

Initial setup

Create agent configuration file /etc/slurm-web/agent.ini to set the cluster name, for example:

[service]
cluster=atlas

Create gateway configuration file /etc/slurm-web/gateway.ini with URL to the agent:

[agents]
url=http://localhost:5012
By default, Slurm-web agent listens on port TCP/5012 of loopback network interface. This can be changed with port parameter in service section of agent configuration.

JWT signing key

Slurm-web authenticates users with JSON Web Token (JWT) for communications between its components. A secret key is required to cryptographically sign generated tokens. Run this command to generate this key:

# /usr/libexec/slurm-web/slurm-web-gen-jwt-key
INFO ⸬ Running slurm-web-gen-jwt-key
INFO ⸬ Generating JWT private key file /var/lib/slurm-web/jwt.key
INFO ⸬ Setting read permission on key for slurm-web user
INFO ⸬ Setting read permission on key for slurm user
More links

RacksDB database

Slurm-web uses RacksDB to generate graphical representation of datacenters racks with the compute nodes. RacksDB database must be defined with your HPC cluster infrastructure. This is actually quick and easy easy based on the examples provided.

Some requirements must be fulfilled in this database:

  • The infrastructure must have the same name as the cluster previously declared in agent configuration file.

  • The compute tag must be assigned to all compute nodes declared in Slurm configuration.

You can choose other tag name but you will have to declare this tag in racksdb section of agent configuration, for example:

[racksdb]
tags=blade

First Access

Slurm-web is now ready to start! Enable and start the agent and gateway native services:

# systemctl enable --now slurm-web-agent.service
# systemctl enable --now slurm-web-gateway.service

Connect your browser to the gateway on http://localhost:5011. You should see the configured cluster:

slurm web clusters

By default, Slurm-web gateway native service listens for incoming network connections on port TCP/5011. This can be changed with port parameter in service section of gateway configuration.

Also, the gateway native service is binded to loopback network interface by default. It restricts access to localhost for security reason. It is recommended to setup production HTTP server for external access to Slurm-web. However, this can be changed anyway with the following lines in gateway configuration file /etc/slurm-web/gateway.ini:

[service]
interface=HOSTNAME_OR_IP

[ui]
host=http://HOSTNAME_OR_IP:5011

Real values depend on the DNS hostname or the public IP address of the host.

Slurm-web is now be available on: http://HOSTNAME_OR_IP:5011

Please refer to gateway configuration reference documentation for more details.

In case of failure, please refer to troubleshooting guide for help.

Setup authentication

To restrict access to the dashboard, you must enable authentication. Slurm-web supports LDAP authentication.

Add the following settings in the gateway configuration file /etc/slurm-web/gateway.ini:

[authentication]
enabled=yes

[ldap]
uri=ldap://SERVER
user_base=ou=PEOPLE,dc=EXAMPLE,dc=TLD
group_base=ou=GROUPS,dc=EXAMPLE,dc=TLD

SERVER, user and groups search bases must be adapted to match your LDAP server and directory tree.

Slurm-web also supports LDAPS (SSL/TLS) and STARTTLS secured protocols with LDAP servers.

The groups of users permitted to authenticate on Slurm-web can also be restricted with restricted_groups parameter.

Please refer to reference documentation of ldap section in gateway configuration for more details.

Restart gateway service to apply the new configuration:

# systemctl restart slurm-web-gateway.service

The authentication form is now presented on Slurm-web access:

slurm web login

Upon successfull LDAP authentication, users have access to the clusters.

Any problem to setup LDAP authentication? slurm-web-ldap-check utility might help. More details in Troubleshooting page.

Setup policy

At this stage, the agent is running with default authorization policy. You can create a file /etc/slurm-web/policy.ini to define your custom RBAC fine-grain policy with specific roles.

Consider this example:

[roles]
user=@biology
admin=jwalls

[user]
actions=view-stats,view-jobs,view-nodes

[admin]
actions=view-partitions,view-qos,view-accounts,view-reservations

This policy defines two roles:

  • user for members of biology group. This role has permission on view-stats, view-jobs and view-nodes actions.

  • admin for jwalls user. This role has permission on view-partitions, view-qos,view-accounts and view-reservations actions.

Please refer to Authorization policy reference documentation for description of all available actions and the corresponding permissions granted in user interface.

Restart the agent component to apply the new configuration:

# systemctl restart slurm-web-agent.service

User jwalls who is also member of biology group is granted both user and admin roles, she has access to everything:

slurm web policy admin

Another user in biology group can only view jobs and resources:

slurm web policy user

Access to the cluster is denied to all other users:

slurm web policy others

Setup cache

Slurm-web has a transparent caching feature which can use Redis (or any compatible alternative) in-memory database to cache Slurm responses.

It is highly recommended to setup cache on Slurm-web agent to significantly reduce the amount of repetitive requests sent to Slurm and reduce its load.

Install Redis:

  • APT

  • DNF

For Debian and Ubuntu, run this command:

# apt install redis-server

For RHEL, CentOS, Rocky Linux and AlmaLinux OS, run this command:

# dnf install redis

Start and enable the service:

# systemctl enable --now redis.service

Edit agent configuration file /etc/slurm-web/agent.ini to enable cache:

[cache]
enabled=yes
It is also possible to setup a remote Redis server, configure a password to access a server secured in protected mode or adjust cache timeouts. More details in cache section of agent configuration file.

Production HTTP server

At this stage, Slurm-web is served by a Python HTTP server which is not designed for production requirements. The performances are not optimal and network communications are not secured with SSL/TLS certificate. This setup is not recommended for production. It is recommended to setup a production HTTP server such as Nginx, Apache or Caddy to launch Slurm-web as a WSGI application.

For simplicity reason, this quickstart guide gives only the procedure to setup Nginx but documentation is available for other supported HTTP servers.

Stop and disable Slurm-web native services:

# systemctl disable --now slurm-web-agent.service slurm-web-gateway.service

Edit gateway configuration file /etc/slurm-web/gateway.ini:

[ui]
host=http://DNS_HOSTNAME

[agents]
url=http://localhost/agent

Real values depend on the DNS hostname of the host.

The [service] section can be safely removed now as it is ignored when Slurm-web is launched as WSGI applications.

Copy examples of uWSGI services provided in Slurm-web packages and reload units:

# cp -v /usr/share/slurm-web/wsgi/agent/slurm-web-{gateway,agent}-uwsgi.service \
  /etc/systemd/system/
# systemctl daemon-reload

Start and enable these services:

# systemctl enable --now slurm-web-agent-uwsgi.service slurm-web-gateway-uwsgi.service

Edit Nginx site configuration to add these locations:

server {
  …

  location / {
    include uwsgi_params;
    uwsgi_pass unix:/run/slurm-web-gateway/uwsgi.sock;
  }

  location /agent/ {
    include uwsgi_params;
    rewrite ^/agent/(.*)$ /$1 break;
    uwsgi_pass unix:/run/slurm-web-agent/uwsgi.sock;
  }
}

Reload Nginx to apply new configuration:

# systemctl reload nginx.service

Slurm-web is now available at: http://DNS_HOSTNAME/

In case of failure, please refer to troubleshooting guide for help.

Metrics (optional)

Slurm-web offers the possibility to export Slurm metrics in OpenMetrics format and integrate with Prometheus. This feature can be used to store metrics in timeseries databases and draw diagrams of historical data.

This feature is disabled by default. It can be enabled with the following lines in /etc/slurm-web/agent.ini:

[metrics]
enabled=yes

Multi-clusters

Slurm-web is designed to support distributed setup with a central server and multiple clusters. Compared to the steps above, the following changes must be considered:

  1. Install and setup slurmrestd on all clusters.

  2. Install Slurm-web agent on all clusters, colocated on the same hosts as slurmrestd.

  3. Install Slurm-web gateway on the central server.

  4. Setup production HTTP servers with HTTPS (SSL/TLS) for all agents and the gateway.

  5. Set URL of all agents in agents section of gateway configuration.

  6. Generate JWT signing key on central server and deploy this key on all agents servers (same key must be shared by all agents and the gateway).

  7. Deploy RacksDB database on all agents servers.

  8. Deploy custom policy on all agents servers.

  9. Install Redis on all agents servers.

Et voilà!


1. Slurm-web 4.0.0 actually requires Slurm REST API v0.0.40 available in Slurm 23.11 and above. Please refer to Slurm REST API versions section for more details.