Hyperscalers run.ai Appliance

  • Fully pre-integrated solution including hardware, software and support services
  • Kubernetes-based software platform
  • Fair-share scheduling to allow users to share clusters of GPUs easily and automatically
  • Fractional GPU allocation for interactive/ training workloads
  • Simplified workflows for building, training and deployment of AI models
  • Visibility into workloads and resource utilization to improve user productivity
  • Control for cluster admin and ops teams, to align priorities to business goals
  • On-demand access to Multi-Instance GPU (MIG) instances for the A100 GPU

  • $USD $46,750.00

    *RRP Pricing*

    To View Channel Discounts Please Login


Hyperscalers run.ai Appliance

Hyperscalers Run:ai Appliance The Hyperscalers Run:ai Appliance offers a fully integrated hardware and software solution that supports deployment of common container-based AI applications designed for researchers, a

Hyperscalers Run:ai Appliance 

The Hyperscalers Run:ai Appliance offers a fully integrated hardware and software solution that supports deployment of common container-based AI applications designed for researchers, academics and business users.

AI infrastructure hardware is an expensive resource, yet most AI workload scheduling environments are not able to utilise all GPU resources optimally - meaning that unfortunately, AI hardware infrastructure must often be significantly over-provisioned.

The Hyperscalers Run:ai appliance stands out against this backdrop due to its ability to perform fine-grained sub-allocation of GPU resources between multiple simultaneous workloads. In addition, the Hyperscalers Run:ai Appliance can dynamically re-allocate GPU resources against any workload while it is being processed. This may mean actively releasing GPU resources if they are no longer required or adding GPU resources whenever they become needed.

The Hyperscalers Run:ai Appliance is a fully pre-integrated solution including hardware, software and support services that can be implemented on a small, medium or large basis as per your initial needs and scaled up later to whatever size may be required. 

This kind of power puts an unprecedented level of AI agility and efficiency within the reach of any organisation.

The Hyperscalers Run:ai Appliance management console allows users without deep infrastructure expertise to schedule, monitor and manage their own workloads. This is truly disruptive in terms of placing NGC model execution support generally across your organisation atop the most efficient AI infrastructure environment possible.

The following diagram illustrates the high-level relationship between NVIDIA NGC, 3rd Party MLOps and Model Serving tools, the Run:ai built-in tools and workflows and additionally its ability to operate within the context of numerous well-known Kubernetes implementations using qualified, reliable state of the art equipment by Hyperscalers:


                                                      Figure 1 Hyperscalers Run:ai Appliance stack


Start Configuring using Hyperscalers Servers


High Performance Compute Run:ai Appliance


Balanced Performance Run:ai Appliance


Cost Efficient Run:ai Appliance

Download the full reference guide in the downloads tab. 


Assign the Right Amount of AI Compute Power to Users, Automatically

The Hyperscalers Run:ai Appliance is a Kubernetes-based software platform for orchestration of containerized AI workloads that enables GPU clusters to be utilized for different Deep Learning workloads dynamically - from building AI models, to training, to inference. With Run:ai, jobs at any stage can obtain access to the compute power they need, automatically.

Run:ai’s compute management platform speeds up data science initiatives by pooling available resources and then dynamically allocating resources optimally as needed. These powerful capabilities maximise AI compute power utilisation and therefore return on investment for your organisation.

Key Features

Fair-share scheduling to allow users to share clusters of GPUs easily and automatically

Fractional GPU allocation for interactive/ training workloads

Simplified workflows for building, training (including multi-GPU and distributed training) and deployment of AI models

Visibility into workloads and resource utilization to improve user productivity

Control for cluster admin and ops teams, to align priorities to business goals

On-demand access to Multi-Instance GPU (MIG) instances for the A100 GPU

Key Benefits

Advanced Kubernetes-based Scheduling Eliminates Static GPU Allocation

The Run:ai Scheduler manages tasks in batches using multiple queues on top of Kubernetes, allowing system admins to define different rules, policies, and requirements for each queue based on business priorities. Combined with an over-quota system and configurable fairness policies, the allocation of resources can be automated and optimized to allow maximum utilization of cluster resources.

Because it was built as a plug-in to K8s, Run:ai’s scheduler requires no advanced setup, and is certified to integrate with any number of Kubernetes “flavors” including Red Hat OpenShift and HPE Ezmeral. 

The following diagram illustrates key architecture and functional elements of the fully integrated Hyperscalers Run:ai Appliance:


                                                              Figure 2 Hyperscalers Run:ai Appliance Workflow Architecture


The following illustration shows Run:ai management dashboard monitoring UI capabilities supporting both high-level overview and detailed observation of AI workload status and resource utilisation: 


                                                       Figure 3 Overview of the Hyperscalers Run:ai Management Dashboard


No More Idle Resources

Run:ai’s over-quota system allows users to automatically access idle resources when available based on configurable fairness policies. The platform allocates resources dynamically, for full utilization of cluster resources. Our customers see improvements in utilization from around 25% from when we start working with them to over 75% once more fully optimised. 

Bridge Between HPC and AI

The Run:ai Scheduler allows users to easily make use of integer GPUs, multi-node/multi GPUs, and even GPU Multi-Instance GPU (MIG) instances for distributed training on Kubernetes. In this way, AI workloads run based on needs, not available capacity. Run:ai empowers you to combine the benefits and efficiency of High-Performance Computing with the simplicity of Kubernetes.

Accelerate AI

By using Run:ai resource pooling, queueing, and prioritization mechanisms, researchers are shielded from infrastructure management hassles and can focus exclusively on data science. Many workloads can be run in parallel without compute bottlenecks. Run:ai delivers real time and historical views on all resources managed by the platform, such as jobs, deployments, projects, users, GPUs and clusters. 

The Hyperscalers Run:ai Appliance can accelerate your time to reach productive AI results, in particular as it is delivered as a turn-key solution including all master, worker and storage nodes pre-configured and ready to start running your AI workloads. 

Streamline AI

Run:ai can support all types of workloads required within the AI lifecycle (build, train, inference) to easily start experiments, run large-scale training jobs and take AI models to production without ever worrying about the underlying infrastructure. The Run:ai Atlas platform allows MLOps and AI Engineering teams to quickly operationalize AI pipelines at scale and run production machine learning models anywhere while using the built-in ML toolset or simply integrating their existing 3rd party toolset.

Productize AI

Run:ai’s unique GPU Abstraction capabilities effectively “virtualize” all available GPU resources to maximize infrastructure efficiency and increase ROI. The platform pools expensive compute resources and makes them accessible to researchers on-demand for a simplified, cloud-like experience. 

A distinction should be made between the capabilities of Run:ai and other GPU virtualisation products that do not support suballocation/percentage based splitting of GPUs across multiple AI workloads, or dynamic reclamation and re-assignment of GPU resources on the fly during AI workload job progress.

Run:ai dynamic resource allocation capabilities prevent GPU resources from becoming unusable for other new and ongoing workloads even though the original workload for which they were allocated no longer needs them. (Run:ai can disable dynamic allocation if required for specific workloads and fall back to static allocation). 

Run:ai helps organizations accelerate their AI journey - from fast entry into building initial models to scaling AI in production. Using Run:ai’s Atlas software platform, companies streamline the development, management and scaling of AI applications. Researchers gain on-demand access to pooled resources for any AI workload. An innovative operating-system supports management of AI equipment resources beginning from fractions of GPUs up and into large-scale distributed training.


Start Configuring using Hyperscalers Servers


High Performance Compute Run:ai Appliance


Balanced Performance Run:ai Appliance


Cost Efficient Run:ai Appliance

Download the full reference guide in the downloads tab. 

Contents Page - Download the Reference Guide in the Downloads Tab 

1 Introduction 4

Hyperscalers Run: ai Appliance 6

Audience and Purpose 9

Digital IP Appliance Design Process 10

Appliance Optimizer Utility AOU 10

Featured Hardware from Hyperscalers 11

Important Considerations 13

2 Base Product Deployment 14

Preinstallation Requirements 14

Installation Components 14

Hardware Deployment 14

Software Deployment 16

Installation of operating system 16

Installation of Kubernetes 16

Prerequisites 16

Run on All Nodes 16

Permanently disable swap on all nodes 19

Avoiding Accidental Upgrades 20

Run:ai Software Prerequisites 20

Install the Run:ai Control Plane (Backend) 22

3 Configure the Appliance 23

4 Updating the Appliance 29

Prerequisites for updating 29

5 Testing the Appliance 30

6 Integration 38

7 Maintenance 44

8 Addendum 47

9 References 48

List Of Figures 

Figure 1 Hyperscalers Run:ai Appliance stack 5

Figure 2 Hyperscalers Run:ai Appliance Workflow Architecture 7

Figure 3 Digital IP-Appliance Design Process 10

Figure 4 Front view of S5N    Figure 5 Rear view and Internals of S5N 11

Figure 6 Front view of HSGP1   Figure 7 Top view and Internals of HSGP1 12

Figure 8 Front Internal view of GZ2   Figure 9 SXM GPUs of GZ2 12

Figure 10 Run:ai Atlas Lab as a Service Appliance Architecture from Hyperscalers (can be changed on customer’s request) 15

Figure 11 Stages of Machine Learning Application development [15] 30

Figure 12 Assigning one complete GPU to a job 31

Figure 13 Deployment of the job with one complete GPU 31

Figure 14 Fractional GPU virtualization in Run:ai Atlas [15] 32

Figure 15 Allocating 10% of GPU to a job 33

Figure 16 Run:ai Atlas Dashboard training utilisation 34

Figure 17 Run:ai Atlas Dashboard interactive utilisation 34

Figure 18 Run:ai Atlas scheduler features [15] 35

Figure 19 Creating a training job for distributed training (multi-node) 36

Figure 20 Creating a training job for multi-GPU training 36

Figure 21 Run:ai Atlas Machine learning model deployment [15] 37

Figure 22 Autoscaling feature of Run:ai Atlas for any model deployment [15] 38

Figure 23 A sample screenshot to demonstrate clustered GPUs [15] 47

White Paper 1
Title Version Date Size
Run:ai Introduction document 1 787KB
Reference Architecture 1
Title Version Date Size
Run:ai Reference Document 1 2MB

Tags: Hyperscalers-Run-ai-Appliance-solution-hardware-software-GPU-buy-server-support-kubernetes-platform-clusters-infrastructure