mercredi 31 mai 2023

Generate Random Number of Jobs for Various HPC Workloads, Constrained by Resource Availability

Overview

I am trying to use Python to generate a randomized distribution of resources for various HPC workloads, constrained by the resources available on the cluster.

For example:

  • Type A = 4 cores, 8GB RAM
  • Type B = 16 cores, 32GB RAM
  • Type C = 32 cores, 128GB RAM

The cluster contains, say, 600 cores and 3000GB of RAM. Let's assume I want to have 50 concurrent users (or close to it).

Example

One feasible situation would be:

n_users = 50
n_A = 30 # 120 cores, 240GB RAM
n_B = 10 # 160 cores, 320GB RAM
n_C = 10 # 320 cores, 1280GB RAM

# 120 + 160 + 320 <= 600 cores
# 240 + 320 + 1280 <= 3000GB RAM

Questions

  1. How can I generate randomized distributions of resource allocation, per machine type (A,B,C) for n-users without violating available resources?

  2. How could I generate randomized distributions of resource allocation, per machine type, for random number of users but with fixed core-count? I.e. constraint is to use 500 cores.

Thoughts

After thinking about this for a bit, my initial thought is to use a constrained stochastic optimizer method to generate pareto-optimal solutions (non violating combinations of workloads) and using that as my random set. Unless someone has a better idea.




Aucun commentaire:

Enregistrer un commentaire