Skip to content

0x374 Virtualization

1. Scheduler

1.2. slurm

how to use slurm

scontrol

scontrol update nodename=<nodename> state=resume

1.3. YARN

ResourceManager

  • keeps the metadata of jobs
  • hosts on a different host from HDFS NameNode

NodeManager

  • run on each node, co-located with HDFS DataNode
  • manage YARN container (resource allocation done by resourcemanager)

Orchestration (Kubernetes)

BORG

borg is a container management system at Google, built to manage long-running services and batch jobs.

Kubernetes

Inspired by Borg, Kubernetes was built to manage long-running proceses, designed to orchestrate multiple micro-services.

Unlike HPC systems such as slurm, which assumes fixed system size and infinite workload, cloud orchestration assumes

  • "infinite" resources are available
  • workload is finite

kubernetes

Reference

Borg, Omega, and Kubernetes Lessons learned from three container-management systems over a decade