0x374 Virtualization

1. Scheduler

how to use slurm

scontrol

scontrol update nodename=<nodename> state=resume

ResourceManager

NodeManager

borg is a container management system at Google, built to manage long-running services and batch jobs.

Inspired by Borg, Kubernetes was built to manage long-running proceses, designed to orchestrate multiple micro-services.

Unlike HPC systems such as slurm, which assumes fixed system size and infinite workload, cloud orchestration assumes

A kubernetes cluster consists of:

kubernetes

Control Plane has the following components:

API server (kube-apiserver): frontend of Kubernetes
etcd key value store for cluster data
scheduler: select a node for newly created pods
controller manager: manages a few controlers such as node controler, job controler etc.

Worker Node has teh following components:

kubelet: agent that run on each node, which make sure container are running in a Pod
kube-proxy network proxy maintains network rules
container runtime: responsible for managing the execution and lifecycle of containers.