0x352 Distributed Applications

This page summarizes protocols, common tools and software for distributed computing.

Overview

Concensus

Paxos

Paxos Made Simple

Scheduler

slurm

how to use slurm

scontrol

scontrol update nodename=<nodename> state=resume

Distributed File Systems

NFS

NFS (Network File System) is an open-protocol developed by Sun (only protocol, no implementation details). It was one of the first usage of distributed systems. It enables easy sharing of files across clients and enable centralized administration.

NFS’s protocol is stateless (to achieve crash recovery), each call contains all the necessary information. The protocol usually contains the file handle to identify the file, the file handler can be thought as tuple of (file system identifier, file identifier), the second one can be inode.

To improve the performance, NFS caches the blocks in client memory, but it also introduces the cache consistency issue. To solve the issue, NFS

  • flush-on-close: when closing file, its contents are forced into the server
  • attribute-cache: check modification by using a local attribute-cache, timer is set to invalidate it.

AFS

  • cache strategy: files are cached in client local disk

HDFS

GFS

  • cache strategy: no cache in chunk server and client server
  • paper

IPFS

Distributed Database/Storage

Foundation

CAP Theorem

Reference: https://www.quora.com/What-Is-CAP-Theorem-1

Kafka (Streaming)

Kafka is a general publish/subscribe messaging system, it can publish message from different sources into multiple topics, and let subscriber receive those topics

MySQL/PostgreSQL

Scaling Strategies

  • denormalization: precompute join queries (to prevent multiple disk seek)
  • caching: add memcache/redis above database (to prevent querying database)
  • master/slave replication: master read/write, slaves read
  • master/master replication: hard to achieve absolute consistency
  • sharding: split database into different ranges

HBase (BigTable)

HBase (BigTable) is a sparse multi-dimension map

$$key(row, column, timestamp) \rightarrow value $$

Interface:

  • No query language, just CRUD API (very fast)
  • HBase shell
  • Java API (python, Scala wrapper)
  • Spark
  • builtin REST service

Architecture

Reference: https://mapr.com/blog/in-depth-look-hbase-architecture/

Data Model

  • fast access to a row with an unique key
  • each row has a small number of column family
  • a column family can have multiple columns (can be very large number, can be sparse)
  • each cell can have multiple versions indexed by timestamps

Applications

The following table is a row example from Bigtable paper, the anchor column family is used to compute page ranks.

Reference: The Ultimate Hands-on Hadoop Chapter 6

Cassandra (Dynamo)

Cassandra (dynamo) has no master node

  • High Availability, High Partition Tolerence
  • Low Consistency (Eventual Consistency)

Interface

  • CQL (something like limited SQL)

Architecture

Reference: https://mapr.com/blog/database-comparison-an-in-depth-look-at-mapr-db
  • Replication: Consistent Hashing
  • membership: Gossip-based protocol

mongoDB

  • Aggregation: aggregation pipline, mapreduce

Distributed Cache

Distributed Network

CDN (Content Delivery Network)

CDN is generally a concept of geographically-aware caching

Load Balancing

LB can server 1M queries per second

DNS side Network Load Balancing (L3/L4)

HTTP Load Balancing (L7)

Systems

This section collects some articles about some actual systems deployed globally.

Netflix

Cloud

This section is some tips and my personal measurement of several cloud vendors

Google Cloud

  • ping latency (same zone us-centrala): 1 ms

Bandwidth

my measurements with iperf.

  • same zone: 1.94 Gbits/sec
  • same region: 1.93Gbits/sec
  • us-europe: 236Mbits/sec
  • us-asia: 143Mbits/sec

AWS

Azure

Reference