mit 6.824 Distributed Systems L1 Introduction

程序员文章站 2022-03-15 16:59:01

...

为什么选择分布式系统？

parallelism
fault tolerance
physical
security/isolated

challenges：

concurrency
partial failure
performance

course structure

lectures
papers
exams
labs
project（optional）
Lab 1 - MapReduce
Lab 2 - Raft for fault tolerance
Lab 3 - K/V server
Lab 4 - standard K/V service

Infrastructure

storage
communication
computation

希望外表建立一个非分布式系统（abstractions）

RPC, threads, concurrency ctl (locks, etc.)

Performance

Scalability - 2x computers -> 2x throughput

Fault Tolerance

Availability
Recoverability
NV（non volatile） storge
Replication

Topic-consistency

Put(k,v)
Get(k) -> v

因为分布式系统中有多个表，多个表之间可能存在不一致性。

Strong consistency：保证取的肯定是最新值

Weak consistency：不保证取的是最新值

弱一致性的要求会低一点，现实中更常见。

MapReduce 的设计目的就是为了让更多的程序原来用这个框架而不用具体了解分布式的实现细节

  
Abstract view of a MapReduce job
  input is (already) split into M files
  Input1 -> Map -> a,1 b,1
  Input2 -> Map ->     b,1
  Input3 -> Map -> a,1     c,1
                    |   |   |
                    |   |   -> Reduce -> c,1
                    |   -----> Reduce -> b,2
                    ---------> Reduce -> a,2
  MR calls Map() for each input file, produces set of k2,v2
    "intermediate" data
    each Map() call is a "task"
  MR gathers all intermediate v2's for a given k2,
    and passes each key + values to a Reduce call
  final output is set of <k2,v3> pairs from Reduce()s

Example: word count
  input is thousands of text files
  Map(k, v)
    split v into words
    for each word w
      emit(w, "1")
  Reduce(k, v)
    emit(len(v))

mit 6.824 Distributed Systems L1 Introduction

上一篇：台积电7nm产能如此抢手？微软表示XSX主机可能缺货半年

下一篇： IBM首发2nm工艺专家：动摇不了台积电

mit 6.824 Distributed Systems L1 Introduction

《Distributed Systems》(6.824)LAB1(mapreduce)

[MIT 6.824: Distributed Systems] LEC 1: Introduction之Preparation

mit 6.824 Distributed Systems L1 Introduction