L7 Multicore 2
L7 Multicore 2
Computer Architecture
Winter 2015
Interconnection Network
Memory I/O
2
Multiprocessors and You
Only path to performance is parallelism
Clock rates flat or declining
SIMD: 2X width every 3-4 years
- 128b wide now, 256b 2011, 512b in 2014?, 1024b in
2018?
- Advanced Vector Extensions are 256-bits wide!
MIMD: Add 2 cores every 2 years: 2, 4, 6, 8, 10, …
4
Example: Sum Reduction
Second Phase:
After each processor has
computed its “local” sum
half = 100;
repeat
synch();
/*Proc 0 sums extra element if there is one */
if (half%2 != 0 && Pn == 0)
sum[0] = sum[0] + sum[half-1];
sum[P0] sum[P1] sum[P2] sum[P3] sum[P4] sum[P5] sum[P6] sum[P7] sum[P8] sum[P9]
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 half = 10
P0 P1 P2 P3 P4 half = 5
P0 P1 half = 2
half = 1
P0
6
Threads
7
Memory Model for Multi-threading
Process
9
Multithreading vs. Multicore
10
Data Races and Synchronization
Two memory accesses form a data race if from different
threads, to same location, and at least one is a write, and
they occur one after another
If there is a data race, result of program can vary
depending on chance (which thread ran first?)
Avoid data races by synchronizing writing and reading
to get deterministic behavior
Synchronization done by user-level routines that rely on
hardware synchronization instructions
11
Question: Consider the following code
when executed concurrently by two threads.
What possible values can result in *($s0)?
# *($s0) = 100
lw $t0,0($s0)
addi $t0,$t0,1
sw $t0,0($s0)
☐ 101 or 102
☐ 100, 101, or 102
☐ 100 or 101
☐
12
Lock and Unlock Synchronization
Lock used to create region
(critical section) where only
one thread can operate Set the lock
Given shared memory, use Critical section
memory location as (only one thread
synchronization point: lock, gets to execute
semaphore or mutex this section of
code at a time)
Thread reads lock to see if it
must wait, or OK to go into e.g., change
critical section (and set to shared variables
locked)
0 => lock is free / open / Unset the lock
unlocked / lock off
1 => lock is set / closed /
locked / lock on
13
Possible Lock Implementation
Unlock
Unlock:
sw $zero,0($s0)
14
Possible Lock Problem
Thread 1 Thread 2
addiu $t1,$zero,1
Loop: lw $t0,0($s0)
addiu $t1,$zero,1
Loop: lw $t0,0($s0)
bne $t0,$zero,Loop
bne $t0,$zero,Loop
Lock: sw $t1,0($s0)
Time Lock: sw $t1,0($s0)
16
Synchronization in MIPS
17
Synchronization in MIPS Example
18
Test-and-Set
critical section
sw $zero,0($s1)
20
Summary
21