0% found this document useful (0 votes)

101 views

Parallel Implementation of John Conway's Game of Life

This document summarizes a parallel implementation of John Conway's Game of Life using both Pthreads and OpenMP. Key points: 1) The cellular automaton grid is divided into columns and assigned to threads to optimize for memory access since modern RAM is organized into banks. 2) Barriers are used between iterations to synchronize threads as some cell states depend on neighboring threads' calculations. 3) The implementation is dynamic for grid size and number of threads, with options for a "play" mode showing iterations or a "bench" mode measuring performance.

Uploaded by

biruk bekele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views

Parallel Implementation of John Conway's Game of Life

Uploaded by

biruk bekele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Machine Translated by Google

Parallel Implementation of John Conway’s Game of Life

Behtsoudis Anestis - [email protected]

Abstract:

In the present work, parallel implementations were performed on a shared

memory model of the Game Of Life (John Conway) cellular automaton. The code for
the two parallel versions was written in C and the Posix Threads (Pthreads) and
OpenMP APIs were used respectively. In order to evaluate the performance and the
improvement of the implementations, measurements of the execution time were
performed in 2 systems, a personal laptop and the system provided to us by HPCLab
(Xanthos). Initially in the first part we will present how we approached the issue of
parallelism for the cellular automatic and we will present the technical details for the
2 implementations. In the second part we will present the measurements made for
the 2 systems and we will comment on the performance through them and some
auxiliary graphs.

Game Of Life Rules:

In a two-dimensional version of the cellular automaton we have an MxN array,

where each position in the array is a cell. Each cell has 2 possible states, "dead" or
"alive". During the initialization phase, each cell is given an initial state and execution
begins. In each execution (around) each cell checks the state of its 8 neighbors and
based on some rules determines the state to which it will go. The rules are: • Every
living cell with less than 2 living neighbors dies • Every living cell with more than 3
living neighbors dies • Every living cell with 2 or 3 living neighbors stays alive •
Every dead cell with exactly 3 living neighbors dies comes to life.
Machine Translated by Google

Part 1 - Implementation Details

1. Parallel implementation of Game of Life

• Separation of Parallel Areas

Game Of Life is an inherent data-parallel problem, where the next state of

the item depends on the current state of the neighboring items. So we have a
table of dependent elements and processing that needs to be done on them. In
the multi-threaded common memory model we look at things are relatively simple.
We have at our disposal a number of threads that run "false" in parallel, to which
we can assign the processing from a part of the table. The problem is that we will
optimally separate the table. A first logical thought would be to simply divide the
original table into isomeric sections or as close to the isomer as possible if the
dimensional divisions are not performed exactly. In a way, that is:

The above separation at first glance seems quite correct and optimal since it aims
at an even distribution of sizes in the processing yarns, when we will have the best
possible result. But a closer and more technical look will show us that this is not
the case.
We know that in all modern computer systems the part of the main
memories has been implemented with SDRAM technology. From the implementation of
SDRAM we know is organized in Banks (usually 4) where access to each bank is
given by giving a row and column address. The selection of the bank arrives from
separate terminals on the memory chip, while the selection of the row and column
uses the same terminals. That is, we have one
Machine Translated by Google

multiplexed address where the row address is sent first and then the column
address. In a burst mode where we want to read many memory locations in
succession, it is advantageous to have them distributed in columns because that
way we will have the smallest possible overhead. For more details refer to a related
article on implementing and accessing SDRAM memories. So if we allocate the
table by columns and divide the sections by rows, so that they are accessed by
columns we will have the best possible performance for the common memory
model of the threads. The separation is then done as the following image:

The above way of separating parallel sections is followed in both PThread

implementation and OpenMP. The boundaries are defined as follows: As we know
each thread has a unique integer ID with the number starting from 0. So the part
that processes each thread results from the following mathematical relations: Start
= ID * height / NThreads - Top section

Finish = Start + height / NThreads - End of section

Where NThreads is the number of threads and height is the height (number of rows)
of the table.

And finally because we do not want the outline of the table (peripheral cells) to
participate in the game (their status remains the same as the original for the entire
duration of execution), we make the corresponding checks in our code (the points
are obvious from its commentary source file).

• Synchronization
Machine Translated by Google

The threads are scheduled by the operating system based on the protocol it
implements and we are not able to know the order in which the threads will be
executed. For this reason we must ensure that there is synchronization of the
threads, since some of the data managed by each thread, more specifically the
elements that are adjacent to the elements of the next thread, and we are therefore
dependent on the data. To achieve this synchronization we use the barrier
functions contained in the library <pthread.h>. PThread barriers work with the
following logic: when a thread encounters a pthread_barrier_wait () function, it
suspends its execution until the number of threads that have also called
pthread_barrier_wait () is satisfied. The maximum will be this limit we set during
the initialization of the barrier. So it is reasonable to see that we put a barrier at
the end of each run to ensure that no thread will go to the next round if all the
threads have not finished the current round first. This way we avoid inconsistencies
in our data and therefore have valid controls during processing.

2. Implementation Details

Our implementation is fully dynamic, both in terms of table dimensions and the
number of threads, while at the same time there are some default values (80x25
and 1 thread) if no values are set by the user. So if in main we reach a stage
where we know the table sizes, we do memory allocation (malloc ()) 2 tables of 2
dimensions each. We need 2 tables in order for one to contain the current state,
and in the other to store the new state that has arisen in each circle for each cell.
So at the end of each round it is enough to copy the contents of the second table
(next state) to the first (current state).

To avoid so much and unnecessary read and write, we have 2 pointers to

(current and next) which initially show in the 2 tables as we defined them, and at
the end of each round we swap them in order to have the new values. This way
we avoid the big overhead of accessing and copying so much data.

• Operation Modes (Play – Bench) (Default – Non-Default)

Regarding the execution of Game Of Life, we have set 2 modes for the total
execution and 2 modes for the type of table dimensions. For the total execution
we have the play and bench mode. In play mode, the current status of the game
and some accompanying information are displayed in the terminal. For a visual
representation of execution to make sense, the table boundaries must be within
some reasonable limits. We set these limits on
Machine Translated by Google

100x50. So for larger table sizes the play mode is not allowed from
user. In bench mode we have a function of measuring the execution time with
use the gettimeofday () function. In this mode it is not printed
status of the table, but statistics on the execution performed are presented
(dimensions, number of threads, execution time in us).
Regarding the dimensions of the table now we have the default and the non-default
mode. In default mode the dimensions are 80x25 and the initial state
is given by an input file of the same size. We have listed 2 indicative files
input Shapes, Random where the first includes some recurring
pattern and the second is a random initial state. In non-default mode ÿ
The user can specify the dimensions of the table and the initial state
is randomly configured using the rand () function.

To specify modes there are 2 variables-flag bench_flag and dflag for

the modes mentioned respectively.

• Common Functions in PThread and OpenMP

Then let's look at some basic functions that have been implemented and are
common to both implementations, while later we will see how the 2 codes differ.

Function Arguments Return Operation

initialize_board() int **current, void If the default flag is 1st
int dflag table arch. with 0, otherwise with
random 0.1.
read_file() int **current, void Reading the original
char *filename status from a file.
copy_region() int **current, void Copy the periphery and
int **next in the second table.
adjacent_to() int **current, int Returns the number of
int i, int j living neighbors of the cell
of current with position (i, j).
play() int **current, void Application of its rules
int **next, int start, finish game, and renewal of
prices in the table next
status (pointer next).
print() int **current void Print his status
current table
arg_check int argc, char *argv[] int Check the parameters with
which the executable was called.
Returns 0 to successful control,
otherwise a different int.
print_help() void void Show the possible
call parameters and
interpretation of these.

2.1 PThreads Implementation

Machine Translated by Google

In the main, if the number of threads becomes known (default = 1 or user

specified), the barrier is initially initialized with a limit on the number of threads of
all threads. This is because we want all the threads that will be assigned by a part
of the table to participate in the synchronization as mentioned above. Then the
threads are created with the entry function the entry_function ().
At the beginning of the input function, the boundaries of the section that will
process each thread are determined based on the way mentioned above. Then
we have a main loop of 100 reps which is the 100 rounds of Game Of Life.

At the beginning of each round the main play function is executed with
arguments of the indicators for the 2 tables (current & next) and the limits of the
piece of the table to which the rules of the game will be applied. Play () therefore
applies the rules and updates the values in the table shown by the next pointer.
Inside the play there is the appropriate control so that the peripheral elements
remain unaffected. After the play is performed, a barrier_wait () follows in order for
all the threads to have applied the rules and to have updated the values in the
next state table. Then the thread with ID == 0 undertakes to swap the pointers so
that the current in the next round points to the next of the current round, and
therefore the updated values of the current round become the current values in
the next round. Also the thread with ID == 0 undertakes to make some prints on
the console if we are in play mode in order to present the status of the table during
the current round. After the swap of the indicators, another barrier follows so that
we can ensure that no thread will go to the next round if the exchange of the
indicators has not been completed first.

At the end of 100 rounds the thread finishes its processing and returns.

2.2 OpenMP Implementation

In OpenMP implementation the differences are not so big. In the main, when
we are done with the controls and initializations, we define a parallel area with the
appropriate openmp directive. We define as common variables the pointers to the
2 tables as well as the number of threads. We define the thread ID, the processing
limits and an auxiliary variable as private variables. Also through the option
num_threads () we define with how many threads we want to execute the parallel
area.
Then the philosophy is the same as that of PThread. Each thread sets its
limits based on its ID (as mentioned above) and starts the main operation for 100
rounds. In each round after the execution of the play we have a barrier
(corresponding directive) to have synchronization. And here the thread with ID ==
0 undertakes to swap the pointers and display its status
Machine Translated by Google

table if we are in play mode. At the end of the round another barrier follows
in order here too to make sure that the swap is done before we go to the next one
round.
After the end of 100 rounds we close the parallel area and
we display the corresponding messages before the program ends.

Part 2 - Measurements

1. Execution Systems

As mentioned above, the measurements were performed in 2 systems

with different architectures in order to have a more complete picture of
the behavior of the parallel versions. First let's look at the basics
characteristics of the 2 systems.

Characteristic My System Xanthos

Cpu Model Inter Core 2 Duo T6400 AMD Opteron(tm) Processor 2210 HE
Frequency 2.00 GHz 1.80 GHz
#Cores L2 24

Cache/ Core 2048 KB #Threads 1024 KB

2 Main Memory 3.00 3473
GB Swap
MB 4

20 GB
20 GB

2. Results - Commentary

Detailed tables are in the accompanying gender excel. We here will

list any pieces that are necessary to make the commentary that
we want.

First let's look at some measurements for implementation with PThreads for various
table sizes:

PThreads 80x25 – Random Input

Nthreads My System Xanthos
Time (us) Time (us) Diff(My-Xanthos
1 21348 60926 -39578
2 14379 40243 -25864
3 18402 31597 -13195
4 18757 43582 -24825
5 22359 42599 -20240
6 22856 42963 -20107
7 30019 51860 -21841
8 30170 55658 -25488
Machine Translated by Google

PThreads 200x50 – Random Input

Nthreads My System Xanthos
Time (us) Time (us) Diff(My-Xanthos
109006 319621 -210615
12 59771 111583 -51812
3 77924 91070 -13146
4 67249 128418 -61169
5 77135 103729 -26594
6 69014 101019 -32005
7 76460 104758 -28298
8 74866 116670 -41804

PThreads 2000x500 – Random Input

Nthreads My System Xanthos
Time (us) Time (us) Diff(My-Xanthos
10477616 19892058 -9414442
1 5397030 10298023 -4900993
23 5698842 6975674 -1276832
4 5670513 5481174 189339
5 5637657 7562911 -1925254
6 5545523 7050784 -1505261
7 5636890 6422875 -785985
8 5615205 5900633 -285428

PThreads 2000x2000 – Random Input

Nthreads My System Xanthos
Time (us) Time (us) Diff(My-Xanthos
1 41790751 78177419 -36386668
2 21575731 39150984 -17575253
3 22081852 26587961 4 21830961 -4506109
20241626 5 21619917 27239080 1589335
-5619163
6 21454325 23992932 -2538607
7 21558108 23212022 8 21597983 -1653914
20926997 670986

We observe that for small table sizes there is a limit of improvement as

we increase the number of threads. From this limit onwards the execution time
Machine Translated by Google

instead of decreasing it increases. This makes sense because the overhead for
alternating threads in the cores is an order of magnitude greater than the execution
time of the piece of each thread itself. Also for small table sizes we see that the
improvement is minimal compared to the serial. In contrast to large table sizes the
improvement is significantly greater as the number of threads increases. We also
notice that we have an upper limit on the improvement we can achieve. From that
limit onwards, no matter how much we increase the threads, the execution time
remains the same or even longer as we see for the small table sizes.

It is characteristic that we have the optimal time in both systems when the
number of threads is the same as the one that can work simultaneously in each
system (2 for ours and 4 for Xanthos respectively).

The above was for PThread implementation, let's now look at the corresponding
graphs for OpenMP implementation. The tables with the numbers were not
included in the report, but can be found in the excel cover sheet.
Machine Translated by Google

Similar observations with PThread can be made in OpenMP. We see that

for small table sizes there is an improvement limit as we increase the number of
threads. Beyond this limit if we increase the number of threads the execution time
increases instead of decreasing for the reason mentioned above. Conversely for
large table sizes after a thread limit and then the execution time tends to become
constant as we increase the number of threads. Of course here too if we exceed
a reasonable number of threads limit, the overhead of the switch will again be very
large and the total execution time will increase. We also notice that in the PThread
implementation our 2-Core system always presented better performance while for
large sizes the execution time of the 2 systems converges. In contrast to
the OpenMP implementation for large sizes, the superiority of the Xanthos system
is felt, which leads us to the conclusion that there is better utilization of the Xanthos
architecture.

In the next graph we see the efficiency of PThread implementation against it

OpenMP for our system:

So we see that PThread implementation is better than OpenMP, which is

completely justified since OpenMP is a higher level of API for implementing parallel
applications, compared to PThtread which is a lower-level approach. Of course
here it is worth noting the platform independence of OpenMP in relation to Posix
Threads.
Machine Translated by Google

Let's now look at some measurements in terms of table sizes.

We take 2 fixed values for the height 50 and 1000 and for these we take different
measurements for different width values. We first observe that as the table sizes
increase, the execution time increases linearly.

With a closer look at the graph data we see that for large height (lines) we
have a much larger angle of increase compared to small height values. This
confirms what we are saying about less overhead data transfer from memory when
accessed in columns. For this it would be more efficient if the large dimension of
the table is always the width (columns) and the small the height (rows).

Annex:
ÿÿÿÿÿÿÿÿÿÿÿ ÿÿÿÿÿÿ: •
gol_threads.c – Source file Pthread implementation
• gol_openmp.c – Source file OpenMP implementation
• Makefile – Makefile to compile sources • Random –
Random input board’s state file • Shapes – Pattern
input board’s state file • Measures.xls – Excel sheet
with measurements
• Readme – Short txt file with executable parameters details
• GOLReport.pdf – Project report

Frhb-8a - Service Manual Smbbe-7f0el150 - July.14.2015
No ratings yet
Frhb-8a - Service Manual Smbbe-7f0el150 - July.14.2015
184 pages
Dchuynh HW4
No ratings yet
Dchuynh HW4
5 pages
Investigating Synchronisation
No ratings yet
Investigating Synchronisation
7 pages
CS 211 - Assignment 3 Due November 27, 2018 The Game of Life
No ratings yet
CS 211 - Assignment 3 Due November 27, 2018 The Game of Life
3 pages
OS List
No ratings yet
OS List
14 pages
Investigating Synchronisation
No ratings yet
Investigating Synchronisation
8 pages
12 Gpu Cuda 3
No ratings yet
12 Gpu Cuda 3
58 pages
Parallel Random Access Machines: Next Prev Prev-Tail Tail Up
No ratings yet
Parallel Random Access Machines: Next Prev Prev-Tail Tail Up
15 pages
CPSC 350 - Data Structures Fall 2013 Assignment #2 - Life (Due 9-17-13 11:59 PM) Overview
No ratings yet
CPSC 350 - Data Structures Fall 2013 Assignment #2 - Life (Due 9-17-13 11:59 PM) Overview
4 pages
CSC630, Spring 2014 Assignment Single and Parallel Implementation of Game of Life
No ratings yet
CSC630, Spring 2014 Assignment Single and Parallel Implementation of Game of Life
1 page
MPI The Best Possible Practise
No ratings yet
MPI The Best Possible Practise
5 pages
Turing Machine PDF
No ratings yet
Turing Machine PDF
22 pages
Design of A Bit-Serial Floating Point Unit For A F
No ratings yet
Design of A Bit-Serial Floating Point Unit For A F
8 pages
Turing Machine
No ratings yet
Turing Machine
10 pages
Tetris Game Report - SUN - LI
No ratings yet
Tetris Game Report - SUN - LI
11 pages
Memory Game
No ratings yet
Memory Game
7 pages
Digital Engineering: Complex System Design
From Everand
Digital Engineering: Complex System Design
S Mathioudakis
No ratings yet
Master in High Performance Computing Advanced Parallel Programming Mpi: Rma and I/O
No ratings yet
Master in High Performance Computing Advanced Parallel Programming Mpi: Rma and I/O
7 pages
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
From Everand
PlayStation 2 Architecture: Architecture of Consoles: A Practical Analysis, #12
Rodrigo Copetti
No ratings yet
Ios Augmented Topic
No ratings yet
Ios Augmented Topic
15 pages
Procedure To Create The Memory Game
No ratings yet
Procedure To Create The Memory Game
17 pages
Problem Set
No ratings yet
Problem Set
17 pages
C Assignment 1
No ratings yet
C Assignment 1
5 pages
Sub Mitted By: Marquez, Jeremiah D. Funcion, Lyndon Jay Malik, Akmad Bilan, Ranilo Comia, Jersy Jonson
No ratings yet
Sub Mitted By: Marquez, Jeremiah D. Funcion, Lyndon Jay Malik, Akmad Bilan, Ranilo Comia, Jersy Jonson
14 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
DB Project Document
No ratings yet
DB Project Document
23 pages
Mega Drive Architecture: Architecture of Consoles: A Practical Analysis, #3
From Everand
Mega Drive Architecture: Architecture of Consoles: A Practical Analysis, #3
Rodrigo Copetti
No ratings yet
Game theory programs
No ratings yet
Game theory programs
10 pages
03 Multidimensional Lists Exercise
No ratings yet
03 Multidimensional Lists Exercise
8 pages
Vga Controller Using Verilog
No ratings yet
Vga Controller Using Verilog
15 pages
Magzhan Kairanbay
No ratings yet
Magzhan Kairanbay
27 pages
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
Arm Architecture (Risc) : Instruction Set
No ratings yet
Arm Architecture (Risc) : Instruction Set
5 pages
The Game of Life: Inal Roject Eport
No ratings yet
The Game of Life: Inal Roject Eport
28 pages
Caches
No ratings yet
Caches
80 pages
Project "Amazons": Game Instruction
No ratings yet
Project "Amazons": Game Instruction
5 pages
st20270256 PORT1 CMP4011
No ratings yet
st20270256 PORT1 CMP4011
7 pages
csd311 Project
No ratings yet
csd311 Project
11 pages
Memory Systems
No ratings yet
Memory Systems
11 pages
Trabalho Gaussian Parte Escrita
No ratings yet
Trabalho Gaussian Parte Escrita
6 pages
Master System Architecture: Architecture of Consoles: A Practical Analysis, #15
From Everand
Master System Architecture: Architecture of Consoles: A Practical Analysis, #15
Rodrigo Copetti
2/5 (1)
Mag 1979
No ratings yet
Mag 1979
37 pages
Assignment Exercise1
No ratings yet
Assignment Exercise1
10 pages
Game of Life 2020
No ratings yet
Game of Life 2020
4 pages
s5-tic-tac-toe-project
No ratings yet
s5-tic-tac-toe-project
5 pages
Turing Machine Life
No ratings yet
Turing Machine Life
8 pages
Data Structures A1 - Game of Life
No ratings yet
Data Structures A1 - Game of Life
5 pages
Tic-Tac-Toe_Game_Against_Computer
No ratings yet
Tic-Tac-Toe_Game_Against_Computer
5 pages
Operating Systems Simulation
No ratings yet
Operating Systems Simulation
6 pages
1 PENTAGO Game
No ratings yet
1 PENTAGO Game
4 pages
Experiments in Computer System Design: Technical Report
No ratings yet
Experiments in Computer System Design: Technical Report
60 pages
Matrix Multiplication-Javan.
No ratings yet
Matrix Multiplication-Javan.
6 pages
REPORT PROJECT Finilzing
No ratings yet
REPORT PROJECT Finilzing
5 pages
Segmentation
No ratings yet
Segmentation
6 pages
Mid3 Revision, VM and Instruction Set Architecture: Prof. Sin-Min Lee
No ratings yet
Mid3 Revision, VM and Instruction Set Architecture: Prof. Sin-Min Lee
72 pages
CSE 451 Midterm Exam November 1999: Running
No ratings yet
CSE 451 Midterm Exam November 1999: Running
6 pages
Eai 17-6-2022 2322876
No ratings yet
Eai 17-6-2022 2322876
10 pages
Tic Tac Toe Game (Dsa)
No ratings yet
Tic Tac Toe Game (Dsa)
31 pages
LA8
No ratings yet
LA8
4 pages
Random Access Machines
No ratings yet
Random Access Machines
9 pages
paMINE
No ratings yet
paMINE
7 pages
Annexure 6 (b)_compressed
No ratings yet
Annexure 6 (b)_compressed
1 page
Sem 3 ASP
No ratings yet
Sem 3 ASP
4 pages
Download Full A Death Retold in Truth and Rumour Kenya Britain and the Julie Ward Murder African Articulations African Articulations 2 Grace A. Musila PDF All Chapters
100% (3)
Download Full A Death Retold in Truth and Rumour Kenya Britain and the Julie Ward Murder African Articulations African Articulations 2 Grace A. Musila PDF All Chapters
51 pages
Final Dental Mechatronics
No ratings yet
Final Dental Mechatronics
2 pages
Biography - Blanche, Updated.
No ratings yet
Biography - Blanche, Updated.
3 pages
Adv Elect Materials - 2021 - Li - Recent Advances in New Materials For 6G Communications
No ratings yet
Adv Elect Materials - 2021 - Li - Recent Advances in New Materials For 6G Communications
19 pages
Hand Cannon 7
No ratings yet
Hand Cannon 7
19 pages
Kl - 12 b Test Gr a = 13 Kopje
No ratings yet
Kl - 12 b Test Gr a = 13 Kopje
3 pages
Konsep Skizofrenia Dan Gangguan Psikotik Lainnya: DR I Gusti Rai Putra Wiguna SPKJ 087861177047
No ratings yet
Konsep Skizofrenia Dan Gangguan Psikotik Lainnya: DR I Gusti Rai Putra Wiguna SPKJ 087861177047
34 pages
As-Built: TANK-2 TANK-3 TANK-1
No ratings yet
As-Built: TANK-2 TANK-3 TANK-1
1 page
GEOGRAPHY-1
No ratings yet
GEOGRAPHY-1
4 pages
Website-INISS_Jan_25_1st_Round_Result
No ratings yet
Website-INISS_Jan_25_1st_Round_Result
26 pages
Kellogg Company Balance Sheet
No ratings yet
Kellogg Company Balance Sheet
5 pages
Project Marvel - External Guidelines v1
No ratings yet
Project Marvel - External Guidelines v1
16 pages
1.de Thi HSG Chinh Thuc - 21-22
No ratings yet
1.de Thi HSG Chinh Thuc - 21-22
12 pages
07 Handout 1
No ratings yet
07 Handout 1
1 page
KMBN 106 Design Thinking
100% (1)
KMBN 106 Design Thinking
27 pages
Noninvasive Airway Management Techniques How and When To Use Them
No ratings yet
Noninvasive Airway Management Techniques How and When To Use Them
16 pages
Risk Assesment and Internal Control Sa 400
No ratings yet
Risk Assesment and Internal Control Sa 400
4 pages
What Is An Editorial
No ratings yet
What Is An Editorial
10 pages
Lloyds Core VP Deck 2023-No Video-with Regional Split-Copy
No ratings yet
Lloyds Core VP Deck 2023-No Video-with Regional Split-Copy
43 pages
Maharashtra Public Service Commission: Assistant Director of Forensic Science Advt No 34 of 2022
No ratings yet
Maharashtra Public Service Commission: Assistant Director of Forensic Science Advt No 34 of 2022
25 pages
Template Contoh HSE Plan
No ratings yet
Template Contoh HSE Plan
9 pages
Module 3 Part 1
No ratings yet
Module 3 Part 1
4 pages
Pc210lc-11 Sen06695-01 Circuit Diagrams
100% (1)
Pc210lc-11 Sen06695-01 Circuit Diagrams
26 pages
World Food Day WFP News Release 2014 Oct 16 PDF
No ratings yet
World Food Day WFP News Release 2014 Oct 16 PDF
2 pages
What If You Could Look Younger Longer?
No ratings yet
What If You Could Look Younger Longer?
6 pages
(Hi) Stories of Translation and Translators: Past, Present and Future
No ratings yet
(Hi) Stories of Translation and Translators: Past, Present and Future
33 pages
Passionate Gardener
No ratings yet
Passionate Gardener
4 pages

Uploaded by

Uploaded by

Machine Translated by Google

Parallel Implementation of John Conway’s Game of Life

Behtsoudis Anestis - [email protected]

In the present work, parallel implementations were performed on a shared

Game Of Life Rules:

In a two-dimensional version of the cellular automaton we have an MxN array,

Part 1 - Implementation Details

1. Parallel implementation of Game of Life

• Separation of Parallel Areas

Game Of Life is an inherent data-parallel problem, where the next state of

The above way of separating parallel sections is followed in both PThread

Finish = Start + height / NThreads - End of section

To avoid so much and unnecessary read and write, we have 2 pointers to

• Operation Modes (Play – Bench) (Default – Non-Default)

To specify modes there are 2 variables-flag bench_flag and dflag for

• Common Functions in PThread and OpenMP

Function Arguments Return Operation

2.1 PThreads Implementation

In the main, if the number of threads becomes known (default = 1 or user

2.2 OpenMP Implementation

As mentioned above, the measurements were performed in 2 systems

Characteristic My System Xanthos

Cache/ Core 2048 KB #Threads 1024 KB

Detailed tables are in the accompanying gender excel. We here will

PThreads 80x25 – Random Input

PThreads 200x50 – Random Input

PThreads 2000x500 – Random Input

PThreads 2000x2000 – Random Input

We observe that for small table sizes there is a limit of improvement as

Similar observations with PThread can be made in OpenMP. We see that

In the next graph we see the efficiency of PThread implementation against it

So we see that PThread implementation is better than OpenMP, which is

Let's now look at some measurements in terms of table sizes.

You might also like