Research Methods Course
Research Methods Course
TECHNOLOGY, KUMASI
Publisher’s Information
i
© IDL, 2009
All rights reserved. No part of this book may be reproduced or utilized in any form or by any
means, electronic or mechanical, including photocopying, recording or by any information
storage and retrieval system, without the permission from the copyright holders.
Dean
Institute of Distance Learning
New Library Building
Kwame Nkrumah University of Science and Technology
Kumasi, Ghana
Phone: +233-51-60013
+233-51-61287
+233-51-60023
Fax: +233-51-60014
E-mail: [email protected]
[email protected]
[email protected]
[email protected]
Web: www.idl-knust.edu.gh
www.kvcit.org
ISBN:
Editors:
ii
1. Icons: -the following icons have been used to give readers a quick access to where similar
information may be found in the text of this course material. Writer may use them as and when
necessary in their writing. Facilitator and learners should take note of them.
Time For Activity
Self Assessment
Group Discussion
Read
New Terms
Icon #11 Icon #12 Icon #13 Icon #14 Icon #15
Answer Tips Note/Learning Tip
Pause
Online
Interactive CD
This course material is also available online at the virtual classroom (v-classroom) Learning
Management System. You may access it at www.kvcit.org
iii
Course Writers
Abeeku BREW-HAMMOND
Associate Professor of Mechanical Engineering
Director of Energy Centre at KNUST
College of Engineering, KNUST
Owusu Amponsah
Lecturer
Department of Planning
College of Architecture and Planning, KNUST
iv
Acknowledgement
The authors are indebted to Dr Gabriel Takyi, Lecturer in the Department of Mechanical
Engineering, for managing the whole of the MSc RETS e-Learning programme, including the
course materials development process.
Thanks also go to Mr Ebenezer Nyarko Kumi for invaluable assistance to Prof Abeeku Brew-
Hammond in the writing of the second half of this document.
v
Course Introduction
This course forms part of the Master of Science Degree Programme in Renewable Energy
Technologies via E-Learning. It is a 3 credit-hour course with 2 hours of teaching and 2 hours
tutorial per week. The programme is hosted by the Department of Mechanical Engineering
under the auspices of The Energy Center, KNUST.
COURSE OVERVIEW
Research methods in engineering and the physical sciences: design of experiments,
Instrumentation, Data acquisition and analysis, Error analysis, mathematical modelling and
computer simulation, statistical analysis, interpretation and presentation of experimental results
and simulations; Research methodology in the social sciences: qualitative and quantitative
research, design of surveys and questionnaires, case study design, sampling and interview
techniques, analytical techniques (analysis of variance, analytic generalisation, etc); Preparation
of research proposals including thesis research design, reporting and publication of findings
(thesis writing, preparation of conference papers and journal articles, posters, etc), critical
reviews of journal papers and other publications, oral presentations using PowerPoint, Software
applications for data analysis (SPSS, STATA, etc)
COURSE OBJECTIVES
By the end of the course the student should be able to do the following:
COURSE OUTLINE
Unit 1: Introduction to Research Proposals and Thesis Synopsis
GRADING
Continuous assessment: 30%
End of semester examination: 70%
RESOURCES
You will require a basic knowledge of engineering science and mathematics as well as access to
the internet and a computer for this course.
vii
READING LIST
1. Journal Articles, Recommended Textbooks, etc.
Annabel, B.K. (2006). Using interviews as research instruments, Language Institute
Chulalongkorn University publications.
Beavon, J. R. (2009). The origins of experimental error. Retrieved August 5, 2010, from
http://home.clara.net/rod.beavon/err_orig.htm
Becker, H. S. and Pamela, R. (Eds) 1986. Writing for Social Scientist: How To Start And Finish
Your Thesis, Book, Or Article. London: University of Chicago Press Ltd.
Bell, J., (2004) (3rd edn) Doing Your Research Project: A Guide for First -time Researchers in
Educational and Social Science, UK: Open University Press.
Bell, J. (2004). Doing Your Research Project, A Guide for First-time Researchers in Education
and Social Science, 3rd edn. Berkshire, UK, Open University Press.
Bell, J. (2010). Doing Your Research Project: a Guide For First-time Researchers in Education
and Social Science. 5th edn. Maidenhead: Open University Press
Brian, Allison (Eds.) 1996, 1998, 2000. Research Skills for Students. London: Kogan Page
Limited.
Chapin, P. G. (2004). Research Projects and Research Proposals; A Guide to Scientists Seeking
Funding. UK: Cambridge University Press.
Denscombe, M. (2010) The good research guide. 4th edn. Maidenhead: Open University Press
Duane, D. (2000). Introduction to Measurements & Error Analysis. Retrieved February 12,
2012, from The University of North Carolina at Chapel Hill, Department of Physics and
Astronomy : http://www.physics.unc.edu/~deardorf/uncertainty/UNCguide.html
viii
Eade, Deborah (Ed.) 2003. Development Methods and Approaches: Critical Reflections. Oxford;
OXFAM GB.
emathzone. (2012). Continuous Random Variable. Retrieved Feb 2012, from emathzone:
http://www.emathzone.com/tutorials/basic-statistics/continuous-random-variable.html
Gagnon, S. (Undated). How cold is liquid nitrogen? Retrieved from Jefferson Lab:
http://education.jlab.org/qa/liquidnitrogen_01.html
Ghanfoor A. (2006). Manual for synopsis and thesis preparation. University of Agriculture,
Faisalabad, Pakistan.
Hart, C. (1998) Doing a literature review: releasing the social science imagination. Thousand
Oaks, Sage
Harvey, G. (1998) Writing with sources: a guide for students. Indiana: Hackett Publishing
Ivan Iachine, Lars Korsholm,Henrik Støvring, Kirstin Vach, Werner Vac (2004).Stata Reference
Manual
James H. Stock and Mark W. Watson, (2003). Introduction to Econometrics
Julie Pallant, (2002). A step by step guide to data analysis using SPSS for Windows
School of Graduate Studies-KNUST. (undated). Manual for thesis preparation for Masters and
Doctoral degrees awarded by the Kwame Nkrumah University of Science and Technology.
School of Graduate Studies, KNUST, Kumasi, Ghana.
Kenneth L. Simons, (2010). Useful Stata Commands
Kumekpor, T.K.B. (2002). Research Methods and Techniques of Social Research, Accra,
SunLife Publications.
Lester, J. (2005) Writing research papers: a complete guide. 11th edn. New York, Longman
Narasimhan, B. (1996). The Normal Distribution. Retrieved Jan 30, 2012, from Stanford
University : http://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.html
Neale, P., Thapa, S. and Boyce, C. (2006). Preparing a Case Study: a guide for designing and
conducting a Case Study for Evaluation Input, pathfinder International, Watertown,
Massachusetts.
Neville C. (2010). The Complete Guide To Referencing And Avoiding Plagiarism, 2nd edition.
UK: Open University Press.
Nsowah-Nuamah, N.N.N. (2005). A Handbook of Descriptive Statistics for Social and Biological
Sciences. Accra: Acadec Press.
Ogden, T.E. and Goldberg, I. A. (2002). Research Proposals; A Guide To Success, 3rd Edition.
USA: Academic Press
Seawright, J. and Gerring, J. (2008). “Case Selection Techniques in Case Study Research : A
Menu of Qualitative and Quantitative Options”, Political Research Quarterly 2008 61: 294.
Singleton, R.A., Jr. Bruce C. S. and Straits, M.M. (1993). Approaches to Social Research.
Second Edition. Oxford University Press, New York.
Susan B. Gerber, Kristin Voelkl Finn, (1999). Using SPSS For Windows. New York:State
University of New York Graduate School of Education
x
Taylor, J. R. (2004). An Introduction to Error Analysis: the study of uncertainties in physical
measurements. CA: University Science Books.
The Health Communication Unit (1999). Conducting Survey Research, The Health
Communication Unit, at the Centre for Health Promotion, University of Toronto.
Urdan, T. C. (2010). Statistics in Plain English. New York: Taylor & Francis Group.
WWF (2005). Logical Framework Analysis. Retrieved on 1st February, 2012 from:
http://www.artemis-services.com/downloads/logical-framework.pdf
Zaidah, Z. (2007). Case study as a research method. Universiti Teknologi Malaysia, Jurnal
Kemanusiaan bil.9, Jun.
https://classshares.student.usp.ac.fj/EN400/2007%20Lecture%20Materials/Sections%201,%202,
%20and%203%20EN400.pdf
http://www.engr.sjsu.edu/bjfurman/courses/ME120/me120pdf/UncertaintyAnal.pdf
http://www.sonoma.edu/aa/gs/guidelines/toc.shtml
http://www.mhhe.com/mayfieldpub/tsw/toc.htm
xi
Table of Contents
Acknowledgement ......................................................................................................................................... v
Unit 1 ......................................................................................................................................................... 1
Unit 2 ....................................................................................................................................................... 14
2.1.1 Motivation for Research in Engineering and some basic concepts ............................................ 16
xii
2.1.2 Classification of Engineering Experiments ................................................................................ 17
Unit 3 ....................................................................................................................................................... 49
3.4.4 Competence.................................................................................................................................... 50
xiii
3.4.5 Privacy ........................................................................................................................................... 50
Advantages.............................................................................................................................................. 71
Disadvantages ......................................................................................................................................... 71
3.3.2.2 Advantages.................................................................................................................................. 84
3.4.4 Competence.................................................................................................................................... 92
xiv
Unit 4 ....................................................................................................................................................... 97
SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER PREPARATION ................. 150
SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND REFERENCING .............. 152
xvi
List of Tables
Table 2.1: Determining the average, average deviation and standard deviation......................................... 23
Table 3.2: Advantage sand Disadvantages of Open-ended and Close questions ........................................ 59
List of Figures
Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel ............................... 20
Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel ................ 21
Figure 2.3: Determining Instrumental Limits of Error and Least Count .................................................... 23
Figure 2.11: Average difference between expected value and sample mean.............................................. 40
xvii
Unit 1
This unit seeks to introduce students to the basic preparations preceding a research project. This
includes the preparation of concept notes, which are mostly directed towards donor/funding
agencies; research proposals, which are the full proposals stating the need for the research as
well as the expected results and thesis synopsis which are basically research proposals but
specifically for academic purposes.
Learning Objectives
UNIT CONTENT
The title of the research should be concise and should focus the reader’s attention to the
critical theme of the proposed research. It should be short, usually not more than one line
in length and devoid of unnecessary punctuations as well as repetition of words.
1.1.2.2 Background
This contains a review of the main research work and current issues specific to the
subject area. It should also contain what is already known about the research subject. It is
important to note that, the background is not the same as the literature review with the
latter not necessary for concept notes. It is usually about 200 words in length.
This section should outline clearly without ambiguity the research problem to be
investigated. It shouldn’t be more than 200 words.
1.1.2.4 Objectives
The main objectives as well as the specific objectives of the proposed research should be
clearly outlined in this section.
1.1.2.5 Methodology
This section outlines clearly the proposed methodology to be used for the research work.
It spells out exactly how the research work will be carried out and the procedures
involved. It should usually be about 100 words in length.
2
The proposed site for the research, time frame for the completion of the research as well
as a summary of the budget should clearly be outlined in this section.
A Research proposal may be written for any of the following reasons; to request funding
for a research project, as a task in tertiary education (in which case it is referred to as a
thesis synopsis), or as a condition for employment at a research institution.
1.2.2.1 Title
3
The title must be succinct and should give the reader an overview of what to expect in the
main document. It should be on the first page of the proposal, short with not more than 20
words and should be devoid of unnecessary punctuations and repetition of keywords.
1.2.2.2 Abstract
The abstract is a concise summary of the main points of the proposal and should be kept as
short as possible without leaving out any important point. It should be a maximum of 500
words.
1.2.2.3 Background
This contains a summary of the background information to the research problem and the
context within which the study will take place. It draws a relation between the study,
research idea and the policy environment. It should contain what is already known about
the research area and how the research will compliment what is already known. It is made
up of a maximum of 1,000 words.
It is important to state clearly the research problem and the significance of the research to
the community. This section of the proposal should be able to answer questions such as;
what is going to be studied/investigated? Why is it important to subject this subject? It
shouldn’t be more than 250 words in length.
It is important to outline the key objective(s) of the research which spells out what the
researcher seeks to accomplish. A single principal objective with two or three specific
objectives is usually enough. They should be listed in order of importance and should be a
maximum of 200 words.
The Literature review should provide a brief description of available literature in terms of
research works done, policy statements and their implications as well as the identification
4
of shortfalls to be studied and complimented. This section should indicate how existing
literature contributes to the proposed research and how the proposed research is also going
to add to existing work. It should be a maximum of 3,000 words.
1.2.2.7 Methodology
This section gives a good indication of what is expected out of the research. It joins the
data analysis and possible outcomes to the theory and questions that have been raised. It
should include the following;
Scope of inference (i.e., to what extent are the results applicable to other locations,
times, or situations?)
Pitfalls that may be encountered
Limitations to proposed methods
5
the fourth column presents important assumptions that are beyond the direct control of the
project but need to be fulfilled in order to ensure a successful implementation of the
project. A logical framework can only done after a thorough analysis of problems,
objectives and strategies to be employed in the project. Table 1.1 shows a typical structure
of a logical framework.
6
Table 1.1: Typical Structure of a Logical Framework
Unit
Costs Unit # of units Costs
rate
1. Human Resources
1.1 Salaries (gross salaries including social security charges and
other related costs, local staff)4
1.2 Salaries (gross salaries including social security
charges and other related costs, expat/int. staff) Per month
1.3 Per diems for missions/travel
Subtotal Human Resources
2. Travel
2.1. International travel Per flight
2.2 Local transportation Per month
Subtotal Travel
3. Equipment and supplies7
3.1 Purchase or rent of vehicles Per vehicle
3.2 Furniture, computer equipment
3.3 Machines, tools…
3.4 Spare parts/equipment for machines, tools
3.5 Other (please specify)
Subtotal Equipment and supplies
4. Local office
4.1 Vehicle costs Per month
4.2 Office rent Per month
4.3 Consumables - office supplies Per month
4.4 Other services (tel/fax, electricity/heating, maintenance) Per month
Subtotal Local office
5. Other costs, services8
5.1 Publications9
5.2 Studies, research9
5.3 Expenditure verification
5.4 Evaluation costs
5.5 Translation, interpreters
5.6 Financial services (bank guarantee costs etc.)
8
5.7 Costs of conferences/seminars9
5.8. Visibility actions10
Subtotal Other costs, services
6. Other
Subtotal Other
7. Subtotal direct eligible costs of the Action (1-6) (excluding
taxes)
8. Provision for contingency reserve (maximum 5% of 7, subtotal
of direct eligible costs of the Action) (excluding taxes)
9. Total direct eligible costs of the Action (7+ 8) (excluding
taxes)
10. Administrative costs (maximum 7% of 9, total direct eligible
costs of the Action) (excluding taxes)
11. Total eligible costs (9+10) (excluding taxes)
12. Taxes11
13. Total eligible/accepted12 costs of the Action (11+12)
9
SESSION 1.3: THESIS SYNOPSES
The title must be concise and should give the reader an overview of what to expect in the
main document. It should be on the first page and must be the same as the title of the
thesis. It should be short with not more than 20 words and should be devoid of unnecessary
punctuations and repetition of keywords.
1.3.2.2 Introduction/Background
Outline briefly the relevance of the research work to be presented in the thesis in this
section. The introduction should be precise and include only relevant background
material in that particular field of study. It is important to provide information on past
works, by other researchers, by way of giving appropriate references. Maximum one
page, preferably half a page is allotted to this section.
1.3.2.3 Justification/Motivation
1.3.2.5 Methods
It is important to outline how you will approach your research topic. One should
demonstrate, in this section, that the chosen method or approach will serve to advance the
thesis. If you need to gather data, describe how you will go about this. This might involve
archival research, interviews with stakeholders, or various forms of fieldwork. There are
many established research methodologies. If your approach is experimental or
comparative, outline how this approach will yield results.
A project plan outlines in specific detail how a project will be conducted, who will work
on which part, and when and in what order each part will be accomplished. Develop this
section with some care, since it will provide you a means of measuring your progress in
relation to your allotted time. This section should detail the timing of specific activities to
be implemented towards the achievement of the specific objectives within a reasonable
It is important to present the full budget as well the various resources available for the
research work in this section. This section should indicate any bibliographic, laboratory,
computing or other physical resources required to execute the study and a budget for
projected expenditures including stipend/allowances where needed
1.3.2.8 References
List the references in the same order as they are referred to in the synopsis make sure all
references listed here are properly referred in the text. It is best to get into the habit of
using a standard referencing system (preferably in conformity with the Harvard System)
so that material can be transferred into your thesis. Do not cite from memory without
referencing.
1.3.2.10 Signature(s)
11
It is very important for signatures attesting to the fact that your proposed Supervisor(s) is
(are) in agreement with your proposed study as elaborated in the synopsis.
Unit Summary
Concept notes, research proposals and thesis synopsis are the first things that come to mind when
one thinks of a research work. These documents give various levels of information about the
research work and are mostly intended for different stakeholders. This chapter introduces
students to the preparation of concept notes, which are mostly directed towards donor/funding
agencies; research proposals, which are the full proposals stating the need for the research as
well as the expected results and thesis synopsis which are basically research proposals but
specifically for academic purposes. It is intended that at the end of this unit, the student should
be able to write a concept note capable of securing funding for a research project specifically
your masters’ research project and also prepare a research proposal as well as thesis synopsis for
your masters’ Thesis.
12
Unit Assignments 1
1. Prepare a zero-order draft of your thesis synopsis in Power-Point format for
presentation to the class.
13
Unit 2
ENGINEERING RESEARCH DESIGN AND DATA ANALYSIS
Introduction
This Unit introduces the student to concepts and methods in engineering research. The first
section (2.1) presents various contexts in engineering practice which necessitate research and
classifies experiments that may be undertaken as part of the research. Procedures for the design
of experiments are also presented.
Section 2.2 is on experimental errors, and catalogues various sources from which errors can be
introduced into our experimental work. Students are also presented with tools for the analysis of
such errors and how they are propagated as measurements are repeated and computations are
done.
Section 2.3 looks at probability distributions and standard errors. In this section the student is
introduced to probability density functions and their common features. The normal distribution
(the most widely used) is then discussed along with the concept standard scores and the
procedure procedures for its application. The Poisson and Binomial probability distributions are
also briefly presented to conclude the section.
Section 2.4 is on standard errors and considers Errors of the Mean, the Central Limit Theorem
and the t-Distribution.
The Unit concludes with Section 2.5, on examples in normal distributions and their application to
engineering problems.
Learning Objectives
Unit content
Research in engineering is necessitated by factors which include either an advantage which could
be realized by improving on an existing technology, (e.g. an existing drilling machine) or to
address a problem.
To guarantee the integrity of the research process and to obtain high quality results and usable
conclusion, a number of practices are recommended below:
1
(US-NIST- National Institute of Standards and Technology)
16
2.1.2 Classification of Engineering Experiments
As part of research in engineering, experiments may be conducted, which for one or more of the
following reasons:
A theoretical relationship between two or more variables is already known (or at least
suspected) and an experiment is needed to verify or quantify this relationship.
A theoretical relationship between two or more variables is not available but rather
sought through an experiment.
A new product is being developed and a test is needed to confirm that it meets the design
specifications, before committing it to production.
The engineer is interested in "understanding" the process as a whole in the sense that he/she
wishes (after design and analysis) to have in hand a ranked list of important through unimportant
factors (most important to least important) that affect the process.
The engineer is interested in functionally modeling the process with the output being a good-
fitting (high predictive power) mathematical function, and to have good estimates (maximal
accuracy) of the coefficients in that function.
The engineer is interested in determining optimal settings of the process factors; that is, to
determine for each factor the level of the factor that optimizes the process response.
1. Scientific/Engineering Concept
2. Questions Posed
3. Equipment /Materials
4. Design of Procedure
5. Analysis of Results
6. Conclusions
The procedure prescribed above may be expanded further into a flow process for the design of
experiments as presented below:
17
Process Flow for Design of Experiments
1. Define the goals and objectives of the experiment. While the goal may be general, the
objectives need to be more specific and measurable, directly or indirectly;
2. Research any relevant theory and previously published data from similar experiments.
Performing computer simulations may also be part of this research, assuming that
appropriate software is available. The purpose of this step is to have an idea about what
to expect from the experiment;
3. Select the dependent and independent variable(s) to be measured;
4. Select appropriate methods for measuring these variables;
5. Choose appropriate equipment and instrumentation;
6. Select the proper range of the independent variable(s);
7. Determine an appropriate number of data points needed for each type of measurement;
8. Data analysis and reporting - qualitative analysis and quantitative analysis.
Additional Skills
In addition to the steps outlined above, the researcher must be careful to:
Analyzing and interpreting data constitutes an important component of research, and the
researcher should be able to:
EXAMPLE
A student is tasked to investigate and compare the power output of a solar panel with a fixed
orientation to that of a solar panel whose orientation tracked the sun. He also tried to verify that
the power output of a photovoltaic cell was a function of temperature.
METHODOLOGY
18
1. Define goals and objectives:
The goals and objectives for the experiment were to verify that:
A logarithmic relationship exists between angle of incidence of sunlight on a solar panel
and power output;
A tracking system increases power output by 20%; and
The power output of a solar cell is a function of temperature
2. Research relevant theory and previously published data:
The student chose a direct method for measuring the angle of the solar panel and measured
voltage and current to determine power output of the solar panel.
The student used a camera tripod, protractor, and plumb bob to orient and determine the angle of
the solar panel for the fixed panel measurements, and a sundial2 rod to orient the panel normal to
the sun’s rays for the tracking measurements; a digital multimeter to measure current and
voltage; and a thermometer to measure the temperature of the solar panel.
The tripod allowed a 55-degree range of motion and this set the range for the angle of incidence.
For the tracking measurements, the range of measurements took place from 6:45 am – 6:00 pm.
The student was limited by the available resources for investigating the effect of temperature to
that obtainable under ambient conditions and by cooling the solar panel using ice cubes.
2
A sundial is a device that determines the time of day by the position of the Sun.
19
To investigate the logarithmic relationship between angle of incidence and power output, the
student chose 5-degree increments, which resulted in 12 data points.
For the tracking measurements, the student reoriented the panel to be normal to the sun’s rays
using the following schedule:
Figure 2.1: Power output vs. insolation angle for polycrystalline silicon solar panel
20
Figure 2.2: Power output for fixed orientation and tracking polycrystalline silicon solar panel
It is claimed that a 10% blend of biodiesel with conventional diesel improves the emissions characteristics
of engines. Design an experiment to investigate the veracity of this claim.
Instrument Limit of Error (ILE): Good measuring tools are calibrated against national and
international standards, e.g. ISO, IEC, National Institute of Standards and Technology-(US
NIST), Ghana Standards Board, etc.
The Instrumental Limit of Error (ILE) is generally taken to be the least count or some fraction
(1/2, 1/5, 1/10) of the least count. For some devices the ILE is given as a tolerance or a
percentage.
Resistors may be specified as having a tolerance of 5%, implying that the ILE is 5% of the
resistor's value.
22
Figure 2.3: Determining Instrumental Limits of Error and Least Count
Table 2.3: Determining the average, average deviation and standard deviation
2
Time, t, sec (t - <t>), sec |t - <t>|, sec (t - <t>)
7.4 -0.2 0.2 0.04
8.1 0.5 0.5 0.25
7.9 0.3 0.3 0.09
7.0 -0.6 0.6 0.36
<t> = 7.6 <t-<t>>= 0.0 <|t-<t>|>= 0.4 2
(t - <t>) = 0.247
Average Average deviation Standard dev = 0.50
The average (mean) value is usually taken as the best estimate, and is determined as:
23
𝑡1 + 𝑡2 + 𝑡3 + ⋯ + 𝑡𝑁
𝐴𝑣𝑒𝑟𝑎𝑔𝑒 〈𝑡〉 =
𝑁
Where N is the number of observations or measurements
A way to express the variation among the measurements is to use the average deviation. This
statistic tells us on average (with 50% confidence) how much the individual measurements vary
from the mean. As indicated above, the average deviation is calculated by summing the absolute
values of the deviation of measurements from the mean, and dividing by the number of
observations.
In the example above (section 3.2.2), the standard deviation of 0.5 implies that for the same
series of measurements, an additional measurement taken may be expected (with about 68%
confidence) to lie within ± 0.5 of the average value of 7.6 sec.
Fractional Uncertainty
When a reported value is determined by taking the average of a set of independent readings, the
fractional uncertainty is given by the ratio of the uncertainty divided by the average value.
𝑈𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑡𝑦
𝐹𝑟𝑎𝑐𝑡𝑖𝑜𝑛𝑎𝑙 𝑈𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑡𝑦 =
𝐴𝑣𝑒𝑟𝑎𝑔𝑒
24
General theory:
f(x,y,...)
𝜕𝑓 𝜕𝑓
𝑑𝑓 = 𝜕𝑥 𝛿𝑥 + 𝜕𝑦 𝛿𝑦+...
Taking the square of the above expression, we get the law of propagation of uncertainty:
𝜕𝑓 2 𝜕𝑓 2 𝜕𝑓 𝜕𝑓
(𝑑𝑓)2 = ( ) (𝛿𝑥)2 + ( ) (𝛿𝑦)2 + 2 ( ) ( ) 𝛿𝑥𝛿𝑦
𝜕𝑥 𝜕𝑦 𝜕𝑥 𝜕𝑦
If the measurements of x and y are uncorrelated, then 𝛿𝑥𝛿𝑦 = 0 and the error in the function f
may be approximated as:
𝜕𝑓 2 𝜕𝑓 2
√
∆𝑓 = ( ) (∆𝑥) + ( ) (∆𝑦)2
2
𝜕𝑥 𝜕𝑦
Examples:
a) If
𝑓 =𝑥+𝑦
𝜕𝑓 𝜕𝑓
= 1, =1
𝜕𝑥 𝜕𝑦
∆𝑓 = √(∆𝑥)2 + (∆𝑦)2
b) If
𝑓 = 𝑥𝑦
𝜕𝑓 𝜕𝑓
= 𝑦, =𝑥
𝜕𝑥 𝜕𝑦
∆𝑓 = √(𝑦)2 (∆𝑥)2 + (𝑥)2 (∆𝑦)2
Dividing by the function
𝑓 = 𝑥𝑦
We obtain
∆𝑓 ∆𝑥 2 ∆𝑦 2
= √( ) + ( )
𝑓 𝑥 𝑦
25
c) If
𝑓 = 𝑥/𝑦
𝜕𝑓 1 𝜕𝑓 𝑥
= , = 2
𝜕𝑥 𝑦 𝜕𝑦 𝑦
1 2 𝑥 2
∆𝑓 = √( ) (∆𝑥)2 + ( 2 ) (∆𝑦)2
𝑦 𝑦
∆𝑓 ∆𝑥 2 ∆𝑦 2
√
= ( ) +( )
𝑓 𝑥 𝑦
Therefore the uncertainty in the function f is the same for both multiplication and division. Note
that unlike the sums, this is always written as fractional errors for dimensional consistency.
𝑓 = 𝑥𝑚𝑦𝑛
∆𝑓 𝑚∆𝑥 2 𝑛∆𝑦 2
√
= ( ) +( )
𝑓 𝑥 𝑦
Addition S = A+B
S A2 B 2 .
26
Subtraction D = A-B
D A2 B 2
Multiplication P=Ax B
P A B
2 2
P A B
Division Q =A/ B
Q A B
2 2
Q A B
For equations involving mixtures of multiplication, division, addition, subtraction, and powers;
the same basic rules are applied systematically to evaluate the error contained in the dependent
variable as a result of errors in the independent variables.
Example
(26.4 – 19.2) oC ± ΔT
=7.2 oC ± ΔT
= √(0.2)2 + (0.2)2
= 0.28o C
ΔT = (7.2 ± 0.28) oC
27
1. 𝒛 = 𝒙 − 𝟐. 𝟓𝒚 + 𝒘 for 𝐱 = (𝟒. 𝟕𝟐 ± 𝟎. 𝟏𝟐)m, 𝐰 = (𝟏𝟓. 𝟔𝟑 ± 𝟎. 𝟏𝟔) m
𝐱
2. 𝐳 = (𝐰 × 𝐲) for 𝐰 = (𝟏𝟒. 𝟒𝟐 ± 𝟎. 𝟎𝟑) m/s 2 , 𝐱 = (𝟑. 𝟔𝟏 ± 𝟎. 𝟏𝟖) m, 𝐲 =
(𝟔𝟓𝟎 ± 𝟐𝟎) m/s
3. 𝑧 = 𝐴 sin 𝑦 for 𝐀 = (𝟏. 𝟔𝟎𝟐 ± 𝟎. 𝟎𝟎𝟕) m/s, 𝐲 = (𝟎. 𝟕𝟕𝟒 ± 𝟎. 𝟎𝟎𝟑) rad.
Probability distribution is a function that describes the probability of a random variable 3taking
certain values. In more precise definitions, distinction is made
between discrete and continuous random variables.
A random variable is called continuous if it can assume all possible values in the possible
range of the random variable. Suppose the temperature in a certain city in the month of
June in the past many years has always been between 35o to 45 o centigrade. The
temperature can take any value between the ranges 35o to 45 o.
In discrete random variable the values of the variable are exact like 0, 1, 2 good bulbs. the
interval may be very small.
(emathzone, 2012)
The probability function of the continuous random variable is called probability density
function.
It is denoted by 𝑓(𝑥);
Where 𝑓(𝑥) is the probability that the random variables X and takes the value between 𝑥 and 𝑥 +
∆𝑥 where ∆𝑥 is a very small change in X.
3
A random variable is a numerical variable whose measured value can change from one replicate experiment to
another.
28
Figure 2.4: Plot of f(x) vs X
The probability that X is between a and b is determined as the integral of 𝑓(𝑥) from a to b, and
is expressed mathematically as:
𝑏
𝑃(𝑎 < 𝑋 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
4
The probability of a continuous random variable assuming a specific value is zero. This does not necessarily mean
that a particular value cannot occur. The interpretation is that the point (event) is one of an infinite number of
possible outcomes.
29
2.3.3 Mean and Variance
Important parameters in presenting probability distributions include the mean (arguably the most
popular statistical parameter), the variance and standard deviation. These parameters could be
based on the population (N) or on a sample of the population (n), see figure 2.5 below:
Variance
∑𝑋
𝜇= (𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑁)
𝑁
Or
∑𝑋
𝑋̅ = (𝑏𝑎𝑠𝑒𝑑 𝑜𝑛 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛, 𝑛)
𝑛
30
Where:
2
∑(𝑋 − 𝜇)2
𝜎 =
𝑁
2
∑(𝑋 − 𝑋̅)2
𝑠 =
𝑛−1
Where:
31
2.3.4 The Normal Distribution (also called the bell-curve)
The normal distribution is the most widely used model for the distribution of random variables
and helps in determining the probability of something occurring in a given sample just due to
chance. It is also called the bell-curve because of its resemblance to the shape of a bell (see
below, Fig 2.6).
Symmetrical - upper half and the lower half of the distribution are mirror images of each
other.
Unimodal - the mean, median, and mode are all in the same place, in the center of the
distribution (i.e., the top of the bell curve); and the normal distribution is highest in the
middle.
Asymptotic - the upper and lower tails of the distribution never actually touch the
baseline, also known as the x-axis.
In a normal distribution, a random variable X has a probability density function is given by:
1 −(𝑥−𝜇)2
𝑓(𝑥) = 𝑒 2𝜎2 𝑓𝑜𝑟 − ∞ < 𝑥 < ∞
𝜎√2𝜋
32
Where;
The notation 𝑁(𝜇, 𝜎 2 ) is often used to denote a normal distribution with mean μ and variance σ2.
When a sample of scores is not normally distributed, two terms, skew and kurtosis, are used to
characterise it.
If there are a few scores creating an elongated tail at the higher end of the distribution, it is said
to be positively skewed (see Fig 2.7). If the tail is pulled out toward the lower end of the
distribution, the shape is called negatively skewed (see Fig 2.8).
33
Figure 2.8: Negatively skewed distribution
Kurtosis refers to the shape of the distribution in terms of height, or flatness. When a distribution
has a peak that is higher than that found in a normal, bell-shaped distribution, it is called
leptokurtic. When a distribution is flatter than a normal distribution, it is called platykurtic.
Using the mean and the standard deviation, researchers are able to generate a standard score,
also called a z score to help them understand where an individual score falls in relation to other
scores in the distribution.
A standard normal random variable is defined as a random variable with μ=0 and σ2=1. It is
normally denoted as Z.
Through a process of standardization, researchers are also better able to compare individual
scores
in the distributions of variables. Standardization is simply a process of converting each score in
a distribution to a z score.
A z score indicates how far above or below the mean a given score in the distribution is in
standard deviation units. Standardization is simply the process of converting individual raw
scores in the distribution into standard deviation units.
The z-score is computed as indicated below, in terms the mean and standard deviation:
34
𝑟𝑎𝑤 𝑠𝑐𝑜𝑟𝑒 − 𝑚𝑒𝑎𝑛 𝑋−𝜇 𝑋 − 𝑋̅
𝑧= = 𝑜𝑟
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎 𝑠
The 68-95-99.7% Rule
All normal probability density curves satisfy the following properties (see figure 2.9):
Interpreting z-Scores
• z scores tell researchers instantly how large or small an individual score is relative to
other scores in the distribution.
• Example, if a student got a z score of -1.5 in an exam, it is inferable that student scored
1.5 standard deviations below the mean in that exam.
• If another student had a z score of 0.29, I would know the student scored 0.29 standard
deviation units above the mean in the exam.
35
Self Assessment 2.3
Quick Questions:
• What does a z-score of 1.0 mean?
• What DOES it say?
• What it does NOT say?
Suppose that the average score of a student in an automobile engineering class is 517, with a
standard deviation of 100, and the distribution of scores is normal. What is the score that marks
the 90th percentile?
Remarks
Remember that the 90th percentile is 40 percentile points above the mean in a normal
distribution, so we are looking for the z score at which 40% of the distribution falls between the
mean and this z-score.
OR
The z score at which 10% of the distribution falls above, because the 90th percentile score
divides the distribution into sections with 90% of the score falling below this point and 10%
falling above
1. From traditional statistics tables5 , the z score that corresponds with the 90th percentile
(probability of 0.9) is 1.28.
So z = 1.28
These tables are developed using the probability function of a normal distribution.
𝑋 = 𝜇 + (𝑧)(𝜎)
5
For cumulative standard normal distribution
36
𝑋 = 517 + (1.28)(100)
𝑋 = 517 + 128
𝑋 = 645
The binomial distribution which is used for the reporting of outcomes of random
experiments consisting of n repeated trials such that
o The trials are independent,
o Each trial results in only two possible outcomes, labeled as success and failure,
and
o The probability of a success on each trial, denoted as p, remains constant.
The random variable X that equals the number of trials that result in a success has a binomial
distribution with parameters p and n, where 0 < 𝑝 < 1, and 𝑛 = {1,2,3, … }
37
The mean and variance are determined as:
The Poisson distribution is used to model the number of events over an interval, such as the
number of e-mails that arrive in an hour. Assuming events occur at random throughout the
interval. If the interval can be partitioned into subintervals of small enough length such that:
If the mean number of counts in the interval is𝜆 > 0, the random variable X which is the
number of counts in the interval has a Poisson distribution with parameter λ, and the
probability function is:
𝑒 −𝜆 𝜆𝑥
𝑓(𝑥) = , 𝑥 = 0, 1, 2, …
𝑥!
The mean and variance of X are
The standard error is the measure of how much random variation we would expect from samples
of equal size drawn from the same population.
When samples are drawn from a given population, say the scores by students in an examination,
the samples will be characterized by their own means (sample means). As an example if 100
students score marks ranging from 2 to 10 in an examination in which 0 is the least and 10 is the
highest; we may at random draw 10 students from the population of 100. The scores of these 10
students will yield a mean of say 5.5. If the earlier 10 students are put back into the population
38
and another sampling of 10 students is done, their scores may yield another mean of say 6.0. If
this process is continued, a distribution of the means of the samples will be obtained, as indicated
in figure 2.10 below.
The distribution of the means of the samples drawn also poses the characteristics of other
probability distributions, i.e. the mean and standard deviation. The mean of the sampling
distribution is called the expected value of the mean, because it is the same as the population
mean. The associated standard deviation (of the sampling distribution) is called the standard
error.
The standard error of the mean refers to the average difference between the expected value
(e.g., the population mean) and an individual sample mean as shown in Figure 2.11 below.
39
Figure 2.11: Average difference between expected value and sample mean
The Central Limit Theorem simply states that as long as you have a reasonably large sample
size (e.g., n = 30), the sampling distribution of the mean will be normally distributed, even if the
distribution of scores in your sample is not.
This theorem says that even when you have a non-normal distribution in a population, the
sampling distribution of the mean will most likely approximate a normal, bell-shaped
distribution as long as you have at least 30 cases in your sample.
This test is used for samples of small sizes that are not distributed normally. With larger sample
sizes (n>=120) the distribution is identical to the normal distribution.
Whenever the population standard deviation is not known and an estimate from a sample must be
used, it is wise to use the family of t distributions.
When σ is known:
40
When σ is not known:
Where:
𝜇 = 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛
𝜎𝑋̅ = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 𝑢𝑠𝑖𝑛𝑔 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
𝑋̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑚𝑒𝑎𝑛
𝑠𝑋̅ = 𝑠𝑎𝑚𝑝𝑙𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟
With known population mean, what is the probability of having a sample distribution with a
particular mean?
Example:
The average American man exercises for 60 minutes a week. Suppose, further, that I have a
random sample of 144 men and that this sample exercises for an average of 65 minutes per week
with a standard deviation of 10 minutes. What is the probability of getting a random sample of
this size with a mean of 65 if the actual population mean is 60 by chance?
65 − 60
𝑡=
10⁄
√144
5
𝑡=
0.83
𝑡 = 6.02
From t-tables, the probability of getting a t value of this size or larger by chance with a sample of
this size is less than 0.001
41
Self Assessment 2.4
An article in the Journal of Heat Transfer described a new method for measuring the thermal
conductivity of Armco iron. Using a temperature of 100 oF and a power input of 550 W, the
following 10 measurements of thermal conductivity were obtained (in Btu/hr-ft- oF):
41.60, 41.48, 42.34, 41.95, 41.86,
42.18, 41.72, 42.26, 41.81, 42.04
Determine the standard error of the sample mean.
42
2.5.1 Part 1 – Examples in Normal Distributions and z-Scores
Find the probability P and represent it on a normal distribution diagram, under the following
assumptions for the normalized score Z:
(7) Find the value of z such that 𝑃(−𝑧 < 𝑍 < 𝑧) = 0.9
43
Q1-SOLUTION
P(Z>1.26)
Q2-SOLUTION
P(Z<-0.86)
P(Z<-0.86)= 0.19490
Q3-SOLUTION
P(Z>-1.37)
44
Remember normal distributions are symmetrical!
P(Z<1.37)=0.91465
Q4-SOLUTION
P(-1.25<Z<0.37)
P(Z<0.37)-P(Z<-1.25)
P(Z<0.37)=0.64431;
P(Z<-1.25)= 0.10565
P(Z<0.37)-P(Z<-1.25)= 0.64431-0.10565=0.53866
Q5-SOLUTION
P(Z≤-4.6)
P(Z≤-4.6) is not available in the tables, but using the last score of -3.99; P(Z≤-3.99) = 0.00003
Q6-SOLUTION
45
Z in the inequality above is the same as pertains in P(Z≤z)=0.95
Search through the probabilities in the Tables for the value that corresponds to 0.95.
𝒛 = 𝟏. 𝟔𝟓
Q7-SOLUTION
Using the symmetry concept, the remaining area in the shaded region is (1-0.99)/2=0.005
The value for z corresponds to a probability of 0.995. The nearest probability is 0.99506 when
z=2.58
Questions
46
(ii) What is the probability that a sample’s strength is between 5800 kg/cm2 and 5900
kg/cm2
(iii) What compressive strength is exceeded by 95% of the samples?
2. The fill volume of an automated filling machine used for filling cans of carbonated
beverage is normally distributed with a mean of 12.4 fluid ounces (fl oz) and a standard
deviation of 0.1 fluid ounce.
(i) What is the probability that a fill volume is less than 12 fluid ounces?
(ii) If all cans less than 12.1 or greater than 12.6 ounces are scrapped, what is the
proportion of cans id scrapped?
(iii) Determine the specifications that are symmetric about the mean that include 99%
of all cans.
Unit Summary
47
Key terms/ New Words in Unit
Least Count
Sampling error
Uncertainty
Random variable
Standard error
Expected value
Z-score
Standardization
Unit Assignments 2
Calculate z and Δz for questions 1 and 2:
a. What is the probability that a reaction requires more than 0.5 second?
b. What is the probability that a reaction requires between 0.4 and 0.5
second?
48
Unit 3
The purpose of this unit is to introduce participants to research designs and the array of research
methodologies within the social sciences. Emphasis is placed on the “empirical social science
research” which involves the design of data collection instruments and the collection,
management, simulation, analysis and presentation of data about people and their social contexts
by a range of methods.
Learning Objectives
After reading this unit you should be able to:
1. Identify the research approaches available to researchers in the social
sciences;
2. Know the factors that affect the effectiveness of these research designs
3. Operationalise these research approaches in ways that the weaknesses do
not limit the credibility of the research findings.
49
UNIT CONTENTS
SESSION 3.1: SURVEY RESEARCH
3.1.1 Introduction to Survey Research
3.1.2 Detailed Steps in Conducting Successful Surveys
SESSION 3.2 CASE STUDY RESEARCH
50
SESSION 3.1 SURVEY RESEARCH
3.1.1 Introduction to Surveys
A survey is a systematic method of collecting data from a population of interest. Generally,
survey research tends to be quantitative in nature and aims to collect information from a sample
such that the results are representative of the population within a certain degree of error. Survey
gathers quantitative information, usually through the use of a structured and standardized
questionnaire. They are appropriate for assessing perceptions, opinions, knowledge, attitudes and
behaviors using structured questionnaires which are often close-ended.
There are about 12 steps in conducting a survey research. These steps are briefly described
below.
51
Self Assessment 3.1
You have observed that several children of school-going age are not in school in a
particular farming community. You want to investigate the causes of this phenomenon
through a survey.
Which sub groups would be of interest to you as a researcher?
When will you undertake the survey?
Answer Tips
List everyone who has a stake in children’s schooling in a locality.
Identify the reasons why timing is of the essence in data collection.
Further Insight
The budget line for a UNDP survey (which was to “address gaps in data on energy access for
rural and urban areas in Ghana) undertaken by The Energy Center, KNUST was US$ 29 000.
The original intent was to survey 56 communities from the 10 administrative regions of Ghana.
52
After assessing the survey expenses, the budget could cover only 15 communities from three
administrative regions. If the planners had glossed over this important stage of the research
process, the survey would have landed on an impermeable rock.
The third step in a survey research is to decide on methods to be used. That is, the most
appropriate method required for the research work. Primarily, there are three methods for
obtaining survey research:
Face to face interviews;
Mailed and e-mailed questionnaires; and
Telephone and computerized telephone interviews.
In the exercise in 3.1, and with your appreciation of the pros and cons of the three
interview methods in Table 3.1, which of the methods will you use in your home district?
Briefly explain the factors you considered in your choice of the most appropriate
interview method.
Answer Tips
The methods you will use should be informed by how effective each of the
methods will be in your home-district.
54
Language and Wording
Proper wording of the questions is essential. The questions should be simple and straightforward
to ensure that respondent understands the questions correctly. Highly technical terms, slang,
abbreviations or words, which may be considered as insulting should be avoided. All the
questions should be available in the native language of respondent.
Recall Bias
When formulating the questions, it is imperative to have in mind that people tend to forget
events. When the recall period is longer the accuracy is often worse. Therefore, recall of the
events should be assisted by adding aids to the questionnaire and by ordering the questions. For
example holidays and national festivals can be used or the respondents can use a calendar.
55
questionnaires may not be avoided. Here, the interviews could be phased out so that the
responses will not suffer.
Types of Questions
Two types of questions are used in questionnaires:
Structured Questions
These are questions that are followed by a list of possible alternative responses from which
respondents select responses that best describes their situation. It is impossible to list all possible
alternative responses so what is normally done is to provide space for responses that were not
mentioned in the list. In such cases it is customary to provide additional space for a response
labelled “others {specify}”. This takes care of all other responses, which do not fit in the list of
alternative responses provided. (Look at the examples provided as an attachment.)
Unstructured Questions
Unstructured or open-ended questions are those that are left open for respondents to provide
answers. Responds have the freedom to provide answers that they think are appropriate
56
irrespective of what the researcher thinks. Individuals respond in their own ways and the length
of the response is determined by the kind of space provided for the response. E.g. where a little
space is provided, a short response is provided too. The reverse is also true. An instrument can
have both open and close ended questions based on the objective behind each question.
Merits of Structured questions
Open-ended questions have the tendency to stimulate respondents to think about their
feelings or motives to express what he/she may consider as an appropriate or most
important.
Responses express respondents feeling about a particular issue.
Responses say a lot about the responds in terms of their background, hidden motivation,
decisions and interests.
Open-ended questions are easier and simpler to formulate.
They allow for a greater depth of response.
c) Contingency questions
Where certain questions are only applicable to certain groups, they are followed with other
questions, which are referred to as contingency questions. Follow-up questions are required to
get further information from the relevant sub-groups. Thus subsequent questions asked after the
initial question are called contingency questions or filter questions. They are used to probe for
more information.
57
An example of contingency questions is as follows:
3. Have you eating today?
Responses:
d) Matrix Questions
These are questions where a set of responses is used to answer all the questions. Likert scales are
usually used for such responses, such as extremely satisfied, satisfied, dissatisfied extremely
dissatisfied. An example of this is shown below:
How satisfied are you with your research methodology lecturer who is not into
management or business administration?
How satisfied are you with the research methods lectures so far as far as your
research work is concerned?
How satisfied are you with the length of time allocated for each lecture? Etc.
58
They rarely put off respondents
Easy to compare responses given to different items.
It facilitates easy determination of a trend in the response.
It is often abused because of the way it is easily constructed and provides responses.
It can easily influence a pattern of responses from respondents when they make up their mind not
to provide right responses.
59
Self Assessment 3.3
The purpose of this exercise is to enable students understand the rules in writing questions in
surveys. Offer reasons why the following rules must be observed in drafting survey questions:
Avoid leading questions: E.g. “Wouldn’t you say that…”, “Isn’t it fair to say…”
Be specific. Avoid words like “regularly”, “often”, or “locally”.
Avoid jargon and colloquialisms.
Avoid double-barreled questions. E.g. “Will you like to use charcoal and LPG?
Avoid double negatives. E.g. “Smoking in public places should not be abolished”.
Why you have to explain the rationale for asking very personal and probing issues?
Ensure options are mutually exclusive – e.g. “How many years have you worked in
academia: 0-5, 6-10, 11-15, over 15.” Not, “0-5, 5-10, 10-15”.
Answer Tips
Consider the kind of responses you will elicit from your respondents if the above
errors are not avoided.
60
Evaluate to know if the instruments takes a long time to administer;
Ascertain if the questions are obtaining responses for all the different response categories
or if responses the same.
Have a fair knowledge of the kind of reactions to expect from respondents in order to
prepare to meet them in the main survey.
Deciding on the sample size is primarily driven by the budget line) and the size of the subgroups
the researcher wishes to analyze. The researcher has to ensure that he has sampled enough people
to obtain an adequate number of respondents in his subgroups so he can accurately draw
conclusions about that group. If the target population is very small (say less than 100), the
researcher should consider doing a census (i.e. complete enumeration). However, if the target
population is very large (for e.g. in millions) the researcher will not improve the accuracy of his
results by interviewing more and more people albeit how expensive it will be to cover everyone.
The sample sizes are often determined statistically at significance levels. Miller and Brewer
(2003) model can be a useful tool for determining the sample size.
61
Formula:
𝑁
𝑛 = 1+𝑁(∝)2. Where n is the sample size; N is the sample frame (total number of objects in the
target population) and α is the confidence level.
Working Example 1:
It has been observed that the performance of students in examinations has been declining over
the years. In a school with a population of 10 000 students, a researcher wants to know the
causes of the poor performance. How many students will you sample if your budget and time
will not permit a census?
Solution:
The question the researcher needs to answer first is the error he is ready to accept. If settled, then
he can go ahead and determine the sample size. Assuming the researcher wants the error margin
to be 5% the sample size can be determined as follows:
Probability sampling requires that each member of the defined target population has a known,
and non-zero, chance of being included in the sample. It is not possible to determine whether a
non-probability sample is likely to provide very accurate or very inaccurate estimates of
population parameters. Consequently, these types of samples are not appropriate for dealing
objectively with issues concerning either the estimation of population parameters or the testing
of hypotheses.
The use of non-probability samples is often carried justified that estimates derived from the
sample may be linked to some hypothetical universe of elements rather than to a real population.
In some circumstances, probability sample design can be turned accidentally into a non-
probability sample design if subjective judgement is exercised at any stage during the execution
of the sample design.
a. Random sampling
63
The first statistical sampling method is simple random sampling. In this method, each item in the
population has the same probability of being selected as part of the sample as any other item.
Random sampling can be done with or without replacement. If it is done without replacement, an
item is not returned to the population after it is selected and thus can only occur once in the
sample.
Having determined a sample size of 385 in example 1, the simple random sampling technique
will be operationalised by assigning numbers to all the 10 000 units and drawing them out from a
basket 385 times. This approach is simple random sampling without replacement.
Advantages
The selection procedure ensures that every sampling units of the population has an equal
and known (non zero) probability of being included in the sample.
Highly representative if all subjects participate; the ideal
Disadvantages
Not possible without complete list of population members; potentially uneconomical to
achieve; can be disruptive to isolate members from a group; time-scale may be too long,
data/sample could change
This consists of selecting every Kth sampling unit (called the sampling interval) of the population
after the first sampling unit is selected at random from the total sampling unit. That is, an
element of randomness is introduced into this kind of sampling by using random numbers,
usually from 1-10, to pick up the unit with which to start. The sampling interval (K) is
determined by dividing the sampling frame (N) by the sample size (n).
E.g. With a sample size of 385 students, the sampling interval will be 10 000 / 385 = 26.
64
Assuming the random number selected is 7, the next number to be selected will be 33 (i.e. 7 +
26), the next number will be 40 (i.e. 33 + 7). This process is continued till the sample size of 385
is reached.
Advantages
Systematic sampling is more convenient than random sample especially when
interviewers are untrained in sampling techniques-they can be instructed to select every
Kth person.
It is more convenient for the use with very large population or when large samples are to
be selected.
It is an easier and less costly method of sampling.
Each sampling unit in the population has a 1/K probability of being included in the
sample.
Disadvantages
It proves to be an inefficient method only if certain production process is defective as this
sample depends solely upon the random starting position. In practice, this method can be
used when list of population are available and are of a considerable length.
The system may interact with some hidden pattern in the population, e.g. every third
house along the street might always be the middle one of a terrace of three.
c. Stratified Sampling
The stratified sampling method is used when representatives from each subgroup within the
population need to be represented in the sample. The first step in stratified sampling is to divide
the population into subgroups (strata) based on mutually exclusive criteria. Random or
systematic samples are then taken from each subgroup. The sampling fraction for each subgroup
may be taken in the same proportion as the subgroup has in the population. For example, if the
person conducting a customer satisfaction survey selected random customers from each customer
type in proportion to the number of customers of that type in the population. Stratified sampling
can also sample an equal number of items from each subgroup.
65
Steps involves in stratified sampling:
Define the population;
Determine the desired sample size;
Identify the variable and subgroups (strata) for which you want to guarantee appropriate
representation (either proportion or equal); and
Classify all members of the population as members of one of the identified subgroups.
Randomly select (using table of random numbers) an appropriate number of individuals
from subgroups.
Advantages
Can ensure that specific groups are represented, even proportionally, in the sample(s)
(e.g., by gender), by selecting individuals from strata list
Disadvantages
More complex, requires greater effort than simple random; strata must be carefully
defined
d. Cluster Sampling
In cluster sampling, the population that is being sampled is divided into groups called clusters.
Instead of these subgroups being homogeneous based on a selected criterion as in stratified
sampling, a cluster is as heterogeneous as possible to matching the population. A random sample
is then taken from within one or more selected clusters. Cluster sampling can tell us a lot about
that particular cluster, but unless the clusters are selected randomly and a lot of clusters are
sampled, generalizations cannot always be made about the entire population.
Steps:
Define the population
Determine the desired sample size
Identify and define a logical cluster
Obtain, or make a list of all clusters in the population
66
Estimate the average number of population members per cluster
Determine the number of clusters needed by dividing the sample size by the estimated
size of the cluster
Randomly select the needed number of clusters (using a table of random numbers)
Include in the sample all population members in selected cluster
Advantages:
Generating sampling frame for clusters is economical, and sampling frame is often
readily available at cluster level
Most economical form of sampling
Larger sample for a similar fixed cost
Less time for listing and implementation
Also suitable for survey of institutions
Disadvantages:
May not reflect the diversity of the community.
Other elements in the same cluster may share similar characteristics.
Provides less information per observation than an SRS of the same size (redundant
information: similar information from the others in the cluster).
Standard errors of the estimates are high, compared to other sampling designs with same
sample size
e. Multi-stage sampling
In many situations, there are natural divisions of the population into several different sizes of
units. For example, a forest management unit consists of several stands, each stand has several
cut blocks, and each cut block can be divided into plots. These divisions can be easily
accommodated in a survey through the use of multi-stage methods. Selection of units is done in
stages. For example, several stands could be selected from a management area; then several cut
blocks are selected in each of the chosen stands; then several plots are selected in each of the
67
chosen cut blocks. Note that in a multi-stage design, units at any stage are selected at random
only from those larger units selected in previous stages.
Example:
You have been asked to undertake a survey in a farming district in your home country. How
will you select respondents for interview?
Steps:
Note that not all the communities may be farming in the district. You need to
identify the farming communities. First stage.
The units to be sampled are the particular farming activities e.g. Food or Cash
Crop Production – Second stage.
The units to be sampled from the farming activities (food or cash crop) are the
farming households who undertake the particular farming activity considered. –
Third Stage.
a. Convenience sampling
A sample of convenience is the terminology used to describe a sample in which elements have
been selected from the target population on the basis of their accessibility or convenience to the
researcher. Convenience samples are sometimes referred to as ‘accidental samples’ for the
reason that elements may be drawn into the sample simply because they just happen to be
68
situated, spatially or administratively, near to where the researcher is conducting the data
collection.
Advantages
Convenience sampling is very easy to carry out with few rules governing how the sample
should be collected.
The relative cost and time required to carry out a convenience sample are small in
comparison to probability sampling techniques. This enables one to achieve the sample
size you want in a relatively fast and inexpensive way.
Disadvantages
Convenience sample can lead to the under-representation or over-representation of
particular groups within the sample.
The ability to make generalizations is undermined if the interest group is under-
represented in the sample
b. Quota sampling
It is sometimes misleadingly referred to as ‘representative sampling’ because numbers of
elements are drawn from various target population strata in proportion to the size of these strata.
The population is stratified by important variables and the required quota is obtained from each
stratum.
Advantages
Quick and cheap to organize
Disadvantages
not as representative of the population as a whole as other sampling methods
because the sample is non-random it is impossible to assess the possible sampling error
69
c. Purposive Sampling
This is often referred to as judgment sample. With this technique the researcher selects sampling
units subjectively in an attempt to obtain a sample that appears to be representative of the
population. That is, the chance that a particular sampling unit will be selected depends on the
subjective judgment of the researcher. The selection of the researcher may yield results favorable
to his/her point of view, resulting in the entire setting vitiated with the element of bias. However,
the sampling technique assures that results obtained are tolerably reliable.
Advantages
Ensures balance of group sizes when multiple groups are to be selected
Disadvantages
Samples are not easily defensible as being representative of populations due to potential
subjectivity of researcher
d. Snowball sampling
Researchers use this sampling method if the sample for the study is very rare or is limited to a
very small subgroup of the population. This type of sampling technique works like chain referral.
After observing the initial subject, the researcher asks for assistance from the subject to help
identify people with a similar trait of interest. The process of snowball sampling is much like
asking your subjects to nominate another person with the same trait as your next subject. The
researcher then observes the nominated subjects and continues in the same way until the
obtaining sufficient number of subjects.
For example, if obtaining subjects for a study that wants to observe a rare disease, the researcher
may opt to use snowball sampling since it will be difficult to obtain subjects. It is also possible
that the patients with the same disease have a support group; being able to observe one of the
members as your initial subject will then lead you to more subjects for the study.
70
Advantages
The chain referral process allows the researcher to reach populations that are difficult to
sample when using other sampling methods.
The process is cheap, simple and cost-efficient.
This sampling technique needs little planning and fewer workforce compared to
other sampling techniques.
Disadvantages
The researcher has little control over the sampling method. The subjects that the
researcher can obtain rely mainly on the previous subjects that were observed.
Representativeness of the sample is not guaranteed. The researcher has no idea of the true
distribution of the population and of the sample.
Sampling bias is also a fear of researchers when using this sampling technique. Initial
subjects tend to nominate people that they know well. Because of this, it is highly
possible that the subjects share the same traits and characteristics, thus, it is possible that
the sample that the researcher will obtain is only a small subgroup of the entire
population.
The advantages and disadvantages of the various sampling techniques are summarised in Table
3.3.
71
Stratified Random sample Can ensure that specific More complex, requires greater
random from identifiable groups are represented, effort than simple random;
groups (strata), even proportionally, in the strata must be carefully defined
subgroups, etc. sample(s) (e.g., by gender),
by selecting individuals
from strata list
Cluster Random samples Possible to select randomly Clusters in a level must be
of successive when no single list of equivalent and some natural
clusters of subjects population members exists, ones are not for essential
(e.g., by but local lists do; data characteristics (e.g.,
institution) until collected on groups may geographic: numbers equal, but
small groups are avoid introduction of unemployment rates differ)
chosen as units confounding by isolating
members
Purposive Hand-pick subjects Ensures balance of group Samples are not easily
on the basis of sizes when multiple groups defensible as being
specific are to be selected representative of populations
characteristics due to potential subjectivity of
researcher
Quota Select individuals Ensures selection of Not possible to prove that the
as they come to fill adequate numbers of sample is representative of
a quota by subjects with appropriate designated population
characteristics characteristics
proportional to
populations
Snowball Subjects with Possible to include No way of knowing whether
desired traits or members of groups where the sample is representative of
characteristics give no lists or identifiable the population
names of further clusters even exist (e.g.,
appropriate drug abusers, criminals)
subjects
As the Senior Research Officer of your organisation, you have been tasked to conduct a
household survey in a large town whose households fall into three income categories,
72
namely high income, middle income and low income earners. Determined to ensure that
the sample size takes care of the diversity in the target population, what sampling
technique will you use to select units and why?
Face-to-face Interviews
73
Select location(s) to conduct interviews: The most appropriate location to conduct a face
to face interview is a place where members of your population frequent and is
comfortable for them to participate at that location.
If you are randomly selecting respondents for a face to face intercept interview it is
important to utilize more than one location in order to ensure a better representation of
the population.
Train interviewers in how to conduct a structured questionnaire face to face and how to
intercept respondents if they are doing intercept interviews. It is quite difficult to ensure
the interviewers randomly select people to participate in intercept interviews. Interviewer
and respondent biases may influence the people who are selected to participate and those
who agree to. Interviewers should follow a standardized and systematic approach to
selecting people who pass by to be interviewed.
If you require a particular group for your survey you may have to develop a questionnaire
screener which would be used to find eligible respondents. A questionnaire screener is a
series of one or two questions (usually demographics like age or family status) which
help you to identify people who are in your target population before doing a full.
74
Using Mail Surveys
Send out the first mailing (usually results in a 40% response)
Send a reminder card 10 days after the 1st mailing to thank those participants who have
already responded and to remind those who have not of the importance of the study. The
card should also indicate where people can obtain another copy of the questionnaire if
they have mislaid their original copy.
Three to four weeks later, send a second mailing emphasizing the importance of receiving
responses. Also include a new questionnaire and return envelope.
The covering letter is one of the most important aspects of a mailed questionnaire. It will
determine whether the recipient reads the survey and the attitude with which respondents
complete the questionnaire.
The letter should explain why the study is important and why their responses are needed.
75
Data Entry
There are two common approaches to data entry:
Direct data entry. Interviewers complete the questionnaires and then they are coded data
entered into a computer for analysis.
Computer assisted telephone interviewing (CATI). Interviewers enter responses directly
into a computer and the questions required coding are entered at a different time.
Both quantitative and qualitative methods are employed for data analysis. The qualitative inquiries
capture areas where in-depth information is required for better understanding of issues. The qualitative
data will also serve as a means of triangulating data gathered through the quantitative approach and
providing in-depth explanation to some of the quantitative data. The quantitative analysis is good for
generalization and numbers. The analysis can be done using SPSS, STATA or any other statistical
software which will be discussed in Unit 4.
76
For most surveys simple descriptive statistics (frequencies, means, ranges, etc) may be all that is
needed to be able to interpret the results. This involves determining how many of the respondents
answered a particular way for each of the questions. More complex analyses may be required
when comparisons are needed between subgroups of the population or for measurements taken at
different times.
Statistical analysis aims to show that your results are not just due to chance or the ‘luck of the
draw’. It provides a way to determine the repeatability of any differences observed. If the same
outcome is found when a study is repeated over and over again, we really don’t need a statistical
analysis. Similarly when we study a ‘sample’ of the population, statistical analysis is used to help
us decide whether it is likely that these same differences would be found if we repeated the
experiment in multiple samples or in the entire population. Hypothesis could be tested with
common statistics tools such as the T-tests (to compare results for continuous data), Z-test or Chi
square (to compare results for categorical data).
Presenting Results
It is easy to become overwhelmed with too much information so focus on the research
questions and only present the information which answers those questions.
77
Choose a format which will highlight the key result.
Keep it simple
Pictures are worth a thousand words
A case study is an intensive study of a single unit for the purpose of understanding a larger class of
(similar) units. A case study is an in-depth investigation of an individual, group, institution or
phenomenon. Case studies are often based on the premise that locating one case is enough make a
conclusion for other cases since a case can be typified for similar other cases. A case being studied is
taken as an example of other similar things/situations.
78
As a means of overcoming shortcomings of quantitative research studies, case study research are
often undertaken to have a holistic and in-depth investigation of social and behavioral problems
such as unemployment, poverty, drug addiction, governance, management and illiteracy.
Through case study methods, a researcher is able to go beyond the quantitative statistical results
and understand the behavioral conditions through the actor’s perspective. Whilst in quantitative
research certain peripheral but relevant information might be omitted and obscured, case study
research explains both the process and outcome of a phenomenon through complete observation,
reconstruction and analysis of the issue under study and thereby covers all relevant information.
79
Despite these advantages, case studies have received criticisms. There are primarily three types
of arguments against case study research.
Case studies are often accused of lack of rigor. Too many times, the case study
investigator has been sloppy, and has allowed equivocal evidence or biased views to
influence the direction of the findings and conclusions.
Case studies provide very little basis for scientific generalization since they use a small
number of subjects, some conducted with only one subject. The question commonly
raised is “How can you generalize from a single case?
Case studies are often labeled as being too long, difficult to conduct and producing a
massive amount of documentation. In particular, case studies of ethnographic or
longitudinal nature can elicit a great deal of data over a period of time.
A common criticism of case study method is its dependency on a single case exploration
making it difficult to reach a generalizing conclusion.
Single case, though generally limited by its inability to provide generalizing conclusions, the
drawback can be overcome by triangulating the study with other methods to authenticate the
validity of the process. Multiple-case design, on the other hand, can be adopted with real-life
events that show numerous sources of evidence through replication rather than sampling logic to
enhance and support previous results. This helps raise the level of confidence in the strength of
the method adopted. For instance, whilst a study on the psychological impacts of the 1983
drought on children may difficult to be replicated and hence appropriate for a single case study,
the assessment of the sensing ability of deaf children is replicable and hence more appropriate
for multiple case study. The design of a case study is therefore very important. A case study
method must be able to prove, through interviews or journal entries, that:
80
It is the only viable method to elicit implicit and explicit data from the subjects
It is appropriate to the research question
It follows the set of procedures with proper application
The scientific conventions used in social sciences are strictly followed
A ‘chain of evidence’, either quantitatively or qualitatively, are systematically recorded
and archived particularly when interviews and direct observation by the researcher are the
main sources of data
The case study is linked to a theoretical framework.
81
The complex and multivariate cases can be explained by three rival theories: a knowledge-driven
theory, a problem-solving theory, and a social-interaction theory.
The knowledge-driven theory stipulates that eventual commercial products are the results of
ideas and discoveries from basic research. Similar notions can be said for the problem-solving
theory. However, in this theory, products are derived from external sources rather than from
research. The social-interaction theory, on the other hand, suggests that overlapping professional
network causes researchers and users to communicate frequently with each other.
82
Observational study involves observing a phenomenon. For example, instead of asking how the
Black Stars are likely to perform in the World cup in Germany, you may observe them playing
prior to the trip to Germany. Observational research is also guided by clearly defined hypotheses
or objectives to make the research objective. The observations should be systematic rather than
opportunistic and disorderly.
83
There is a problem of the impact of the observer’s participation on the situation and the
subjects.
It could be very biased.
3.3.2.2 Advantages
Hypothesis or theories developed are grounded firmly in observational data gathered in a
naturalistic setting.
It provides a very vivid (life) picture of the environment being studied
The long period of study required in ethnographic research gives the research a
longitudinal perspective that cannot be achieved in many other types of research.
3.3.2.3 Disadvantages
84
Ethnographic research requires the skills of someone trained in observational techniques
to make results valid.
The outcome of the field data can easily be influenced by the observer’s bias.
Since the field reports are usually long hand written notes, such field records are usually
difficult to quantify and interpret.
Ethnographic research goes on for a long period of time, which makes it very expensive.
A lot of time is first of all devoted in trying to understand the environment where the
study will be carried out long before the study takes place, thus making it very expensive.
The observer is forced to become an active participant in the society/environment being
studied, which could lead to role conflicts (e.g. one can easily forget the role he/she is
expected to play and disclose his/her real self) and this could reduce the validity of data
being collected.
It requires an observer who is alert and a fast writer who can also write clearly.
Historical research is also defined as “the discovery and analysis of records of previous events,
interpretation of trends in the attitudes or events of the past and generalizations from these past
events to help guide present or future behaviour. Historical research consists of locating,
integrating and evaluating evidence from physical relics, written records or documents in order
to establish facts or generalizations regarding past or present events, human characteristics or
other problems in question” Compton and Hall (1972). The historical researcher is interested in
understanding and analyzing the past. The research for evidence or facts is always guided by a
85
broad theory or interpretation relevant to the researcher’s interest and therefore the facts to not
speak for themselves.
Examples of historical sources of data, which could either be primary or secondary sources include;
Official records which may include legal records, legal instruments such as contracts and
wills, court decisions, etc.
Eye witness accounts of events, which could be given orally or in written form.
Creative productions such as works of art, photographs, literature, museum pieces and
costumes.
Expressive documents, such as personal letters, life histories (from diaries or
autobiographies, etc.).
Historical research also attempts to interpret ideas or events that had previously seemed
unrelated. It emphasizes old data or merges old data with new historical facts that others have
discovered. Historical research is also used to reinterpret past events that have been studied.
Descriptions can be concrete or abstract. A relatively concrete description might describe the
ethnic mix of a community, the changing age profile of a population or the gender mix of a
workplace. Alternatively the description might ask more abstract questions such as `is the level
of social inequality increasing or declining?’ `How secular is society?' or `How much poverty is
there in this community?' Accurate descriptions of the level of unemployment or poverty have
historically played a key role in social policy reforms (Marsh, 1982). By demonstrating the
existence of social problems, competent description can challenge accepted assumptions about
the way things are and can provoke action.
Good description provokes the `why' questions of explanatory research. If we detect greater
social polarization over the last 20 years (i.e. the rich are getting richer and the poor are getting
poorer) we are forced to ask `Why is this happening?' But before asking `why?' we must be sure
87
about the fact and dimensions of the phenomenon of increasing polarization. It is all very well to
develop elaborate theories as to why society might be more polarized now than in the recent past,
but if the basic premise is wrong (i.e. society is not becoming more polarized) then attempts to
explain a non-existent phenomenon are silly.
Of course description can degenerate to mindless fact gathering or what C.W. Mills (1959) called
`abstracted empiricism'. There are plenty of examples of unfocused surveys and case studies that
report trivial information and fail to provoke any `why' questions or provide any basis for
generalization. However, this is a function of inconsequential descriptions rather than an
indictment of descriptive research itself.
Causal explanations argue that phenomenon Y (e.g. income level) is affected by factor X (e.g.
gender). Some causal explanations will be simple while others will be more complex. For
example, we might argue that there is a direct effect of gender on income (i.e. simple gender
discrimination). People often confuse correlation with causation. Simply because one event
follows another, or two factors co-vary, does not mean that one causes the other. The link
between two events may be coincidental rather than causal.
88
There is a correlation between the number of fire engines at a fire and the amount of damage
caused by the fire (the more fire engines the more damage). Is it therefore reasonable to conclude
that the number of fire engines causes the amount of damage? Clearly the number of fire engines
and the amount of damage will both be due to some third factor - such as the seriousness of the
fire.
Confusing causation with correlation also confuses prediction with causation and prediction with
explanation. Where two events or characteristics are correlated we can predict one from the
other. Knowing the type of school attended improves our capacity to predict academic
achievement. But this does not mean that the school type affects academic achievement.
Predicting performance on the basis of school type does not tell us why private school students
do better. Good prediction does not depend on causal relationships. Nor does the ability to
predict accurately demonstrate anything about causality.
89
Two types of Longitudinal Design
They share a similar design structure i.e. the data are collected in at least two waves on
the same variable on the same people.
They are both concerned with illuminating social change and improving the
understanding of causal influence over time- the causal influence implies that the
Longitudinal designs are somewhat better able to deal with the problem of ambiguity
about the direction of influence.
3.3.8.1 Advantages
90
Experiments enable researchers to exert a great deal of control over extrinsic and intrinsic
variables, strengthening the validity of causal inferences (internal validity).
Experiments enable researchers to control the introduction of the Independent variable so
they may determine the direction of causation.
3.3.8.2 Disadvantages
External validity is weak because experimental design does not allow researchers to
replicate real-life social situation.
Researchers must often rely on volunteer or self-selected subjects for their samples.
Therefore the sample may not be representative of the population of interest, preventing
researchers from generalizing to the population and limiting the scope of their findings
There are no absolute answers to the above conflict but it is important to be aware of it and to
guide against it as much as possible, or be able to manage it. Values people attach to the benefits
or cost of conducting research are based on so many factors including background, culture,
experience, convictions, etc. Some of the costs that the researcher may put the researched into
are affronts to dignity of the individual, embarrassment, loss of trust in social relations, loss of
self-esteem or self-confidence, etc. For the researcher the gains could be developing more theory
about the hidden agenda of people, potential advances of applied knowledge, etc. For the
researched, the gains could be the monetary benefits, satisfaction in contributing to knowledge,
etc. All ethical decisions have to be made individually.
3.4.4 Competence
It is important to know that it is not everyone who is competent enough to provide informed
responses to questions posed by the researcher. It is often assumed that adults are capable of
providing response of any kind while children are not. This could be true or untrue depending on
the research topic. In some cases children may more competent in providing responses than
adults and vice versa. Ethically, competence must be taken into account in deciding on the
92
respondents. The freedom to decide whether to participate in a research or not is left to those to
be researched and so on ethical grounds it is considered as voluntary.
3.4.5 Privacy
Privacy as an ethical issue in research needs safeguarding. It is viewed from three angels:
a) Sensitivity of information
Sensitivity of information refers to how personal or potential threatening the information is that
the researcher is interested in. The greater the sensitivity of the information, the more the
researcher needs to provide privacy to the respondent. People are often sensitive about issues
related to religion, income, sexual practices, racism and personal attributes such as honesty,
intelligence, etc.
The setting could vary from the private (e.g. home) to the public place. The extent to which any
of the above two places could be intrusion in people’s privacy is not certain which could lead to
an ethical issue. An example is trying to interview homosexual in a public drinking place.
It should not be easy to match information with the people who provided it. Being able to do so
would mean not protecting the privacy of those who provided the information. It is easy to get
that done by not putting names of the questionnaires or research instruments used.
Unit Summary
Social science research is always limited by the unpredictability of the human behaviour.
Premised on this, it has to be approached in a systematic manner devoid of biases. Depending
on the nature of the problem, several approaches can be used to address the research
problem. What is imperative is for the researcher to clarify the purpose of the research and
examine the suitability of a chosen research approach in addressing the research questions.
Noting that researchers use samples after which findings are generalised to represent the
population, it is imperative that units in the sample reflects nuances in the target population.
Another significant factor worthy to be consider is the need to observe the research ethics
which include but not limited to; the use of informed consent, competence and privacy. The
quality of data to be gathered depends primarily on the nature of instruments used. Hence,
the instruments are to be designed to gather the required data from respondents. The
94
questions should be unambiguous, devoid of technicalities and jargons to enable the
enumerator and respondent understand them to gather the required data.
95
Unit Assignment
The Government of your country wants to curtail the impact of a hydro electric
power dam it is about to construct on the livelihood sources of people with the
proposed dam’s catchment area. The official record of the National Statistical
Service indicates that about 15 000 people are to be affected through inundation.
As a research fellow, you have been asked to carry out a preliminary assessment for
the implementers to evaluate the effects of their interventions and plan appropriately
to curtail the effects.
96
Unit 4
Introduction
Statistical analysis software packages such as SPSS and STATA provide complete,
comprehensive set of tools that can be used to perform various statistical procedures, such as line
plots, scatter plots, tables, regression analysis, bar charts, pie charts, dot charts, multivariate
analysis, time series analysis, survival analysis etc.
Learning Objectives
Unit content
97
SESSION 4.1: INTRODUCTION TO SPSS
There are two important limitations of SPSS that deserve mention at the outset:
o SPSS users have less control over statistical output than, for example, Stata or Gauss
users. For novice users, this hardly causes a problem. But, once a researcher wants
greater control over the equations or the output, she or he will need to either choose
o SPSS has problems with certain types of data manipulations, and it has some built in
quirks that seem to reflect its early creation. The best known limitation is its weak lag
functions, that is, how it transforms data across cases. For new users working off of
standard data sets, this is rarely a problem.But, once a researcher begins wanting to
significantly alter data sets, he or she will have to either learn a new package or develop
greater skills at manipulating SPSS.
Data Editor Window. This window shows the contents of the current data file. A blank data
editor window, as shown in figure 4.1, automatically opens when you start SPSS for Windows;
only one data window can be opened at a time. From this window, you may create new data files
or modify existing ones.
98
Output Viewer Window. This window displays the results of any statistical procedures you run,
such as descriptive statistics or frequency distributions. All tables and charts are also displayed in
this window. The viewer window automatically opens when you create output. Figure 4.2 shows
an output viewer window.
Chart Editor Window. In this window, you can modify charts and plots. For instance, you can
rotate axes, change the colors of charts, select different fonts, and rotate three-dimensional
scatter plots.
Syntax Editor Window. You will use this window if you wish to use SPSS syntax to run
commands instead of clicking on the pull-down menus. An advantage to this method is that it
allows you to perform special features of SPSS that are not available through dialog boxes.
Syntax is also an excellent way to keep a record of your analyses.
99
Figure 4.2: SPSS Output Viewer Window
Pivot Table Editor. Output displayed in pivot tables can be modified in many ways with the
Pivot Table Editor. You can edit text, swap data in rows and columns, add colour, create
multidimensional tables, and selectively hide and show results.
100
Figure 4.3: SPSS Syntax Editor
Text Output Editor. Text output not displayed in pivot tables can be modified with the Text
Output Editor. You can edit the output and change font characteristics (type, style, color, size).
File. This menu is used to create new files, open existing files, read files that have been created
by other software (e.g., spreadsheets or databases), and print files.
Edit. This menu is used to modify or copy text from output or syntax windows.
101
View. This menu allows you to change the appearance of your screen. You can, for instance,
change fonts, customize toolbars, and display data using their value labels.
Data. Use this menu to make temporary changes in SPSS data files, such as merging files,
transposing variables and cases, and selecting subsets of cases for analyses. Changes are not
permanent unless you explicitly save the changes.
Transform. The transform menu makes changes to selected variables in the data file and
computes new variables based on values of existing variables. Transformations are not
permanent unless you explicitly save the changes.
Analyze. Use this menu to select a statistical procedure to be performed such as descriptive
statistics, correlations, analysis of variance, and cross-tabulations.
Graphs. This menu is used to create bar charts, pie charts, histograms, and scatter plots. Some
procedures under the Analyze menu also generate graphs.
Utilities. This menu is used to change fonts, display information on the contents of SPSS data
files, or open an index of SPSS commands.
Window. Use the window menu to arrange, select, and control the attributes of the SPSS
windows.
Help and add-on. These menus open a Microsoft Help window containing information on how to
use many SPSS features.
102
• *, multiplication
• **, exponentiation
• abs(x) returns the absolute value of x.
• exp(x) returns the exponential function of x.
• int(x) returns the integer by truncating x towards zero.
• ln(x), log(x) returns the natural logarithm of x if x>0.
• log10(x) returns the log base 10 of x if x>0.
• max(x1,...,xn) returns the maximum of x1, ..., xn.
• min(x1,...,xn) returns the minimum of x1, ..., xn.
• round(x) returns x rounded to the nearest whole number.
• round(x,y) returns x rounded to units of y.
• sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.
• sqrt(x) returns the square root of x if x>=0.
Logical Operators
& and
| or
! not
∼ not
Relational Operators
greater than
< less than
>= greater or equal
<= smaller or equal
= equal(for conditional statements)
!= not equal
103
4.1.2 Data Management
Data can be entered directly into SPSS, or it can be imported from a number of different sources.
The processes for reading data stored in SPSS data files, spreadsheet applications, such as
Microsoft Excel, database applications, such as Microsoft Access, and text files are all discussed
in this chapter.
Once you have entered data in the data editor, you may change or delete values. To change or
delete a value in a cell, simply click on the cell you wish to alter. You will notice that a dark
border appears around the selected cell, and the value in the cell appears at the top of the data
editor. If you are changing the value, simply type the new value and press enter.
To delete a variable:
1. Click on the variable name that you wish to delete.
2. Click on Edit from the menu bar.
3. Click on Clear. The selected variable will be deleted and all variables to the right of the
deleted variable will shift to the left. Deleting variables can also be accomplished using
SPSS syntax with the Drop and Keep subcommands.
Defining Variables
By default, SPSS assigns variable names and formats to all variables in the SPSS data file. By
default, variables are named VAR##### (prefix VAR followed by five digits) and all values are
valid (blanks are assigned system missing values). Most of the time, however, you will want to
105
customize your data file. For example, you may want to give your variables more meaningful
names, provide labels for specific values, change the variable formats, and assign specific values
to be regarded as “missing.”
To do any or all of these:
1. First, make sure that your data file window is the active window and click on the variable
name that you wish to change.
2. Click on the Variable View tab or else double-click on the variable name in the data
editor.
3. Type the name of the variable in the Name column. Variable names have to be unique,
begin with a letter, and cannot contain blank spaces.
4. If you wish to change the type or format of a variable, click the button in the Type cell to
open the Variable Type dialog box. By default, all variables are numeric, but you may
work with other types such as names, dates, and other non-numeric data.
5. Suppose you have a variable representing average cost of groceries per person that was
entered to the nearest cent (e.g., 32.24) and you want to change this format so that the
average cost is displayed as a whole number (rounded to the nearest dollar, e.g., 32) click
in the button Decimal places box. To change the format of the numeric variable, click in
the Width box.
6. If one of your variables is categorical, you can assign numbers to represent the categories
of the variable. For example, the variable sex will have 2categories: male and female.
Males may have the assigned value “1” and “2” represents females. It is useful to have
descriptive labels assigned to the values of 1 and 2 so that it is easy to see which number
represents which category in your output files.
7. If there are specific values that you would like to be treated as missing values, click on
Missing to open the Missing Values dialog box. Click on Discrete Missing Values to tell
SPSS that you have specific values that are considered to be missing. Type the value(s) in
the boxes (you may have up to three values). If you have more than three missing values,
click on Range plus one optional discrete missing value and enter the lower and upper
bounds of the discrete variable. Click OK when you have entered in all of your missing
values.
106
Reading SPSS Data Files
We will illustrate how to read an existing SPSS data file. The reader may follow along using the
data accompanying this guide.
To open a data file:
1. Click on File from the menu bar.
2. Click on Open on the file pull-down menu.
3. Click on Data on the open pull-down menu. This opens the Open File dialog box as
shown in Figure 4.5.
4. Choose the correct directory from the Look in: box at the top of the screen.
5. Point the arrow to the data file you wish to open and click on it.
6. Click on Open.
Note: Most of the examples in the following chapters use the SPSS data files that are provided
with this manual. Unless you are required to enter data on your own into a new file, all
procedures assume that you have opened the SPSS data file before beginning any computations
or analyses.
To open data from a file such as an Excel spreadsheet, begin at the Data Editor window:
1. Click on File.
2. Click on Open and then click on Data.
107
3. Select the file format from the drop-down list of file types in the Files of type: box.
4. Choose the appropriate directory and file.
5. Click on Open.
109
3. Enter the name of the new variable (in the above illustration, total) in the Target Variable
box. (You also have the option to describe the nature and format of the new variable by
clicking on the Type & Label box.)
4. You will then need to perform a series of steps to construct an expression used to
compute your new variable. In this illustration, you would first select the daytime
variable (“daysleep”) from the variable list box on the lefthand side of the dialog box and
move it to the Numeric Expression box using the right directional arrow.
5. Then click on the “+” from the calculator pad. You will notice that a plus sign is placed
in the Numeric Expression box after the word daytime.
6. Complete the expression by selecting the nighttime variable (“nightsleep”) and moving it
to the Numeric Expression box, following the instructions in step (4) above.
7. When you have completed the expression, click on OK to close the Compute Variable
dialog box. Your new variable will be added to the end of your data file.
In addition to simple algebraic functions on the calculator pad (+, -, x, ÷), there are many other
arithmetic functions such as absolute value, truncate, round, square root, and statistical functions
including sum, mean, minimum, and maximum. These are displayed in the Function group box
to the right of the calculator pad. First, select a procedure in the Function group window, and
then select the specific function in the Functions and Specific Variables window.
SPSS syntax commands are typed into a command file using the SPSS syntax editor. Syntax files
have the extension “.sps”. There are several reasons why command syntax is useful, such as
when the user wants to: (1) have a record of the analyses conducted during a session; (2) repeat
long and complex analyses; (3) review how variables were created or transformed; and (4)
modify commands to run slightly different or customized statistics.
110
When working with syntax, the user must enter commands instructing the program what
procedures to conduct. You can enter syntax by either typing or pasting syntax into the syntax
editor. Because most users do not know the commands from memory, it is useful to refer to the
SPSS Syntax Reference Guide for a complete reference to the command syntax. Help is also
available by using the Help button on the toolbar in the syntax editor window. Pasting syntax
commands from dialog boxes is perhaps the easiest way to construct syntax commands. Rather
than typing the commands, you initiate a procedure using pull-down menus and then instruct
SPSS to provide the commands and paste them into the syntax editor.
111
Figure 4.6: Compute Variable Dialog Box
For example, suppose you want to open the sleep.sav data file, but you only want to read a subset
of variables — body weight, total sleep, and danger index.
The syntax command would be:
GET FILE = SLEEP.
/KEEP = BODY WT TOTSLEEP DANGER
You can also run a procedure by pasting syntax from a dialog box. When you use the paste
button, SPSS creates the syntax commands to execute procedures requested from pull-down
menus. For example, to compute a new variable (total sleep hours) as shown in session 4.1.2.2,
follow steps 1–6. Instead of clicking on OK, click on the Paste button. The compute commands
will automatically be displayed in a syntax window. To run the syntax commands, click the
Right arrow button on the toolbar.
112
Once you have created a syntax file, you can save it using the same procedures described in
Session 4.1.2.1 of this chapter. The file can then be opened and edited for future modifications.
Make sure when you open, edit, and save a syntax file that you correctly identify it with the
“.sps” file type.
The basic features of any data can be presented in the form of:
Graphical displays
Tabular descriptions
Summary statistics
Linear regressions
113
Categorical variables are those that have qualitatively distinct categories as values. For example,
gender is a categorical variable with categories “male” and “female”.
Frequencies
One way to display data is in a frequency distribution, which lists the values of a variable (e.g.,
for the variable region: Accra, Kumasi, Volta, etc.) and the corresponding numbers and
percentages of people for each value. Let us begin by creating a simple frequency distribution of
Regions using the “sec7.sav” SPSS data file from the GLSS5 accompanying this manual. Follow
along by using SPSS to open the data file on your computer (using the procedure given in
Chapter 2). This data set was used in a study of the Housing Characteristics in Ghana.
Notice that the data view lists numbers as the values for all of the variables, even though the
variable is a categorical variable. To see the categories each of the values represents, you can
examine the contents of the data file (variable labels, variable type, and value labels) by clicking
on Utilities on the menu bar and clicking on Variables from the pull-down menu.
The frequency distribution produced by SPSS is shown in Figure 4.8. This figure shows the
content of the output — that which is in the right-hand frame of your Output Viewer. The
“Statistics” table in the output indicates the number of valid and missing values for this variable.
There are 8687 valid cases and no missing values. The “Region” table displays the frequency
distribution.
114
For example, there are 834 people in the Western region and 1257 people in the Greater Accra
region. The numbers in the “Percent” column represent the percentage of the total number of
cases that are in each region. These are obtained by dividing each frequency by the total number
of cases and multiplying by 100. For example, 18.1% of the people are in the Ashanti region.
Statistics
Region
N Valid 8687
Missing 0
Region
Cumulative
Frequency Percent Valid Percent Percent
Valid Western 834 9.6 9.6 9.6
central 689 7.9 7.9 17.5
greater accra 1257 14.5 14.5 32.0
volta 720 8.3 8.3 40.3
eastern 914 10.5 10.5 50.8
ashanti 1574 18.1 18.1 68.9
brong ahafo 795 9.2 9.2 78.1
northern 795 9.2 9.2 87.2
upper east 600 6.9 6.9 94.1
upper west 509 5.9 5.9 100.0
Total 8687 100.0 100.0
Figure 4.8: Frequency Distribution of number of people in the various regions of Ghana
The “Valid Percent” column takes into account missing values. In this case, there are no missing
values, so the “Percent” and “Valid Percent” columns are the same. The “Cumulative Percent” is
a cumulative percentage of the cases for the category and all categories listed before it in the
table.
Worked example 2
115
Draw a table showing the variation in cooking fuel in the urban areas of the Greater Accra
Region. (Use the data in the file “sec7.sav”, from the GLSS5 accompanying this manual).
Solution;
1. Click on Data to open the data pull down menu
2. Click on “select cases”. (To open the select cases pop-up menu )
3. Click on “if condition is satisfied”, click on the “if…” button
4. Type “if region=3 & loc=1%”, and click OK
5. Click on Analyze from the menu bar.
6. Click on Descriptive Statistics from the pull-down menu.
7. Click on Frequencies from the second pull-down menu to open the region dialog box.
8. Click on the label/name of the variable you wish to examine (“Main fuel used for
cooking”) in the left-hand box.
9. Click on the right arrow button to move the variable name into the Variable(s) box.
10. Click on OK.
Cumulative
Frequency Percent Valid Percent Percent
Valid None,No Cooking 4 1.4 1.4 1.4
Wood 61 20.7 20.7 22.1
Charcoal 185 62.9 62.9 85.0
Gas 39 13.3 13.3 98.3
Electricity 1 .3 .3 98.6
Kerosene 4 1.4 1.4 100.0
Total 294 100.0 100.0
Figure 4.9: Frequency Distribution of Main fuel used for cooking in Ghana
116
scale (e.g., number of errors). This chapter demonstrates how to examine different types of data
through graphical representations.
A bar chart like that in Figure 4.11 should appear in your SPSS Viewer. The information
displayed in this chart is a graphical version of that shown in the frequency distribution in Figure
4.8. The region with the greatest number of people is the Ashanti region.
Worked example 1
Draw bar graphs to show the Rural – Urban correlation for the various Regions. (Use the data in
the file “sec7.sav”, from the GLSS5 accompanying this manual).
Solution;
117
1. Click on Graphs to open the graphs pull down menu
2. Click on Bar charts. (To open the bar chart pop-up menu )
3. Click on “clustered”, select “summaries for group of cases” and click on “define”
4. Select “% of cases”, move the region variable to the category axis and the rural/urban
variable to the “define clusters by:”
5. Click on OK to run the chart procedure
A bar chart like that in Figure 4.12 should appear in your SPSS Viewer.
Region
1,500
1,000
Frequency
500
0
Western central greater volta eastern ashanti brong northern upper upper
accra ahafo east west
Region
Figure 4.11: Bar chart of number of people in the various regions of Ghana
118
Summarizing Numerical Data
There are two types of numerical variables — discrete and continuous. The values for discrete
variables are counting numbers. For example, an American football game is won by one, two, or
three points, not a quantity in between. Continuous variables, on the other hand, do not have such
indivisible units. Body temperature, for instance, can be measured to the nearest degree, half
degree, quarter-degree, and so on. For practical purposes in SPSS, there is no difference in
summarizing these two types of numerical data.
urban/rural-corr
urban
rural
30.0%
20.0%
Percent
10.0%
0.0%
Wester central greater volta eastern ashanti brong norther upper upper
n accra ahafo n east west
Region
Figure 4.12: Bar graphs showing the Rural – Urban correlation of the various regions in Ghana
119
4.1.3.5 Mean, Sum, Standard Deviation, Variance, Minimum Value, Maximum Value, and
Range
When generating these statistics, the Data Editor must be open with the appropriate data set
before continuing.
Worked Problem
Using the data in the file “sec7.sav”, determine the mean, sum, standard deviation, variance,
minimum value, maximum value, and range for s7fq6 only.
Solution
1. Repeat steps 1–2 of the Frequencies section, select Descriptives. This will open the
Descriptives dialog box as shown in Fig.4.13.
2. In the variable list, select the variable Area in square meters. Left click on the right arrow
button between the boxes to move this variable over to the Variable(s) box. To calculate
statistics for many variables, simultaneously add variables to the Variable(s) box.
3. Click on the Options button. This will open the Descriptives: Options dialog box.
4. Click on mean, sum, standard deviation, minimum value, maximum value, and range.
5. Click on the Continue button when done.
120
6. Click OK. The Descriptives dialog box closes and SPSS activates the Output Navigator
to illustrate the statistics.
121
Figure 4.14: Descriptives of areas in square meters for households in Ghana.
122
7. Click on Continue to close this dialog box.
8. Click on OK to close the Frequencies dialog box and execute the procedure.
Notice that the same method employed above, could be used to obtain the median, mean, sum,
percentiles etc.
123
Self Assessment 4.1
(Use the data in the file “sec7.sav”, from the GLSS5 accompanying this manual). Using SPSS;
4.1.1 With the aid of tables and bar charts, show how access to the different cooking fuels
varies between rural and urban areas for the Greater Accra Metropolitan Area
(GAMA).
4.1.2 Still using tables and bar charts, show how access to the different cooking fuels in
rural and urban areas for Accra compares with one other region of your choice.
4.1.3 With the aid of tables and pie charts, show the distribution of different cooking fuel
usage for all the regions of Ghana
4.1.4 Draw bar graphs to show the Rural – Urban correlation for the entire sample in
percentages and actual number of cases
124
4.2.1 The Stata Environment
When you start Stata for Windows you will see the following windows, the Command window
where you type in your Stata commands, the Results window where Stata results are displayed,
the Review window where past Stata commands are displayed and the Variables window which
list all the variables in the active data file as shown in figure 4.17. The data in the active data file
can be browsed (read-only) in the Browser window, which is activated from the menu Data/Data
browser or by
browse varlist
where varlist (e.g. income age) is a list of variables to be displayed.
The Editor window as shown in figure 4.18, allows to edit data either by directly typing into the
editor window or by copying and pasting from spreadsheet software
edit varlist
Stata has implemented every Stata command (except the programming commands) as a dialog
that can be accessed from the menus. This makes commands you are using for the first time
easier to learn as the proper syntax for the operation is displayed in the Review window.
125
Figure 4.17: Stata Environment
126
Figure 4.18: Stata Toolbar
4.2.1.3 Memory
To change the memory assigned to STATA:
set mem#k
where # is a number greater than the size of the dataset, and less than the total amount of
memory available on your system.
To check the size of the dataset, look in My Computer or your Explorer package. To check the
amount of memory (RAM) your system has available, go to the Start menu and click on
\Settings\Control Panel\System. The bottom line, under General tells you how many KB of RAM
you have available.
127
STATA 10 opens with a default memory of 10.00 MB. To increase the default memory: Right
click on the STATA icon and choose Properties\Shortcut
Edit the Target field to say: \\St-server5\stata8$\wsestata.exe /k#
Where k# is the number of kb you wish to assign to STATA.
Note: If you do not have enough memory available on your machine to read a whole dataset,
open a subset of the variables you need.
If you don't know the exact expression for the command, you can search the Stata documentation
by;
search word
In both cases the result is written into the result window. Alternatively, you can display the result
in the Viewer window by issuing the command
view help command
or by calling the Stata online help in the menu bar: Help/Search...
128
• abs(x) returns the absolute value of x.
• exp(x) returns the exponential function of x.
• int(x) returns the integer by truncating x towards zero.
• ln(x), log(x) returns the natural logarithm of x if x>0.
• log10(x) returns the log base 10 of x if x>0.
• max(x1,...,xn) returns the maximum of x1, ..., xn.
• min(x1,...,xn) returns the minimum of x1, ..., xn.
• round(x) returns x rounded to the nearest whole number.
• round(x,y) returns x rounded to units of y.
• sign(x) returns -1 if x<0, 0 if x==0, 1 if x>0.
• sqrt(x) returns the square root of x if x>=0.
Logical Operators
& and
| or
! not
∼ not
Relational Operators
> greater than
< less than
>= greater or equal
<= smaller or equal
= = equal(for conditional statements)
!= not equal
129
4.2.2 Data Management
4.2.2.1 Data Entry and Importing Data in Stata
There are two ways of getting data in stata, one way of doing this is manual data entry or
inputting interactively from keyboard. This method is useful for small datasets. For example to
enter data on accident rates (ar) and speed limits (sl) directly into Stata, the syntax is;
input ar sl
1. 4 55
2. 1.5 60
3. 1 .
4. end
This data could also be entered manually by clicking on the data editor on the toolbar menu; note
that you can copy-and-paste into the data editor. The output is as shown in figure 4.19.
130
Inputting from files and spreadsheets (data entry software) is the common way data are brought
into Stata. (Note; excel is not a data entry software).
131
cd drive:directory
See help memory if you encounter memory problems when loading a file.
4.2.2.3 Creating new variables
New variables are created by the following syntax;
generate newvar = expression [if expression]
where newvar is the name of the new variable and expression is a mathematical function of
existing variables. The if option applies the command only to the data specified by a logical
expression. The (system) missing value code ‘.’ is assigned to observations that take no value.
Some examples:
generate age2 = age^ 2
generate agewomen = age if women = = 1
generate rich = 0 if wealth != .
replace rich = 1 if wealth >= 1000000
generate rich = wealth >= 1000000
132
Figure 4.20: Variable Properties
135
4.2.2.10 Files Extensions
Data file filename.dta
Do file filename.do (program file)
Dictionary file filename.dct
Log file filename.scml (only readable in stata)
Log file filename.log (text file)
Just like in SPSS, the basic features of any data in Stata can be presented in the form of:
Graphical displays
Tabular descriptions
Summary statistics
136
Linear regressions
To draw a scatter plot of the variables yvar1 yvar2 ... (y-axis) against xvar (x-axis): the syntax is
scatter yvar1 yvar2 ... xvar
To draw a line graph, i.e. scatter with connected points
line yvar1 yvar2 ... xvar
To draw a histogram of the variable var
histogram var
To draw a scatter plot with regression line:
scatter yvar xvar || lfit yvar xvar
4.2.3.2 Summary Statistics
To display univariate summary statistics of the variables in varlist: type
summarize varlist
at the command prompt
137
To produce a two-way table of absolute and relative frequencies counts along with Pearson's chi-
square statistic:
tabulate var1 var2, col chi2
To perform a two-sample t-test of the hypothesis that varname has the same mean within the two
groups defined by the dummy variable groupvar
ttest varname [if exp], by(groupvar) [ unequal]
where the option unequal indicates that the two-sample data are not to be assumed to have equal
variances.
4.2.3.4 Regression
To regress a dependent variable depvar on a constant and one or more independent variables in
varlist use
regress depvar [varlist] [if exp] [, level(#) noconstant]
the if option limits the estimation to a subsample specified by the logical expression exp. The
noconstant option suppresses the constant term.
level(#) specifies the confidence level, in percent, for confidence intervals of the coefficients. See
help regress for more options.
You can access the estimated parameters and their standard errors from the most recently
estimated model;
coef[varname] contains the value of the coe_cient on varname
se[varname] contains the standard error of the coe_cient
Stata calculates predictions from the previously estimated regression by
predict newvarname [, stdp]
The stdp option provides the standard error of the prediction.
[post-estimation commands: predict, cve, ...]
138
where filename is any name you wish to give the file. The append option simply adds more
information to an existing file, whereas the replace option erases anything that was already in the
file. Full logs are recorded in one of two formats: SMCL (Stata Markup and Control Language)
or text (meaning ASCII). The default is SMCL, but the option text can change that.
A command log contains only your commands
cmdlog using filename
Both type of log files can be viewed in the Viewer:
view filename
You can temporarily suspend, resume or stop the logging with the command:
log f on | off | close g
cmdlog f on | off | close g
4.2.3.5 Do-Files
A do-file is a set of commands just as you would type them in one-by- one during a regular Stata
session. Any command you use in Stata can be part of a do file. The default extension of do-files
is .do, which explains its name. Do-files allow you to run a long series of commands several
times with minor or no changes. Furthermore, do-files keep a record of the commands you used
to produce your results.
To edit a do-file, just click on the new do file icon in the toolbar. To run this file, save it in the
do-file editor and issue the command:
do mydofile
You can also click on the Do current file icon in the do-file editor to run the do-file you are
currently editing.
Comments are indicated by a * at the beginning of a line. Alternatively, what appears inside /* */
is ignored. The /* and */ comment delimiter has the advantage that it may be used in the middle
of a line. Appendix A shows some typical do files.
Worked Example 1
You are required to use Stata to analyze data from the 5th Ghana Living Standards Survey
(GLSS5) on the use of electricity for lighting and traditional fuels for cooking.
139
1. Generate a table showing total household income, main source of lighting and main fuel
used for cooking for all households covered in the GLSS5.
Solution
Command:
tabstat category1 totalincome s7dq13 s7dq11,by(hhid) by(loc2) columns(variables)
where; totalincome is generated from the data by using ;
gen totalincome=totemp+agric1c+agric2c+nsfey1+nsfey2+nsfey3+import+remitinc+
otherinc
and category1 is the income quintiles, to obtain the income quintiles, use the command
below;
pctile category= totalincome,nq(5)
xtile category1= totalincome,cut(category)
A table like that in figure 4.21 should appear in your result window.
140
Figure 4.21: A table of total household income, main source of lighting and main fuel used for
cooking, for all households in Ghana.
Worked Example 2
Generate a bar chart to show the % distribution by income quintile for households in Ghana who
use electricity as their main source of lighting.
Solution
Command:
141
graph bar (count) totalincome if s7dq11==1,over( category1) asyvars percentages
Figure 4.22 shows the content of the output.
4 4
4
2
2
0
Group One Group Two Group Three Group Four Group Five
Source:Fifth Ghana Living Standards Survey
Figure 4.22: A bar chart of distribution of electricity as fuel for cooking, for the various income
quintiles in Ghana.
Use Stata to analyze the data from the 5th Ghana Living Standards Survey (GLSS5),
accompanying this manual.
4.2.1 Generate a bar chart to show the % distribution by income quintile for households in
Ghana which use traditional energy sources (Wood, Charcoal, Crop Residue/Sawdust,
Animal Waste and Other) as their main fuel for cooking.
142
4.2.2 With the aid of tables and bar charts, show how access to LPG varies between rural
and urban areas for the Greater Accra region (GAR).
4.2.3 Still using tables and bar charts, show how access to LPG in rural and urban areas for
GAR compares with those for two other regions of your choice, both of which should
not be in the same ecological zone.
Unit Summary
Statistical analysis softwares increase the accuracy and speed of analysing, especially,
sophisticated data. Planning and good policy can only be done more accurately, if the
requisite data analysis is done and done correctly. SPSS and STATA are some of the
common statistical analysis softwares that could be used in statistical analysis of data, such
as, the census data.
SPSS for windows is a menu-driven program, ie., most functions are performed by selecting
an option from one of the menus. Users have less control over statistical output than for
example, Stata or Gauss users.
143
Key terms/ New Words in Unit
1. Toolbar
2. SPSS/STATA
3. Data editor
4. GLSS5
Unit Assignments 4
Use Stata to analyze the data accompanying this manual, from the 5th Ghana Living
Standards Survey (GLSS5).
If a question involves drawing table(s), submit your results and commands in a log
format.
For problems involving drawing graphs, write a do file to draw those graphs.
1. With the aid of tables and pie charts, show the variation between rural and
urban usage of LPG for the whole country and compare with those for
Greater Accra Region and any two regions of your choice.
2. With the aid of tables and bar charts, show how access to main source of
lighting varies between rural and urban areas of Ghana as a whole.
3. Again using tables and bar charts, show how the main source of lighting in
rural and urban areas of Ghana compare with those for any one region of
your choice.
4. Still with the aid of tables and bar charts, show how access to electricity as
main source of lighting varies between the region of your choice and Ghana
as a whole for each total household income quintile in Ghana.
144
Unit 5
Research findings are meant to be published so as to add to the body of knowledge in that
particular field of study. Research reports/papers, theses, journal articles and conference papers
are the widely used means of publishing research findings for the benefit of all interested parties.
This unit will guide students through the preparation of research reports/papers, theses, journal
articles and conference papers with proper referencing.
Learning Objectives
After reading this unit you should be able to:
1. Write journal articles and conference papers
for publications.
2. Prepare a full research report or thesis with
proper referencing.
UNIT CONTENT
145
SESSION 5.1: RESEARCH AND THESIS REPORTS
Title page
This is made up of the full title of the thesis, the name and previous qualification of the
author, the Department to which it is being submitted, in partial fulfillment of requirement
for what degree and in which Faculty and month or year of presentation.
Abstract
An abstract is a brief summary of the thesis and the most likely part of the thesis to be
widely published and read. It should have a concise description of the problem addressed,
the methodology used, the results as well as conclusions. The abstract should usually be
composed as a single paragraph not exceeding 500 words.
Table of contents
This outlines clearly the chapters and subchapters as well content of the materials within
thesis and the pages where they are located.
Prefatory matter
Materials pertaining to the preface, foreword acknowledgement and etc may be presented
in this section. The acknowledgement page is however mandatory.
Introduction
146
The introduction provides background information as well as the rationale for the
research work. It also provides information related to the need for the research and in the
process builds an argument for the research and presents research question(s) and aims.
The introduction should also give a detailed description of the various chapters as well as
their contents.
Literature Review
The literature review should provide a detailed account of research works done by other
researchers in the selected area of study, highlighting the merits as well as limitations.
Referencing in this particularly important in this section because it contains, mostly, works from
other researchers. This is where plagiarism becomes an issue. It is also important to discuss theory
which is directly relevant to your research in this section.
Methodology
This section of the thesis presents an understanding of the philosophical framework within
which the research will be carried out and gives the methodological approach as well as a
justification of the chosen methodology. This section should also clearly define the
boundaries of the research in terms of methodological approach and describe steps taken to
ensure ethical research practice.
He section draws all the important arguments and findings together and in the process
providing the reader with a strong sense that the work has been done satisfactorily and that it
was worthwhile. It provides summaries of the major findings and presents limitations as well
as the implications. It is important to end this section on a strong note by suggesting
directions for future research in the respective field.
147
References
This comprises a list of the major works (publications and authorities) consulted in the
course of writing the thesis. See the reference sections of these notes for more details of the
various referencing styles.
Appendices
An appendix provides a place for important information which, if placed in the main text,
would distract the reader from the flow of the argument. Includes raw data examples and
reorganised data (eg, a table of interview quotes organised around themes). Appendices
may be named, lettered or numbered (decide early).
The title should be concise, attract attention, and highlight the main point of your paper.
It should be clear about the subject matter and devoid of abbreviations.
Abstract
The abstract is a concise summary of the paper and should be able to tell the reader
whether the paper is worth reading or not. It should therefore be as informative as
possible with respect to the objectives, methodology, results as well as the conclusions. It
should mostly not exceed 300 words.
Introduction
The introduction to a research paper should be as brief as possible and should touch on
background of the research problem, a clear justification of why the research is being
undertaken and also the underlying theory and hypothesis. It should contain a short
review of literature in the field of study and should be limited to a maximum of two
pages.
The conclusions drawn from the results of the research should be briefly and clearly
outlined and the importance of these conclusions should also be stated. All conclusions
should be supported by data presented in the research findings. This section should also
contain recommendations for future research in the respective field of study.
References
149
SESSION 5.2: JOURNAL ARTICLES AND CONFERENCE PAPER
PREPARATION
A Journal Article, sometimes referred to as a Scientific Article, a Peer-Reviewed Article, or a
Scholarly Research Article is the means by which a scholar puts forth the results of an academic
research or information to add to the body of knowledge in their field of study and is usually
published in journals. Conference papers on the other hand are similar to journal articles except
they are delivered at conferences.
Guidelines for journal article/conference paper preparation vary from journal to journal and
from conference to conference but there some basic format that cuts across most of the
journals/conferences. These include title, name of researcher(s) and affiliation(s), abstracts, an
introduction which is made up of background information, problem statement, objectives and
justification of the research topic. The introduction should also give a general overview of the
whole paper.
5.2.1 Title
The title should be concise, attract attention, and highlight the main point of your paper. It should
be clear about the subject matter and devoid of abbreviations.
5.2.2 Authors
The list of authors with their institutional affiliation should be presented immediately after the
title. It should be ordered according to the level of contribution to the paper with the lead
contributor/principal author’s name listed first.
5.2.3 Abstract
It is important to provide a abstract of about 350 words which should summarise the entire paper,
highlighting the most important information such as the purpose of the research, methodology
used, results and conclusions.
5.2.4 Introduction
The introduction should provide a background to the research, state the problem briefly and
clearly outline the objectives of the research. It should
5.2.5 Methodology
The methodology tells how the research was conducted. It is important to describe in details the
various processes involved in carrying out the research with illustrations if possible.
150
5.2.6 Results and discussions
The results of the research should be presented in this section and should be in the clearest forms
possible; whether it is text, figures, or tables. It is also important to use text to provide essential
information on figures and tables and be sure to define all terms in the text, figures and tables.
State directly and briefly your conclusions and the utility of these conclusions. All conclusions
should be supported by data presented in the paper. Present your recommendations also in this
section of the paper.
5.2.8 References
References should be listed in alphabetical order at the end of the text in this section.
151
SESSION 5.4: SESSION 5.3: ABSTRACTS AND SUMMARIES AND
REFERENCING
Research data and results are mostly presented in tables and figures. Tables present lists of
numbers or text in columns, each column having a title or label where as figures are visual
presentations of results, including graphs, diagrams, photos, drawings, schematics, maps,
etc. Graphs are the most common type of figure and will be discussed in detail. When
figures and tables are used in a manuscript, they must be referred to from the text. It is
important to use sentences that draw the reader's attention to the major issues to be
highlighted by referring to the appropriate figure or table. They must also be properly
captioned for clarity.
5.3.2 Referencing
A reference, as defined by the De Montfort University, is the detailed bibliographic
description of the items from which information is gained. The basic idea behind
referencing is to support and identify the evidence you use in your research work. It helps
152
to direct readers of your work to the source of evidence. References can be presented in
two ways; either in-text where it is briefly cited within the text, and/or in the reference list
where it is given in full at the end of the work. All items read for background information
but not referred to in the text are usually given in full at the end of the work in a reference
list sometimes referred to as the bibliography. In short, references should;
• Enable the reader to locate the sources used for a research work
153
2. Handy (1996) argues that by the end of the twentieth century two broad approaches to
the management of people within organizations had emerged.
3. Some commentators, for example, Handy (1996), have argued that by the end of the
twentieth century two broad approaches to the management of people within
organizations had emerged.
4. It has been argued, (Handy 1996; see also Brown 1999 and Clark 2000), that two
approaches to the management of people within organizations had emerged by the
end of the twentieth century.
5. Charles Handy, amongst others, has argued that by the end of the twentieth century
two broad approaches to the management of people within organizations could be
observed (Handy 1996).
1. Book Reference
AUTHOR(S) (Year) Title. Edition – (if not the 1st). Place of publication: Publisher.
E.g.
o WILMORE, G.T.D. (2000). Alien plants of Yorkshire. Kendall: Yorkshire
Naturalists’ Union.
o LI, X. and CRANE, N.B. (1993) Electronic style: a guide to citing electronic
information. London: Meckler.
3. Chapters in books
AUTHOR(S) (Year) Title of chapter. In: AUTHOR(S)/EDITOR(S), ed(s). Book title.
Edition. Place of publication: Publisher, Pages (use p. or pp.)
e.g. TUCKMAN, A. (1999) Labour, skills and training. In: LEVITT, R. et al, (eds.)
The reorganised National Health Service. 6th ed. Cheltenham: Stanley Thornes, pp.
135-155.
5. Journal articles
154
AUTHOR(S) (Year) Title of article. Title of journal, Volume number. (Part
no./Issue/Month), Pages, use p. or pp.
RYAN, J. (2006) ‘Management accounting for developers’, Journal of advanced
accounting, Vol. 1, No 5: p.21-24
7. Electronic sources
AUTHOR(S) (Year) Title of document [Type of resource, e.g. CD-ROM, e-mail,
www] Organization responsible (optional). Available from: web address [Date
accessed].
e.g. UNIVERSITY OF SHEFFIELD LIBRARY (2001) Citing electronic sources of
information [WWW] University of Sheffield. Available from:
http://www.shef.ac.uk/library/libdocs/hsl-dvc1.pdf [Accessed 23/02/07].
Consecutive Numbering uses superscript numbers in the text that connect with references in
either footnotes or chapter endnotes (but usually the former). This system uses different and
consecutive number for each reference in the text. A list of sources is included at the end the
document, which lists all the works referred to in the notes (‘References’, ‘Works cited’).
(Neville, 2010)
Recurrent numbering style uses bracketed (or superscript) numbers in the text that connect
with a list of references at the end of the chapter/assignment. In this case, the same number
can recur if a source is mentioned more than once in the text. (Neville, 2010)
155
5.3.4 Introduction to referencing software packages
There are numerous referencing software packages but the commonest are endnote, endnote web
and reference manager.
Endnote
EndNote is compatible with recent versions of Microsoft Word (Windows and Macintosh) and
installs an add-in for easy integration with your word processing software. It is used most
effectively from the start of a project, when information is being resourced, rather than when
writing up begins.
Endnote Web
156
EndNote Web is a simplified version of the full desktop EndNote product. It has only recently
been released and is still under development, but it can perform many common referencing tasks.
EndNote Web is compatible with recent versions of Microsoft Word (Windows and Macintosh).
One must download and install a plug-in to enable EndNote Web to work with Word. Once
registered for Endnote Web one can:
Reference Manager
Reference Manager is most commonly used by people who want to share a central database of
references and need to have multiple users adding and editing records at the same time. You can
specify whether users are allowed to make edits to the database. Reference Manager offers
different in-text citation templates for each Reference Type. It is however limited to Windows
operating systems only. Use Reference Manager to:
Reference Manager is used most effectively from the start of a project, when information is
being gathered, rather than when writing up begins.
Further details about the features of Reference Manager are available on the Reference
Manager website along with an online overview of the new features of Reference Manager 12
157
Learning Track Activities
Unit Summary
Communicating research findings to interested stakeholders is very important
since research works are usually carried out to address a specific issue in the
society. Journal articles and conference papers are among the commonest ways of
communicating research findings to stakeholders.
Bibliography
Endnote
Journal articles
Referencing
Summaries
158
Unit Assignments 5
COURSE SUMMARY
The course is organised under five units. Introduction to research proposal writing and thesis
synopsis development is treated in unit1 while engineering research design and data analysis is
treated in unit 2. Unit 3 looks at social science research design and data analysis with unit for
concentrating on statistical analysis using SPSS and STATA. Finally, unit 5 introduces the
concept of journal article/conference paper writing and thesis report preparation.
Unit 1 sought to introduce students to the preliminary stages of research which involves the
preparation concept notes, which gives a brief idea about the nature of the research. It also
tackled the preparation of a full research proposal where it also looked at the logical framework
analysis as well as detailed budget preparation. The unit ended with an introduction to thesis
synopsis writing.
Unit 2 dealt with the rudiments of engineering research design and data analysis where issues
such as the various contexts in engineering practice which necessitate research, classification of
experiments that may be undertaken as part of the research and procedures for the design of
experiments. It went on to treat error theory and the various sources of research errors. The unit
also treated the concept of probability theory.
Unit 3 talked about social science research design and data analysis where it looked at the
various research methodologies including survey research as well as case study research. The
unit also treated some basic research ethics including balancing cost and benefits in research.
Unit 4 introduces statistical analysis software packages and their importance in increasing the
accuracy and speed of analysing, especially, sophisticated data. It went on to indicate that,
planning and good policy can only be done more accurately, if the requisite data analysis is done
159
and done correctly. SPSS and STATA are some of the common statistical analysis software
packages that could be used in statistical analysis of data, such as, the census data.
Unit 5 put together all the works done during the research into a document for dissemination.
This introduced the concept of journal articles/conference paper writing, research report/thesis
writing and abstracts/summaries. The unit ended with a brief discussion of the various
referencing styles and a more elaborate explanation of the Harvard way of referencing.
APPENDIX A1
KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY
THESIS SYNOPSIS
160
……………………… …………………………… ………………………….
E. Y. OSEI DR. A. K. SUNNU PROF. BREW-HAMMOND
(CANDIDATE) (HEAD OF DEPARTMENT) (SUPERVISOR)
August 2009
BACKGROUND
Modern life would come to a halt without energy and this makes it simply impossible to
live without it. Studies have shown that simply harnessing the power of oxen in ancient times for
example increased the power available to the human being by a factor of 10 (World Energy
Council, 2000). The invention of the vertical water wheel increased human productivity by
another factor of 6 (WEC et al., 2000). The use of motor vehicles and airplanes have drastically
reduced journey times and increased the ability of humans to transport goods over wider
distances. Energy being the foundation for industrial civilization coupled together with the
depleting conventional fossil sources has made it necessary for the world to seek alternative
sources to meet the increasing demand.
Renewable energy sources are becoming increasingly attractive due to the limited fossil
reserves and the adverse effects associated with their use. They have the potential to provide
energy with zero or almost zero emissions of greenhouse gases and other air pollutants. The
renewable energy sources including solar, wave, wind, hydro, tidal, geothermal and bio-energy
are readily available and can provide complete energy security if their technologies are well
established (REN21, 2008).
Wind energy, first used by the Egyptians around the 4th century BC is a promising source
of electrical power because it has key advantages such as cleanliness, low cost, sustainability,
161
popularity, safety and abundance in most parts of the world. Studies in Ghana indicate that the
monthly average wind speed measurement at 12 m height above ground level lies in the range of
4.8 – 5.5 m/s (Akuffo et al., 2003). For wind speed of less than or equal to 4.4 m/s at a height of
10 m, the wind power density is less than or equal to 100 W/m2 according to Li and Li (2005).
Despite this potential, the electrification rate in Ghana is 49.2% and 11.3 million people are
without electricity (IEA, 2006). The productivity of this large number of people is seriously
compromised and this constrains their opportunities for economic development and improved
living standards. This project seeks to assess the technical performance and determine the cost of
building a 50 MW wind power plant in Ghana.
JUSTIFICATION
The need to ensure electricity supply security first came to light in the 1980’s when
Ghana suffered a major drought resulting in reduced inflows to the Akosombo Dam. This
disrupted electricity supplies and adversely affected the performance of the economy. Today,
Ghana faces the challenge of providing reliable energy for the rapidly growing demand by all
sectors due to the expanding economy and growing population. It has been estimated that grid
electricity demand would grow from about 6,900 GWh in 2000 to about 18,000 GWh by 2015,
reaching about 24,000 GWh by 2020 (Energy Commission, 2006). The existing installed
electricity generating capacity of 1760 MW would have to be doubled by the year 2020 if Ghana
is to be assured of secured uninterrupted electricity supply (Energy Commission, 2006). To
become wealthy as a country, Ghana needs to grow at a GDP between 8 – 10% and these growth
rates require significant amount of electricity (Brew-Hammond et al., 2007).
Wind power use and development worldwide is growing rapidly, having doubled in the
three years between 2005 and 2008. The global wind industry installed close to 20,000 MW of
new capacity in 2007. This development, led by Spain, China and United States took the
worldwide total to 93,864 MW which was an increase of 31% compared with the 2006 market
and represented an overall increase in global installed capacity of about 27% (GWEC et al.,
162
2008). In 2008, it accounted for 19% of the electricity production in Denmark, 10% in Spain and
Portugal and 7% in Germany and the Republic of Ireland. At the end of that same year, the
worldwide nameplate capacity of wind-powered generators was 120.8 GW (Wikipedia, 2009).
These success stories attest to the efficacy of wind power technology as a viable option in
providing energy and reducing environmental pollution.
The installation of 50 MW wind power plant in Ghana is to augment the existing sources
of electricity in the country which are mainly from thermal and hydro sources. This will to some
extent contribute positively to the aggravating energy situation in the country. Wind energy
being a renewable source has the ability to provide energy in a sustainable manner and with
virtually zero emission of pollutants and greenhouse gases.
The Energy Commission of Ghana in 2003 conducted a study to gather and analyze wind
energy data in some areas of the country (Akuffo et al., 2003). This data would help determine
the wind turbine technology to use and the estimate of the cost required for installation.
OBJECTIVES
The main objective of this thesis is to conduct a feasibility study of generating 50 MW from
wind energy in the coastal areas of Ghana.
163
METHODOLOGY
Literature would be sought in order to get acquainted with the relevant works that have
been done in the field of wind power. The areas of interest would include various wind flow
velocities in the world and particularly in Ghana, energy situation in the country, standard
relationships between wind speed and estimated power that can be generated per squared meter,
the relevance of wind power in the country and wind turbine design technologies. Sources of
information will include the KNUST library, internet, etc.
Prefeasibility study of a 50 MW wind power plant would be done using RETscreen with
in-built data and turbine specifications. The total initial cost will be determined as well as the
simple pay back period. Green house gas analysis will also be done.
Areas of the country best suited for wind power development will be selected based on
the recommendations of Solar and Wind Energy Resource Assessment compiled by the Energy
Commission of Ghana in 2003 and more recent data to be collected from them. The help of the
Ministry of Energy will be sought to approach private companies who have also made their own
measurements for coastal areas with the view to acquiring their data sets to be included with
those of the Energy Commission.
Wind turbine design technologies and their technical performance characteristics plus
their costs would be collected from the manufactures, reviewed and the best ones suited for the
country’s situation determined. The comparison criteria will include merits and demerits,
technical considerations, applicability to the Ghanaian situation, etc. The technical assessment of
the whole plant will be carried out with Wind Atlas Analysis and Application Program (WAsP)
designed by Risø National Laboratory.
The cost of building a 50 MW wind power plant in the areas of interest would again be
determined using Computer Model for Feasibility Analysis and Reporting (COMFAR) software
package designed by UNIDO for feasibility studies.
164
WORK PLAN
2009
MONTHS MAR APR MAY JUN JUL AUG SEP OCT NOV
WEEKS 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2
Synopsis
Literature
Review
Thesis Writing
Prefeasibility
Study
Technical
Assessment
Financial
Analysis
Thesis wrap up
Submission of
Draft Thesis
165
Total 4,300
REFERENCES
Akuffo F. O., Brew-Hammond A., Antonio J., Forson F., Edwin I. A., Sunnu A., Akwensivie F.,
Agbeko K. E., Ofori D. D., Appiah F. K. (2003). Solar and Wind Energy Resource Assessment
(SWERA). Department of Mechanical Engineering, KNUST.
Brew-Hammond A., Kemausuor F., Akuffo F. O., Akaba S., Braimah I., Edjekumhene I.,
Essandoh E., King R., Mensah-Kutin R., Momade F., Ofosu-Ahenkorah A. K., Sackey T. (2007).
Energy Crisis in Ghana: Drought, Technology or Policy? Kwame Nkrumah University of
Science and Technology, Kumasi, Ghana. ISBN: 9988-8377-2-0.
Energy Commission of Ghana (2003). Solar and Wind Energy Resource Assessment (SWERA).
Department of Mechanical Engineering, KNUST.
Energy Commission of Ghana (2006). Strategic National Energy Plan 2006 – 2020 and Ghana
Energy Policy. Main version.
Global Wind Energy Council, Greenpeace, Wind Power Works (2008). Global Wind Energy
Outlook 2008.
Meishen Li, Xianguo Li (2005). Investigation of wind characteristics and assessment of wind
energy potential for Waterloo region, Canada. Department of Mechanical Engineering,
University of Waterloo, 200 University Avenue West, Waterloo, Ont., Canada, N2L 3G1.
REN21 (2008). Renewables 2007 Global Status Report. Paris: REN21 Secretariat and
Washington, DC: Worldwatch Institute.
Resource Center for Energy Economics and Regulation (2005). Guide to Electric Power in
Ghana – First Edition. Institute of Statistical, Social and Economic Research, University of
Ghana, Legon.
166
Wikipedia (2009). Wind Power. http//en.wikipedia.org/wiki/Wind_energy (assessed: 23 March
2009).
World Energy Council, United Nations Development Programme, United Nations Department of
Economic and Social Affairs (2000). World Energy Assessment: energy and the challenge of
sustainability. New York, NY 10017. ISBN: 92-1-126126-0.
APPENDIX A2
KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY
THESIS SYNOPSIS
167
NAME: FAISAL WAHIB ADAM
DEPARTMENT, KATH)
SIGNATURES:
BACKGROUND
The femur or thigh bone, is the strongest, longest, and heaviest bone in the body and is essential for
normal ambulation. It consists of three parts; femoral shaft or diaphysis, proximal metaphysis, distal
metaphysic(Douglas et al., 2008).
168
Figure. 1(Wikipedia,2010)
A femoral shaft fracture is a severe injury that generally occurs in high-speed motor vehicle collisions
and significant falls. These injuries are often one of the several major injuries experienced by patients
(Jonathan, 2005). This type of fractures like may other bony fractures has become more common in
Ghana due the exponential increase in the number of motor vehicle accident.
The occurrence of fractures of the femoral shaft in the United States is reported in the bimodal
distribution and it peaks at 25 and 65 years of age with an overall incidence of approximately 1 per
10,000 people per year. Motor vehicle accident is the most common cause, followed by pedestrian
versus automobile, falls from height, and gunshot injuries (Jesse, 2008).A similar studies done by Hinton
et al., 2000, reveal that the rate of femoral shaft fractures in children in Maryland was 19.5 per 100,000
per year, the same as the overall incidence in Finland. The most commonly occurring fracture in children
aged 6 to 9 years was caused when they were struck by cars. Once children reached driving age, the
most frequent cause was a motor vehicle accident. This variation gave rise to a bimodal distribution with
peaks at 2 and 17 years
169
In the Department of Orthopaedic Surgery and Traumatology, Obafemi Awolowo University Teaching
Hospital, Ile-Ife, Osun State, Nigeria, a study of fractures reported indicates that the distribution of the
involved bones included being humerus 10%, femoral shaft 65%, and tibia 25% (Innocent et al., 2006)
Nowadays femoral shaft fractures in adults are usually treated operatively. With more and more of
femoral shaft fractures getting operated the number of complications has proportionately increased.
One such complication is implant failure. An implant is said to have failed if it is found to be inadequate
in performing the function expected of it.
The study of the causes of this failure for engineering purposes requires quantitation of many factors,
most of which the surgeon is aware but cannot access quantitatively the requirements of a particular
situation as an engineer does. This is why an engineering analysis needs to be done to find these causes.
JUSTIFICATION
A discussion with a section of orthopedic doctors and nurses at the orthopedic department (KATH)-
Kumasi-Ghana, has revealed that there is an alarming rate of femoral shaft implant failures, and this
calls for an objective assessment of the exact circumstances that lead to implant failure, as it is
necessary to prevent this complication in one of the major weight bearing bones of the body.
Failure of an implant is a condition that needs to be completely avoided in the human body, because of
the devastating complications that it can bring, for instance a bend in the implant gradually removes the
thin film of oxide on its surface and hastens the corrosive process, the metal if not removed continually
sheds so that the surrounding soft tissue slowly become saturated with metal particles, which may lead
to aseptic inflammation many years after implantation(Charles et al., 1959).Another complication is
shortening of femur, and this leaves the patient with torsion on the pelvic girdle.
The causes of implant is a complex one to look at, because, it involves the engineer(designer),the
surgeon, Operating-room personnel and the patient, all these people have a potential contribution to
failures as well as to successes of the implant. From the standpoint of Mechanical Engineering, every
device has points of weakness at which it will fail when the margin of safety is exceeded. It is the
designer's responsibility to provide an adequate minimum margin, and it is the surgeon's not to exceed
that margin (Cohen, 1964).
A lot of work has been done on the failure of femoral shaft implants in many countries, but to my
knowledge the causes of the failure of femoral shaft implants in operative orthopaedic practice has not
been reported in the Komfo Anokye Teaching Hospital-Ghana. In this background it is decided to study
the causes of the implant failure of the shaft of the femur, from the Mechanical Engineering point of
view, by testing the mechanical properties of the implant, to obtain the allowable stress in order to
compare it with the stresses acting on the implant, so as to suggest guidelines to minimize further
failures.
170
OBJECTIVES
The objective of this work is to find the causes of failure of the femoral shaft implants at the Komfo
Anokye Teaching Hospital (KATH)
To find the mechanical properties of the femoral shaft plate implant that is used
-modulus of elasticity
METHODOLOGY
Two cases of healed femoral shaft implants and three failed ones, who presented at the department of
Orthopaedics KATH- Kumasi-Ghana, will be studied under the following headings;
Age
Sex
Body weight
Nature of primary injury
Anatomical site of the fracture
Type of primary fixation
Weight bearing
The X-ray of the fracture site will be taken together with the removed implant. The implant will be taken
to the mechanical engineering laboratory for the tensile test to be done. The x-ray will aid in the
computer modeling, to predict the forces that could have cause that kind of failure using the ANSYS
software to do a progressive failure analysis.
Exclusion criteria
171
FACILITIES AVAILABLE
ANSYS Software
REFERENCES
1. Jonathan Cohen, Failure in Performance of Surgical Implants, Journal of Bone and Joint Surgery
http://www.jbjs.org. (Accessed 2010 February 6)
172
3Alfred O. Ogbemudia,Phillip F.A.Umebee (2006).Implant Failure in Osteosynthesis of Fractures of Long
Bones. Journal of Medicine and Biomedical Research (College of Medical Sciences, University of Benin
Nigeria)
10. RJ Brumback,S Uwagie-Ero,RP Lakatos ,A Poka ,GH Bathon and AR Burgess(1988). Intramedullary
Nailing of Femoral Shaft Fractures.Part II;Fracture Healing with Static Interlocking Fixation. Journal of
Bone and Joint Surge.
11. RW Buchoz ,SE Ross,and KL Lawrence(1987).Fatique Fracture of the Interlocking Nail in the
Treatment of Fractures of the Distal part of the Femoral Shaft. Journal of Bone and Joint Surgery.
173
Printing and Binding of Thesis 200 200
Total 3 800
174
WORKPLAN FOR COMPLETION OF PROJECT
YEAR 2010
WEEKS 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3
Literature
review,
Synopsis
Writing,
Sponsorship
Taking
samples from
hospital
Design of
experiment
Chapters one
and two
Testing of
samples
Computer
modeling
Chapter three
Analysis of
results
Chapters four
and five
Submission of
draft thesis
APPENDIX B
/*#########################################################################*/
/* DO-FILES WRITTEN BY: FAISAL WAHIB ADAM */
/* MECHANICAL ENGINEERING DEPARTMENT */
/* KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY */
/*#########################################################################*/
*DATE: [05-03-2011]
175
*****************************************************************************
//FIRST RESULT (2 tables)
use "C:/Documents and Settings/Administrator/Desktop/Stata
10.0/Faisal/finalgraphV3a.dta"
log using result1,replace text
describe
tabulate s7dq13 category1
tabulate s7dq13 category1,column nofreq
log close
exit
*****************************************************************************
*****************************************************************************
*****************************************************************************
176
label define region 7 "Eastn", modify
label define region 5 "Westn", modify
label define region 6 "Ashti", modify
label define region 7 "Accra", modify
describe
graph bar (count) hhid ,over( s7dq13) over(category1) over(region) asyvars
percentages stack ///
title(" DISTRIBUTION OF COOKING FUELS") subtitle("FOR THE VARIOUS INCOME
QUINTILES AND REGIONS IN GHANA") ///
ytitle("Percentage of households") note("Source:Fifth Ghana Living Standards
Survey" ) ///
legend(position(3) cols(1) order(8 7 6 5 4 3 2 1))
log close
exit
*****************************************************************************
177
Course Quiz/Exams
[Supply course quiz of this course here for the attention of the Institute’s examinations officer]
178
RESEARCH/PROJECT AREAS AND RELATED TOPICS
IN THIS COURSE
[Supply research/project areas and related topics in this course for use by students]
179
SOME CASE STUDIES IN THIS COURSE
180
MY PAGE
Name: _______________________________________ Learning Centre: _________________
_____________________________________________________________________________
______________________________________________________________________________
______________________________________________________________________________
Self-grading: self assessment questions score ______ % Unit Assignments scored ______ %
______________________________________________________________________________
______________________________________________________________________________
181
____________________________________________________ (may continue on reverse side)
182
= = = = = = = = = = = = = = = = = = = = = = = = = detach and return to IDL, KNUST = = = = = = = = = = = = = = = = = = = = Learner Feedback Form/[insert course code]
Dear Learner,
While studying the units in the course, you may have found certain portions of the text
difficult to comprehend. We wish to know your difficulties and suggestions, in order
to improve the course. Therefore, we request you to fill out and send the following
questionnaire, which pertains to this course. If you find the space provided
insufficient, kindly use a separate sheet.
1. How many hours did you need for studying the units
Unit no. 1 2 3 4 5 6
No. of hours
2. Please give your reactions to the following items based on your reading of the
course
Items Excellent Very Good Poor Give specific examples, if
good poor
Presentation
quality
Language and
style
Illustrations
used
(diagrams,
tables, etc.)
Conceptual
clarity
Self assessment
Feedback to SA
Unit 1: _______________________________________________________________
Unit 2: _______________________________________________________________
Unit 3: _______________________________________________________________
Unit 4: _______________________________________________________________
Unit 5: _______________________________________________________________
Unit 6: _______________________________________________________________
183
184