Research Data Processing
Research Data Processing
DATA
Coritents
5.0 Objectives
5:l Introduction
5.2 Data Processing
5.3 Editing of Data
5.4 Coding of Data
5.5 Preparing a Master Chart
5.6 Tabulation of Data
5.7 Classification of Data
5.8 Data Analysis and Jpterpretation
5.9 Use of Computer in Data Processing and Tabulation
5.10 Let Us Sum Up
5.11 Key Words
5.12 Suggested Readings
5.13 Answers to Check Your Progress
5.1 INTRODUCTION
In the previous Unit we discussed about methods and tools of data collection.
After data collection the researcher turns his focus of attention on its processing.
In this Unit, we will discuss about one of the most important stages of the
research process, i.e. data processing and analysis.
5 . 2 DATA PROCESSING
Data processing refers to certain operations such as editing, coding, computing
of the scores, preparation of master charts, etc. A researcher has to make
his plan for each and every stage of the research process. As such, a good
researcher makes a perfect plan of processing and analysis of data. To some
researchers data processing and analysis is not a very serious activity. They
feel many times that data processing is a job of computer assistants. As a
assistants which may not help them to achieve their objectives. To avoid such Processing and
1 Analysis of Data
I situations, it is essential that data processing must be planned in advance and
1 instructed to assistants accordingly.
6
4 3 Establishment Public 1 1
I I I I Private 1 2 1 7 I I
5 4 Level of Education Graduate 1
Intermediate 2
High School 3
I 1 1 I Middle school]
Primary
4 1 1 I
Illiterate
Other
6 5 Marital Status Married 1
Unmarried 2
Widow 3
1 1 1 36 35 Nature of work
I
Divorce
Yes
4 9
I 1 Duration of work
I ( 38 Promotion
I Yes
No
I 43
42 ' ' Attitude of Employer
Preference for male
employees
Undecided
Disagree
VARIABLE LABELS
NUMBER
5.8--
DATA ANALYSIS AND INTERPRETATION
--- -
The first step in data analysis is a critical examination of the processed data
in the form of frequency distribution and cross tabulation. This analysis is
made with a view to draw meaningful inferences and generalisation.
Setting up the Analytic Model
Before we begin the analysis of data we have to look back to the objectives
of our research study and set up analytic models. These models are diagrammatic
presentation of variables and their interrelationships.
Let us hypothesise that awareness about the Equal Remuneration Act affects
wage differentials.
68
Processing and
H1 -b Awareness about the Act Affects Wage Differnetials Analysis of Data
The two variables in the hypothesis are awareness about the Act and the
wage differentials. The relation beheen the two variables can be diagrammatically
presented as follows:
I
Awareness d Wage Differential
1 regional development has been categorised into three, namely, high, medium
and low. This can be described as follows:
REGIONALDEVELOP= MEDIUM
With the analytic model described above the researcher can proceed to analyse
I
the data as discussed in the following sections:
I ~ni;ariate Analysis
I
Univariate analysis refers to tables, which give data relating to one variable.
Uni-variate tables which are more commonly known as frequency distribution
tables show how frequently an item repeats. Examples of frequency tables
are given below. The distribution may be symmetrical or asymmetrical. The
characteristics of the sample while examining the percentages, further properties
of a distribution can be found out by various measures of central tendencies.
However, researcher is required to decide which is most suited for this analysis.
To know how much is the variation, the researcher has to calculate measures
of dispersion.
Distribution of Respondents
Medium
1 Low
1 Total
1
1
1
134
68
280
1
1
47.9
24.2
100.0
,
1
1
A frequency distribution of a single variable is the frequency of observation
in each category of a variable. For example, an examination of the pattern
of response to variable 'awareness of the respondents' in Table 5.3 would
provide a description of the number of respondents who have high, medium
and low level of awareness. In case of nominal variables categories can be
listed in any arbitrary order. Thus, the variable "Religion" may be described
with the category 'Hindu' or the category 'Christian' listed first. However,
the categories of ordinal, interval and ratio variables are arranged in order.
Let us consider the frequency distribution (Tables 5.3, 5.4 and 5.5) which
describes the awareness, wage differentials and regional development of
Processing and
respondents. The tables have four rows, the first three being the categories Analysis of Data
of variables, which appear in the left-hand columns and the right hand columns
show the number of observation in each category. The last rows are the totals
of all frequencies appearing in tables. To anal$e the data it is necessary to
convert the fhquencies into figures that can be inteeted meaningfully. Frequencies
expressed in comparable numbers are called proportions or percentages. A
proportion is obtained by dividing the frequency of a category by the total
number of responses in the distribution. Proportions when multiplied by 100
become a percentage. For example, the relative weight of d e category 'High'
in Table 5.3 is expressed by the proportion 1101280=.393 or by the percentage
1101280 x 100=39.3 per cent. These figures indicate that only about 40 out
of every 100 respondents in the group have 'high' level of awareness about
the Act. Proportions and percentages permit the comparison of two or more
frequency distributions,for instance, while distribution of respondents by regional
development displayed in Table 5.4 clearly shows the predominance of respondents
from 'high' development region whereas,*distibution of respondents by wage
differential in Table 5.5 indicates that the proportions of respondents with
'high' and 'low' wage differentials are almost equal.
Bivariate Analysis
A researcher might be interested in knowing the relationships between the
variables. To know the relationship between these variables, the data pertaining
to the variables are cross tabulated. Hence, a bi-variate table is also known
as cross table. A bi-variate table presents data of two variables in column
and row simultaneously. An example of a bi-variate table is given below:
The table presents data with regard to two variables namely awareness about
the Equal Remuneration Act and the level of wage differential. First row
presents data with regard to respondents who were aware of the Act. The
second row presents data about who were not. Similarly, the first column
gives data pertaining to workers who have low wage differentials. The second
column presents data of workers whose differentials were medium and the
last column represents the respondents who felt high wage differentials. For
example, the first cell (in the left-hand corner) represents 94 respondents who
were fully aware of the Act and perceived high wage differentials.
The association between two variables can be explained either by comparing the
percentages of respondents column wise or row wise. The relationship between
Basics Of Research the variables can also be examined by various statistical techniques depending
upon the level of measurement of the data. Apparently, the two variables are
associated, therefore, more people who were aware of the Act have perceived
low wage Merentials than who were not aware of the Act. Alternatively, comparatively
smaller percentage of people who were aware of the Act has perceived high
wage differentials than people who were not aware of the Act.
In bivariate analysis the researcher also explains the nature and the degree
of association. That is whether t h e ' r e l a t i ~ n s h is
i ~positive or negative and
it also indicates the degree of relationships in terms of high, moderate or low.
Trivariate Analysis
Sometimes researcher might be interested in knowing whether there is a third
variable which is effecting the relationships between two variables. In such
cases the researcher has to examine the bi-variate relationship by controlling
the effects of third variable. One way of controlling the effects of a third
variable is to prepare partial tables and examine the bi-variate relationship.
Let us take an example. In the above table, if researcher wants to examine
whether there is effect of regional development on the bivariate relationship
he may prepare three partial tables giving data relating to awareness of the
Act and wage differential for high, medium and low regional development.
I1 Awareness about
the Act
Wage Differentials
High Medium Low 1 Total
Medium
1 13
(40.6) (33.3)
12
(37.5)
42
(53.8)
2o
(62.5)
~ 74
Total 32 78 32 142 1
I 1
- -
Medium
8
(36.4)
9
(40.9)
11
(25.6)
17
(39.5)
4
(19.0)
8 .
(38.1)
23
34
~
Total 1 22 . 43 21 1
I
86
Table 5.9: Regional Development = Low (N = 52) Processing and
Analysis of Data
i Total 1 24 1 13 1 15 1 ' 5 2 1
On examination of these three partial tables, if the researcher finds out that
bi-variate relationships do not hold good he may infer that it is the third
variable, the regional development which is affecting the bi-variate relationship.
In the partial tables for higher regional development, the proportion of people
perceiving high wage differential are those who are having high level of awareness
about the Act. The similar trend can be noticed in the remaining two partial
tables, which means regional development does not effect the bi-variate relationships
between wage differential and awareness about the Act.
AND TABULATION
Research involves large amounts of data, which can be handled manually or
by computers. Computers provide the best alternative for more than one
reason. Besides its capacity to process large amounts of data, it also analyses
data with the help of a number of statistical procedures. Computers cany
out processing and analysis of data flawlessly and with a very high speed.
The statistical analysis that took months earlier takes now a few seconds or
few minutes. Today, availability of statistical software and access to computers
has increased substantially over the last few years all over the world.
While there are many specialised software application packages for different
types of data analysis, Statistical Package for Social Sciences (SPSS) is one
such package that is often used by researchers for data processing and analysis.
It is preferred choice for social work research analysis due to its easy to
use interface and comprehensive range of data manipulation and analytical
tools.
I1 You can enter your data directly into SPSS Data Editor. Before data analysis,
it is advised that you should have a detailed plan of analvsis so that vou are
Basics of Social Research clear as to what analysis.,is to be performed. Select the procedure to work
on the data. All the variables are listed each time a dialog box is opened.
Select variables on which you wish to apply a statistical procedure. After
completing the selection, execute the SPSS command. Most of the commands
are directly executed by clicking '0.K'. on the dialog box. The processor
in the computer will execute the procedures and display the results on the
monitor as 'output file'.
-