0% found this document useful (0 votes)
14 views55 pages

Data Preperation and SPSS Intro

The document discusses the process of data preparation and descriptive statistics for marketing research. It outlines the steps for data preparation including questionnaire checking, editing, coding, transcribing, and cleaning. Descriptive statistics are then used to summarize the data, including measures of central tendency like the mean, median, and mode as well as measures of dispersion like range, variance, and standard deviation.

Uploaded by

seif.yazhord
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views55 pages

Data Preperation and SPSS Intro

The document discusses the process of data preparation and descriptive statistics for marketing research. It outlines the steps for data preparation including questionnaire checking, editing, coding, transcribing, and cleaning. Descriptive statistics are then used to summarize the data, including measures of central tendency like the mean, median, and mode as well as measures of dispersion like range, variance, and standard deviation.

Uploaded by

seif.yazhord
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Marketing Research

Data Preparation &


Descriptive Statistics
Overview

 Data Preparation Process

 Basic Descriptive Statistics

 Measure of Dispersion
Data Preparation Process

Preliminary Plan for Data Analysis

1. Questionnaire Checking
2. Editing
3. Coding
4. Transcribing
5. Data Cleaning

Data Analysis Strategy


Step 1: Questionnaire Checking

 What are the reasons for a questionnaire returned from the


field to be unacceptable?

Parts of the questionnaire may be incomplete.

The pattern of responses may indicate that the respondent did not
understand or follow the instructions .

The responses show little variance.

One or more pages are missing.

The questionnaire is received after the pre-established cut-off date.

The questionnaire is answered by someone who does not qualify for


participation.
Step 2: Editing

 A review of the questionnaires with the objective of


increasing accuracy & precision

 Treatment of Unsatisfactory Responses


 Return to the field
 Assign missing data
 Discard unsatisfactory responses
Step 3: Coding

 The assignment of a code to represent a specific


response to a specific question along with the data
record and column position that code will occupy

 A Codebook contains coding instructions and the


necessary information about variables in the dataset

 Steps:
1. Transforming responses to each question into a set of
meaningful categories
2. Assigning numerical codes to the categories
3. Creating a data set suitable for computer analysis
1. Transforming Responses into Meaningful
Categories

 A structured question is pre-categorized

 Responses to a non-structured or open-ended


questions to be grouped into a meaningful and
manageable set of categories
 Other? please specify
2. Assigning Numerical Codes

 Assign appropriate numerical codes to responses


that are not already in quantified form

 To assign numerical codes, the researcher should


facilitate computer manipulation and analysis of
responses
Example

 A questionnaire was collected for a fast food chain


research with 200 completed responses for the
following questions
 Rate you preference to eat in a familiar restaurant (1= Weak
Preference, 7= Strong Preference
 Rate the restaurant in terms of
 Quality of food (1= Poor , 7 = Excellent)
 Quantity of food (1= Poor , 7 = Excellent)
 Value For Money (1= Poor , 7 = Excellent)
 Service Quality (1= Poor , 7 = Excellent)
 Please indicate your household income
Less than 20,000 EGP (=1) 20,000 EGP- 34,999 EGP (=2)
35,000 EGP- 49,999 EGP (=3) 50,000 EGP- 74,999 EGP (=4)
75,000 EGP- 99,999 EGP (=5) 100,000 EGP or More (=6)
Codebook ( In Appendix)

Column Variable Variable Question Coding


Number Number Name Number Instructions
1 1 ID 1 to 200 as coded
2 2 Preference 1 Input the number circled.
1=Weak Preference
7=Strong Preference

3 3 Quality 2 Input the number circled.


1=Poor
7=Excellent

4 4 Quantity 3 Input the number circled.


1=Poor
7=Excellent

5 5 Value 4 Input the number circled.


1=Poor
7=Excellent

6 6 Service 5 Input the number circled.


1=Poor
7=Excellent
Codebook Excerpt (Cont.)

Column Variable Variable Question Coding


Number Number Name Number Instructions
7 7 Gender 6 Input the number
selected
1=Female
2=Male

8 8 Income 7 Input the number


circled.
1 = Less than $20,000
2 = $20,000 to 34,999
3 = $35,000 to 49,999
4 = $50,000 to 74,999
5 = $75,000 to 99,999
6 = $100,00 or more
Coding Multiple Response

 Which of the following countries have you visited during the


past 12 months? (Mark all that apply)
________Canada
________England
________France
________Germany
________Japan
________Mexico

 How to code it?


Coding Multiple Response

 Which of the following countries have you visited during the


past 12 months? (Mark all that apply)
________Canada
________England
________France
________Germany
________Japan
________Mexico

 How to code it: Need 6 variables, each relating to a specific


country and having two possible values (Ex: 1= “Yes” and 0 =
“No”)
Rank Order Question

 Please rank the following fast-food restaurants by placing


a 1 beside the restaurant you think is best overall, a 2
beside the restaurant you think is second best, and so on.
__________Burger King
__________McDonald's
__________Wendy's
__________Hardy’s

 How to code it?


Rank Order Question

 Please rank the following fast-food restaurants by placing


a 1 beside the restaurant you think is best overall, a 2
beside the restaurant you think is second best, and so on.
__________Burger King
__________McDonald's
__________Wendy's
__________Hardy’s

 How to code it?


This question requires as many variables (and columns) as
there are objects to be ranked
3. Creating a Data Set

 Organized collection of data records

 Each sample unit within the data set is called a Case or


Observation

 Structure of a Data Set


 The number of observations = n
 The total number of variables embedded in the
questionnaire is m, then
 Data set = n x m matrix of numbers
Structure of a Data Sheet

Respondent 1’s response


to variable 1.
Structure of a Data Sheet
Step Four: Transcribing

 Transcribing: is transferring the coded data from the


questionnaires into the computers.

 This step is unecceasry in most of the cases because


data are entered directly into the computer.
Step Five: Data Cleaning

 Consistency Checks
 A part of the data cleaning process that identifies data that are
out of range or logically inconsistent, or that have extreme
values
 Treatment of Missing Responses
 A respondent's refusal to answer a question
 An interviewer's failure to ask a question or record an answer
or a "don't know" that does not seem legitimate
 Requires sound questionnaire design & tight control over
fieldwork
Data Preparation & Cleaning your project

 Change to codes (based on your codebook)


 If requires more variables (columns then create it)
 Open ended: Other…
Descriptive statistics

• Summarizes/describes the characteristics of a


data set.
• Consists of two basic categories of measures:
1) Measures of central tendency: describe the center
of a data set.
2) Measures of dispersion: describe the
variability/spread of data within the set.
Measures of Central Tendency

 Mode: Most frequently category chosen

 Median: 50th percentile response

 Mean: Simple average of the various numbers


Example

Choices Code #
 A sample of 100
Students
students has been
drawn to measure their Hate it 1 30
perceptions of AUC’s
online instruction.
Don’t like 2 25
 Calculate the mode, it
median and mean based
on the results in the Neutral 3 25
following table.
 How do you evaluate the
results? Like it 4 15

Love it 5 5
Measures of Dispersion

 Describe how the responses are clustered around


the mean or a central value.

 Measures:
 Range: The difference between the largest and smallest
response value
 Variance: The mean squared deviation from the mean
(Normal distribution assumption). The variance can never be
negative.
 Standard Deviation: The square root of the variance. It
measures of dispersion around the mean
 The Coefficient of Variation: The ratio of the standard
deviation to the mean expressed as a percentage, and is a
unitless measure of relative variability
Measures of Central Tendency &
Dispersion

MEASUREMENT MEASURES MEASURES OF


LEVEL OF DATA OF CENTRAL DIESPERSION
PERTAINING TO TENDENCY
VARIABLE

Nominal MODE NO MEASURE

Ordinal MEDIAN RANGE

Interval MEAN STANDARD DEVIATION

Ratio MEAN STANDARD DEVIATION


Why Averages May be Misleading

 Researchers tested a new sauce product & found:


 Mean rating of the taste test was close to the middle of the
scale, which had "very mild" and "very hot" as its bipolar
adjectives
 Researcher’s conclusion
 Consumers need neither really hot nor really mild sauce
 What do you think?
Why Averages May be Misleading

 Researchers tested a new sauce product & found:


 Mean rating of the taste test was close to the middle of the
scale, which had "very mild" and "very hot" as its bipolar
adjectives
 Researcher’s conclusion
 Consumers need neither really hot nor really mild sauce
 Deeper examination revealed
 The existence of a large proportion of consumers who
wanted the sauce to be mild and an equally large proportion
who wanted it to be hot
 Morale of the story:
 A clear understanding of the distribution of responses can
help a researcher avoid erroneous inferences
SPSS
SPSS interface

Data View
• The place to enter data
• Columns: Variables
• Rows: Records

Variable View
• The place to enter variables
• List of all variables
• Characteristics of all variables
30
Data View on SPSS

Variable Name (Set in


Variable View)

Data Value
Respondent
Number
(Called case
number)
Data View
Variable View on SPSS
Possible Scale
Variable Values used
Name
Variable Question Values for
Type Statement missing

Variable View
Data View on SPSS
Exercise: Creating variables in SPSS

 Open SPSS

 Create a Nominal Variable Gender


 Create a variable named Gender
 Put 1 For males and 2 for females

 Create a second variable called age group


 Put <18 as 1
 Put 18-25 as 2
 Put 26-35 as 3
 Put >35 as 4

 Create a continuous variable called Salary


Importing Data from Excel

 Select File Open Data


 Choose Excel as file type
 Select the file you want to import
 Then click Open

42
Open Excel files in SPSS

43
Variable Names
appeared on the
column header as
written in Excel
Frequency Distribution

 A mathematical distribution with the objective of obtaining


a count of the number of responses associated with values
of one variable and to express these counts in
percentage terms

 One-way tabulation is a table showing the distribution of


data pertaining to categories of a single variable
Frequency Distribution on SPSS

46
Frequency Distribution in SPSS

Analyze>Descriptive Statistics>Frequencies

Step 1: Choose the


Type of Analysis
Frequency Distribution in SPSS

Step 2 : Select
the Variable for
which you want
to compute
frequencies
and press “ ok”

Note: You can also


choose to display a
Histogram
(Frequency
Distribution Chart)
Frequency Distribution in SPSS

Step 3 :
Analyze the
Output
Frequency Table
Measure of Central Tendencies

Step 2 : Select
the Variable for
which you want to
compute and click
on “Statistics”
Measures of Central Tendency in SPSS

Step 3 : Select
the analysis you
want to compute
Measures of Dispersion in SPSS

Step 3 : Select
the analysis you
want to
compute

You might also like