Research Methodology
Research Methodology
DEMGN832
Edited by:
Dr. Lokesh Jasrai
Research Methodology
Edited By
Dr. Lokesh Jasrai
CONTENTS
Notes
Contents
Objectives
Introduction
1.1 Characteristics of Research
1.2 Research Proposal
1.3 Creating a good research proposal
1.4 Research Paradigms
1.5 Research Ethics
Summary
Key Words
Self-Assessment
Review Questions
Answers: Self-Assessment
Further Readings
Objectives
After studying this unit, you will be able to:
understand the process of writing a research proposal.
explain the significance of the research proposal.
compare and contrastthe major paradigms of evaluation.
outline the contribution of research towards theory.
assess the role of ethics in research.
Introduction
What is research?
A careful consideration of study concerning a specific concern or problem using scientific methods.
Research isalso defined as the creation of new knowledge and/or the use of existing knowledge in
a new and creative way to generate new concepts, methodologies, and understandings. This could
include creation and analysis ofprevious research to the extent that it leads to new and creative
outcomes. Research also includes a careful consideration of study regarding a particular concern or
problem using scientific methods. According to Earl Robert Babbie, “Research is a systematic inquiry
todescribe, explain, predict, and control the observed phenomenon. Research involves inductive and
deductivemethods.”Inductive research methods are used to analyze an observed event. Deductive
methods are used to verify theobserved event. Inductive approaches are associated with qualitative
research and deductive methods are morecommonly associated with quantitative research.
Research is conducted with a purpose to understand:
What do organizations or businesses want to find out?
What are the processes that need to be followed to chase the idea?
What are the arguments that need to be built around a concept?
What is the evidence that will be required for people to believe in the idea or concept?
Notes
Identifying Potential Topics
Sometimes, an instructor may provide a list of suggested topics. If so, a researcher may benefit from
identifying severalpossibilities before committing to one idea. Other times, an instructor lets
students decide where to beginwhen picking a topic. It is importantto know how to narrow down
the ideas into a concise, manageablethesis. Discussing the ideas with the instructor will help ensure
that a researcher chooses a manageable topic that fitsthe requirements of the assignment.
Below are a couple of common approaches to developing and narrowing a topic.
Explore Topics
In a few sentences, describe broader topics or issues the article touches on. Beyond the specific
incident orevent described in the article, what larger social problems or debates does the article
relate to? (EX:reading an article on a specific topic like an increase in fuel prices, topics might
include international petro-prices, vehicles, government policies, taxes, etc.)
Explore Exigence
In a few sentences, explain why you are personally interested in or curious about the
incidentreported in the article. If possible, connect it to your own experience. Based on this, what
topics do youthink you’d like to research should be attempted to answer.
Explore Kairos
In a few sentences, identify the groups of people this incident or problem matters to
(beyondyourself) and why it matters to them now, thinking not only of those involved in the
incident itself but other peopleor entities or institutions in society that might have a concern
Notes regarding this incident or incidents like it. Based onwhat matters most about this incident, what
topics related to it might be worthy of research?
Explore Controversies
In a few sentences, explain what differences of opinion or debates may exist about thisincident or
event and you think those differences of opinion might exist? Based on this, which of
thesecontroversies might be worthy of research?
Notes
which are best found on the open web?What search words will you use to search for your sources,
whether you do this using library resources or goonline?
A.Research topic
C.Methodology
A tight fit between the aims of the study and the research strategy chosen must be evident.
Include statements on:
research strategy (e.g., qualitative, quantitative) and justification for approach;
research methods (e.g. survey, case study, ethnography, experimental);
tools of data collection (e.g., questionnaire, interviews, focus groups, documentary analysis);
location and availability of data;
methods of data analysis and interpretation;
ethical implications (if relevant); and
any problems that may be encountered in the conduct of the research.
A key part of a research application is the research proposal. Whether you are applying
for self-funded research or organization-funded research, research guidelines should be
followed.
E.Proposed timeline/milestones
A schedule indicating plans from commencement right through to submission needs to be
provided.
Notes
TECHNICAL EVALUATION OF RESEARCH PROPOSAL
Title ____________________________________________________________
________________________________________________________________
Reviewed by: ____________________________________________________
Date: __________________________
YES NO
A. TITLE
Does it include the
subject matter/scope?
type of study?
location/subjects?
intervention?
time/duration of observation?
B. SIGNIFICANCE
Is the study worth undertaking?
1. The problem
affects a large population
has serious consequences
related to on-going project
2. The answer
fills a gap in knowledge
has practical application
will improve professional practice/health service
Is the topic in priority list of DOST/DOH?
Is it within the institutional mission?
D. RESEARCH QUESTION
Is it clear and adequately formulated?
Is it researchable?
Notes F. OBJECTIVES
Is the general objective appropriate for the research question?
2. Subjects
Is target population suitable to the objectives?
Is accessible population representative of target population?
3. Data Collection
Are the relevant variables operationally defined?
4. Data Processing
Is the data processing to be done by computerization?
H. SCHEDULE
Are all important activities scheduled?
I. BUDGET
Are all projected expenses included?
Notes
1.4 Research Paradigms
Once the researcher finalizes theresearch topic the next thought is about the approach or
methodology to follow for the research. There are threequestions that the researcher needs to ask
before beginning the actual research:
The
ontological
inquiry
Research
Paradigms
The The
epistemological methodological
inquiry inquiry
1. The ontological inquiry: What is the reality that the researcher wants to explore and know?
2. The epistemological inquiry: What is it (the ontology) that is available to explore and how to
reach it?
3. The methodological inquiry: What are the methods and procedures that will make this inquiry
possible?
All of the three above questions are part of the paradigms of research. A paradigm is a worldview
about how toconduct research. Paradigm includes the methodology, approach, ontology, and
epistemology to conduct theresearch. In one paradigm there can be several methodologies and the
researcher can follow anyone of them.These methodologies are approaches to research that can
help the researcher conduct systematic research.
For example, if ontology asks does God exist? The epistemology will ask how to know that God
exists? and themethodology will focus on what procedures and methods one can use to find the
existence of God. Theparadigms however are four or five that are internationally accepted,
depending on whether you are researching pure sciences or in social sciences. For the new
researcher, the choice of the right paradigm andresearch methodology is a difficult task. The
researcher getsa better understanding of the paradigm as they workon their research project. The
term paradigm was first used by Kuhn in his work The Structure of ScientificRevolution he defined
research paradigm as “an integrated cluster of substantive concepts, variables, andproblems
attached with corresponding methodological approaches and tools”.
A researcher will be curious to know the answers to research questions. The answers to the
researchquestions can be solved informally but a researcher will not be able to inform the readers
how he/she conduct theresearch. A researcher needs to provide a step-by-step guide to the readers
about how theresearch proceeded and how the researcher got the answers to the research question.
As we know that everyresearch should have some characteristics these research characteristics give
the research meaning and value.Unless the researcher follows a well-defined path to conduct the
research, he/she could not justify his findings to thereaders. Additionally, other researchers cannot
replicate the study nor they can learn from it. A paradigm providesthe researcher a guide to follow
throughout the research. The novice researcher finds it difficult to understandresearch paradigms
and their importance in research.
Notes
Research devise a method for investigating a particular topic and its contribution is to add
toexisting knowledge about the methodology.
1. Honesty
Honestly report data, results, methods and procedures, and publication status. Do not fabricate,
falsify, ormisrepresent data.
2. Objectivity
Strive to avoid bias in experimental design, data analysis, data interpretation, peer review,
personnel decisions,grant writing, expert testimony, and other aspects of research.
3. Integrity
Keep your promises and agreements; act with sincerity; strive for consistency of thought and
action.
4.Carefulness
Avoid careless errors and negligence; carefully and critically examine your work and the work of
your peers.Keep good records of research activities.
5.Openness
Share data, results, ideas, tools, resources. Be open to criticism and new ideas.
7.Confidentiality
Protect confidential communications, such as papers or grants submitted for publication, personnel
records, tradeor military secrets, and patient records.
8.Responsible Publication
Publish in order to advance research, not to advance just your career. Avoid wasteful
andduplicative publication.
9.Responsible Mentoring
Help to educate, mentor, and advise students. Promote their welfare and allow them to make their
own decisions.
11.Social Responsibility
Strive to promote social good and prevent or mitigate social harms through research, public
education, andadvocacy.
12.Non-Discrimination
Avoid discrimination against colleagues or students based on sex, race, ethnicity, or other factors
that arenot related to their scientific competence and integrity.
13.Competence
Maintain and improve your own professional competence and expertise through lifelong education
and learning;take steps to promote competence in science.
14.Legality
Notes Know and obey relevant laws and institutional and governmental policies.
15.Animal Care
Show proper respect and care for animals when using them in research. Do not conduct
unnecessary or poorlydesigned animal experiments.
Summary
Research is defined as the creation of new knowledge and/or the use of existing knowledge in a
new and creative way so as to generate new concepts, methodologies and understandings. This
could include synthesis and analysis of previous research to the extent that it leads to new and
creative outcomes.
Research proposal is required to be developed by the researcher to present his/her future course of
actions related to research. It includes a detailed description of the research process which a
researcher wants to undertake.
A paradigm is a worldview about how to conduct research. Paradigm includes the methodology,
approach, ontology, and epistemology to conduct the research. In one paradigm there can be
several methodologies and the researcher can follow anyone of them. These methodologies are
approaches to research that can help the researcher conduct systematic research.
Key Words
The ontological inquiry: What is the reality that the researcher wants to explore and know?
The epistemological inquiry: What is it (the ontology) that is available to explore.
The methodological inquiry: What are the methods and procedures that will make this inquiry
possible?
Self-Assessment
Notes
10. Which of the following implies meaning of research.
(a) Finding Solution of the research problem (b) Search Again (c) Scientific way to search truth (d)
None of these
11.Which of the following explains about methods and procedures that make an inquiry possible?
(a) Methodology (b) Epistemology (c) Ontology (d) None of these
12.Which of the following is NOT required in a research proposal while writing background and
context of the study:
(a) Assumptions (b) Significance of the study (c) location and availability of data (d)
problems associated with issue
Review Questions
13.What do you mean by research?
14.Write down various points to be considered while preparing a research proposal. Highlight the
importance of each point in detail.
15.What are common research paradigms, elaborate.
16.What do you understand by research ethics? Why ethics are necessary in research.
Answers: Self-Assessment
Further Readings
Business Research Methods by Naval Bajpai, Pearson
Research Methodology: Methods and Techniques by Kothari, C. R. & Garg, Gaurav,
New Age International.
Marketing Research by Naresh K Malhotra, Pearson
Notes
Introduction
Research can be referred to as a search for information. It can also be defined as the scientific and
systematic search for relevant information on a particular subject or topic. Research is the ability to
make scientific inquiries. Research is an original and systematic investigation undertaken to
increase existing knowledge and understanding of the unknown to establish facts and principles.
Some researchers consider research as a voyage of discovery of new knowledge.It comprises the
creation of ideas and the generation of new knowledge that leads to new and improved insights
and the development of new materials, devices, products, and processes. It should have the
potential to produce results that are sufficiently relevant to increase and synthesize existing
knowledge or correcting and integrating previous knowledge.Research is a scientific approach to
answering a research question, solving a research problem, or generating new knowledge through
a systematic and orderly collection, organization, and analysis of data with the goal of making the
findings of research useful in decision-making. Any research endeavor is said to be scientific if:
Objectives of Research
The purpose of research is to explore answers to questions through the application of scientific
procedures. The main aim of the research is to find out the truth which is hidden, and which has
not beendiscovered as yet. Though each research study has its specific purpose, we may think
ofresearch objectives as falling into several following broad groupings:
1. To gain familiarity with a phenomenon or to achieve new insights into it (studies with
thisobjectin view are termed as exploratory or formulative research studies);
2. To portray accurately the characteristics of a particular individual, situation or a group(studies
with this object in view are known as descriptive research studies);
Notes
3. To determine the frequency with which something occurs or with which it is associatedwith
something else (studies with this object in view are known as diagnostic research
studies);
4. To test a hypothesis of a causal relationship between variables (such studies are known
ashypothesis-testing research studies).
Characteristics of Research
1. The purpose of the research should be clearly defined, and common concepts are used.
2. The research procedure used should be described in sufficient detail to permit another
researcher to repeat the research for further advancement, keeping the continuity of what has
already been attained.
3. The procedural design of the research should be carefully planned to yield results that are as
objective as possible.
4. The researcher should report with complete frankness, flaws in procedural design and estimate
their effects upon the findings.
5. The analysis of data should be sufficiently adequate to reveal its significance and the methods of
analysis used should be appropriate. The validity and reliability of the data should be checked
carefully.
6. Conclusions should be confined to those justified by the data of the research and limited to those
for which the data provide an adequate basis.
7. Greater confidence in research is warranted if the researcher is experienced, has a good
reputation in research, and is a person of integrity.
Notes
2. Desire to face the challenge in solving the unsolved problems, i.e., concern over
practicalproblems initiates research
3. Desire to get the intellectual joy of doing some creative work
4. Desire to be of service to society
5. Desire to get respectability
However, this is not an exhaustive list of factors motivating people to undertake research
studies.Many more factors such as directives of the government, employment conditions, curiosity
about new things, desire to understand causal relationships, social thinking and awakening, and
the like may as well motivate (or at times compel) people to perform research operations.
Research Process
There are a variety of approaches to research in any field of investigation, irrespective of whether it
is applied research or basic research. Each particular research study will be unique in some ways
because of the particular time, setting, environment, and place in which it is being undertaken.An
understanding of the research process is necessary to effectively carry out research and sequencing
the stages inherent in the process.
A research design is a detailed blueprint used to guide a research study towards its objective (Aaker et
al., 2000). In the introductory section, it has already been discussed that the steps in conducting a
research program are interlinked and interrelated. Good research is conducted using 9 steps; they are
problem or opportunity identification, decision-maker and business researcher meeting to discuss the
problem and opportunity dimensions, defining the management problem and subsequently the research
problem, formal research proposal.
9.Management 1. Problem or
decision and its opportunity
implementation identification
8. Interpretation
of result and 2. Evaluate
presentation of the cost of
findings research
Research Process
7. Performing 3. Collect
data analysis Information
6.Data
4. Research
preparation
design
and data
decision
entry
5. Fieldwork
and data
collection
Notes
Problem or opportunity identification
The process of business research starts with the problem or opportunity identification.
The management of the company identifies the problem or opportunity in the organizationor the
environment. The management can identify the symptoms or the effectsof the problem, but to
understand the reasons for the problems, systematic research has tobe adopted. This required
research should either be executed by a business research firm ora business researcher.
One of the most frequently asked questions we get at Market Connections is “how much does
custom research cost?” That’s like walking into a car dealership and asking, “how much does a new
car cost?” Before answering either question, several factors need to be considered.
The same can be said when asking about the cost of research. Whether it’s focus groups, in-depth
interviews, or surveys, the price tag will depend on many factors, including what you want to
achieve through the research, who you want to ask the questions of, and how you plan to act on the
results. Like any major purchase, understanding your budget and priorities is important to help us
ensure we can properly scope the project to best meet your needs and priorities. This is not to
gauge you to the top of your range, but to maximize what we can provide, given any constraints.
The cost associated with research can be divided based on:
Qualitative Research
Quantitative Research
QUALITATIVE RESEARCH
Consider the price of a focus group study. Prices would vary depending on the number of groups,
seniority of participants, the narrowness of profession/expertise, or the location of groups. A
researcher may be able to secure a single, simple group of government IT professionals for $10
thousand or an eight-group study of mid-to senior-level professionals across multiple cities for $100
thousand. More typically, two groups of business or government participants can cost between $20-
35 thousand, and four groups may cost $35-75 thousand.
Here are some other price-related issues affected by the target audience:
The budget depends on the number of audience types a researcher is targeting and
whether it makes sense to mix them into the same group or give them their group to
ensure an unbiased and more relevant discussion.
Where are the customers located? If they are scattered across the country or the globe, we
might very well drop the idea of an in-person group and recommend instead an online
focus group as more economical for you and more convenient for the participants.
Are you able to provide a contact list of the people you want us to recruit, or do you
want a firm to compile that list? This can affect the price dramatically, depending on
who the target is.
Is the target audience very senior, or a very specific and hard-to-reach segment? The
researchers intend to discuss highly complex or sensitive issues? Any of these conditions
may call for a change in strategy to more private one-on-one in-depth interviews.
While these are the most commonly asked questions, there may be additional factors that could
affect the cost of the project. The type of recording, analysis, reporting, participant incentives, and
travel can also impact the budget.
Notes
QUANTITATIVE RESEARCH
Conversely, the price for quantitative research can range widely, from $15 thousand to over $100
thousand, with most studies in the $30-$55 thousand range.
Collect Information
The main aim of collecting information is to find out problems that are already
investigated and those that need further investigation. It includes an extensive survey of
all available past studies relevant to the field of investigation. Its objective is to collect
background knowledge of the research topic.It also helps to identify the concepts relating
to it, potential relationships between variables.
There are three main types of research design: Data collection, measurement, and analysis.
The type of research problem an organization is facing will determine the research design and not
vice-versa. The design phase of a study determines which tools to use and how they are used.
Experimental research design: Experimental research design establishes a relationship between the
cause and effect of a situation. It is a causal design where one observes the impact caused by the
independent variable on the dependent variable. For example, one monitors the influence of an
independent variable such as a price on a dependent variable such as customer satisfaction or
brand loyalty. It is a highly practical research design method as it contributes to solving a problem
at hand. The independent variables are manipulated to monitor the change it has on the dependent
variable. It is often used in social sciences to observe human behavior by analyzing two groups.
Researchers can have participants change their actions and study how the people around them
react to gain a better understanding of social psychology.
Notes
Diagnostic research design: In diagnostic design, the researcher is looking to evaluate the
underlying cause of a specific topic or phenomenon. This method helps one learn more about the
factors that create troublesome situations.
Explanatory research design: Explanatory design uses a researcher’s ideas and thoughts on a
subject to further explore their theories. The research explains unexplored aspects of a subject and
details about what, how, and why of research questions.
Data Analysis
Data analysis is defined as a process of cleaning, transforming, and modeling data to discover
useful information for business decision-making. The purpose of Data Analysis is to extract useful
information from data and taking the decision based upon the data analysis.A simple example of
Data analysis is whenever we take any decision in our day-to-day life is by thinking about what
happened last time or what will happen by choosing that particular decision. This is nothing but
analyzing our past or future and making decisions based on it.
Notes
Write down the steps needed to research on “Attitude of Bank customers towards online
banking.
Research Problem
Defining a research problem is the fuel that drives the scientific process and is the foundation of
any research method and experimental design, from true experiment to case study.It is one of the
first statements made in any research paper and, as well as defining the research area, should
include a quick synopsis of how the hypothesis was arrived at.Operationalization is then used to
give some indication of the exact definitions of the variables, and the type of scientific
measurements used.
This will lead to the proposal of a viable hypothesis. As an aside, when scientists are putting
forward proposals for research funds, the quality of their research problem often makes the
difference between success and failure.
As an example, a literature review and a study of previous experiments, and research, might throw
up some vague areas of interest.Many scientific researchers look at an area where a previous
researcher generated some interesting results but never followed up. It could be an interesting area
of research, which nobody else has fully explored.A scientist may even review a successful
experiment, disagree with the results, the tests used, or the methodology, and decide to refine the
research process, retesting the hypothesis.
This is called the conceptual definition and is an overall view of the problem. A science report will
generally begin with an overview of the previous research and real-world observations. The
researcher will then state how this led to defining a research problem .
What is happening?
Notes
Basic characteristics of research problem:
• Reflecting on important issues or needs
• Formulating a research problem is the first and most important step of the research
process.
• research instrument
• type of analysis
• Experts
• Observation
• Literature reviews
• Replication of studies
Research Design
The word ‘design’ has various meanings. But, about the subject concern, it is a pattern or an outline
of the research project’s workings. It is the statement of essential elements of a study that provides
basic guidelines for conducting the project. It is the same as the blueprint of an architect’s work.
The research design is similar toa broad plan or model that states how the entire research project
would be conducted. It is desirable that it must be in written form and must be simple and clearly
stated. The real project is carried out as per the research design laid down in advance.
Notes
2. Type of data needed
7. Probable output or research outcomes and possible actions to be taken based on those outcomes
The exploratory research design is used to increase the familiarity of the analyst with a problem
under investigation. This is particularly true when the researcher is new in the area, or when a
problem is of a different type.
4. Developing hypotheses
Exploratory research design is characterized by the flexibility to gain insights and develop
hypotheses. It does not follow a planned questionnaire or sampling. It is based on a literature
survey, experimental survey, and analysis of selected cases. Unstructured interviews are used to
offer respondents a great deal of freedom. No research project is purely and solely based on this
design. It is used as complementary to descriptive design and causal design.
Notes
Secondary data
analysis
Expert survey
Focus group
Exploratory interview
research methods
Depth interview
Case analysis
Projective
techniques
Depth Interviews: While you may get a lot of information from public sources, but
sometimes an in-person interview can give in-depth information on the subject being
Notes
studied. Such research is a qualitative research method. An interview with a subject
matter expert can give you meaningful insights that a generalized public source won’t
be able to provide. Interviews are carried out in person or on the telephone which has
open-ended questions to get meaningful information about the topic.
For example,An interview with an employee can give you more insights to find out the
degree of job satisfaction, or an interview with a subject matter expert of quantum
theory can give you in-depth information on that topic.
Focus group Interviews: Focus group is yet another widely used method in
exploratory research. In such a method a group of people is chosen and are allowed to
express their insights on the topic that is being studied. Although, it is important to
make sure that while choosing the individuals in a focus group they should have a
common background and have comparable experiences.
For example, A focus group helps researchers identify the opinions of consumers if they were to
buy a phone. Such research can help the researcher understand what the consumer value while
buying a phone. It may be screen size, brand value, or even the dimensions. Based on which the
organization can understand what are consumer buying attitudes, consumer opinions, etc.
Case Analysis: A case study is an in-depth study of a particular situation rather than a
sweeping statistical survey. It is a method used to narrow down a very broad field of
research into one easily researchable topic. Whilst it will not answer a question
completely, it will give some indications and allow further elaboration
and hypothesis creation on a subject. The case study research design is also useful for
testing whether scientific theories and models work in the real world.
Third-person techniques
Online research: In today’s world, this is one of the fastest ways to gather information
on any topic. A lot of data is readily available on the internet and the researcher can
download it whenever he needs it. An important aspect to be noted for such a research
is the genuineness and authenticity of the source websites that the researcher is
gathering the information from.
For example: A researcher needs to find out what is the percentage of people that prefer a specific
brand phone. The researcher just enters the information he needs in a search engine and gets
multiple links with related information and statistics.
Notes
Literature research: Literature research is one of the most inexpensive method used for
discovering a hypothesis. There is tremendous amount of information available in
libraries, online sources, or even commercial databases. Sources can include
newspapers, magazines, books from library, documents from government agencies,
specific topic related articles, literature, Annual reports, published statistics from
research organizations, and so on.
However, a few things have to be kept in mind while researching from these sources. Government
agencies have authentic information but sometimes may come with a nominal cost. Also, research
from educational institutions is generally overlooked, but in fact, educational institutions carry out
more research than any other entities.Furthermore, commercial sources provide information on
major topics like political agendas, demographics, financial information, market trends, and
information, etc.
For example, A company has low sales. It can be easily explored from available statistics and
market literature if the problem is market-related or organization-related or if the topic being
studied is regarding the financial situation of the country, then research data can be accessed
through government documents or commercial sources.
Descriptive research design is typically concerned with describing the problem and its solution. It is
a more specific and purposive study. Before rigorous attempts are made for descriptive study, the
well-defined problem must be on hand. The descriptive study rests on one or more hypotheses.
For example, “our brand is not much familiar,” “sales volume is stable,” etc. It is more precise and
specific. Unlike exploratory research, it is not flexible. Descriptive research requires clear
specification of who, why, what, when, where, and how of the research. Descriptive design is
directed to answer these problems. It is further classified into Cross-Sectional and Longitudinal
Study types.
Descriptive
Research
Classification
Cross- Longitudinal
sectional studies
studies
Notes
Figure 2.3: Descriptive Research Classification
Notes
independent variables (like price, products, advertising and selling efforts or marketing strategies
in general) on dependent variables (like sales volume, profits, and brand image and brand loyalty).
It has more practical value in resolving marketing problems. We can set and test hypotheses by
conducting experiments.Test marketing is the most suitable example of experimental marketing in
which the independent variable like price, product, promotional efforts, etc., are manipulated
(changed) to measure its impact on the dependent variables, such as sales, profits, brand loyalty,
competitive strengths product differentiation and so on.
Review Questions
1 What are the steps in business research processdesign?
2 What is the difference between a management problemand a research problem?
3 What are the different types of research?
4 For what purposes, exploratory research is used?
5 What is descriptive research and when do researchers conduct it.
Notes
12.Following are techniques of Qualitative Research ?
(a) Depth interview
(b) Focus group
(c) Projective technique
(d) All of the above
Answers: Self-Assessment:
6. (d) 7. (c) 8. (b) 9. (a) 10. (d) 11. (c) 12(d) 13. Qualitative and Quantitative 14. Secondary
15. Causal Research
Further Readings:
1. Business Research Methods by Naval Bajpai, Pearson
2. Research Methodology: Methods and Techniques by Kothari, C. R. & Garg,
Gaurav, New Age International.
3. Marketing Research by Naresh K Malhotra, Pearson
1. https://www.iedunote.com/research-process
2. www.wisdomjobs.com/e-university/research-methodology-tutorial-
355/motivation-in-research
3. www. muet-crp.yolasite.com
4. www. callygood.medium.com
5. www. smallbusiness.chron.com
6. www.courses.lumenlearning.com
7. www. www.iedunote.com
Objectives
After studying this unit, you will be able to
Introduction
Research creates the need to draw boundaries around an idea, topic or subject area. It is helpful to
think about how and where information for the area under research is generated. For this, the
researcher needs to identify the branches of knowledge creation in a subject area.
Information does not exist in an environment like raw materials, instead it is created by individuals
working in a specific field of knowledge who use specialized methods to generate new knowldege.
Disciplines use, produce and disseminate knowledge. Looking at the list of university courses
reveals the key to a disciplined structure. Areas such as political science, biology, history and
mathematics are as unique disciplines as social work. Everyone has their own logic for how new
knowledge is introduced and made accessible. The researcher needs to be comfortable in
identifying the branches that can provide information in any search.For example, think of
disciplines that can provide information on a topic such as the role of sport in society. A researcher
tries to anticipate what kind of perspective each discipline has on the subject. The following types
of questions can examine what different branches can contribute.
In general, literature review embraces: conceptual review, theoretical review, and empirical review.
In the first part of the review, the researcher generally needs to clarify the research topic by
providing an interesting terminological explanation. Because of this, researchers define terminology
and describe research problems. In the theoretical review section, researchers mention some related
theories for backing up a proposed study. By reviewing empirical studies, researchers briefly
summarize previous research and show the research gap (s) through its critique. In doing so,
researchers "highlight the agreement and differences between the authors / theories and identify
unanswered questions or gaps" (Kumar, 1996, p.30). Through any literature review, the goal of the
researchers is to reach the existing gap. Therefore, literature review is a daunting task, but it is
essential if the research process is to be successful. Moreover, it “gives credibility and legitimacy to
research” (Cohen, Manion and Morrison, 2011, p. 112).
The purpose of literature review is to convey to the reader previous knowledge, facts established on
an issue and their strengths and weaknesses. Literature review allows the reader to update with the
status of the research by allowing them to summarize, evaluate and compare original research in
the field. Some of these purposes may be as follows:
It projects what is known and not known about the field of research to find out how
research can best contribute to that knowledge. Literature review also attempts to provide
a description of the strengths and weaknesses of the design / investigation methods and
devices used in previous research work and also shows any gaps or inconsistencies in the
body of knowledge.
It also helps in the development of research devices, identification of suitable designs for
research studies and assistance in data collection methods.
It is important to include three important aspects of knowledge in each field. First, there are
primary studies that researchers do and publish. Second, it is the study reviews that summarize
and offer new interpretations and often extend beyond the original study. Third, there are
assumptions, conclusions, opinions and interpretations that are informally shared that form part of
current understanding in the field.
When designing a literature review, it is important to note that this third level of knowledge is
always cited as "true", although it often has a loose relationship with primary studies and
secondary literature reviews. Given this, while literature reviews are designed to provide an
overview and synthesis of the relevant sources a researcher has discovered, there are many
approaches to how they can be done, depending on the type of analysis a researcher is considering.
Argumentative Review
This form selects the literature to support or refute the argument, deeply understood assumptions
or philosophical problems already established in the literature. The aim is to develop a body of
literature that establishes contradictory perspectives. Given the value-filled nature of some social
science research [e.g. Educational reform; Immigration control], an argumentative approach to
analyse literature can be genuine and important form of discourse. However, they can also present
bias problems when used to make good claims of the sort found in systematic reviews.
Integrative Review
Integrative review refers to a type of research that seamlessly reviews, critiques, and produces
typical literature on a topic, generating new frameworks and perspectives. The body of literature
includes all studies that address related or similar hypotheses. A well-conducted integrated review
meets the same standards as primary research on clarity, rigidity, and replication.
Historical Review
Historical reviews are focused on examining research throughout a period, often starting with the
first time an issue, concept, theory, phenomena emerged in the literature, then tracing its evolution
within the scope of a discipline. The purpose is to place research in a historical context to show
familiarity with state-of-the-art developments and to identify the likely directions for future
research.
Methodological Review
Reviews do not always focus on what someone said [the content], but how they said it [method of
analysis]. This approach provides a framework of understanding at different levels (e.g. theory,
significant fields, research approaches and data collection and analysis techniques), enabling
researchers to use a wide range of knowledge from the conceptual level to practical documents. It
can find its uses in the areas of inquiries related to ontological and epistemological consideration,
quantitative and qualitative integration, sampling, interviewing, data collection and data analysis,
and helps highlight many ethical issues which a researcher should be aware of and consider
conducting the study.
Systematic Review
This form of literature review contains an overview of the existing evidence related to clearly
devised research question, which uses pre-determined and standardized methods to identify and
critically evaluate the relevant research. It is further characterised by collection, reporting, and
analysis of data from previous studies. Typically, it focuses on a very specific empirical question,
which often arises in the form of cause-effect, such as "How much does A contribute to B?"
Theoretical Review
The purpose of this form of review is to examine essence of theory that is stored about an issue,
concept, theory, event. Theoretical literature review helps to establish the theories that already
exist, the relationships between them, the degree to which existing theories have been examined,
and to develop new hypotheses to test. This form of literature review is often used reveal findings
that current theories are inadequate for explaining new or is used in emerging research problems.
The unit of analysis can focus on a theoretical concept or a whole theory or framework.
Sources of Literature
The Literature refers to the collection of scholarly writings on a topic. This includes peer-reviewed
articles, books, dissertations and conference papers. Sources are considered primary, secondary, or
tertiary depending on the originality of the information presented and their proximity or how
close they are to the source of information. This distinction can differ between subjects and
disciplines.
In the current scenario, research findings may be communicated informally between researchers
through email, presented at conferences (primary source), and then, possibly, published as a
journal article or technical report (primary source). Once published, the information may be
commented on by other researchers (secondary sources), and/or professionally indexed in a
database (secondary sources). Later the information may be summarized into an encyclopaedic or
reference book format (tertiary sources).
Primary Sources
Secondary Sources
Primary Sources
A primary source is a document or record that reports on a study, experiment, trial or research
project. Primary sources are usually written by the person(s) who did the research, conducted the
study, or ran the experiment, and include hypothesis, methodology, and results.
Pilot/prospective studies
Cohort studies
Survey research
Case studies
Lab notebooks
Clinical trials
Dissertations
Secondary Sources
Secondary sources of research contain descriptions of studies prepared by another person rather
than the original researcher. Secondary sources list the primary information and studies.
Itsummaries, compares and evaluatesto present the current state of knowledge in the subject.
Sources may include a bibliography that may lead to the primary research. There are various types
of secondary sources, these are:
• Electronic database
• Printed Sources
• Conference Papers
• Theses
• Encyclopaedia/Dictionary
• Research Reports
Electronic database
Electronic literature search is very useful, but sometimes it can be time consuming &
unpredictable because there are many website & web pages that can lead to information
overload & confusion. But there are available some online databases that make it easy to
find the research published in online journals.
Printed Sources
Printed sources incudes journals, trade publications, and magazines. Printed sources find their way
as a source of literature in various forms which have been listed below:
Magazines: A magazine is a collection of articles and images about diverse topics of popular
interest and current events.
Features of magazines:
Trade Publications: Trade publications or trade journals are periodicals directed to members of a
specific profession. They often have information about industry trends and practical information
for people working in the field.
Scholarly, Academic, and Scientific Publications : Scholarly, academic, and scientific publications
are a collections of articles written by scholars in an academic or professional field. Most journals
are peer-reviewed or refereed, which means a panel of scholars reviews articles to decide if they
should be accepted into a specific publication. Journal articles are the main source of information
for researchers and for literature reviews.
Features of journals:
Primary, Secondary, and Tertiary Sources: Primary sources of information are those types of
information that come first. Example includes: original research, like data from an experiment. It
also includes diaries, journals, photographs data from the census bureau or a survey
There are different types of primary sources for different disciplines. In the discipline of history,
for example, a diary or transcript of a speech is a primary source. In education and nursing,
primary sources will generally be original research, including data sets.
Secondary sources are written about primary sources to interpret or analyze them. They are a step
or more removed from the primary event or item. Some examples of secondary sources are:
commentaries on speeches critiques of plays, journalism, or books a journal article that talks about a
primary source eg: text book, biographies etc
Conference Papers :Conference papers refer to articles that are written with the goal of being
accepted to a conference: typically, an annual (or biannual) venue with a specific scope where you
can present your results to the community, usually as an oral presentation, a poster presentation, or
a tabled discussion.
Thesis: A thesis is a theory which is based on own ideas and research and represent in a logical
way. It is one of the most important concepts of college expository writing. It usually consists of a
several original research that has already been carried out and seeks to find a particular framework
for a strong opinion.
Thesis can be a source of literature and can be written at various levels as mentioned below:
a. Undergraduate Thesis
b. Masters Thesis
c. Doctoral Thesis
Research Reports: Research reports are recorded data prepared by researchers or statisticians after
analysing information gathered by conducting organized research, typically in the form of surveys
or qualitative methods.
A Quality journal approves the article for publication only after getting it reviewed from
various subject experts.
Planned and focused: answers the question and demonstrates an understanding of the
subject.
Structured: is coherent, written in a logical order, and brings together related points
and material.
Evidenced: demonstrates knowledge of the subject area, supports opinions and
arguments with evidence, and is referenced accurately.
Formal in tone and style: uses appropriate language and tenses, and is clear, concise
and balanced.
Citation
Broadly, a citation is a reference to a published or unpublished source (not always the original
source).More precisely, a citation is an abbreviated alphanumeric expression embedded in the body
of an intellectual work that denotes an entry in the bibliographic references section of the work for
the purpose of acknowledging its relevance.Generally, the combination of both the in-body citation
and the bibliographic entry constitutes what is commonly thought of as a citation.
Referencing
Academic writing relies on more than just the ideas and experience of one author. It also uses the
ideas and research of other sources: books, journal articles, websites, and so forth.Referencing is
used to tell the reader where ideas from other sources have been used in an assignment.It shows
the reader that you can find and use sources to create a solid argument. It properly credits the
originators of ideas, theories, and research findings. It shows the reader how your argument relates
to the big picture. Different academic disciplines have priorities of what is important to the
subsequent reader of an academic paper, and different journals have differing rules about the
citation of sources.
Referencing Styles
APA
APA stands for "American Psychological Association" and comes from the association of the same
name.Although originally drawn up for use in psychological journals, the APA style is now widely
used in the social sciences, in education, in business, and numerous other disciplines.
Example: - Pinker, S. (1999). Words and rules: The ingredients of language. London: Phoenix
Chicago
Example: - Grazer, Brian, and Charles Fishman. A Curious Mind: The Secret to a Bigger Life. New
York: Simon & Schuster, 2015.
Vancouver
Originally came from The International Committee of Medical Journal Editors which produced the
"Uniform Requirements for Manuscripts Submitted to Biomedical Journals" following a meeting
that was held in Vancouver in 1978 [Source: Jönköping University Library].
The Vancouver style is used mainly in the medical sciences.
Example: - Ramalho R, Helffrich G, Schmidt DN, Vance D. Tracers of uplift and subsidence in the
Harvard
Harvard came originally from "The Bluebook: A Uniform System of Citation" published by the
Harvard Law Review Association.The Harvard style and its many variations are used in law,
natural sciences, social and behavioural sciences, and medicine.
Example: - Neville, C 2010, The complete guide to referencing and avoiding plagiarism, Open
University Press, New York.
The researcher should select the referencing style with care as failing to give credits can
prove fatal for a researcher and it is unethical as well.
In course of literature review, researchers need to follow a long procedure. For example,
researchers have to work out on the key terminologies that help to make the concept clear to the
researchers themselves and the other readers. For this, they may have to use primary or secondary
data sources. Similarly, they may have to strive hard to locate and evaluate them critically. Some of
the sources may be useful and valid and other may not be. On top of this, writing them following
certain format is a tedious procedure. For this, researchers need to follow numerous steps and sub-
steps. These steps are the following:
Select a topic
Survey the
literature
Selection of Topic
• Types of sources that can be included: Books, Articles, Abstracts, Reviews, Dissertations
• Develop an understanding of the academic terminology for your field of study and
• Look for empirical and theoretical literatureand also include primary and secondary
sources.
• Identify important authors who are contributing to the development of the topic under
research and use a system to organize and manage material.Example: Mendley, Refworks
Types of Claims
• Fact
• Worth
• Policy
• Concept
• Interpretation
• Go back and skim the preface and introduction, try to identify main ideas contained in the
work
• Identify key parts of the article or key chapters in books
Stage 2: Highlight and Extract Key Elements
• Try to understand historical context and current state
• Identify themes, trends, patterns
• Also look for gaps and anomalies
Key questions related to literature:
• What are the origins and definitions of the topic?
• What are the key theories, concepts, and ideas?
• What are the major debates, arguments, and issues?
The key elements that all research studies should include:
• Problem
• Purpose
• Research questions
• Sample
• Methodology
• Key findings
• Conclusions
• Recommendations
3.3 Summary
A review of scholarly literature provides information that can be used to investigate a topic of
importance to learn what is known about that topic for its own sake (i.e., to improve teaching or
therapeutic practices) or as a basis for designing a research study. The formulation of a research
topic is enabled by reading about research that has already been conducted because the reader can
figure out what is already known as well as become acquainted with the strengths and weaknesses
of methods used in prior research. Multiple sources exist for the conduct of literature reviews,
including secondary sources that provide an overview of past research and primary sources that
report original research. Primary sources can be identified through several different electronic
means. A literature review is used to develop research questions of different types, such as
descriptive, correlational, or interventionist. Researchers can also benefit by looking outside of
published scholarly research to community members to provide a different perspective on what
needs to be studied and how it should be studied.
description, summary, and critical evaluation of these works in relation to the research
problem being investigated.
The purpose of a literature review: Place each work in the context of its contribution to
understanding the research problem being studied and describe the relationship of each
work to the others under consideration.
Types of Literature Reviews: Argumentative Review, Integrative Review, Historical Review,
Methodological Review.
APA style of referencing: This style of referencing is known as "American Psychological Association"
referencing.
Literature Research:It refers to "referring to a literature to develop a new hypothesis".
(a) Conducted after you have decided upon your research question
(b) Helps in the formulation of your research aim and research question
(c) Is the last thing to be written in your research report
(d) Is not part of a research proposal
10. Which is the most reliable source of information for your literature review?
(a) A TV documentary
(b) A newspaper article
(c) A peer reviewed research article
(d) A relevant chapter from a textbook
Objectives
After studying this unit, you will be able to:
Introduction
A business researcher has to tackle the problem of converting the management question into a
research question. To do this, the researcher must have some information readily available before
formally starting an experiment or research. This information is also important to understand
different dimensions of a management problem. The readily available data sources also provide an
opportunity to access other researcher’s work that had similar kind of problems. This provides an
opportunity to the researchers to develop their research problems in a more comprehensive
manner. The available data sources are also important to identify the relevant variables to be
included in the study and to frame the research questions properly. In the modern era, when
computer and Internet facility are available everywhere, it is important for a researcher to be
focused on the right source of data. It will help him or her to be concentrated on the concerned
source and the research energy will not be devoured in searching an available unlimited source or
more specifically web source. The quantity of the data will never be a problem for a researcher, but
its added features of time and cost efficiencies will be a matter of concern. The chapter begins with
the discussion on the difference between the primary and secondary data.
Primary data collection methods are different ways in which primary data can be collected. It
explains the tools used in collecting primary data, some of which are highlighted below:
Interviews
An interview is a method of data collection that involves two groups of people, where the first
group is the interviewer (the researcher(s) asking questions and collecting data) and the
interviewee (the subject or respondent that is being asked questions). The questions and responses
during an interview may be oral or verbal as the case may be. Interviews can be carried out in 2
ways, namely; in-person interviews and telephonic interviews. An in-person interview requires an
interviewer or a group of interviewers to ask questions from the interviewee in a face-to-face
fashion. It can be direct or indirect, structured or structure, focused or unfocused, etc. Some of the
tools used in carrying out in-person interviews include a notepad or recording device to take note
of the conversation—very important due to human forgetful nature.
Telephonic interviews, on the other hand, are carried out over the phone through ordinary voice
calls or video calls. The 2 parties involved may decide to use video calls like Skype to carry out
interviews.A mobile phone, Laptop, Tablet, or desktop computer with an internet connection is
required for this.
Pros
Cons
● It is more time-consuming.
● It is expensive.
● The interviewer may be biased.
Surveys and questionnaires are 2 similar tools used in collecting primary data. They are a group of
questions typed or written down and sent to the sample of study to give responses. After giving the
required responses, the survey is given back to the researcher to record. It is advisable to conduct a
pilot study where the questionnaires are filled by experts and meant to assess the weakness of the
questions or techniques used.
There are 2 main types of surveys used for data collection, namely; online and offline
surveys. Online surveys are carried out using internet-enabled devices like mobile phones, PCs,
Tablets, etc.They can be shared with respondents through email, websites, or social media. Offline
surveys, on the other hand, do not require an internet connection for them to be carried out.The
most common type of offline survey is paper-based surveys. However, there are also offline
surveys that can be filled with a mobile device without access to an internet connection.
This kind of survey is called online-offline surveys because they can be filled offline but require an
internet connection to be submitted.
Cons
Observation
The observation method is mostly used in studies related to behavioral science. The researcher uses
observation as a scientific tool and method of data collection. Observation as a data collection tool is
usually systematically planned and subjected to checks and controls.There are different approaches
to the observation method—structured or unstructured, controlled or uncontrolled, and
participant, non-participant, or disguised approach.
The evaluation may also decide to observe from outside the class, becoming a non-participant. An
evaluator may also be asked to stay in class and disguise as a student, to carry out a disguised
observation.
Pros
Cons
Focus Groups
Focus Groups are a gathering of 2 or more people with similar characteristics or who possess
common traits. They seek open-ended thoughts and contributions from participants. A focus group
is a primary source of data collection because the data is collected directly from the participant. It is
commonly used for market research, where a group of market consumers engages in a discussion
with a research moderator.
It is slightly similar to interviews, but this involves discussions and interactions rather than
questions and answers. Focus groups are less formal and the participants are the ones who do most
of the talking, with moderators there to oversee the process.
● It incurs a low cost compared to interviews. This is because the interviewer does not have
to discuss with each participant individually.
● It takes lesser time too.
Cons
● Response bias is a problem in this case because a participant might be subjective to what
people will think about sharing a sincere opinion.
● Group thinking does not clearly mirror individual opinions.
Experiments
An experiment is a structured study where the researchers attempt to understand the causes,
effects, and processes involved in a particular process. This data collectionmethod is usually
controlled by the researcher, who determines which subject is used, how they are grouped, and the
treatment they receive.
During the first stage of the experiment, the researcher selects the subject which will be considered.
Some actions are therefore carried out on these subjects, while the primary data consisting of the
actions and reactions are recorded by the researcher.After which they will be analyzed, and a
conclusion will be drawn from the result of the analysis. Although experiments can be used to
collect different types of primary data, it is mostly used for data collection in the laboratory.
Pros
● It is usually objective since the data recorded are the results of a process.
● Non-response bias is eliminated.
Cons
Secondary data is the data that has already been collected through primary sources and made
readily available for researchers to use for their own research. It is a type of data that has already
been collected in the past.A researcher may have collected the data for a particular project, then
made it available to be used by another researcher. The data may also have been collected for
general use with no specific research purpose like in the case of the national census.
Data classified as secondary for particular research may be said to be primary for another research.
This is the case when data is being reused, making it primary data for the first research and
secondary data for the second research it is being used for.
Sources of secondary data include books, personal sources, journals, newspapers, websites,
government records, etc. Secondary data are known to be readily available compared to of primary
data. It requires very little research and needs for manpower to use these sources. With the advent
of electronic media and the internet, secondary data sources have become more easily accessible.
Some of these sources are highlighted below.
Books
Books are one of the most traditional ways of collecting data. Today, there are books available for
all topics you can think of. When carrying out research, all you have to do is look for a book on the
topic being researched on, then select from the available repository of books in that area. Books,
Published Sources
There are a variety of published sources available for different research topics. The authenticity of
the data generated from these sources depends majorly on the writer and publishing company.
Published sources may be printed or electronic as the case may be. They may be paid or free
depending on the writer and publishing company's decision.
This may not be readily available and easily accessible compared to the published sources. They
only become accessible if the researcher shares with another researcher who is not allowed to share
it with a third party. For example, the product management team of an organization may need data
on customer feedback to assess what customers think about their product and improvement
suggestions. They will need to collect the data from the customer service department, which
primarily collected the data to improve customer service.
Journal
Journals are gradually becoming more important than books these days when data collection is
concerned. This is because journals are updated regularly with new publications periodically,
therefore giving to date information. Also, journals are usually more specific when it comes to
research. For example, we can have a journal on, "Secondary data collection for quantitative data"
while a book will simply be titled, "Secondary data collection".
Newspapers
In most cases, the information passed through a newspaper is usually very reliable. Hence, making
it one of the most authentic sources of collecting secondary data. The kind of data commonly
shared in newspapers is usually more political, economic, and educational than scientific.
Therefore, newspapers may not be the best source for scientific data collection.
Websites
The information shared on websites is mostly not regulated and as such may not be trusted
compared to other sources. However, some regulated websites only share authentic data and can
be trusted by researchers. Most of these websites are usually government websites or private
organizations that are paid, data collectors.
Blogs
Blogs are one of the most common online sources for data and may even be less authentic than
websites. These days, practically everyone owns a blog, and a lot of people use these blogs to drive
traffic to their website or make money through paid ads. Therefore, they cannot always be trusted.
For example, a blogger may write good things about a product because he or she was paid to do so
by the manufacturer even though these things are not true.
Diaries
They are personal records and as such rarely used for data collection by researchers. Also, diaries
are usually personal, except for these days when people now share public diaries containing
specific events in their life. A common example of this is Anne Frank's diary which contained an
accurate record of the Nazi wars.
Government records are a very important and authentic source of secondary data. They contain
information useful in marketing, management, humanities, and social science research. Some of
these records include; census data, health records, education institute records, etc. They are usually
collected to aid proper planning, allocation of funds, and prioritizing of projects.
Podcasts
Podcasts are gradually becoming very common these days, and a lot of people listen to them as an
alternative to radio. They are more or less like online radio stations and are generating increasing
popularity. Information is usually shared during podcasts, and listeners can use it as a source of
data collection. Some other sources of data collection include:
● Letters
● Radio stations
● Public sector records.
Quantitative methods emphasize objective measurements and the statistical, mathematical, or numerical
analysis of data collected through polls, questionnaires, and surveys, or by manipulating pre-existing
statistical data using computational techniques. Quantitative research focuses on gathering numerical data
and generalizing it across groups of people or to predict or explain a particular phenomenon.
Qualitative Research
Qualitative research is used to understand how people experience the world. While there are many
approaches to qualitative research, they tend to be flexible and focus on retaining rich meaning
when interpreting data. Common approaches include grounded theory, ethnography, action
research, phenomenological research, and narrative research. They share some similarities but
emphasize different aims and perspectives.
● Observations: recording what you have seen, heard, or encountered in detailed field
notes.
● Interviews: personally, asking people questions in one-on-one conversations.
● Focus groups: asking questions and generating discussion among a group of people.
● Surveys: distributing questionnaires with open-ended questions.
● Secondary research: collecting existing data in the form of texts, images, audio or video
recordings, etc.
● Purposeful - study cases are selected because they are “information-rich” and illuminative.
That is, they offer useful manifestations of the phenomenon of interest; sampling is aimed
at insight about the phenomenon, not empirical generalization derived from a sample and
applied to a population.
● Data -observations yield a detailed, quotations about people’s perspectives and lived
experiences; often derived from carefully conducted case studies and review of material
culture.
● Personal experience and engagement - researcher has direct contact with and gets close to
the people, situation, and phenomenon under investigation; the researcher’s personal
experiences and insights are an important part of the inquiry and critical to understanding
the phenomenon.
Most types of qualitative data analysis share the same five steps:
Flexibility
The data collection and analysis process can be adapted as new ideas or patterns emerge. They are
not rigidly decided beforehand.
Natural settings
Data collection occurs in real-world contexts or in naturalistic ways.
Meaningful insights
Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing,
testing or improving systems or products.
Unreliability
The real-world setting often makes qualitative research unreliable because of uncontrolled factors
that affect the data.
Subjectivity
Due to the researcher’s primary role in analyzing and interpreting data, qualitative research cannot
be replicated. The researcher decides what is important and what is irrelevant in data analysis, so
interpretations of the same data can vary greatly.
Limited generalizability
Small samples are often used to gather detailed data about specific contexts. Despite rigorous
analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased
and unrepresentative of the wider population.
Labor-intensive
Although software can be used to manage and record large amounts of text, data analysis often has
to be checked or performed manually.
Quantitative Research
Quantitative research deals in numbers, logic, and an objective stance. Quantitative research
focuses on numeric and unchanging data and detailed, convergent reasoning rather than divergent
reasoning [i.e., the generation of a variety of ideas about a research problem in a spontaneous, free-
flowing manner].
Correlational and experimental research can both be used to formally test hypotheses, or
predictions, using statistics. The results may be generalized to broader populations based on
the sampling method used. To collect quantitative data, you will often need to
use operational definitions that translate abstract concepts (e.g., mood) into observable and
quantifiable measures (e.g., self-ratings of feelings and energy levels).
Descriptive statistics will give you a summary of your data and include measures of averages and
variability. You can also use graphs, scatter plots and frequency tables to visualize your data and
check for any trends or outliers.
Using inferential statistics, you can make predictions or generalizations based on your data. You
can test your hypothesis or use your sample data to estimate the population parameter.
Replication
Large samples
Data from large samples can be processed and analyzed using reliable and consistent procedures
through quantitative data analysis.
Hypothesis testing
Using formalized and established hypothesis testing procedures means that you have to carefully
consider and report your research variables, predictions, data collection and testing methods before
coming to a conclusion.
Data types
Categorical Numerical
Nominal Interval
Ordinal Ratio
Categorical Data
Categorical data represents characteristics. Therefore, it can represent things like a person’s gender,
language etc. Categorical data can also take on numerical values (Example: 1 for female and 0 for
male).
Nominal Data
Nominal values represent discrete units and are used to label variables, that have no quantitative
value. Nominal data that has no order. If a researcher changes the order of its values, the meaning
will not change.
Ordinal Data
Ordinal values represent discrete and ordered units. It is therefore nearly the same as nominal data,
except that it’s ordering matters.The main limitation of ordinal data is the differences between the
values is not really known. Ordinal scales are usually used to measure non-numeric features like
happiness, customer satisfaction etc.
Numerical Data
Continuous Data
Continuous Data represents measurements and therefore their values can’t be counted but they can
be measured. Example: Height of a person
Interval Data
Interval values represent ordered units that have the same difference. Therefore, interval data
represents numeric values that are ordered and where differences between the values are known
exactly.Ratio values are also ordered units that have the same difference. Ratio values are the same
as interval values, with the difference that they do have an absolute zero. Good examples are
height, weight, length etc.
Ratio Data
Ratio values are also ordered units that have the same difference. Ratio values are the same as
interval values, with the difference that they do have an absolute zero. Good examples are height,
weight, length etc.
Variable
A variable is any property, a characteristic, a number, or a quantity that increases or decreases over
time or can take on different values (as opposed to constants, such as n, that do not vary) in
different situations.
Independent Variable
The variable that is used to describe or measure the factor that is assumed to cause or at least to
influence the problem or outcome is called an independent variable. The definition implies that the
experimenter uses the independent variable to describe or explain the influence or effect of it on the
dependent variable. Variability in the dependent variable is presumed to depend on variability in
the independent variable.
Dependent Variable
The variable that is used to describe or measure the problem or outcome under study is called a
dependent variable.In a causal relationship, the cause is the independent variable, and the effect is
the dependent variable. If we hypothesize that smoking causes lung cancer, ‘smoking’ is the
independent variable and cancer the dependent variable.
Background Variable
In almost every study, researchers collect information such as age, sex, educational attainment,
socioeconomic status, marital status, religion, place of birth, and the like. These variables are
referred to as background variables.
Moderating Variable
In any statement of relationships of variables, it is normally hypothesized that in some way, the
independent variable ’causes’ the dependent variable to occur. In simple relationships, all other
variables are extraneous and are ignored. In actual study situations, such a simple one-to-one
relationship needs to be revised to take other variables into account to better explain the
relationship. This emphasizes the need to consider a second independent variable that is expected
to have a significant contributory or contingent effect on the originally stated dependent-
independent relationship. Such a variable is termed as a moderating variable.
Extraneous Variable
Qualitative Research
Researchers more comfortable with quantitative research. Quantitative methods deal with the
collection and processing numerical data.
Qualitative research answer questions:
• How often? To what extent?
• How much? How many … but cannot answer questions on: Why? how? In what
way?
Ethnographic studies are qualitative procedures utilized to describe, analyze and interpret a
culture’s characteristics. Ethnography was developed in the 19thand 20th centuries and used by
anthropologists to explore primitive cultures different from their own; it originated from
Anthropology. Ethnography is used when a researcher wants to study a group of people to gain a
larger understanding of their lives or specific aspects of their lives. The primary data collection
method is through observation over an extended period of time. It would also be appropriate to
interview others who have studied the same cultures.
Grounded theory is a systematic procedure of data analysis, typically associated with qualitative
research, that allows researchers to develop a theory that explains a specific phenomenon.
Grounded theory was developed by Glaser and Strauss and is used to conceptualize phenomenon
using research; grounded theory is not seen as a descriptive method and originates from sociology.
The unit of analysis in grounded theory is a specific phenomenon or incident, not individual
behaviors. The primary data collection method is through interviews of approximately 20 – 30
participants or until data achieves saturation.
Case studies are believed to have originated in 1829 by Frederic Le Play. Case studies are rooted in
several disciplines, including science, education, medicine, and law. Case studies are to be used
when (1) the researcher wants to focus on how and why, (2) the behavior is to be observed, not
manipulated, (3) to further understand a given phenomenon, and (4) if the boundaries between the
context and phenomena are not clear. Multiple methods can be used to gather data, including
interviews, observation, and historical documentation.
Keywords
Quantitative data can be counted, measured, and expressed using numbers. Qualitative data is
descriptive and conceptual. Qualitative data can be categorized based on traits and characteristics.
An experiment is a structured study where the researchers attempt to understand the causes, effects,
and processes involved in a particular process.
Data refers to distinct pieces of information, usually formatted and stored in a way that is in
accordance with a specific purpose.
Secondary research involves collecting existing data in the form of texts, images, audio or video
recordings, etc
SelfAssessment
1. Secondary/existing data may include which of the following?
A. Official documents
B. Personal documents
C. Archived research data
D. All of the above
2. Which of the following terms best describes data that were originally collected at an earlier
time by a different person for a different purpose?
A. Primary data
B. Secondary data
C. Experimental data
D. Field notes
A. Questionnaires
B. Focus groups
C. Correlational method
D. Secondary data
4. Questionnaire is a _____
A. Research method
B. Measurement technique
C. Tool for data collection
D. Data analysis technique
A. Categories
B. Units
C. Individuals
D. None of the above
7. In qualitative research you talk to more people than in quantitative research is this
statement:
A. True
B. False
9. Which of the following is NOT one of Research Designs for Qualitative Studies?
A. Goals
B. Conceptual Framework
C. Survey
D. Conclusion
A. Observation
B. Experimentation
C. Postal questionnaire
D. Focus group
A. Categories
B. Units
C. Individuals
D. None of the above
12. In qualitative research you talk to more people than in quantitative research is this
statement:
A. True
B. False
13. Quantitative studies emphasize the measurement and analysis of causal relationships
between variables, but not on……….
A. Results
B. Process
C. Introduction
D. Context
14. Which of the following is NOT one of Research Designs for Qualitative Studies?
A. Goals
B. Conceptual Framework
C. Survey
D. Conclusion
A. Observation
B. Experimentation
C. Postal questionnaire
D. Focus group
A. Categories
B. Units
C. Individuals
D. None of the above
17. In qualitative research you talk to more people than in quantitative research is this
statement:
A. True
B. False
18. Quantitative studies emphasize the measurement and analysis of causal relationships
between variables, but not on……….
A. Results
B. Process
C. Introduction
D. Context
19. Which of the following is NOT one of Research Designs for Qualitative Studies?
A. Goals
B. Conceptual Framework
C. Survey
D. Conclusion
A. Observation
B. Experimentation
C. Postal questionnaire
D. Focus group
6. A 7. B 8. B 9. C 10. D
Review Questions
Further Reading
Business Research Methods by Naval Bajpai, Pearson
Research Methodology: Methods and Techniques by Kothari, C. R. & Garg,
Gaurav, New Age International.
Marketing Research by Naresh K Malhotra, Pearson
Online Link
https://library.fiu.edu/researchmethods/datatypes
https://www.djsresearch.co.uk/glossary/item/Qualitative-Research-Design
https://neltaeltforum.wordpress.com/2018/02/06/preview-of-literature-review-
as-a-corner-stonresearch/.
Objectives
After studying this unit, you will be able to:
Decide how to collect sample data
Gain insights into terminologies used in sampling
Overview steps involved in sampling design process
Identify the characteristics of good sampling design
State the different types of sampling design
Report about the probability and non-probability sampling
Explain the various types of errors in sampling
Introduction
Sampling is the process of selecting units (e.g., people, organizations) from a population of interest
so that by studying the sample we may fairly generalize our results back to the population from
which they were chosen. Each observation measures one or more properties (weight, location, etc.)
of an observable entity enumerated to distinguish objects or individuals. Survey weights often need
to be applied to the data to adjust for the sample design. Results from probability theory and
statistical theory are employed to guide practice.
Research Methodology
Example: You want to learn about scooter owners in a city. The RTO will be the frame,
which provides you names, addresses and the types of vehicles possessed.
3. Specify the sampling unit: Individuals who are to be contacted are the sampling units. If retailers
are to be contacted in a locality, they are the sampling units. Sampling unit may be husband or wife
in a family. The selection of sampling unit is very important. If interviews are to be held during
office timings, when the heads of families and other employed persons are away, interviewing
would under-represent employed persons, and over-represent elderly persons, housewives and the
unemployed.
5. Determine the sample size: This means we need to decide "how many elements of the target
population are to be chosen?" The sample size depends upon the type of study that is being
conducted. For example: If it is an exploratory research, the sample size will be generally small. For
conclusive research, such as descriptive research, the sample size will be large. The sample size also
depends upon the resources available with the company
6. Specify the sampling plan: A sampling plan should clearly specify the target population.
Improper defining would lead to wrong data collection.
7. Select the sample: This is the final step in the sampling process.
1. Goal orientation: This suggests that a sample design "should be oriented to the research
objectives, tailored to the survey design, and fitted to the survey conditions". If this is done, it
Research Methodology
should influence the choice of the population, the measurement as also the procedure of choosing a
sample.
2. Measurability: A sample design should enable the computation of valid estimates of its sampling
variability. Normally, this variability is expressed in the form of standard errors in surveys.
However, this is possible only in the case of probability sampling. In non-probability samples, such
a quota sample, it is not possible to know the degree of precision of the survey results.
3. Practicality: This implies that the sample design can be followed properly in the survey, as
envisaged earlier. It is necessary that complete, correct, practical, and clear instructions should be
given to the interviewer so that no mistakes are made in the selection of sampling units and the
final selection in the field is not different from the original sample design. Practicality also refers to
simplicity of the design, i.e. it should be capable of beingunderstood and followed in actual
operation of the field work.
4. Economy: Finally, economy implies that the objectives of the survey should be achieved with
minimum cost and effort. Survey objectives are generally spelt out in terms of precision, i.e. the
inverse of the variance of survey estimates. For a given degree of precision, the sample design
should give the minimum cost. Alternatively, for a given per unit cost, the sample design should
achieve maximum precision (minimum variance). It may be pointed out that these four criteria
come into conflict with each other in most of the cases.
Task: Carefully balance the conflicting criteria to select a good sample design.
Random Sampling
Simple random sample is a process in which every item of the population has an equal probability
of being chosen.
Step 1: The above mentioned cluster sampling is similar to the first step of stratified random
sampling. But the two sampling methods are different. The key to cluster sampling is decided by
how homogeneous or heterogeneous the clusters are.
A major advantage of simple cluster sampling is the case of sample selection. Suppose, we have a
population of 20,000 units from which we wish to select 500 units. Choosing a sample of that size is
a very time-consuming process, if we use Random Numbers table. Suppose, the entire population is
divided into 80 clusters of 250 units each, we can choose two sample clusters (2 × 250 = 500) easily
by using cluster sampling. The most difficult job is to form clusters. In marketing, the researcher
forms clusters so that he can deal with each cluster differently.
Cross Houses
1 X1 X2 X3 X4
2 X5 X6 X7 X8
We need to select eight houses. We can choose eight houses at random. Alternatively, two clusters,
each containing four houses can be chosen. In this method, every possible sample of eight houses
would have a known probability of being chosen – i.e. chance of one in two. We must remember
that in the cluster, each house has the same characteristics. With cluster sampling, it is impossible
for certain random sample to be selected. For example, in the cluster sampling process described
above, the following combination of houses could not occur: X1 X2 X5 X6 X9 X10 X13 X14. This is because
the original universe of 16 houses have been redefined as a universe of
four clusters. So only clusters can be chosen as a sample.
Example: Suppose, we want to have 7500 households from all over the country. In such a case, from
the first stage, District, say 30 districts out of 600 are selected from all over the country.
I Stage - Cities: Suppose 5 cities are selected out of each 30 districts; and
II Stage - Wards/Localities: say 10 wards/localities are selected from each city
III Stage - Households: 50 households are selected from each ward/locality.
In stage I, we can employ stratified sampling
In stage II, we can use cluster sampling
In stage III, we can have simple random sampling
Research Methodology
Multistage Sampling
The name implies that sampling is done in several stages. This is used with stratified/cluster
designs. An illustration of double sampling is as follows. The management of a newly-opened club
is solicits new membership. During the first rounds, all corporates were sent details so that those
who are interested may enroll. Having enrolled, the second round concentrates on how many are
interested to enroll for various entertainment activities that club offers such as billiards, indoor
sports, swimming, gym etc. After obtaining this information, you might stratify the interested
respondents. This will also tell you the reaction of new members to various activities. This
technique is considered to be scientific, since there is no possibility of ignoring the characteristics of
the universe.
Task: What are the advantages and disadvantages of multistage sampling? Enlist.
Area Sampling
This is a probability sampling, a special form of cluster sampling.
Example:If someone wants to measure the sales of toffee in retail stores, one might choose a
city locality and then audit toffee sales in retail outlets in those localities. The main problem in area
sampling is the non-availability of lists of shops selling toffee in a particular area. Therefore, it
would be impossible to choose a probability sample from these outlets directly. Thus, the first job is
to choose a geographical area and then list out outlets selling toffee. Then follows the probability
sample for shops among the list prepared.
Example: You may like to choose shops which sell the brand-Cadbury dairy milk. The disadvantage
of the area sampling is that it is expensive and time-consuming.
Example: The researcher may wish to compare the responses of two or more TV commercials for
two or more products. Mall samples can be informative for this kind of studies. Mall samples
should not be used under following circumstances i.e., if the difference in effectiveness of two
commercials varies with the frequency of mall shopping, change in the demographic characteristic
of mall shoppers, or any other characteristic. The success of this method depends on "How well the
sample is chosen".
Sequential Sampling
This is a method in which the sample is formed on the basis of a series of successive decisions. They
aim at answering the research question on the basis of accumulated evidence. Sometimes, a
researcher may want to take a modest sample and look at the results. Thereafter, s(he) will decide if
more information is required for which larger samples are considered. If the evidence is not
conclusive after a small sample, more samples are required. If the position is still inconclusive, still
larger samples are taken. At each stage, a decision is made about whether
more information should be collected or the evidence is now sufficient to permit a conclusion.
Example: Assume that a product needs to be evaluated. A small probability sample is taken
from among the current user. Suppose it is found that average annual usage is between 200 to 300
units. It is known that the product is economically viable only if the average consumption is 400
units. This information is sufficient to take a
decision to drop the product. On the other hand, if the initial sample shows a consumption level of
450 to 600 units, additional samples are needed for further study.
Quota Sampling
Quota sampling is quite frequently used in marketing research. It involves the fixation of certain
quotas, which are to be fulfilled by the interviewers.
Suppose, 2,00,000 students are appearing for a competitive examination. We need to select 1% of
them based on quota sampling. The classification of quota may be as follows:
Example:Classification of Samples
Research Methodology
Snowball Sampling
This is a non-probability sampling. In this method, the initial group of respondents are selected
randomly. Subsequent respondents are being selected based on the opinion or referrals provided
by the initial respondents. Further referrals will lead to more referrals, thus leading to a snowball
sampling. The referrals will have demographic and psychographic characteristics that are relatively
similar to the person referring them.
Example: College students bring in more students on the consumption of Pepsi. The major
advantage of snowball sampling is that it monitors the desired characteristics in the population.
Panel Samples
Panel samples are frequently used in marketing research. To give an example, suppose that one is
interested in knowing the change in the consumption pattern of households. A sample of
households is drawn. These households are contacted to gather information on the pattern of
consumption. Subsequently, say after a period of six months, the same households are approached
once again and the necessary information on their consumption is collected.
Non-probability Sample
In this case, the likelihood of choosing a particular universe element is unknown. The sample
chosen in this method is based on aspects like convenience, quota etc.
When natural groupings are clear in a statistical population, cluster sampling technique is used.
While Stratified sampling is a method where in, the member of a group are grouped into relatively
homogeneous groups.
Cluster sampling can be chosen if the group consists of homogeneous members. On the other
hand, for heterogeneous members in the groups, stratified sampling is a good option.
The benefit of cluster sampling over other sampling methods is, it is cheaper as compared to the
other methods. While the benefits of stratified sampling are, this method ignores the irrelevant ones
and focuses on the vital sub populations. Another advantage is, with stratified random sampling
method is that for different sub populations, the researcher can opt for different sampling
techniques. The stratified sampling method as well helps in improving the efficiency and accuracy
of the estimation and facilitates greater balancing of statistical power of tests.
The major disadvantage of cluster sampling is, it initiates higher sampling error. This sampling
error may be represented as design effect. The disadvantages of stratified random sampling method
are, it calls for choice of relevant stratification variables which can be tough at times. When there
are homogeneous subgroups, random sampling method is not much useful. The implementation of
random sampling method is expensive and If not
provided with correct information about the population, then an error may be introduced.
All strata are represented in the sample; but only a subset of clusters are in the sample.
5.5 Fieldwork
The fieldwork consists of informal conversations as well as formal standardized interviews,
including projectives or questionnaires. Initially, a single person conducted the research. Changes
in society have shifted research for the most part into teamwork. However, a single person can still
conduct effective research. Traditionally, educational researchers began their research with a set of
hypothesis, whereas the fieldworker's hypothesis emerges through the fieldwork. Fieldwork in its
inception may seem to be disorganized. The notes may be scattered, information is coming from all
over the place. That is because the hypothesis has not yet emerged. Even though, at times the
hypothesis may become very clear rapidly. Once the hypothesis became evident the fieldworker
maintains an open mind thus allowing other hypothesis to emerge. Another important difference
between the types of research is the "nature of the proposition sought: his propositions are rarely of
the A causes B type, the usual casual interrelationships between two or more variables dealt with in
an experimental research".
Much of the naturalistic data is collected by using raw materials: notes stating the actual response
given. In order to be accurate recorders are often used. Experienced researchers create their own
techniques and develop the ability to remember the information that needs to be recorded.
The fieldworker knows when the inquiry should finish by analyzing the data as it is gathered. The
end arrives when the fieldworker sees patterns and no new significant changes. Three important
points that must be included are:
1. The data can be subjective to quantitative analysis
2. Most practitioners of the method probably consider its products to have full status as
actual studies
3. Can be credible regardless of abstraction.
Example:If a study is done amongst Maruti car-owners in a city to find the average monthly
expenditure on the maintenance of car, it can be done by including all Maruti car-owners. It
can also be done by choosing a sample without covering the entire population. There will be a
difference between the two methods with regard to monthly expenditure.
Research Methodology
Non-sampling Error
One way of distinguishing between the sampling and the non-sampling error is that, while
sampling error relates to random variations which can be found out in the form of standard error,
non-sampling error occurs in some systematic way which is difficult to estimate.
Example:
1. A MNC bank wants to pick up a sample among the credit card holders. They can readily
get a complete list of credit card holders, which forms their data bank. From this frame, the desired
individuals can be chosen. In this example, sample frame is identical to ideal population namely all
credit card holders. There is no sampling error in this case.
2. Assume that a bank wants to contact the people belonging to a particular profession over phone
(doctors, lawyers) to market a home loan product. The sampling frame in this case is the telephone
directory. This sampling frame may pose several problems: (1) People might have migrated. (2)
Numbers have changed. (3) Many numbers were not yet listed. The question is "Are the residents
who are included in the directory likely to differ from those who are not included"? The answer is
yes. Thus in this case, there will be a sampling error.
Non-response Error
This occurs, because the planned sample and final sample vary significantly.
Example: Marketers want to know about the television viewing habits across the country.
They choose 500 households and mail the questionnaire. Assume that only 200 respondents reply.
This does not show a non-response error, which depends upon the discrepancy. If those 200 who
replied did not differ from the chosen 500, there is no non-response error. Consider an alternative.
The people who responded are those who had plenty of leisure time. Therefore, it is implied that
non-respondents do not have adequate leisure time. In this case, the final sample and the planned
sample differ. If it was assumed that all the 500 chosen have leisure time, but in the final analysis
only 200 have leisure time and not others. Therefore, a sample
with respect to leisure time leads to response error.
Data Error
This occurs during the data collection, analysis of data or interpretation. Respondents sometimes
give distorted answers unintentionally for questions which are difficult, or if the question is
exceptionally long and the respondent may not have answer. Data errors can also occur depending
on the physical and social characteristics of the interviewer and the respondent. Things such as the
tone and voice can affect the responses. Therefore, we can say that the characteristics of the
interviewer can also result in data error. Also, cheating on the part of the interviewer leads to data
error. Data errors can also occur when answers to open-ended questions are being
improperly recorded.
Failure of the Interviewer to Follow Instructions
The respondent must be briefed before beginning the interview, "What is expected"? "To what
extent he should answer"? Also, the interviewer must make sure that respondent is familiar with
the subject. If these are not made clear by the interviewer, errors will occur. Editing mistakes made
by the editors in transferring the data from questionnaire to computers are other causes for errors.
The respondent could terminate his/her participation in data gathering, because it may be felt that
the questionnaire is too long and tedious.
Non-response error may be due to (1) failure to locate, (2) flat refusal.
Failure to locate: People move to new destinations. However, if the sample frames used are of
recent origin, this problem can be overcome.
Flat refusal: We do not know if those who did not respond hold different views or opinions
from those who responded.
This implies that those who don't respond should be motivated. It can be done in any one of the
following ways:
1. An advance letter informing the respondents that they will receive a questionnaire and
requesting their cooperation. This will generally increase the rate of response.
2. Monetary incentive or gift given to respondents will yield a larger response rate.
3. Proper follow up is necessary after the potential respondent received the questionnaire.
Research Methodology
Example: Consider a normal population with mean and variance. Assume we repeatedly
take samples of a given size from this population and calculate the arithmetic mean for each
sample – this statistic is called the sample mean. Each sample will have its own average value, and
the distribution of these averages will be called the "sampling distribution of the sample mean".
This distribution will be normal N(m, s2/n) since the underlying population is normal. The
standard deviation of the sampling distribution of the statistic is referred to as the standard error of
that quantity.
Summary
Sample is a representative of population while Census represents cent percent of population.
The most important factors distinguishing whether to choose sample or census is cost and time.
There are seven steps involved in selecting the sample.
There are two types of sample, namely, Probability sampling and Non-probability sample.
Probability sampling includes random sampling, stratified random sampling systematic
sampling, cluster sampling, Multistage sampling.
Samples can be chosen either with equal probability or varying probability.
Random sampling can be systematic or stratified.
In systematic random sampling, only the first number is randomly selected. Then by adding a
constant "K" remaining numbers are generated.
In stratified sampling, random samples are drawn from several strata, which has more or less
same characteristics.
In multistage sampling, sampling is drawn in several stages.
Keywords
Census: It refers to complete inclusion of all elements in the population. A sample is a sub-group of
the population.
Deliberate Sampling: The investigator uses his discretion in selecting sample observations from the
universe. As a result, there is an element of bias in the selection.
Multistage Sampling: The name implies that sampling is done in several stages
Quota Sampling: Quota sampling is quite frequently used in marketing research. It involves the
fixation of certain quotas, which are to be fulfilled by the interviewers.
Random Sampling: Simple random sample is a process in which every item of the population has
an equal probability of being chosen.
Sample Frame: Sampling frame is the list of elements from which the sample is actually drawn.
Stratified Random Sampling: A probability sampling procedure in which simple random sub-
samples are drawn from within different strata, that are, more or less equal on some characteristics.
Self Assessment
A. Data
B. Set
C. Distribution
D. Population
2. A statistical investigation in which the data are collected for each and every element/unit of
A. Census
B. Distribution
C. Population
D. Subset
A. All members
B. Few members
C. Proportionate members
D. None of these
4. Element represents
A. No Unit
B. One Unit
C. Multiple Units
D. None of these
A. Non-probability sampling
B. Probability sampling
C. Moving sampling
D. Unequal sampling
A. Optimum Size
B. Larger Size
C. Small Size
D. Medium Size
Research Methodology
D. Previous research
A. Non-random selection
B. Random selection
C. Data
D. Researcher
A. Quota sampling
B. Convenience sampling
C. Snowball sampling
D. Stratified random sampling
13. In random sampling, the probability of selecting an item from the population is:
A. Unknown
B. Known
C. Un-decided
D. One
A. Population
B. Data
C. Set
D. Distribution
15. Increasing the sample size has the following effect upon the sampling error?
6. B 7. A 8. A 9. B 10. B
Review Questions
1. What do you analyse as the advantages and disadvantages of probability sampling?
2. Which method of sampling would you use in studies, where the level of accuracy can vary
from the prescribed norms and why?
3. Quota sampling does not require prior knowledge about the cell to which each
populationunit belongs. Does this attribute serve as an advantage or disadvantage for
QuotaSampling?
4. What suggestions would you give to reduce non sampling error?
5. One mobile phone user is asked to recruit another mobile phone user. What
samplingmethod is this known as and why?
6. Sampling is a part of the population. True/False? Why/why not?
7. What do see as the reason behind purposive sampling being known as
judgementsampling?
Further Readings
1. Cooper and Schinder, Business Research Methods, TMH.
2. CR Kotari, Research Methodology, Vishwa Prakashan.
3. David Luck and Ronald Rubin, Marketing Research, PHI.
4. Naresh Amphora, Marketing Research, Pearson Education.
5. S.N. Murthy & U. Bhojanna, Business Research Methods, 3rd Edition, Excel Books.
6. William Zikmund, Business Research Methods, Thomson.
Objectives
After studying this unit, you will be able to:
Introduction
According to given principles, measurement is the process of assigning numbers or other symbols
to the properties of the things being measured. A concept (or construct) is a broad concept that
describes a group of things, qualities, events, or processes. Age, gender, number of children,
education, and income are examples of rather definite constructs. Brand loyalty, personality,
channel power, and satisfaction are all factors that are considered in relatively abstract
formulations.
The creation of a continuum on which measured things are placed is known as scaling. A scale is a
measuring device that consists of a collection of elements that are arranged in ascending order of
value or magnitude.
1) Nominal scale
2) Ordinal scale
3) Interval scale
4) Ratio scale
Nominal Scale
Numbers are used to identify the things on this scale. Students' university registration numbers, for
example, are issued to them, as are the numbers on their jerseys. In this kind of scaling, the
objective of marking numbers, symbols, labels, and other symbols is not to establish an order, but
rather to simply place labels to identify activities and count the objects and subjects. Individuals,
corporations, products, brands, and other entities are classified into categories using this
measurement scale, which has no suggested order. It's referred to as a categorical scale rather
frequently. It is a classification system, not a continuum, in which the entity is placed. It requires a
basic count of the number of cases given to each category, and if desired, numbers may be
nominally assigned to each category to label it.
Characteristics
1. It does not have an arithmetic origin.
2. It shows no relationship in terms of order or distance.
3. It categorises objects and groups them accordingly.
Use:This scale is commonly used to perform surveys and ex-post-facto research.
‘Yes' is coded as 'One' and 'No' is coded as 'Two'. The numeric value assigned to the
responses has no relevance and serves just as a means of identification. The answers
supplied by respondents will not be affected if the numbers are changed to one for
'No' and two for 'Yes.' The numbers used in nominal scales are solely for counting
purposes. The telephone numbers are an example of nominal scale, as each number
corresponds to a single subscriber. The purpose of employing a nominal scale is to
ensure that no two people or objects receive the same numerical value. Bus route
numbers, for instance, are an example of nominal scale.
we use nominal scale in the cases like "What is the ID number of your Card? books
arrangement in the library- subject wise, author wise.
ask the respondents to rank the items, like for example, "A soft drink, based upon flavour
or color". In such a case, the ordinal scale is used. Ordinal scale is a ranking scale.
Rank the following attributes of 1-5 scale according to the importance in the Mobile Phone:
Ordinal scale is used to arrange things in order. In qualitative research, rank ordering is used to
rank characteristics units from the highest to the lowest.
Characteristics
1. The ordinal scale ranks the things from the highest to the lowest.
2. Such scales are not expressed in absolute terms.
3. The difference between adjacent ranks is not equal always.
4. For measuring central tendency, median is used.
5. For measuring dispersion, percentile or quartile is used.
Scales involve the ranking of individuals, attitudes, or items along the continuum of the
characteristics being scaled.
In nominal scale numbers can be interchanged because it serves only for the purpose of counting.
Numbers in Ordinal scale have meaning, and it won't allow interchangeability.
Interval Scale
Nominal and ordinal scales are less powerful than interval scales. On the object being measured,
the distance shown on the scale indicates an equal distance. The interval scale can inform us "how
far apart the items are in relation to an attribute." This indicates that the differences are comparable.
The difference between the numbers "1" and "2" is the same as the difference between the numbers
"2" and "3."
The idea of "equality of interval" is employed in the interval scale, which means that the intervals
are used as the basis for making the units equal, assuming that the intervals are equal.
Researchers can only justify using the arithmetic mean as a measure of average when the data is
interval scaled. The interval or cardinal scale uses the same units of measurement as the ordinal
scale, allowing you to understand not only the order of the scale scores but also the distance
between them. The zero point on an interval scale, however, must be understood as arbitrary and
not a true zero. Of course, this has consequences for the kind of data manipulation and analysis we
may perform on data obtained in this manner. A constant can be added or subtracted from all of
the scale values without changing the scale's shape, but the values cannot be multiplied or divided.
Two respondents with scale positions 1 and 2 are as far off as two respondents with scale positions
4 and 5, however a person with a score of 10 does not feel twice as intensely as someone with a
score of 5. Temperature is measured in Centigrade or Fahrenheit on an interval scale. Because the
respective temperatures on the centigrade scale, 100°C and -3.9°C, are not in the ratio 2:1, we cannot
say that 50°F is twice as hot as 25°F.
Interval scales may be either numeric or semantic.
Characteristics
1. Interval scales have no absolute zero. It is set arbitrarily.
2. For measuring central tendency, mean is used.
3. For measuring dispersion, standard deviation is used.
4. For test of significance, t-test and f-test are used.
5. Scale is based on the equality of intervals.
Use:The majority of common statistical methods of analysis just require interval scales in order to
be applied. These aren't discussed here because they're so widespread and can be found in almost
every introductory statistics textbook.
In case, we would like to measure the refrigerator rating by using interval scale, It would
look as follows:
(a) Brand Name Poor …………………… Good
(b) Price High …………………….. Low
(c) Service after-sales Poor …………………… Good
(d) Utility Poor …………………….Good
The researcher cannot assume that a respondent who gives a rating of 6 is three times more positive
about a product under investigation than a responder who gives a rating of 2.
Ratio Scale
A ratio scale is a type of interval scale with a significant zero point. This scale can be used to
measure length, weight, or distance. It is possible to describe how many times larger or smaller one
object is when compared to another using this scale. Actual variables are measured using these
scales. A ratio scale is the highest degree of measurement. This scale combines the characteristics of
an interval scale with the addition of a fixed origin or zero point. Weights, lengths, and times are all
examples of ratio scaled variables. Ratio scales allow researchers to compare difference in scores as
well as the relative magnitude of those discrepancies. The difference between 5 and 10 minutes, for
example, is the same as the difference between 10 and 15 minutes, because 10 minutes is twice as
long as 5. Given that sociology and managerial research rarely goes beyond the interval level of
measurement, it is not recommended that this level of analysis be given extra attention. To
summarize, ratio scales can be used to execute almost all statistical operations.
Characteristics
1. This scale has a measurement of absolute zero.
2. Geometric and harmonic techniques are utilised to determine central tendency.
For Instance, this year's sales of product A are twice as high as last year's sales of the
identical product.
Statistical implications: This scale allows for the execution of all statistical operations.
1. Continuous rating scale: In this technique, respondents rate an item using a sequence of
numbers called scale points. Graphic rating scaling is another name for this method.
2. Likert scale: This method allows respondents to rate items on a five-to-seven-point scale
based on their level of agreement or disagreement with the item.
3. Semantic differential scale: Respondents are asked to rate the distinct qualities of an item
on a seven-point scale in this technique.
Figure: 5.1
We will measure an individual's attitude by analysing his ideas, in the figure above, about drinkers.
As you get further down, you'll see that people's attitudes and behaviours toward drinkers get
increasingly erratic. If a person agrees with one of the statements in the list, he is more likely to
agree with all of the assertions that follow it. As a result, in this case, the rule is growing one. This is
referred to as scaling. Scaling is used to test a hypothesis during the research process. Scaling can be
used as part of probing inquiry in some cases.
: For each pair of professors, please indicate the professor from whom you prefer to take
classes with a 1.
Rank the instructors listed below in order of preference. For the instructor you
prefer the most, assign a "1", assign a "2" to the instructor you prefer the 2nd most,
assign a "3" to the instructor that you prefer 3rd most, and assign a "4" to the
instructor that you prefer the least.
Four marketing professors are listed below, along with three features that students often
value. Please assign a number to each component that indicates how well you believe each
instructor performs in that area. The higher the number, the better the score. The sum of all
of the teachers' scores on a certain aspect should be 100.
Non-comparative Scale
Continuous Rating Scale VERY POOR ……………………………………………….............VERY
GOOD 10 20 30 40 50 60 70 80 90 100
Likert Scale
It is also known as summated rating scale. This is a collection of statements about an attitude object.
On the scale of '5 points, Agree, and Disagree,' each statement has 5 points, Agree, and Disagree.
Because the scores of different items are added together to generate a total score for the respondent,
they are also known as summated scales. The Likert Scale is divided into two parts: item and
evaluation. A comment about a specific product, event, or attitude is frequently included in the
item component. The evaluation section consists of a series of replies ranging from "strongly agree"
to "strongly disagree." In this case, a five-point scale is used. Numerals such as +2, +1, 0, –1, –2 are
utilized. Let's look at how a customer's attitude toward a shopping mall is measured using an
example.
The overall attitude of the respondents is assessed by adding up his or her numerical ratings on the
statements that make up the scale. Because some sentences are positive and others are negative, this
is the most important task to complete before calculating the ratings. In other words, a "strongly
agree" category is associated with a positive statement, while a "strongly disagree" category is
associated with a negative statement. The statement must be given the same amount every time,
such as +2 or –2. "How successfully are the statements generated?" determines the Likert Scale's
success. The more favourable the attitude, the higher the respondent's score. For example, if there
are two shopping malls, ABC and XYZ, and the Likert Scale scores are 30 and 60, we can conclude
that people prefer XYZ to ABC.
There must be an equal number of positive and negative statements on the Likert Scale.
This is very much like the Likert Scale. It also includes a number of items for respondents to rate.
The following is the key distinction between Likert and Semantic Differential Scale: It employs the
term "bipolar" in its adjectives and phrases. The Semantic Differential Scale has no statements. A
seven-point scale separates each pair of adjectives.
Semantic Differential Scale Items
Please rate the five real estate developers mentioned below on the given scales for each of the five
aspects. Developers are:
Respondents were asked to select one of seven categories that best described their attitudes. The
calculation is carried out in the same manner as in the Likert Scale. Assume we're attempting to
assess the packing of a specific product. The following is a seven-point scale:
"I feel …………..
1. Delighted
2. Pleased
3. Mostly satisfied
5. Mostly dissatisfied
6. Unhappy
7. Terrible.
Thurstone Scale
This is also called as an equal appearing interval scale. The following are the steps to
construct a Thurstone Scale:
Step 2: A group of judges, say 20 to 30, is given these statements (75 to 100) and asked to
classify them according to the degree of favorability and unfavorability.
Step 3: The judges must create 11 piles. The statements in the heaps range from "most
unfavourable" in pile 1 to "neutral" in pile 6 and "most favourable" in pile 11.
Step 4: Analyze the frequency distribution of ratings for each statement and delete any
statements with highly disparate ratings from various judges.
Step 5: For the final scale, choose one or two statements from each of the 11 piles. To make
the scale, arrange the statements in a random order.
Step 6: Those whose attitudes were to be scaled were given a set of items and asked to
indicate whether they agreed or disagreed with each one. Some people may agree with
just one assertion, while others may agree with multiple statements.
1) It is more important to live in the present than in the future. As a result, there is no need
to savings.
2) There are numerous attractions where you can spend the money you have saved.
10) One should make an effort to save more money so that the majority of it can be
invested.
Conclusion: A respondent who agrees with points 8, 9, and 11 is said to have a positive
attitude about saving and investing. The person who agrees with statements 2, 3, and 4 is
someone who has a negative attitude. Furthermore, a respondent's attitude is not
considered consistent if he chooses statements 1, 3, 7, or 9.
Multidimensional Scaling
This is used to research customer attitudes, namely perceptions and preferences. These
methods aid in determining which product qualities are most significant to customers and
1. What are the most important characteristics to consider when selecting a product (soft
drinks, modes of transportation)? (a) What characteristics do customers compare while
evaluating different product brands? Is it a matter of cost, quality, or availability, for
example?
2. According to the customer, what is the appropriate combination of attributes? (That is,
which two or more attributes will a consumer consider before making a purchase
decision.)
3. Which advertising messages are in line with the consumer's impressions of the brand?
There are two methods for gathering input data for perceptual mapping:
Stapel Scales
1. While creating pairs of bipolar adjectives is problematic, modern versions of the
Stapel scale use a single adjective to replace the semantic difference.
2. A Stapel scale's benefits and drawbacks, as well as the outcomes, are remarkably
comparable to those of a semantic differential. The stapel scale, on the other hand, is
generally easy to conduct and administer.
Summary
A nominal, ordinal, interval, or ratio scale can be used to measure something. The scales describe
the degree of liking/disliking, agreement/dissent, or belief in an object. Each scale has its own set
of statistical implications. In market research, there are four types of scales: paired comparison,
Likert, semantic differential, and Thurstone scale. The semantic differential scale is a seven-point
scale, whereas the Likert scale is a five-point scale. In the semantic difference scale, bipolar
adjectives are utilised. The Thurstone scale is intended to examine the respondents' attitudes about
any issue of public concern. Before the scale is used for measurement, its validity and reliability are
checked. "Does the scale measure what it claims to measure?" is the question. The sort of validity
required depends on "What is being measured." There are three methods for checking validity.
Keywords
Scaling: referred to as the assignment of objects to numbers according to a set of rules.
Likert Scale:This is a combination of statements about an attitude object. On the scale of '5 points,
Agree, and Disagree,' each statement has 5 points, Agree, and Disagree.
Ordinal Scale:In most market research studies, the ordinal scale is applied for ranking.
Ratio Scale: Ratio scale is a special kind of internal scale that has a meaningful zero point.
Self Assessment
1....................... scale may tell us "How far the objects are apart with respect to an attribute?"
Which of the following is suitable to be filled-in in the given statement?
A. Interval
B. Nominal
C. Ratio
D. Ordinal
2.Ratio scale is a special kind of internal scale that has a meaningful ................................. Which
of the following is suitable to be filled-in in the given statement?
A. Zero point
B. Mid Point
C. Fraction
D. None of these
3.The salary of Ram is twice as much as the salary of Shyam – this is an example of:
A. Nominal scale measurement
B. Ordinal scale measurement
C. Interval scale measurement
D. Ratio scale measurement
4.The constant sum rating scale would result in which type of measurement?
A. Nominal scale
B. Ordinal scale
C. Interval scale
D. Ratio scale
D. Ratio scale
D. Ratio scale
7.In which of the following scales the objects are arranged according to their magnitude in an ordered
relationship?
A. Nominal scale
B. Ordinal scale
C. Interval scale
D. Ratio scale
8.In which of the following scales does difference in scores have meaningful interpretation?
A. Nominal
B. Ratio
C. Interval
D. Both b and c
10.In which of the following scales can all possible statistical techniques be applied?
A. Nominal
B. Ordinal
C. Ratio
D. Interval
A. False
B. True
14.we are not measuring the object but some characteristic of it when we measure the
perceptions, attitudes, and preferences of consumers.
A. False
B. True
15.Only a limited number of statistics, all of which are based on frequency counts, are
permissible on the
numbers in a nominal scale.
A. False
B. True
6. D 7. B 8. D 9. B 10. C
Review Questions
1. What is measurement and scaling?
6. Justify your answer by identifying the type of scale you will use in ordinal, nominal, internal,
ratio scales?
Further Readings
C.R.Kotari, Research Methodology, VishwaPrakashan. David Luck and Ronald Rubin,
Marketing Research, PHI.
G.C.Beri, Marketing Research, TMH.
Paneerselvam, R, Research Methods, PHI.
S.N. Murthy & U. Bhojanna, Business Research Methods, 3rd Edition, Excel Books.
Tull and Donalds, Marketing Research, MMIL.
Web Links
https://www.coursera.org/lecture/applying-data-analytics-business-in-
marketing/lesson-1-3-2- measurements-and-scaling-techniques-primary-scales-
of-measurement-nhWnx
https://www.uoguelph.ca/hftm/book/export/html/2106
https://conjointly.com/kb/scaling-in-measurement/
Objectives
After studying this unit, you will be able to:
Introduction
The data directly collected by the researcher, with respect to the problem under study, is
known as primary data. Primary data is also the firsthand data collected by the researcher
for the immediate purpose of the study. Primary data is the data that is collected by the
researchers for the purpose of investigation. This data is original in character and
generated by surveys. Primary data is the information collected during the course of
experiment in an experimental research. It can also be obtained through observations or
through direct communication with the persons associated with the selected subject by
performing surveys or descriptive research.
There are several methods of collecting the primary data, which are as follows:
Observation Method
Interview Method
Through Questionnaires
Through Schedules
Other methods such as warranty cards, distributor audits, pantry audits, consumer
panels, using mechanical devices, through projective techniques, depth interviews and
content analysis. Observation and questioning are two broad approaches available for
primary data collection. The major difference between the two approaches is that in the
Observation Method
In the observation method, only present/current behaviour can be studied. Therefore,
many researchers feel that this is a great disadvantage. A causal observation could
enlighten the researcher to identify the problem. Such as the length of the queue in front of
a food chain, price and advertising activity of the competitor etc. Observation is the least
expensive mode of data collection.
Suppose a Road Safety Week is observed in a city and the public is made aware
of advance precautions while walking on the road. After one week, an observer
can stand at a street corner and observe the number of people walking on the
footpath and those walking on the road during a given period of time. This will
tell him whether the campaign on safety is successful or unsuccessful.
Sometimes, observation will be the only method available to the researcher.
Behaviour or attitude of the children, and also of those who are inarticulate.
Structured-Unstructured Observation
A manager of a hotel wants to know "how many of his customers visit the hotel
With their families and how many come as single customers. Here, the
observation is structured, since it is clear "what is to be observed". He may
instruct his waiters to record this. This information is required to decide
requirements of the chairs and tables and also the ambience.
Suppose the manager wants to know how single customers and those with
families behave and what their attitudes are like. This study is vague, and it needs
a non-structured observation
The observation method is the only method applicable to study the growth of
plants and crops.
Disguised-Undisguised Observation
In disguised observation, the respondents do not know that they are being observed. In non-
disguised observation, the respondents understand they are being observed. In disguised
observation, observers often pose as shoppers. They are known as "mystery shoppers". They are
Direct-Indirect Observation
In direct observation, the actual behaviour or phenomenon of interest is observed. In indirect
observation, the results of the consequences of the phenomenon are observed. Suppose, a
researcher is interested in knowing about the soft drinks consumption of a student in a hostel room.
He may like to observe empty soft drink bottles dropped into the bin. Similarly, the observer may
seek the permission of the hotel owner to visit the kitchen or stores. He may carry out a
kitchen/stores audit, to find out the consumption of various brands of spice items being used by
the hotel. It may be noted that the success of an indirect observation largely depends on "how best
the observer is able to identify physical evidence of the problem under study".
Human-Mechanical Observation
Most of the studies in marketing research are based on human observation, wherein trained
observers are required to observe and record their observation. In some cases, mechanical devices
such as eye cameras are used for observation. One of the major advantages of electrical/
mechanical devices is that their recordings are free from any subjective bias.
What observation technique would you use to gather the following information?
Researcher needs to send a polite short cover note, especially with mailed questionnaires
and it should include the following:
Characteristics of Survey
Purpose of Survey
There are two purposes of survey, they are as follows:
1. Information gathering: It collects information for a specific purpose. For example, pools,
census, customer satisfaction, attitude, etc.
2. Theory testing and building: Surveys are also used for the purpose of testing and building
theory. For example, personality and social psychology theories.
Advantages of Survey
1. Lack of control.
2. Data may be superficial.
Personal Interviews
An interview is called personal when the Interviewer asks the questions face-to-face with the
Interviewee. Personal interviews can take place at home, at a shopping mall, on the street, and so
on.
Advantages
1. The ability to let the Interviewee see, feel and/or taste a product.
2. The ability to find the target population. For example, you can find people who have
seen a film much more easily outside a theater in which it is playing than by calling
phone numbers at random.
3. Longer interviews are sometimes tolerated. Particularly with in-home interviews that
have been arranged in advance. People may be willing to talk longer face-to-face than
to someone on the phone.
Disadvantages
1. Personal interviews usually cost more per interview than other methods.
2. Change in the characteristics of the population might make sample non-
representative.
Telephone Surveys
It is a process of collecting information from sample respondents by calling them over telephone.
Surveying by telephone is the most popular interviewing method.
Advantages
People can usually be contacted faster over the telephone than with other methods.
You can dial random telephone numbers when you do not have the actual telephone
numbers of potential respondents.
Skilled interviewers can often invite longer or more complete answers than people will
give on their own to mail, e-mail surveys.
Disadvantages
Many telemarketers have given legitimate research a bad name by claiming to be doing
research when they start a sales call.
The growing number of working women often means that no one is at home during the
day. This limit calling time to a "window" of about 6-9 p.m. (when you can be sure to
interrupt dinner or a favorite TV program).
You cannot show sample products by phone.
Disadvantages
The interviewees must have access to a computer, or it must be provided for them.
As with mail surveys, computer direct interviews may have serious response rate
problems in populations due to literacy levels being low.
Advantages
Speed: An email questionnaire can gather several thousand responses within a day or two.
There is practically no cost involved once the setup has been completed.
Pictures and sound files can be attached.
The novelty element of an email survey often stimulates higher response levels than
ordinary mail surveys.
Disadvantages
Advantages
Web page surveys are extremely fast. A questionnaire posted on a popular Web site can
gather several thousand responses within a few hours. Many people who will respond to
an email invitation to take a Web survey will do so the first day, and most will do so
within a few days.
There is practically no cost involved once the setup has been completed.
Pictures can be shown. Some Web survey software can also show video and play sound.
Web page questionnaires can use complex question skipping logic, randomizations and
other features which is not possible with paper questionnaires. These features can assure
better data.
Web page questionnaires can use colors, fonts, and other formatting options not possible
in most email surveys.
A significant number of people will give more honest answers to questions about sensitive
topics, such as drug use or sex, when giving their answers to a computer, instead of to a
person or on paper.
On an average, people give longer answers to open-ended questions on Web page
questionnaires than they do on other kinds of self-administered surveys.
Disadvantages
Current use of the Internet is far from universal. Internet surveys do not reflect the
population. This is true even if a sample of Internet users is selected to match the general
population in terms of age, gender, and other demographics.
People can easily quit in the middle of a questionnaire. They are not as likely to complete a
long questionnaire on the Web as they would be if talking with a good interviewer.
Depending on your software, there is often no control over people responding multiple
times to bias the results.
Mail Questionnaire
Mail questionnaire is a paper questionnaire, which is sent to selected respondents to fill and post
filled questionnaire back to the researcher.
Advantages
1. Easier to reach a larger number of respondents throughout the country.
2. Since the interviewer is not present face to face, the influence of interviewer on the respondent is
eliminated.
3. This is the only kind of survey you can do if you have the names and addresses of the target
population, but not their telephone numbers.
4. Mail surveys allow the respondent to answer at their leisure, rather than at the often-
inconvenient moment they are contacted for a phone or personal interview. For thisreason, they are
not considered as intrusive as other kinds of interviews.
5. Where the questions asked are such that they cannot be answered immediately, and needs some
thinking on the part of the respondent, the respondent can think over leisurely and give the answer
8. Personal and sensitive questions are well answered in this method.
9. The questionnaire can include pictures - something that is not possible over the phone.
Disadvantages
1. It is not suitable when questions are difficult and complicated. Example, do you believe in value
price relationship?
2. When the researcher is interested in a spontaneous response, this method is unsuitable.
Because thinking time allowed to the respondent will influence the answer.
3. In case of a mail questionnaire, it is not possible to verify whether the respondent himself/
herself has filled the questionnaire. If the questionnaire is directed towards the housewife, say, to
Prorated discount, product profile, marginal rate, etc., may not be understood by
the respondents.
5. If the answers are not correct, the researcher cannot probe further.
6. Poor response (30%) - Not all will reply.
7. in populations of lower educational and literacy levels, response rates to mail surveys are often
too small to be useful.
Questionnaire
What is Questionnaire?
A questionnaire is a research instrument consisting of a series of questions and other prompts for
the purpose of gathering information from respondents. The questionnaire was invented by Sir
Francis Galton.
Characteristics of Questionnaire
1. It must be simple. The respondents should be able to understand the questions.
2. It must generate replies that can easily be recorded by the interviewer.
3. It should be specific, so as to allow the interviewer to keep the interview to the point.
4. It should be well arranged, to facilitate analysis and interpretation.
5. It must keep the respondent interested throughout.
Which television programme did you see last Saturday? This requires a reasonably
goodmemory, and the respondent may not remember. This is known as recall loss.
Therefore, questioning the distant past should be avoided. Memory of events depends on
(1) Importance of the events, and (2) Whether it is necessary for the respondent to
remember. In the above case, both the factors are not fulfilled. Therefore, the respondent
does not remember. On the contrary, a birthday or wedding anniversary of individuals is
remembered without effort since the event is important. Therefore, the researcher should
be careful while asking questions about the past.
First, he must make sure that the respondent has the answer.
"Subjects attitude towards Cyber laws and the need for government legislation to
regulate it".
"Tell me your opinion about Mr. Ben's healing effect show conducted at
Bangalore?"
The respondent may have meant "basic pay" but interviewer may think that the respondent is
talking about "total pay including dearness allowance and incentive". Since both of them refer to
pay, it is impossible to separate two different frames.
Dichotomous Question
These questions have only two answers, 'Yes' or 'no', 'true' or 'false' 'use' or 'don't use'.
Do you use toothpaste? Yes ……….. No …………
There is no third answer. However sometimes, there can be a third answer:
Dichotomous questions are most convenient and easy to answer. A major disadvantage of
dichotomous question is that it limits the respondent's response. This may lead to
measurementerror.
Close-Ended Questions
There are two basic formats in this type:
● Make one or more choices among the alternatives.
● Rate the alternatives.
Choice Among Alternatives
Which of the following words or phrases best describes the kind of person you feel would be most
likely to use this product, based on what you have seen in the commercial?
1. Young ………… old …………….
Single ………… Married ………..
Modern ………… Old fashioned ……………...
2. Rating Scale
(i) Please tell us your overall reaction to this commercial?
(a) A great commercial; would like to see again.
(b) Just so-so, like other commercials.
(c) Another bad commercial.
(d) Pretty good commercial.
(ii) Based on what you saw in the commercial, how interested do you feel, you would bebuying the
products?
(a) Definitely
(b) Probably I would buy
(c) I may or may not buy
(d) Probably I would not buy
(e) Definitely I would not buy.
"Don't you think that Brazil played poorly in the FIFA cup?" The answer will
be'yes'. Many of them, who do not have any idea about the game, will also most
likely say 'yes'. If the question is worded in a slightly different manner, the
response will be different.
"Do you think that, Brazil played poorly in the FIFA cup?" This is a
straightforwardquestion. The answer could be 'yes', 'no' or 'don't know' depending
on the knowledge the respondents have about the game.
"Do you think anything should be done to make it easier for people to pay
theirphone bill, electricity bill and water bill under one roof"?
"Don't you think something might be done to make it easier for people to pay their
phone bill, electricity bill, water bill under one roof"?
A change of just one word as above, can generate different responses by respondents.
Instead of using the word 'reasonably', 'usually', 'occasionally', 'generally', 'on the
whole'.
"How often do you go to a movie?" "Often, may be once a week, once a month,once
in two months or even more."
"Do you feel that firms today are employee-oriented and customer-oriented?"
"Are you happy with the price and quality of branded shampoo?" [yes] [no]
1. Leading Questions:A leading question is one that suggests the answer to the
respondent. The question itself will influence the answer, when respondents get an
idea that the data is being collected by a company. The respondents have a tendency
to respond positively.
"How do you like the programme on 'Radio Mirchy'? The answer is likely to be
'yes'. The unbiased way of asking is "which is your favorite F.M. Radio station? The
answer could be any one of the four stations namely (1) Radio City (2) Mirchy (3)
Rainbow4) Radio-One.
Do you think that offshore drilling for oil is environmentally unsound? The most
probable response is 'yes'. The same question can be modified to eliminate the
leading factor.
What is you’re feeling about the environmental impact of offshore drilling for oil? Give choices as
follows:
(a) Offshore drilling is environmentally sound.
(b) Offshore drilling is environmentally unsound.
(c) No opinion.
2. Loaded Questions: A leading question is also known as a loaded question. In a loaded
Notes question, special emphasis is given to a word or a phrase, which acts as a lead to
respondent.
"Do you own a Kelvinator refrigerator?" A better question would be "what brand
of refrigerator do you own?" "Don't you think the civic body is 'incompetent'?"
Here the word incompetent is 'loaded'.
(a) Are the Questions Confusing? If there is a question unclear or is confusing, then the
respondent becomes more biased rather that getting enlightened. Example: "Do you think
that the government publications are distributed effectively"? This is not the correct way,
since respondent does not know what the meaning of the word effective distribution is.
This is confusing. The correct way of asking questions is "Do you think that the
government publications are readily available when you want to buy?" Example: "Do you
think whether value price equation is attractive"? Here, respondents may not know the
meaning of valueprice equation.
(b) Applicability: "Is the question applicable to all respondents?" Respondents may try to
answer a question even though they don't qualify to do so or may lack from any
meaningful opinion.
"Why do you use Ayurvedic soap"? One respondent might say "Ayurvedic soap is better
for skin care". Another may say "Because the dermatologist has recommended". A third
might say "It is a soap used by my entire family for several years". The first respondent
answers the reason for using it at present. The second respondent answers how he started
using. The third respondent "the family tradition for using". As can be seen, different
reference frames are used.
The question may be balanced and rephrased
Complex Questions?
In which of the following do you like to park your liquid funds?
i. Debenture
ii. Preferential share
iii. Equity linked MF
iv. IPO
v. Fixed deposit
If this question is posed to the public, they may not know the meaning of liquid fund.
Most of the respondents will guess and tick one of them.
Are the Questions Too Long?Generally, as a thumb rule, it is advisable to keep the number of
words in a question not exceeding 20. The question given below is too long for the respondentto
comprehend, leave alone answer.
Do you accept that the people whom you know, and associate yourself have been receiving
ESI and P.F. benefits from the government accept a reduction in those benefits, with a view
to cut down government expenditure, to provide more resources for infrastructural
development?
Yes................... No................... Can't say...................
The husband is asked a question "How much does your family spend on groceries
in a week"? Unless the respondent does the grocery shopping himself, he will not
know how much has been spent. In a situation like this, it will be helpful to ask a
'filtered question'. An example of a filtered question can be, "Who buys the
groceries in your family"?
"Do you have the information of Mr. Ben's visit to Bangalore"? Not only should
the individual have the information but also s(he) should remember the same. The
inability to remember the information is known as "recall loss".
"Do you have the information of Mr. Ben's visit to Bangalore"? Not only should
the individual have the information but also s(he) should remember the same. The
inability to remember the information is known as "recall loss".
Give one example for each of the following type of the questions:
1. Leading question
2. Double-barreled question
3. Close-ended question
4. Fixed alternative question
5. Split-ballot question
1. Basic information
2. Classification
3. Identification information.
Items such as age, sex, income, education, etc., are questioned in the classification section. The
identification part involves body of the questionnaire. Always move from general to specific
questions on the topic. This is known as funnel sequence. Sequencing of questions is illustrated
below:
(1) Which TV shows do you watch?
Sports................... News...................
(2) Which among the following are you most interested in?
Sports................... News...................
Music................... Cartoon...................
(3) Which show did you watch last week?
World Cup Football...................
Bournvita Quiz Contest...................
War News in the Middle East...................
Tom and Jerry cartoon show...................
The above three questions follow a funnel sequence. If we reverse the order of question and ask
"which show was watched last week"?The answer may be biased. This example shows the
importance of sequencing.
Layout: How the questionnaire looks or appears.
Clear instructions, gaps between questions, answers and spaces are part of Layout. Two
different layouts are shown below:
Layout - 1 How old is your bike?
Summary
Primary data may pertain to lifestyle, income, awareness or any other attribute of
individuals or groups.
There are mainly two ways of collecting primary data namely: (a) Observation (b) By
questioning the appropriate sample.
Observation method has a limitation i.e., certain attitudes, knowledge, motivation, etc.
cannot be measured by this method. For this reason, researcher needs to communicate.
Communication method is classified based on whether it is structured or disguised.
Structured questionnaire is easy to administer. This type is most suited for descriptive
research. If the researcher wants to do exploratory sturdy, unstructured method is better.
In unstructured method questions will have to be framed based on the answer by the
respondent. Questionnaire can be administered either in person or online or Mail
questionnaire. Each of these methods have advantages and disadvantages.
Questions in a questionnaire may be classified into (a) Open question (b) Close ended
questions (c) Dichotomous questions, etc.
While formulating questions, care has to be taken with respect to question wording,
vocabulary, leading, loading and confusing questions should be avoided. Further it is
desirable that questions should not be complex, nor too long.
It is also implied that proper sequencing will enable the respondent to answer the question
easily. The researcher must maintain a balanced scale and must use a funnel approach.
Pretesting of the questionnaire is preferred before introducing to a large population.
Keywords
Computer Direct Interview: This is the method in which the respondents key in (enter) their
answers directly into a computer.
Dichotomous Question: These questions have only two answers, like 'Yes' or 'no'
Disguised Observation: The observation under which the respondents do not know that they are
being observed.
Loaded Question: A question in which special emphasis is given to a word or a phrase, which acts
as a lead to respondent.
Non-disguised Observation: The observation in which the respondents are well aware that they
are being observed.
Self Assessment
1. The-------------------is the most frequently used primary method of data collection in any area
of business research. It involves a predetermined set of queries in a structured format.
A. In-depth interviews
B. Focus group discussions
C. Questionnaire
D. None of the above
2. The-------------------is the most frequently used primary method of data collection in any area
of business research. It involves a predetermined set of queries in a structured format.
A. In-depth interviews
B. Focus group discussions
C. Questionnaire
D. None of the above
6. The reason for the respondent’s inability to answer the questions in a questionnaire could be
because of:
A. The person might not have the required information
B. The person might not remember the answer
C. The person might not be able to articulate the answer
D. All of the above
7. The information that is of the most importance to the research project and should be
obtained first is _____
A. Qualifying information
B. Identification information
C. Basic information
D. Classification information
8. Consider the following question: Don’t you think the current government has an excellent
poverty alleviation programme? Yes/no
A. Is a leading question
B. Is a loaded question
C. Is a double-barrelled question
D. Is an interval scaled question
9. Do you think taking dowry is the right of every Indian male? Is an example of a
A. Forced choice
B. Open-ended
C. Dichotomous
D. Loaded question
11. A well-designed questionnaire can motivate the respondents andincrease the response rate.
A. True
B. False
12. To conduct an e-mail survey, the survey is written within the body of the e-mail message.
A. True
B. False
13. Observing children playing with new toys is an example of unstructured observation
A. True
B. False
15. Which of the following is a disadvantage of the survey method of data collection?
A. The questionnaire is simple to administer.
B. The data obtained are reliable because the responses are limited to the alternatives stated.
C. Wording questions properly is not easy.
D. Coding, analysis, and interpretation of data are relatively simple.
6. D 7. C 8. A 9. D 10. B
Review Questions
1. What is primary data?
2. What are the various methods available for collecting primary data?
3. What are the advantages and disadvantages of a structured questionnaire?
4. What are the several methods used to collect data by observation method?
5. What are the advantages and limitations of collecting data by observation method?
6. What are the various methods of survey research?
7. What is a questionnaire? What are its importance and characteristics?
8. Explain the steps involved in designing a questionnaire.
9. Explain Open ended and Closed ended questions in a questionnaire.
10. One method of sequencing the question in a questionnaire is to proceed from general to specific.
What is the logical reason behind this?
Further Readings
Books Abrams, M.A., Social Surveys and Social Action, London: Heinemann, 1951.
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
Bernal, J.D., The Social Function of Science, London: George Routledge and Sons,
1939.
Chase, Stuart, The Proper Study of Mankind: An inquiry into the Science of Human
Relations, New York, Harper and Row Publishers, 1958.
S. N. Murthy and U. Bhojanna, Business Research Methods, Excel Books.
Web Links
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC420179/
http://www.fao.org/3/w3241e/w3241e05.htm
https://www.youtube.com/watch?v=ywCDy7IWwHw
Objectives
After studying this unit, you will be able to:
Introduction
Let’s take a look at the most basic form of statistics, known as descriptive statistics. This branch of
statistics lays the foundation for all statistical knowledge. Descriptive Statistics are used to describe
the basic features of the data gathered from an experimental study in various ways. A descriptive
statistics is distinguished from inductive statistics. They provide simple summaries about the
sample and the measures. Together with simple graphics analysis, they form the basis of virtually
every quantitative analysis of data. It is necessary to be familiar with primary methods of
describing data in order to understand phenomena and make intelligent decisions.
There may be two objectives for formulating a summary statistic: (1) to choose a statistic that shows
how different units seem similar. Statistical textbooks call one solution to this objective, a measure
of central tendency and (2) to choose another statistic that shows how they differ. This kind of
statistic is often called measure dispersion.
Functions of an Average
1. To present huge mass of data in a summarised form: It is quite difficult for the human
mind to comprehend a large body of numerical facts. To summarise such data into a single
figure that is easier to understand and remember, an average measure is used.
2. To facilitate comparison: Averages can be used to compare different sets of data.
Workers' earnings in two factories, for example, can be compared using the mean (or
average) wages of each factory's workers.
3. To help in decision-making:The majority of research, planning, and other decisions are
based on the average value of particular variables. For example, if a company's average
monthly sales are declining, the sales manager may need to make some steps to rectify the
situation.
Characteristics of a Good Average
The following criteria must be present in a good average way of measuring:
where ∑ (called sigma) denotes summation sign. The subscript of X, i.e., ‘i’ is a positive integer,
which indicates the serial number of the observation. Since there are n observations, variation
in i will be from 1 to n. This is indicated by writing it below and above ∑ , as written earlier.
When there is no ambiguity in range of summation, this indication can be skipped and we may
simply write X1 + X2 +..... + Xn = SXi.
Mean or Arithmetic Mean can be worked out using ungrouped and grouped data.
Mean: Ungrouped Data
For Ungrouped Data or raw data, the mean has following formula
∑
̅= Where = Mean ∑ =
Problem:
Mr. Sharma operates a Web site service that employs 8 people. Find the mean
age of his workers if the ages of the employees are as follows:
55, 63, 34, 59, 29, 46, 51, 41
Solution:
55 + 63 + 34 + 59 + 29 + 46 + 51 + 41 = 378
• = 47.25
A mean can also be determined for data that is grouped, or placed in intervals. Unlike
listed data, the individual values for grouped data are not available, and it is not possible
to calculate their sum. To calculate the mean of grouped data, the first step is to determine
the midpoint of each interval or class.These midpoints must then be multiplied by the
frequencies of the corresponding classes. The sum of the products divided by the total
number of values will be the value of the mean.
Problem:
Solution:
Outliers
As a summary statistic, the mean is frequently employed. Extreme values, on the other hand, have
an impact (outliers). Outliers are numbers that are extraordinarily high or low. The mean isn't an
appropriate summary statistic when there are extreme values at one end of a data set.
Median
The value of the variate that splits it into two equal pieces is the median of the distribution. The
ordinate drawn at the median divides the area under the curve into two equal portions in terms of
a frequency curve. The median is a positional average because its value is determined by the item's
position rather than its size.
(b) The mean of the sizes of th and +1th observations, when n is even.
Solution:
Writing the observations in ascending order, we get 15, 16, 18, 20, 25, 28, 30.
Since n = 7, i.e., odd, the median is the size of Example: Find median of the
following observations:
20, 15, 25, 28, 18, 16, 30.
Solution:
Writing the observations in ascending order, we get 15, 16, 18, 20, 25, 28, 30.
Since n = 7, i.e., odd, the median is the size of (7+1/2), i.e., 4th observation.
Hence, median, denoted by Md = 20. i.e., 4th observation.
Hence, median, denoted by Md = 20.
The Mode
There will only be one mean and one median for the same collection of data. When describing the
mode of a data set, the term modal is frequently employed. The term "unimodal" refers to a data
collection that contains only one value that occurs most frequently. Bimodal data is defined as a set
of data with two values that occur with the same maximum frequency. The term "multimodal"
refers to a set of data that contains more than two values that occur with the same highest
frequency.
In the above data set, the number 79 appears twice, but all the other numbers appear only
once. Since 79 appears with the greatest frequency, it is the mode of the data values.
Example 2: The ages of 12 randomly selected customers at a local Best Buy are listed below:
23, 21, 29, 24, 31, 21, 27, 23, 24, 32, 33, 19
The above data set has three values that each occur with a frequency of 2. These values are
21, 23, and 24. All other values occur only once. Therefore, this set of data has three modes.
Example 3: You begin to observe to the color of clothing your employees wear. Your goal is
to find out what color is worn most frequently so that you can offer company shirts to your
employees.
The color blue was worn 11 times during the week. All other colors were worn with much
less frequency in comparison to the color blue
1. Range
2. Quartile deviation
3. Mean Deviation
4. Standard Deviation
Range
Range is the simple measure of dispersion, which is defined as the difference between the largest
value and the smallest value. Mathematically, the absolute and the relative measure of range can
be written as the following:
R= L – S
Coefficient of Range= L-S/L+S, Where R= Range, L= largest value, S=smallest value
So, it can be said that the Range is the difference between the lowest and highest values.
Example: In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9.
So the range is 9 − 3 = 6.
The range can sometimes be misleading when there are extremely high or low values.
Quartile Deviation
The Quartile Deviation is a simple way to estimate the spread of a distribution about a measure of
its central tendency (usually the mean).
It gives an idea about the range within which the central 50% of sample data lies.
The first quartile or the lower quartile or the 25th percentile, also denoted by Q1, corresponds to the
value that lies halfway between the median and the lowest value in the distribution (when it is
already sorted in the ascending order). Hence, it marks the region which encloses 25% of the initial
data.
Similarly, the third quartile or the upper quartile or 75th percentile, also denoted by Q3,
corresponds to the value that lies halfway between the median and the highest value in the
distribution (when it is already sorted in the ascending order). It, therefore, marks the region which
encloses the 75% of the initial data or 25% of the end data.
Example1:The number of vehicles sold by a Toyota Showroom in a day was recorded for
10 working days. The data is given as:
Day Frequency
1 20
2 15
3 18
4 5
5 10
6 17
7 21
8 19
9 25
10 28
We first need to sort the frequency data before proceeding with the quartiles calculation
Sorted Data – 5, 10, 15, 17, 18, 19, 20, 21, 25, 28 n(number of data points) = 10
Now, to find the quartiles, we use the logic that the first quartile lies halfway between the
lowest value and the median; and the third quartile lies halfway between the median and
the largest value.
+1
First Quartile Q = th term.
4
10 + 1
= th term = 2.75 th term
4
2nd term + 0.75 × (3 rd term − 2 nd term )
= 10 + 0.75 × (15 − 10)
= 10 + 3.75
= 13.75
3( + 1)
Third Quartile Q = th term.
4
3(10 + 1)
= th term = 8.25 th term
4
= 8 th term + 0.25 × (9 th term − 8 th term )
= 21 + 0.25 × (25 − 21)
= 21 + 1
= 22
Using the values for Q1 and Q3, now we can calculate the Quartile Deviation and
its coefficient
−
=
2
22 − 13.75
=
2
8.25
=
2
= 4.125
−
= × 100
+
22 − 13.75
= × 100
22 + 13.75
8.25
= × 100
35.75
≈ 23.08
Mean Deviation
The mean deviation is defined as a statistical measure which is used to calculate the average
deviation from the mean value of the given data set.
Step 1: Find the mean value for the given data values
Step 2: Now, subtract mean value form each of the data value given (Note: Ignore the
minus symbol)
Step 3: Now, find the mean of those values obtained in step 2.
Mean Deviation = [Σ |X – µ|]/N
Σ represents the addition of values
X represents each value in the data set
Μ represents the mean value of the data set
N represents the number of data values
Example:
Problem: Determine the mean deviation for the data values
5, 3, 7, 8, 4, 9.
Given data values are 5, 3, 7, 8, 4, 9.
We know that the procedure to calculate the mean deviation.
First, find the mean for the given data:
Mean, µ = ( 5+3+7+8+4+9)/6
µ = 36/6
Now, subtract each mean from the data value, and ignore the minus symbol if any
(Ignore”-”)
5–6=1
3–6=3
7–6=1
8–6=2
4–6=2
9–6=3
Now, the obtained data set is 1, 3, 1, 2, 2, 3.
Finally, find the mean value for the obtained data set. Therefore, the mean
deviation is = (1+3 + 1+ 2+ 2+3) /6 = 12/6 = 2
Hence, the mean deviation for 5, 3, 7, 8, 4, 9 is 2.
Standard Deviation
The Standard Deviation is a measure of how spread out numbers are.
Its symbol is σ (the greek letter sigma)
Normal Distribution Curve
The bell curve (also known as a "normal distribution" by statisticians) is a prominent tool used in
statistics to understand standard deviation. In actual life, the graph of a normal distribution below
reflects a large amount of data. The Greek letter, at the centre, represents the mean, or average.
Each segment (dark blue to light blue in colour) reflects a standard deviation from the mean.
The formula to find the standard deviation (s) when working with samples is:
∑( − ̅ )
S=
−1
Expressed in percentage
Relative measures or measures of net changes
Measure change over a period of time or in two or more places
Specialized average
Measuring changes which are not directly measurable
When calculating the cost-of-living index for instructors, for example, book costs will be given more
weight than when calculating the cost-of-living index for workers. Weights should be neutral and
chosen logically rather than arbitrarily.
6. Purpose of Index Numbers
Different index numbers are created for specific purposes, and no one index number can be
considered an "all-purpose" index number. It is critical to understand the function of an index
number before it is created.
7. Selection of Method
There are two methods of computing the index numbers:
(a) Simple index number:
Simple index number can be constructed either by -
(i) Simple aggregate method
OR
(ii) Simple average of price relative’s method.
Limitedcoverage
Indexnumbersarebasedonsample items.
Qualitativechangesareignored
Ignoreschangesin theconsumptionpattern
Limitedapplicability
Misleading results -index numbers may not perfect it.
Wrongbaseyearhasbeentaken; wrong formulaeor wrongweightageis takenetc.
Basedonaverages
Economics Gross Domestic Product (GDP), Consumer Price Index (CPI), S&P 500
Index, and unemployment rates
A long period isn't defined by a definite length of time. Long lengths of time vary
depending on the situation. For example, in the case of population or output trends, the
long period could be ten years, whereas the daily demand trend for vegetables could be a
month. It should be observed, however, that the longer the period, the more important the
trend. Furthermore, the growth or drop in values does not have to continue in the same
direction throughout the duration. The statistics could show a rising (or falling) trend at
first, followed by a decreasing (or rising) trend. etc.
Researching the series' previous growth or decline. When short-term fluctuations are
ignored, trend describes the data's basic growth or fall tendency.
The trend curve can be projected into the future for predicting if the same behavior is
assumed to continue in the future.
To investigate the impact of other factors, the trend can be assessed first and then
subtracted from the observed values.
Two or more time series' trend values can be utilized to compare them.
2. Periodic Variations
These variations, also known as oscillatory movements, repeat themselves after a regular interval of
time. This time interval is known as the period of oscillation. These oscillations are shown in the
following Figure:
If the period of oscillation is less than one year, the oscillatory movements are called Seasonal
Variations, and if the period is larger than one year, they are called Cyclical Variations. Seasonal
and cyclical changes may be present in a time series where the time gap between subsequent
measurements is less than or equal to one year. Seasonal changes, on the other hand, are
nonexistent if the time period between successive measurements is more than a year. Although
periodic variations are more or less regular, they are not always uniformly periodic, meaning that
the pattern of their variations in different periods may or may not be the same in terms of time
period and amount of periodic variations. For example, if a cycle takes five years to complete, the
next cycle may take more or shorter than five years to complete.
1. Causes of Seasonal Variations: The main causes of seasonal variations are: (a) Climatic
Conditions and (b) Customs and Traditions
(a) Climatic Conditions: The changes in climatic conditions affect the value of time series variable
and the resulting changes are known as seasonal variations. For example, the sale of woolen
garments is generally at its peak in the month of November because of the beginning of winter
season. Similarly, timely rainfall may increase agricultural output, prices of agricultural
commodities are lowest during their harvesting season, etc., reflect the effect of climatic conditions
on the value of time series variable.
(b) Customs and Traditions: The customs and traditions of the people also give rise to the seasonal
variations in time series.
The sale of garments and jewelry may be highest during the wedding season, the
sale of sweets during Diwali, and so on, are examples of differences that are the
outcome of people's rituals and traditions. It's worth noting that both of the
aforementioned causes occur on a regular basis and are frequently repeated after a
gap of less than or equal to one year.
As the name suggests, these variations do not reveal any regular pattern of movements.
These variations are caused by random factors such as strikes, floods, fire, war, famines, etc.
Random variations are that component of a time series which cannot be explained in terms of
any of the components discussed so far. This component is obtained as a residue after the
elimination of trend, seasonal and cyclical components and hence is often termed as residual
component. Random variations are usually short-term variations but sometimes their effect
may be so intense that the value of trend may get permanently affected.
Forecasting Approaches
Quantitative Forecasts uses one or more mathematical models that rely on historical data
and/or causal variable to forecast demand.
Qualitative Forecasts uses such factors like decision makers’ intuition, emotions, personal
experiences, and value system.
1. Naive Approach
2. Moving Averages
TIME SERIES MODELS
3. Exponential Smoothing
4. Trend Projection
Summary
The variance, its square root, the standard deviation, the range, the interquartile range,
and the average absolute deviation are the most popular metrics of variability for
quantitative data (average deviation)
A time series is a collection of data points taken at different times and separated by time
intervals.
Time series analysis refers to approaches for attempting to comprehend time series,
usually to comprehend the underlying context of the data points or to create predictions.
Time series forecasting is when a model is used to predict future events based on known
previous events: anticipating future data points before they are measured.
Data collected on a quarterly, monthly, weekly, daily, or hourly basis is likely to show
seasonal fluctuations.
In two or more scenarios, an index number is a technique for comparing the general
magnitude of a group of distinct but linked variables.
Keywords
Average: It is a single value which can be taken as representative of the whole distribution.
Descriptive Statistics: Descriptive statistics are used to describe the basic features of the data in a
study.
Dispersion: It is the spread of the data in a distribution.
Median: It is that value of the variate which divides it into two equal parts.
Mode: It is that value of the variate which occurs maximum number of times in a distribution and
around which other items are densely distributed.
Base Year: The year from which comparisons are made is called the base year. It is commonly
denoted by writing ‘0’ as a subscript of the variable.
Consumer Price: It is the price at which the ultimate consumer purchases his goods and services
from the retailer.
Current Year: The year under consideration for which the comparisons are to be computed is called
the current year. It is commonly denoted by writing ‘1’ as a subscript of the variable.
Index Number: An index number is a statistical measure used to compare the average level of
magnitude of a group of distinct but related variables in two or more situations.
Mean Squared Error: It is the sum of the squared forecast errors for each of the observations
divided by the number of observations.
Period of Oscillation: The time interval between the variations is known as the period of
oscillation.
Periodic Variations: The variations that repeat themselves after a regular interval of time.
Random Variations: The variations that do not reveal any regular pattern of movements.
Secular Trend: It is the general tendency of the data to increase or decrease or stagnate over a long
period of time.
Review Questions
1. Show that if all observations of a series are added, subtracted, multiplied or divided by a
constant b, the mean is also added, subtracted, multiplied or divided by the same constant.
2. The heights of 15 students of a class were noted as shown below. Compute arithmetic
mean.
Self-Assessment
4. Find the median of the following data: 160, 180, 200, 280, 300, 320, 400
A. 140
B. 180
C. 300
D. 280
5. In a frequency distribution the last cumulative frequency is 300, Median shall lie in:
A. 130th
B. 140th
C. 160th
D. 150th
6. In a frequency distribution, the last cumulative frequency is 500. Q3 (Third Quartile) must
lie in.
A. 275th
B. 375th
C. 150th
D. 175th
7. The average monthly production of a factory for the first 8 months is 2,500 units, and for the
next 4 months, the production was 1,200 units. The average monthly production of the year
will be
A. 2066.55 units
B. 2085.55 units
C. 2075.55 units
D. none
8. In a week the prices of a bag of rice were 350, 280, 340, 290, 320, 310, 300. The range is
A. 100
B. 70
C. 60
D. 90
9. The most frequently occurring number in a set of values is called the ____.
A. Mean
B. Median
C. Mode
D. Range
10. When a set of numbers is heterogeneous, you can place more trust in the measure of central
tendency as representing the typical person or unit.
A. True
B. False
C. Median
D. Mean
13. The variation in two or more variables studies by the index is called:
A. Composite index
B. simple index
C. price index
D. none of these
6. B 7. A 8. B 9. C 10. B
Further Readings
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
R.S. Bhardwaj, Business Statistics, Excel Books, New Delhi, 2008.
S.N. Murthy and U. Bhojanna, Business Research Methods, Excel Books, 2007
Allan &Blumon, Elementary Statistics: A Step by Step Approach. McGraw-Hill College,
June 2003.
Mario F. Triola, Elementary Statistics, Addison-Wesley, January 2006.
Mark L. Berenson, David M. Revine, Tineothy C. Krehbiel, Basic Business Statistics:
Concepts & Applications, Prentice Hall, May 2005
Web Links
https://www.youtube.com/watch?v=Rjwknl_LuKw
https://www.youtube.com/watch?v=98K7AG32qv8
https://www.aptech.com/blog/introduction-to-the-fundamentals-of-time-series-
data-and-analysis/
https://www.vedantu.com/commerce/index-numbers
Objectives
After studying this unit, you will be able to:
Identify the Steps involved in Hypothesis Testing
Introduction
A statistical hypothesis test is a way of using experimental data to make statistical decisions. A
finding is statistically significant in statistics if it is unlikely to have occurred by chance.
In contrast to exploratory data analysis, hypothesis testing is sometimes referred to as confirmatory
data analysis. These decisions are almost always made using null hypothesis tests in frequency
probability; that is, tests that answer the query. What is the probability of seeing a value for the test
statistic that is at least as extreme as the value that was actually seen, assuming the null hypothesis
is true? Hypothesis testing can be used to determine whether experimental data contain enough
information to call conventional knowledge into question.
In a two-way ANOVA, the interaction term tells you if one of your independent factors has the
same influence on the dependent variable for all values of the other independent variable (and vice
versa). Is the effect of educational level (undergraduate/postgraduate) on test anxiety influenced
by gender (male/female)? If a statistically significant interaction is discovered, you must also
evaluate whether any "simple main effects" exist, and if so, what these effects are.
Null hypothesis
Alternate hypothesis
Assume that the population's mean is m0 and that the sample's mean is x. This is our null
hypothesis because we believed the population has a mean of m0. Hom = mo, where Ho is the null
hypothesis, is how we write it. Ha = m is an alternative hypothesis. The null hypothesis will be
rejected, indicating that the population mean is not m0. This means that the alternative hypothesis
is valid.
Significance Level
The validity of the hypothesis at a specific level of significance is the next stage after it has been
formulated. The significance level determines the level of confidence with which a null hypothesis
is accepted or rejected. A significance level of 5%, for example, suggests that the chance of making a
bad judgement is 5%. On 5 out of 100 occurrences, the researcher will be erroneous in accepting a
false hypothesis or rejecting a true hypothesis. A significance level of one percent suggests that the
researcher has a one-in-a-hundred chance of being mistaken in accepting or rejecting the
hypothesis. As a result, a 1% significance level provides more confidence in a judgement than a 5%
significance level.
There are two types of tests.
In a right side test, the critical region lies entirely in the right tail of the sample distribution.
Whether the test is one-sided or two-sided – depends on alternate hypothesis.
Example 1: A tyre company claims that mean life of its new tyre is 15,000 km. Now
the researcher formulates the hypothesis that tyre life is = 15,000 km.
A two-tailed test is one in which the test statistics leading to rejection of null hypothesis
falls on both tails of the sampling distribution curve as shown. One-tailed test is used when
Example 2: "Is the current advertisement less effective than the proposed new
advertisement"?
A two-tailed test is appropriate, when the researcher has no reason to focus on one side of
the issue.
Example 3:"Are the two markets - Mumbai and Delhi different to test market a
product?"
Ho = m1 = m2
= / Two-sided
Degree of Freedom
It tells the researcher the number of elements that can be chosen freely.
Compute
Make Decisions
Accepting or rejecting of the null hypothesis depends on whether the computed value falls in the
region of rejection at a given level of significance.
Discuss when you would prefer two tailed test to one tailed test.
(a) is called Type 1 error (a), (b) is called Type 2 error (b). When a = 0.10 it means that true
hypothesis will be accepted in 90 out of 100 occasions. Thus, there is a risk of rejecting a true
hypothesis in 10 out of every 100 occasions. To reduce the risk, use a = 0.01 which implies that we
are prepared to take a 1% risk i.e., the probability of rejecting a true hypothesis is 1%. It is also
possible that in hypothesis testing, we may commit Type 2 error (b) i.e., accepting a null hypothesis
which is false.
The only way to reduce Type 1 and Type 2 error is by increasing the sample size
Type 1 and Type 2 error is presented as follows. Suppose a marketing company has 2
distributors (retailers) with varying capabilities. On the basis of capabilities, the company
has grouped them into two categories (1) Competent retailer (2) Incompetent retailer. Thus,
R1 is a competent retailer and R2 is an incompetent retailer. The firm wishes to award a
performance bonus (as a part of trade promotion) to encourage good retailer ship. Assume
that two actions A1 and A2 would represent whether the bonus or trade incentive is given
and not given. This is shown as follows:
When the firm has failed to reward a competent retailer, it has committed type-2 error. On
the other hand, when it was rewarded to an incompetent retailer, it has committed type-1
Error.
Z-Test,
T-Test and (c) F-Test.
3. Observations must be independent i.e., selection of any one item should not affect the
chances of selecting any others be included in the sample.
Univariate
If we wish to analyse one variable at a time, this is called univariate analysis. Example: Effect of
sales on pricing. Here, price is an independent variable and sales is a dependent variable. Change
the price and measure the sales.
Bivariate
The relationship of two variables at a time is examined by means of bivariate data analysis.
If one is interested in a problem of detecting whether a parameter has either increased or decreased,
a two-sided test is appropriate.
Z Test
When sample size is > 30
1. P1 = Proportion in sample 1
P2 = Proportion in sample 2
You are working as a purchase manager for a company. The following information
has been supplied by two scooter tyre manufacturers.
Company A Company B
In the above, the sample size is 100, hence a Z-test may be used.
Testing the hypothesis about difference between two means: This can be used when two population
means are given and null hypothesis is Ho: P1 = P2.
In a city during the year 2000, 20% of households indicated that they read Femina
magazine. Three years later, the publisher had reasons to believe that circulation has gone
up. A survey was conducted to confirm this. A sample of 1,000 respondents were contacted
and it was found 210 respondents confirmed that they subscribe to the periodical 'Femina'.
From the above, can we conclude that there is a significant increase in the circulation of
'Femina'?
Solution:
We will set up null hypothesis and alternate hypothesis as follows:
Null Hypothesis is Ho × = 15%
− 0.20
=
. ( . )
0.21 − 0.20
=
. × .
0.01 −
=
.
0.1
= .
.
0.1
= = 8.33
0.012
As the value of Z at 0.05 =1.64 and calculated value of Z falls in the rejection region, we
reject null hypothesis, and therefore we conclude that the sale of 'Femina' has increased
significantly.
Problem
1. A certain pesticide is packed into bags by a machine. A random sample of 10 bags
are drawn and their contents are found as follows: 50, 49, 52, 44, 45, 48, 46, 45, 49, 45.
Confirm whether the average packaging can be taken to be 50 kgs.
In this text, the sample size is less than 30. Standard deviations are not known using
this test. We can find out if there is any significant difference between the two means
i.e. whether the two population means are equal.
2. There are two nourishment programmes 'A' and 'B'. Two groups of children are
subjected to this. Their weight is measured after six months. The first group of
children subjected to the programme 'A' weighed 44, 37, 48, 60, 41 kgs. at the end of
programme. The second group of children were subjected to nourishment
programme 'B' and their weight was 42, 42, 58, 64, 64, 67, 62 kgs. at the end of the
programme. From the above, can we conclude that nourishment programme 'B'
increased the weight of the children significantly, given a 5% level of confidence?
̅−
=
√ +
=5 =7
Σ = 230, Σ = 399
Σ( − ̅ ) = 310, Σ( − ) = 399
Σ 230
̅= = = 46
5
Σ 399
= = = 57
7
1
= {∑( − ̅ ) + ∑( − ) }
+ −2
D.F. = ( + − 2) = (5 + 7 − 2) = 10
1
= {310 + 674} = 98.4
10
46 − 57
=
98.4 × +
−11
=
98.4 ×
−11 11
= =−
√33.73 5.8
= −1.89
t at 10 d.f. at 5% level is 1.81.
Since, calculated t is greater than 1.81, it is significant. Hence HA is accepted. Therefore the
two nutrition programmes differ significantly with respect to weight increase.
−2
=
1−
Problem
A study of weight of 18 pairs of male and female employees in a company shows that
coefficient of correlation is 0.52. Test the significance of correlation.
Solution:
Applying t test:
−2
t=
1−
= 0.52, n = 18
18 − 2
t = 0.52
1 − (0.52)
0.52 × 4
= = 2.44
0.854
v = (n − 2) = (18 − 2) = 16
v = 16, t . = 2.12
The calculated value of t is greater than the table value. The given value of r is significant.
F-Test
Let there be two independent random samples of sizes n1 and n2 from two normal populationswith
variances 12 and 22 respectively. Further, let = ∑( − ‾ ) and = ∑( − ‾ ) be
the variances of the first sample and the second samples respectively. Then - statistic is defined as
the ratio of two - variates. Thus, we can write
( )
/( − 1)
= =( )
=
/( − 1)
Features of F-distribution
2 The mean of - variate with and degrees of freedom is and standard error is
2( + − 2)
−2 ( − 4)
We note that the mean will exist if > 2 and standard error will exist if > 4. Further, the mean
> 1.
3. The random variate F can take only positive values from 0 to ∞.The curve is positively skewed.
4. For large values of and the distribution approaches normal distribution.
5. If a random variate follows t-distribution with v degrees of freedom, then its square follows -
distribution with 1 and d.f. i.e. = ,
( )
6. Fand are also related as F = as v → ∞.
Chi-square Test
When the null hypothesis is true, a chi-square test (also chi-squared or test) is any statistical
hypothesis test in which the test statistic's sampling distribution is a chi-square distribution, or any
in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis
is true) can be made to estimate a chi-square distribution as closely as desired.
One case where the distribution of the test statistic is an exact chi-square
distribution is the test that the variance of a normally distributed population has a
given value based on a sample variance. Such a test is uncommon in practice
because values of variances to test against are seldom known exactly.
1. Sample observations should be independent i.e. two individual items should be included
twice in a sample.
2. The sample should contain at least 50 observations
or
total frequency should be greater than 50.
3. There should be a minimum of five observations in any cell. This is called cell frequency
constraint.
Persons
Under 20-40 20-40 41-50 51 &over
Is there any significant difference between the age group and preference for the car?
Problem:
on the part of competitor to conclude that the claim made by the company does not
holds good at 5% level of significance?
Solution:
(0 − )
= = 2.381
A 0.5 level of significance for 1 d.f. is equal to 3.841 (From tables). The calculated
value is 2.381 is lower. Therefore, we accept the hypothesis that 70% of the people
in that metro drink Wood Smoke branded tea.
One-way ANOVA
Following are the steps followed in ANOVA:
calculated value of F is more than the critical value of F, the difference in sample means is
considered as significant and the null hypothesis is rejected.
Problem:
In a company there are four shop floors. Productivity rate for three methods of
incentives and gain sharing in each shop floor is presented in the following table.
Analyze whether various methods of incentives and gain sharing differ significantly at
5% and 1% F-limits.
X1 X2 X3
1 5 4 4
2 6 4 3
3 2 2 2
4 7 6 3
Solution:
Step 1: Calculate mean of each of the three samples (i.e., x1, x2 and x3, i.e. different
methods of incentive gain sharing).
5+6+2+7
‾ = =5
4
4+3+2+3
‾ = =3
4
4+3+2+3
‾ = =3
4
{(5 − 5) + (6 − 5) + (2 − 5) + (7 − 5) }
ss within =
Σ( − ‾ )
{(4 − 4) + (4 − 4) + (2 − 4) + (6 − 4) }
+
Σ( − ‾ )
{(4 − 3) + (3 − 3) + (2 − 3) + (3 − 3) }
+
Σ( − ‾ )
= (0 + 1 + 9 + 4) + (0 + 0 + 4 + 4) + (1 + 0 + 1 + 0)
= 14 + 8 + 2
= 24
Step : ss of total variance which is equal to total of s.s. between and ss within and is
Σ x −x
Where
i = 1.23
for our example, total ss will thus be:
We will, however, get the same value if we simply total respective values of ss between
and ss within. For our example, ss between is 8 and ss within is 24, thus ss of total
variance is32(8 + 24).
Step 5: Ascertain degrees of freedom and mean square (MS) between and within the
samples. Degrees of freedom (df) for between samples and within samples are
computed differently as follows.
For between samples, dfis(k − 1), where k ' represents number of samples (for us it is 3).
For within samples dfis(n − k), where 'n' represents total number of items in all the
samples (for us it is 12).
Mean squares (MS) between and within samples are computed by dividing the ss
between and ss within by respective degrees of freedom. Thus, for our example:
ss between
(i) MS between = = =4
( )
where (K − 1) is thedf.
ss within
(ii) MS within = = = 2.67
( )
1) where(n − k) is thedf.
Step 6: Now we will have to compute F ratio by analysing our samples. The
formula for computing 'F' ratio is: ss between ss within
.
Thus, for our example, F ratio =
.
1.5
Step 7: Now we will have to analyze whether various methods of incentives and gain
sharing differ significantly at 5% and 1% 'F' limits. For this, we need to compare
observed 'F' ratio with 'F' table values. When observed 'F' value at given degrees of
freedom is either equal to or less than the table value, difference is considered
insignificant. In reverse cases, i.e., when calculated 'F' value is higher than table-F value,
the difference is considered significant and accordingly we draw our conclusion. For
example, our observed 'F' ratio at degrees of freedom (v1* & v2**, i.e., and 9) is 1.5. The
table value of F at 5% level with df 2 and 9 (v1 = 2, v2 = 9) is 4.26. Since the table value is
higher than the observed value, difference in rate of productivity due to various
methods of incentives and gain sharing is considered insignificant. At 1% level with df 2
and 9, we get the table value of F as 8.02 and we draw the same conclusion.
We can now draw an ANOVA table as follows to show our entire observation.
In a two-way ANOVA, the interaction term tells you if one of your independent factors has the
same influence on the dependent variable for all values of the other independent variable (and vice
versa). Is the effect of educational level (undergraduate/postgraduate) on test anxiety influenced
by gender (male/female)? If a statistically significant interaction is discovered, you must also
evaluate whether any "simple main effects" exist, and if so, what these effects are.
Problem:
Company ‘X’ wants its employees to undergo three different types of training
programme with a view to obtain improved productivity from them. After the
completion of the training programme, 16 new employees are assigned at random to
three training methods and the production performance were recorded.
The training managers’ problem is to find out if there are any differences in the
effectiveness of the training methods? The data recorded is as under:
Daily Output of New Employees
Method 1 15 18 19 22 11
Method 2 22 27 18 21 17
Method 3 18 24 19 16 22 15
10 Draw conclusions.
Solution:
∑ ( − ‾) 40
‾ = = = 20
−1 3−1
Variance between columns = 20
4-Calculation sample variance
−‾ ( − ‾) −‾ ( − ⃗) −‾ ( − ⃗)
15 22 18
− 17 (−2) = 4 − 21 (1) = 1 − 19 (1) = 1
18 27 24
− 17 (1) = 1 − 21 (6) = 36 − 19 (5) = 25
19 18 19
− 17 (2) = 4 − 21 (−3) = 9 − 19 (0) = 0
22 21 16
− 17 (5) = 25 − 21 (0) = 1 − 19 (−3) = 9
11 17 22
− 17 (−6) = 36 − 21 (−4) = 16 − 19 (3) = 9
15
− 19 (−4) = 16
∑( − ⃗) ∑( − ‾) Σ( − ‾)
= 70 = 62 = 60
∑( ‾) ∑( ‾) ∑( ‾)
Sample variance = = , = , =
70 62 60
= = 17.5, = = 15.5, s = = 12
4 4 5
= × 17.5 + × 15.5 + × 12
Within column variance = × 17.5 + × 15.5 + × 12
= = 14.76
7-d.f. of Numerator= (3 − 1) = 2.
10-The value is3.81. This is the upper limit of acceptance region. Since calculated value
Conclusion: There is no significant difference in the effect of the three training methods.
Non-parametric Test
Non-parametric tests are used to test the hypothesis with nominal and ordinal data.
1. We do not make assumptions about the shape of population distribution.
2. The hypothesis of non-parametric test is concerned with something other than the value of a
population parameter.
3. Easy to compute. There are several conditions, especially in marketing research, when
parametric tests' assumptions aren't valid. In a parametric test, for example, we assume that
the data is distributed normally. Non-parametric tests are employed in these situations.
Binomial test, Mann-Whitney U test, Sign test, and other non-parametric tests are examples.
When there are only two classes in a population, such as males and females, buyers and non-
buyers, success and failure, a binomial test is performed. Every observation about the
population must pass one of these two tests. When the sample size is small, the binomial test
is employed.
Advantages
Disadvantage
Non-parametric test involves the greater risk of accepting a false hypothesis and thus committing a
Type 2 error.
Summary
Hypothesis testing is the use of statistics to determine the probability that a given hypothesis
is true.
The usual process of hypothesis testing consists of four steps.
Formulate the null hypothesis and the alternative hypothesis.
Identify a test statistic that can be used to assess the truth of the null hypothesis.
Compute the P-value, which is the probability that a test statistic at least as significant as the
one observed would be obtained assuming that the null hypothesis were true.
The smaller the -value, the stronger the evidence against the null hypothesis.
If p, that the observed effect is statistically significant, the null hypothesis is ruled out, and
the alternative hypothesis is valid.
Keywords
Alternate Hypothesis: An alternative hypothesis is one that specifies that the null hypothesis is not
true. The alternative hypothesis is false when the null hypothesis is true, and true when the null
hypothesis is false.
ANOVA: It is a statistical technique used to test the equality of three or more sample means.
Degree of Freedom:It is the consideration that tells the researcher the number of elements that can
be chosen freely.
Null Hypothesis:The null hypothesis is a hypothesis which the researcher tries to disprove, reject
or nullify.
Significance Level: Significance level is the criterion used for rejecting the null hypothesis.
Self Assessment
1-When the prediction does not specify a direction, the research have:
A. One-tailed hypothesis
B. Two-tailed hypothesis
C. Null hypothesis
D. None of these
2-The significance level can be denoted by:
A. Alpha
B. Beta
C. Gamma
D. Hyphen
3-A hypothesis is a specific statement of
A. Estimates
B. Assessment
C. Accuracy
D. Prediction
A. One-tailed hypothesis
B. Two-tailed hypothesis
C. Multi-tailed hypothesis
D. None of these
A. population parameters.
B. sample parameters.
C. sample statistics.
D. it depends - sometimes population parameters and sometimes sample statistics
6-A statement made about a population for testing purpose is called?
A. Statistic
B. Hypothesis
C. Level of Significance
D. Test-Statistic
7-If the Critical region is evenly distributed then the test is referred as?
A. One tailed
B. Two tailed
C. Three tailed
D. Zero tailed
8-Alternative Hypothesis is also called as?
A. Composite Hypothesis
B. Research Hypothesis
C. Simple Hypothesis
D. Null Hypothesis
9-The statement “If there is sufficient evidence to reject a null hypothesis at the 10% significance
level, then there is sufficient evidence to reject it at the 5% significance level” is:
A. Always True
B. Never True
C. Sometimes true; the p-value for the statistical test needs to be provided for a conclusion.
D. Not Enough Information: this would depend on the type of statistical test used
10-Any statement whose validity is tested based on a sample is called
A. Null hypothesis
B. Alternative hypothesis
C. Statistical hypothesis
D. Simple hypothesis
A. Simple
B. Composite
C. Null
D. All of the above
12-A null hypothesis is rejected if the value of a test statistic lies in the
A. Rejection region
B. Acceptance region
C. Both
D. None
14- Which of the following distribution is useful for small sample while testing for population
means?
A. Z distribution
B. F distribution
C. Chi-square distribution
D. T distribution
15-The t distribution could be used
6. B 7. B 8. B 9. C 10. C
Review Questions
1. What hypothesis, test and procedure would you use when an automobile company has
manufacturing facility at two different geographical locations? Each location manufactures
two-wheelers of a different model. The customer wants to know if the mileage given by both
the models is the same or not. Samples of 45 numbers may be taken for this purpose.
2. What hypothesis, test and procedure would you use when a company has 22 sales
executives? They underwent a training programme. The test must evaluate whether the sales
performance is unchanged or improved after the training programme.
3. What hypothesis, test and procedure would you use in A company has three categories of
managers:
(a) With professional qualifications but without work experience.
(b)With professional qualifications accompanied by work experience.
(c) Without professional qualifications but with work experience.
4. Each person in a random sample of 50 was asked to state his/her sex and preferred colour.
The resulting frequencies are shown below.
Sex Female 15 6 4
A chi-square test is used to test the null hypothesis that sex and preferred colour are
independent. Will you reject at the null hypothesis 0.005 level? Why/Why not?
5. In hypothesis testing, if is the probability of committing an error of Type II. The power of
the test, 1 – is then the probability of rejecting H0 when HA is true or not? Why?
6. In a statistical test of hypothesis, what would happen to the rejection region if , the level of
significance, is reduced?
7. During the pre-flight check, Pilot Mohan discovers a minor problem - a warning light
indicates that the fuel gauge may be broken. If Mohan decides to check the fuel level by hand,
it will delay the flight by 45 minutes. If he decides to ignore the warning, the aircraft may run
out of fuel before it gets to Mumbai. In this situation, what would be:
(a) the appropriate null hypothesis? and (b) a type I error?
8. Can the probability of a Type II error be controlled by the sample size? Why/ why not?
9. A research biologist has carried out an experiment on a random sample of 15 experimental
plots in a field. Following the collection of data, a test of significance was conducted under
appropriate null and alternative hypotheses and the P-value was determined to be
approximately .03. What does this indicate with respect to the hypothesis testing?
10. Two samples were drawn from a recent survey, each containing 500 hamlets. In the first
sample, the mean population per hamlet was found to be 100 with a S.D. of 20, while in the
second sample the mean population was 120 with a S.D. 15. Do you find the averages of the
samples to be statistically significant?
11. A simple random sample of size 100 has a mean of 15, the population variance being 25. Find
an interval estimate of the population mean with a confidence level of (i) 99% and (ii) 95%.
12. A population consists of five numbers 2, 3, 6, 8, 11. Consider all possible samples of size two
which can be drawn with replacement from this population. Calculate the S.E. of sample
means.
13. A certain drug is claimed to be effective in curing colds; half of them were given sugar pills.
The patients’ reactions to the treatment are recorded in the following table.
Drug 52 10 18
Sugar pills 44 10 26
Test the hypothesis that the drug is no better than the sugar pills for curing colds. (The 5 % value
of x2 for v = 2 = 5.991)
Further Readings
Abrams, M.A, Social Surveys and Social Action, London: Heinemann, 1951.
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
R.S. Bhardwaj, Business Statistics, Excel Books, New Delhi, 2008.
S.N. Murthy and U. Bhojanna, Business Research Methods, Excel Books, 2007.
Web Links
https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/
https://machinelearningmastery.com/statistical-hypothesis-tests/
https://www.uth.tmc.edu/uth_orgs/educ_dev/oser/L2_2.HTM
https://online.stat.psu.edu/statprogram/reviews/statistical-
concepts/hypothesis-testing
Objectives
After studying this unit, you will be able to:
Explain the Concept of correlation
Judge the Scope of correlation analysis
Define the Rank Correlation
Introduction
Once best estimates are chosen, both from a statistical and epidemiologic perspective, hypotheses
about the estimated association between a single mean, proportion, or rate and a fixed value,
typically standard or goal, or about the estimated association between two or more means,
proportions, or rates can be tested.
The measures of association refer to a wide variety of coefficients that measure the strength of the
relationship that has been described in several ways. The word ‘association’ in measures of
association measures the strength of association in which there is at least one of the variables that is
dichotomous in nature, generally nominal or ordinal. The measures of association define the
strength of the linear relationship in terms of the degree of monotonicity. This degree of
monotonicity used by the measures of association is based on the counting of various types of pairs
in a relationship.
10.1 Correlation
Various experts have defined correlation in their own words and their definitions, broadly
speaking, imply that correlation is the degree of association between two or more variables. Some
important definitions of correlation are given below:
— A.M. Tuttle
Research Methodology
2. “When the relationship is of a quantitative nature, the appropriate statistical tool for discovering
and measuring the relationship and expressing it in a brief formula is known as correlation.”
— YaLun Chou
Correlation Coefficient: It is a numerical measure of the degree of association between two or
more variables.
Example: If we have data on price of wheat and its cost of production, the correlation
between them may be very high because higher price of wheat may attract farmers to
produce more wheat and more production of wheat may mean higher cost of production,
assuming that it is an increasing cost industry. Further, the higher cost of production may
in turn raise the price of wheat.
For the purpose of determining a relationship between the two variables in such situations,
we can take any one of them as independent variable.
3. The two variables may be acted upon by the outside influences: In this case we might get a
high value of correlation between the two variables, however, apparently no cause and effect
type relation seems to exist between them.
Example: The demands of the two commodities, say X and Y, may be positively correlated
because the incomes of the consumers are rising. Coefficient of correlation obtained in such
a situation is called a spurious or nonsense correlation.
4. A high value of the correlation coefficient may be obtained due to sheer coincidence (or pure
chance): This is another situation of spurious correlation. Given the data on any two
variables, one may obtain a high value of correlation coefficient when in fact they do not have
any relationship.
Example: A high value of correlation coefficient may be obtained between the size
of shoe and the income of persons of a locality.
Scatter Diagram
Let the bivariate data be denoted by , , where = 1,2 … … . . In order to have some idea about
the extent of association between variables and , each pair , = 1,2 … … , is plotted on a
graph. The diagram, thus obtained, is called a Scatter Diagram.
Each pair of values ( , ) is denoted by a point on the graph. The set of such points may cluster
around a straight line or a curve or may not show any tendency of association. Various possible
situations are shown with the help of following diagrams:
The sets of points in scatter diagram are known as dots of the diagram
If all the points or dots lie exactly on a straight line or a curve, the association between the variables
is said to be perfect. This is shown below:
A scatter diagram of the data helps in having a visual idea about the nature of association between
two variables. If the points cluster along a straight line, the association between variables is linear.
Further, if the points cluster along a curve, the corresponding association is non-linear or
curvilinear. Finally, if the points neither cluster along a straight line nor along a curve, there is
absence of any association between the variables.
It is also obvious from the above figure that when low (high) values of X are associated with low
(high) value of Y, the association between them is said to be positive. Contrary to this, when low
(high) values of X are associated with high (low) values of Y, the association between them is said
to be negative.
This unit deals only with linear association between the two variables X and Y. We shall measure
the degree of linear association by the Karl Pearson’s formula for the coefficient of linear
correlation.
Research Methodology
Let ‾ and ‾ be the arithmetic means of and respectively. Draw two lines = ‾ and = ‾ on
the scatter diagram. These two lines, intersect at the point ( ‾ , ‾ ) and are mutually perpendicular,
divide the whole diagram into four parts, termed as I, II, III and IV quadrants, as shown.
As mentioned earlier, the correlation between and will be positive if low (high) values of are
associated with low (high) values of . In terms of the above Figure, we can say that when values of
that are greater (less) than ‾ are generally associated with values of that are greater (less) than
‾ , the correlation between and will be positive. This implies that there will be a general
tendency of points to concentrate in I and III quadrants. Similarly, when correlation between and
is negative, the point of the scatter diagram will have a general tendency to concentrate in II and
IV quadrants.
Further, if we consider deviations of values from their means, i.e., ( − ‾ )and ( − ‾ ), we note
that:
2 ( − ‾ )will be negative and ( − ‾ ) will be positive for all points in quadrant II.
4 ( − ‾ )will be positive and ( − ‾ ) will be negative for all points in quadrant IV.
It is obvious from the above that the product of deviations, i.e., ( − ‾ )( − ‾ ) will be
positive for points in quadrants I and III and negative for points in quadrants II and IV.
Notes: Since, for positive correlation, the points will tend to concentrate more in I and
III quadrants than in II and IV, the sum of positive products of deviations will
outweigh the sum of negative products of deviations. Thus, Σ( − ‾ )( − ‾ ) will be
positive for all the observations.
Similarly, when correlation is negative, the points will tend to concentrate more in II
and IV quadrants than in I and III. Thus, the sum of negative products of deviations
will outweigh the sum of positive products and hence Σ( − ‾ )( − ‾ ) will be
negative for all the n observations.
On the basis of the above, we can consider I(x − x)( , ‾ ) as an absolute measure of correlation. This
measure, like other absolute measures of dispersion, skewness, etc, will depend upon (i) the
number of observations and (ii) the units of measurements of the variables.
In order to avoid its dependence on the number of observations, we take its average, i.e., Σ( −
‾ )( − ‾ ).This term is called covariance in statistics and is denoted asCov ( , ).
To eliminate the effect of units of measurement of the variables, the covariance term is divided by
the product of the standard deviation of and the standard deviation of . The resulting
expression is known as the Karl Pearson's coefficient of linear correlation or the product moment
correlation coefficient or simply the coefficient of correlation, between and .
Cos ( , )
=
or
∑ ( − ‾ )( − ‾ )
=
∑ ( − ‾) ∑ ( − ‾)
∑ ( − ‾ )( − ‾)
=
∑ ( − ‾) ∑ ( − ‾)
Consider ( − ‾ )( − ‾ ) = ( − ‾) − ‾ ( − ‾)
=Σ − ‾Σ (second term is zero)
= − ‾‾ = ‾
( − ‾) = − ‾
∑ − ‾‾
=
[∑ − ‾ ] ⌊ − ‾
(∑ )(∑ )
∑ −
=
(∑ ) (∑ )
∑ − ∑ −
∑ − (∑ )(∑ )
=
∑ − (∑ ) ∑ − (∑ )
∑
=
∑ ∑
or
Research Methodology
∑
=
∑ ∑
or
1∑
=
Equations (5) or (6) are often used for the calculation of correlation from raw data, while the use of
the remaining equations depends upon the forms in which the data are available. For example, if
standard deviations of and are given, equation (9) may be appropriate.
Example: Calculate the Karl Pearson's coefficient of correlation from the following
pairs of values:
Values of Xi 12 9 8 10 11 13 7
Values of Yi 14 8 6 9 11 12 3
Solution
The formula for Karl Pearson's coefficient of correlation is
∑ − (∑ )(∑ )
∑ − (∑ ) ∑ − (∑ )
The values of different terms, given in the formula, are calculated from the following
table:
7 × 676 − 70 × 63
= 7 × 651 − (63) = 0.949
√7 × 728 − (70)
Example: Calculate the correlation between Reading, (X) and Spelling (Y) for the 10
students whose scores are given below:
Solution:
Σ( − ) −
r =
−60.5 −60.5
= = = −0.36
(10)(2.872)(5.832) 167.495
However, in real practice, we use the computational or raw score formula for the
correlation coefficient:
Σ − (Σ )(Σ )
Σ − (Σ ) Σ − (Σ )
Where:
(i) N is the number of subjects
(ii) Σ is the sum of each subject score times the score,
Research Methodology
Correlation between Reading and Spelling for the data given in example using
Computational Formula:
Σ − (Σ )(Σ )
r=
Σ − (Σ ) Σ − (Σ )
(10)(506) − (55)(103)
=
(10)(385) − (55) (10)(1401) − (103)
(5060 − 5665 −605
= =
√3850 − 3205√14010 − 10609 √825√3401
−605 −605
= = = −0.36
(28.723)(58.318) 1675.0679
Thus, the correlation is−.36, indicating that there is a small negative correlation
between reading and spelling. The correlation coefficient is a number that can range
from 1 (perfect negative correlation) through 0 (no correlation) to 1 (perfect positive
correlation).
Task: The covariance between the length and weight of five items is 6 and their standard
deviations are 2.45 and 2.61 respectively. Find the coefficient of correlation between
length and weight.
The Karl Pearson's coefficient of correlation and covariance between two variables
and is −0.85 and −15 respectively. If variance of is 9 , find the standard deviation of
X.
+ℎ , ∴ ‾= + ℎ‾
Thus we have − ‾= +ℎ − − ℎ ‾ = ℎ( − ‾)
Similarly = + , ∴ ‾= + ‾
Thus −‾= + − − ‾= ( − ‾)
∑ ( − ‾ )( − ‾ )
=
∑ ( − ‾) ∑ ( − ‾)
∑ ℎ( − ‾) ( − ‾) ∑ ( − ‾)( − ‾)
= =
∑ ℎ ( − ‾) ∑ ( − ‾) ∑ ( − ‾) ∑ ( − ‾)
∴ =
∴ =
This shows that correlation between and is equal to correlation between and , where and
are the variables obtained by change of origin and scale of the variables and respectively.
This property is very useful in the simplification of computations of correlation. On the basis of this
property, we can write a short-cut formula for the computation of :
∑ − (∑ )(∑ )
=
∑ − (∑ ) ∑ − (∑ )
∑ ( − ‾) ∑ ( − ‾)
or = and =
∑ ( − ‾ )( − ‾ ) 1 − ‾ − 1
Also, = = ⋅ = ,
Consider the sum + . The square of this sum is always a non-negative number,
i.e.( + ) ≥ 0
Taking sum over all the observations and dividing by n, we get
1 1
( + ) ≥0 or + +2 , ≥0
1 1 2
or + + , ≥0
or 1 + 1 + 2 0 or 2 + 2 0 or −1
Further, consider the difference − . The square of this difference is also non-negative, i.e.( −
) ≥0
Taking sum over all the observations and dividing by , we get
1
( − ) ≥0
Research Methodology
or ∑ + −2 ≥0
or ∑ + ∑ − ∑ ;≥ 0
or 1 + 1 − 2 0 or 2 − 2 0 or £1
Combining the inequalities (11) and (12), we get −1 ≤ ≤ 1. Hence lies between - 1 and +1.
3. If and are independent they are uncorrelated, but the converse is not true.
If and are independent, it implies that they do not reveal any tendency of simultaneous
movement either in same or in opposite directions. The dots of the scatter diagram will be
uniformly spread in all the four quadrants. Therefore, ∑( − ‾ )( − ‾ ) or Cov ( , ) will be equal
to zero and hence, = 0. Thus, if and are independent, they are uncorrelated.
The converse of this property implies that if = 0, then and may not necessarily be
independent. To prove this, we consider the following data:
1 (∑ )(∑ ) 1 28 × 28
∴ Cov ( , ) = − = 112 − = 0 Thus, =0
7 7
A close examination of the given data would reveal that although = 0, but and are not
independent. In fact they are related by the mathematical relation = ( − 4) .
1. Coefficient of correlation r does not give any idea about the existence of cause and effect
relationship between the variables. It is possible that a high value of r is obtained although none of
them seem to be directly affecting the other. Hence, any interpretation of r should be done very
carefully.
2. It is only a measure of the degree of linear relationship between two variables. If the relationship
is not linear, the calculation of r does not have any meaning.
4. If the data are not uniformly spread in the relevant quadrants the value of r may give a
misleading interpretation of the degree of relationship between the two variables. For example, if
there are some values having concentration around a point in first quadrant and there is similar
type of concentration in third quadrant, the value of r will be very high although there may be no
linear relation between the variables.
5. As compared with other methods, to be discussed later in this unit, the computations of r are
cumbersome and time consuming.
This is a crude method of computing correlation between two characteristics. In this method,
various items are assigned ranks according to the two characteristics and a correlation is computed
between these ranks. This method is often used in the following circumstances:
1. When the quantitative measurements of the characteristics are not possible, e.g., the results of a
beauty contest where various individuals can only be ranked.
2. Even when the characteristics is measurable, it is desirable to avoid such measurements due to
shortage of time, money, complexities of calculations due to large data, etc.
3. When the given data consist of some extreme observations, the value of Karl Pearson’s
coefficient is likely to be unduly affected. In such a situation the computation of the rank correlation
is preferred because it will give less importance to the extreme observations.
4. It is used as a measure of the degree of association in situations where the nature of population,
from which data are collected, is not known.
The coefficient of correlation obtained on the basis of ranks is called 'Spearman's Rank Correlation'
or simply the 'Rank Correlation'. This correlation is denoted by p(rho).
Let be the rank of ith individual according to the characteristics and , be its rank according to
the characteristics . If there are individuals, there would be pairs of ranks , , = 1,2, ..... .
We assume here that there are no ties, i.e., no two or more individuals are tied to a particular rank.
Thus, and are simply integers from 1 to , appearing in any order.
⋯.. ( )
The means of and , i.e., ‾ = ‾= = = .
( ) ( )( ) ( )
Also, = = [1 + 2 + ⋯ + ]− = − =
Letd be the difference in ranks of the th individual, i.e.
= − =( − ‾ ) − ( − ‾ )(∵ ‾ = ‾ )
Squaring both sides and taking sum over all the observations, we get
Σ = [( − ‾ ) − ( − ‾ )]
= ( − ‾) + ( − ‾) − 2 ( − ‾ )( − ‾ )
∑ = ∑ ( − ‾) + ∑ ( − ‾) − ∑ ( − ‾ )( − ‾ )
= + − 2Cov ( , ) = 2 − 2Coo ( , )(∵ = )
∑
From this, we can write 1 − = ×
or
1 ∑ 1 ∑ 12 6∑
=1− × =1− × × =1−
2 2 −1 ( − 1)
Question: Following is the list of marks scored by eleven students in mathematics and
English in their 12th standard examination.
Research Methodology
Solution:
The sum of the squared difference in ranks (the sum of the entries in the D2 column) is
given by: 0+9+0+0+0+0+0+4+36+1+16 = 66 Using the Spearman rank-correlation
coefficient, we obtain:
6 × 66
r =1− = 0.56
10(10 × 10 − 1)
The Spearman rank-correlation coefficient ranges from −1to +1. The estimate of 0.56
suggests a strong positive relationship between rank performance in Maths and English.
Similarly, when = −1, an individual that has been assigned 1 st rank according to one
characteristic must be assigned th rank according to the other and an individual that has been
assigned 2nd rank according to one characteristic must be assigned ( − 1) th rank according to the
other, etc.
Thus, the sum of ranks, assigned to every individual, is equal to( + 1), ie., + = + 1 or
= ( + 1) − = 1,2, … …
Further, = − = − ( + 1) + = 2 − ( + 1)
Squaring both sides, we have
= [2 − ( + 1)] = 4 + ( + 1) − 4( + 1)
4 ( + 1)(2 + 1) 4 ( + 1)
=4 + ( + 1) − 4( + 1) = + ( + 1) −
6 2
2 ( + 1)( − 1) ( − 1)
= ( + 1) (2 + 1) + ( + 1) − 2( + 1) = =
3 3 3
Substituting this value in the formula for rank correlation we have
6 ( − 1) 1
=1− × = −1
3 ( − 1)
Hence, the Spearman's coefficient of correlation lies between −1and +1.
Example: The following table gives the marks obtained by 10 students in commerce and
statistics. Calculate the rank correlation.
Solution:
Calculation Table
Research Methodology
When the null hypothesis is true, a chi-square test (also chi-squared or test) is any statistical
hypothesis test in which the test statistic's sampling distribution is a chi-square distribution, or any
in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis
is true) can be made to estimate a chi-square distribution as closely as desired.
Caution: One case where the distribution of the test statistic is an exact chi-square
distribution is the test that the variance of a normally-distributed population has a
given value based on a sample variance. Such a test is uncommon in practice
because values of variances to test against are seldom known exactly.
1. Sample observations should be independent i.e. two individual items should be included
twice in a sample.
2. The sample should contain at least 50 observations
or
total frequency should be greater than 50.
3. There should be a minimum of five observations in any cell. This is called cell frequency
constraint.
For instance: Chi-square
Persons
Under 20-40 20-40 41-50 51 &Over
Is there any significant difference between the age group and preference for the car?
Problem:
Solution:
Hypothesis H0 – People who drink Wood Smoke brand is 70%.
H0 – People who drink Wood Smoke brand is not 70%.
If the hypothesis is true then number of consumers who drink this particular brand is
200 × 0.7 = 140.
Those who do not drink that brand are 200 × 0.3 = 60
Degree of freedom = D = 2 – 1 = 1, since there are two groups.
(0 − )
= = 2.381
A 0.5 level of significance for 1 d.f. is equal to 3.841 (From tables). The calculated
value is 2.381 is lower. Therefore, we accept the hypothesis that 70% of the people in
that metro drink Wood Smoke branded tea.
Summary
Researchers sometimes put all the data together, as if they were one sample.
There are two simple ways to approach these types of data.
We can use the technique of correlation to test the statistical significance of the association.
In other cases we use regression analysis to describe the relationship precisely by means of an
equation that has predictive value.
Straight-line (linear) relationships are particularly important because a straight line is a
simple pattern that is quite common.
The correlation measures the direction and strength of the linear relationship.
Calculation of rank correlation coefficient in different situations
Keywords
Correlation: It is an analysis of covariation between two or more variables.
Observed Frequency: The frequency actually obtained from the performance of an experiment.
Contingency table: A table having rows and columns where in each row corresponds to a level of
one variable and each column to a level of another variable.
Self Assessment
Research Methodology
3. When the values of two variables move in the opposite directions, correlation is said to be
............................
A. Linear
B. Non-linear
C. Positive
D. Negative
4. When the values of two variables move in the opposite directions, correlation is said to be
............................
A. Linear
B. Non-linear
C. Positive
D. Negative
A. Fisher
B. Spearman
C. Karl Pearson
D. Bowley
A. K
B. A
C. S
D. R
A. Partial correlation
B. Multiple correlation
C. Nonsense correlation
D. Simple correlation
D. Poisson Distribution
A. True
B. False
10. On account of simple calculation involved, χ2 test is very frequently used by the statistician.
A. True
B. False
12-If all the scatter of points on two variables lie on a negatively stopped straight line, the
correlation coefficient between the variables would be
A. +1
B. -1
C. Zero
D. None of the above
13-A positive and a negative relationship may have the same strength.
A. True
B. False
A. True
B. False
6. D 7. B 8. B 9. B 10. A
Review Questions
1. Show that the coefficient of correlation, r, is independent of change of origin and scale.
2.Prove that the coefficient of correlation lies between – 1 and + 1.
3. What is Spearman’s rank correlation? What are the advantages of the coefficient of rank
correlation over Karl Pearson’s coefficient of correlation?
Research Methodology
4.What can you conclude on the basis of the fact that the correlation between body weight and
annual income were high and positive?
1. Suppose we have ranks of 8 students of B.Sc. in Statistics and Mathematics. On the basis of
rank we would like to know that to what extent the knowledge of the student in Statistics
and Mathematics is related.
Rank in 52 60 58 39 41 53 47 34
Statistics
40 46 43 54 49 55 48 57
Rank in
Mathematics
Further Readings
Abrams, M.A., Social Surveys and Social Action, London: Heinemann, 1951.
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
RS. Bhardwaj, Business Statistics, Excel Books, New Delhi, 2008.
S.N. Murthy and U. Bhojanna, Business Research Methods, Excel Books, 2007
Web Links
https://www.statisticshowto.com/probability-and-statistics/correlation-coefficient-
formula/
https://conjointly.com/kb/correlation-statistic/
https://www.youtube.com/watch?v=4EXNedimDMs
https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-
statistical-guide.php
Objectives
After studying this unit, you will be able to:
Explain the Concept of Analysis of Variance (ANOVA)
Discuss reliability and validity
Introduction
ANOVA stands for "analysis of variance," and it's a statistical technique for testing a hypothesis
and determining how various groups react to one another by connecting independent and
dependent variables. ANOVA is a statistical test that compares the means of two groups to see
if there is a difference between them.It is an advanced technique for the experimental treatment of
testing differences among all of the means.
The ANOVA technique allows us to do this simultaneous test and is thus regarded as a valuable
analytical tool in the hands of a researcher. Using this method, one can estimate if the samples
weretaken from populations with the same mean.
Regression analysis is a proven method for determining which variables have an impact on a
certain subject. Regression analysis allows you to confidently establish which elements are most
important, which factors may be ignored, and how these factors interact. Data is at the heart of
regression analysis. It aids businesses in comprehending the data they have and using it –
specifically, the correlations between data points – to make better decisions, ranging from sales
forecasting to inventory levels and supply and demand analysis. Regression analysis is frequently
referred to as one of the most important business analysis approaches.
Research Methodology
1. One-way ANOVA
Following are the steps followed in ANOVA:
1. Calculate the variance between samples.
2. Calculate the variance within samples.
3. Calculate F ratio using the formula. F = Variance between the samples/Variance within the
sample
4. Compare the value of F obtained above in (3) with the critical value of F such as 5% level of
significance for the applicable degree of freedom.
5. The difference in sample means is not significant when the calculated value of F is less than the
table value of F, and the null hypothesis is accepted. When the estimated value of F is greater than
the critical value of F, on the other hand, the difference in sample means is regarded significant, and
the null hypothesis is rejected.
Application in Market Research Consider the following pricing experiment. For a new toffee box
introduced by Nutrine Company, three prices are explored. The price of three different types of
toffee boxes is 39, 44, and 49 dollars. The goal is to figure out how price levels affect sales. These
toffee boxes will be shown in five supermarkets. The sales are as follows:
What the manufacturer wants to know is: (1) whether the difference among the means is
significant? If the difference is not significant, then the sale must be due to chance. (2) Do the means
differ? (3) Can we conclude that the three samples are drawn from the same population or not?
Example: In a company there are four shop floors. Productivity rate for three methods of
incentives and gain sharing in each shop floor is presented in the following table. Analyze
whether various methods of incentives and gain sharing differ significantly at 5% and 1% F-
limits.
Solution:
Step 1: Calculate mean of each of the three samples (i.e., x1, x2 and x3, i.e. different methods
of incentive gain sharing).
5+6+2+7
‾ = =5
4
4+3+2+3
‾ = =3
4
4+3+2+3
‾ = =3
4
‾ ‾ ‾
Step 2: Calculate mean of sample means i.e., ‾‾=
ss between = n (x − x) + n (x − x) + n (x − x)
ss within= Σ( − ‾ ) + Σ( − ‾ ) + Σ( −‾ )
The sum of squares (ss) for variance between samples is calculated by subtracting the
sample mean deviations from the mean of sample means () and computing the squares of
such deviations, which are then multiplied by the number of items or categories in the
samples to get their total. The sum of squares (ss) for variance within samples is calculated
by subtracting all sample item values from their respective sample averages, squaring the
deviations, and then adding them together. For our illustration then
ss between = 4(5 − 4) + 4(4 − 4) + 4(3 − 4)
= 4+0+4 = 8
{(5 − 5) + (6 − 5) + (2 − 5) + (7 − 5) } {(4 − 4) + (4 − 4) + (2 − 4) + (6 − 4) }
ss within = +
Σ(x − x ) Σ(x − x )
{(4 − 3) + (3 − 3) + (2 − 3) + (3 − 3) }
+
Σ(x − x )
= (0 + 1 + 9 + 4) + (0 + 0 + 4 + 4) + (1 + 0 + 1 + 0)
= 14 + 8 + 2
= 24
Step 4: ss of total variance which is equal to total of s.s. between and ss within and is
denoted by formula as follows:
Σ − ‾̅
Where
= 1.23
= 1.23
We will, however, get the same value if we simply total respective values of ss between and
ss within.
172 For our example, ss between
Lovely is 8Professional
and ss withinUniversity
is 24, thus ss of total variance is 32
(8+24). Step 5: Ascertain degrees of freedom and mean square (MS) between and within the
samples. Degrees of freedom (df) for between samples and within samples are computed
differently as follows. For between samples, df is (k-1), where k' represents number of
Notes
Research Methodology
2. Two-way ANOVA
The approach for calculating variance is identical to that used for one-way classification. The
following is an example of ANOVA two-way classification: Assume a company has four different
types of machines: A, B, C, and D. It has placed four of its employees on each machine for a given
amount of time, such as one week. The average production of each worker on each type of machine
was calculated at the end of one week. These data are given below:
Average Production by the MachineType
Example: Company ‘X’ wants its employees to undergo three different types of
training programme with a view to obtain improved productivity from them. After
the completion of the training programme, 16 new employees are assigned at
random to three training methods and the production performance were recorded.
The training managers’ problem is to find out if there are any differences in the
effectiveness of the training methods? The data recorded is as under
Daily Output of New Employees
10 Draw conclusions.
Solution:
2 Grand mean
15 + 18 + 19 + 22 + 11 + 22 + 27 + 18 + 21 + 17 + 24 + 19 + 16 + 22 + 15 + 18
=
16
304
= = 19
16
∑ ( − ) 40
‾ = = = 20
−1 3−1
4. Calculation sample variance:
∑( ‾) ∑( ‾) ∑( ‾)
Sample variance = = , = , =
70 62 60
= = 17.5, = = 15.5, = = 12
4 4 5
7. d.f. of Numerator= (3 − 1) = 2.
Notes
Research Methodology
Reliability Analysis
The degree to which the measurement method is error-free is referred to as reliability. Accuracy
and consistency are two aspects of reliability. If the scale produces the same findings when
repeated measurements are taken under the same conditions, it is said to be reliable.
Reliability can be ensured by using the same scale on the same set of respondents, using the same
method. However, in actual practice, this becomes difficult as:
1. Extent to which a scale produces consistent results
2. Test-retest Reliability: Respondents are administered scales at 2 different times under nearly
equivalent conditions
3. Alternative-form Reliability: 2 equivalent forms of a scale are constructed, then tested with the
same respondents at 2 different times
4. Internal Consistency Reliability:
(a) The consistency with which each item represents the construct of interest
(b) Used to assess the reliability of a summated scale
(c) Split-half Reliability
5. Items constituting the scale divided into 2 halves, and resulting half scores are correlated:
Coefficient alpha (most common test of reliability)
6. Average of all possible split-half coefficients resulting from different splitting of the scale items.
Validity Analysis
The paradigm of validity focused in the question "Are we measuring, what we think, we are
measuring?" Success of the scale lies in measuring "What is intended to be measured?" Of the two
attributes of scaling, validity is the most important.
There are several methods to check the validity of the scale used for measurement:
1. Construct Validity:A sales manager feels that there is a direct link between job satisfaction and
the degree to which a person is an extrovert, as well as the sales force's performance. As a
result, those who have high job satisfaction and outgoing personalities should perform well. If
they don't, the measure's construct validity is called into question.
2. Content Validity:The problem should be clearly defined by the researcher. Determine the
object to be measured. Create a scale that is appropriate for this purpose. Regardless of these
factors, the scale may be criticised for its lack of content validity. Face validity is another term
for content validity. The advent of new packaged foods is one example. When a new packaged
food is introduced, it represents a significant change in flavour. Hundreds of thousands of
people may be urged to try the new packaged meals. People may report that they liked the
new flavour overwhelmingly. Even with such a positive response, the product may
nevertheless fail when it is launched on a commercial basis. So, what's the issue? Perhaps a
vital question was overlooked.
3. Predictive Validity: This pertains to "How best a researcher can guess the future performance
from the knowledge of attitude score"?
4. Criterion Validity:
(a) Examines whether measurement scale performs as expected in relation to other variables
selected as meaningful criteria, i.e., predicted and actual behavior should be similar.
(b) Addresses the question of what construct or characteristic the scale is actually measuring
5. Convergent Validity: Extent to which scale correlates positively with other measures of the
same construct.
6. Discriminant Validity: Extent to which a measure does not correlate with other constructs
from which it is supposed to differ.
7. Nomological Validity: Extent to which scale correlates in theoretically predicted ways with
measures of different but related constructs.
Bivariate Regression
Bivariate Regression, often known as simple regression analysis, is a technique for determining the
strength of a relationship between two variables. The two variables are commonly referred to as X
and Y, with one acting as an independent (or explanatory) variable and the other as a dependent
variable (or outcome variable).
Bivariate Regression Analysis employs a linear regression line (since the relationship between the
variables is considered to be linear) to help measure how the two variables change together in order
to establish the relation.
For a bivariate data (Xi, Yi), i = 1, 2, ...... n, we can have either X or Y as independent variable. If X is
independent variable then we can estimate the average values of Y for a given value of X. The
relation used for such estimation is called regression of Y on X. If on the other hand Y is used for
estimating the average values of X, the relation will be called regression of X on Y. For a bivariate
data, there will always be two lines of regression. It will be shown later that these two lines are
different, i.e., one cannot be derived from the other by mere transfer of terms, because the
derivation of each line is dependent on a different set of assumptions.
The general form of the line of regression of Y on X is YCi = a + bXi, where YCi denotes the average
or predicted or calculated value of Y for a given value of X = Xi. This line has two constants, a and
b. The constant a is defined as the average value of Y when X = 0. Geometrically, it is the intercept
of the line on Y-axis. Further, the constant b, gives the average rate of change of Y per unit change
in X, is known as the regression coefficient. The above line is known if the values of a and b are
known. These values are estimated from the observed data (Xi, Yi), i = 1, 2, ...... n.
Research Methodology
Using the regression YCi = a + bXi, we can obtain YC1 , YC2 , ...... YCn corresponding to the X
values X1 , X2 , ...... Xn respectively. The difference between the observed and calculated value for a
particular value of X say Xi is called error in estimation of the ith observation on the assumption of
a particular line of regression. There will be similar type of errors for all the n observations. We
denote by e i = Yi – YCi (i = 1, 2,.....n), the error in estimation of the ith observation. As is obvious
from Figure 9.4, ei will be positive if the observed point lies above the line and will be negative if
the observed point lies below the line. Therefore, in order to obtain a Figure of total error, ei¢s are
squared and added. Let S denote the sum of squares of these errors,
i.e.,S= ∑ =∑ ( − )
The regression line can, alternatively, be written as a deviation of Yi from YCi i.e. Yi – YCi = e i or
Yi = YCi + e i or Yi = a + bXi + e i. The component a + bXi is known as the deterministic component
and ei is random component. The value of S will be different for different lines of regression. A
different line of regression means a different pair of constants a and b. Thus, S is a function of a and
b. We want to find such values of a and b so that S is minimum. This method of finding the values
of a and b is known as the Method of Least Squares. Rewrite the above equation as S = S(Yi – a –
bXi)2 (YCi = a + bXi).
The necessary conditions for minima of S are
Equations (1) and (2) are a system of two simultaneous equations in two unknowns a and b, which
can be solved for the values of these unknowns. These equations are also known as normal
equations for the estimation of a and b. Substituting these values of a and b in the regression
equation YCi = a + bXi, we get the estimated line of regression of Y on X
Expressions for the Estimation of a and b. Dividing both sides of the equation (1) by n, we have
Research Methodology
Research Methodology
Remarks: It should be noted here that the two lines of regression are different because these have
been obtained in entirely two different ways. In case of regression of Y on X, it is assumed that the
values of X are given and the values of Y are estimated by minimisingS(Yi – YCi) 2 while in case of
regression of X on Y, the values of Y are assumed to be given and the values of X are estimated by
minimising S(Xi – XCi) 2 . Since these two lines have been estimated on the basis of different
assumptions, they are not reversible, i.e., it is not possible to obtain one line from the other by mere
transfer of terms. There is, however, one situation when these two lines will coincide. From the
study of correlation we may recall that when r = ± 1, there is perfect correlation between the
variables and all the points lie on a straight line. Therefore, both the lines of regression coincide and
hence they are also reversible in this case. By substituting r = ± 1 in equation (12) or (24) it can be
shown that the lines of regression in both the cases become.
−‾ − ‾
=±
Further when = 0, equation (12) becomes ⊙ = ‾ and equation (24) becomes = ‾ . These are the
equations of lines parallel to -axis and -axis respectively. These lines also intersect at the point
( ‾ , ‾ ) and are mutually perpendicular at this point, as shown in Figure.
Multiple Regression Analysis is an extension of two variable regression analysis. In this analysis,
two of more independent variables are used to estimate the values of a dependent variable, instead
of one independent variable.
The objective of multiple regression analysis are:
1. To derive an equation which provides estimates of the dependent variable from values of
the two or more variables independent variables.
2. To obtain the measure of the error involved in using the regression equation as a basis of
estimation.
3. To obtain a measure of the proportion of variance in the dependent variable accounted for
or explained by the independent variables.
Multiple regression equation explains the average relationship between the given variables and the
relationship is used to estimate the dependent variable. Regression equation refers the equation for
estimating a dependent variable.
= + + .
Research Methodology
. = . +
=( − ‾ )
=( − ‾ )
=( − ‾ )
Σ = +
Σ = Σ + Σ
.
. =
.
− −
( − ‾ )= ( − ‾ )+ ( −‾ )
1− 1−
− −
( − )= ( − ‾ )+ ( −‾ )
1− 1−
Summary
ANOVA is a technique of statistics and it is applied to test the equality of three or more
sample means.
It is an advanced technique for the experimental treatment of testing differences among all
of the means.
The ANOVA allows to do this simultaneous test and is thus considered as a valuable
analytical tool in the hands of a researcher.
Regression analysis allows you to confidently establish which elements are most
important, which factors may be ignored.
Regression is a term used for predicting the value of one variable from the other.
Least square method is used to fit the line.
Keywords
ANOVA: It is a statistical technique used to test the equality of three or more sample means.
Bivariate Regression: a technique for determining the strength of a relationship between two
variables.
Regression Equation: If the coefficient of correlation calculated for bivariate data (Xi, Yi), i = 1, 2, n,
is reasonably high and a cause and effect type of relation is also believed to be existing between
them, the next logical step is to obtain a functional relation between these variables. This functional
relation is known as regression equation in statistics.
Reliability Analysis:the extent to which the measurement process is free from errors.
Internal Consistency in Reliability: The consistency with which each item represents the construct
of interest.
Self Assessment
1-Analysis of variance is a statistical method of comparing the ________ of several populations.
A. standard deviations
B. variances
C. means
D. proportions
3- ANOVAs cannot be used when testing data collected in educational research as it cannot be
applied to social science.
A. True
B. False
A. Correlational
B. Random
C. Experimental
D. Simple
6-In regression, the equation that describes how the response variable (y) is related to the
explanatory variable (x) is:
Research Methodology
11- The ________ sum of squares measures the variability of the sample treatment means
around the overall mean.
A. treatment
B. error
C. interaction
D. total
12- Which of the following is an assumption of one-way ANOVA comparing samples from
three or more experimental treatments?
A. All the response variables within the k populations follow a normal distributions.
B. The samples associated with each population are randomly selected and are independent
from all other samples.
C. The response variables within each of the k populations have equal variances.
D. All of the above.
6. C 7. A 8. D 9. A 10. A
Review Questions
1. What do you think as the reason behind the two lines of regression being different?
2. From the data given below:-
3-Obtain the equations of the two lines of regression for the data given below:
4-In the estimation of regression equation of two variables X and Y the following results were
obtained. X = 90, Y = 70, n = 10, Ȉx 2 =6360; Ȉy 2 = 2860, Ȉxy = 3900 Obtain the two regression
equations.
5- A test was given to five students taken at random from the fifth class of three schools of a town.
The individual scores are
Research Methodology
FurtherReadings
Abrams, M.A, Social Surveys and Social Action, London: Heinemann, 1951.
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
R.S. Bhardwaj, Business Statistics, Excel Books, New Delhi, 2008.
S.N. Murthy and U. Bhojanna, Business Research Methods, Excel Books, 2007.
A Parasuraman, Dhruv Grewal, Marketing Research, Biztantra
Paneerselvam, R, Research Methods, PHI.
Web Links
https://www.youtube.com/watch?v=TKom54uOzXY
https://murraylax.org/rtutorials/regression_intro.html
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3049417/
https://sciencing.com/difference-between-bivariate-multivariate-analyses-
8667797.html
Objectives
After studying this unit, you will be able to:
Explain the concept of multivariate analysis
Introduction
As the name indicates, multivariate analysis comprises a set of techniques dedicated to the analysis
of data sets with more than one variable. Several of these techniques were developed recently in
part because they require the computational capabilities of modern computers. Multivariate
analysis (MVA) is based on the statistical principle of multivariate statistics, which involves
observation and analysis of more than one statistical variable at a time. In design and analysis, the
technique is used to perform trade studies across multiple dimensions while taking into account
the effects of all variables on the responses of interest. Sometimes, the marketers will come across
situations, which are complex involving two or more variables. Hence, bivariate analysis deals with
this type of situation. Chi-Square is an example of bivariate analysis.
Research Methodology
Example: The demand for television sets may depend not only on price, but also on the
income of households, advertising expenditure incurred by TV manufacturer and other
similar factors. To solve this type of problem, multivariate analysis is required.
12.2 Classification
Multiple-variate analysis: This can be classified under the following heads:
A. Factor Analysis
B. Cluster Analysis
C. Discriminant Analysis
D. Multidimensional Scaling
E. Conjoint Analysis
When the objective is to summarise information from a large set of variables into fewer factors,
principle component factor analysis is used. On the other hand, if the researcher wants to analyse
the components of the main factor, common factor analysis is used.
Example: Common factor – Inconvenience inside a car. The components may be:
1. Leg room
2. Seat arrangement
3. Comfort (C)
6. Price (F)
The questionnaire may be administered to 5,000 respondents. The opinion of the customer is
gathered. Let us allot points 1 to 10 for the variables factors A to F. 1 is the lowest and 10 is the
highest. Let us assume that application of factor analysis has led to grouping the variables as
follows:
A, B, D, E into factor-1
F into Factor -2
C into Factor - 3
Factor - 1 can be termed as Technical factor;
For future analysis, while conducting a study to obtain customers’ opinion, three factors mentioned
above would be sufficient. One basic purpose of using factor analysis is to reduce the number of
independent variables in the study. By having too many independent variables, the M.R study will
suffer from following disadvantages:
1. Time for data collection is very high due to several independent variables.
2. Expenditure increases due to the time factor.
3. Computation time is more, resulting in delay.
The results provide information which is similar in nature to those produced by Factor Analysis
techniques, and they allow one to explore the structure of categorical variables included in the
table. The most common kind of table of this type is the two-way frequency cross-tabulation table.
Example: Following are the data on the drinking habits of different employees in an
organization:
One may think of the 4 column values in each row of the table as coordinates in a 4-
dimensionalspace, and one could compute the (Euclidean) distances between the 5
row points in the 4dimensional space. The distances between the points in the 4-
dimensional space summarize all information about the similarities between the
rows in the table above. Now suppose one could find a lower-dimensional space, in
which to position the row points in a manner that retains all, or almost all, of the
information about the differences between the rows. You could then present all
information about the similarities between the rows (types of employees in this
case) in a simple 1, 2, or 3-dimensional graph. While this may not appear to be
particularly useful for small tables like the one shown above, one can easily imagine
Research Methodology
how the presentation and interpretation of very large tables (e.g., differential
preference for 10 consumer items among 100 groups of respondents in a consumer
survey) could greatly benefit from the simplification that can be achieved via
correspondence analysis (e.g., represent the 10 consumer items in a two-
dimensional space).
simplicity as the number of zeros or near-zero entries in the factor loading matrix—the more zeros,
the simpler the structure. Rotation does not alter matrix C or U at all, but does transform the factor
loading matrix.
In the intense case of simple structure, each X-variable will have merely one large entry, so that all
the others can be ignored. But that would be a simpler structure than you would usually expect to
achieve; after all, in the real world each variable isn’t in general affected by only one other variable.
You then name the factors subjectively, based on an examination of their loadings.
In common factor analysis the procedure of rotation is in fact somewhat more abstract that I have
implied here, since you don’t actually know the individual scores of cases on factors. However, the
statistics for a multiple regression that is mainly relevant here—the multiple correlation and the
standardized regression slopes—can all be calculated just from the correlations of the variables and
factors involved. So we can base the calculations for rotation to simple structure on just those
correlations, devoid of using any individual scores.
A rotation which necessitates the factors to remain uncorrelated is an orthogonal rotation, while
others are oblique rotations. Oblique rotations regularly achieve greater simple structure, though at
the cost that you have to also consider the matrix of factor intercorrelations when interpreting
results. Manuals are usually clear which is which, but if there is ever any ambiguity, a simple rule is
that if there is any capability to print out a matrix of factor correlations, then the rotation is oblique,
as no such capacity is needed for orthogonal rotations.
Cluster Analysis is a technique used for classifying objects into groups. This can be used to sort
data (a number of people, companies, cities, brands or any other objects) into homogeneous groups
based on their characteristics.
The result of Cluster Analysis is a grouping of the data into groups called clusters. The researcher
can analyse the clusters for their characteristics and give the cluster, names based on these.
Where can Cluster Analysis be applied?
The marketing application of cluster analysis is in customer segmentation and estimation of
segment sizes. Industries, where this technique is useful include automobiles, retail stores,
insurance, B-to-B, durables and packaged goods. Some of the well-known frameworks in consumer
behaviour (like VALS) are based on value cluster analysis.
Cluster Analysis is applicable when:
1. An FMCG company wants to map the profile of its target audience in terms of life-style,
attitude and perceptions.
2. A consumer durable company wants to know the features and services a consumer takes into
account, when purchasing through catalogues.
3. A housing finance corporation wants to identify and cluster the basic characteristics, lifestyles
and mindset of persons who would be availing housing loans. Clustering can be done based
on parameters such as interest rates, documentation, processing fee, number of installments
etc.
Process
There are two ways in which Cluster Analysis can be carried out:
1. First, objects/respondents are segmented into a pre-decided number of clusters. In this case,
a method called non-hierarchical method can be used, which partitions data into the specified
number of clusters
Research Methodology
The above two are basic approaches used in cluster analysis. This can be used to segment customer
groups for a brand or product category, or to segment retail stores into similar groups based on
selected variables.
Interpretation of Results
Ideally, the variables should be measured on an interval or ratio scale. This is because the clustering
techniques use the distance measure to find the closest objects to group into a cluster. An example
of its use can be clustering of towns similar to each other which will help decide where to locate
new retail stores.
If clusters of customers are found based on their attitudes towards new products and interest in
different kinds of activities, an estimate of the segment size for each segment of the population can
be obtained, by looking at the number of objects in each cluster.
Marketing strategies for each segment are fine-tuned based on the segment characteristics. For
instance, a segment of customers, like sports car, get a special promotional offer during specific
period.
Names can also be given to clusters to describe each one. For example, there can be a
cluster called “neo-rich”. Segments are prioritized based on their estimated size.
Example: Suppose there are five attributes, 1 to 5, on which we are judging two
objects A and B. The existence of an attribute may be indicated by 1 and its absence
by 0. In this way, two objects are viewed as similar if they share common attributes.
+
=
+ + +
Where
a = No. of attributes possessed by brands A and B
b = No. of attributes possessed by brand A but not by brand B
c = No. of attributes possessed by brand B but not by brand A
d = No. of attributes not possessed by both brands.
3. Dialogue box will appear select all the variables which are required to be used in cluster
analysis. This can be done by clicking on the right arrow to transfer them from the variable
list on the left.
4. Click on METHOD. The dialogue box will open. Choose "Between Groups Linkage" as the
CLUSTER METHOD.
6. Click STATISTICS on the main dialogue box. Choose "Agglomeration schedule" so that it will
appear in the final output click CONTINUE.
7. Choose DENDROGRAM then on the box called ICICLE, Choose "All Clusters" and "Vertical".
8. Click OK on the main dialogue box to get the output of the hierarchical cluster analysis.
Research Methodology
Stage 2
This stage is used to know how many clusters are required. This stage is called K- MEANS
CLUSTERING.
1. Click CLASSIFY, followed by K- FANS CLUSTER desired.
2. Fill in the desired number of clusters that has been identified from stage 1.
3. Click OPTIONS on the main dialogue box. Select "Initial Cluster Centers". Then click
CONTINUE to return to the main dialogue box.
4. Click OK on the main dialogue box to get the output which has final clusters.
3. Those who go to Food World to buy and those who buy in a Kirana shop.
Suppose there is a comparison between the groups mentioned as above along with demographic
and socio-economic factors, then discriminant analysis can be used. One way of doing this is to
proceed and calculate the income, age, educational level, so that the profile of each group could be
determined. Comparing the two groups based on one variable alone would be informative but it
would not indicate the relative importance of each variable in distinguishing the groups. This is
because several variables within the group will have some correlation which means that one
variable is not independent of the other.
If we are interested in segmenting the market using income and education, we would be interested
in the total effect of two variables in combinations, and not their effects separately. Further, we
would be interested in determining which of the variables are more important or had a greater
impact. To summarize, we can say, that Discriminant Analysis can be used when we want to
consider the variables simultaneously to take into account their interrelationship.
Like regression, the value of dependent variable is calculated by using the data of independent
variable.
x = Independent variable
As can be seen in the above, each independent variable is multiplied by its corresponding
weightage.
This results in a single composite discriminant score for each individual. By taking the average of
discriminant score of the individuals within a certain group, we create a group mean. This is
known as centroid. If the analysis involves two groups, there are two centroids. This is very similar
to multiple regression, except that different types of variables are involved.
Application
A company manufacturing FMCG products introduces a sales contest among its marketing
executives to find out “How many distributors can be roped in to handle the company’s product”.
Assume that this contest runs for three months. Each marketing executive is given target regarding
number of new distributors and sales they can generate during the period. This target is fixed and
based on the past sales achieved by them about which, the data is available in the company. It is
also announced that marketing executives who add 15 or more distributors will be given a Maruti
Omni-van as prize. Those who generate between 5 and 10 distributors will be given a two-wheeler
as the prize. Those who generate less than 5 distributors will get nothing. Now assume that 5
marketing executives won a Maruti van and 4 won a two-wheeler.
The company now wants to find out, “Which activities of the marketing executive made the
difference in terms of winning a prize and not winning the prize”. One can proceed in a number of
ways. The company could compare those who won the Maruti van against the others.
Alternatively, the company might compare those who won, one of the two prizes against those
who won nothing. It might compare each group against each of the other two.
Discriminant analysis will highlight the difference in activities performed by each group members
to get the prize. The activity might include:
1. More number of calls made to the distributors.
3. Dialogue box will appear. Select the GROUPING VARIABLE. This can be done by clicking on
the right arrow to transfer them from the variable list on the left to the grouping variable box
on the right.
4. Define the range of values by clicking on DEFINE RANGE. Enter Minimum and Maximum
value then click CONTINUE.
5. Select all the independent variable for discriminant analysis from the variable list by clicking
on the arrow that transfers them to box on the right.
6. Click on STATISTICS on the lower part of main dialogue box. This will open up a smaller
dialogue box.
7. Click on CLASSIFY on the lower part of the main dialogue box select SUMMARY TABLE
under the heading DISPLAY in a small dialogue box that appears.
Research Methodology
Types of MDS
In general, there are two types of MDS:
1. Metric
2. Non-metric
Metric MDS makes the assumption that the input data is either ratio or interval data, while the non-
metric model requires simply that the data be in the form of ranks. Therefore, the nonmetric model
has fewer restrictions than the metric model, but also less rigor. One technique to use if you are
unsure whether your data is ordinal or can be considered interval is to try both metric and non-
metric models. If the results are very close, the metric model may be used.
An advantage of the non-metric models is that they permit the researcher to categorize and
examine preference data, such as the kind obtained in marketing studies or other areas where
comparisons are useful.
Another technique, correspondence analysis, can work with categorical data, i.e., data at the
nominal level of measurement, however that technique will not be described here.
Notes:
Similarities and Differences between Factor Analysis and MDS
We have already seen that MDS can accept more different measures of similarity and
dissimilarity than factor analysis techniques can. In addition, there are some
differences in terminology. These differences reflect the origin of MDS in the field of
psychology. The measure corresponding to factors are called alternatively
dimensions or stimulus coordinates.
The output of MDS looks very similar to that of factor analysis and the determination
of the optimal number of dimensions is handled in much the same way.
3. Decision about the number of stimulus coordinates that represent the data
Example: Let us say that you have a matrix of distances between a number of
major cities, such as you might find on the back of a road map. These distances can
be used as the input data to derive an MDS solution. When the results are mapped
in two dimensions, the solution will reproduce a conventional map, except that the
MDS plot might need to be rotated so that the north-south and east-west
dimensions conform to expectations. However, the once the rotation is completed,
the configuration of the cities will be spatially correct.
Example: An airline would like to know, which is the most desirable combination
of attributes to a frequent traveler: (a) Punctuality (b) Air fare (c) Quality of food
served on the flight and (d) Hospitality and empathy shown.
Conjoint Analysis is a multivariate technique that captures the exact levels of utility that an
individual customer places on various attributes of the product offering. Conjoint Analysis enables
a direct comparison,
Example: A comparison between the utility of a price level of 400 versus 500, a
delivery period of 1 week versus 2 weeks, or an after-sales response of 24 hours versus 48
hours.
Once we know the utility levels for each attribute (and at individual levels as well), we can combine
these to find the best combination of attributes that gives the customer the highest utility, the
second best combination that gives the second highest utility, and so on. This information is then
used to design a product or service offering.
Application
Conjoint Analysis is extremely versatile, and the range of applications includes virtually in any
industry. New product or service design, including the concepts in the pre-prototyping stage can
specifically benefit from the conjoint applications.
Some examples of other areas where this technique can be used are:
1. Designing an automobile loan or insurance plan in the insurance industry,
2. Designing a complex machine for business customers.
Process
Design attributes for a product are first identified. For a shirt manufacturer, these could be design
such as designer shirts vs plain shirts, this price of 400 versus 800. The outlets can have
exclusive distribution or mass distribution. All possible combinations of these attribute levels are
then listed out. Each design combination will be ranked by customers and used as input data for
Conjoint Analysis. Then the utility of the products relative to price can be measured.
The output is a part-worth or utility for each level of each attribute. For example, the design may
get a utility level of 5 and plain, 7.5. Similarly, the exclusive distribution may have a part utility of
2, and mass distribution, 5.8. We then put together the part utilities and come up with a total utility
for any product combination we want to offer and compare that with the maximum utility
combination for this customer segment.
This process clarifies to the marketer about the product or service regarding the attributes that they
should focus on in the design.
If a retail store finds that the height of a shelf is an important attribute for selling at a particular
level, a well-designed shelf may result from this knowledge. Similarly, a designer of clocks will
benefit from knowing the utility attached by customers to the dial size, background colours, and
price range of the clocks.
Approach
From a discussion with the client, identify the design attributes to be studied and the levels at
which they can be offered. Then build a list of product concepts on offer. These product concepts
are then ranked by customers. Once this data is available, use Conjoint Analysis to derive the part
utilities of each attribute level. This is then used to predict the best product design for the given
customer segment. Use the SPSS Conjoint procedure to analyse the data.
There are three steps in conjoint analysis:
1. Identification of relevant products or service attributes.
2. Collection of data.
3. Estimation of worth for the attribute chosen.
For attributes selection, the market researcher can conduct interview with the customers directly.
Research Methodology
2. Now 2 files namely DATA FILE 1 and DATA FILE 2 are created.
3. A third file called SYNTAX file is to be opened. By using the FILE, OPEN command followed
by syntax.
4. Type the following - conjoint plan = DATA FILE 1 SAV/DATA' DATA FILE 2 SAV/
One combination 3 kg, 4 hours, Dell clearly dominates and 5 kg, 2 hours, Lenovo
is leastpreferred.
Let us now take the average rank for 3 kg option = 4 + 3 + 2 + 1/4 = 2.5
For 5 kg option average rank is 5 + 8 + 7 + 6/4 = 6.5
For 4 hour option 5 + 3 + 7 + 1/4 = 4
For 2 hour option 4 + 8 + 2 + 6/4 = 5
For Dell 5 + 6 + 1 + 2/4 = 3.5
For Lenovo 5.5
Looking at the difference in average ranks, the most important characteristic to
this respondent is weight = 4, followed by brand name = 2 and battery life = 1.
Summary
Multivariate analysis is used if there are more than 2 variables.
Some of the multi variate analysis are discriminant analysis, Factor analysis, Cluster analysis,
conjoint analysis, and multi-dimensional scaling.
In discriminant analysis, it is verified whether the 2 groups differ from one another.
Factor analysis is used to reduce large no of various factors into fewer variables cluster
analysis is used to segmenting the market or to identify the target group.
Regression is a term used for predicting the value of one variable from the other.
MDS as a set of multivariate statistical methods for estimating the parameters in and
assessing the fit of various spatial distance models for proximity data.
The output of MDS looks very similar to that of factor analysis and the determination of the
optimal number of dimensions is handled in much the same way.
Keywords
Cluster Analysis: Cluster Analysis is a technique used for classifying objects into groups.
Conjoint Analysis: Conjoint analysis is concerned with the measurement of the joint effect of two
or more attributes that are important from the customers’ point of view.
Discriminant Analysis: In this analysis, two or more groups are compared. In the final analysis, we
need to find out whether the groups differ one from another.
Factor Analysis: Factor Analysis is the analysis whose main purpose is to group large set of
variable factors into fewer factors.
Multivariate Analysis: In multi variate analysis, the number of variables to be tackled are many.
Self Assessment
1-In discriminant analysis the averages for the independent variables for a group define the
A. centroid
B. median
C. mode
D. central tendency
2____ is a method for deriving the utility values that consumers attach to varying levels of a
product's attributes
A. Regression
B. Conjoint analysis
C. Correlation
D. T test
3-The conjoint analysis procedure is based on trade-offs respondents make when evaluating
alternatives.
A. True
B. False
Research Methodology
A. Researchers
B. Industries
C. Marketers
D. Consumers
A. Descriptive
B. Predictive
C. Inferential
D. None of These
9-Factor analysis is a(n) _____ in that the entire set of interdependent relationships is examined.
10-_____ are simple correlations between the variables and the factors.
A. Factor scores
B. Factor loadings
C. Correlation loadings
D. Both a and b are correct
A. personal choice
B. Kaiser’s rule
C. Scree test
D. Both Kaiser’s rule and Scree test
Review Questions
1. Which technique would you use to measure the joint effect of various attributes while
designing an automobile loan and why?
2. Do you think that the conjoint analysis will be useful in any manner for an airline? If yes how,
if no, give an example where you think the technique is of immense help.
3. In your opinion, what are the main advantages of cluster analysis?
4. Which analysis would you use in a situation when the objective is to summarize information
from a large set of variables into fewer factors? What will be the steps you would follow?
5. Which analysis would answer if it is possible to estimate the size of different groups?
6. Which analysis would you use to compare a good, bad and a mediocre doctor and why?
7. Analyse the weakness of principle component factor analysis.
8. Which multivariate analysis would you apply to identify specific customer segment for a
company’s brand and why?
9. Critically evaluate multidimensional scaling.
Research Methodology
10. In your opinion what will be the disadvantages of having too many independent variables in
an MR study?
11.People have been rated on their suitability for an advanced training course in computer
programming on the basis of six ratings given by their manager (rated 1=low to 20=high):
(a) Intellect
The training department believe that these are really measuring only three things; intellect,
computer programming experience and loyalty, and want you to carry out a factor analysis
to explore that hypothesis. Describe the decisions you would have to make in carrying out a
factor analysis and what the results would be likely to tell you.
12. Six observations on two variables are available, as shown in the following table:
Obs. X1 X2
a 3 2
b 4 1
c 2 5
d 5 2
e 1 6
f 4 2
(a) Plot the observations in a scatter diagram. How many groups would you say there are,
and what are their members?
(b) Apply the nearest neighbor method and the squared Euclidean distance as a measure
of dissimilarity. Use a dendrogram to arrive at the number of groups and their
membership.
13. Six observations on two variables are available, as shown in the following table:
Obs. X1 X2
a -1 -2
b 0 0
c 2 2
d -2 -2
e 1 -1
f 1 2
(a) Plot the observations in a scatter diagram. How many groups would you say there are,
and what are their members?
(b) Apply the nearest neighbor method and the Euclidean distance as a measure of
dissimilarity.
6. D 7. B 8. C 9. C 10. B
Further Readings
A Parasuraman, Dhruv Grewal, Marketing Research, Biztantra
Cisnal Peter, Marketing Research, MCGE.
Hague & Morgan, Marketing Research in Practice, Kogan page.
Paneerselvam, R, Research Methods, PHI.
Tull and Donalds, Marketing Research, MMIL
Web Links
https://www.qualtrics.com/au/experience-management/research/factor-
analysis/
https://stats.idre.ucla.edu/spss/seminars/introduction-to-factor-analysis/a-
practical-introduction-to-factor-analysis
https://www.qualtrics.com/au/experience-management/research/cluster-
analysis/
https://www.statisticshowto.com/multidimensional-scaling/
https://ncss-wpengine.netdna-ssl.com/wp-
content/themes/ncss/pdf/Procedures/NCSS/Multidimensional_Scaling.pdf
Objectives
After studying this unit, you will be able to:
Introduction
A report is a formal document written for a number of objectives in the sciences, social sciences,
engineering, and business fields. The findings of a specified or specific task are usually written up
in a report. It's worth noting that reports are seen as legal papers in the workplace, therefore they
must be exact, accurate, and difficult to misunderstand.
At its most fundamental level, report writing is defined by three characteristics: a set framework,
independent parts, and achieving unbiased conclusions.
Predefined structure: Broadly, these headings may indicate sections within a report, such
as an introduction, discussion, and conclusion.
Independent sections: Each section in a report is typically written as a stand-alone piece,
so the reader can selectively identify the report sections they are interested in, rather than
reading the whole report through in one go from start to finish.
Unbiased conclusions: A third element of report writing is that it is an unbiased and
objective form of writing.
Research Methodology
We argue from observation during the inductive phase. We reason towards the observation during
the deductive phase. The quality of the data analysis determines the success of the interpretation.
The interpretation of data that has not been adequately analyzed may be incorrect. If an analysis
needs to be corrected, then good data collecting is required. Similarly, if the data is correct but the
analysis is incorrect, the interpretation or conclusion will be incorrect as well. Even with good data
and analysis, the data might sometimes lead to incorrect interpretation. The researcher's experience
and the methods he uses for interpretation play a role in the interpretation.
The results may lead us to the conclusion that the second sales promotion method was
the most effective in developing sales. This may be adopted nationally to promote the
product. But one cannot say that the same method of sales promotion will be effective
in each and every city under study.
Make copies of your data and store the master copy away. Use the copy for making edits,
cutting and pasting, etc.
Tabulate the information, i.e., add up the number of ratings, rankings, yes's, no's for each
question.
For ratings and rankings, consider computing a mean, or average, for each question. For
example, "For question #1, the average ranking was 2.4". This is more meaningful than
indicating, e.g., how many respondents ranked 1, 2, or 3.
Consider conveying the range of answers, e.g., 20 people ranked "1", 30 ranked "2", and 20
people ranked "3".
Sort comments into categories based on their content, such as worries, suggestions,
strengths, shortcomings, comparable experiences, programme inputs, recommendations,
outputs, result indicators, and so on.
Label the categories or themes with words like "concerns," "suggestions," and so on.
Look for patterns, associations, and causal relationships in the themes, such as whether all
people who attended evening programmes had similar concerns, whether most people
were from the same geographic area, whether most people were in the same salary range,
what processes or events respondents experienced during the programme, and so on.
Save all comments for several years after they've been completed in case they're needed in
the future
Interpreting Information
Attempt to put the data into context, e.g., compare results to what you expected, promised
results; management or programme staff; any common standards for your products or
services; original goals (especially if you're conducting a programme evaluation);
indications or measures of achieving outcomes or results (especially if you're conducting
an outcomes or performance evaluation); desirability (especially if you're conducting an
outcomes or performance evaluation); desirability (especially if you're conducting an
outcomes or performance
Research Methodology
Take into account suggestions to assist employees in improving the programme, product,
or service; judgments about programme operations or reaching targets, and so on.
Write out your conclusions and recommendations in a report, together with the
interpretations that support them.
Precautions
1. Keep the main objective of research in mind.
2. Analysis of data should start from simpler and more fundamental aspects.
3. It should not be confusing.
4. The sample size should be adequate.
5. Take care before generalizing of the sample studied.
6. Give due attention to significant questions.
Caution: In report writing, do not miss the significance of some answers, because
they are found from very few respondents, such as "don't know" or "can't say".
Quantify when you have the data to do so. Avoid large, small, instead, say 50%, one in
three.
Be precise and specific in your phrasing of findings.
Inform, not impress. Avoid exaggeration.
Use short sentences.
Verbs and adjectives sparingly.
Avoid the passive voice, if possible, as it creates vagueness (e.g., 'patients were
interviewed' leaves uncertainty as to who interviewed them) and repeated use makes
dull reading.
Aim to be logical and systematic in your presentation
Caution: In report writing, be consistent in the use of tenses (past or present tense).
Oral Report
When the researchers are asked to give an oral presentation, this form of reporting is essential.
When compared to a written report, giving an oral presentation is more challenging. Because the
reporter must communicate directly with the audience, this is the case. Any stuttering during an
oral presentation can give the listeners a poor impression. The presenter's self-confidence may be
lowered as a result of this. Communication is crucial in an oral presentation. To decide 'What to
say,' 'How to say,' and 'How much to say,' a lot of planning and thinking is required. In addition,
the presenter may be bombarded with questions from the crowd. An oral presentation necessitates
much preparation; the following is a general classification.
Research Methodology
Opening: A brief statement can be made on the nature of discussion that will follow. The opening
statement should explain the nature of the project, how it came about and what was attempted.
Finding/Conclusion: Each conclusion may be stated backed up by findings.
Recommendation: Each recommendation must have the support of conclusion. At the end of the
presentation, question-answer session should follow from the audience.
Method of presentation: Visuals, if need to be exhibited, can be made use of. The use of tabular
form for statistical information would help the audience.
(a) What type of presentation is a root question? Is it read from a script, memorized, or spoken
extemporaneously? Memorization is not advised because a slip could occur during the
presentation. Second, it results in a speaker-centric strategy. It is not suggested to read from the
manuscript because it gets repetitive, uninteresting, and lifeless. Making main points notes so that
they can be expanded is the best technique to speak in ex-tempo. Sequences should be followed in a
logical manner.
Written Report
Following are the Various Types of Written Reports:
(A) Reports can be classified based on the time-interval such as:
1. Daily
2. Weekly
3. Monthly
4. Quarterly
5. Yearly
(B) Type of reports:
1. Short report
2. Long report
3. Formal report
4. Informal report
5. Government report
1. Short Report: Short reports are produced when the problem is very well defined and if the
scope is limited. For example, Monthly sales report. It will run into about five pages. It
consists of report about the progress made with respect to a particular product in a clearly
specified geographical locations.
2. Long Report: This could be both a technical report as well as non-technical report. This will
present the outcome of the research in detail.
(a) Technical Report: This will include the sources of data, research procedure, sample
design, tools used for gathering data, data analysis methods used, appendix, conclusion
and detailed recommendations with respect to specific findings. If any journal, paper or
periodical is referred, such references must be given for the benefit of reader.
(b) Non-technical Report: This report is meant for those who are not technically qualified.
E.g. Chief of the finance department. He may be interested in financial implications only,
such as margins, volumes, etc. He may not be interested in the methodology.
3. Formal report:
Example: The report prepared by the marketing manager to be submitted to the Vice-
President (marketing) on quarterly performance, reports on test marketing
4. Informal report:The report prepared by the supervisor by way of filling the shift log book,
to be used by his colleagues.
Research Methodology
Summary
A report is a formal document written for a number of objectives in the sciences, social
sciences, engineering, and business fields.
The most important thing to remember when writing a research report is to communicate
with the audience.
The report should be able to pique the readers' curiosity. As a result, the report should be
written with the reader in mind.
Accuracy and clarity are two other factors to consider while writing a report.
The following points should be kept in mind when giving an oral presentation: the
language used, time management, graph use, report purpose, and so on. The audience
must be able to understand the visuals used.
The presenter must ensure that the presentation is finished within the allowed time. It's a
good idea to set aside some time for questions and answers.
Depending on whether the report is brief or extensive, it might be characterised as a
written report. It can also be divided into two categories: technical and non-technical
reports.
The report's style should be straightforward and to the point.
In report writing, there should not be an excessive amount of detail, and qualitative data
should not be overlooked.
Keywords
Appendix: The part of the report whose purpose is to provide a place for material which is not
absolutely essential to the body of the report.
Executive Summary: It is a condensed version of the whole report.
Informal Report: The report prepared by the supervisor by way of filling the shift log book, to be
used by his colleagues
Short Report: Short reports are the reports that are produced when the problem is very well
defined and if the scope is limited.
Self Assessment
1-Through interpretation, a researcher can ………… relations and processes that underlie his
findings.
A. Expose
B. Hide
C. Transfer
D. None of These
A. Answers
B. Extension
C. Problem
D. None of these
3-Which of the following does not represent the sequence from data to Knowledge?
A. Analysis
B. Interpretation
C. Findings
D. Results
A. false generalization
B. wrong interpretation of statistical measures
C. data with consistent homogeneity
D. identification of correlation with causation
A. Casual
B. Formal
C. Informal
D. Technical
A. Numerical
B. Technical
C. Specialized
D. Descriptive
A. Periodic Reports
B. Formal Reports
Research Methodology
C. Long Reports
D. Business Reports
A. Structured Data
B. New Variables
C. Knowledge Gaps
D. None of These
A. Consistency
B. Regularity
C. Clarity
D. None
11- Aim must be logical and ……………in the report presentation
A. Organised
B. Systematic
C. Structured
D. None
13-Effective oral presentation techniques include all of the following except ________.
A. the use of visual aids displayed with a variety of media
B. allowing sufficient opportunity for questions, both during and after the presentation
C. not spending much time on the reason for the research and getting to the results quickly
D. constant eye contact and interaction with the audience
14-Effective oral presentation techniques include all of the following except ________.
A. the speaker should vary the volume, pitch, voice quality, articulation, and rate while
speaking
B. terminate the presentation with a strong closing
C. the presentation should be sponsored by a top-level manager in the client's organization
D. All of the above are correct.
A. Experiment
B. Investigation
C. Inquiry
D. All of these
6. B 7. A 8. D 9. C 10. A
Review Questions
1. What is a research report?
7. What are the various criteria used for classification of written report?
Further Readings
Abrams, M.A., Social Surveys and Social Action, London: Heinemann, 1951.
Arthur, Maurice, Philosophy of Scientific Investigation, Baltimore: John Hopkins
University Press, 1943.
Bernal, J.D., The Social Function of Science, London: George Routledge and Sons, 1939.
Chase, Stuart, The Proper Study of Mankind: An inquiry into the Science of Human
Relations, New York, Harper and Row Publishers, 1958.
S. N. Murthy and U. Bhojanna, Business Research Methods, Excel Books
Web Links
https://www.ets.org/Media/Research/pdf/RM-12-05.pdf
https://eduvoice.in/types-research-report-writing/
https://www.yourarticlelibrary.com/marketing/research-report-introduction-
definition-and-report-format/48713
https://www.questionpro.com/blog/research-reports/
Objectives
After studying this unit, you will be able to:
Appraise the purpose of a research proposal.
Summarise the criteria for evaluation of a research proposal.
Introduction
In order to prepare for your dissertation, you may be expected to produce a brief proposal or plan
explaining your research idea and how you propose to carry it out. This is a good method to get
ready for your research and it will get you thinking about a lot of the topics addressed in the next
section. The proposal will ask you to demonstrate some understanding of the literature in your
chosen field, in addition to stating your intended research methodology and techniques, the topic
area in which your study will be conducted, and the research questions that you aim to answer.
A research proposal is a comprehensive plan, scheme, structure, and strategy for obtaining answers
to your research project's study questions or challenges. A research proposal should detail the steps
you propose to take to achieve your research goals, test hypotheses (if any), and answer your
research questions. It should also explain why you're conducting the research. In general, the major
purpose of a research proposal is to explain the operational plan for acquiring answers to your
study questions. As a result, the reader is assured of the methodology's validity in finding accurate
and objective responses to your research questions.
In order to achieve this function, a research proposal must tell you, your research supervisor and
reviewers the following information about your study:
What you are proposing to do;
How you plan to find answers to what you are proposing;
Why you selected the proposed strategies of investigation.
Research Methodology
Preamble/introduction
The proposal should begin with an introduction, which should include some of the following
details. Keep in mind that some of the content mentioned in this area may not be applicable to all
studies, so choose just what is relevant to your research. The literature review is crucial when
drafting this section since it serves two purposes:
1. It acquaints you with the available literature in the area of your study, thereby broadening your
knowledge base.
2. It provides you with information on the methods and procedures other people have used in
similar situations and tells you what works and what does not.
The kind, scope, and quality of a literature review are largely determined by the academic level for
which the proposal is being written. The contents of this part may also vary significantly depending
on the study topic. Begin by taking a broad view of the main subject area before eventually focusing
on the central problem under consideration. Cover the following aspects of your research field
when doing so.:
An overview of the main area under study;
A historical perspective (development, growth, etc.) pertinent to the study area;
Philosophical or ideological issues relating to the topic;
Trends in terms of prevalence, if appropriate;
Major theories, if any;
The main issues, problems and advances in the subject area under study;
Important theoretical and practical issues relating to the central problem under study;
The Problem
After giving a comprehensive overview of the topic, concentrate on topics related to the major
theme, noting some of the gaps in the existing body of knowledge. Determine some of the most
important unanswered questions. Some of the primary research questions that you'd like to answer
with your study should be stated here, along with a reason and relevance for each.
Knowledge gained from other studies and the literature about the issues you are proposing to
investigate should be an integral part of this section. Specifically, this section should:
-Identify the issues that are the basis of your study;
-Specify the various aspects of/perspectives on these issues;
-Identify the main gaps in the existing body of knowledge;
-Raise some of the main research questions that you want to answer through your study;
-Identify what knowledge is available concerning your questions, specifying the differences of
0pinion in the literature regarding these questions if differences exist;
-Develop a rationale for your study with particular reference to how your study will fill the
identified gaps.
Research Methodology
The Setting
Describe the organisation, agency, or community where you will conduct your research in a few
words. If the study is on a group of individuals, emphasize some of the group's most important
qualities (such as its history, size, makeup, and structure) and bring attention to any relevant
material that is accessible.
Include the following information in your description if your study is about an agency, office, or
organisation: the agency, office, or organization's core services; its administrative structure; the
types of clients served; information about the topics that are key to your research. If you're
researching a community, quickly describe some of its most important aspects, such as the
community's size, a social profile of the community (i.e. the make-up of the various groups within
it), and difficulties related to the study's major theme.
Measurement Procedures
This section should cover your instrument as well as the specifics of how you want to
operationalize your primary variables. To begin, defend your research tool selection by noting its
advantages and disadvantages. Then, make a list of the important components of your research
instrument and how they relate to the study's main goals. If you're using a standard instrument,
talk about the evidence for its reliability and validity briefly. Describe and explain any
modifications you make if you adapt or modify it in any manner.
You should also talk about how you'll put the primary themes into practice. If you're going to
measure effectiveness, for example, be specific about how you'll do it. Mention the key indications
of self-esteem and the techniques for measuring it if you plan to measure the self-esteem of a group
of people (e.g. the Likert or Thurstone scale, or any other procedure).
Ideally, for quantitative studies you should attach a copy of the research instrument to your
proposal.
Ethical Issues
Any ethical difficulties that research may have are taken seriously by all academic institutions. To
cope with them, every institution has an ethics policy in place. You must be familiar with the
policies of your institution. It is critical that you highlight any ethical difficulties in your proposal
and explain how you plan to address them. You must consider ethical considerations from the
perspective of your responders, and you must specify the process in place to deal with any
potential 'damage,' whether psychological or otherwise.
Sampling
Under this section of the proposal include the following:
the size of the sample population (if known), as well as where and how this data will be gathered;
the size of the sample you intend to choose, as well as your justifications for doing so; an
explanation of the sampling design you want to apply in the sample selection (simple random
sampling, stratified random sampling, quota sampling, etc.).
Analysis of data
Describe the data analysis strategy you propose to utilize in broad terms. Indicate whether the data
will be manually or automatically analyzed. Determine the application you'll use to analyse the
data and, if necessary, the statistical processes you'll use. Determine the essential variables for
cross-tabulation in quantitative investigations.
Describe how you plan to analyse your interviews or observation notes in qualitative research to
derive meaning from what your respondents have stated regarding the issues covered or
observation notes taken. Identifying main themes by analysing the contents of the information
acquired in the field is one of the most prevalent ways. You must first select whether you want to
analyse this data manually or with the help of a computer programme.
Work Schedule/Timelines
Because you must do the research within a specified time range, you must give yourself dates. List
the many operational processes you'll need to perform, along with the deadlines for each.
Remember to set aside some time near the end as a "cushion" in case the research process does not
go as planned.
• A timeline is a very important part of a research/project proposal.
• It shows the chronological order of events that a researcher plan to do in his/her project.
Research Methodology
• It is supposed to give the reader a broad overview of the project at a glance. It does not
have to be very detailed.
• A timeline presents a clear indication of the time frame for the project, Identify tasks, the
times when each activity of the project will be implemented and the responsible member
of the team.
• A timeline is displayed most effectively in a graphic, table or spreadsheet will help
demonstrate the feasibility of the project in a very visible way.
Timeline-Phases
• Planning phase
• Implementation phase
• Follow-through phase
Timelines-Considerations
• Timelines need to be realistic and represent the entire duration of the project
• Show project timelines using most appropriate style, for example: Bar chart (Gantt chart).
Budgets
• Outlines the funds needed effectively conduct the research proposed
• Outlines exactly what researcher realistically need from the funding agency to conduct
the research
• Budget should be realistic
• Aligns with agency suggested/required budget categories
• The budget should align with the activities proposed in research design
Budget categories
• Personnel (salary and benefits)
• Researcher (time, salary and benefits)
• Training
• Consultants and/or resource person (salary)
• Instruction
• Equipment
• Supplies (paper, tapes, film, batteries, printing costs, publication cost etc.)
• Communication (telephone/postage/Internet/ media)
• Materials preparation (software, medical supplies, copying and printing)
• Travel/subsistence
• Community liaison
• Rental of facilities
• Evaluation
• Indirect costs (costs that your organization requires you to include)
• Other expenses (lunches for Meetings, interviews etc.)
Research Methodology
Budget Justification
• Justify each budget item
• Demonstrate how the budget items align with the activities to be undertaken in your
research design
• Provide details on additional sources of funding available to the organization or Principal
investigator
• If the funds will go to different institutions, indicate allocation of funds by site
Proposal Presentation
• A proposal presentation has a distinct audience and purpose, a researcher should assume
his/her audience to be:
Presentation Outline
• Brief research overview
• Sufficient background information for everyone to understand the proposal •
• Statement of the research problem and goals
• Project details and methods
• Predicted outcomes if everything goes according to plan and if nothing does
• Needed resources to complete the work (budget proposal)
• Societal impact if all goes well
• Timetable of activities (Gantt Chart)
Proposal Defence
• The Proposal should be prepared in accordance with formatting style and guidelines.
• The Proposal should include chapters (Introduction, Literature Review, and Methodology)
and their traditional elements, the References, and appropriate Appendices (surveys,
assessments, measurement scales).
• The defense should begin with a description of the context or background for the research
question(s) in the study.
• It should also define key terms and variables and identifies hypotheses.
• The Proposal defense serves as an opportunity for the researcher to share the proposed
study that is a comprehensive and well-defined plan for the research.
• The format of the Proposal defense is a brief and succinct presentation followed by
questions from the review committee.
Points to defend
• Significance of the proposed research
• A summary of key points extracted from the literature on the topic
• A description of the conceptual framework and how the problem will be measured or
assessed
• A proposal for analysis and interpretation of data or evidence
• Following the presentation, each member from the evaluation committee will be given the
opportunity to present questions to the candidate;
• It is aimed to probe the candidate’s understanding of the Proposal and to clarify, to both
the candidate and Committee members, information which has been presented.
• Committee members may also suggest changes in any aspect of the Proposal.
• Opinions may differ; should differences arise, the chair provides guidance.
• The Proposal defense requires demonstration of two main elements:
• The candidate, Chair, and Committee have thought deeply and carefully about
the Proposal; the “big picture” is defensible.
• The candidate is able to weigh the suggestions of the Committee and accept those
that will strengthen the study.
Appendix As an appendix, in the case of quantitative studies, attach your research instrument.
Also, attach a list of references in the appendix of the proposal.
Summary
A research proposal details the operational plan for obtaining answers to research questions.
A research proposal must tell your supervisor and others what you propose to do, how
you plan to proceed and why the chosen strategy has been selected.
A research proposal thus assures readers of the validity of the methodology used to obtain
answers accurately and objectively.
Any given research proposal provides only a framework within which a research proposal
for both quantitative and qualitative studies should be written and assume that you are
reasonably well acquainted with research methodology and an academic style of writing.
Keywords
Title:The summary of the study's actual notion; it should use the fewest exact words that effectively
convey the research's overall meaning.
Research Methodology
Introduction:A brief background of the selected topic, including the objective, significance,
relevancy, and applicability of the outcomes. It should clearly state the main points of the study. It
also comprises objectives, which define the researcher's goals for the investigation.
Review of Literature:The overview of the chosen issue should be based on essential writings from
other sources, such as authentic web sites, government records, and academic journal articles.
Every source is described, summarized, and evaluated in a literature review.
Research Gap: The missing item from the present field of research's literature. This is an area where
the style should be unique in order to fill a gap in a certain field of research.
Theoretical and Conceptual Framework: The relationship that will be revealed inside the research
is theoretical framework, which will also support the study's theory, and conceptual framework,
which reflects the overall architecture of the study as well as a visual representation of the
relationship between variables.
Hypothesis:A test-based forecast of what is likely to happen in the research. Uncertainty in a
statement that emphasizes the link between the study's factors.
Methodology:In general, it makes useful to consider what type of research is conducted and how it
is conducted in order for readers to assess the research's validity and dependability. This chapter
comprises several sections, including sample design, data collection process, sample size, statistical
techniques to be utilised, and so on.
Conclusions:Briefly, the entire research procedure that has been included in the synopsis and will
be included in the major research is completed.
Timeline:Defining the total time line that will be required for each phase of the research, which
must correspond to the time limit set by the relevant authorities.
References:It gives credit to the authors who contributed ideas and words to the research work. For
reference, use the appropriate format, such as APA, Harvard, or MLA.
Self Assessment
1. A good research proposal will always
C. Consider all possible research that had previously been done on the topic.
4. A review of the literature prior to formulating research questions allows the researcher to do
which of the following?
Research Methodology
C. Concludes with a statement of the research questions and, for quantitative research, it
includes the research hypothesis.
9. The research participants are described in detail in which section of the research proposal?
A. Introduction.
B. Research Methodology.
C. Data Analysis.
D. Conclusion.
10. According to the text, which of the following orders is the recommended in the flowchart of
the development of a research idea?
A. Research topic, research problem, research purpose, research question, and hypothesis.
B. Research topic, research purpose, research problem, research question, and hypothesis.
C. Research topic, research problem, research purpose, research question, and hypothesis.
B. Guidelines on ethics.
B. It looks authoritative.
C. It shows that you are knowledgeable about the literature that relates to your research topic.
13. Which section of the research proposal describes the purpose with a full statement of the
research question?
A. Introduction.
B. Research Methodology.
C. Literature review.
D. References.
A. I hope to
15. What helps to agree timings, agree resource allocation and also draws boundaries?
A. The questionnaire.
B. The Proposal.
Research Methodology
1. B 2. D 3. B 4. D 5. D
6. B 7. D 8. B 9. B 10. A
Review Questions
1. Enumerate the contents of any research proposal?
2. How far is it important to formulate objectives in the proposal?
3. Draft a sample research proposal on any given topic of your choice?
4. Why is it important to write about limitation in any given research proposal?
5. What are the considerations in presenting research proposal?
6. Throw light on the research proposal defence?
7. State as to why it is important to have time lines in any research proposal?
FurtherReadings
Business Research Methods By Naval Bajpai, Pearson
Marketing Research: Text And Cases By Nargundkar, R., Mcgraw Hill Education
Web Links
https://www.monash.edu/rlo/graduate-research-writing/write-the-
thesis/writing-a-research-proposal
https://www.youtube.com/watch?v=d8brslGIi10
https://www.youtube.com/watch?v=eALzUfkQJRU
https://www.youtube.com/watch?v=aj4H2nVuqNE
https://www.youtube.com/watch?v=NIUTwCoLIVo
https://www.birmingham.ac.uk/schools/law/courses/research/research-
proposal.aspx