0% found this document useful (0 votes)

2 views

The document outlines the criteria for good measurement in research, emphasizing the importance of validity, reliability, and sensitivity in measurement tools. It details different types of validity, including content, criterion-related, and construct validity, and explains methods to assess reliability, such as test-retest and parallel-form reliability. Additionally, it highlights the significance of sensitivity and practicality in ensuring that measurement instruments effectively capture the intended concepts.

Uploaded by

mudassirshah2345678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Uploaded by

mudassirshah2345678

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Research Methods –STA630 VU

_________________________________________________________________________________

Lesson 18
CRITERIA FOR GOOD MEASUREMENT
Now that we have seen how to operationally define variables, it is important to make sure that the
instrument that we develop to measure a particular concept is indeed accurately measuring the
variable, and in fact, we are actually measuring the concept that we set out to measure. This ensures
that in operationally defining perceptual and attitudinal variables, we have not overlooked some
important dimensions and elements or included some irrelevant ones. The scales developed could
often be imperfect and errors are prone to occur in the measurement of attitudinal variables. The use
of better instruments will ensure more accuracy in results, which in turn, will enhance the scientific
quality of the research. Hence, in some way, we need to assess the “goodness” of the measure
developed.

What should be the characteristics of a good measurement? An intuitive answer to this question is
that the tool should be an accurate indicator of what we are interested in measuring. In addition, it
should be easy and efficient to use. There are three major criteria for evaluating a measurement tool:
validity, reliability, and sensitivity.

Validity
Validity is the ability of an instrument (for example measuring an attitude) to measure what it is
supposed to measure. That is, when we ask a set of questions (i.e. develop a measuring instrument)
with the hope that we are tapping the concept, how can we be reasonably certain that we are indeed
measuring the concept we set out to do and not something else? There is no quick answer.

Researchers have attempted to assess validity in different ways, including asking questions such as “Is
there consensus among my colleagues that my attitude scale measures what it is supposed to
measure?” and “Does my measure correlate with others’ measures of the ‘same’ concept?” and “Does
the behavior expected from my measure predict the actual observed behavior?” Researchers expect
the answers to provide some evidence of a measure’s validity.

What is relevant depends on the nature of the research problem and the researcher’s judgment. One
way to approach this question is to organize the answer according to measure-relevant types of
validity. One widely accepted classification consists of three major types of validity: (1) content
validity, (2) criterion-related validity, and (3) construct validity.

(1) Content Validity: How well the instrument covers all the aspects, is it upto the level of
measuring what it is supposed to measure?

The content validity of a measuring instrument (the composite of measurement scales) is the extent to
which it provides adequate coverage of the investigative questions guiding the study. If the
instrument contains a representative sample of the universe of subject matter of interest, then the
content validity is good. To evaluate the content validity of an instrument, one must first agree on
what dimensions and elements constitute adequate coverage. To put it differently, content validity is a
function of how well the dimensions and elements of a concept have been delineated. Look at the
concept of feminism which implies a person’s commitment to a set of beliefs creating full equality
between men and women in areas of the arts, intellectual pursuits, family, work, politics, and
authority relations. Does this definition provide adequate coverage of the different dimensions of the
concept? Then we have the following two questions to measure feminism:

1. Should men and women get equal pay for equal work?
2. Should men and women share household tasks?
These two questions do not provide coverage to all the dimensions delineated earlier. It definitely
falls short of adequate content validity for measuring feminism.

__________________________________________________________________________________

57 © Copyright Virtual University of Pakistan

Research Methods –STA630 VU

_________________________________________________________________________________

A panel of persons to judge how well the instrument meets the standard can attest to the content
validity of the instrument. A panel independently assesses the test items for a performance test. It
judges each item to be essential, useful but not essential, or not necessary in assessing performance of
a relevant behavior.

Face validity is considered as a basic and very minimum index of content validity. Face validity
indicates that the items that are intended to measure a concept, do on the face of it look like they
measure the concept. For example a few people would accept a measure of college student math
ability using a question that asked students: 2 + 2 = ? This is not a valid measure of college-level
math ability on the face of it. Nevertheless, it is a subjective agreement among professionals that a
scale logically appears to reflect accurately what it is supposed to measure. When it appears evident
to experts that the measure provides adequate coverage of the concept, a measure has face validity.

(2) Criterion-Related Validity

Criterion validity uses some standard or criterion to indicate a construct accurately. The validity of an
indicator is verified by comparing it with another measure of the same construct in which research has
confidence. There are two subtypes of this kind of validity.

Concurrent validity: To have concurrent validity, an indicator must be associated with a preexisting
indicator that is judged to be valid. For example we create a new test to measure intelligence. For it
to be concurrently valid, it should be highly associated with existing IQ tests (assuming the same
definition of intelligence is used). It means that most people who score high on the old measure
should also score high on the new one, and vice versa. The two measures may not be perfectly
associated, but if they measure the same or a similar construct, it is logical for them to yield similar
results.

Predictive validity: Criterion validity whereby an indicator predicts future events that are logically
related to a construct is called a predictive validity. It cannot be used for all measures. The measure
and the action predicted must be distinct from but indicate the same construct. Predictive
measurement validity should not be confused with prediction in hypothesis testing, where one
variable predicts a different variable in future. Look at the scholastic assessment tests being given to
candidates seeking admission in different subjects. These are supposed to measure the scholastic
aptitude of the candidates – the ability to perform in institution as well as in the subject. If this test
has high predictive validity, then candidates who get high test score will subsequently do well in their
subjects. If students with high scores perform the same as students with average or low score, then
the test has low predictive validity.

(3) Construct Validity

Construct validity is for measures with multiple indicators. It addresses the question: If the measure is
valid, do the various indicators operate in consistent manner? It requires a definition with clearly
specified conceptual boundaries. In order to evaluate construct validity, we consider both theory and
the measuring instrument being used. This is assessed through convergent validity and discriminant
validity.

Convergent Validity: This kind of validity applies when multiple indicators converge or are
associated with one another. Convergent validity means that multiple measures of the same construct
hang together or operate in similar ways. For example, we construct “education” by asking people
how much education they have completed, looking at their institutional records, and asking people to
complete a test of school level knowledge. If the measures do not converge (i.e. people who claim to
have college degree but have no record of attending college, or those with college degree perform no
better than high school dropouts on the test), then our test has weak convergent validity and we should
not combine all three indicators into one measure.

Discriminant Validity: Also called divergent validity, discriminant validity is the opposite of
convergent validity. It means that the indicators of one construct hang together or converge, but also
__________________________________________________________________________________

58 © Copyright Virtual University of Pakistan

Research Methods –STA630 VU

_________________________________________________________________________________

diverge or are negatively associated with opposing constructs. It says that if two constructs A and B
are very different, then measures of A and B should not be associated. For example, we have 10
items that measure political conservatism. People answer all 10 in similar ways. But we have also
put 5 questions in the same questionnaire that measure political liberalism. Our measure of
conservatism has discriminant validity if the 10 conservatism items hang together and are negatively
associated with 5 liberalism ones.

Reliability
The reliability of a measure indicates the extent to which it is without bias (error free) and hence
ensures consistent measurement across time and across the various items in the instrument. In other
words, the reliability of a measure is an indication of the stability and consistency with which the
instrument measures the concept and helps to assess the ‘goodness” of measure.

Stability of Measures

The ability of the measure to remain the same over time – despite uncontrollable testing conditions or
the state of the respondents themselves – is indicative of its stability and low vulnerability to changes
in the situation. This attests to its “goodness” because the concept is stably measured, no matter when
it is done. Two tests of stability are test-retest reliability and parallel-form reliability.

(1) Test-retest Reliability: Test-retest method of determining reliability involves administering the
same scale to the same respondents at two separate times to test for stability. If the measure is stable
over time, the test, administered under the same conditions each time, should obtain similar results.
For example, suppose a researcher measures job satisfaction and finds that 64 percent of the
population is satisfied with their jobs. If the study is repeated a few weeks later under similar
conditions, and the researcher again finds that 64 percent of the population is satisfied with their jobs,
it appears that the measure has repeatability. The high stability correlation or consistency between the
two measures at time 1 and at time 2 indicates high degree of reliability. This was at the aggregate
level; the same exercise can be applied at the individual level. When the measuring instrument
produces unpredictable results from one testing to the next, the results are said to be unreliable
because of error in measurement.

There are two problems with measures of test-retest reliability that are common to all longitudinal
studies. Firstly, the first measure may sensitize the respondents to their participation in a research
project and subsequently influence the results of the second measure. Further if the time between the
measures is long, there may be attitude change or other maturation of the subjects. Thus it is possible
for a reliable measure to indicate low or moderate correlation between the first and the second
administration, but this low correlation may be due an attitude change over time rather than to lack of
reliability.

(2) Parallel-Form Reliability: When responses on two comparable sets of measures tapping the same
construct are highly correlated, we have parallel-form reliability. It is also called equivalent-form
reliability. Both forms have similar items and same response format, the only changes being the
wording and the order or sequence of the questions. What we try to establish here is the error
variability resulting from wording and ordering of the questions. If two such comparable forms are
highly correlated, we may be fairly certain that the measures are reasonably reliable, with minimal
error variance caused by wording, ordering, or other factors.

Internal Consistency of Measures

Internal consistency of measures is indicative of the homogeneity of the items in the measure that tap
the construct. In other words, the items should ‘hang together as a set,’ and be capable of
independently measuring the same concept so that the respondents attach the same overall meaning to
each of the items. This can be seen by examining if the items and the subsets of items in the
__________________________________________________________________________________

Research Methods –STA630 VU

_________________________________________________________________________________

measuring instrument are highly correlated. Consistency can be examined through the inter-item
consistency reliability and split-half reliability.

(1) Inter-item Consistency reliability: This is a test of consistency of respondents’ answers to all the
items in a measure. To the degree that items are independent measures of the same concept, they will
be correlated with one another.

(2) Split-Half reliability: Split half reliability reflects the correlations between two halves of an
instrument. The estimates could vary depending on how the items in the measure are split into two
halves. The technique of splitting halves is the most basic method for checking internal consistency
when measures contain a large number of items. In the split-half method the researcher may take the
results obtained from one half of the scale items (e.g. odd-numbered items) and check them against
the results from the other half of the items (e.g. even numbered items). The high correlation tells us
there is similarity (or homogeneity) among its items.

It is important to note that reliability is a necessary but not sufficient condition of the test of goodness
of a measure. For example, one could reliably measure a concept establishing high stability and
consistency, but it may not be the concept that one had set out to measure. Validity ensures the ability
of a scale to measure the intended concept.

Sensitivity: The sensitivity of a scale is an important measurement concept, particularly when

changes in attitudes or other hypothetical constructs are under investigation. Sensitivity refers to an
instrument’s ability to accurately measure variability in stimuli or responses. A dichotomous
response category, such as “agree or disagree,” does not allow the recording of subtle attitude
changes. A more sensitive measure, with numerous items on the scale, may be needed. For example
adding “strongly agree,” “mildly agree,” “neither agree nor disagree,” “mildly disagree,” and
“strongly disagree” as categories increases a scale’s sensitivity.

The sensitivity of a scale based on a single question or single item can also be increased by adding
additional questions or items. In other words, because index measures allow for greater range of
possible scores, they are more sensitive than single item.

Practicality: The scientific requirements of a project call for the measurement process to be reliable
and valid, while the operational requirements call for it to be practical. Practicality has been defined
as economy, convenience, and interpretability.

__________________________________________________________________________________

Pine Script Language Tutorial
85% (13)
Pine Script Language Tutorial
70 pages
A-Cat Corp
No ratings yet
A-Cat Corp
26 pages
Introduction To Special Education
83% (23)
Introduction To Special Education
218 pages
Christ Is Born Medley
100% (2)
Christ Is Born Medley
10 pages
Assignment HED 405
No ratings yet
Assignment HED 405
3 pages
University Notes - HPS431 Psychological Assessment - Week 3 Learning Objective Notes
No ratings yet
University Notes - HPS431 Psychological Assessment - Week 3 Learning Objective Notes
7 pages
Essay On Validity and Reliability
No ratings yet
Essay On Validity and Reliability
10 pages
NOTE 5 - Validity and Data Gathering Technique
No ratings yet
NOTE 5 - Validity and Data Gathering Technique
6 pages
Unit 2 Testing and Evaluation Material for MA(1730201317937)
No ratings yet
Unit 2 Testing and Evaluation Material for MA(1730201317937)
23 pages
VALIDITY
No ratings yet
VALIDITY
7 pages
Validity
No ratings yet
Validity
6 pages
PR2 LAS Q2. Week 3
No ratings yet
PR2 LAS Q2. Week 3
7 pages
Research
No ratings yet
Research
28 pages
12. Types of Validity
No ratings yet
12. Types of Validity
3 pages
Validity Types - Test Validity
100% (1)
Validity Types - Test Validity
3 pages
Measurement in Research
No ratings yet
Measurement in Research
40 pages
MODULE 2 Lesson 1-4
No ratings yet
MODULE 2 Lesson 1-4
37 pages
Statistics
No ratings yet
Statistics
142 pages
Validity
No ratings yet
Validity
8 pages
Reliability and Validity in Research
No ratings yet
Reliability and Validity in Research
5 pages
Validity: Prepared By: R.Sreeraja Kumar Professor SNSR, Sharda University
No ratings yet
Validity: Prepared By: R.Sreeraja Kumar Professor SNSR, Sharda University
38 pages
Part 5 - Measurement
No ratings yet
Part 5 - Measurement
38 pages
Week 5 - Validity
No ratings yet
Week 5 - Validity
6 pages
Validity 1
No ratings yet
Validity 1
2 pages
InesSinthyaBrPandia - Reliability and Validity (Repitition, Review)
No ratings yet
InesSinthyaBrPandia - Reliability and Validity (Repitition, Review)
3 pages
Validity
No ratings yet
Validity
7 pages
What validity is and is not
No ratings yet
What validity is and is not
4 pages
Validity of Test Instruments: Journal of Physics: Conference Series
No ratings yet
Validity of Test Instruments: Journal of Physics: Conference Series
12 pages
Introduction To Validity and Reliability
No ratings yet
Introduction To Validity and Reliability
6 pages
8771 Reliability and Validity-6
No ratings yet
8771 Reliability and Validity-6
17 pages
Validity and Reliability
No ratings yet
Validity and Reliability
13 pages
8 Eighth - Class - Q 0Q - Research - JAVERIANA
No ratings yet
8 Eighth - Class - Q 0Q - Research - JAVERIANA
18 pages
Validity
No ratings yet
Validity
47 pages
RMM Lecture 17 Criteria For Good Measurement 2006
No ratings yet
RMM Lecture 17 Criteria For Good Measurement 2006
31 pages
MODULE 5.ppt
No ratings yet
MODULE 5.ppt
30 pages
Validity seminar
No ratings yet
Validity seminar
14 pages
Types of Validity
100% (2)
Types of Validity
4 pages
Research Instruments, Validity and Reliability Report
No ratings yet
Research Instruments, Validity and Reliability Report
5 pages
8602 Assignment No 2 Muhamamd Shahid
No ratings yet
8602 Assignment No 2 Muhamamd Shahid
56 pages
Validity
No ratings yet
Validity
6 pages
Chapter 6
No ratings yet
Chapter 6
8 pages
Criteria of Measurement Quality
No ratings yet
Criteria of Measurement Quality
20 pages
Assignment 1
No ratings yet
Assignment 1
5 pages
Reliability and Validity Mha1
No ratings yet
Reliability and Validity Mha1
13 pages
What Are The Components of Construct Validity? Describe Each
No ratings yet
What Are The Components of Construct Validity? Describe Each
4 pages
8-INSTRUMENTATION
No ratings yet
8-INSTRUMENTATION
30 pages
8602.02 Sumaia Bulqees
No ratings yet
8602.02 Sumaia Bulqees
48 pages
Measurement
No ratings yet
Measurement
34 pages
Validity
No ratings yet
Validity
13 pages
MODULE
No ratings yet
MODULE
5 pages
Vii. Validity
No ratings yet
Vii. Validity
3 pages
Criteria of Good Instruments: Validity & Reliability
No ratings yet
Criteria of Good Instruments: Validity & Reliability
11 pages
8602 2nd Assignment
No ratings yet
8602 2nd Assignment
14 pages
Advance Research Methods: Dr. Amin
No ratings yet
Advance Research Methods: Dr. Amin
5 pages
What Is Validity in Research?
100% (1)
What Is Validity in Research?
6 pages
SPL-3 Unit 3
No ratings yet
SPL-3 Unit 3
4 pages
Validity
No ratings yet
Validity
7 pages
Qualities of Good Measuring Instruments
56% (9)
Qualities of Good Measuring Instruments
4 pages
ValidityandReliabilityoftheResearchInstrumentHowtoTesttheValidationofaQuestionnaireSurveyinaResearch
No ratings yet
ValidityandReliabilityoftheResearchInstrumentHowtoTesttheValidationofaQuestionnaireSurveyinaResearch
10 pages
Validity Coefficient
No ratings yet
Validity Coefficient
6 pages
Testing Impact Review
From Everand
Testing Impact Review
Mason Ross
No ratings yet
Rasch Measurement Theory: A Complete Course
From Everand
Rasch Measurement Theory: A Complete Course
Tejas Thakur
No ratings yet
Glossary of Research Methodology
From Everand
Glossary of Research Methodology
Dr. Awadhesh Kishore
No ratings yet
Glossary of Research Methods
From Everand
Glossary of Research Methods
Dr. Awadhesh Kishore
No ratings yet
Hybrid Warfare and Challenges PDF
No ratings yet
Hybrid Warfare and Challenges PDF
15 pages
066 - MIAA vs. Court of Appeals
No ratings yet
066 - MIAA vs. Court of Appeals
2 pages
Living in The Light of Eternity
No ratings yet
Living in The Light of Eternity
140 pages
Terrorism in Pakistan and Its Impact On Foreign Investment
No ratings yet
Terrorism in Pakistan and Its Impact On Foreign Investment
23 pages
Nova Aetas Renaissance-Core Rulebook Eng v1.1
100% (1)
Nova Aetas Renaissance-Core Rulebook Eng v1.1
64 pages
Oxford Phrasal Academic Lexicon
No ratings yet
Oxford Phrasal Academic Lexicon
5 pages
Jane Austen - Mary Shelley
No ratings yet
Jane Austen - Mary Shelley
24 pages
EC8251 Circuit Analysis
0% (1)
EC8251 Circuit Analysis
26 pages
Changu Proposal
No ratings yet
Changu Proposal
40 pages
History Presentation Chapter 11 Paths To Modernisation
No ratings yet
History Presentation Chapter 11 Paths To Modernisation
30 pages
Median Polish: Purpose
No ratings yet
Median Polish: Purpose
5 pages
Xi Englishcore
No ratings yet
Xi Englishcore
8 pages
Y E C C: Angon Ducation Reation Orner
No ratings yet
Y E C C: Angon Ducation Reation Orner
9 pages
A Critical and Comparative Study of
No ratings yet
A Critical and Comparative Study of
157 pages
The 1619 Project
No ratings yet
The 1619 Project
3 pages
MIDTERM EXAM (NEW BOOK)
No ratings yet
MIDTERM EXAM (NEW BOOK)
3 pages
b1 - Mobile Phone
No ratings yet
b1 - Mobile Phone
2 pages
Supreme Court: The Solicitor General For Plaintiff-Appellee. Manuel B. Tomacruz For Accused-Appellant
No ratings yet
Supreme Court: The Solicitor General For Plaintiff-Appellee. Manuel B. Tomacruz For Accused-Appellant
26 pages
2024 USL League One - Uniform Policy
No ratings yet
2024 USL League One - Uniform Policy
2 pages
ISC 2015 Chemistry Paper 2 Practical Solved
83% (6)
ISC 2015 Chemistry Paper 2 Practical Solved
11 pages
Ga - Anh 12 - Tuan 10
No ratings yet
Ga - Anh 12 - Tuan 10
18 pages
PEM 2023 Investigating Science Trial HSC Mapping Grid
No ratings yet
PEM 2023 Investigating Science Trial HSC Mapping Grid
1 page
Teachers Make A Difference What Is The Research Evidence
No ratings yet
Teachers Make A Difference What Is The Research Evidence
18 pages
Analysis of PLANETS in 12 Houses
No ratings yet
Analysis of PLANETS in 12 Houses
4 pages
MaChemGuy's Revision Frame
No ratings yet
MaChemGuy's Revision Frame
4 pages
Assignment 12 Lesson Plan
No ratings yet
Assignment 12 Lesson Plan
2 pages

Uploaded by

Uploaded by

Research Methods –STA630 VU

57 © Copyright Virtual University of Pakistan

(2) Criterion-Related Validity

(3) Construct Validity

58 © Copyright Virtual University of Pakistan

Internal Consistency of Measures

59 © Copyright Virtual University of Pakistan

Sensitivity: The sensitivity of a scale is an important measurement concept, particularly when

60 © Copyright Virtual University of Pakistan

You might also like