The opinion of yours is not necessarily the reality of others
By Prof Dr Sohail Ansari The worst blindness is straying form the straight path
after receiving true guidance.
The side one is on is not the only side
·
You cannot have the view
of yours unless you are one-sided; however you must know that side you are on
is not the only side.
·
Don't let someone else's opinion of you become
your reality. Les Brown (Leslie Calvin "Les" Brown is an American motivational speaker)
·
It is from a weakness and smallness of mind
that men are opinionated; and we are very loath to believe what we are not able
to comprehend. Francois de La
Rochefoucauld
·
Too often we... enjoy the comfort of opinion
without the discomfort of thought. John
F. Kennedy
Instrument, Validity, Reliability
Instrument, Validity, Reliability
(The definition of a generic term is
a word or phrase that is used to describe some general or
vague group or class, rather than some specific thing.)
Part I: The Instrument
One of the most important components of
a research design is the research instruments because they gather or collect
data or information.
Instrument is the generic term that researchers use for a measurement
device (survey, test, questionnaire, etc.). To help distinguish between instrument and
instrumentation, consider that the instrument is
the device and instrumentation
is the course of action (the process of developing, testing, and
using the device).
Instruments fall into two broad categories, researcher-completed
and subject-completed, distinguished by those instruments that researchers administer versus
those that are completed by participants. Researchers chose which type of
instrument, or instruments, to use based on the research question. Examples are
listed below:
Researcher-completed
Instruments
|
Subject-completed Instruments
|
Rating
scales
|
Questionnaires
|
Interview
schedules/guides
|
Self-checklists
|
Tally
sheets
|
Attitude
scales
|
Flowcharts
|
Personality
inventories
|
Performance
checklists
|
Achievement/aptitude
tests
|
Time-and-motion
logs
|
Projective
devices
|
Observation
forms
|
Sociometric
devices
|
Usability
Usability refers to the ease with which an
instrument can be administered, interpreted by the participant, and
scored/interpreted by the researcher. Example usability problems include:
1. Students
are asked to rate a lesson immediately after class, but there are only a few
minutes before the next class begins (problem with administration).
2. Students
are asked to keep self-checklists of their after school activities, but the
directions are complicated and the item descriptions confusing (problem with
interpretation).
3. Teachers
are asked about their attitudes regarding school policy, but some questions are
worded poorly which results in low completion rates (problem with
scoring/interpretation).
Validity and reliability concerns (discussed below) will help alleviate usability
issues. For now, we can identify five usability considerations:
1. How long
will it take to administer?
2. Are the
directions clear?
3. How easy
is it to score?
4. Do
equivalent forms exist?
5. Have any
problems been reported by others who used it?
It is best to use an existing instrument, one that has been
developed and tested numerous times, such as can be found in the Mental
Measurements Yearbook. We will turn to why next.
Part II: Validity
Validity is the extent to which an
instrument measures what it is supposed to measure and performs as it is
designed to perform. It is rare, if nearly impossible, that an instrument be 100% valid, so
validity is generally measured in degrees. As a process, validation
involves collecting and analyzing data to assess the accuracy of an instrument.
There
are numerous statistical tests and measures to assess the validity of
quantitative instruments, which generally involves pilot testing. The
remainder of this discussion focuses on external validity and content validity.
External
validity is the extent to which the results of a study can be generalized from a sample
to a population. Establishing eternal validity for an instrument, then, follows
directly from sampling. Recall that a sample should be an accurate
representation of a population, because the total population may not be
available. An instrument that is externally valid helps obtain population
generalizability, or the degree to which a sample represents the population.
External validity is the
extent to which results of a study can be generalized to the world at large.
Because the goal of research is to tell us about the world, external validity is a very important part
of designing a study. ... This tension is also known as the tension
between internal and external validity.
External Validity
Sarah is a psychologist who teaches and does research at an
expensive, private college. She's interested in studying whether offering
specific praise after a task will boost people's self-esteem. If her
hypothesis is correct, then giving someone a
specific compliment on a job well done after a task will make them feel better
about themselves. And, if she can show that specific praise post-task boosts
self-esteem, then managers at companies everywhere will be able to boost their
employees' self-esteem by offering them specific praise.
But, here's a problem: the volunteers that Sarah gets for her
study are all college students, most of them are white, and most of them are
from privileged backgrounds. Sarah worries that her results might not
be applicable to people who are not in their late teens or early 20s, white,
and rich.
External validity is the extent to
which results of a study can be generalized to the world at large. Sarah is
worried that her study might have low external validity. Let's
look closer at external validity, including why it's important and the balance
between control and generalization that is required for external validity.
Importance
Let's go back to Sarah's concerns about her study for a moment.
She's worried because all of her subjects are similar to each other, and
they represent only a very small portion of the population. If Sarah's study
shows, for example, that specific praise after a task boosts self-esteem, what does
that mean for the real world?
If all of Sarah's subjects are young, white, and upper-class,
can she know for sure that this specific praise after a task will boost the
self-esteem of an older, minority, lower-class worker?
The goal of research is to generalize to the world at large.
Perhaps, like Sarah, the goal is to generalize to the population as a whole,
based on an experiment done on a small sample of the population. Or, perhaps
the goal is to generalize from a task done in a lab to a real-world setting,
like an office or a school. Either way, the goal is to make
inferences about the way things work in the real world based on the results of a
study.
But, without external validity, a researcher cannot make those
inferences. If external validity is low on a study, the results won't translate
well to other conditions. That means that the research done doesn't tell us
anything about the world outside of the study. That's a very limited
viewpoint!
Control vs. Generalization
OK, you might be thinking, so just make sure that every study
has a whole lot of external validity. What's the big deal?
Well, that's easier said than done. There's a balance in
research between control and generalization. Essentially, the
problem is this: with a study in a lab, there is low external
validity. That is, it isn't always applicable to the real world.
The other option is to do research in the field - that is, to
conduct research in the real world. Sarah, for example, could go to
an office or a factory and do her experiment there with real workers and
managers. Then, she'd have a very high external validity.
Internal
Validity
Sean works for a large corporation, and they've hired someone to
figure out if more money will mean more productivity for their
workforce. In other words, they want to know if they pay Sean a higher
salary, will he work more?
At first glance, the answer appears to be yes. After
all, the people who get paid the most at the company tend to be the ones that
come in early and stay late. They are the hardest working people in the
company. So, it stands to reason that the more a person gets paid, the harder
they will work, right?
Maybe, but it's actually a bit more complicated than
that. Maybe those people get paid the most because they were already hard workers. Maybe
they're motivated to work hard because they really like what they do and the
pay is incidental. Maybe they are hyper competitive and don't want to be the
first to leave the office.
How do we know what the cause of their hard work is? In
research, internal validity is the
extent to which you are able to say that no other variables except the one
you're studying caused the result. For example, if we are studying the variable
of pay and the result of hard work, we want to be able to say that no other
reason (not personality, not motivation, not competition) causes
the hard work. We want to say that pay and pay alone makes people like Sean
work harder.
Importance
You may be wondering why we should care
about internal validity. If people who work the hardest get paid the most, then
why not just say that's what happens and call it a day?
The purpose of most research is to study how one
thing (called the independent variable) affects another (called the dependent
variable). The strongest statement in research is one of causality. That
is, if we can say that the independent variable causes the dependent variable,
we have made the strongest statement there is in research.
But, that's not possible if an experiment has low
internal validity. Remember our example from above? How do we know that pay causes
harder work if there are other possibilities, like competition or motivation?
The answer is that we don't. That's why internal validity is so important.
The best experiments are designed to try to eliminate the
possibility that anything other than the independent variable caused the
changes in the dependent variable. In our experiment, we would try to
eliminate all other things that might be causing the hard work by the
workers. If we can do that, then we can show that higher pay
causes harder work.
Threats
But, designing a study that allows you to prove causality isn't
as easy as it might seem. That's because there are several common threats to
internal validity. These are things that make it difficult to prove that the
independent variable is causing the changes in the dependent variable.
One threat to internal validity is selection. This
is simply the fact that the people who are studied may not be normal. Do the
people at Sean's company who get paid the most work hard because they are paid
a lot, or do they get paid a lot because they are inherently hard
workers? By studying them, we might be studying just people who already work
hard; we have accidentally selected people whose experience does not mirror everyone
else's.
Another threat to internal validity is maturation. How do we
know that people wouldn't change during the study because they matured
instead of because of the effect of the independent variable? For example,
imagine that we look at Sean's productivity before and after he got a
raise and figure out that he is more productive after the raise.
But, what if he became a harder worker because he is aging
and becoming more responsible? What if he became more productive because
he's had more time at his job and has learned how to do it better? We
don't know if one of these is the reason or if the raise is the reason.
Likewise, if a one-time historical event happens that affects
Sean's productivity, it's the threat of history. Maybe Sean's wife had a
baby around the time he got a raise; being a dad has made him more responsible
and a harder worker.
Content
validity refers to the appropriateness of the content
of an instrument. In other words, do the measures (questions,
observation logs, etc.) accurately assess what you want to know? This is particularly
important with achievement tests. Consider that a test developer wants to
maximize the validity of a unit test for 7th grade mathematics. This would
involve taking representative questions from each of the sections of the unit and
evaluating them against the desired outcomes.
Content Validity Definition
When it comes to
developing measurement tools such as
intelligence tests, surveys, and self-report assessments, validity is important. A variety of types of
validity exist, each designed to ensure that specific aspects of measurement tools are accurately measuring what
they are intended to measure
and that the results can be applied to real-world settings.
Before we move into
discussing content validity, it is important to understand that validity is a broad concept that encompasses many aspects of assessment. For example, face validity describes the degree to which an assessment measures
what it appears to measure, concurrent validity measures how well the results of
one assessment correlate with other assessments designed to measure the same
thing, and predictive validity measures how well the assessment results can
predict a relationship between the construct of being measured and future behavior.
So, what about content
validity? Content validity refers to how accurately an
assessment or measurement tool taps into the various aspects of the specific
construct in question. In other words, do the questions really assess the construct
in question, or are the responses by the person answering the questions
influenced by other factors?
Content Validity Measurement
So how is content
validity measured? How do researchers know if an assessment has content
validity?
Content validity is most
often measured by relying on the knowledge of people who are familiar with the
construct being measured. These subject-matter experts are usually provided
with access to the measurement tool and are asked to provide feedback on how
well each question measures the construct in question. Their feedback is then analyzed, and informed decisions can be
made about the effectiveness of each question.
Examples
To better illustrate the
significance of content validity, let's look at two examples. One example explains how content validity can be
helpful in a clinical setting, and the
other in a business setting. Assessment and measurement tools like surveys and
questionnaires are quite common in the social and behavioral sciences. Content
validity is a critical aspect of developing tools that can help practitioners
understand and treat behavioral and mental health conditions.
For example, if a particular assessment tool is designed to
measure the severity of symptoms of clinical depression, a group of psychiatrists would evaluate
each question and provide an
opinion or rating on how well the wording of each question taps into measuring
the severity of depression symptoms.
The independent ratings of each subject-matter expert are then
compared and analyzed to determine the degree of content validity that exists for each question. The assessment developers can
then use that information to make alterations to the questions in order to develop an assessment tool which
yields the highest degree of content validity possible. Perhaps the subject
matter experts report that one of the questions is really tapping into aspects of an anxiety disorder. While anxiety and
depression are linked and have some cross-over in terms of symptomology, the
assessment in question is interested in measuring depression, so that question would either be altered or
eliminated altogether.
Here is another example
that you might be more familiar with. If you have ever purchased anything
online, you have probably also received a follow-up email asking you all about your buying experience. Retailers
want to make sure they provide
good customer service and one way is to go directly to the customer and ask
them how things went by surveying them about their experience.
Part III: Reliability
Reliability can be thought of as
consistency. Does the instrument consistently measure what it is intended to
measure? It is not possible to calculate reliability; however, there are four
general estimators that you may encounter in reading research:
1. Inter-Rater/Observer
Reliability: The degree to which different raters/observers give consistent
answers or estimates.
2. Test-Retest
Reliability: The consistency of a measure evaluated over time.
3. Parallel-Forms
Reliability: The reliability of two tests constructed the same way, from the
same content.
4. Internal
Consistency Reliability: The consistency of results across
items, often measured with Cronbach’s Alpha.
Relating
Reliability and Validity
Reliability is directly related to the validity of the measure.
There are several important principles. First, a test can be considered reliable,
but not valid. Consider the SAT, used as a predictor of success in college. It is a
reliable test (high scores relate to high GPA), though only a
moderately valid indicator of success (due to the lack of structured
environment – class attendance, parent-regulated study, and sleeping habits –
each holistically related to success).
Second, validity is more important than reliability. Using
the above example, college admissions may consider the SAT a reliable test,
but not necessarily a valid measure of other quantities colleges seek,
such as leadership capability, altruism, and civic involvement. The combination
of these aspects, alongside the SAT, is a more valid measure of the applicant’s
potential for graduation, later social involvement, and generosity (alumni
giving) toward the alma mater.
Finally, the most useful instrument is both valid and reliable. Proponents of
the SAT argue that it is both. It is a moderately reliable predictor of
future success and a moderately valid measure of a student’s knowledge in
Mathematics, Critical Reading, and Writing.
Part IV: Validity and Reliability in Qualitative Research
Thus far, we have discussed Instrumentation as related to mostly
quantitative measurement. Establishing validity and reliability in qualitative research
can be less precise, though participant/member checks, peer evaluation (another
researcher checks the researcher’s inferences based on the instrument, and
multiple methods (keyword: triangulation), are
convincingly used. Some qualitative researchers reject the concept of validity
due to the constructivist viewpoint that reality is unique to the individual,
and cannot be generalized. These researchers argue for a different standard for judging
research quality. For a more complete discussion of trustworthiness.
The research instruments
These research instruments or tools are ways of gathering data.
Without them, data would be impossible to put in hand.
QUETIONAIRE The most common instrument or tool of research for obtaining the data beyond the physical reach of the observer which, for ex. May be sent to human beings who are thousands of miles away or just around the corner.
INTERVIEW
It is in a sense of an oral questionnaire. Instead of writing the response, the interviewee gives the needed information orally and face-to-face. With a skillful interviewer, the interview is often superior to other data-gathering device.
TAPE RECORDED DATA
Observe through the ear as well as through the eye Also use video tape recorder or radio cassette recorder
OBSERVATION
Perceiving data through the senses: sight, hearing, taste touch and smell .
Most direct way used in studying individual behavior
PSYCHOLOGICAL TESTS
An instrument designed to describe and measure a sample of certain aspects of human behavior
performance test, paper and pencil test, achievement inventory, personality inventory and projective devices
The methods of data collection vary according
to RESEARCH INSTRUMENTS
“Massey states that the “Instrument
development requires a high degree of research expertise, as the instrument
must be reliable and valid.”
The type of instrument used by the researcher
depends on the data collection method selected. facilitate variable observation
and measurement. described as a device used to collect the data.
Guidelines for Developing an
Instrument
The instrument must be based on the
theoretical framework selected for the study.
The research tool will only be effective only
as it relates to its particular purpose.
The
instrument must be suitable for its function.
An
instrument should include an item that directly asks the hypothesis
The
content of the instrument must be appropriate to test the hypothesis or answer
the question being studied
The
researcher may need to read extensively to identify which aspects of the theory
are appropriate for investigation
The
instrument should not contain measures that function as hints for desired
responses.
A
good instrument is free of build-in clues
The
instrument should be free of bias.
The researcher, through the instrument, must be able to gather
data that are appropriate in order to test the hypothesis or to answer the
question under investigation.
Comments
Post a Comment