By Prof Dr Sohail Ansari The worst blindness is straying form the straight path after receiving true guidance.

The side one is on is not the only side

· You cannot have the view of yours unless you are one-sided; however you must know that side you are on is not the only side.

· Don't let someone else's opinion of you become your reality. Les Brown (Leslie Calvin "Les" Brown is an American motivational speaker)

· It is from a weakness and smallness of mind that men are opinionated; and we are very loath to believe what we are not able to comprehend. Francois de La Rochefoucauld

· Too often we... enjoy the comfort of opinion without the discomfort of thought. John F. Kennedy
Instrument, Validity, Reliability

(The definition of a generic term is a word or phrase that is used to describe some general or vague group or class, rather than some specific thing.)

Part I: The Instrument

One of the most important components of a research design is the research instruments because they gather or collect data or information.

Instrument is the generic term that researchers use for a measurement device (survey, test, questionnaire, etc.). To help distinguish between instrument and instrumentation, consider that the instrument is the device and instrumentation is the course of action (the process of developing, testing, and using the device).

Instruments fall into two broad categories, researcher-completed and subject-completed, distinguished by those instruments that researchers administer versus those that are completed by participants. Researchers chose which type of instrument, or instruments, to use based on the research question. Examples are listed below:

Researcher-completed Instruments	Subject-completed Instruments
Rating scales	Questionnaires
Interview schedules/guides	Self-checklists
Tally sheets	Attitude scales
Flowcharts	Personality inventories
Performance checklists	Achievement/aptitude tests
Time-and-motion logs	Projective devices
Observation forms	Sociometric devices

Usability

Usability refers to the ease with which an instrument can be administered, interpreted by the participant, and scored/interpreted by the researcher. Example usability problems include:

1. Students are asked to rate a lesson immediately after class, but there are only a few minutes before the next class begins (problem with administration).

2. Students are asked to keep self-checklists of their after school activities, but the directions are complicated and the item descriptions confusing (problem with interpretation).

3. Teachers are asked about their attitudes regarding school policy, but some questions are worded poorly which results in low completion rates (problem with scoring/interpretation).

Validity and reliability concerns (discussed below) will help alleviate usability issues. For now, we can identify five usability considerations:

1. How long will it take to administer?

2. Are the directions clear?

3. How easy is it to score?

4. Do equivalent forms exist?

5. Have any problems been reported by others who used it?

It is best to use an existing instrument, one that has been developed and tested numerous times, such as can be found in the Mental Measurements Yearbook. We will turn to why next.

Part II: Validity

Validity is the extent to which an instrument measures what it is supposed to measure and performs as it is designed to perform. It is rare, if nearly impossible, that an instrument be 100% valid, so validity is generally measured in degrees. As a process, validation involves collecting and analyzing data to assess the accuracy of an instrument. There are numerous statistical tests and measures to assess the validity of quantitative instruments, which generally involves pilot testing. The remainder of this discussion focuses on external validity and content validity.

External validity is the extent to which the results of a study can be generalized from a sample to a population. Establishing eternal validity for an instrument, then, follows directly from sampling. Recall that a sample should be an accurate representation of a population, because the total population may not be available. An instrument that is externally valid helps obtain population generalizability, or the degree to which a sample represents the population.

External validity is the extent to which results of a study can be generalized to the world at large. Because the goal of research is to tell us about the world, external validity is a very important part of designing a study. ... This tension is also known as the tension between internal and external validity.

External Validity

Sarah is a psychologist who teaches and does research at an expensive, private college. She's interested in studying whether offering specific praise after a task will boost people's self-esteem. If her hypothesis is correct, then giving someone a specific compliment on a job well done after a task will make them feel better about themselves. And, if she can show that specific praise post-task boosts self-esteem, then managers at companies everywhere will be able to boost their employees' self-esteem by offering them specific praise.

But, here's a problem: the volunteers that Sarah gets for her study are all college students, most of them are white, and most of them are from privileged backgrounds. Sarah worries that her results might not be applicable to people who are not in their late teens or early 20s, white, and rich.

External validity is the extent to which results of a study can be generalized to the world at large. Sarah is worried that her study might have low external validity. Let's look closer at external validity, including why it's important and the balance between control and generalization that is required for external validity.

Importance

Let's go back to Sarah's concerns about her study for a moment. She's worried because all of her subjects are similar to each other, and they represent only a very small portion of the population. If Sarah's study shows, for example, that specific praise after a task boosts self-esteem, what does that mean for the real world?

If all of Sarah's subjects are young, white, and upper-class, can she know for sure that this specific praise after a task will boost the self-esteem of an older, minority, lower-class worker?

The goal of research is to generalize to the world at large. Perhaps, like Sarah, the goal is to generalize to the population as a whole, based on an experiment done on a small sample of the population. Or, perhaps the goal is to generalize from a task done in a lab to a real-world setting, like an office or a school. Either way, the goal is to make inferences about the way things work in the real world based on the results of a study.

But, without external validity, a researcher cannot make those inferences. If external validity is low on a study, the results won't translate well to other conditions. That means that the research done doesn't tell us anything about the world outside of the study. That's a very limited viewpoint!

Control vs. Generalization

OK, you might be thinking, so just make sure that every study has a whole lot of external validity. What's the big deal?

Well, that's easier said than done. There's a balance in research between control and generalization. Essentially, the problem is this: with a study in a lab, there is low external validity. That is, it isn't always applicable to the real world.

The other option is to do research in the field - that is, to conduct research in the real world. Sarah, for example, could go to an office or a factory and do her experiment there with real workers and managers. Then, she'd have a very high external validity.

Internal Validity

Sean works for a large corporation, and they've hired someone to figure out if more money will mean more productivity for their workforce. In other words, they want to know if they pay Sean a higher salary, will he work more?

At first glance, the answer appears to be yes. After all, the people who get paid the most at the company tend to be the ones that come in early and stay late. They are the hardest working people in the company. So, it stands to reason that the more a person gets paid, the harder they will work, right?

Maybe, but it's actually a bit more complicated than that. Maybe those people get paid the most because they were already hard workers. Maybe they're motivated to work hard because they really like what they do and the pay is incidental. Maybe they are hyper competitive and don't want to be the first to leave the office.

How do we know what the cause of their hard work is? In research, internal validity is the extent to which you are able to say that no other variables except the one you're studying caused the result. For example, if we are studying the variable of pay and the result of hard work, we want to be able to say that no other reason (not personality, not motivation, not competition) causes the hard work. We want to say that pay and pay alone makes people like Sean work harder.

Importance

You may be wondering why we should care about internal validity. If people who work the hardest get paid the most, then why not just say that's what happens and call it a day?

The purpose of most research is to study how one thing (called the independent variable) affects another (called the dependent variable). The strongest statement in research is one of causality. That is, if we can say that the independent variable causes the dependent variable, we have made the strongest statement there is in research.

But, that's not possible if an experiment has low internal validity. Remember our example from above? How do we know that pay causes harder work if there are other possibilities, like competition or motivation? The answer is that we don't. That's why internal validity is so important.

The best experiments are designed to try to eliminate the possibility that anything other than the independent variable caused the changes in the dependent variable. In our experiment, we would try to eliminate all other things that might be causing the hard work by the workers. If we can do that, then we can show that higher pay causes harder work.

Threats

But, designing a study that allows you to prove causality isn't as easy as it might seem. That's because there are several common threats to internal validity. These are things that make it difficult to prove that the independent variable is causing the changes in the dependent variable.

One threat to internal validity is selection. This is simply the fact that the people who are studied may not be normal. Do the people at Sean's company who get paid the most work hard because they are paid a lot, or do they get paid a lot because they are inherently hard workers? By studying them, we might be studying just people who already work hard; we have accidentally selected people whose experience does not mirror everyone else's.

Another threat to internal validity is maturation. How do we know that people wouldn't change during the study because they matured instead of because of the effect of the independent variable? For example, imagine that we look at Sean's productivity before and after he got a raise and figure out that he is more productive after the raise.

But, what if he became a harder worker because he is aging and becoming more responsible? What if he became more productive because he's had more time at his job and has learned how to do it better? We don't know if one of these is the reason or if the raise is the reason.

Likewise, if a one-time historical event happens that affects Sean's productivity, it's the threat of history. Maybe Sean's wife had a baby around the time he got a raise; being a dad has made him more responsible and a harder worker.

Content validity refers to the appropriateness of the content of an instrument. In other words, do the measures (questions, observation logs, etc.) accurately assess what you want to know? This is particularly important with achievement tests. Consider that a test developer wants to maximize the validity of a unit test for 7th grade mathematics. This would involve taking representative questions from each of the sections of the unit and evaluating them against the desired outcomes.

Content Validity Definition

When it comes to developing measurement tools such as intelligence tests, surveys, and self-report assessments, validity is important. A variety of types of validity exist, each designed to ensure that specific aspects of measurement tools are accurately measuring what they are intended to measure and that the results can be applied to real-world settings.

Before we move into discussing content validity, it is important to understand that validity is a broad concept that encompasses many aspects of assessment. For example, face validity describes the degree to which an assessment measures what it appears to measure, concurrent validity measures how well the results of one assessment correlate with other assessments designed to measure the same thing, and predictive validity measures how well the assessment results can predict a relationship between the construct of being measured and future behavior.

So, what about content validity? Content validity refers to how accurately an assessment or measurement tool taps into the various aspects of the specific construct in question. In other words, do the questions really assess the construct in question, or are the responses by the person answering the questions influenced by other factors?

Content Validity Measurement

So how is content validity measured? How do researchers know if an assessment has content validity?

Content validity is most often measured by relying on the knowledge of people who are familiar with the construct being measured. These subject-matter experts are usually provided with access to the measurement tool and are asked to provide feedback on how well each question measures the construct in question. Their feedback is then analyzed, and informed decisions can be made about the effectiveness of each question.

Examples

To better illustrate the significance of content validity, let's look at two examples. One example explains how content validity can be helpful in a clinical setting, and the other in a business setting. Assessment and measurement tools like surveys and questionnaires are quite common in the social and behavioral sciences. Content validity is a critical aspect of developing tools that can help practitioners understand and treat behavioral and mental health conditions.

For example, if a particular assessment tool is designed to measure the severity of symptoms of clinical depression, a group of psychiatrists would evaluate each question and provide an opinion or rating on how well the wording of each question taps into measuring the severity of depression symptoms. The independent ratings of each subject-matter expert are then compared and analyzed to determine the degree of content validity that exists for each question. The assessment developers can then use that information to make alterations to the questions in order to develop an assessment tool which yields the highest degree of content validity possible. Perhaps the subject matter experts report that one of the questions is really tapping into aspects of an anxiety disorder. While anxiety and depression are linked and have some cross-over in terms of symptomology, the assessment in question is interested in measuring depression, so that question would either be altered or eliminated altogether.

Here is another example that you might be more familiar with. If you have ever purchased anything online, you have probably also received a follow-up email asking you all about your buying experience. Retailers want to make sure they provide good customer service and one way is to go directly to the customer and ask them how things went by surveying them about their experience.

Part III: Reliability

Reliability can be thought of as consistency. Does the instrument consistently measure what it is intended to measure? It is not possible to calculate reliability; however, there are four general estimators that you may encounter in reading research:

1. Inter-Rater/Observer Reliability: The degree to which different raters/observers give consistent answers or estimates.

2. Test-Retest Reliability: The consistency of a measure evaluated over time.

3. Parallel-Forms Reliability: The reliability of two tests constructed the same way, from the same content.

4. Internal Consistency Reliability: The consistency of results across items, often measured with Cronbach’s Alpha.

Relating Reliability and Validity

Reliability is directly related to the validity of the measure. There are several important principles. First, a test can be considered reliable, but not valid. Consider the SAT, used as a predictor of success in college. It is a reliable test (high scores relate to high GPA), though only a moderately valid indicator of success (due to the lack of structured environment – class attendance, parent-regulated study, and sleeping habits – each holistically related to success).

Second, validity is more important than reliability. Using the above example, college admissions may consider the SAT a reliable test, but not necessarily a valid measure of other quantities colleges seek, such as leadership capability, altruism, and civic involvement. The combination of these aspects, alongside the SAT, is a more valid measure of the applicant’s potential for graduation, later social involvement, and generosity (alumni giving) toward the alma mater.

Finally, the most useful instrument is both valid and reliable. Proponents of the SAT argue that it is both. It is a moderately reliable predictor of future success and a moderately valid measure of a student’s knowledge in Mathematics, Critical Reading, and Writing.

Part IV: Validity and Reliability in Qualitative Research

Thus far, we have discussed Instrumentation as related to mostly quantitative measurement. Establishing validity and reliability in qualitative research can be less precise, though participant/member checks, peer evaluation (another researcher checks the researcher’s inferences based on the instrument, and multiple methods (keyword: triangulation), are convincingly used. Some qualitative researchers reject the concept of validity due to the constructivist viewpoint that reality is unique to the individual, and cannot be generalized. These researchers argue for a different standard for judging research quality. For a more complete discussion of trustworthiness.

The research instruments

These research instruments or tools are ways of gathering data. Without them, data would be impossible to put in hand.

QUETIONAIRE
The most common instrument or tool of research for obtaining the data beyond the physical reach of the observer which, for ex. May be sent to human beings who are thousands of miles away or just around the corner.
INTERVIEW
It is in a sense of an oral questionnaire. Instead of writing the response, the interviewee gives the needed information orally and face-to-face. With a skillful interviewer, the interview is often superior to other data-gathering device.
TAPE RECORDED DATA
Observe through the ear as well as through the eye Also use video tape recorder or radio cassette recorder
OBSERVATION
Perceiving data through the senses: sight, hearing, taste touch and smell .
Most direct way used in studying individual behavior
PSYCHOLOGICAL TESTS
An instrument designed to describe and measure a sample of certain aspects of human behavior
performance test, paper and pencil test, achievement inventory, personality inventory and projective devices

The methods of data collection vary according to RESEARCH INSTRUMENTS

“Massey states that the “Instrument development requires a high degree of research expertise, as the instrument must be reliable and valid.”

The type of instrument used by the researcher depends on the data collection method selected. facilitate variable observation and measurement. described as a device used to collect the data.

Guidelines for Developing an Instrument

The instrument must be based on the theoretical framework selected for the study.

The research tool will only be effective only as it relates to its particular purpose.

The instrument must be suitable for its function.

An instrument should include an item that directly asks the hypothesis

The content of the instrument must be appropriate to test the hypothesis or answer the question being studied

The researcher may need to read extensively to identify which aspects of the theory are appropriate for investigation

The instrument should not contain measures that function as hints for desired responses.

A good instrument is free of build-in clues

The instrument should be free of bias.

The researcher, through the instrument, must be able to gather data that are appropriate in order to test the hypothesis or to answer the question under investigation.

Search This Blog

Dr Sohail Ansari

The opinion of yours is not necessarily the reality of others