Last Updated on: 29th August 2022, 08:18 am
What is internal validity? The validity of a research study refers to how well the results among study participants represent the reality of similar individuals outside the study. Internal validity is the extent to which the results of a study represent the reality in the population you are studying and are not due to methodological errors. This differs from external validity, which I cover in another article.
Internal Validity is basically a measure of, did you do it right? Said another way, are there confounding factors that make your results harder to believe? You want your results to actually reflect the reality of the population you’re studying, and there are all sorts of ways you can go wrong with your study design that could interfere with that.
The key to maintaining internal validity is designing your study to control for as many outside or confounding factors that could influence the results as possible. Below, I’ll share the key insights I’ve gained about internal validity from 30 years in academia.
Internal Validity Example: What Not to Do
For example, let’s say you were studying whether consuming a certain sports drink could help people dunk a basketball. You first tested ten-year-olds’ dunking ability to establish a baseline and then had them consume that drink regularly for eight years. Then you tested them again and concluded that because there was an 87% increase in the ability to dunk a basketball, the sports drink must be the cause.
However, of course, there are a lot of other things that could have happened in that time – for example, they might have grown. Those results aren’t valid because the way the study was designed didn’t account for other variables.
How to Improve Internal Validity
The key to internal validity is study design. When you set it up properly in the beginning, there are fewer potential flaws. If you don’t set it up properly, it’s harder to explain away confounding factors.
Fear not: there are simple, concrete steps you can take to improve internal validity. Here are some of the most common.
Use Control Groups
Going back to our ten-year-olds dunking a basketball, you would probably want to have a control group of ten-year-olds who didn’t drink the sports drink and measure them eight years later as well, to see how they did dunking a basketball.
Using a control group essentially provides a baseline – it gives you an idea of what participants in the treatment group might have experienced if they had not been exposed to the treatment. Without a control group, it’s difficult to get an idea of which effects are due to the treatment and which are due to external factors that have nothing to do with your study.
If you use a control group in a study, it’s important to make sure that there is no interaction between the treatment group and the control group. For example, you might be trying to determine if a certain educational approach can cause differences in test scores. Your treatment and control groups can’t be allowed to tell each other about what they’re learning, because that would interfere with the results of the study.
Avoid Selection Bias
It’s important to assign subjects to either the treatment or the control group without regard to their characteristics. If I put different demographics of people in the treatment group and the control group, or put people in the control group that I believe will not change and people in the treatment group that I believe will change, then I have biased my results.
For example, in medical tests, researchers often use double-blind studies where nobody knows whether the subjects are getting the treatment or a placebo – not even the researchers. This is to make sure that the investigator doesn’t know what’s being given and isn’t swayed by a good story or what they think could be a positive outcome.
Of course, if a demographic characteristic might itself be a potential confounding factor, matching control group members and treatment group members by these characteristics is an option (examples might be matching by ethnicity, age, handedness, etc., depending on what you’re studying. You’ll still need to randomly assign members, but you can pull those random assignments from each demographic group you’ve created.
Use Consistent Measurement Tools
You want to use the same measurement tool at the beginning as you do at the end. If you use one test for psychological well-being as a pre-test and a different test for the post-test, or if you give your treatment group and control group different tests, then your study has no meaning, hence no validity.
There’s a concept that we hear about in statistics, which is regression toward the mean. This means that outliers over time tend to move toward the average. If you look at people who get low math scores and then you do some intervention, is it because of the intervention or is it because of a natural movement over time? This is where the control group comes in handy.
Keep Scoring Consistent
Have I changed the ways I evaluate over time? I saw someone in week 1 and interpreted their response one way, but by week 7, after seeing many people, will I code a similar response differently? In large studies, you’ll have multiple raters. Do they all have the same interpretation? You have to make sure you train them to have what’s known as inter-rater reliability: all raters giving the same rating for the same or similar behavior/response.
Teachers face this all the time when grading papers. If I’ve got 70 papers to grade, am I consistently grading everybody the same way? Is it possible that after grading 45 papers, I get grouchy and start giving lower grades, or get bored and grade more leniently?
Another test of validity is whether or not the results can be replicated. Can somebody else do it and come up with similar results? Many graduate students do their first studies trying to replicate studies others have done, just to get used to doing research. If it can’t be replicated, then it calls your research into question, so you’ll want to keep that in mind and design your study so that future researchers will be able to repeat your work and add evidence to build on your results.
Don’t Provide Too Much Compensation
Many times, respondents are compensated for being part of a study. However, it’s important for the compensation to be relatively meaningless – low enough that people don’t try to please you by looking for what answers they think you want. A $5 cup of coffee is a lot different than a $300 Amazon gift certificate. For $5 they might be honest, for $300 they might think, “Okay, what does she want to hear?”
Utilize Member Checking
Another technique that can be used to increase internal validity is called member checking. After you conduct an interview, you transcribe the interview and then you send it back to the subject to confirm that that’s exactly what they said. This reduces the likelihood of transcription errors or misinterpretations.
Be Aware of External Events
If you have a study that occurs over a period of time, you cannot control for external events. Events like COVID-19, a hurricane, or war cannot be controlled for and will influence your participants and the data you collect. However, external events do not always destroy the validity of a study. If the event impacts everybody in the study (including the control group) in the same ways and to the same degree, then it may not be a problem. However, if it affects your treatment group vs your control group differently, then you will have issues with internal validity.
Analyze Attrition
Attrition, where participants leave the study before completing it, is not uncommon. The important thing to consider is, where did the attrition happen. Did it happen only in your treatment group or only with one type of subject? If so, does this alter your results? If attrition alters your results then internal validity is compromised.
Avoid Bias & Priming
One of the things that can hurt research from the very beginning is having an agenda or looking for specific results. In that case, you may start asking certain questions or interpreting things according to what you want to see. It’s a very subtle way of getting the result you want.
For example, if you are interviewing someone and they give you answers that contain terms or ideas that you are hoping to find, you might tend to probe more deeply into those responses, whereas you neglect to probe as deeply with subjects or responses that do not match your desired results.
Avoid Repeated Testing
If you give people the same test over and over again, they may just get better at taking the test. They also might learn, either explicitly or by implication, what the “right” answer is, and give that instead of what’s true. If they get used to the test, they may answer for the test as opposed to what you’re actually looking for. To prevent this, avoid repeated testing. If you need to give someone the same questionnaire multiple times, do so at extended time intervals and try to reduce the number of times it’s administered to the minimum.
Final Thoughts on Internal Validity
Maintaining internal validity doesn’t have to be difficult – it’s just about taking the time to think about what could go wrong and control for it at the beginning. It’s worth it to spend the time thinking about and planning for internal validity at the beginning. I also strongly recommend asking others if they can see any problems with your study design – especially your chair and committee. Believe me, you don’t want to get to the end of your study and find out that your results lack internal validity.
Taking the simple steps outlined in this article (including running everything by your Chair) will greatly increase your chances of producing valid results with your study. Go forth and validate!