As previously discussed, epidemiological research studies are conducted to meet a number of objectives which include amongst others:
There are seven major types of epidemiological study design that vary in the strength of evidence they provide about a causal relationship between exposures and outcomes. You will often see these framed as the pyramid of evidence whereby the studies at the top are considered more powerful, but less frequently done due to the time and expense involved. Studies towards the bottom are usually quicker and cheaper to run so there tends to be a lot more of them in the literature even though they may provide weaker evidence.
We’ll start our study design review working from the bottom of the pyramid to the top.
In these types of studies, researchers solicit opinions from individuals who are considered to have practical knowledge and experience working with a disease. Since these studies seldom involve field data collection, control groups, or rigorous scientific methods, they cannot be used to definitively establish the cause for a disease outcome.
A case report describes some ‘newsworthy’ or rare clinical occurrence such as an unusual combination of clinical signs, experience with a novel treatment or a sequence of events that may suggest previously unsuspected causal relationships. In contrast, a case series presents data from multiple cases. The goal of a case series is to describe the common features among multiple cases and explains patterns of variability among.
Case reports and case series are useful for investigating new or re-emerging diseases as they help in determining the definition that will be used to detect disease in descriptive and analytical studies. However, with the small number of cases, it is difficult to perform robust statistical analyses and without a control group to compare against, we can’t make any robust conclusions about risk factors. For example, let’s say we have a case series of 100 cows in New Zealand with the novel disease dropped hock syndrome (DHS) and 30 / 100 were known to have trace element deficiencies Unless we know how many other unaffected cattle in the source herds also had trace element deficiencies, we cannot conclude that it places cattle at increased risk.
A cross-sectional study is often what people are talking about when they say ‘we did a survey’. A descriptive cross-sectional study simply aims to determine what proportion of the population has the characteristic of interest by selecting a sample from the population. These are often done when there is a new disease to characterize it, quantify its frequency, and determine how it varies in relation to person, place, and time. These studies are usually conducted without a specific hypothesis, but may be used to generate hypotheses to test in future studies. An analytical cross-sectional study will often ask questions about exposures as well that can be used to explore risk factors for disease. Individuals are examined for the presence of disease and the presence of specified risk factors at the same time.
For some diseases, it may prove useful to repeat the cross-sectional studies to determine if changes are occurring over time. In order to estimate the proportion that have the disease at a particular point in time, it is not necessary to sample every member of the population of interest, but it is important to ensure that the sample is representative of the population and is best achieved by taking a random sample from the population of interest.
The three main advantages to a cross-sectional study are that:
The main disadvantages to a cross-sectional study are that they are not very good for study rare diseases or diseases the only last for a very short period of time and it can be difficult to tell if the exposure came before or after the outcome since they are both being measured at the same time point.
Case-control studies allow us to answer the same questions without needing to sample as many people. A case-control study begins with the identification of those individuals with disease (namely, cases). A suitable control group is best thought of as a sample (ideally random) taken from the population that gave rise to the cases. For example, if we did a case-control study to determine if neutering prevented mammary cancers, in which cases had been selected from one of five veterinary clinics, then controls should be selected from those people who would have brought their dog for treatment if it developed the disease of interest. The owners of the dogs would then be asked if their dog had been neutered.
The quality of a case-control study is dependent on whether the controls are representative of the population from which the cases arose. This requires that the study base be well defined in time and place. An example of a case-control study with a good study base might be all cases of coughing horses that occurred at one race-track between 1 January 2012, and 31 December 2014. The controls would then be chosen at random from horses at the same race track at the time the cases occurred. Unfortunately, most case-control studies have a poorly defined study base. The following are just some examples of case-control studies in which the study base is poorly defined:
Reading this you may be inclined to think that there is little point in conducting a case-control study. However, it is possible to design good case-control studies and they do have a number of advantages. Case-control studies are an efficient method for studying rare diseases and because subjects have already experienced the outcome of interest at the start of the study, case-control studies are quick to run and are considerably cheaper than other study types. Furthermore, when using a case-control study, it is possible to explore the relationship between several study factors and the outcome factor of interest. Consequently, it can be a useful design for generating hypotheses.
It should also be noted that case-control studies can only use odds ratios as the measure of association. This is primarily because we are only measuring disease at one time point so cannot use incidence risk or incidence rate to estimate disease frequency. Since we also artificially set the prevalence of disease in the population by choosing a specific number of controls to match the number of cases, we cannot use the prevalence risk ratio either (i.e. choosing 1 case to 1 control would make prevalence 50%, choosing 1 case to 2 controls would make prevalence 33% and this can result in biased estimates of the relative risk particularly for small study samples).
Cohort studies (also called longitudinal or prospective studies) can be conducted in either a prospective or retrospective manner. A prospective cohort study begins with the selection of two groups of non-diseased animals, one exposed to a factor postulated to cause a disease and the other unexposed. The groups are followed over time and their change in disease status is recorded during the study period. A retrospective cohort study starts when all of the disease cases have been identified and then the history of each study participant is carefully evaluated for evidence of exposure to the agent under investigation. Clearly, a retrospective cohort study is dependent on there being good records of disease and exposure. Given there are frequently not good records of exposure and disease, most well designed cohort studies collect information prospectively.
The advantage of a cohort study is that because subjects are monitored over time for disease occurrence, cohort studies provide estimates of the absolute incidence of disease in exposed and non-exposed individuals. By design, exposure status is recorded before disease has been identified. In most cases, this provides unambiguous information about whether exposure preceded disease. Cohort studies are well suited for studying rare exposures. This is because the relative number of exposed and non-exposed individuals in the study need not necessarily reflect true prevalence of the risk factor in the population at large.
The main disadvantage of cohort studies is that they can require a long follow-up period particularly for diseases that may take years to develop. In the case of rare diseases, large groups are necessary to ensure that enough animals in each exposure group will develop the disease to facilitate statistical comparison. Losses to follow-up can become an important problem and the studies are often expensive.
Randomized control trials (RCTs) involve randomly assigning an individual to either be given the intervention or not. RCTs are the closest study type to an experiment and, as such, provide the best evidence as to whether or not exposure will result in a particular outcome (or prevent it). Historically, RCTs have been used to study therapeutic drugs, but have also been expanded to include other types of interventions such as surgical procedures or client education programmes. The first step is to determine the source population. The potential study participants are then assessed to determine if they are eligible before being invited to participate in the study. After they consent, they will be randomly assigned to receive either the treatment or control and then followed over time to monitor for the development of the disease outcome.
The advantage of RCTs is that they provide the best evidence as to whether or not an exposure (typically a therapeutic or preventive intervention) is causally associated with the outcome. This is because an RCT is the epidemiologic design that most closely resembles a laboratory experiment and, as such, provides excellent control for bias. The randomisation ensures that the case and control groups do not differ with respect to relevant characteristics like age, breed, or sex and the controlled element is because the effects of intervention can only be judged in relation to what would happen in its absence.
Sometimes animals and people can improve from the placebo effect just by participating in the study. You will also encounter the terms single-blinded study, which means that the participant (or animal owner) does not know which experimental treatment they are receiving and double-blinded study, which means that neither the participant nor the research know which experimental treatment they are receiving. This is to prevent bias. For example, if we were doing a study to evaluate the effects of providing pain relief to calves for disbudding, if I knew that Group A were receiving the pain medication and Group B was the control group receiving no pain medication, I may be subconsciously more likely to report better clinical outcomes for Group A than Group B because I am expecting them to do better.
The main disadvantage of an RCT is that it is not always ethical or feasible. For example, it would be impossible to design an RCT to evaluate the impact of things such as weather or population as the researcher cannot control these factors. If we were interested in exploring the effects of alcohol consumption on fetal development in humans, it would be considered extremely unethical to assign mothers into drinking and non-drinking experimental groups. Furthermore, RCTs are more expensive and impractical if long periods of follow-up are required. They are inefficient studies for rare diseases or diseases that take a long time to develop.
Systematic reviews and meta-analyses are basically when researchers have collected information from many different published research studies and aggregated the results together to identify either similar patterns or sources of disagreement. If 20 studies in the literature have all found evidence of an association between an exposure and an outcome, that increases our confidence that there is a genuine link (remember back to Hill’s criteria).
The main advantage to this approach is that it considers all available evidence to help us decide how much we believe a particular hypothesis. The main disadvantages are that it can take considerable time and effort to find all relevant studies in the literature and we know that there is an underreporting in the literature of “negative studies” (studies where no associations were found).
When you are evaluating research articles, it is important to be able to correctly identify which study design the researchers used as this will impact how we interpret the findings and potential sources of bias. Always looks at the actually study methodology rather than relying solely on what label the researchers assigned because they can get it wrong. The following is a very basic algorithm for determining which study design the researchers used.
Basically, if there is no comparison/control group (i.e. they just looked at individuals with disease or they just looked at individuals with an exposure), then it is a case series or case report. Remember that we always need to have a baseline for comparison to make inferences about risk.
If they tracked any group of susceptible individuals over time to see if they developed disease, it’s either going to be a cohort or an RCT. If they randomly assigned participants into exposure groups, then it becomes an RCT. The most appropriate measure of association is the incidence risk/incidence rate ratio
If they only measured disease/exposure in an individual at one time period, you know it’s either going to be a cross-sectional study or a case-control study. If they started by identifying the cases and then picking a number of controls to match, then it is a case-control study. If they surveyed an entire group of individuals, tested them for disease, and used the test results to determine the ratio of positive vs negative, then it is a cross-sectional study. For case-control studies, we always use the odds ratio. For cross-sectional studies, we can use either the odds ratio or the prevalence risk ratio.