What concepts you will be introduced to:
- risk, risk factor, risk association, relative risk, odds ratio, attributable risk.
You will be able to:
- differentiate between rates, ratio, and proportion.
- calculate relative risk, odds ratio, and risk reduction using contingency tables.
- interpret the results of the odds ratio, relative risk, and relative risk reduction
- evaluate therapeutic interventions with measures such as: absolute risk reduction (ARR), number needed to treat (NNT), number needed to harm (NNH), etc.
You will explore common biases in research and how to overcome these biases.
We explored rates, ratio, and proportion when looking at Measures of Disease Occurrences in Epidemiology. These concepts remain fundamental in understanding measures of disease occurrences.
Rates are numerical expressions that quantify the occurrence of events or phenomena (how fast) within a specific population over a specified period, typically expressed per unit of population or time. Rates such as incidence rates and prevalence rates are crucial in assessing the frequency and distribution of diseases within populations.
Ratios denote the relationship between two quantities and are used to compare frequencies or sizes of different groups or elements in a population. Ratios like the risk ratio (relative risk) and odds ratio are used to evaluate the association between exposure and disease occurrence, comparing the risk of a particular outcome between different groups.
Proportions represent the fraction of a specific characteristic within a population, often presented as a percentage. Proportions are used in calculating proportions of affected individuals within a population, such as disease prevalence or case-fatality rates, offering insights into the burden and impact of diseases on specific groups or populations.
These concepts directly relate to measures of risk. These metrics aid in understanding the magnitude and patterns of health-related events, contributing significantly to epidemiological research and informing public health interventions.
Measures of risks are used: in health resources (time, money, human resources, etc) allocation, to infer cause or source of disease, and to evaluate the role of an exposure as a cause for a disease. Various measures are used to evaluate risk and bias in research:
Risk refers to the probability of an individual developing a particular disease or health condition over a specific period. It is often calculated as the ratio of the number of people affected by the disease to the total population at risk.
Risk Factor is any characteristic, exposure, or behavior of an individual or population that increases the likelihood of developing a disease or experiencing an adverse health outcome. These factors can be genetic, environmental, behavioral, or related to lifestyle.
Risk Association measures the strength and nature (magnitude and direction) of the relationship between a potential risk factor and the occurrence of a disease or health condition. It is often quantified using statistical methods to establish the likelihood or probability of the association being due to chance. It is prudent to note that the association between exposure and outcome variables does not necessarily indicate causality. Importantly, measures of association are used to determine the aetiology, treatment, and prevention of diseases.
Measures of risk and bias are critical as they help researchers identify potential causes or contributing factors to diseases, assess the strength of relationships, and determine the validity and reliability of study findings. Risk factors may be classified as modifiable or non-modifiable.
Table 7.1 Risk factors for diseases categorized into modifiable and non-modifiable factors.
Presenting Risk Data
Risk data are often presented in a standardized format, commonly utilizing contingency tables, also known as 2×2 tables. These tables serve to compare two key variables: an exposure and an outcome. The layout typically features the dependent variable (outcome) along the top row and the independent variable (predictor, exposure, or treatment like smoking or alcohol) listed vertically. Both variables are represented as binary characteristics: exposed or not exposed, and disease present or absent. For visual clarity, an example of this table is provided below.
We will use this table in our computations of risk.
Absolute Risk
Absolute risk refers to the actual or absolute probability of an event occurring within a specific population over a defined period. It represents the proportion of individuals in a group who experience a particular event, such as developing a disease, suffering an adverse effect, or encountering an outcome of interest, regardless of other influencing factors. Absolute risk is usually expressed as a percentage or a rate within a specific time frame and is valuable in providing an understandable and straightforward measure of risk. Absolute risk may be calculated using this formula:
Table #7.2
Title: Ten-Year Analysis of Lung Cancer Incidence Among 2,000 Adults Aged 50 and Above, Stratified by Smoking Status.
A total of 100 smokers out of 1,000 developed lung cancer. The absolute risk of lung cancer among smokers can be calculated as 100/1000 = 0.1 or 10%.
Among non-smokers, 10 out of 1,000 developed lung cancer. The absolute risk of lung cancer among non-smokers is 10/1000 = 0.01 or 1%.
The total number of individuals who developed lung cancer is 110 out of 2,000 individuals. If all conditions remain the same for this population, the absolute risk of developing lung cancer over a 10-year period is 110/2,000 * 1,000, which is 55 cases per 1,000 population.
Attributable Risk
Upon analyzing the data from table 7.2, it’s evident that lung cancer incidence in the population is not solely associated with smoking. Cases of lung cancer were observed in individuals who were not exposed to smoking. To discern the genuine risk of an event linked to exposure, we can calculate the attributable risk.
Attributable risk represents the disparity in the incidence of a condition, like lung cancer, between an exposed group (smokers) and an unexposed group (non-smokers). It indicates the proportion of risk directly attributed to the exposure variable (smoking).
Using the previous example:
- Absolute risk among smokers = 10%
- Absolute risk among non-smokers = 1%
Attributable risk = Absolute risk among smokers – Absolute risk among non-smokers
Attributable risk = 10% – 1% = 9%
Therefore, in this scenario, the attributable risk of developing lung cancer due to smoking is calculated to be 9%. This implies that roughly 9% of the observed lung cancer cases in the studied population can be linked to smoking. However, assuming that removing smoking from the population would instantly lead to an identical lung cancer incidence of 1% in both exposed and unexposed groups might oversimplify the outcome. It’s essential to recognize that this statement lacks consideration for potential residual effects, the persistence of risk factors, or other unaccounted influences that might persist even after eliminating smoking from the population. This highlights the need for a more comprehensive understanding of causal associations and residual effects when interpreting these measures to draw more accurate conclusions.
Absolute Risk Reduction
Absolute risk reduction (ARR) refers to the difference in the absolute risk of an adverse event between two groups, such as those receiving a treatment/intervention versus those who do not. It quantifies the actual reduction in the risk of an outcome occurring due to an intervention or exposure. ARR is calculated by subtracting the absolute risk in the control group (without the intervention) from the absolute risk in the treated group (with the intervention).
Absolute Risk Reduction (ARR) = Risk in Control Group – Risk in Treated Group.
ARR is an important measure used in clinical trials and epidemiological studies to evaluate the effectiveness of treatments or interventions in reducing the occurrence of an event or outcome. It provides a clearer understanding of the impact of an intervention by expressing the actual reduction in risk associated with the treatment, regardless of the baseline risk.
Table #7.3
Title: Contingency Table Showing Smoking Cessation Rates With and Without Varenicline Tartrate Treatment in an adequately powered Placebo-Controlled Study.
Let us consider the hypothetical scenario in Table #7.3 where the study evaluates the effectiveness of Varenicline Tartrate, a medication for smoking cessation, in a population of 2,000 individuals over six-month period.
Suppose in the placebo-controlled trial:
- In the group receiving Varenicline Tartrate (the treated group) consisting of 1,000 individuals, 100 individuals quit smoking.
- In the control group (without the treatment), also consisting of 1,000 individuals, 50 individuals quit smoking.
To calculate the Absolute Risk Reduction (ARR) we can use the following formula:
ARR = c/(c + d) – a/(a + b)
Risk in the Control Group = Number of successes (50 quitters) / Total population of the Control Group (1,000) = 50 / 1,000 = 0.05 or 5%
Risk in the Treated Group = Number of successes (100 quitters) / Total population of the Treated Group (1,000) = 100 / 1,000 = 0.1 or 10%
ARR = Risk in Control Group – Risk in Treated Group ARR = 0.05 – 0.1 = -0.05 or 5%
The Absolute Risk Reduction (ARR) in this scenario is 5%. This means that using Varenicline Tartrate resulted in a 5% reduction in the absolute risk of individuals quitting smoking compared to those without the treatment within the six-month time frame. In other words, the use of Varenicline resulted in 5% higher smoking cessation rate among the treated group compared to the placebo group.
Number Needed to Treat
The Number Needed to Treat (NNT) is a statistical measure used to determine the number of patients who need to receive a specific treatment over a defined period to prevent one additional adverse outcome or to achieve one positive outcome compared to a control group not receiving the treatment. It quantifies the effectiveness of a treatment by estimating how many patients need to be treated to observe a particular beneficial effect or to prevent a harmful event in one patient. We assessed the effectiveness of Varenicline Tartrate using the hypothetical data from Table #7.3; let us now determine the NNT to attain one positive endpoint (smoking cessation) in a six-month period.
NNT = 1/ARR = 1/0.05 = 20
This means that for every 20 individuals treated with Varenicline Tartrate there will be one positive outcome (smoking cessation) with the intervention group compared to those not treated or given placebo.
A lower NNT indicates that fewer individuals need to be treated to achieve a beneficial outcome, reflecting a more effective treatment.
Relative Risk (RR) is a measure that compares the risk of developing a certain disease between two groups: one exposed to a specific risk factor and another unexposed. In a cohort study, initially disease-free groups are followed over time to estimate the risk of disease developing in both groups. The incidence risk is calculated for each group, typically represented as a/(a + b) for the exposed and c/(c + d) for the unexposed, where ‘a’ denotes those who developed the disease among the exposed, ‘b’ represents those who did not develop the disease among the exposed, ‘c’ signifies those who developed the disease among the unexposed, and ‘d’ represents those who did not develop the disease among the unexposed.
Table #7.4
Title: 2×2 table showing neonates who developed hyperbilirubinemia secondary to Sulfamethoxazole/ Trimethoprim (SMX-TMP) exposure in utero.
Calculating relative risk:
Therefore, neonates exposed to SMX-TMP are 3.24 times more likely to have the disease outcome (hyperbilirubinemia) compared to those unexposed to SMX-TMP in utero.
Interpreting the results of relative risk
If the RR = 1, it signifies that the risk of developing the disease in the exposed group is equal to that in the unexposed group, indicating no association between the exposure and the disease occurrence.
- If RR > 1, it means that the risk of developing the disease in the exposed group is higher than that in the unexposed group, suggesting a positive relationship between the exposure and the disease occurrence, potentially indicating a causal link.
- If RR < 1, it indicates that the risk of developing the disease in the exposed group is lower than that in the unexposed group, suggesting a negative association. This might imply some form of protection within the exposed group against the occurrence of the disease.
It is essential to analyze RR results cautiously, considering various factors and potential biases (confounding biases, selection bias, etc.) that might influence the observed associations in cohort studies.
Note carefully, this section emphasizes the use of cohort study data to compute relative risk. This is due to challenges with relative risk estimation in case-control studies relating to the study design. These studies, start with known cases and controls, and hence lack direct incidence data. Selecting appropriate controls representative of the population presents a challenge, influencing RR estimation. Moreover, case-control studies do not establish temporal relationships between exposure and disease onset, hindering causal inference or RR estimation. Consequently, RR is not directly calculable in these studies, leading researchers to commonly use odds ratios as an approximation, especially when the disease is rare. However, odds ratio may not precisely reflect RR, creating complexities in interpreting the results.
Relative Risk Reduction
We analyzed risk associations through relative risk calculations. Assessing the efficacy of an available intervention for disease outcomes involves examining relative risk reduction (RRR). RRR is a statistical measure used to assess the effectiveness of an intervention in reducing the risk of an outcome or event in a specific group compared to another. It quantifies the proportional reduction in the risk of an event in the treatment group relative to the control group. RRR is calculated as the difference between the control group’s event rate and the treatment group’s event rate, divided by the control group’s event rate, and then multiplied by 100 to express the reduction as a percentage:
or or RRR = (1 – RR) * 100%
This metric, commonly utilized in clinical trials and epidemiological research, evaluates the intervention’s effectiveness by comparing the relative decrease in the risk of an event between two groups.
Practice Example
An adequately powered study assessing the effectiveness of an aromatase inhibitor (AI) X in preventing the occurrence of breast cancer. n= 2,000 there was an occurrence of breast cancer in 64/1,000 of the exposure group while the number of breast cancer outcomes in the unexposed was 435/1,000. What is the relative risk reduction associated with this intervention?
Table #7.5
Contingency Table Showing Relative Effectiveness of an AI X in Reducing Breast Cancer Incidence among 2,000 Women.
RRR for this study can be calculated as follows:
- Risk in the group treated with AI X = a/(a + b) =64/1,000 = 0.064
- Risk in the untreated group = c/(c + d) = 435/1,000 = 0.435
- Therefore, RRR = [(Risk in untreated – Risk in treat)/Risk in untreated] * 100%
- RRR = (0.435- 0.064)/0.435 *100 = 85.3%
In this context, the RRR of approximately 85.3% means that there is an 85.3% relative reduction in the risk of breast cancer for women using the AI X compared to the placebo or untreated group.
ARR, NNT, and RR
We will utilize this figure to illustrate how metrics like ARR, NNT, RR, and RRR can complement each other in assessing health risks, determining intervention effectiveness, and providing guidance for policy direction.
Case: An adequately powered trial assessing the effectiveness of a 5-alpha reductase inhibitor (5-ARI) in reducing prostate cancer among men 40 years and older.
AR = 25/50 *100 = 50% ARR = 0.6 – 0.4 = 0.2
NNT = 1/0.2 = 5
RR = 0.4 * 1.67 = 0.7 RRR = 0.2/0.6 = 0.3
The absolute risk (AR) of prostate cancer in the studied population is 50% regardless of the intervention, indicating a substantial baseline risk. The ARR of 0.2 and the NNT of 5 suggest that for every 5 individuals treated with the 5-ARI, one case of prostate cancer could be prevented over the studied period.
The RR of 0.7 indicates a reduced risk of prostate cancer among those receiving the intervention compared to those without it, albeit not a substantial reduction. Additionally, the RRR of 0.3 demonstrates a 30% reduction in the relative risk of developing prostate cancer among those treated with 5-ARI.
However, despite these calculations, it’s crucial to note that the effectiveness of the intervention might be modest, considering the absolute risk remains relatively high in the studied population. The 5-ARI appears to have a modest effect in reducing the risk of prostate cancer but might not be significantly impactful in substantially decreasing the overall risk in this specific demographic.
Further investigation and comprehensive analysis are warranted to understand the clinical significance and real-world implications of the intervention in mitigating prostate cancer risk among men aged 40 years and older. The researcher might consider other factors such as adverse effects, cost-benefit analysis, and long-term follow-up studies. These additional strategies would be pivotal in determining the holistic effectiveness and suitability of the 5-ARI as a preventive measure for prostate cancer in this population.
Number Needed to Harm
The Number Needed to Harm (NNH) is a statistical measure used in epidemiology and clinical research. It represents the number of individuals who need to be exposed to a risk factor or treatment for a specific period to cause harm in one individual who otherwise would not have been harmed if not exposed.
To calculate the NNH, use data from a study or analysis that compares the occurrence of harm between exposed (new intervention) and unexposed groups (traditional therapy, placebo or no intervention).
The formula for NNH is = 1/Absolute Risk (Harm) Increase.
Table #7.7
Title: 2×2 Table comparing incidence of side effects for two drugs used in smoking cessation.
We will assess the risk of side effects associated with each treatment option:
Risk with Varenicline
- Risk with exposure = 130/1,000
- Risk without exposure = 870/1,000
- ARI = 0.13 – 0.87 = – 0.74
- NNHVarenicline = 1/AHI = 1/(0.74) = -1.35 (invalid)
Risk with Wellbutrin
- Risk with exposure = 635/1,000
- Risk without exposure = 365/1,000
- ARI = 0.635 – 0.365 = 0.27
- NNHWellbutrin = 1/AHI = 1/0.27 = 3.7
Note, the Absolute Risk Increase (ARI) is the difference in the absolute risk of an adverse event between the exposed group and the unexposed group.
The negative ARI value for Varenicline indicates a decrease in risk rather than an increase. Negative values do not allow for meaningful interpretation in the context of NNH as they imply a reduction in risk rather than an increase, which is what NNH measures.
The Absolute Risk Increase (ARI) for Wellbutrin, a traditional treatment, is 0.27. This value suggests that there is an increase in the absolute risk of experiencing side effects among individuals using Wellbutrin compared to those not using it. The NNH for Wellbutrin is approximately 3.7. This means that for every 3.7 individuals treated with Wellbutrin, one additional person is likely to experience side effects compared to those not receiving the treatment.
Therefore, these results indicate that there is an increased risk of side effects associated with Wellbutrin when compared to the alternative group not using this treatment.
The Number Needed to Harm provides valuable information about the potential risk associated with an exposure or intervention, assisting in decision-making and evaluating the balance between benefits and harms in clinical practice or public health interventions
The odds ratio (OR) is a statistical measure used in research and epidemiology to quantify the strength and direction of the association between two variables in a case-control study. It determines the likelihood of an event or outcome occurring in one group compared to another. Mathematically, the odds ratio is calculated as the ratio of the odds of an event occurring in the exposed group to the odds of the event occurring in the unexposed group:
Odds Ratio = Odds of Exposure in Cases/Odd of Exposure in Controls or (a/b)/(c/d) or a(d)/b(c)
Table #7.6
Title: A study among 2,000 individuals showing the prevalence of lung disease and exposure to smoking.
The Odds Ratio = 650/350 * 865/135 = 12; meaning the odds of having the disease among smokers are 12 times higher than the odds of having the disease among non-smokers
In case-control studies, the odds ratio helps in assessing the association between an exposure (such as a risk factor or treatment) and an outcome (such as a disease or condition) by comparing the odds of exposure among cases (those with the outcome) to the odds of exposure among controls (those without the outcome).
- An odds ratio greater than 1 suggests a positive association (increased odds of the outcome with exposure)
- An odds ratio less than 1 suggests a negative association (decreased odds of the outcome with exposure).
Biases in research signify systematic errors or deviations from the true outcomes of a study and can arise across different research phases. Common biases encompass selection, measurement, confounding, and publication biases. These biases manifest as systematic errors in the study’s design, execution, or analysis and interpretation, resulting in an inaccurate estimation of the study’s findings.
The implications of research biases are far-reaching as they profoundly impact the validity, reliability, and generalizability of study results. They jeopardize the accuracy and reliability of research outcomes, thereby undermining the scientific integrity of the study. Biases also wield influence over the interpretation of findings, leading to skewed conclusions or erroneous associations between variables. Consequently, researchers might draw flawed inferences from biased data. Biases can curtail the generalizability of study findings, especially when the participant selection or methods introduce biases that fail to represent the wider population accurately. Moreover, biases raise ethical concerns by potentially misguiding decision-making in healthcare, policy, or other domains, possibly resulting in harm if erroneous conclusions are applied. Additionally, biases can pervade the publication process, favoring studies with positive or statistically significant results and overlooking negative or null findings, thereby fostering publication bias
Types of Biases in Research
Selection bias occurs when certain individuals or groups are systematically included or excluded from a study in a way that skews the sample’s composition, leading to non-representative or biased results. This typically arises due to flaws in the study’s design, wherein the sample selection process introduces systematic differences between the chosen participants and the broader population. It may manifest when non-probability sampling methods like snowballing or convenience sampling are utilized, leading to a sample that does not accurately represent the entire population (compromised external validity) of interest.
Effective strategies to mitigate selection bias include:
Randomization- Employing randomization techniques helps distribute potential confounding factors equally among study groups. Random allocation helps ensure that participants are assigned to groups without any systematic bias, minimizing the impact of external variables that might influence the outcomes.
Matching- Matching involves selecting study participants in a way that balances important characteristics across different groups. It entails pairing individuals in different groups who have similar characteristics, allowing for a more balanced representation of factors that could affect the outcomes being measured.
Information bias encompasses various biases in research, including misclassification bias and recall bias. Misclassification bias occurs when individuals are inaccurately classified into certain categories (e.g., having or not having a disease) due to incomplete data or errors in interpreting results. On the other hand, recall bias involves individuals’ memory and the tendency to inaccurately recall past events or experiences. Interviewer bias may arise when the interviewer influences the respondent’s answers intentionally or unintentionally. Non-response bias occurs when certain individuals do not respond to questions, potentially affecting the representativeness of the sample. Wish bias involves respondents providing answers as if they never had the disease, possibly due to a desire for a particular outcome. Lastly, surrogate interview bias refers to a situation where information about an individual is provided by someone else, such as a spouse reporting on behalf of their partner, which might lead to biases in the information shared.
Strategies to overcome information or observer bias:
Implementing standardized protocols and procedures for data collection, interviews, or observations can help minimize biases. This includes using structured questionnaires, standardized assessment tools, or specific training for interviewers or observers to ensure consistency in data collection methods.
Employ blinding techniques, such as single-blind or double-blind methodologies, to minimize bias. Single-blind involves participants not knowing specific details about the study, while double-blind involves both participants and researchers being unaware of critical information. Blinding can reduce potential biases stemming from participants’ or researchers’ expectations or preferences.
In situations where observations or evaluations are required, using independent observers who are not aware of the study objectives or potential outcomes can mitigate bias. This approach reduces the chances of biased observations influenced by preconceived notions or expectations
Surveillance bias, also known as detection bias, occurs when there is an increased detection or identification of disease cases due to active surveillance or heightened awareness, rather than a genuine increase in disease incidence. This can often result in an inflated perception of disease occurrence solely because of increased monitoring efforts, known as the Hawthorne Effect. The Hawthorne effect and mitigation strategies were discussed here.
Other mitigation strategies include:
Randomly allocating individuals to surveillance and control groups to eliminate biases caused by selective observation. This method helps in comparing outcomes without participants knowing they are part of a study.
Employing blinding methodologies where possible, such as single-blind or double-blind approaches, to prevent observation bias. Blinding prevents the observers or researchers from being influenced by prior knowledge or expectations, ensuring more objective observations.
Establishing standardized and consistent monitoring procedures. Using predefined criteria for disease identification and clear guidelines for surveillance can help minimize bias by ensuring uniformity in disease detection practices across different observation periods.
Confounding bias occurs when a third variable distorts the observed relationship between the exposure and the outcome, leading to incorrect associations.
To mitigate confounding bias:
Employ matching techniques in study design to create comparable groups based on potential confounders. Matching ensures that both exposed and unexposed groups are similar concerning potential confounders, reducing the impact of these variables on the association being studied.
Stratify the analysis based on potential confounders. This involves analyzing data separately within subgroups defined by the potential confounder. By examining data within strata, researchers can better understand how the confounder affects the exposure-outcome relationship.
Utilize multivariate statistical techniques like regression models that adjust for potential confounders. Incorporating these variables into the analysis allows researchers to control for their effects, isolating the relationship between the exposure and outcome more accurately. Covariate adjustment is a specific component of multivariate analysis. It involves incorporating potential confounding variables (covariates) into statistical models to estimate the effect of an exposure or treatment on an outcome while controlling for the influence of other variables. It is a technique used in regression models to account for the effects of confounding variables, enhancing the accuracy of estimating the relationship between the exposure and outcome.
Additional strategies to improve internal and external validity and also mitigate biases are discussed here.
Summary
In research, various risks are associated with diseases, and specific measures are used to gauge these associations. The Relative Risk (RR) and Odds Ratio (OR) are key metrics utilized to assess the correlation between risk factors and the occurrence of a disease or condition. The RR signifies the ratio of individuals expressing a disease trait, taking into account their previous exposure to a related risk factor. On the other hand, the OR represents the ratio of individuals expressing or not expressing the trait concerning their exposure to the risk factor. Researchers endeavor to identify, mitigate, or control biases through robust study designs, appropriate methodologies, blinding techniques, randomization, and transparent reporting practices. Understanding biases is fundamental for conducting high-quality research, ensuring the reliability, and bolstering the credibility of scientific findings