Biostatistics Glossary

25 essential terms — because precise language is the foundation of clear thinking in Biostatistics.

Showing 25 of 25 terms

The pre-specified probability threshold (commonly 0.05) for rejecting the null hypothesis. It represents the maximum acceptable probability of committing a Type I error.

A statistical method for comparing means across three or more groups by partitioning the total variability in data into between-group and within-group components.

A statistical framework that updates the probability of a hypothesis as new evidence is obtained, using Bayes' theorem to combine prior beliefs with observed data to produce posterior probabilities.

A systematic error that leads to an incorrect estimate of the association between exposure and outcome. Common types include selection bias, information bias, and confounding.

A procedure in clinical trials where participants, investigators, or both are kept unaware of treatment assignments to prevent bias in treatment delivery and outcome assessment.

A situation in survival analysis where the time to the event of interest is incompletely observed, often because the study ended or the participant was lost to follow-up.

The sequential stages of testing a new therapy: Phase I (safety and dosage), Phase II (efficacy and side effects), Phase III (large-scale comparison with standard treatment), and Phase IV (post-marketing surveillance).

Distortion of the estimated association between an exposure and an outcome caused by a third variable that is related to both.

A semi-parametric regression model used in survival analysis to estimate the effect of covariates on the hazard rate while assuming proportional hazards over time.

An observational study that collects data on exposure and outcome at a single point in time. It measures prevalence but cannot establish temporal relationships or causation.

A quantitative measure of the magnitude of a phenomenon or treatment effect, independent of sample size. Common measures include Cohen's d, odds ratios, and relative risks.

The study of the distribution, determinants, and frequency of disease in human populations. Biostatistics provides the analytical tools used in epidemiological research.

The expected proportion of false positives among all rejected null hypotheses. Controlling FDR is an alternative to controlling the family-wise error rate in multiple testing scenarios.

The number of new cases of a disease occurring per unit of person-time at risk in a defined population over a specified period.

An analysis approach in RCTs where all participants are analyzed in the group to which they were randomized, regardless of adherence, to preserve the integrity of randomization.

A step-function graph produced by the Kaplan-Meier estimator that displays the estimated probability of survival over time, accounting for censored observations.

A function that measures how probable the observed data are for different values of a statistical parameter. Maximum likelihood estimation finds the parameter values that maximize this function.

A quantitative statistical technique that combines results from multiple independent studies to produce an overall summary estimate with greater precision and power.

A condition in regression analysis where two or more predictor variables are highly correlated, making it difficult to isolate the individual effect of each predictor.

A symmetric, bell-shaped probability distribution characterized by its mean and standard deviation. Many biological measurements approximate a normal distribution, and many statistical tests assume normality.

The average number of patients who need to be treated to prevent one additional adverse outcome, calculated as the reciprocal of the absolute risk reduction.

The ratio of the odds of an event occurring in one group to the odds in another. It is the primary measure of association in case-control studies.

An analysis restricted to participants who completed the trial according to the protocol, in contrast to the intention-to-treat approach. It estimates the effect under ideal adherence but may introduce bias.

The proportion of a population that has a particular disease or condition at a specific point in time (point prevalence) or over a defined period (period prevalence).

A statistical phenomenon in which extreme measurements tend to be followed by measurements closer to the average on subsequent occasions, regardless of any intervention.