When are colloquial uses of technical statistical terms OK?

As someone trained in mathematics I tend to use words precisely, as I would in a mathematical proof. For example, the word “function” has a particular definition in mathematics that is related to but different from the common definition. Statistics has the same issue, where words that have careful mathematical definitions also have colloquial definitions. In my experience working with collaborators, I’ve noticed that the colloquial vs technical definitions are often used interchangeably. A common one is the use of “correlation” as a synonym for “association” or “relationship” rather than statistical correlation, e.g., Pearson’s correlation.

Here is a specific example: The word “risk” has the common definition of exposure to harm. I could say that if I smoke I’m putting myself at greater risk of lung cancer. In statistics, there are different tools for measuring what could be considered risk, e.g., odds ratio, relative risk, or hazard ratios. Using this example, is it OK to report a hazard ratio but interpret it using the word risk?

Adding specific values to the previous example, suppose I fit a Cox proportional hazards model to data and find a hazard ratio of 0.80 for the treatment vs control model parameter (assume control is baseline). This value shows a decrease in hazard, interpreted as “a patient in the treatment group has a 20% decrease in hazard over a person in the control group.” Is it OK to say in the discussion section that the treatment lowers the risk, or even say that the treatment lowers the probability of the event?

Technically hazard is not risk; this post explains well the issue of conflating relative risk vs hazard. It concludes that it’s OK to interpret hazard ratio as a relative risk since their meanings are sufficiently close.

At this point I guess I’m OK with colloquial use of statistical terms in the reporting of statistical results as long as the precise values and terms are also reported (e.g., a table of parameter estimates) so the reader has a clear understanding of the parameter used. Maybe use statistical terms precisely in the Results section but then allow looser language in the Discussion? What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *