{"id":209,"date":"2021-07-04T17:08:26","date_gmt":"2021-07-04T17:08:26","guid":{"rendered":"https:\/\/darrengemoets.me\/wp\/?p=209"},"modified":"2021-07-04T17:11:39","modified_gmt":"2021-07-04T17:11:39","slug":"p-values-are-not-transferable","status":"publish","type":"post","link":"https:\/\/darrengemoets.me\/wp\/2021\/07\/04\/p-values-are-not-transferable\/","title":{"rendered":"P values are not transferable"},"content":{"rendered":"<p>Of the many challenges I encounter in communicating statistical concepts to collaborators, a common one is that p-values are seen as \u201ctransferable.\u201d Here is a simple example to illustrate what I mean.<\/p>\n<p>Consider the following <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/darrengemoets.me\/wp\/wp-content\/ql-cache\/quicklatex.com-6fb4b11d19674b659ef303fc121f207b_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#50;&#32;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#51;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"40\" style=\"vertical-align: 0px;\"\/> table, where the rows are sex (Male and Female) and the columns are smoking status (Current, Former, Never). We are interested in determining if the sex variable is independent of the smoking status variable. The counts are the number of observations in each category followed by relative frequencies.<\/p>\n<pre><code class=\"r\">dat.synth1 &lt;- matrix(c(55,45,70,50,50,70),nrow=2)\nrownames(dat.synth1) &lt;- c(\"M\",\"F\")\ncolnames(dat.synth1) &lt;- c(\"Current\",\"Former\",\"Never\")\ndat.synth1\n<\/code><\/pre>\n<pre>##   Current Former Never\n## M      55     70    50\n## F      45     50    70\n<\/pre>\n<pre><code class=\"r\">dat.synth1\/sum(dat.synth1)\n<\/code><\/pre>\n<pre>##     Current    Former     Never\n## M 0.1617647 0.2058824 0.1470588\n## F 0.1323529 0.1470588 0.2058824\n<\/pre>\n<p>I perform a <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/darrengemoets.me\/wp\/wp-content\/ql-cache\/quicklatex.com-4cbe148bd9da75d2e668ddd0223dbf9f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#99;&#104;&#105;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"18\" style=\"vertical-align: -4px;\"\/> Test of Independence, with p-value given below. I would conclude that the sex variable is not independent of the smoking variable.<\/p>\n<pre><code class=\"r\">chisq.test(dat.synth1)\n<\/code><\/pre>\n<pre>## \n##  Pearson's Chi-squared test\n## \n## data:  dat.synth1\n## X-squared = 7.3789, df = 2, p-value = 0.02499\n<\/pre>\n<p>This test for association does not \u201ctransfer\u201d to other hypotheses, e.g., marginal quantities. For example, in the sample 55 out of 175 are currently smokers and male (<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/darrengemoets.me\/wp\/wp-content\/ql-cache\/quicklatex.com-4c5427ea5fb7ebae9eb00565bcb39791_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#97;&#112;&#112;&#114;&#111;&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"13\" style=\"vertical-align: 1px;\"\/> 31.4 % and 45 out of 165 (<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/darrengemoets.me\/wp\/wp-content\/ql-cache\/quicklatex.com-4c5427ea5fb7ebae9eb00565bcb39791_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#97;&#112;&#112;&#114;&#111;&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"13\" style=\"vertical-align: 1px;\"\/> 27.3%) are current female smokers. The following would be incorrect to report: \u201cWe find that smoking status is not independent of sex (p=0.025), thus the proportion of males that are current smokers is different from the proportion of females that are current smokers.\u201d<\/p>\n<p>We can&#8217;t use the p value from the <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/darrengemoets.me\/wp\/wp-content\/ql-cache\/quicklatex.com-4cbe148bd9da75d2e668ddd0223dbf9f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#99;&#104;&#105;&#94;&#50;\" title=\"Rendered by QuickLaTeX.com\" height=\"19\" width=\"18\" style=\"vertical-align: -4px;\"\/> Test of Independence for any \u201cpost-hoc\u201d type test on each proportion as done in the second part of the sentence. To assess the second claim we test if the proportion male current smokers is different from the proportion of females that are current smokers.<\/p>\n<pre><code class=\"r\">prop.test(c(55,45),c(175,165))\n<\/code><\/pre>\n<pre>## \n##  2-sample test for equality of proportions with continuity correction\n## \n## data:  c(55, 45) out of c(175, 165)\n## X-squared = 0.5205, df = 1, p-value = 0.4706\n## alternative hypothesis: two.sided\n## 95 percent confidence interval:\n##  -0.06101684  0.14413372\n## sample estimates:\n##    prop 1    prop 2 \n## 0.3142857 0.2727273\n<\/pre>\n<p>So we can&#8217;t conclude a difference in the two proportions in the population. The main issue is that hypotheses should be clearly established a priori, and not generated and tested using that data. Or put another way, performing many tests after seeing the data is p-hacking.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Of the many challenges I encounter in communicating statistical concepts to collaborators, a common one is that p-values are seen as \u201ctransferable.\u201d Here is a simple example to illustrate what I mean. Consider the following table, where the rows are sex (Male and Female) and the columns are smoking status (Current, Former, Never). We are&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[28,26],"class_list":["post-209","post","type-post","status-publish","format-standard","hentry","category-statistics","tag-chi-square","tag-pvalues"],"_links":{"self":[{"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/posts\/209","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/comments?post=209"}],"version-history":[{"count":4,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/posts\/209\/revisions"}],"predecessor-version":[{"id":214,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/posts\/209\/revisions\/214"}],"wp:attachment":[{"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/media?parent=209"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/categories?post=209"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/darrengemoets.me\/wp\/wp-json\/wp\/v2\/tags?post=209"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}