The table below is from a study by Hayat and colleagues (
). It illustrates one common trend regarding cancer – it increases dramatically in incidence among those who are older. With some exceptions, such as Hodgkin's lymphoma, there is a significant increase in risk particularly after 50 years of age.
So I decided to get state data from the US Census web site (
), on the percentage of seniors (age 65 or older) by state and cancer diagnoses per 1,000 people. I was able to get some recent data, for 2011.
I analyzed the data with WarpPLS (version 4.0 has been just released:
), generating the types of coefficients that would normally be reported by researchers who wanted to make an effect appear very strong.
In this case, the effect would be essentially of population aging on cancer incidence (assessed indirectly), summarized in the graph below. The graph was generated by WarpPLS. The scales are standardized, and so are the coefficients of association in the two segments shown. As you can see, the coefficients of association increase as we move along the horizontal scale, because this is a nonlinear relationship. The overall coefficient of association, which is a weighted average of the two betas shown, is 0.84. The probability that this is a false positive is less than 1 percent.
A beta coefficient of 0.84 essentially means that a 1 standard deviation variation in the percentage of seniors in a state is associated with an overall 84 percent increase in cancer diagnoses, taking the standardized unit of the number of cancer diagnoses as the baseline. This sounds very strong and would usually be presented as an enormous effect. Since the standard deviation for the percentage of seniors in various states is 1.67, one could say that for each 1.67 increment in the percentage of seniors in a state the number of cancer diagnoses goes up by 84 percent.
Effects expressed in percentages can sometimes give a very misleading picture. For example, let us consider an increase in mortality due to a disease from 1 to 2 cases for each 1 million people. This essentially is a 100 percent increase! Moreover, the closer the baseline is from zero, the more impressive the effect becomes, since the percentage increase is calculated by dividing the increment by the baseline number. As the baseline number approaches zero, the percentage increase from the baseline approaches infinity.
Now let us take a look at the graph below, also generated by WarpPLS. Here the scales are unstandardized, which means that they refer to the original measures in their respective original scales. (Standardization makes the variables dimensionless, which is sometimes useful when the original measurement scales are not comparable – e.g., dollars vs. meters.) As you can see here, the number of cancer diagnoses per 1,000 people goes from a low of 3.74 in Utah to a high of 6.64 in Maine.
One may be tempted to explain the increase in cancer diagnoses that we see on this graph based on various factors (e.g., lifestyle), but the percentage of seniors in a state seems like a very good and reasonable predictor. You may say: This is very depressing. You may be even more depressed if I tell you that controlling for state obesity rates does not change this picture at all.
But look at what these numbers really mean. What we see here is an increase in cancer diagnoses per 1,000 people of less than 3. In other words, there is a minute increase of less than 3 diagnoses for each group of 1,000 people considered. It certainly feels terrible if you are one of the 3 diagnosed, but it is still a minute increase.
Also note that one of the scales, for diagnoses, refers to increments of 1 in 1,000; while the other, for seniors, refers to increments of 1 in 100. This leads to an interesting effect. If you move from Alaska to Florida you will see a significant increase in the number of seniors around, as the difference in the percentage of seniors between these two states is about 10. However, the difference in the number of cancer diagnoses will not be even close to the difference in the presence of seniors.
The situation above is very common in medical research. An effect that is fundamentally tiny is stated in such a way that the general public has the impression that the effect is enormous. Often the reason is not to promote a drug, but to attract media attention to a research group or organization.
When you look at the actual numbers, the magnitude of the effect is such that it would go unnoticed in real life. By real life I mean: John, since we moved from Alaska to Maine I have been seeing a lot more people of my age being diagnosed with cancer. An effect of the order of 3 in 1,000 would not normally be noticed in real life by someone whose immediate circle of regular acquaintances included fewer than 333 people (about 1,000 divided by 3).
But thanks to Facebook, things are changing … to be fair, the traditional news media (particularly television) tends to increase perceived effects a lot more than social media, often in a very stressful way.