The media is abuzz about a recent study that found that the gender pay gap for newly trained physicians is widening compared to ten years ago. Adjusted for specialty and hours worked, new female physicians made an unexplained average of $16,819 dollars less per year than new male physicians in 2008.
These adjustments counter the claim that the pay gap exists because women go into low-paying specialties and that they tend to work fewer hours. That's good.
However, what's not so good is that the authors favor some hypotheses over others when explaining the gap. They speculate that women are intentionally choosing lower-paying jobs because these jobs provide greater flexibility and family-friendly benefits, such as not being on call after certain hours. Women may negotiate these conditions of employment which come at the price of commensurately lower pay. This is certainly a fair hypothesis.
This is also the hypothesis that the media is picking up on. It's intentional and not imposed; it's unfortunate but not unjust is the subtext.
But there is perhaps too widespread of an acceptance of this theory.
Why do the authors prefer this explanation over others in the first place?
The authors state that they cannot rule out other theories, such as gender discrimination and women being worse negotiators than men. The main reason they say these theories are not consistent with observed data hinges on a single, pivotal point: that, in 1999, using the same adjustments, starting salary differences between men and women were not statistically significant.
"...we are unwilling to accept the theory that women have become worse negotiators in recent years," the authors write.
"...it would be difficult to believe that discrimination, after a period of quiescence, has actually been on the rise in recent years," they also write.
"...by the late 1990s, women and men earned roughly equivalent salaries after observable factors were adjusted for," they add.
I think we need to look at 1999 more closely.
Time to get back to basic stats: what determines a significant difference? The answer is usually a p-value of 0.05, which is arbitrary but accepted. This means that if the study were conducted repeatedly, 5% of the time, the "significance" found would be a false positive, due purely to chance.
In 1999--without adjusting for specialty or work hours--new women physicians earned an average of $151,600 versus $173,400 for men (a 12.5% salary difference). About 17% of this difference ($3,600) remained after adjustments.
In 2008, women earned $174,000 compared to men's $209,300 (a 17% difference). Roughly half of this difference ($16,819) remained after adjustments. Clearly, the unexplained adjusted starting gap widened.
But unexplained starting salary differences between men and women in 1999 were not found statistically significant. Why? The p-value was 0.08. In other words, there was only an 8% chance that the difference in findings were due to chance. But, in the world of statistical significance, 8% is simply not 5%. (In contrast, p < 0.001 in 2008.)
So, if we repeated this study 100 times in 1999, 92 times we'd find a difference between starting male and female salaries.
Is it misleading to state that the pay gap in 1999 was not statistically significant? No.
Is it misleading to state by the late 1990s, women and men earned roughly equivalent salaries after observable factors were adjusted for? Only if you think that a $3,600 difference (~17% of the unadjusted salary figure) with a p-value of 0.08 is "roughly equivalent."
Onto the bigger questions: how does the 1999 data affect the author's conclusions for 2008?
The authors toss the sexism hypothesis mainly because they suggest that gender discrimination has been in a "period of quiescence" due to the 1999 data. This is a far greater leap than what the 1999 data actually suggests.
The authors toss the women-are-worse-negotiators theory for the same reason.