US elections: Goldsmiths data science research links voting habits with sickness & death


A new dissertation by MSc Data Science student Caroline Butler highlights the relationship between health and politics in the USA.

MSc Data Science student Caroline Butler has been investigating whether there is a relationship between mortality among middle-aged white Americans, social and economic well-being, and the 2016 presidential primary election outcomes at county-level.

Her research suggests that middle-aged white Americans living in counties with higher death rates are more cautious voters. That is, they are more likely to vote for a safe bet over a wildcard such as Trump.

After analysing data from the United States Center for Disease Control’s WONDER tool, the United States Census Bureau’s County QuickFacts, and the Kaggle forum, 2016 US Election, Caroline discovered a pattern connecting death rates to voting.

Contrary to expectations, a one unit increase in the all-cause mortality rate increased log odds of Hillary Clinton winning in that county’s Democratic presidential election primary by 1.5693 compared to Bernie Sanders. However, this result could have been skewed by Bernie Sanders’ younger fan base.

To Caroline’s surprise, a one unit increase in the all-cause mortality rate decreased log odds of Donald Trump winning his primary in a county by 1.4371.

The project was inspired by recent evidence that drug and alcohol poisoning, suicide and chronic liver diseases have caused the mortality rate among middle-aged white people in the United States to increase. At the same time, anti-establishment candidates, such as Donald Trump and Bernie Sanders, have achieved unexpected success.

In a follow-up investigation to her project, Caroline ran her data on mortality, socio-economic status of a county, and which state the counties were in through the CHAID machine learning algorithm, and found that with 85-89% accuracy, you could predict who would win the primary for each political party.

Her results suggest that for both white people and all races combined, the social and economic well-being of a county is as much related to the outcomes of the 2016 primary election as the mortality rates of middle aged Americans is.

“Understanding whether mortality data for middle-aged white Americans is associated with political viewpoints is important not only from a political perspective, but also for purposes of developing appropriate public health directives,” Caroline explains.

“I was surprised to find that in areas with higher mortality rates, people were more likely to vote for Clinton over Sanders in the primaries – but I’d suggest this could be because Sanders had a high number of young, so generally more healthy, voters.

“A similar study should definitely be done for the United States Presidential Election so we can compare the voting patterns from the Democratic Party to the votes from the Republican Party.”

Adapted from a Goldsmiths news article by Sarah Cox