School of Public Health Research Leverages AI to Study Cardiovascular Disease Prevalence, Care Costs

Image is of binary code - the digits one and zero. The numerals are white and the background is black. In the middle, the digits are rendered in red, forming form a heart shape.
Photo by Alexander Sinn /

By Erin Frick

ALBANY, N.Y. (April 27, 2023) — An interdisciplinary research team composed predominantly of University at Albany collaborators has published a new a study that harnesses the power of machine learning to examine the importance of contextual factors that influence cardiovascular disease prevalence and care costs among Medicare beneficiaries nationwide.

The team analyzed publicly available, county-level data on demographic composition, behavior, social vulnerability, racial and ethnic segregation and other contextual factors to determine the relative importance of these factors as predictors for cardiovascular health and care costs.

Their study, published in the Journal of the American Heart Association, is among the first in the United States to use machine learning to study contextual cardiovascular disease predictors and costs at the county level.

Machine Learning in Public Health

Individual-level risk factors for cardiovascular disease are well studied. Less is known about how broader contextual factors — such as traits of our surrounding communities — can impact disease risk and care costs. The research team employed machine learning to explore this question.

Machine learning is a form of artificial intelligence that holds great promise as an important research tool across many fields, including public health.

“Compared to traditional statistical methods, AI methods are more powerful in making predictions based on existing data, with more flexibility and fewer assumptions,” said senior co-author Kai Zhang, Empire Innovation associate professor at UAlbany’s School of Public Health. “AI methods can also yield deeper insights on the relative importance of the different factors contributing to a specific health outcome. In this study, we leveraged AI methods to identify the important predictors of county-level cardiovascular disease care costs and rank the relative importance of these factors.”

As is the case in addressing many of today’s pressing societal issues, demand is high and resources are limited. It is therefore critical that resources are allocated strategically and that they reach communities in need. This work can inform strategies for allocating resources with an eye to disease prevention and early interventions to help reduce cardiovascular disease and care costs.

“With an issue like cardiovascular disease, wherein many interacting factors shape risk and patient outcomes, it is critical that we use all available information to inform decisions around resource allocation,” said Zhang. “This is where machine learning approaches can help. Methods like the one we used in this study can take in high quantities of diverse data types, run them against each other and yield results at a highly granular level, in this case, all counties nationwide. This helps us identify patterns among co-occurring contextual risk factors that would otherwise be impossible to detect. It can also help us identify communities of high need that might otherwise be missed.”

Studying Context

To complete this analysis, the team gathered publicly available data from 3,137 U.S. counties. Information pertaining to both cardiovascular disease prevalence and inpatient, outpatient and total care costs was derived from the Centers for Medicare and Medicaid Services Chronic Conditions Data Warehouse; these figures all pertain to Medicare beneficiaries diagnosed with cardiovascular disease in 2017 and reflect average costs incurred among those individuals in that year.

Using a machine learning approach called “extreme gradient boosting,” the team analyzed county-level data on cardiovascular disease prevalence and cost against information on county demographics, education, income and urbanicity, as well as behavioral factors such as smoking, alcohol consumption, cholesterol levels and physical activity. Findings underscore the importance of contextual factors in shaping health care costs for cardiovascular disease.

Key results:

  • Overall, demographic composition, education and social vulnerability are consistently important predictors of cardiovascular disease and care costs across most scenarios.
  • Demographic composition and behavioral risk factors are among the most important predictors of cardiovascular disease prevalence and inpatient care costs.
  • Social vulnerability and racial and ethnic segregation are particularly important predictors for total and outpatient care costs.

Identifying Disparities

An important finding from the work is that cardiovascular disease prevalence and care costs did not fall exclusively along socioeconomic lines, according to Zhang.

“We found that even for counties that are more advantaged, such as those with low poverty and low social vulnerability, racial and ethnic segregation is still one of the major contextual contributors affecting care costs for cardiovascular disease. This means that racial and ethnic segregation itself may have important impacts on cardiovascular disease care, independent from other socioeconomic disadvantages.”

This finding aligns with a growing body of evidence that shows that the effects of racism and discrimination — both in the present as well as intergenerational legacy effects — can have deep and lasting impacts on the physical well-being of people from historically minoritized populations, including poor health outcomes like cardiovascular disease.

The researchers assert that studies like this can help policymakers direct resources to where they are most needed, including low-poverty, less socially vulnerable communities that are nonetheless home to high-risk individuals.

Improving cardiovascular health among disproportionately affected populations is a complex challenge. The researchers hope that this work can inform training for policymakers and researchers, to help raise awareness of care and cost inequities across populations and geographic areas. For researchers, this includes embedding diversity and inclusion ethos within all trials, studies or evaluations of health care and its costs, including research on related contextual and behavioral factors.  

Partners on this study included research collaborators from: Mount Saint Vincent University, University of Texas Health Science Center at Houston, Washington University in St. Louis, Vanderbilt University, and Northwestern University’s Feinberg School of Medicine.