How Can Big Data in Medical Research Eliminate Health Disparities?

  • userPaicon

  • calendarAugust 8, 2023

  • clock9 min read

1. Concepts: Health Disparity and Health Equity

Health disparities refer to potentially avoidable systematic differences in health between disadvantaged and advantaged social groups, which are mainly determined either socially and/or economically [1]. This definition excludes voluntarily assumed risks- such as recreational sky divers, pure chance- such as a pure genetic mutation and life stage differences- such as age differences. Health disparities imply non-random or non-equal distribution of health achievements and access to healthcare depending on different factors such as gender, race/ethnicity and/or socio-economic status [2]. Such disparities in health outcomes or health status are clinically and statistically observed and measured between groups [3]. It is called health disparity instead of health differences since a difference becomes a disparity when some subgroups in a population are given access to healthcare resources while others not [4].

Disadvantaged social groups here refer to groups of people not only having different social positions according to their power, wealth and/or prestige but also socio-economically disadvantaged people based on racial or ethnic minorities [1, 5]. Compared to advantaged social groups, disadvantaged social groups systematically experience worse health or worse access to health [1]. This increases unfairness and unjustness as it decreases opportunities for people who are at a disadvantaged position in society to escape from their disadvantaged positions because of having worse well-being and health status, compared to the advantaged social groups. Therefore, good health status and well-being are needed to overcome such disadvantaged positions in society [6].

Different health policies can affect the extent of health disparity which depends on many factors, such as population, socio-economic status, gender, race and/or ethnicity. However, what should be done to maintain health equity, is “… ideally everyone should have a fair opportunity to attain their full health potential and, more pragmatically, that no one should be disadvantaged from achieving this potential, if it can be avoided” [1]. One way to eliminate health disparity arising from different socio-economic status, gender, wealth, and race/ethnicity is to increase health equity by measuring health disparities and taking necessary actions in science, technology, and policy.

Achieving health equity and reducing health disparities require in-depth research, especially to determine the causes of health disparities. Health disparities may be caused by a patient, healthcare providers, system-related factors such as different treatments or societal inequities as described above [3]. It is possible to ensure effective and efficient policies for reducing or preventing health disparities, by truly determining causes of health disparities in a society. Determining its true cause requires data which can measure health outcomes for those who have systematically and observably been exposed to worse healthcare.

Addressing Health Disparity: Big Data Solutions

Scientific and technological advances have improved healthcare services, but disadvantaged groups continue to experience disproportionately acute and chronic health diseases due to health disparities [7]. These disadvantaged groups here include women, socio-economically disadvantaged people, racial or ethnic minorities, referred as underrepresented groups.

They are called underrepresented groups because even starting from data collection process and clinical trials, some subgroups are underrepresented. This lack of diversity in medical data might lead to reduce the opportunities for finding out effects, potentially specific to those underrepresented groups [8, 9].

Big data is promising for improving health equity as it has the potential to transform biomedical science. The quality and quantity of big data is dynamically increasing, and it is much easier to collect data from various sources, e.g., different countries and most importantly, measurements from analytical methods [7]. Therefore, big data can help in reducing health disparities in the population, by collecting data from underrepresented groups, increasing generation of health records, and promoting different analytic methods.

Big data in healthcare sector can be collected from various resources such as wearable sensor devices, mobile devices, electronic health records (EHRs), videos, clinical notes, radiology images, mobile apps, social media, blogs, health monitoring devices, genomic and pharmaceutical data, telemedicine etc. [10, 11]. Such rich data collection resources enable scientists to analyze changes in patients’ behavior and related health outcomes, resulting in the identification of high-risk patients, thereby promoting precision medicine by means of diversity in patients and their heterogenous responses to treatments.

As defined in the literature, big data has six distinct characteristics, including volume, velocity, variety, veracity, value, and valence. Volume is about having a tremendous amount of data generated every second of the day. Velocity is about fast data generation and fast data transfer from one port to another port. Variety is about having increasingly different forms of data such as image, text, video etc. Veracity is about data with different qualities. Value is about extracting a real-time value from the data. Lastly, valence is about how well big data can bond with each other and form connection between otherwise distinct datasets [10, 12].

With the above-mentioned characteristics of big data, it is possible to come up with policies for reducing or eliminating health disparity and improving health equity by being able to include everyone with its enormous amount of data. In this regard, big data offers a lot of opportunities related to health equity, which are discussed below [7, 13, 14].

  • Although technology has improved enormously in the health sector in the 21st century, it has failed to protect disadvantaged social groups. Big data provides a great opportunity to improve health and healthcare for everyone. For example, this could be achieved by means of clinical trials data, Electronic Medical Records (EMRs) and Electronic Health Records (EHRs) in all healthcare settings, in conjunction with gathering monitored digital health information.
  • It is not always possible to carry out randomized controlled trials (RCTs) due to scientific, ethical, and cost-related concerns. Big data overcomes this deficiency by modelling data with simulation modeling and systems science. Simulation data is especially useful in terms of modeling health disparities and their systemic and ecological causes over life course, thus helping to reduce health disparities against disadvantaged groups.
    • By combining big data and policy data, it is possible to identify areas with health disparities, determinants of health disparities, and whether such disparities increase or decrease over time. This opens the way for to improve policies regarding healthcare at national and international levels.
    • While big data analysis methods examine big data to derive solutions, improved visualization techniques in big data, help to explore complex data by simplifying it. Apart from visualization techniques, network analysis techniques enable to link community-level data with healthcare data, thus helping clinicians and healthcare specialists to allocate resources accordingly and realize whether such resources are distributed equally among different groups.
    • Big data such as social media data and geographic information systems data is especially useful to determine social determinants of health disparities and social factors leading to such health disparities. Social media data and geographic information systems data can also help scientists to predict trends in public health such as flu.
    • One of the most important benefits provided by big data is to ensure evidence-based medicine. By means of big data, it is now possible to decide on a treatment based on scientific evidence available in addition to the physician’s knowledge and capabilities.
    • Big data provides different analytic techniques, such as predictive, descriptive, and prescriptive analytics:
    1. Predictive analytics refers to forecasting what might happen in the future by using statistical approaches. That’s how early detection and diagnosis are possible before a patient has symptoms of a disease.
    2. Descriptive analytics refers to summarizing what has happened by using past and current data. It is useful because healthcare specialists can understand past behaviors of patients and infer the effect of such behaviors on outcomes based on their database.
    3. Prescriptive analytics refers to prescribing actions for the decision makers to act upon. It offers optimal solutions to understand what to do in the future.


  1. Brakeman,P. Health Disparities and Health Equity: Concepts and Measurement, Annual Review of Public Health 2006 27:1, 167-194.
  2. Kawachi, Ichiro, Subu V. Subramanian, and Naomar Almeida-Filho. A glossary for health inequalities. Journal of Epidemiology & Community Health 2002 59:9, 647-652.
  3. Kilbourne AM, Switzer G, Hyman K, Crowley-Matoka M, Fine MJ. Advancing Health Disparities Research Within the Health Care System: A Conceptual Framework. American Journal of Public Health [Internet]. 2006 Dec;96(12):2113–21. Available from:
  4. Warnecke RB, Oh A, Breen N, Gehlert S, Paskett E, Tucker KL, et al. Approaching Health Disparities from a Population Perspective: The National Institutes of Health Centers for Population Health and Health Disparities. American Journal of Public Health. 2008 Sep; 98(9): 1608–15.
  5. Ganesh S, Talukder AK. Formal Methods, Artificial Intelligence, Big-Data Analytics, and Knowledge Engineering in Medical Care to Reduce Disease Burden and Health Disparities. 2018 Dec 18; 307–21.
  6. Braveman P. What Are Health Disparities and Health Equity? We Need to Be Clear. Public Health Reports [Internet]. 2014 Jan; 129(2): 5–8. Available from:
  7. Zhang X, Pérez-Stable EJ, Bourne PE, Peprah E, Duru OK, Breen N, et al. Big Data Science: Opportunities and Challenges to Address Minority Health and Health Disparities in the 21st Century. Ethnicity & Disease [Internet]. 2017 Apr 20; 27(2): 95. Available from:
  8. Howerton MW, Gibbons MC, Baffi CR, Gary TL, Lai GY, Bolen S, et al. Provider roles in the recruitment of underrepresented populations to cancer clinical trials. Cancer. 2007; 109(3): 465–76.
  9. Ford JG, Howerton MW, Lai GY, Gary TL, Bolen S, Gibbons MC, et al. Barriers to recruiting underrepresented populations to cancer clinical trials: A systematic review. Cancer [Internet]. 2008; 112(2): 228–42. Available from:
  10. Mathew PS, Pillai AS. Big Data solutions in Healthcare: Problems and perspectives [Internet]. IEEE Xplore. 2015. p. 1–6. Available from:
  11. Pastorino R, De Vito C, Migliara G, Glocker K, Binenbaum I, Ricciardi W, et al. Benefits and challenges of big data in healthcare: An overview of the European initiatives. European Journal of Public Health [Internet]. 2019 Oct 1; 29(3): 23–7. Available from:
  12. Singh M, Bhatia V, Bhatia R. Big data analytics: Solution to healthcare. 2017 International Conference on Intelligent Communication and Computational Techniques (ICCT). 2017 Dec
  13. Yoo KH, Leung CK, Nasridinov A. Big Data Analysis and Visualization: Challenges and Solutions. Applied Sciences. 2022 Aug 18; 12(16): 8248.
  14. Wang Y, Hajli N. Exploring the path to big data analytics success in healthcare. Journal of Business Research. 2017 Jan; 70: 287–99.

Related Articles

bacground image
bacground image

Subscribe to our newsletter