0% found this document useful (0 votes)
56 views11 pages

FEBIM2022BigDataEthics Bigdata

This document discusses the growth of big data in healthcare and the associated ethical issues. It introduces big data sources in healthcare and the benefits of big data analytics. However, it also highlights privacy, discrimination, and security as major ethical concerns. The paper aims to provide an in-depth analysis of these issues and their effects.

Uploaded by

killerpreet003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views11 pages

FEBIM2022BigDataEthics Bigdata

This document discusses the growth of big data in healthcare and the associated ethical issues. It introduces big data sources in healthcare and the benefits of big data analytics. However, it also highlights privacy, discrimination, and security as major ethical concerns. The paper aims to provide an in-depth analysis of these issues and their effects.

Uploaded by

killerpreet003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/360209789

An Ethical Framework for Big Data and Smart Healthcare

Conference Paper · April 2022


DOI: 10.5220/0011030900003206

CITATIONS READS

0 158

4 authors:

Victor Chang Rahman Eniola


Aston University Teesside University
517 PUBLICATIONS 12,706 CITATIONS 1 PUBLICATION 0 CITATIONS

SEE PROFILE SEE PROFILE

Ben S Liu Mitra Arami


Quinnipiac University Arab Open University - Kuwait
78 PUBLICATIONS 1,693 CITATIONS 12 PUBLICATIONS 430 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Development of Intelligent Data Analytics tool in Social Media View project

Software Engineering for Green Computing View project

All content following this page was uploaded by Ben S Liu on 26 April 2022.

The user has requested enhancement of the downloaded file.


An Ethical Framework for Big Data and Smart Healthcare

Victor Chang1 a , Rahman Olamide Eniola2 b, Ben Shaw-Ching Liu3 c , and Mitra Arami4 a
1Department of Operations and Information Management, Aston Business School, Aston University, Birmingham, UK
2 Cybersecurity, Information Systems and AI Research Group, School of Computing, Engineering and Digital Technologies,
Teesside University, UK
3 Department of Marketing, Lender School of Business Center, Quinnipiac University Hamden, CT 06518, USA
4 Pardis Limited, London and EM Normandie Business School, France

[email protected]*, [email protected], [email protected]; [email protected]

Keywords: Ethics for AI and Data Science; Ethical framework; Ethics for smart healthcare.

Abstract: There has been significant growth in big data technology in healthcare in recent years. However, the potential
of big data analytics is affected by various ethical and security concerns, which have hampered the application
of big data analytics in healthcare. Recently, numerous studies have been conducted on the emerging big data
ethical issues in healthcare. While most of the journal reflects on privacy and security questions, it did not
examine; objectively the possible discriminatory impact of big data analytics has no. This mixed-method
project aims to highlight various ethical problems in big data analytics while also providing an in-depth insight
into the biased results derivable from big data analytics and the effects of such outcomes.

1 INTRODUCTION technology and the introduction of digitized computer


systems has resulted in the transition of conventional
Higher healthcare investment in a nation can provide hard copy medical data to Electronic Health Records
better health prospects that can enhance human (EHR) and Electronic Medical Records (EMR)
capital and increase productivity, thus contributing to systems (Rehman, Naz and Razzak, 2021). These
economic performance (Raghupathi and Raghupathi, systems resulted in exponential data expansion
2020; Cutillo et al., 2020). However, the exponential (Razzak, Imran and Xu, 2020), which has contributed
growth in the world's population presents a critical to the growth of big data analytics, especially in
threat to current medical and healthcare systems (Zhu healthcare.
et al., 2019). The change in population demographics, According to a 2021 Grand View Research,
the increase in the number of aged people, and the Inc. study, the worldwide healthcare analytics market
drastic increase in the cost of in-hospital services all was valued at USD 23.6 billion in 2020, projected to
lead to realizing the value of effective healthcare rise at a Compound Annual Growth Rate (CAGR) of
systems (Demirkan, 2013). The professional-to- 23.8 percent from 2021 to 2028 (‘Healthcare
patient ratio is another factor that led to the rise in Analytics Market Size Industry Report: 2021-2028’,
demand for an efficient healthcare system (Borodin et 2021). See Figure1.
al., 2016). This massive increase fulfills the growing
With the explosive growth of disruptive need for improved healthcare, aided by innovative
technologies in recent years, the speed and quantity technology.
of digital data collected have expanded steadily and
rapidly (Chang, Shi and Zhang, 2019).
Correspondingly, the evolution of information

a https://orcid.org/0000-0002-8012-5852
b https://orcid.org/0000-0001-9799-861X
c https://orcid.org/0000-0002-2950-9607
c https://orcid.org/0000-0001-6855-9888
Figure 1: Healthcare Analytics market in the USA, by end
user, 2018 – 2028 (USD Million)

Administrative claim reports, hospital registries,


electronic records of health, biometric data, patient
Figure 2: Trend of Industry Average Data Cost (IBM, 2019)
data, the internet, medical imaging, biomarkers,
prospective cohort studies, and clinical trials are
possible medical big data sources in healthcare The possibility of potential discrimination is
(Hermon and Williams, 2014; Luo et al., 2016). among the most alarming yet understudied issues of
These sources; are aggregated to produce fast and big data technology. There is no universally accepted
cost-effective prescriptive, descriptive, and definition of discrimination. The term generally refers
diagnostic insights for the healthcare stakeholders. to acts, practices, or policies that impose a relative
While strategically analyzing data for insightful disadvantage or treat a person or specific group of
analysis is crucial, the existence of different data people differently, especially in a worse way than
types accessible from numerous sources makes big treating other people because of their skin color,
data management extremely difficult (Nair, 2020). gender, sexuality, language, or other factors (Reinsch
Despite the aforementioned benefits of big data and Goltz, 2016).
technology, it is worth remembering that big data The research (Obermeyer et al., 2019) that
analytics has its drawbacks due to its intertwinement revealed pervasive racism in decision-making
with people's sensitive personal information, daily systems utilized by US clinics is an excellent
behavioral patterns, and potential prospects. The most demonstration of discrimination in healthcare
pressing concerns of big data analytics are privacy analytics. Participants who self-identified as black
(Francis, 2014), confidentiality and informed consent were rated lower risk scores than equally ill white
(Ioannidis, 2013), epistemic hurdles (Floridi, 2012), people in the study. Consequently, black individuals
and the analysis of monitoring in a growing were less likely to be referred for more personalized
datafication of the society (Ball et al., 2016). Indeed, medical care (Obermeyer et al., 2019).
the assurance of privacy and safety of subjects The emergence of these instances describes why
through the application of big-data analytics are of discrimination in big data analytics has become an
significant importance and high priority. The study by emerging topic in a variety of fields, from data
IBM (IBM, 2019) in Figure 2 shows that the health science and artificial intelligence to psychology,
sector has suffered an average overall cost of data culminating in a dispersed and fractured
breaches considerably higher than other sectors such interdisciplinary corpus that tends to make
as hospitality, media, and research. Healthcare data thoroughly accessing the foundation of the problem
should be securely kept, and big data analytics difficult (Favaretto et al., 2019).
performed ethically (Mittelstadt, 2019). This study summarized big data and its use in
healthcare, addressing current ethical and security
issues relevant to big data application in healthcare.
Moreover, we suggest several alternative solutions to
compromise between the application and the ethical
obligation.

2 LITERATURE REVIEW
Big data and big data analytics are arguably the pillars
of other disruptive technologies, providing the
necessary business insights for patients, experts, and different conclusions. Big data analytics may result in
government (Wong, Zhou, and Zhang, 2019). Big unintentional discrimination (Žliobaitė, 2017;
data analytics is the method of storing, processing, Sonawane and Irabashetti, 2015). Žliobaitė (2017)
and analyzing vast collections of data to find trends established that discrimination is indirect, not by the
and other valuable knowledge (Heyman et al., 2004). analyst's intention but because of the structure and
These massive and complex big data collections are noise of experimental data. Such algorithms may
manipulated and managed using various systematically disfavor persons belonging to
computational methods such as machine learning and particular groups or categories, rather than depending
artificial intelligence (Ward and Barker, 2013). The purely on individual merits.
advent of advanced technology has provided Conversely, other academic studies
conditions and procedures for voluminous databases emphasized intentional discrimination (e.g.,
to be compiled and processed, resulting in informed Kuempel, 2016; Sonawane and Irabashetti, 2015).
decision-making in addressing health problems (Raja According to Kuempel (2016), data brokers
et al., 2020). frequently combine raw components of personal data
Big data has emerged as a promising option in a discriminatory way, leaving customers exposed
with the potential to revolutionize the healthcare to exploitative and distasteful marketing techniques.
system by lowering costs and optimizing treatment The effect of utilizing such a biased dataset with
process, delivery, and management (Patil and sensitive information is that such individuals or
Seshadri, 2014). The application of big data comes groups of people would lead to direct discrimination.
with some ethical issues that demand careful Suresh and Guttag (2019) explain how bias problems
consideration (Camilleri, 2020). Suresh and Guttag occur, how they apply to specific applications, and
(2019) explain how bias problems occur, how they how they inspire various solutions. They also present
apply to specific applications, and how they inspire a framework for understanding analytical bias at a
various solutions. They also present a framework for higher level of abstraction to facilitate constructive
understanding analytical bias at a higher level of dialogue and solution development.
abstraction to facilitate constructive dialogue and
solution development.
Notwithstanding the amount of data generated 3 RESEARCH QUESTIONS AND
in healthcare, the underlying challenge remains in the
integration, of structured and unstructured health BIG DATA ANALYTICS
data. According to Dridi et al. (2020), approximately ARCHITECTURE
80% of clinical data is unstructured: and widely
underutilized, once generated. Different clinical data The first step of the research was to identify relevant
formats, such as scanned canned medical documents, research questions. The main research question is,
prescriptions, patient registries, and clinician notes, "given the many applications and benefits of big data
result in poor standardization of healthcare data, and big data analytics in healthcare, do the ethical
making it more difficult to handle by EHR systems risks overshadow the benefits?"
and more prone to bias from data preprocessing (Cave To answer this main question, we need to find
et al., 2019; Dridi et al., 2020). answers to the following sub-questions:
Patient privacy invasion is an emerging 1. What are the applications of big data in
problem in big data analytics. Patients' behavior and healthcare?
sentiment data can be obtained from various online 2. What are the current ethical issues of
sources. For example, an online drug retailer may healthcare big data analytics?
have recorded the purchase of a particular
medication, a ride-hailing app may have recorded a
visit to a clinic or lab, or a social media app may have
recorded patients' interactions with a medical web
page. Furthermore, patients' data can also be
extracted unethically via health-care-specific
applications and wearable devices.
Also, we studied several publications to grasp
better the potential discriminatory effects and popular
Figure 3: Power BI architecture
drivers of discrimination or inequality in big data
analytics on subjects. Different writers arrived at
3. What is the cause of discrimination in big appropriateness of big data analytics in healthcare
data analytics? (Rehman et al., 2021). The aggregate of these data is
The big data analytics framework utilized in this analyzed to assist patients with diets, reminders of
project is a blend of many steps that explains the big preventative care, personalized medical care, follow-
data Analytics procedure (shown in Figure 3 above). up on prior consultations and medicines, and
The first phase in the framework is data preparation, counseling (Razzak et al., 2020).
which involves the ETL, i.e., Extraction, Due to the considerably broad customer base,
Transformation, and Loading of the data. Extraction relatively few regulatory obligations and ease of
is the process of determining the data type to be access to wearable devices and medical apps,
utilized and collecting it from different data sources, personalized medical care has significantly increased
such as existing databases and repositories, APIs, and its market size, as shown in Figure 4.
the cloud. Data transformation is the next step in
which data is transformed, aggregated, and loaded
into the Power Business Intelligence (BI) dashboard.
The transformation step is to ensure the: (1) handling
of inconsistencies and missing values in the data; (2)
elimination of duplicate data; (3) removal of useless
data; and (4) sorting of data into the appropriate type.
Figure 3 below illustrates the overview of the Power
BI analytics procedure.
The visualization step involves taking the
processed outputs and transforming them into
meaningful insights by viewing the results in
diagrams, KPIs, or other easy-to-understand formats. Figure 4: Trend of Personalized Medicine (2012 -2022)
It is crucial to ensure that results can be interpreted by
those with no previous experience or expertise. 4.2 Evidence-Based Healthcare
Unlike other tools, Power BI allows the integration of
different programming languages. Applying Python
Traditional healthcare is changing from expedient
and R functionalities while using the DAX and M-
language formulas is the advantage of Power BI. It and discretionary decision-making to evidence-based
gives a better result due to the combined strengths of medical practices (Piai and Claps, 2013).
different programming languages. Evidence care is a healthcare practice where
we base the patients' conditions on scientific proof.
Through consolidating data from various outlets, big
data offers evidence-based treatment. The data trends
4 APPLICATION AND and patterns would provide sufficient support for
BENEFITS OF BIG DATA diagnosis and treatment (Piai and Claps, 2013).
ANALYTICS
4.3 Enhancement of Public Health
4.1 Preventive Medicine Monitoring

Preventive medicine is arguably the most innovative The analysis of healthcare data with ground-breaking
application of big data analysis which employs methods aids in the epidemic trends analysis, disease
cutting-edge data analytics methods: for disease outbreaks monitoring, and the spread of disease. This
detection and classification, association analytics, approach improves public health monitoring,
and clustering, with the promise of efficiently education, and reaction time. An excellent example is
discovering valuable patterns by analyzing large the Covid 19 pandemic surveillance system in the
amounts of unstructured, heterogeneous, non- United Kingdom which offers a daily update of a
standard data (Razzak et al., 2020). Appropriate postcode district-based location with infection rates
disease prevention involves identifying and treating in that district, generates a risk score, and
at-risk patients. To increase therapeutic adherence, communicates it to the user. Furthermore, the app
several preventative strategies are employed. allows users to check into a specific place, recording
Pertinent data, such as body temperature, pulse, and their presence at that particular time and date. The app
blood pressure, are electronically collected, enabling also stores an individual's check-ins with the name
automated risk prediction. Consonantly, the increased and IDs of such locations, which work with the test
usage has contributed significantly to the
Figure 5: Emerging Ethical Issues in Healthcare Big data

and trace teams to inform users on association with a of specific articles related to the proposed research.
particular area at a given time. For example, suppose We used an inclusion basis to choose big data and
someone visits a local bar and is tested positive with healthcare papers to find relevant papers to answer
Coronavirus. In that case, the app alerts everyone who research questions based on predefined keywords.
has also checked in the same place to self-isolate or Our aim is to support developing an emerging ethical
quarantine. framework for Healthcare big data, as shown in Figu

4.4 Improves Interaction Between 5.2 The Diabetes Dataset (UCL


Healthcare Providers and Patients repository)
Big data technology also improves collaboration Since millions of healthcare data points are created
between healthcare providers and patients. For and shared daily, a central data repository that
example, on social media, people with common aggregates the entire dataset in one location is needed
health conditions and healthcare professionals with (Luo et al., 2016). We also need powerful tools to
similar specialties across the world can share extract information rapidly and analyze the selected
information on the treatment and cure of some data effectively. While Power BI will give healthcare
illnesses, thereby promoting interaction within health organizations visibility into their data and help them
systems. gather many insights, other more effective analytics
tools should also be considered. Furthermore, even
though the data has been de-identified, there are other
5 METHODOLOGY ethical issues and concerns that we will discuss in the
subsequent section of the article (Durcevic, 2020).
5.1 Systematic Literature Review 5.3 Research Surveys
The complete literature review of the paper deals with
In this project, we used primary data collected by the
'big data in healthcare' based papers and studies
authors' team from randomly selected respondents
published in scholarly journals focuses on the
and a pre-processed dataset originally obtained from
following objectives:
Health Information National Trends Survey HINTS 4
• Understanding the concept of big data for
Cycles 1 (NCI, 2012).
healthcare.
• Recognizing tools and techniques for big 5.3.1 Primary Research Survey
data analytics in healthcare.
• Underlining the future benefits and uses of We used Sogo Survey to conduct the primary
big data in healthcare. research questionnaire to extract respondents'
• Reviewing emerging ethical concerns of big concerns with big data analytics and ensure that the
data systems in healthcare. required data is retrievable intelligently.
We obtained most of the pertinent papers used for this Unlike the traditional approach, an online survey
study from Research Gate, IEEE, and Google Scholar makes retrieval and analysis of the relevant
research sources, which we used to explore for the set information more accessible. Power BI visualization
is appropriate because it can display complex data in (Paulus et al., 2018). The causes of discriminatory
an interactive and user-friendly manner. To move this bias in a dataset could occur at different phases of an
forward, 53 people filled the survey, and we will analytical pipeline (Suresh and Guttag, 2019).
address the results in the following segment. As observed in Figures 6 & 7 below, the
Diabetes readmission dataset used in this project is
5.3.2 Secondary Research Survey highly imbalanced. The dataset has an
overrepresentation of the Caucasian race, leading to a
We used the pre-processed first cycle HINTS 4 false generalization. Additionally, there is an
survey, conducted on 3959 responders between aggregation bias in the dataset as it is hard to know
October 2011 and February 2012, with a response rate which group (race) is others. While the gender feature
of 36.7 percent. Five questions, labeled A-E, were is well represented, the LGBT populations can feel
listed, and are discussed further below. unfairly aggregated with the two genders.
A. Concerns of unauthorized access to their
health records as they are transferred
electronically between healthcare facilities.
B. Concerned about unauthorized access to
their records as they are faxed between
healthcare professionals.
C. Satisfied that protections were in place to
shield their patient records from unwanted
access.
D. Satisfied that they had a voice in collecting,
using, and exchanging their medical records.
E. Hidden details from a healthcare provider Figure 6: Number of Readmissions by Race
out of respect for the patient's safety?
We used the following concepts in this work.

6 ANALYSIS AND FINDINGS


6.1 Ethical Problems of Big Data
Analytics in Smart Healthcare
As mentioned earlier, this project discusses some
emerging ethical concerns of big data in healthcare, Figure 7: Number of readmissions by Gender
including discrimination, data breaching and privacy
issues as delineated in the following. We also discuss On the other hand, we cannot say there is an
further how some ethical issues could lead to underrepresentation based on age because it is rare for
potential discrimination. people below 20 years to be diabetic (as shown in
Figure 8 below).
6.1.1 Discrimination
Big data analytics can potentially exacerbate pre-
existing demographic gaps in healthcare by
presenting biased results from the algorithm used
(Cahan et al., 2019; Cutillo et al., 2020). The data
used to train these algorithms contributes more to
such generalization or stereotypes against a group.
Racial biases embedded in typically biased training
datasets are more likely to yield racially
discriminatory predictive models (Cutillo et al.,
Figure 8: Number of Readmissions by Age
2020). For example, the predictive models derived
from the Framingham Heart Study and precision
medicine protocols centered on European ancestry
The model analysis with Power BI Key analytics could be discriminatory, affecting patients'
Influencer (shown in Figures 9 & 10 below) identified treatment plans.
evaluation bias. It revealed that Asians, Hispanics,
African Americans, and people weighing between 0 6.1.2 Data Breach
and 25kg are unlikely to be readmitted.
Breach of protected health information (PHI) security
substantially impacts individuals and healthcare
institutions (Agaku et al., 2014). The annual cost of
stolen or compromised PHI in the US healthcare
sector is estimated to be up to $7 billion. According
to research conducted by IBM Security (2019),
healthcare data is the most cost of all sectors, with
continuous growth in the number of breaches. Figure
11 illustrates that healthcare has the highest average
cost of a data breach, almost twice the global average.

Figure 9: What factors Influences Readmission to be No

However, Caucasians and people weighing


more than 200kg are more likely to be readmitted due
to Diabetes. This finding could lead to a misleading
generalization of readmitting Caucasian patients
weighing more than 200kg even though they are fine.

Figure 11: Average cost of data breach by industry

Big data Ethical challenges are not isolated


issues as data breaches could result in the disclosure
of personal health information and financial or
medical identity theft. In some cases, it can result in
severe health consequences on patients (Agaku et al.,
2014). Furthermore, a data breach may result in
disclosing hitherto undetectable behavioral or
Figure 10: What factors Influences Readmission to be Yes psychographic tendencies (Winter, 2018). Data from
seemingly insignificant daily routines is gradually
Conversely, it might also lead to refusal of being pooled and utilized to uncover behaviors or
admission for patients who do not identify as White patterns, clustering or associating individuals into
or do not weigh up to 200kg. separate groups, resulting in unfair generalizations
The aforementioned analytical outcome might against such groups. Unauthorized access to private
lead to social exclusion, marginalization, and information or activities, such as medical data, could
stigmatization. Because some persons may be picked be used to discriminate against persons seeking
out and excluded or included due to the bias, the immigration eligibility, medical treatment, education,
revelation and application of this study may result in banking, and jobs (Winter, 2018).
stigma and discrimination. The possible implication
could be prioritizing hospital spaces for some patients 6.1.3 Privacy
or refusing to readmit other patients due to their racial
identity. The possible implication could also be Privacy is a fundamental human right that allows one
prioritizing hospital spaces for some patients or to choose between exposing or not to expose
refusing to readmit other patients due to their body themselves to others and the rest of the world (Chang,
weight. This finding is consistent with Obermeyer et Shi and Zhang, 2019). From the primary survey
al.'s (2019) research, identifying how big data results shown in Figures 12 & 13, most people agree
that big data analytics technologies are functional in
healthcare. However, respondents are concerned Mohammed, and Mohammed, 2015). Protection of
about the sensitivity of healthcare data, which may patients should be prioritized by avoiding any type of
jeopardize their privacy. surveillance or unauthorized identification.

7 EVALUATION AND
DISCUSSION OF FINDINGS
7.1 Conclusion and Implications
Removing private information to increase patients'
anonymity is a powerful method of protecting patient
data. The difficulty faced is determining the
Figure 12: Big data and AI in Healthcare removable feature with high sensitivity from the data.
While, in cases such as the coronavirus pandemic, the
use of sensitive patient data such as location may
improve governments' and research institutions'
ability to combat the threat more quickly by a
surveillance system that provides location data used
to curb the current crisis. The diabetes data, on the
other hand, has features that could give
discriminatory and stereotypical generalizations.
Data scientists must be mindful that utilizing
Figure 13: Sensitive Data these large amounts of data comes at the expense of
human liberty and social autonomy. Lessening the
The HINT (NCI, 2020) survey results (shown risks of using these data must be monitored by
in Figure 14 below) indicate that, while the majority established legislative measures, such as the General
of respondents are concerned about unauthorized Data Protection Regulation (GDPR). The Human-
access to their health records, they have confidence Centered Design approach must be the intent and
that medical providers and institutions would value goals of data usage, including its processing, analysis,
their voice and therefore keep their data secure. warehousing, and dataset sharing.
The following are the main conclusions
observed from these principles and criteria for
operational use of data-driven healthcare analytics:

7.1.1 Data sensitivity is Relative


The description and decision of feature sensitivity
vary from project to project, and it also depends on
the social value and regulations. For example, the
outcome from diabetes data analytics is
Figure 14: HINT Survey Results discriminatory and stereotypical. Can we say that
Caucasian white women weighing more than 200 kg
Personal data can be retrieved at different are more likely to be diabetic than other ethnic
stages of analytics (Dev Mishra and Beer Singh, groups?
2017). Since modern healthcare services demand
patients to provide private and sensitive information 7.1.2 Discrimination is Just as Severe
to access medical services, clients lose control over
the confidentiality of their data when they hand over Understanding the significance of data privacy and
personal information to third parties and rely on the security is crucial. Most data science ethics journals
organization to safeguard its security. Such are concerned with privacy and security and their
dependence increases the risk of information leakage implications. Notwithstanding, there are
if the trusted entity does not implement proper discriminatory and racist submissions arising from
security measures to secure client data (Mariani, big data analytics, which also have grave
consequences. Furthermore, to ensure a fair model, algorithm in big data addressing personalized
we must measure analytics discriminatory tendencies healthcare’, npj Digital Medicine, 2(1).
against respective advantages. Camilleri, M. A. (2020) ‘The use of data-driven
technologies for customer-centric marketing’,
7.1.3 Human-Centered Design (HCD) must International Journal of Big Data Management, 1(1),
50-63.
be Ethically Compliant Cave, A., Kurz, X., and Arlett, P. (2019) 'Real-World Data
for Regulatory Decision Making: Challenges and
Each phase in the Machine Learning and big data Possible Solutions for Europe', Clinical pharmacology
analytics design process should consider the data and therapeutics', 106(1), 36–39.
citizens impacted by models, methods, and https://doi.org/10.1002/cpt.1426
algorithms developed by data scientists. Biases in Chang, V., Shi, Y. and Zhang, Y. (2019) ‘The
defective datasets, algorithms, and human users are Contemporary Ethical and Privacy Issues of Smart
numerous and discussed in depth. We must not ignore Medical Fields’, International Journal of Strategic
that, owing to the vulnerability of data subjects and Engineering, 2(2), pp. 35–43. doi:
10.4018/ijose.2019070104.
groups, the risk of discrimination is more severe.
Cutillo, C. M., Sharma, K.R., Foschini, L., Kundu, S.,
Furthermore, data scientists are also data Mackintosh, M. Mandl, K.D., Beck, T., Collier, E.,
citizens, asides from developing big data insights, Colvis, C., Gersing, K. Gordon, V., Jensen, R.,
they are also affected by such techniques. As a result, Shabestari, B. (2020) ‘Machine intelligence in
maintaining ethically acceptable data processing and healthcare—perspectives on trustworthiness,
analytics is a win-win scenario for all parties explainability, usability, and transparency’, npj Digital
involved. Medicine, 3(1), pp. 1–5.
Demirkan, H. (2013) ‘A Smart Healthcare Systems
Framework’ in IT Professional, vol. 15, no. 05, pp. 38
-45, 2013. doi: 10.1109/MITP.2013.35. url:
ACKNOWLEDGEMENT https://doi.ieeecomputersociety.org/10.1109/MITP.20
13.35.
This work is partly supported by VC Research Dev Mishra, A. and Beer Singh, Y. (2017) ‘Big data
(VCR0000158) for Prof Chang. analytics for security and privacy challenges’,
Proceeding - IEEE International Conference on
Computing, Communication and Automation, ICCCA
2016, pp. 50–53. doi: 10.1109/CCAA.2016.7813688.
REFERENCES Dridi, A., Sassi, S.B., Chbeir, R., and Faïz, S. (2020) 'A
Flexible Semantic Integration Framework for Fully-
integrated EHR based on FHIR Standard' ICAART.
Agaku, I. T., Adisa, A.O., Ayo-Yusuf, O.A., and Durcevic, S. (2020) ‘18 Examples Of Big Data Analytics
Connolly, G.N. (2014) 'Concern about security and In Healthcare That Can Save People’, Datapine’.
privacy, and perceived control over collection and use Available at: https://www.datapine.com/blog/big-data-
of health information are related to withholding of examples-in-healthcare/.
health information from healthcare providers', journal European Union Agency for Fundamental Rights (FRA)
of the American Medical Informatics Association, (2018) ‘Big Data: Discrimination in data-supported
21(2), pp. 374–378. decision making’, FRA Focus, p. 14. Available at:
Alzahrani, A. G. M., Alenezi, A., Mershed, A., Atlam, H., https://fra.europa.eu/sites/default/files/fra_uploads/fra-
Mousa, F., and Wills, G. (2020) 'A framework for data 2018-focus-big-data_en.pdf.
sharing between healthcare providers using Favaretto, M., De Clercq, E. and Elger, B. S. (2019) ‘Big
blockchain'. Data and discrimination: perils, promises and
Ball, K., Di Domenico, M. L. and Nunan, D. (2016) ‘Big solutions. A systematic review’, Journal of Big Data,
Data Surveillance and the Body-subject’, Body and 6(1). doi: 10.1186/s40537-019-0177-4.
Society, 22(2), pp. 58–81. doi: Floridi, L. (2012) ‘Big data and their epistemological
10.1177/1357034X15624973. challenge’, Philosophy and Technology, 25(4), pp.
Borodin, A. V., Lebedev, N.F., Vasilyev, A., Zavyalova, 435–437. doi: 10.1007/s13347-012-0093-4.
Y.V., and Korzun, D.G (2016) 'An experimental study Francis, J. G. (2014) ‘Introduction: Technology and New
of personalized mobile assistance service in healthcare Challenges for Privacy,” Journal of Social Philosophy
emergency situations', Pdfs.Semanticscholar.Org, (c), 45(3): 291-303, University of Utah College of Law
pp. 178–183. Available at: Research Paper No. 107. Available at:
https://pdfs.semanticscholar.org/64e7/c0da0b9f7fd403 https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2
831f6e1a8ade2688f216cc.pdf. 564314.
Cahan, E. M., Hernandez-Boussard, T., Thadaney-Israni, Grandview Research Inc., (2021) ‘Healthcare Healthcare
S. and Rubin, D.L. (2019) ‘Putting the data before the Analytics Market Size, Share & Trends Analysis
Report,’ Available at International Congress on Big Data, Big Data
https://www.grandviewresearch.com/industry- Congress 2014, pp. 762–765. doi:
analysis/healthcare-analytics-market, last accessed on 10.1109/BigData.Congress.2014.112.
12/8/2021. Paulus, J. K., Wessler, B. S., Lundquist, C. M., & Kent, D.
Healthcare Industry Insights (no date) ‘Healthcare M. (2018) ‘Effects of Race Are Rarely Included in
Analytics Market Is Estimated To Be Valued At $53’. Clinical Prediction Models for Cardiovascular
Available at: Disease’, Journal of General Internal Medicine, 33(9),
https://sites.google.com/site/healthcareindustryinsights pp. 1429–1430. doi: 10.1007/s11606-018-4475-x.
/healthcare-medical-analytics-market. Piai, S. and Claps, M. (2013) ‘Bigger data for better
Hermon, R. and Williams, P. (2014) ‘Big data in healthcare’, IDC Health Insights, pp.1-24.
healthcare: What is it used for?’, Proceedings of the Raghupathi, V. and Raghupathi, W. (2020) ‘Healthcare
3rd Australian eHealth Informatics and Security Expenditure and Economic Performance: Insights
Conference, pp. 40–49. doi: From the United States Data’, Frontiers in Public
10.4225/75/57982b9431b48. Health, 8(May), pp. 1–15. doi:
Heyman, D. L., and Rodier, G. (2004) 'Global 10.3389/fpubh.2020.00156.
Surveillance, National Surveillance, and SARS. Raja, R., Ali, S., Mukherjee, I., Sarkar, B.K. (2020) ‘A
Emerging Infectious Diseases.', PubMed, 10(2), pp. Systematic Review of Healthcare Big Data’, Scientific
173–175. doi.org/10.3201/eid1002.031038 Programming, 2020. doi: 10.1155/2020/5471849.
IBM (2019) ‘Cost of a data breach report’, IBM Security, Razzak, MI, Imran, M. & Xu, G. (2020) 'Big data
p. 76. Available at: analytics for preventive medicine', Neural Comput &
https://www.ibm.com/downloads/cas/ZBZLY7KL. Applic 32, 4417–4451. doi.org/10.1007/s00521-019-
Ioannidis J. P. (2013). 'Informed consent, big data, and the 04095-y
oxymoron of research that is not research', The Rehman, A., Naz, S. and Razzak, I. (2021) ‘Leveraging
American journal of bioethics: AJOB, 13(4), 40–42. big data analytics in healthcare enhancement: trends,
Journal, H. (2019) ‘Healthcare Data Breach Statistics’, challenges and opportunities’, Multimedia Systems.
Www.Hipaajournal.Com, pp. 1–13. Available at: doi: 10.1007/s00530-020-00736-8.
https://www.hipaajournal.com/healthcare-data-breach- Reinsch, R. W. and Goltz, S. (2016) ‘Big Data: Can the
statistics/. Attempt To Be More Discriminating Be More
Kuempel, A. (2016) ‘The invisible middlemen: A critique Discriminatory Instead?’, St. Louis University Law
and call for reform of the data broker industry’, Journal, 61(1), pp. 35–82.
Northwestern Journal of International Law and Selby-Bigge, L. A. (ed.) (1975). ‘Enquiries Concerning
Business, 36(1), pp. 207–234. Human Understanding and Concerning the Principles
Luo, J. Wu, M., Gopukumar, D., & Zhao, Y. (2016) ‘Big of Morals’. Oxford University Press.
Data Application in Biomedical Research and Health Sonawane, V. P. and Irabashetti, P. (2015) ‘Method for
Care: A Literature Review’, Biomedical Informatics preventing direct and indirect discrimination in data
Insights, 8, p. BII.S31559. doi: 10.4137/bii.s31559. mining’, Proceedings - 1st International Conference on
Mariani, D.M.R., Mohammed, S. and Mohammed, S., Computing, Communication, Control and Automation,
(2015) ‘Cybersecurity challenges and compliance ICCUBEA 2015, 25(7), pp. 353–357.
issues within the us healthcare sector’, International Suresh, H. and Guttag, J. V. (2019) ‘A framework for
Journal of Business and Social Research, 5(02). understanding unintended consequences of machine
Mittelstadt, B. (2019) ‘AI Ethics – Too principled to fail?’, learning’, arXiv preprint arXiv:1901.10002, 2.
arXiv, pp. 1–15. doi: 10.2139/ssrn.3391293. Ward, J. S. and Barker, A. (2013) ‘Undefined By Data: A
Nair, S. R. (2020). ‘A review on ethical concerns in big Survey of Big Data Definitions’. Available at:
data management’, International Journal of Big Data http://arxiv.org/abs/1309.5821.
Management, 1(1), 8-25. Winter, J. S. (2018) ‘Introduction to the Special Issue:
NCI (National Cancer Institute) (2020) ‘HINTS 5 cycle 4 Digital Inequalities and Discrimination in the Big Data
public codebook’, Hints. Available at: Era’, Journal of Information Policy, 8, 1–4.
https://hints.cancer.gov/data/download-data.aspx. https://doi.org/10.5325/jinfopoli.8.2018.0001.
NIH (2021) ‘Health Information National Trends Survey Wong, Z. S. Y., Zhou, J. and Zhang, Q. (2019) ‘Artificial
(HINTS)’, National Cancer Institute, p. Survey. Intelligence for infectious disease Big Data Analytics’,
Available at: Infection, Disease and Health, 24(1), pp. 44–48. doi:
https://hints.cancer.gov/%0Ahttp://hints.cancer.gov/do 10.1016/j.idh.2018.10.002.
cs/HINTS 2007 Annotated Mail Instrument.pdf. Zhu, H, Wu, C.K., Koo, C.H., Tsang, Y.T., Liu, Y., Chi,
Obermeyer, Z., Powers B., Vogeli C., Mullainathan S. H.R. and Tsang, K-F. (2019), 'Smart Healthcare in the
(2019) ‘Dissecting racial bias in an algorithm used to Era of Internet-of-Things', IEEE Consumer
manage the health of populations’, Science, Electronics Magazine, vol. 8, no. 5, 8822574, pp. 26-
366(6464), pp. 447–453. doi: 30. https://doi.org/10.1109/MCE.2019.2923929
10.1126/science.aax2342. Žliobaitė, I. (2017) ‘Measuring discrimination in
Patil, H. K. and Seshadri, R. (2014) ‘Big data security and algorithmic decision making’, Data Mining and
privacy issues in healthcare’, Proceedings - 2014 IEEE Knowledge Discovery, 31(4), pp. 1060–1089.

View publication stats

You might also like