0% found this document useful (0 votes)
4 views

data science 1 (1)

The document discusses the importance of data science in various industries, particularly in e-commerce and cybersecurity, emphasizing its role in analyzing data to meet business needs and improve security measures. It outlines a plan for data preprocessing at Al Azhar University, focusing on identifying and addressing data corruption and missing values. Additionally, it compares two datasets, concluding that dataset x is superior due to its smaller range and lower variance.

Uploaded by

baderahed21
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

data science 1 (1)

The document discusses the importance of data science in various industries, particularly in e-commerce and cybersecurity, emphasizing its role in analyzing data to meet business needs and improve security measures. It outlines a plan for data preprocessing at Al Azhar University, focusing on identifying and addressing data corruption and missing values. Additionally, it compares two datasets, concluding that dataset x is superior due to its smaller range and lower variance.

Uploaded by

baderahed21
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

‫بدر عاهد الغصين‬

20201216
‫بسم الله الرحمن الرحيم‬

H.w #1

1) You are a computer engineers by using a data science definition and


techniques how can data science serve you in future career, with more
details and by giving examples.

data science is a very important and essential aspect in a success of a


business that’s because data science is resolved on analyzing large
amount of data to figure out what the are the required needs for the
business to improve or what the we should focus on all that happens
with the principles and practices that data science use.

Data science is used in almost every industry because in today’s world


everything is about data, it is a very helpful study for example for an e-
commerce business with the practices and analysis of data that data
science use we can know what the customer is looking for and know
what products are getting sold more and show products depending on
customer needs its all about analyzing the customer behavior for better
income.

For me as a cyber security student , data science is very important for me


because it allows me to with what it offers from practices and techniques
to know how to detect common vulnerabilities and discover new ones
and train devices such as firewalls and intrusion detection systems to
discover and block any attacks on the network for better defending
2) AL AZHAR University choose you as a data scientist to do data Pre-
processing , what’s your plan to serve your university in that’s field, with
more details and by giving examples.

we first need to figure out what is the problem that al Azhar university
has with its data, by investigating the data records we can apply any of
the data pre processing methods to make the data clean for example due
to the incident that happened in Gaza all the data records that are
related to the semester before the incident were gone missing for us to
bypass this problem we can use methods like global constant to fill all
missing values or mean/mode imputation or any other solution
depending on the data that is missed. We need to detect if any
corruption has happened to the data as well and implement the
necessary solutions

3) Select only one data giving your opinion in the data by emphasize your
answer :

a) x=(20,17,19,5,60,13,18,19,4,15,10,7,8)

b) y=(99,30,20,80,88,77,3,77,60,89,55,44)

data set x is the better data for many reasons

if we calculate the mean and median and range for both datasets we will
find x better

mean and median and range for x:

mean = 20+17+19+5+60+13+18+19+4+15+10+7+8/13 = 16.55

median = 4,5,7,8,10,13,15,17,18,19,19,20,60 = 15

range = 60-4 = 56

mean and median and range for y:

mean = 99+30+20+80+88+77+3+77+60+89+55+44/12 = 60.17

median = 3,20,30,44,55,60,77,77,80,88,89,99

range = 99-3 = 96
= 60+77/68.5

After comparing each mean, we find that dataset y has higher mean
because of the extreme values 99,3 ,for the median we find that dataset
y is extremely higher than data set x for the same reason , for the range
we find that dataset y is also larger than x which is going to affect
consistency and will casue greater variability

For these reasons data set x is better for its small range and low variance

You might also like