data science assignment
data science assignment
th
Grade 11 i
-Data science is an interdisciplinary field that combines various techniques or (ablend of) ,
methods, algorithms, and systems to extract insights and knowledge from structured and
unstructured data. It involves using scientific methods, processes, algorithms, and systems to
analyze and interpret data, often with the goal of informing decision-making, solving problems,
or predicting future trends.
-The creation of data science was driven by the explosion of data generated by digitalization,
the need for organizations to make data-driven decisions, and advancements in computing
power and algorithms like machine learning. As businesses faced increasingly large and
complex datasets, traditional methods were no longer sufficient, prompting the development of
new techniques for analyzing, interpreting, and leveraging data for insights and competitive
advantage.
-it provides tools and tecniques provided that are used by organizations to take advantages of the
vast amount of data that they own.
-Data science can be categorized into various types based on its applications, techniques, and
areas of focus. The main types of data science include:
1. Data Engineering:
2. Statistics and Probability:
3. Machine Learning: Data Visualization:
4. Big Data:
5. Artificial Intelligence (AI):
6. Data Mining:
7. Business Analytics:
8. Operations Research:
9. Computational Science
-Data science employees use different tools at different stages in order to solve aproblem
-Big data refers to datasets that are so large, complex, and dynamic that traditional data
processing tools and techniques are insufficient to handle them effectively. It is characterized by
the three Vs:
1. Volume: The sheer amount of data being generated from various sources such as social
media, sensors, transactions, and more. terabytes to exabytes of existing data to process.
2. Velocity: The speed at which data is created, processed, and analyzed, often in real-time
or near-real-time.
3. Variety: The diverse types of data, including structured, semi-structured, and
unstructured data (e.g., text, images, video, sensor data).
-Big data requires specialized tools and technologies, such as Hadoop, Apache Spark,databases,
to store, process, and analyze the vast and varied datasets to extract meaningful insights.
-The insights generated through data science tools and techniques apply to almost all fields. In
manufacturing, for example,
- data science can be used to forecast product demand that will be used to determine the precise
amount of raw mate Emerging Technologies that needs to be ordered.
-The personalization of information on social media is achieved through the use of data science
tools on the massive amount of data that social media companies collect from their users.
-Weather predictions in the agricultural sector, preventive medicine in health care, and risk
management in business are but a few examples to mention about the areas where data science
has improved results to a great extent.
- The application of data science in sites like social networking sites involves the collection of
large amounts of user data. But this sometimes creates tension with the issue of privacy,
especially in countries where there are strong privacy regulations.
- Data anonymization and data generalization are some of the ways suggested for tackling
issues of data protection and privacy.
-Data anonymization refers to removing personally identifiable information from data, while
data generalization is about bunching data into broad categories such as age groups and
geographical areas.
- Data generalization refers to the process of abstracting or simplifying data to make it more
manageable, interpretable, and useful for analysis, modeling, or decision-making.