BAE815 - Liu - 06 - Data Analysis For Scientific Research
BAE815 - Liu - 06 - Data Analysis For Scientific Research
Scientific Research
BAE 815
Dr. Zifei Liu
In the old days Modern life
Descriptive
Data requirement Predictive
Inspecting
Sampling Exploratory
Cleaning Decision
Experimental design Confirmatory
Transforming making
Numerical Data mining
Integration
Categorical Interpretation
Visualization
Be objective
Separate facts and opinion
Support argument with data
8
The goal is to obtain usable and useful information.
• To identify and understand patterns in data
• To identify relationships between variables
• To compare variables and identify the difference between
variables
• To explain cause-and-effect phenomena
• To forecast outcomes
Data collection
12
• Check for completeness and accuracy of data, handle
missing values, undetected values, duplicates, outliers, and
correct errors
• Code, clean
• Initial data analysis: check and question the assumptions
required for the following data analysis and hypothesis
testing.
– Linearity
– Normality
– Symmetry
– Effect of uncommon observation
• Make transformations of variables as needed.
Data processing
13
• Descriptive: How can the data be summarized?
• Exploratory/Inferential: focuses on discovering new
features in the data and suggest new hypotheses. How can
we draw inferences from the data?
• Confirmatory: focuses on confirming or falsifying existing
hypotheses.
• Predictive: How can we build predictive models using the
data available?
• Probability of research
– Nothing is certain
– Scientific “truth” is usually a statement of what is
most probable given the currently known data
Test of hypotheses
15
• Statement 1: A is a human being. B is a gorilla. Between
and A and B are many similarities, but A has many
superior attributes when compared with B.
• Statement 2: The similarities show that both A and B had a
common origin. The superiorities suggest that A evolved
from B over millions of years.
• Statement 3: The similarities show that both A and B had a
common origin: the creator God. The superior attributes of
A show that God chose to create human beings in His own
image, and this was not the case with the creation of
animals.
Credit: Elaine Kennedy
Data interpretation
17
The duck-rabbit illusion
Data interpretation
18
• Justifying the methodology; citing agreement with
previous studies
• Offer an interpretation/explanation of the results
• Discussing limitations, pointing out discrepancies
• Commenting on the data; state the implications and
recommend further research
25
• Old friend: MS Excel
• Abaqus
• Ansys
• LAMMPS
• Matlab
• Mathematica
• LabView
• SAS, SPSS
• R is available free over internet
• Many more!
Final word
27
Blind men and an elephant
28
“A data scientist is someone who knows more statistics than a
computer scientist and more computer science than a statistician.”
- Josh Blumenstock
29