Beginners Guide For Business and Science
Beginners Guide For Business and Science
By Charles Jensen
Copyright @ 2017
All rights reserved. No part of this book
may be reproduced in any form or by any
means without permission in writing
from the publisher, Charles Jensen.
Volume
It refers to the size of data. The range
lies from terabytes up to yottabytes.
Velocity
It refers to the speed of data. The
scope ranges from yearly up to real
time.
TYPES OF REGRESSION
Regressions span from simple equations
to complex ones.
Spearman’s rho
It is a form of statistical measure that
correlates the ranking of the variables.
For example, beauty pageants use this
standard to verify the correlation of the
judges’ contestants rank.
The formula below is used to compute
for the rank correlation. The coefficient,
, means the Greek letter for r.
= difference between ranks; =
number of categories
Statisticians use the same qualitative
descriptions like the one in interpreting
r.
VARIABLES
In choosing an appropriate tree, you
base your decision upon the target
variable. There are two types.
1. Categorical Variable – also known
as nominal variable and has the
following characteristics: a finite
domain set, and you can group this
into categories.
ALGORITHM
J. Ross Quinlan developed the Iterative
Dichotomiser (ID3) algorithm in 1980.
This one became the foundation for
constructing decision trees.
Note: decision trees utilize a top-down
and greedy search because it starts with
a standard category (root), then divides
itself into classes that contain similar
values. ID3 uses entropy and information
gain.
Entropy refers to the maximum number
of yes or no questions the user can ask to
achieve a probability.
A complete similarity indicates zero on
entropy. If the root is divided equally,
then the entropy is one. Information gain
signifies the decrease of the entropy of
the target’s variable.
PROCESS
There are five basic steps for you to
construct a classification tree.
1. Separate the data into homogenous
and non-homogenous variables.
Homogenous variables have low
entropy.
TREE PRUNING
Overfitting happens when a model is a
complex one. It has numerous variables
involved. As a result, it lowers the
effectivity of the decision tree. On the
other hand, underfitting materializes
when the software cannot depict a trend
out of the given data.
The following approaches can be
implemented to prevent data overfitting:
Pre-pruning
It uses the error estimate to end the
tree’s construction.
Post-pruning
It uses the Chi2 Test to eliminate a
sub-tree.
Pruned trees are smaller and easy to
interpret.
Chapter 7: Social Network
Analysis
At the end of this section, you are
expected to:
Understand the definition of social
network analysis;
Differentiate social network analysis
from sentiment and decision-tree
analyses;
Discover the purpose of social
network analysis.
7.1 What is Social Network Analysis
(SNA)?
Unlike sentiment analysis, which
concerns for studying the individual
attributes, social network analysis
studies the relationships between people
and groups. It answers the questions,
“How do the relationships form?”, and
more importantly, “What are the
consequences of these relationships?”
This type of analysis originated from the
sociologists Georg Simmel and Emile
Durkheim.
Social patterns define the lives of the
individuals and the people surrounding
them. SNA based its ideas on the
previous sentence. Others claim that
these trends determine the success or
failure of institutions. For them, the
internal structure of the companies
affects the inclusive growth of the
business.
7.1.1 Types of Social Network Analysis
There are two types of SNA. The
egocentric analysis examines an
individual’s personal network and its
effects while sociocentric analysis
assesses large groups of people.
The egocentric analysis utilizes surveys
to its respondents. The interviewer asks
the respondents about their interaction
with other people. This type of SNA is
convenient to implement.
Analysts perceive Sociocentric analysis
harder than the previous one. As a
researcher, you get all the possible
relationship sets of sample respondents.
7.1.2 Related Terms
Propinquity refers to the likelihood of
people to obtain closer ties than the
others.
Homophily refers to the possibility of
individuals to associate with people
with the same trait as them. It is also
known as assortativity. Shared
characteristics, such as beliefs, value
systems, personality, make relationship
formation easier.
The Social Comparison and Social
Identity theories support the term
homophily. It states that one chooses the
similar others for comparison and some
individual defines the identity of the
group.
Moreover, nodes (point) refer to the
different persons, people within the
network. Lines, also called as the link
and tie, refer to the interactions
connecting them. The sociogram is the
term called for the diagram used to
represent the analysis.
Dyad refers to the pair of actors and
their interaction. Triad refers to a group
of three actors and their relationships.
Subgroup relates to the subset of the
actors and their interaction.
Ebook Marketing
Make Money Online Fast
Self-Publishing for Beginners
Here is an excerpt of another book I
wrote, Ebook Marketing
Chapter 3: Kindle Advertising
I have heard stories from people who
did well through Facebook advertising,
Goodreads advertising, or Reddit
advertising. All of these websites charge
you per click, and if you know what
you’re doing, then it can pay off.
However, I am still convinced, also
seeing after my own results, that if you
advertise on the main websites where
people go to buy things, you’ll end up
generating more revenue than if you
would advertise where people go to
chit-chat, share pictures, or post their
daily stuff. If you want to socialize or
show off your kids, go to Facebook; if
you want to read and talk about books,
go to Goodreads, but many people know
that Amazon has become one of the main
go-to websites to BUY stuff. That’s why
I only advertise through Amazon ads
now.
In this chapter, I will show you how
advertising through
www.kdp.amazon.com works. Once you
get it, you don’t have to worry about
major risks, it’s super simple, and it can
definitely increase your sales. You’ll
have to do it through trial and error, but
it’s worth it.