Lec 1
Lec 1
FOUNDATIONS OF COMPUTER
SCIENCE
Introduction
What is R?
• R is a popular programming language used for
statistical computing and graphical presentation.
• Its most common use is to analyse and visualize
data.
• R is a scripting language (are often interpreted
rather than compiled)
• It was inspired by, and is mostly compatible with,
the statistical language S developed by AT&T.
• R is designed by Ross Ihaka and Robert Gentleman,
developed by R core team.
Why Use R?
• It is a great resource for data analysis, data
visualization, data science and machine learning.
• It provides many statistical techniques (such as
statistical tests, classification, clustering and data
reduction)
• It is easy to draw graphs in R, like pie charts,
histograms, box plot, scatter plot, etc
• It works on different platforms (Windows, Mac,
Linux)
• It is open-source and free
• It has a large community support
• It has many packages (libraries of functions) that
can be used to solve different problems.
R Features
• Statistical inference
• Data analysis
• Machine learning algorithm
How to Run R
What is CRAN?
• CRAN abbreviates Comprehensive R
Archive Network will provide binary
files and follow the installation
instructions and accepting all
defaults.
• Download from http://cran.r-
project.org/ we can see the R
Console window will be in the RGUI
(graphical user interface).
Following figure is the sample R GUI.
R Studio: R Studio is an Integrated Development Environment
(IDE) for R Language with advanced and more user-friendly GUI. R
Studio allows the user to run R in a more user-friendly
environment. It is open-source (i.e., free) and available at
http://www.rstudio.com/.
The fig shows the GUI of R Studio. The R Studio screen has four
windows:
1. Console.
2. Workspace and history.
3. Files, plots, packages and help.
4. The R script(s) and data view.
• At console:
R as a calculator, typing commands directly into the
R Console. Launch R and type the following code,
pressing
< Enter > after each command. Type an expression
on console.
• R Sessions:-
• R is a case-sensitive, interpreted language. You can enter
commands one at a time at the command prompt (>) or run a
set of commands from a source file.
• There are a wide variety of data types, including vectors,
matrices, data frames (similar to datasets), and lists
(collections of objects).
• The standard assignment operator in R is <-. = can also used,
but this is discouraged, as it does not work in some special
situations.
• The variables can be printed without any print statement by
giving name of the variable.
> y <- 5
>y # print out y
>4/3
[1] 1.333333
R follows the basic order of operations: Parenthesis, Exponents,
Multiplication, Division, Addition and
•Subtraction (PEMDAS). This means the operations inside parenthesis
take priority over other operations.
•Next on the priority list is exponentiation. After that multiplication and
division are performed, followed by addition and subtraction.
Example:-
> 4 * (6 + 5)
[1] 44
• Variables:- Variables are integral part of
any programming language. R does not
require variable types to be declared. A
variable can take on any available datatype.
It can hold any R object such as a function,
the result of an analysis or a plot. A single
variable, at one point hold a number, then
later hold a character and then later a
number again.
• Variable Assignment:- There a number of
ways to assign a value to a variable, it does
not depend on the type of value being
assigned. There is no need to declare your
variable first
Example:-
• > x <- 6 # assignment operator: a less-than character (<) and a hyphen
(-) with no space
>x
[1] 6
• >y=3 # assignment operator = is used.
>y
[1] 3
• > z <<- 9 # assignment to a global variable rather than a local variable.
>z
[1] 9
• > 5 -> fun #A rightward assignment operator (->) can be used anywhere
> fun
[1] 5
• > a <- b <- 7 # Multiple values can be assigned simultaneously.
>a
[1] 7
>b
[1] 7
• > assign("k",12) # assign function can be used.
> k [1] 12
Removing Variables:- rm() function is used to remove
variables. This frees up memory so that R can store more
objects, although it does not necessarily free up memory for
the operating system.
There is no “undo”; once the variable is removed.
Variable names are case sensitive.
[1] "numeric"
Numeric Data:- The most commonly used numeric data is
numeric. This is similar to float or double in other languages.
It handles integers and decimals, both positive and negative,
and also zero.
[1] 5
> is.integer(i) # Testing whether a variable is integer or not
[1] TRUE
> is.numeric(i)
[1] TRUE
• R promotes integers to numeric when needed.
• Multiplying an integer to numeric results in decimal number.
• Dividing an integer with numeric results in decimal number.
• Dividing an integer with integer results in decimal number.
> class(date1)
[1] "Date"
> TRUE
[1] TRUE
> TRUE*5
[1] 5
Mode vs Class: