0% found this document useful (0 votes)
44 views

Factors Factors: LM GLM

Factors are used to represent categorical data in R and can be either unordered or ordered. Factors are integer vectors that map each integer to a label. Factors are treated specially by modelling functions as they are self-describing using meaningful labels like "Male" and "Female" rather than arbitrary integers. The levels of a factor can be set to control the ordering and baseline level used in linear models.

Uploaded by

RustEd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Factors Factors: LM GLM

Factors are used to represent categorical data in R and can be either unordered or ordered. Factors are integer vectors that map each integer to a label. Factors are treated specially by modelling functions as they are self-describing using meaningful labels like "Male" and "Female" rather than arbitrary integers. The levels of a factor can be set to control the ordering and baseline level used in linear models.

Uploaded by

RustEd
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Factors

Factors are used to represent categorical data. Factors can be unordered or ordered. One can think
of a factor as an integer vector where each integer has a label.
Factors are treated specially by modelling functions like lm() and glm()
Using factors with labels is better than using integers because factors are self-describing; having
a variable that has values Male and Female is better than a variable that has values 1 and 2.

17/27

Factors
> x <- factor(c("yes", "yes", "no", "yes", "no"))
> x
[1] yes yes no yes no
Levels: no yes
> table(x)
x
no yes
2
3
> unclass(x)
[1] 2 2 1 2 1
attr(,"levels")
[1] "no" "yes"

18/27

Factors
The order of the levels can be set using the levels argument to factor(). This can be important
in linear modelling because the first level is used as the baseline level.
> x <- factor(c("yes", "yes", "no", "yes", "no"),
levels = c("yes", "no"))
> x
[1] yes yes no yes no
Levels: yes no

19/27

You might also like