Advanced R
Advanced R
UNIT-1
a) install.package()
b) install.packages()
c) install_lib()
d) library_install()
a) set_working_dir()
b) setwd()
c) setwd("path/to/directory")
d) change_dir()
a) ==
b) =
c) <-
d) ===
a) Assignment
b) Not equal
c) Greater than
d) Equal
a) &
b) &&
c) ||
d) |
a) vector()
b) c()
c) list()
d) data.frame()
7. Which function is used to sort a vector in R?
a) arrange()
b) sort()
c) order()
d) rank()
a) repeat(1:10)
b) seq(1, 10)
c) generate_seq(1, 10)
d) list(1:10)
a) new.factor()
b) levels()
c) factor()
d) categorize()
a) rand()
b) random()
c) runif()
d) get_random()
a) check_class()
b) class()
c) type()
d) typeof()
a) Numeric
b) Array
c) Character
d) Logical
a) Always a list
b) Always a vector
c) A vector or matrix
d) A data frame
16. Which function can be used to apply a function over subsets of a vector in R?
a) lapply()
b) tapply()
c) apply()
d) sapply()
a) +
b) %%
c) :=
d) ^
20. Which command would you use to repeat the number 5, ten times?
a) function() = {...}
b) function_name <- function() {...}
c) def function() {...}
d) create_function() = {...}
a) 5
b) 6
c) 8
d) 9
a) x[2, ]
b) x[[2]]
c) x[2]
d) x{2}
a) merge(v1, v2)
b) sum(v1, v2)
c) c(v1, v2)
d) combine(v1, v2)
UNIT-2
1. Which function is used to read a CSV file into R?
a) import_csv()
b) readData()
c) read.csv()
d) load.csv()
a) write.csv()
b) export.csv()
c) save.csv()
d) write.csv(data_frame, "file.csv")
a) info()
b) str()
c) summary()
d) head()
4. Which function is used to view the first few rows of a data frame?
a) tail()
b) preview()
c) head()
d) glimpse()
a) save_xlsx()
b) write_xl()
c) write.xlsx()
d) save.excel()
6. Which function can be used to view the summary statistics of a data frame?
a) stats()
b) summary()
c) describe()
d) info()
a) replace_na()
b) is.na() <- value
c) replace()
d) fill_na()
a) unique()
b) remove_dups()
c) distinct()
d) deduplicate()
10. How can you add a new column to an existing data frame in R?
a) add_column()
b) data_frame$new_column <- values
c) insert_column()
d) append_column()
11. To add a new row to a data frame, which function can be used?
a) rowAdd()
b) rbind(data_frame, new_row)
c) append()
d) add_row()
12. How can you merge two data frames by a common column?
a) join()
b) stack()
c) merge(df1, df2, by = "column_name")
d) combine()
13. Which function reshapes data from wide format to long format?
a) spread()
b) transpose()
c) gather()
d) pivot_longer()
14. Which function reshapes data from long format to wide format?
a) gather()
b) spread()
c) wide()
d) stack()
15. What is the function used to remove a column from a data frame?
a) del_column()
b) remove()
c) df$column_name <- NULL
d) clear_column()
16. Which function can be used to reorder the levels of a factor variable?
a) rearrange_levels()
b) change_factor()
c) factor() with levels argument
d) level_order()
a) combine()
b) rbind(df1, df2)
c) cbind(df1, df2)
d) merge()
a) rbind()
b) merge()
c) cbind(df1, df2)
d) append()
19. Which function is used to filter rows in a data frame based on conditions?
a) filter_rows()
b) subset(data_frame, condition)
c) filter_data()
d) cond_subset()
a) find_na()
b) is.na()
c) na.detect()
d) check_na()
21. To change the names of the columns in a data frame, which function can be used?
22. Which function converts data frames into tibble (modern data frame)?
a) convert()
b) as_tibble()
c) to_df()
d) to_tibble()
a) Removes duplicates
b) Returns a logical vector indicating duplicate rows
c) Finds unique values
d) Orders a data frame
a) str_join()
b) string_concat()
c) paste()
d) glue()
a) rename_column()
b) names(data_frame)[names(data_frame) == "old_name"] <- "new_name"
c) change_name()
d) col_rename
Unit-3
a) Sys.time()
b) current_time()
c) now()
d) get_time()
a) POSIXlt
b) POSIXct
c) Date
d) time
a) parseDate()
b) as.POSIXct()
c) strptime()
d) dateParse()
a) as.Date("2024-09-26")
b) date("2024-09-26")
c) dateParse("2024-09-26")
d) parseDate("2024-09-26")
7. What is the purpose of the format argument in date parsing functions like as.POSIXct()?
a) extract_year()
b) format(date, "%Y")
c) year()
d) get_year()
10. Which function is used to calculate the difference between two dates in R?
a) subtract_dates()
b) time_diff()
c) difftime()
d) date_diff()
11. What is the class of the object returned by the difftime() function?
a) difftime
b) numeric
c) POSIXct
d) Date
a) add_days()
b) date + days
c) date.plus()
d) append_date()
13. How would you create a sequence of dates starting from "2024-01-01" with an increment of 1
day for 10 days?
14. Which function is used to truncate a date-time object to the nearest hour?
a) trunc()
b) cut_time()
c) round_time()
d) truncate_hour()
a) get_tz()
b) timezone()
c) Sys.timezone()
d) timeZone()
17. How would you set the time zone of a date-time object in R?
a) time_zone(object, "timezone")
b) attr(date, "tzone") <- "timezone"
c) set_tz(date, "timezone")
d) tz.set(date, "timezone")
18. How do you calculate the time interval between two dates in R?
a) difftime(end_date, start_date)
b) interval(start_date, end_date)
c) time_difference(start_date, end_date)
d) diff_time(start_date, end_date)
a) days
b) hours
c) seconds
d) weeks
20. How can you generate a sequence of times every hour for 5 hours starting from a specific date-
time in R?
a) time_sequence(start, "hour", 5)
b) seq.POSIXt(from = start_time, by = "hour", length.out = 5)
c) seq(from = start_time, by = "hour", length.out = 5)
d) seq_time(start_time, hours = 5)
a) time_overlap(interval1, interval2)
b) lubridate::int_overlaps(interval1, interval2)
c) overlaps(interval1, interval2)
d) check_overlap(interval1, interval2)
22. Which package is commonly used in R for working with time intervals?
a) lubridate
b) chron
c) zoo
d) timeSeries
25. Which function would you use to round a date-time object to the nearest minute in R?
a) cut_time()
b) round.POSIXt()
c) trunc.POSIXt()
d) round_time()
UNIT-4
1. What is the purpose of a one-sample t-test?
A) Shapiro-Wilk test
B) Durbin-Watson test
C) t-test
D) Chi-squared test
12. What does robust standard error adjust for?
A) Boxplots
B) Histograms
C) Scatter plots with regression lines
D) Heatmaps
A) Linearity
B) Independence of errors
C) Homoscedasticity
D) The response variable is normally distributed
A) The expected value of the response variable when all predictors are zero
B) The change in response variable per unit change in predictor
C) The average of the response variable
D) None of the above
19. In a two-way ANOVA, the main effects are:
20. What does the term “interaction” refer to in the context of ANOVA?
24. Which of the following best describes the assumption of normality in regression?
A) Remove outliers
B) Normalize data
C) Robust estimation
D) Standardize data
4. What is the key difference between quantile regression and ordinary least squares regression
(OLS)?
6. What type of diagnostic test is used to assess the fit of a quantile regression model?
A) Shapiro-Wilk test
B) Quantile residual analysis
C) Durbin-Watson test
D) AIC/BIC selection
7. In quantile regression, which of the following quantiles represents the median?
A) 0.25
B) 0.10
C) 0.50
D) 0.75
8. When manipulating quantile data, which method is most effective for visualization?
A) Line charts
B) Histograms
C) Quantile-Quantile (Q-Q) plots
D) Box-and-whisker plots
9. Which of the following is a common technique for treating outliers in quantile regression?
10. A key advantage of quantile regression over traditional linear regression is:
A) Normally distributed
B) Skewed
C) Non-symmetric and centered around zero
D) Uniformly distributed
12. Logit and probit regression models are primarily used for:
A) Continuous data
B) Time series data
C) Binary classification
D) Clustering analysis
13. What is the key difference between Logit and Probit models?
A) Durbin-Watson
B) Hosmer-Lemeshow test
C) ANOVA
D) Akaike Information Criterion
16. In a logistic regression model, the odds ratio can be interpreted as:
A) Exponential
B) Logistic
C) Linear
D) Cumulative normal distribution
19. When writing quantile data to a file, the most appropriate format is:
A) CSV
B) JSON
C) XML
D) YAML
20. What is the primary benefit of using quantile data visualization techniques like box plots?
A) Normally distributed
B) Not normally distributed
C) Uniformly distributed
D) Exponentially distributed
22. A key assumption in both Logit and Probit regression models is:
A) Homoscedasticity
B) Linearity between the log-odds and the predictors
C) Normality of residuals
D) Constant variance of error terms
23. In quantile regression, the primary measure of accuracy for outlier treatment is:
24. Which of the following tools is best suited for manipulating quantile data?
A) PCA
B) Linear regression
C) Quantile transformation
D) Lasso regression
25. When plotting a Q-Q plot for quantile data, what does a deviation from the line suggest?
A) Linear relationship
B) Homoscedasticity
C) Presence of outliers or non-normality
D) Strong correlation