Quiz

The document contains a quiz with 21 multiple choice questions about Spark SQL. Some key points covered in the questions include: - The Derby DB warehouse directory is configured via the "spark.sql.warehouse.dir" property and is used to store metadata and data. - Spark SQL achieves high performance using the Catalyst query optimizer. - Dataframes can be used to read data from various sources like CSV and Parquet files, with options specified to configure things like separators, headers, and inferring schemas. - Global temporary views are available to all sessions but not after the creating session is closed.

Uploaded by

Oneil Henry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

494 views4 pages

Quiz

Uploaded by

Oneil Henry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Quiz

Question 1 (1 point)
The Derby DB warehouse directory is pointed to by “spark.sql.warehouse.dir” .It creates database if not exists
sales_db: If remove sales metadata from the metastore and from warehouse directory (Hive has two part
meta data and data .Metadata is in the metastore and data is in the file system so the actual data is in
“C:/Users/pedro/Documents/test_data/hive” )
Question 1 options:
True
False

Question 2 (1 point)
In Spark SQL high performance is achieved using the Catalyst switch.
Question 2 options:
True
False

Question 3 (1 point)
Data Frame is a Data set which columns do not have name.
Question 3 options:
True
False

Question 4 (1 point)

Data sources have their own options that can be specified during the load process:
val salesRecords = spark.read.format("csv”) .option("sep", ";").option("inferSchema", "true")
.option("header", "false”) .load("/Users/hadoop-user/Documents/SalesJan2009.csv").

While reading a csv file, the first row is always taken as the header, if the header option is set to false, as
shown in the above example
Question 4 options:
True
False

Question 5 (1 point)
In term of Reading Parquet Files ,if the base location of the table is specified as the path, the partitions are
not automatically discovered
Question 5 options:
True
False

Question 6 (1 point)
Data Frame run on specific engine of spark environment. That engine called catalyst engine.
Question 6 options:
True
False
Question 7 (1 point)
Hive support needs to be enabled on spark session.The hive warehouse directory needs to be set as
“spark.sql.warehouse.dir”. Once the session is created, SQL statements can be issued using
sparkSession.sql(“<sql_statement>”). Bucketing , sorting and partitioning can be done on the tables being
saved.
Question 7 options:
True
False

Question 8 (1 point)
In Spark SQL ,The entire things happening in memory .We do not have any db.Every thing happening in
memory .Spark is In memory db .Any thing you do in any db you can do it here too.
Question 8 options:
True
False

Question 9 (1 point)
When you create tempview you create for only this session however when you create globalview its available
for all session.
Question 9 options:
True
False

Question 10 (1 point)
Global view available to all sessions but not when we close the current session
Question 10 options:
True
False

Question 11 (1 point)
In term of Reading Parquet Files ,Partition columns of numeric data types, date, timestamp and string types
are automatically inferred
Question 11 options:
True
False

Question 12 (1 point)
In RRD we deal with structure and semi structure data however in Data frame we deal with structure data.
Question 12 options:
True
False

Question 13 (1 point)
createOrReplaceTempView will overwrite the existing view if it exist
Question 13 options:
True
False
Question 14 (1 point)
An easier way to load various types of data, other than parquet, is using the following:
val salesRecords = spark.read. .format(“csv”) .load("/Users/hadoop-user/Documents/SalesJan2009.csv”)
.The formats are data sources and should be referred to using their fully qualified names like
“org.apache.spark.sql.parquet”
Question 14 options:
True
False

Question 15 (2 points)
The write object is derived from session object (spark session).The reader object is derived from data frame

Question 15 options:
True
False

Question 16 (1 point)
val salesRecords = spark.read.load("/Users/hadoop-user/Documents/SalesJan2009.parquet")
This loads a parquet file by default.The default option is specified by the configuration property
“spark.sql.sources”
Question 16 options:
True
False

Question 17 (1 point)
We always apply sql in the data frame
Question 17 options:
True
False

Question 18 (2 points)
In term of Reading Parquet Files ,the columns are automatically inferred because the property
“spark.sql.sources.partitionColumnTypeInference.enabled” is set to false by default .If the above property is
set to true, all partition columns will be read as String
Question 18 options:
True
False

Question 19 (1 point)
We use filter function when is complicated condition because you can not use typical sql stuff that’s why we
write the filter function .The function is pass the particular row and then you do row.(methods) to get the
value of particular row
Question 19 options:
True
False
Question 20 (1 point)
In Spark SQL unit of processing is the Data set or Data Frame
Question 20 options:
True
False

Question 21 (1 point)
Data set is a collection of records.Each record in the Hive table is a row. Data set is a resilience distributed
collections of rows where row is an object. row is Scala class
Question 21 options:
True
False

...

[Message clipped] View entire message

ReplyForward

Py Spark
No ratings yet
Py Spark
3 pages
Pyspark Final Assessment
No ratings yet
Pyspark Final Assessment
3 pages
T09 Sparksql
No ratings yet
T09 Sparksql
30 pages
GCP Questions PDF
No ratings yet
GCP Questions PDF
1 page
C4 - K1 - The Great Pacific Garbage Patch PDF
No ratings yet
C4 - K1 - The Great Pacific Garbage Patch PDF
1 page
BDA U5 copy
No ratings yet
BDA U5 copy
42 pages
Sector Breakdown US & Europe
No ratings yet
Sector Breakdown US & Europe
64 pages
100 Great Problems of Elementary Mathematics - Their History and Solution
No ratings yet
100 Great Problems of Elementary Mathematics - Their History and Solution
308 pages
Apache Spark - Practices
No ratings yet
Apache Spark - Practices
24 pages
Intro To Apache Spark: Paco Nathan, Download Slides
No ratings yet
Intro To Apache Spark: Paco Nathan, Download Slides
86 pages
Beginners Python Cheat Sheet PCC Lists
No ratings yet
Beginners Python Cheat Sheet PCC Lists
2 pages
049 Hadoop Commands Reference Guide.
No ratings yet
049 Hadoop Commands Reference Guide.
3 pages
Databricks Certified Professional Data Engineer 1 1
No ratings yet
Databricks Certified Professional Data Engineer 1 1
16 pages
Hive in Class Assignment Winter 2021
No ratings yet
Hive in Class Assignment Winter 2021
2 pages
4 Building Blocks of A Streaming Data Architecture
No ratings yet
4 Building Blocks of A Streaming Data Architecture
11 pages
Lab 1 and Lab 2 (1)
No ratings yet
Lab 1 and Lab 2 (1)
13 pages
Spark SQL
100% (1)
Spark SQL
34 pages
Product Margin Sheet - _make a Copy or Download Your Own
No ratings yet
Product Margin Sheet - _make a Copy or Download Your Own
10 pages
Pair RDD Operations: Flat Map
No ratings yet
Pair RDD Operations: Flat Map
4 pages
SQL PT Poster
No ratings yet
SQL PT Poster
1 page
Problem Description: Sensitivity: Internal & Restricted
No ratings yet
Problem Description: Sensitivity: Internal & Restricted
2 pages
Mastering Apache Spark PDF
50% (2)
Mastering Apache Spark PDF
1,352 pages
Datatypes in Hive
No ratings yet
Datatypes in Hive
31 pages
Azure Cosmos DB - Change Feed Support
No ratings yet
Azure Cosmos DB - Change Feed Support
8 pages
Data Platform and Analytics Foundational Training: (Speaker Notes)
No ratings yet
Data Platform and Analytics Foundational Training: (Speaker Notes)
19 pages
Power Bi Interview Question Asked in Tech Mahindra 1721390502
No ratings yet
Power Bi Interview Question Asked in Tech Mahindra 1721390502
15 pages
Final Practice Set
No ratings yet
Final Practice Set
31 pages
Tableau Interview Questions and Answers
No ratings yet
Tableau Interview Questions and Answers
5 pages
Spark Streaming Twitter Example
No ratings yet
Spark Streaming Twitter Example
4 pages
BattleTech IS Market Guide
No ratings yet
BattleTech IS Market Guide
76 pages
Few Shot Learning Algorithms
No ratings yet
Few Shot Learning Algorithms
1 page
7 Hive Notes
No ratings yet
7 Hive Notes
36 pages
Spark
No ratings yet
Spark
13 pages
spark
No ratings yet
spark
160 pages
Spark SQL
No ratings yet
Spark SQL
24 pages
Pyspark With Docker
100% (1)
Pyspark With Docker
15 pages
Cloudera Certification Dump - 410-Anil
100% (3)
Cloudera Certification Dump - 410-Anil
49 pages
Cuttinlines EN
No ratings yet
Cuttinlines EN
32 pages
Kanishk Resume
No ratings yet
Kanishk Resume
5 pages
How To Make PCB at Home-Pcbway
No ratings yet
How To Make PCB at Home-Pcbway
11 pages
Unite Real-Time and Batch Analytics With AWS Glue
No ratings yet
Unite Real-Time and Batch Analytics With AWS Glue
28 pages
Fall209 Spark SQL MC
No ratings yet
Fall209 Spark SQL MC
96 pages
Mongodb Cheat Sheet
No ratings yet
Mongodb Cheat Sheet
10 pages
Azure DataBricks Interview Questions
No ratings yet
Azure DataBricks Interview Questions
17 pages
Apache Sqoop
No ratings yet
Apache Sqoop
21 pages
Scala PDF
No ratings yet
Scala PDF
29 pages
Distributed Computing With Python - Sample Chapter
No ratings yet
Distributed Computing With Python - Sample Chapter
18 pages
SAURABH YADAV 2ND
No ratings yet
SAURABH YADAV 2ND
1 page
Tax Lien List Multi Format
100% (1)
Tax Lien List Multi Format
318 pages
Data Bricks
No ratings yet
Data Bricks
43 pages
PySpark Data Frame Questions PDF
100% (1)
PySpark Data Frame Questions PDF
57 pages
Spark A To Z
No ratings yet
Spark A To Z
63 pages
What Are The Difference Between DDL, DML and DCL Commands - Oracle FAQ
No ratings yet
What Are The Difference Between DDL, DML and DCL Commands - Oracle FAQ
4 pages
gdp3q23 Adv
No ratings yet
gdp3q23 Adv
11 pages
Sater 2007 Andean Tragedy Fighting The War of The Pacific 1879-1884
100% (4)
Sater 2007 Andean Tragedy Fighting The War of The Pacific 1879-1884
457 pages
Talend Interview Questions
No ratings yet
Talend Interview Questions
5 pages
2 Hadoop (Uploaded)
No ratings yet
2 Hadoop (Uploaded)
82 pages
Apache Hive Tutorial
No ratings yet
Apache Hive Tutorial
139 pages
Pyspark RDD Cheat Sheet Python For Data Science
No ratings yet
Pyspark RDD Cheat Sheet Python For Data Science
1 page
PySpark Questions
No ratings yet
PySpark Questions
5 pages
Hive Cheat Sheet - Quick Reference
No ratings yet
Hive Cheat Sheet - Quick Reference
19 pages
SQL Server Theory
No ratings yet
SQL Server Theory
2 pages
Data Lake Azure
No ratings yet
Data Lake Azure
290 pages
Week 4 - PIG SqoopFall2019
No ratings yet
Week 4 - PIG SqoopFall2019
117 pages
Leeboy-1000g Tilt Paver PART-CATALOGUE-BOOK
No ratings yet
Leeboy-1000g Tilt Paver PART-CATALOGUE-BOOK
238 pages
Sqoop User Guide
No ratings yet
Sqoop User Guide
58 pages
Digital Libraries - Data, Information, and Knowledge
No ratings yet
Digital Libraries - Data, Information, and Knowledge
329 pages
Falcon Mobile
No ratings yet
Falcon Mobile
111 pages
Financial Maths Extra Questions
No ratings yet
Financial Maths Extra Questions
52 pages
Flink Vs Spark by Slim Baltagi
No ratings yet
Flink Vs Spark by Slim Baltagi
67 pages
Timesheet ON-Time HR
No ratings yet
Timesheet ON-Time HR
1 page
Freebie Pack: Point of View
No ratings yet
Freebie Pack: Point of View
18 pages
DataWarehouseInterview Part1
No ratings yet
DataWarehouseInterview Part1
4 pages
Downhole Water Sink/Loop: Supervisor: Dr. Hashemi Zadeh Student: Mahdi Tohidi Mobark Aabad
No ratings yet
Downhole Water Sink/Loop: Supervisor: Dr. Hashemi Zadeh Student: Mahdi Tohidi Mobark Aabad
15 pages
DataStage Faq S
No ratings yet
DataStage Faq S
57 pages
Databricks
No ratings yet
Databricks
11 pages
Teradata Scripts
No ratings yet
Teradata Scripts
998 pages
Endogenous Analysis
No ratings yet
Endogenous Analysis
9 pages
Scan WV1DH22F6H7001217 20210610 1130
No ratings yet
Scan WV1DH22F6H7001217 20210610 1130
6 pages
LaserJet 4M Plus
No ratings yet
LaserJet 4M Plus
39 pages
Course Syllabus/Content Diploma in Quantity Surveying
No ratings yet
Course Syllabus/Content Diploma in Quantity Surveying
9 pages
Servopro Multiexact-5400 Analizador-Multigas Monitoreo-Gases-Industriales Servomex
No ratings yet
Servopro Multiexact-5400 Analizador-Multigas Monitoreo-Gases-Industriales Servomex
4 pages
Tableau Interview Questions
No ratings yet
Tableau Interview Questions
31 pages
DGA
No ratings yet
DGA
73 pages
50 PySpark Interview Questions.pdf
No ratings yet
50 PySpark Interview Questions.pdf
7 pages
SQL Loader Basics
No ratings yet
SQL Loader Basics
13 pages
Da Baji Quan by Grandmaster Liu
100% (2)
Da Baji Quan by Grandmaster Liu
0 pages
30-2si (Low Ambient Temp - Pressure Control Accessory)
No ratings yet
30-2si (Low Ambient Temp - Pressure Control Accessory)
16 pages
CH. A. Charalambides, M.V. Koutras, N. Balakrishnan - Probability and Statistical Models With Applications-Chapman and Hall - CRC (2000)
No ratings yet
CH. A. Charalambides, M.V. Koutras, N. Balakrishnan - Probability and Statistical Models With Applications-Chapman and Hall - CRC (2000)
609 pages
Bug Check Code Reference (Windows Debuggers)
No ratings yet
Bug Check Code Reference (Windows Debuggers)
8 pages
Electro-Hydraulic Valve Control With Multiair Technology: Cover Story
No ratings yet
Electro-Hydraulic Valve Control With Multiair Technology: Cover Story
8 pages
Programming Machine Ethics
No ratings yet
Programming Machine Ethics
5 pages
GSI127 Galvanic Separation Unit: Iecex Kgs
No ratings yet
GSI127 Galvanic Separation Unit: Iecex Kgs
7 pages
GE Airborne 185A Plus Transport Incubator
No ratings yet
GE Airborne 185A Plus Transport Incubator
2 pages
S16 16 Updates AM
0% (1)
S16 16 Updates AM
141 pages
How Many 4 Letter Words Can Be Formed by The Word MISSISSIPPI
No ratings yet
How Many 4 Letter Words Can Be Formed by The Word MISSISSIPPI
5 pages
Cat 4160-SC A00 Parker
No ratings yet
Cat 4160-SC A00 Parker
8 pages
Bangladesh Tax
No ratings yet
Bangladesh Tax
4 pages
Cineplex Values The Cineplex Way: Who We Are Whatwedoandhowwedoit
No ratings yet
Cineplex Values The Cineplex Way: Who We Are Whatwedoandhowwedoit
2 pages
Virtual Simutech
No ratings yet
Virtual Simutech
2 pages
Co-Op Winter 2020 Job Description KPMG PDF
No ratings yet
Co-Op Winter 2020 Job Description KPMG PDF
1 page
Case Segway Case
No ratings yet
Case Segway Case
16 pages
Official Poker Hand Rankings Full Chart
No ratings yet
Official Poker Hand Rankings Full Chart
2 pages
AC - Recommended Noise Criterion - NC
No ratings yet
AC - Recommended Noise Criterion - NC
1 page
3 Lecture 3-ETL
100% (1)
3 Lecture 3-ETL
42 pages
Downloaded From Manuals Search Engine
No ratings yet
Downloaded From Manuals Search Engine
122 pages
Storage and Handling of Anhydrous Ammonia. - 1910.111 - Occupational Safety and Health Administration
100% (1)
Storage and Handling of Anhydrous Ammonia. - 1910.111 - Occupational Safety and Health Administration
20 pages

Quiz

Uploaded by

Quiz

Uploaded by

Quiz

[Message clipped] View entire message

You might also like