Data Engineering Roadmap For Freshers & Resources
Data Engineering Roadmap For Freshers & Resources
Resources
1. Programming Language :
a. Python
b. Scala
c. Java
5. SQL Scripting :
a. Transactional Databases : MySQL, PostgreSQL
b. All types of joins
c. Nested Queries
d. Group By
e. Use of Case When Statements
f. Window Functions
9. BigData Frameworks :
a. Apache Hadoop (Architecture Understanding Most Imp)
i. HDFS
ii. Mar-Reduce
iii. Yarn
b. Apache Hive
i. How to load data in different file formats
ii. Internal Tables
iii. External Tables
iv. Querying table data stored in HDFS
v. Partitioning
vi. Bucketing
vii. Map-Side Join
viii. Sorted-Merge Join
ix. UDF’s in Hive
x. SerDe in Hive
c. Apache Spark (Most Important)
i. Spark Core
ii. Spark SQL
iii. Spark Streaming
d. Apache SQOOP
e. Apache NIFI
f. Apache FLUME