0% found this document useful (0 votes)
44 views

Manoj Reddy

Uploaded by

Mandeep Bakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Manoj Reddy

Uploaded by

Mandeep Bakshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Manoj Reddy Frisco, Frisco, TX

4088907768
Data Engineer | Azure Data Engineer | Big data Engineer [email protected]
linkedin.com/in/manoj-kumar-reddy

Manoj Kumar is a seasoned Data Engineer with 8 years of experience Skills


driving efficiency and innovation through Big Data and Cloud solutions.
His expertise lies in building and optimizing data pipelines, migrating • Amazon Athena
data warehouses to the cloud, and ensuring data quality and • Amazon Elastic MapReduce
performance. • Amazon Redshift
A results-oriented Data Engineer with a passion for leveraging • Amazon Web Services
technology to solve complex problems. His proven track record of • Analysis
delivering efficient, scalable, and high-quality data solutions makes him • Analytical Skill
an asset to any team.
• Apache Airflow
• Apache Hadoop
Highlights:
• Apache Hbase
• Apache Hive
Boosted Processing Efficiency by 50%: Designed and implemented
• Apache HTTP Server
data pipelines on Azure Data Factory, slashing processing times and
• Apache Kafka
accelerating valuable insights.
• Apache Oozie
Delivered Scalable Cloud Solutions: Migrated on-premises data
warehouses to Azure SQL Data Warehouse, resulting in significant • Apache Spark
cost savings and improved scalability for future growth. • Apache Sqoop
Enhanced Data Integrity: Implemented automated data validation • Application Programming
Interface
processes, ensuring data quality and driving more reliable business
decisions. • AWS Lambda
Optimized Data Processing: Reduced Spark job latency and • Azure Data Factory
achieved a 40% improvement in query execution time through • Azure DevOps Server
performance tuning techniques. • Azure SQL Data Warehouse
• Azure Storage
Beyond the Core • Batch Processing
• Big Data
DevOps Integration: Experience with CI/CD tools to automate • Boto3
software delivery and streamline development processes. • Cascading Style Sheet
Web Development Background: Possesses a foundational • Cloud Computing
understanding of web development technologies (HTML, CSS,
• Cosmos DB
JavaScript) from previous experience.
• Cost Reduction
• Creativity
TECHNICAL SKILLS:
• Cross Functional Skills
• Data Access
Programming languages: Java, C/C++, Python, Scala, SQL
• Data Aggregation
Big Data Components: MapReduce, HDFS, Spark, HBase, Hive,
• Data Analysis
Sqoop, Pyspark
• Database Optimization
Orchestration tools: Oozie, Airflow, ADF
• Data Export
AWS Services:S3, Athena, Lambda, Redshift, DynamoDB, EMR, etc.
• Data Extraction
Azure:ADLS, ADF, SQL Server, Databricks, Synapse, Logic apps,
Cosmos DB, etc • Data Integration

Data warehouse: Hive, Synapse, snowflake • Data Integrity


• Data Lake
• Data Management
• Data Modeling
Work Experience • Data Pipelines
• Data Processing
Tiger Analytics • Data Storage

Santa Clara, CA, USA • 06/2023 - 04/2024 • Data Structure


• Data Validation
Data Engineer
• Data Warehouse
• Enhanced data extraction efficiency by 50% by creating Spark • DevOps Operations
Applications using PySpark and Spark-SQL, resulting in expedited
data processing and cost savings. • Diligence

• Designed and implemented end to end data pipelines on Azure • DynamoDB


Data Factory to extract data from various sources, transform it • Eclipse
using Azure Databricks, and load it into Azure SQL Data • Electronic Health Record
Warehouse for analytics purposes resulting in a 50% reduction in
processing time. • Energetic

• Developed Python scripts, and UDFs using both Data frames/SQL • Enterprise Java Beans
in Spark for Data Aggregation. • Etl Tool
• Migrated on-premises data warehouses to Azure SQL Data • Exception Handling
Warehouse, achieving significant cost savings and improved • Hadoop Cluster
scalability.
• Hadoop Distributed File System
• Collaborated on ETL (Extract, Transform, Load) tasks, maintaining
data integrity, and verifying pipeline stability. • Hard Working

• Experienced in performance tuning of spark applications for setting • High Availability


the right batch interval size, the correct level of parallelism, and • Hypertext Markup Language
memory tuning. • Java
• Worked on Data integration and storage technologies with Jupyter • JavaScript
Notebook and MySQL.
• JavaScript Object Notation
• Collaborated with DevOps Engineers to develop automated CI/CD
and Test-driven development pipeline using Azure as per client • JavaServer Pages
standards. • Jenkins
• Written extensive Spark SQL queries to do transformations on the • jQuery
data to be used by downstream models. • MapReduce
• Enhanced quality of data insights through implementation of • Microsoft Azure
automated data validation processes.
• Microsoft SQL Server
• Teradata migration to Azure Delta Lake and creation of the external
tables in serverless Synapse • MySQL

• Performed debugging, data validation, and data clean-up analysis • NoSQL


within large datasets. • Optimization Techniques
• Optimized data storage, reducing storage costs by 20% and • Organizational Skills
improving data retrieval speed by 15%, leading to more efficient • Performance Tuning
data processing and cost savings for the company.
• Power BI
• Environments : Azure ADF, Databricks, Azure DevOps, Azure
DevOps, Synapse, PySpark, Teradata, Snowflake, etc. • Prioritization
• Pyspark
• Python
Bosch • Python Script
06/2021 - 06/2023 • Relational Database
Management System
Data Engineer
• Routine Inspection
• Developed a scalable data warehouse using Azure Blob Storage, • Scalability
Data Lake, and Synapse to store and manage large volumes of
data. • Scala Programming language
• Experience in migrating/ingesting other databases to Snowflake • Service Management
using the ETL tool Azure data factory. • Shell Script
• Worked on Azure Cosmos DB as the final staging layer which gets • Snowflake Software
connected to PowerBI.
• SnowSQL
• Has experience with SnowSQL and Snowpark. • Software Optimization
• Monitor and optimize query performance in Snowflake. • Spring Framework
• Has worked on Azure SQL, Data warehousing services like Azure • Spring Mvc
Synapse tool, and Data Modeling concepts. • SQL
• Used Python and Shell scripting to build pipelines. • SQL Azure
• Reduced the latency of spark jobs by tweaking the spark • Streamlining Process
configurations and following other performance and Optimization
techniques. • Team Development

• Teradata migration to Azure Delta Lake and creation of the external • Team Player
tables in serverless Synapse • Teradata Database
• Worked on Data integration and storage technologies with Jupyter • Test Driven Development
Notebook and MySQL. • Troubleshooting
• Performed extensive debugging, data validation, error handling • Unit Testing
mechanism, transformation types, and data clean-up analysis
within large datasets. • User Interface

• Conducted regular monitoring and troubleshooting of Azure data • Web Application


solutions, ensuring high availability and data reliability.
• Environments : AWS ETL, AWS lambda, Azure ADF, Databricks,
Azure DevOps, Big data tools : Azure DevOps, Teradata, Synapse, Education
PySpark, Snowflake, Snowpark, etc.
Bachelor Of Technology B
Tech In Information
Mu Sigma Technology
11/2019 - 04/2021 National Institute Of
Technology
Big Data Engineer Srinagar, India
• Experience in Developing Spark applications using Spark -
SQL/Scala in Databricks for data extraction, transformation, and
aggregation.
• Developed and optimized CI build jobs with the help of Jenkins.
• Performed performance tuning on Pyspark/SQL queries and data
processing jobs, resulting in a 40% reduction in query execution
time.
• Worked on Apache Kafka which is scheduled using UC4 to handle
batch data.
• Developed Python scripts, and UDFs using both Data frames/SQL
in Spark for Data Aggregation.
• Used Python and Shell scripting to build pipelines. Environments :
Azure ADF, Databricks, Big data tools, Jenkins, logic apps, Scala,
etc.
• Created scripts in Python (Boto3) which integrated with Amazon
API to control service operations.
• Involved in writing spark and hive scripts to run in EMR using
different operators available in Airflow and scheduling them.
• Worked with integration of Spark Streaming and Apache Kafka.
• Explored the Spark performance and optimization of the existing
algorithms in Hadoop using Spark Context, Spark-SQL, Data
Frame, Pair RDD's, Spark YARN.
• Implemented AWS Lambda with integration of DynamoDB, S3 etc.
• Used Spark-SQL to Load data into Hive tables and Write queries to
fetch data from these tables. Implemented partitioning and
bucketing in hive.
• Responsible for loading structured and semi-structured data into
Hadoop by creating static and dynamic partitions.
• Worked on Spark user-defined functions (UDF) using PySpark for
external functions.
• Written extensive Hive queries to do transformations on the data to
be used by downstream models.
• Environments : Python, PySpark, Hive, Apache Airflow, AWS EMR,
AWS Athena, AWS S3, AWS Lambda. etc.

Tcs
03/2018 - 08/2019
Hadoop Developer
• Responsibilities :
• Used Sqoop to import/export data from various RDBMS (Teradata,
Oracle, etc.) . to Hadoop cluster and vice versa.
• Developed, and maintained data integration programs in Hadoop
and RDBMS environments with both RDBMS and NoSQL data
stores for data access and analysis.
• Worked on batch processing of data sources using Apache Spark
with Java.
• Involved in converting Hive/SQL queries into Spark transformations
using Spark RDD's, and Java.
• Implemented solutions for ingesting data from various sources and
processing the Data-at-Restulizing Big Data technologies such as
Hadoop, Map Reduce Frameworks, HBase, and Hive.
• Increased Hive performance and optimization by using bucketing,
partitioning, and other techniques.
• Configured the Oozie workflows to manage independent jobs and
automate Shell, Hive, Sqoop, Spark, etc.
• Involved in populating the processed data either by spark/hive to
No-SQL HBase
• Environments : Java, Spark (Java) , Hive, Oozie, HBase, SQL, etc.

Value Labs
04/2016 - 02/2018
Java Developer
• Used HTML, CSS, JavaScript, and JSP pages for user interaction.
• Implemented and managed SQL database for use in the
background.
• Created a web application with Spring framework.
• Experience with using Apache web server.
• Solved problems using the combination of JavaScript, JSON, and
jQuery.
• Developed and implemented servlets and Java beans.
• Developed JSP pages and viewed and control related files using
the Spring MVC framework.
• Coordinate with the development team to identify automation
opportunities and improve technical support for end users.
• Perform code optimization, conduct unit testing, and develop
frameworks using object-oriented design principles.
• Worked effectively with cross-functional design teams to create
software solutions that elevated client-side experience.
• Environments : Java, Eclipse IDE, Spring MVC, SQL, etc.

You might also like