0% found this document useful (0 votes)
69 views

Internship

This document provides details about an internship conducted at Codegnan IT Solutions (OPC) Private Limited from June 2022 to July 2022. The internship focused on machine learning using Python. Over the course of the internship, topics covered included data science, artificial intelligence, data analysis, machine learning, Python modules and packages, data visualization, and a hands-on project applying machine learning concepts.

Uploaded by

Vyshnavi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Internship

This document provides details about an internship conducted at Codegnan IT Solutions (OPC) Private Limited from June 2022 to July 2022. The internship focused on machine learning using Python. Over the course of the internship, topics covered included data science, artificial intelligence, data analysis, machine learning, Python modules and packages, data visualization, and a hands-on project applying machine learning concepts.

Uploaded by

Vyshnavi
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

“Machine Learning Using Python”

Submitted in partial fulfillment for the award of certificate


of

BACHELOR OF TECHNOLOGY
IN COMPUTER SCIENCE AND ENGINEERING

By
Varshith Uddanti (208T1A05B9)

DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY

GANGURU, VIJAYAWADA - 521 139

Affiliated to JNTUK, Kakinada &Approved By AICTE,

New Delhi Certified by ISO 9001-2015, Accredited By

NBA

i
DHANEKULA INSTITUTE OF ENGINEERING&TECHNOLOGY
GANGURU, VIJAYAWADA - 521 139
Affiliated to JNTUK, Kakinada &Approved By AICTE, New Delhi Certified by
ISO 9001-2015, Accredited by NBA

Department of Computer Science &


Engineering CERTIFICATE

This is to certify that the Summer Internship work entitled “Empowering Farming
Decisions: Exploring Machine Learning in Crop Recommendations” is a bonafide record

of internship work done by Varshith Uddanti(208T1A05B9) for the award of the


Summer Internship in Computer Science and Engineering by Jawaharlal Nehru
Technological University, Kakinada during the academic year 2022- 2023.

Head of Department:
Dr. K. SOWMYA
Professor, HOD CSE EXTERNAL EXAMINER

ii
iii
DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY
Department of Computer Science & Engineering
VISION – MISSION - PEOs

Institute Vision Pioneering Professional Education through Quality


Providing Quality Education through state-of-art infrastructure,
laboratories and committed staff.
Institute Mission
Moulding Students as proficient, competent, and socially responsible
engineering personnel with ingenious intellect.

Involving faculty members and students in research and development


works for betterment of society.

Department To empower the budding talents and ensure them with probable
Vision employability skills in addition to human values by optimizing the
resources.

Department * To encourage students to become pioneers in the global


Mission competition with problem-solving skills
* To make students become innovative with potential skills to
explore the employment opportunities and/or to become entrepreneurs
* To promote Research environment and inculcate corporate social
responsibility

Program Graduates of Computer Science & Engineering will:


Educational
Objectives(PEOs) PEO1: Excel in problem solving and designing new products for a
competitive and challenging business environment

PEO2: Contribute to technological innovation, research and society


through the application of information technology in a diversified
world.

iv
PROGRAM OUTCOMES(PO’S)
1. Engineering knowledge: apply the knowledge of mathematics, science, engineering fundamentals, and an
engineering specialization to the solution of complex engineering problems.

2. Problem Analysis: identify, formulate, review research literature, and analyze complex engineering
problems reaching sustained conclusions using first principles of mathematics, natural sciences, and engineering
sciences.

3. Design/Development Of Solutions: design solutions for complex engineering problems and design
system components or process that meet the specified needs with appropriate consideration for the public health
and safety, and the cultural, societal, and environmental considerations.

4. Conduct Investigations Of Complex Problems: use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to provide
valid conclusions.

5. Modern Tool Usage: create, select, and apply appropriate techniques, resources, and modern engineering
and IT tools including prediction and modelling to complex engineering activities with an understanding of the
limitations.

6. The Engineer And Society: apply reasoning informed by the contextual knowledge to assess societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional engineering
practice.

7. Environment And Sustainability: understand the impact of the professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable development.

8. Ethics: apply ethical principles and commit to professional ethics and responsibilities and norms of the
engineering practice.

9. Individual And Team Work: function effectively as an individual, and as a member or a leader in diverse
teams, and in multidisciplinary settings.

10. Communication: communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and design
documentation, make effective presentations, and give and receive clear instructions.

11. Project Management And Finance: demonstrate knowledge and understanding of the engineering and
management principles and apply these to one’s own work, as a member and leader in a team, to manage projects
and in multidisciplinary environments.

12. Life- Long Learning: recognize the need for, and have the preparation and ability to engage in
independent and life- long learning in broadest context of technological change.
PROGRAM SPECIFIC OUTCOMES(PSOs)
PSO1: Designing and developing the Information Technology based systems with high professional skills.
PSO2: Qualify in national and international level competitive examinations for successful higher studies and
get employment in IT enabled industries

v
Internship Mappings

P P
P P P P P P P P P P P P
Project Title S S
O O O O O O O O O O O O
O O
1 2 3 4 5 6 7 8 9 10 1 12
1 2
1

Application to share
and discover photos and
albums like Instagram.

Mapping Level Mapping Description


1 Low Level Mapping with PO & PSO
2 Moderate Mapping with PO & PSO
3 High Level Mapping with PO & PSO

NAME: Varshith Uddanti


ROLLNO: 208T1A05B9
YEAR-SEMESTER:3-2

vi
Contents

Contents Page No.


Internship carried out organization details 8

Internship Log 9

Domain area of the Internship 17

Project Report 18

vii
Internship Carried Out Organization Details

 Codegnan It Solutions (opc) Private Limited is a Private (One


Person Company) incorporated on 20 August 2018.
 It is classified as Non-govt company and is registered at
Registrar of Companies, Vijayawada.
 It is involved in Software publishing, consultancy and supply
[Software publishing includes production, supply and
documentation of ready-made (non-customized) software,
operating systems software, business & other applications
software, computer games software for all platforms.
 Custom software also includes made-to-order software based
on orders from specific users.
 Codegnan It Solutions (opc) Private Limited's Annual General
Meeting (AGM) was last held on N/A and as per records from
Ministry of Corporate Affairs (MCA), its balance sheet was
last filed on 31 March 2021.

viii
Internship Log

Day 1:
● Introduction to Data Science and Artificial Intelligence
● Importance of Data in the 21st Century
● Types of Data and its usage
● Difference between Data Science and Artificial Intelligence
● Overview of Data Analysis and key steps involved
● How does Machine Learning work?
● Different stages of Machine Learning projects
● Understanding the Importance of Path and Installation of Jupyter Notebook
● Modules and Packages

Day 2:
● What is a Python Module? How is it different from a Python File?
● Creating Modules and Packages
● Importing Functions, Variables from different modules
● Python built-in modules
● Working on math, time, random modules

Day 3:
● Hands-On – Working on Python Built-in Modules and User-defined Custom
● Module
● Understanding Importance of Data Analysis
● Understanding Importance and Types of Data Analysis
● Understanding types of Data

Day 4:
● Understanding Importance of Visualization and Types of Graphs
● Understanding the Importance and Usage of Jupyter Notebook
● Understanding and Working on NumPy
● Introduction to NumPy
● Advantages of NumPy over lists
● Creating NumPy arrays - 1-D, 2-D, N-D arrays
● Data types for ndarrays

ix
Day 5:
● Checking the attributes - shape, size, dimensions, dtype
● NumPy Arithmetic Operations
● NumPy universal functions
● Linear Algebra using NumPy
● Working on Appending and Concatenating

Day 6:
● Pandas for Data Analysis
● Getting Started with Pandas
● Introduction to Pandas Data Structures - Series, DataFrames
● Checking Attributes and Description

Day 7:
● Basic Essential functionality - Reindexing, Dropping entries from an axis
● Indexing, Selection, and Filtering
● Working on loc and iloc Functionalities
● Data Loading and Storage
● Reading and writing different file types (.txt, .xlsx, .csv files)

Day 8:
● Interacting with Web APIs
● Accessing data from databases
● Writing/Saving Files

Day 9:
● Data Cleaning and Preparation
● Handling Missing Data - Filtering out Missing Data, Filling in Missing data
● Removing Duplicates
● Computing Indicator/Dummy Variables

Day 10:
● Data Wrangling
● Concatenation - Adding Rows, Adding Columns, Concatenation with Different
● Indices
● Merging Data Frames

Day 11:
● Data Aggregation and Group Operations
● Pivot Tables and Cross-Tabulation
x
● Working on Time Series data
● Hands-on: Case Study Working on Titanic Dataset for Cleansing Data, HR
● Analytics Data

Day 12:
● Data Visualization Using Matplotlib
● Introduction to Matplotlib
● Setting Labels, Titles, xticks, and yticks
● Multiple Line Plots, adding legend
● Bar charts - What are they, When to use it
● Bar chart for comparing categorical data
● Histogram to check the distribution of numerical data

Day 13:
● Scatter Plots and their Usage
● Pie charts and their usage
● Subplots and their Usage
● Hands-on: Case Study Working on Titanic Dataset for Visualization

Day 14:
● Python Interactive Visualization using Plotly for Dashboards
● Introduction to Plotly and Cufflinks
● Loading Plotly and Cufflinks
● Loading the Data

Day 15:
● Quick Visualization with custom bar charts
● Interactive Bubble charts
● Understanding and Working on Choropleth Maps
● Hands-on: Analyzing Gapminder dataset

Day 16:
● Data - Wealth of the 21st Century - Web Scraping using Python
● Why Web Scraping and Understanding its importance
● Installing BeautifulSoup
● Understanding web structures
● Scraping data from the web using Beautiful Soup - Static & Dynamic websites
● Performing Data Visualization over the scraped data

Day 17:
● Machine Learning Fundamentals
xi
● Data Transformation and Preprocessing
● Handling Numeric Features
● Feature Scaling
● Standardization and Normalization

● Day 18:
● Handling Categorical Features
● One Hot Encoding, pandas get dummies
● Label Encoding
● More on different encoding techniques
● Train, Test and Validation Split
● Simple Train and Test Split
● Drawbacks of train and test split
● K-fold cross-validation
● Time-based splitting

Day 19:
● Overfitting And Underfitting
● What is overfitting?
● What causes overfitting?
● What is Underfitting?
● What causes underfitting?
● What are bias and Variance?
● How to overcome overfitting and underfitting problems?

Day 20:
● Supervised Machine Learning Algorithms
● Regression and its Importance in real-world cases
● Introduction to Linear Regression
● Understanding How Linear Regression Works

Day 21:
● Maths behind Linear Regression
● Ordinary Least Square
● Gradient Descent
● R - square
● Adjusted R-square

Day 22:
● Polynomial Regression
● Multiple Regression
xii
● Performance Measures - MSE, RMSE, MAE
● Assumption of Linear Regression
● Ridge and Lasso regression
● Hands-on: Algorithm implementation with real use case datasets

Day 23:
● Building and Deployment of Machine learning model - Flask, Git, Github &Python
Anywhere
● Understanding steps in end-to-end ML projects
● Building a web service for Machine Learning Model
● Git Download and Github Usage
● Deploying the Final Trained Model on PythonAnywhere

Day 24:
● Understanding Classification Modelling Approach
● Introduction to the Classification problem
● Why the name Regression? and Implementation of the Sigmoid Function

Day 25:
● Working on a dataset for Logistic Regression
● Performance Metrics for Classification Algorithms
● Accuracy Score Confusion Matrix, Precision-Recall F1-Score, ROC Curve and
AUC, Log Loss

Day 26:
● Decision Trees
● Introduction to Decision Tree
● Homogeneity and Entropy
● Gini Index
● Information Gain
● Advantages of Decision Tree

● Day 27:
● Preventing Overfitting
● Plotting Decision Trees
● Plotting feature importance
● Regression using Decision Trees
● Hands-On - Decision Tree on US Adult income dataset

Day 28:
● Ensemble Learning
xiii
● Introduction to Ensemble Learning
● Bagging (Bootstrap Aggregation)
● Constructing random forests
● Runtime
● Case study on Bagging

Day 29:
● Tuning hyperparameters of random forest (GridSearch, RandomizedSearch)
● Measuring model performance

Day 30:
● Boosting
● Gradient Boosting
● Adaboost and XGBoost
● Case study on boosting trees

Day 31:
● Hyperparameter tuning
● Evaluating performance
● Stacking Models
● Hands-On - Talking Data Ad Tracking Fraud Detection case study

Day 32:
● Naive Bayes
● Refresher on conditional Probability
● Bayes Theorem
● Examples of Bayes theorem
● Exercise problems on Naive Bayes
● Naive Bayes Algorithm

Day 33:
● Assumptions of Naive Bayes Algorithm
● Laplace Smoothing
● Naive Bayes for Multiclass classification
● Handling numeric features using Naive Bayes
● Measuring performance of Naive Bayes
● Hands-On - Working on Spam detection and Amazon Food Review dataset

Day 34:
● Support Vector Machines
● Introduction to SVM
xiv
● What are hyperplanes?
● Geometric intuition
● Maths behind SVM
● Loss Function
● Kernel trick
● Polynomial kernel, RBF, and linear kernels

Day 35:
● SVM Regression
● Tuning the parameter
● GridSearch and RandomizedSearch
● Hands-On - Case Study SVM on Social network ADs

Day 36:
● K Nearest Neighbors
● Introduction to KNN
● Effectiveness of KNN
● Distance Metrics
● Accuracy of KNN

Day 37:
● Effect of an outlier on KNN
● Finding the k Value
● KNN on regression
● Where not to use KNN
● Hands-On - Case Study on ECommerce Recommendation

Day 38:
● Unsupervised Machine Learning Algorithms
● Introduction to Unsupervised Learning
● K Means Geometric intuition
● Maths Behind KMeans

Day 39:
● Determining the right k
● Evaluation metrics for KMeans
● Case study on K Means
● Introduction and Working on Hierarchical Clustering

Day 40:
● Dimensionality Reduction Techniques
xv
● What are the dimensions?
● Why is high dimensionality a problem?
● Introduction to MNIST dataset with (784 Dimensions)
● Into to Dimensionality reduction techniques
● PCA (Principal Component Analysis) for dimensionality reduction
● Hands-on: Applying Dimensionality Reduction along with

Day 41:
● Pythonanywhere Deployment

Day 42:
● AWS Deployment

Day 43 - 60:
● Project Development

xvi
Domain Area Of Internship
Domain: Machine learning using Python

PROJECT TITLE: Empowering Farming Decisions: Exploring Machine


Learning in Crop Recommendations

• Python brings an exceptional amount of power and versatility to machine


learning environments. The language's simple syntax simplifies data validation
and streamlines the scraping, processing, refining, cleaning, arranging and
analyzing processes, thereby making collaboration with other programmers less
of an obstacle.
• As you can see, it is very easy to build a machine learning model with Python
libraries.
• Python is the major code language for AI and ML. It surpasses Java in
popularity and has many advantages, such as a great library ecosystem, Good
visualization options, A low entry barrier, Community support, Flexibility,
Readability, and Platform independence.
• Python libraries in machine learning used are
 numpy
 pandas
 matplotlib
 sea born

xxii
Project Report

xxi
SNAP SHOTS:
HOME PAGE:

Alert for Invalid Inputs:

xxi
OUTPUT:

Giving output:

xxi
Conclusion:

The project's main objective is to optimize crop yields and profitability while promoting
sustainable agricultural practices. By analyzing historical crop yields, soil characteristics,
and weather patterns, machine learning algorithms can predict which crops are most
likely to succeed on a specific plot of land. Flask, as a web framework, provides a user-
friendly interface for farmers to input their field data and receive personalized crop
recommendations.

Overall, the crop recommendation project using machine learning and Flask offers an
innovative solution to enhance decision-making in agriculture. By integrating data-driven
insights, it empowers farmers with the ability to optimize crop selection and management,
leading to higher efficiency, improved yields, and a more sustainable future for
agriculture.

xxi
xxx

You might also like