Internship
Internship
BACHELOR OF TECHNOLOGY
IN COMPUTER SCIENCE AND ENGINEERING
By
Varshith Uddanti (208T1A05B9)
NBA
i
DHANEKULA INSTITUTE OF ENGINEERING&TECHNOLOGY
GANGURU, VIJAYAWADA - 521 139
Affiliated to JNTUK, Kakinada &Approved By AICTE, New Delhi Certified by
ISO 9001-2015, Accredited by NBA
This is to certify that the Summer Internship work entitled “Empowering Farming
Decisions: Exploring Machine Learning in Crop Recommendations” is a bonafide record
Head of Department:
Dr. K. SOWMYA
Professor, HOD CSE EXTERNAL EXAMINER
ii
iii
DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY
Department of Computer Science & Engineering
VISION – MISSION - PEOs
Department To empower the budding talents and ensure them with probable
Vision employability skills in addition to human values by optimizing the
resources.
iv
PROGRAM OUTCOMES(PO’S)
1. Engineering knowledge: apply the knowledge of mathematics, science, engineering fundamentals, and an
engineering specialization to the solution of complex engineering problems.
2. Problem Analysis: identify, formulate, review research literature, and analyze complex engineering
problems reaching sustained conclusions using first principles of mathematics, natural sciences, and engineering
sciences.
3. Design/Development Of Solutions: design solutions for complex engineering problems and design
system components or process that meet the specified needs with appropriate consideration for the public health
and safety, and the cultural, societal, and environmental considerations.
4. Conduct Investigations Of Complex Problems: use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to provide
valid conclusions.
5. Modern Tool Usage: create, select, and apply appropriate techniques, resources, and modern engineering
and IT tools including prediction and modelling to complex engineering activities with an understanding of the
limitations.
6. The Engineer And Society: apply reasoning informed by the contextual knowledge to assess societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional engineering
practice.
7. Environment And Sustainability: understand the impact of the professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable development.
8. Ethics: apply ethical principles and commit to professional ethics and responsibilities and norms of the
engineering practice.
9. Individual And Team Work: function effectively as an individual, and as a member or a leader in diverse
teams, and in multidisciplinary settings.
10. Communication: communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write effective reports and design
documentation, make effective presentations, and give and receive clear instructions.
11. Project Management And Finance: demonstrate knowledge and understanding of the engineering and
management principles and apply these to one’s own work, as a member and leader in a team, to manage projects
and in multidisciplinary environments.
12. Life- Long Learning: recognize the need for, and have the preparation and ability to engage in
independent and life- long learning in broadest context of technological change.
PROGRAM SPECIFIC OUTCOMES(PSOs)
PSO1: Designing and developing the Information Technology based systems with high professional skills.
PSO2: Qualify in national and international level competitive examinations for successful higher studies and
get employment in IT enabled industries
v
Internship Mappings
P P
P P P P P P P P P P P P
Project Title S S
O O O O O O O O O O O O
O O
1 2 3 4 5 6 7 8 9 10 1 12
1 2
1
Application to share
and discover photos and
albums like Instagram.
vi
Contents
Internship Log 9
Project Report 18
vii
Internship Carried Out Organization Details
viii
Internship Log
Day 1:
● Introduction to Data Science and Artificial Intelligence
● Importance of Data in the 21st Century
● Types of Data and its usage
● Difference between Data Science and Artificial Intelligence
● Overview of Data Analysis and key steps involved
● How does Machine Learning work?
● Different stages of Machine Learning projects
● Understanding the Importance of Path and Installation of Jupyter Notebook
● Modules and Packages
Day 2:
● What is a Python Module? How is it different from a Python File?
● Creating Modules and Packages
● Importing Functions, Variables from different modules
● Python built-in modules
● Working on math, time, random modules
Day 3:
● Hands-On – Working on Python Built-in Modules and User-defined Custom
● Module
● Understanding Importance of Data Analysis
● Understanding Importance and Types of Data Analysis
● Understanding types of Data
Day 4:
● Understanding Importance of Visualization and Types of Graphs
● Understanding the Importance and Usage of Jupyter Notebook
● Understanding and Working on NumPy
● Introduction to NumPy
● Advantages of NumPy over lists
● Creating NumPy arrays - 1-D, 2-D, N-D arrays
● Data types for ndarrays
ix
Day 5:
● Checking the attributes - shape, size, dimensions, dtype
● NumPy Arithmetic Operations
● NumPy universal functions
● Linear Algebra using NumPy
● Working on Appending and Concatenating
Day 6:
● Pandas for Data Analysis
● Getting Started with Pandas
● Introduction to Pandas Data Structures - Series, DataFrames
● Checking Attributes and Description
Day 7:
● Basic Essential functionality - Reindexing, Dropping entries from an axis
● Indexing, Selection, and Filtering
● Working on loc and iloc Functionalities
● Data Loading and Storage
● Reading and writing different file types (.txt, .xlsx, .csv files)
Day 8:
● Interacting with Web APIs
● Accessing data from databases
● Writing/Saving Files
Day 9:
● Data Cleaning and Preparation
● Handling Missing Data - Filtering out Missing Data, Filling in Missing data
● Removing Duplicates
● Computing Indicator/Dummy Variables
Day 10:
● Data Wrangling
● Concatenation - Adding Rows, Adding Columns, Concatenation with Different
● Indices
● Merging Data Frames
Day 11:
● Data Aggregation and Group Operations
● Pivot Tables and Cross-Tabulation
x
● Working on Time Series data
● Hands-on: Case Study Working on Titanic Dataset for Cleansing Data, HR
● Analytics Data
Day 12:
● Data Visualization Using Matplotlib
● Introduction to Matplotlib
● Setting Labels, Titles, xticks, and yticks
● Multiple Line Plots, adding legend
● Bar charts - What are they, When to use it
● Bar chart for comparing categorical data
● Histogram to check the distribution of numerical data
Day 13:
● Scatter Plots and their Usage
● Pie charts and their usage
● Subplots and their Usage
● Hands-on: Case Study Working on Titanic Dataset for Visualization
Day 14:
● Python Interactive Visualization using Plotly for Dashboards
● Introduction to Plotly and Cufflinks
● Loading Plotly and Cufflinks
● Loading the Data
Day 15:
● Quick Visualization with custom bar charts
● Interactive Bubble charts
● Understanding and Working on Choropleth Maps
● Hands-on: Analyzing Gapminder dataset
Day 16:
● Data - Wealth of the 21st Century - Web Scraping using Python
● Why Web Scraping and Understanding its importance
● Installing BeautifulSoup
● Understanding web structures
● Scraping data from the web using Beautiful Soup - Static & Dynamic websites
● Performing Data Visualization over the scraped data
Day 17:
● Machine Learning Fundamentals
xi
● Data Transformation and Preprocessing
● Handling Numeric Features
● Feature Scaling
● Standardization and Normalization
● Day 18:
● Handling Categorical Features
● One Hot Encoding, pandas get dummies
● Label Encoding
● More on different encoding techniques
● Train, Test and Validation Split
● Simple Train and Test Split
● Drawbacks of train and test split
● K-fold cross-validation
● Time-based splitting
Day 19:
● Overfitting And Underfitting
● What is overfitting?
● What causes overfitting?
● What is Underfitting?
● What causes underfitting?
● What are bias and Variance?
● How to overcome overfitting and underfitting problems?
Day 20:
● Supervised Machine Learning Algorithms
● Regression and its Importance in real-world cases
● Introduction to Linear Regression
● Understanding How Linear Regression Works
Day 21:
● Maths behind Linear Regression
● Ordinary Least Square
● Gradient Descent
● R - square
● Adjusted R-square
Day 22:
● Polynomial Regression
● Multiple Regression
xii
● Performance Measures - MSE, RMSE, MAE
● Assumption of Linear Regression
● Ridge and Lasso regression
● Hands-on: Algorithm implementation with real use case datasets
Day 23:
● Building and Deployment of Machine learning model - Flask, Git, Github &Python
Anywhere
● Understanding steps in end-to-end ML projects
● Building a web service for Machine Learning Model
● Git Download and Github Usage
● Deploying the Final Trained Model on PythonAnywhere
Day 24:
● Understanding Classification Modelling Approach
● Introduction to the Classification problem
● Why the name Regression? and Implementation of the Sigmoid Function
Day 25:
● Working on a dataset for Logistic Regression
● Performance Metrics for Classification Algorithms
● Accuracy Score Confusion Matrix, Precision-Recall F1-Score, ROC Curve and
AUC, Log Loss
Day 26:
● Decision Trees
● Introduction to Decision Tree
● Homogeneity and Entropy
● Gini Index
● Information Gain
● Advantages of Decision Tree
● Day 27:
● Preventing Overfitting
● Plotting Decision Trees
● Plotting feature importance
● Regression using Decision Trees
● Hands-On - Decision Tree on US Adult income dataset
Day 28:
● Ensemble Learning
xiii
● Introduction to Ensemble Learning
● Bagging (Bootstrap Aggregation)
● Constructing random forests
● Runtime
● Case study on Bagging
Day 29:
● Tuning hyperparameters of random forest (GridSearch, RandomizedSearch)
● Measuring model performance
Day 30:
● Boosting
● Gradient Boosting
● Adaboost and XGBoost
● Case study on boosting trees
Day 31:
● Hyperparameter tuning
● Evaluating performance
● Stacking Models
● Hands-On - Talking Data Ad Tracking Fraud Detection case study
Day 32:
● Naive Bayes
● Refresher on conditional Probability
● Bayes Theorem
● Examples of Bayes theorem
● Exercise problems on Naive Bayes
● Naive Bayes Algorithm
Day 33:
● Assumptions of Naive Bayes Algorithm
● Laplace Smoothing
● Naive Bayes for Multiclass classification
● Handling numeric features using Naive Bayes
● Measuring performance of Naive Bayes
● Hands-On - Working on Spam detection and Amazon Food Review dataset
Day 34:
● Support Vector Machines
● Introduction to SVM
xiv
● What are hyperplanes?
● Geometric intuition
● Maths behind SVM
● Loss Function
● Kernel trick
● Polynomial kernel, RBF, and linear kernels
Day 35:
● SVM Regression
● Tuning the parameter
● GridSearch and RandomizedSearch
● Hands-On - Case Study SVM on Social network ADs
Day 36:
● K Nearest Neighbors
● Introduction to KNN
● Effectiveness of KNN
● Distance Metrics
● Accuracy of KNN
Day 37:
● Effect of an outlier on KNN
● Finding the k Value
● KNN on regression
● Where not to use KNN
● Hands-On - Case Study on ECommerce Recommendation
Day 38:
● Unsupervised Machine Learning Algorithms
● Introduction to Unsupervised Learning
● K Means Geometric intuition
● Maths Behind KMeans
Day 39:
● Determining the right k
● Evaluation metrics for KMeans
● Case study on K Means
● Introduction and Working on Hierarchical Clustering
Day 40:
● Dimensionality Reduction Techniques
xv
● What are the dimensions?
● Why is high dimensionality a problem?
● Introduction to MNIST dataset with (784 Dimensions)
● Into to Dimensionality reduction techniques
● PCA (Principal Component Analysis) for dimensionality reduction
● Hands-on: Applying Dimensionality Reduction along with
Day 41:
● Pythonanywhere Deployment
Day 42:
● AWS Deployment
Day 43 - 60:
● Project Development
xvi
Domain Area Of Internship
Domain: Machine learning using Python
xxii
Project Report
xxi
SNAP SHOTS:
HOME PAGE:
xxi
OUTPUT:
Giving output:
xxi
Conclusion:
The project's main objective is to optimize crop yields and profitability while promoting
sustainable agricultural practices. By analyzing historical crop yields, soil characteristics,
and weather patterns, machine learning algorithms can predict which crops are most
likely to succeed on a specific plot of land. Flask, as a web framework, provides a user-
friendly interface for farmers to input their field data and receive personalized crop
recommendations.
Overall, the crop recommendation project using machine learning and Flask offers an
innovative solution to enhance decision-making in agriculture. By integrating data-driven
insights, it empowers farmers with the ability to optimize crop selection and management,
leading to higher efficiency, improved yields, and a more sustainable future for
agriculture.
xxi
xxx