0% found this document useful (0 votes)

8 views21 pages

Stats-with-R-project

The document presents an analysis of agricultural crops in India, focusing on various statistical representations such as boxplots, pie charts, and scatter plots for crops like rice, wheat, barley, and cotton. It includes data summaries and visualizations that highlight production areas, yields, and correlations among different crops across various states. The analysis emphasizes the significant concentration of rice production and the geographical factors influencing crop yields in different regions.

Uploaded by

Vedant Baiswar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views21 pages

Stats-with-R-project

Uploaded by

Vedant Baiswar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Analysis of Agricultural Crops in India

Vedant,Dev and Sanidhya

Acknowledgements

We wish to express our gracious gratitude to Rajni Ma’am and Uday sir, our project guide, and our Principal,
Dr. John Varghese, for their guidance and support in completing our project on “Analysis of Agricultural
Crops in India”. Their cooperation and contributions were crucial to the project’s success.

Contents
• Boxplot for rice production
• Pie chart for depicting Rice area
• Scatter plot on barley yield
• Sugarcane production line graph
• Stacked bar chart for cotton yield
• Scatter plot on Wheat Yield
• Scatter plot on Correlationship between Rice Production and Area
• Barplot on oilseed production in Rajasthan
• Line graph for area under cotton
• Boxplot on Maize Yield
• Analysing relations between different crop yields using pairs plot
• Maize Production pie chart
• Oilseeds boxplot(facet wrapped by states)
• Area under production for wheat
• Histogram and frequency polygon

data

library(dplyr)

##
## Attaching package: ’dplyr’

## The following objects are masked from ’package:stats’:

##
## filter, lag

## The following objects are masked from ’package:base’:

##
## intersect, setdiff, setequal, union

library(ggplot2)
library(readxl)
library(GGally)

## Registered S3 method overwritten by ’GGally’:

## method from
## +.gg ggplot2

1
getwd()

## [1] "/Users/vdb/Downloads"

C=(read_xlsx("ICRISAT-District Level Data (1).xlsx"))

#read.csv("ICRISAT-District Level Data (1).csv")
B=as.data.frame(C)

Dev

summary(B)

## Dist Code Year State Code State Name

## Min. : 7.0 Min. :2014 Min. : 2.000 Length:640
## 1st Qu.:123.8 1st Qu.:2015 1st Qu.: 6.000 Class :character
## Median :163.5 Median :2016 Median :10.000 Mode :character
## Mean :252.4 Mean :2016 Mean : 8.331
## 3rd Qu.:222.2 3rd Qu.:2016 3rd Qu.:12.000
## Max. :912.0 Max. :2017 Max. :13.000
## Dist Name RICE AREA (1000 ha) RICE PRODUCTION (1000 tons)
## Length:640 Min. : 0.0 Min. : 0.000
## Class :character 1st Qu.: 4.0 1st Qu.: 5.322
## Mode :character Median : 68.7 Median : 146.310
## Mean : 119.2 Mean : 302.700
## 3rd Qu.: 178.6 3rd Qu.: 437.375
## Max. :1154.2 Max. :3215.010
## RICE YIELD (Kg per ha) WHEAT AREA (1000 ha) WHEAT PRODUCTION (1000 tons)
## Min. : 0 Min. : 0.0 Min. : 0.0
## 1st Qu.:1268 1st Qu.: 75.4 1st Qu.: 180.5
## Median :2195 Median :149.1 Median : 425.5
## Mean :1919 Mean :156.6 Mean : 503.4
## 3rd Qu.:2626 3rd Qu.:216.7 3rd Qu.: 692.7
## Max. :5160 Max. :879.5 Max. :4169.4
## WHEAT YIELD (Kg per ha) MAIZE AREA (1000 ha) MAIZE PRODUCTION (1000 tons)
## Min. : 0 Min. : 0.000 Min. : 0.00
## 1st Qu.:2352 1st Qu.: 0.500 1st Qu.: 0.93
## Median :2975 Median : 6.825 Median : 12.00
## Mean :2961 Mean : 25.850 Mean : 69.30
## 3rd Qu.:3578 3rd Qu.: 33.233 3rd Qu.: 71.91
## Max. :5166 Max. :267.890 Max. :1510.95
## MAIZE YIELD (Kg per ha) BARLEY AREA (1000 ha) BARLEY PRODUCTION (1000 tons)
## Min. : 0 Min. : 0.000 Min. : 0.000
## 1st Qu.: 1381 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 1920 Median : 0.590 Median : 1.080
## Mean : 2058 Mean : 3.558 Mean : 10.087
## 3rd Qu.: 2514 3rd Qu.: 3.775 3rd Qu.: 8.707
## Max. :21429 Max. :101.510 Max. :328.710
## BARLEY YIELD (Kg per ha) SUNFLOWER AREA (1000 ha)
## Min. : 0 Min. :0.0000

2
## 1st Qu.: 0 1st Qu.:0.0000
## Median :1836 Median :0.0000
## Mean :1747 Mean :0.1375
## 3rd Qu.:2823 3rd Qu.:0.0100
## Max. :5000 Max. :5.2700
## SUNFLOWER PRODUCTION (1000 tons) SUNFLOWER YIELD (Kg per ha)
## Min. :0.0000 Min. : 0.0
## 1st Qu.:0.0000 1st Qu.: 0.0
## Median :0.0000 Median : 0.0
## Mean :0.1882 Mean : 433.2
## 3rd Qu.:0.0200 3rd Qu.:1103.9
## Max. :6.4500 Max. :2142.9
## SOYABEAN AREA (1000 ha) SOYABEAN PRODUCTION (1000 tons)
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 41.38 Mean : 42.42
## 3rd Qu.: 17.12 3rd Qu.: 14.25
## Max. :468.78 Max. :673.00
## SOYABEAN YIELD (Kg per ha) OILSEEDS AREA (1000 ha)
## Min. : 0.0 Min. : 0.00
## 1st Qu.: 0.0 1st Qu.: 0.00
## Median : 0.0 Median : 0.00
## Mean : 435.1 Mean : 48.72
## 3rd Qu.: 854.3 3rd Qu.: 45.00
## Max. :2063.8 Max. :636.70
## OILSEEDS PRODUCTION (1000 tons) OILSEEDS YIELD (Kg per ha)
## Min. : 0.00 Min. : 0.0
## 1st Qu.: 0.00 1st Qu.: 0.0
## Median : 0.00 Median : 0.0
## Mean : 56.16 Mean : 403.5
## 3rd Qu.: 41.09 3rd Qu.: 837.4
## Max. :1101.11 Max. :3553.2
## SUGARCANE AREA (1000 ha) SUGARCANE PRODUCTION (1000 tons)
## Min. : 0.000 Min. : 0.00
## 1st Qu.: 0.010 1st Qu.: 0.03
## Median : 0.730 Median : 3.37
## Mean : 17.871 Mean : 122.37
## 3rd Qu.: 6.732 3rd Qu.: 34.10
## Max. :277.300 Max. :2326.51
## SUGARCANE YIELD (Kg per ha) COTTON AREA (1000 ha)
## Min. : 0.0 Min. : 0.00
## 1st Qu.: 506.3 1st Qu.: 0.00
## Median : 5864.0 Median : 0.00
## Mean : 4785.4 Mean : 27.41
## 3rd Qu.: 6970.6 3rd Qu.: 5.00
## Max. :17988.0 Max. :492.87
## COTTON PRODUCTION (1000 tons) COTTON YIELD (Kg per ha)
## Min. : 0.0000 Min. : 0.0
## 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 0.0000 Median : 0.0
## Mean : 12.4371 Mean : 126.9
## 3rd Qu.: 0.6425 3rd Qu.: 108.0
## Max. :348.7300 Max. :1009.5

3
## FRUITS AND VEGETABLES AREA (1000 ha)
## Min. : 0.00
## 1st Qu.: 0.00
## Median : 7.08
## Mean : 20.95
## 3rd Qu.: 23.91
## Max. :213.69

Boxplot for rice production

C%>%ggplot(aes(`RICE PRODUCTION (1000 tons)`))+

geom_boxplot(col=c("red"))+
facet_wrap(~Year)+
labs(title="CONCENTRATION OF RICE PROD. IN FOUR YEARS THROUGH BOXPLOTS")

CONCENTRATION OF RICE PROD. IN FOUR YEARS THROUGH BOXPLOTS

2014 2015
0.4

0.2

0.0

−0.2

−0.4
2016 2017
0.4

0.2

0.0

−0.2

−0.4
0 1000 2000 3000 0 1000 2000 3000
RICE PRODUCTION (1000 tons)

The graph makes it evident the major concentration of rice production is within 1000000 tons every year.
The extremest outlier can be located for the years 2014 and 2015, above 3000000 tonnes of production . The
median production in all the years has been around 300000-400000 tons for all the districts in the states
taken except for 2015 , where the major concentration is clearly within 500000 tonnes and median is even
less that is 100000 tonnes. The mild outliers are within 3000000 tones of production. Overall we can say ,
most of the districts produce rice for around 500000 tonnes while few districts with exceptional geographical
situation and agricultural background may be able to produce more than 3000000 tones per year.

4
Pie chart for depicting Rice area

E=filter(B,Year=="2015")
AREA=E%>%group_by(`State Name`) %>%
summarize_each(funs(sum(.,na.rm=T)),`RICE AREA (1000 ha)` )

## Warning: ‘summarise_each_()‘ was deprecated in dplyr 0.7.0.

## i Please use ‘across()‘ instead.
## i The deprecated feature was likely used in the dplyr package.
## Please report the issue at <https://github.com/tidyverse/dplyr/issues>.
## This warning is displayed once every 8 hours.
## Call ‘lifecycle::last_lifecycle_warnings()‘ to see where this warning was
## generated.

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

## i Please use a list of either functions or lambdas:
##
## # Simple named list: list(mean = mean, median = median)
##
## # Auto named with ‘tibble::lst()‘: tibble::lst(mean, median)
##
## # Using lambdas list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## Call ‘lifecycle::last_lifecycle_warnings()‘ to see where this warning was
## generated.

F=AREA[1:7,2]
G=as.list(F)
H=c(3232.34 ,774.22,1353.10,2026.00 ,182.92 ,5876.26,5523.96 )

pie(H,labels = c("Bihar(61.34 %)","Gujarat(14.69%)","Haryana(25.67%)","MadhyaPradesh(38.45%)","Rajasthan

5
DISTRIBUTION OF TOTAL RICE AREA AMONG TOP PRODUCING STATES

Haryana(25.67%)
Gujarat(14.69%)
MadhyaPradesh(38.45%)
Rajasthan(6.07%)
Bihar(61.34 %)

Uttar Pradesh(111.522%)

West Bengal(104.83%)

It shows that West Bengal is the state with the largest area of Rice as it is a staple crop there , followed
by Uttar Pradesh, another leading producer of Rice .Rajasthan has the lowest area for rice as it is an arid
region whereas rice crop is irrigation intense crop.It shows that West Bengal is the state with the largest
area of Rice as it is a staple crop there , followed by Uttar Pradesh, another leading producer of Rice .
Bihar comes third among these states. We can clearly attribute this to the vast network of Ganges and its
tributaries as well as the alluvial soil.Rajasthan has the lowest area for rice as it is an arid region whereas
rice crop is an irrigation intense crop.

Scatter plot on barley yield

I=B%>%filter(.,`State Code`=="10")
RAJAREA=I%>%group_by(`Dist Name`) %>%
summarize_each(funs(sum(.,na.rm=T)),`BARLEY YIELD (Kg per ha)` )

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

6
ggplot(RAJAREA)+
geom_point(aes(`Dist Name`,`BARLEY YIELD (Kg per ha)`),colour="blue",size=2)+
theme(axis.text.x=element_text(angle=90))

14000
BARLEY YIELD (Kg per ha)

12000

10000

Swami Madhopur
Ganganagar
Chittorgarh

Dungarpur

Jhunjhunu
Banswara

Bharatpur

Jaisalmer

Jhalawar
Bhilwara

Jodhpur

Udaipur
Bikaner
Barmer

Nagaur
Churu

Jaipur

Jalore
Ajmer

Bundi

Sirohi
Alwar

Sikar

Tonk
Kota

Pali
Dist Name

labs(title="BARLEY YEILD IN THE DISTRICTS OF RAJASTHAN")

## $title
## [1] "BARLEY YEILD IN THE DISTRICTS OF RAJASTHAN"
##
## attr(,"class")
## [1] "labels"

Here,This plot makes it evident that Alwar has the largest yield of Barley , followed by GangaNagar and
Sikar. The lowest yield is in Bikaner.The reason may be suggested as the better agricultural practices and
equipment available to the farmers in the areas of region of Alwar, Ganganagar, Hanumangarh. Moreover
there is certainly an influence of green revolution in these areas as they are quite close to Punjab and Haryana
Moreover, Indira Gandhi canal which flows through heart of Rajasthan has made it feasible for it to be one
of the largest producer of Barley.

Sugarcane production line graph

7
SUGPROD=B%>%group_by(`Year`) %>%
summarize_each(funs(sum(.,na.rm=T)),`SUGARCANE PRODUCTION (1000 tons)`);SUGPROD

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

## # A tibble: 4 x 2
## Year ‘SUGARCANE PRODUCTION (1000 tons)‘
## <dbl> <dbl>
## 1 2014 19436.
## 2 2015 17929.
## 3 2016 19955.
## 4 2017 20995

O=as.matrix(SUGPROD)
plot(O,type="l",cex=4,col="black",main="Production of SUGARCANE over the years")

Production of SUGARCANE over the years

SUGARCANE PRODUCTION (1000 tons)

21000
20000
19000
18000

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

Year

8
The graph clearly shows that sugarcane production was high in 2014 but drastically declined in 2015 and
then the production has risen exponentially.This can be attributed to bad climate conditions in the year
2015 because of erratic pattern of conducive conditions for sugarcane production.Moreover, over the years
in 2016 and 2017, certain govt schemes like price policy have encouraged its production.

stacked bar chart for cotton yield

P=B%>%filter(.,`State Name`==c("Haryana","Gujarat",'Madhya Pradesh'))

## Warning: There was 1 warning in ‘filter()‘.

## i In argument: ‘‘State Name‘ == c("Haryana", "Gujarat", "Madhya Pradesh")‘.
## Caused by warning in ‘‘ ‘State Name‘ == c("Haryana", "Gujarat", "Madhya Pradesh") ‘‘:
## ! longer object length is not a multiple of shorter object length

COTY=P%>%group_by(`State Name`,`Year`) %>%

summarize_each(funs(sum(.,na.rm=T)),`COTTON YIELD (Kg per ha)` )

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

Y1=as.matrix(COTY)
Z=Y1[1:12,3]
Z1=as.vector(Z)
Z2=matrix(Z1,4,3,dimnames =list( c("2014","2015",'2016','2017'),c("Gujarat","Haryana","Madhya Pradesh"))
Z2

## Gujarat Haryana Madhya Pradesh

## 2014 "2951.12" "1374.29" " 728.10"
## 2015 "3236.01" " 423.83" " 522.12"
## 2016 "2675.83" " 541.35" " 289.09"
## 2017 "2999.47" "1433.48" " 851.53"

barplot(Z2,legend=T,col=c(1:12),ylim=c(0,15000))

9
2017
12000

2016
2015
2014
8000
4000
0

Gujarat Haryana Madhya Pradesh

It shows Gujarat as the largest producer of Cotton among the three top producing states of cotton with
almost equal amounts of production in all 4 years. Haryana, as the second largest producer of cotton , had
maximum yield in 2017 and the least in 2016. Madhya Pradesh, the lowest of the three , had maximum
production in 2017 and least in 2016.We can interpret that Gujarat, as it lies in the black soil belt of Deccan
Plateau has excellent yields all 4 years while Madhya Pradesh, with thin patches of black soil had the least
yield. Haryana , though doesn’t lie in any belt of black soil but due to heavy investments and revolutionised
agriculture is able to produce at par with cotton belt states.In 2017 , all three states had maximum yield of
cotton.

getwd()

## [1] "/Users/vdb/Downloads"

C=as.data.frame(read.csv("agricultural produce.csv"))# importing dataset

Vedant

Scatter plot on Wheat Yield

C%>%ggplot(aes(Year, WHEAT.YIELD..Kg.per.ha., col = WHEAT.YIELD..Kg.per.ha. > 3000)) +

geom_point() +
facet_wrap(~State.Name) +

10
scale_color_manual(labels = c(">3000", "<3000"), values = c("springgreen3", "red")) +
theme_minimal() +
theme(legend.position = "topright",
legend.title = element_blank(),
axis.text = element_text(size = 10),
axis.title = element_text(size = 12, face = "bold"),
plot.title = element_text(size = 16, hjust = 0.5, face = "bold"))+
labs(title="Wheat Yield: 2014-2017",y="Yield")

Wheat Yield: 2014−2017

Bihar Gujarat Haryana
5000
4000
3000
2000
1000
0
Madhya Pradesh Rajasthan Uttar Pradesh
5000
4000
Yield

3000
2000
1000
0
2014 2015 2016 20172014 2015 2016 2017
West Bengal
5000
4000
3000
2000
1000
0
2014 2015 2016 2017
Year

This data pertains to wheat yeild in India across different states. The red points depicts points above 3000
kg. It can be seen Uttar Pradesh is the highest producer of wheat and has yeild which exceeds 3000 kg per
ha which is depicted by the red points. This is because it has fertile alluvial soil along with dry and cool
winter season which is most suitable for growing wheat. Moreover, Madhya Pradesh and Punjab are the
other two states whose share in wheat production is substantial. West Bengal, which is the lowest producer
of wheat among these states, has a tropical climate with a significant influence of the Bay of Bengal. The
state experiences mild winters and high temperatures which is not conducive for a winter crop such as wheat.

Scatter plot on Correlationship between Rice Production and Area

C%>%ggplot(aes(RICE.PRODUCTION..1000.tons.,RICE.AREA..1000.ha.))+
geom_point(col="violetred3")+
geom_smooth(method=lm)+
labs(
title = "Correlationship between Rice Production and Area",

11
x = "Rice Production (1000 tons)",
y = "Rice Area (1000 ha)")+
theme_minimal() +
theme(plot.title = element_text(size = 16, hjust = 0.5, face = "bold"),
axis.title = element_text(size = 12, face = "bold"),
axis.text = element_text(size = 10),
axis.text.x = element_text(angle = 45, hjust = 1))

## ‘geom_smooth()‘ using formula = ’y ~ x’

Correlationship between Rice Production and Area

1200

900
Rice Area (1000 ha)

600

300

0
0

00
10

Rice Production (1000 tons)

This is a scatter plot showing correlation between area under rice and rice production. it is evident that
there is high degree of correlation between these two variables as the points do not deviate much from the
line of best fit which is represented by the Blue line which is drawn using the geom_smooth command
through the linear model method. Also, the highest ever rice production is around 33 lakh tons with almost
11 lakh tones of production. In certain situations, there might be a negative correlation between crop area
and production due to limited availability land.

Barplot on oilseed production in Rajasthan

a=C%>%select(OILSEEDS.PRODUCTION..1000.tons.,State.Name,Dist.Name,Year)%>%
filter(State.Name=="Rajasthan"&Year=="2015")%>%group_by(OILSEEDS.PRODUCTION..1000.tons.)
d=c("584.19","541.04","473.17","432.08","398.6")
e=matrix(d,1,5)

12
colnames(e)=c("Ganganagar","Kota","Bikaner","Jodhpur","Bharatpur")
e=as.numeric(e)
barplot(e,col=colors()[31:42],names=c("Ganganagar","Kota","Bikaner","Jodhpur","Bharatpur"),xlab="Distric

Top 5 Oilseed producing districts in Rajasthan (2015)

500
OILSEEDS PRODUCTION

400
300
200
100
0

Ganganagar Kota Bikaner Jodhpur Bharatpur

Districts

This barplot describes top 5 oilseed producing districts in Rajasthan (2015). It can se seen Ganganagar
and Kota are the highest Oilseed producing states whose production exceeds 5 lakh tones. The average
production is near to 4.5 lakh tones and the lowest producer among there districts is Bharatpur which
produces around 3.9 lakh tones of production. Rajasthan experiences arid to semi-arid climates, which can
be suitable for certain oilseed crops such as mustard, groundnut, and soybeans.

Line graph for area under cotton

ve=C%>%filter(State.Name!="West Bengal"&Year=="2014")%>%select(COTTON.AREA..1000.ha.)%>%filter(COTTON.AR
v1=C%>%filter(State.Name!="West Bengal"&Year=="2015")%>%select(COTTON.AREA..1000.ha.)%>%filter(COTTON.AR
v2=C%>%filter(State.Name!="West Bengal"&Year=="2016")%>%select(COTTON.AREA..1000.ha.)%>%filter(COTTON.AR
v3=C%>%filter(State.Name!="West Bengal"&Year=="2017")%>%select(COTTON.AREA..1000.ha.)%>%filter(COTTON.AR
sum(ve)

## [1] 4660.64

sum(v1)

## [1] 4354.13

13
sum(v2)

## [1] 4038.95

sum(v3)

## [1] 4486.56

a=c("2014","2015","2016","2017","4660.64","4154.13","4038","4486.56")
vb=matrix(a,4)
colnames(vb)=c("Year","Cotton area")
par(bg="seashell")
plot(vb,type = "b", col = "darkgreen", lwd = 2,
xlab = "Years", ylab = "Cotton Area (1000 ha)",
main = "Total area under cotton in India")
legend("topright", legend = c("Cotton area"), col = c("darkgreen", "red"), lty = c(1, 2), lwd = 2, cex =

Total area under cotton in India

Cotton area
Cotton Area (1000 ha)

4500
4300
4100

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

Years

This line chart shows the total area under cotton production across a span of 4 years in India. the year 2014
witnessed the highest area under cotton with around 46 lakh tonnes. Since then, there is a steep decline in
its area which was mainly due to deficit rainfall. But after 2016, the total area incread gradually to around
44 lakh tonnes. New varieties of genetically modified or hybrid cotton seeds constitutes a major reason for
such increase in area.

14
Boxplot on Maize Yield

C%>%filter(MAIZE.YIELD..Kg.per.ha.<15000&MAIZE.YIELD..Kg.per.ha.>0)%>%
ggplot(aes(x=reorder(Year,MAIZE.YIELD..Kg.per.ha.),y=MAIZE.YIELD..Kg.per.ha.))+
geom_boxplot(col="darkgreen",alpha=0.4,outlier.size=3,outlier.colour="red")+
coord_flip()+
labs(
title = "Maize Yield Distribution Over Years",
x = "Year",
y = "Maize Yield (Kg per ha)") +
stat_summary(fun=mean,geom="point",color="blue")

Maize Yield Distribution Over Years

2017

2016
Year

2014

2015

0 2500 5000 7500

Maize Yield (Kg per ha)
the Boxplots depict Maize yeild over 4 years. as we can see the median is near to 2200 kg per ha. In 2016,
India crossed the 8000 kg threshold mark as the yeild in Bihar was 8772 kg/ha which is an extreme outlier.
From the year 2016 to 2017 there was a substantial rise in yield as the values crossed 6000 kg/ha mark.
Interestingly, even though 2017 saw the highest total yeild, it also witnessed near to 0 yeild. That’s why the
mean in not substantially higher than others, which is represented by the blue points using the command
stat_summary.

15
Sanidhya

using pair plots

P1=B%>%filter(`State Name`==c("Madhya Pradesh","Uttar Pradesh"))

b=select(P1,contains("YIELD") & starts_with(c("RICE","WHEAT","SUGARCANE","oilseeds")))
pairs(b, # data
main ="Analysing relations between different crop yields", # title
col="red", # colour
pch=1) # plotting character

Analysing relations between different crop yields

1500 2500 3500 4500 0 2000 6000

2000
RICE YIELD (Kg per ha)

0
3500

WHEAT YIELD (Kg per ha)

1500

1000
OILSEEDS YIELD (Kg per ha)

0
4000

SUGARCANE YIELD (Kg per ha)

0 1000 3000 0 500 1000

attach(b)
cor(`RICE YIELD (Kg per ha)`,`SUGARCANE YIELD (Kg per ha)`) # moderate degree of positive correlation

## [1] 0.4872056

cor(`WHEAT YIELD (Kg per ha)`,`SUGARCANE YIELD (Kg per ha)`)# negligible positive correlation

## [1] 0.1308622

cor(`WHEAT YIELD (Kg per ha)`,`RICE YIELD (Kg per ha)`) #low degree of positive correlation

16
## [1] 0.2611049

cor(`RICE YIELD (Kg per ha)`,`OILSEEDS YIELD (Kg per ha)`) # low degree of negative correlation

## [1] -0.3240037

cor(`WHEAT YIELD (Kg per ha)`,`OILSEEDS YIELD (Kg per ha)`)# negligible degree of negative correlation

## [1] -0.1466258

cor(`SUGARCANE YIELD (Kg per ha)`,`OILSEEDS YIELD (Kg per ha)`) # low degree of negative correlation

## [1] -0.3127461

detach(b)

Here we have analyzed whether there was any relation between the yields of different crops in Madhya
Pradesh and Uttar Pradesh or not using pair plots and the correlation function. We have shown the degree
of linear correlations between these variables and what these numbers depict. Using these correlations we
can analyze which crop combination would be suitable for the farmer to use if he adopts the practice of
mixed cropping in these states. As shown it would be suitable to grow those 2 crops together whose yields
are positively related to each other.

Maize Production pie chart

E=filter(B,Year=="2015")
PRODUCTION=E%>%group_by(`State Name`) %>%
summarize_each(list(~sum(.,na.rm=T)),`MAIZE PRODUCTION (1000 tons)` ) # using group by and summarize co

R1=(PRODUCTION[2]) # this data is extracted in the list format

Vec1=unlist(R1)
Vec1# using unlist to create numeric vector from the list

## MAIZE PRODUCTION (1000 tons)1 MAIZE PRODUCTION (1000 tons)2

## 2517.16 608.51
## MAIZE PRODUCTION (1000 tons)3 MAIZE PRODUCTION (1000 tons)4
## 18.00 2909.00
## MAIZE PRODUCTION (1000 tons)5 MAIZE PRODUCTION (1000 tons)6
## 1156.71 1304.66
## MAIZE PRODUCTION (1000 tons)7
## 662.44

pie(Vec1,
labels = c("Bihar(27.6%)","Gujarat(6.2%)","Haryana(0.3%)","Madhya Pradesh(31.9%)","Rajasthan(12.8%)"
col =c(3:10), # slice co
main="Contribution of each state in maize production", # title
radius=0.8, # radius
border="blue",lty=7 # border and line
)

17
Contribution of each state in maize production

Gujarat(6.2%)
Haryana(0.3%)
Bihar(27.6%)

Madhya Pradesh(31.9%)
West Bengal(7.1%)

Uttar Pradesh(14.1%)
Rajasthan(12.8%)

As we can clearly see that Bihar and Madhya Pradesh contribute the most in maize production amongst
these 7 states. Rajasthan and Uttar Pradesh also contribute about 1/8th part of production each. Gujarat
and West Bengal have lesser contributions whereas Haryana produces negligible amount when compared to
the total production. The reason for this are the geographical conditions of these states that is the presence
of well drained loamy soil and moderate temperature.

Oilseeds boxplot(facet wrapped by states)

B%>%ggplot(aes(`OILSEEDS PRODUCTION (1000 tons)`))+ # data

geom_boxplot(col=c("red"),fill="black", #boxplot
outlier.colour = "dark green", # outlier color
outlier.shape = 6,
outlier.size = 1.2,
outlier.stroke = 1.1,)+
facet_wrap(~`State Name`)+
labs(title="Boxplots for oilseed production in different states")+
theme_grey()

18
Boxplots for oilseed production in different states
Bihar Gujarat Haryana
0.4
0.2
0.0
−0.2
−0.4
Madhya Pradesh Rajasthan Uttar Pradesh
0.4
0.2
0.0
−0.2
−0.4
0 300 600 900 0 300 600 900
West Bengal
0.4
0.2
0.0
−0.2
−0.4
0 300 600 900
OILSEEDS PRODUCTION (1000 tons)
As can be seen from the given data the median production of oilseeds in different states is 0. This is true
because oilseeds require specific geographical conditions which are not easily found.Most states have many
outliers especially Rajasthan and Gujarat which produce most of the oilseeds in India. Atleast 75% districts
of West Bengal,UP,Bihar do not produce oil seeds at all.Haryana is a bit different from all states and many
districts in Haryana contribute towards the oilseeds production. Madhya Pradesh and Rajasthan’s boxplots
are nearly similar as the geographical conditions in these 2 states are nearly same excluding Rajasthan’s
Thar region.

Area under production for wheat

Ar1=B%>%group_by(`Year`) %>%
summarize_each(list(~sum(.,na.rm=T)),`WHEAT AREA (1000 ha)` )

f1=as.matrix(Ar1)
plot(f1,type="b",
cex=4,
col="orange",
main="Area under wheat production",
sub="Variation in area under production",
xlab="year",
ylab="area",
lwd=1.5,
lty=2,
pch=2.1,)

19
Area under wheat production
25400
area

25000
24600

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

year
Variation in area under production
The given line graph shows how the area under production for wheat has increased or decresead through
the years. As we can see the production fell in 2015 but rose back to previous levels in 2016. In 2017 the
area again declined. The data is not showing any upward or downward trend. One of the reasons for this is
that there is very limited amount of land left now and it is not economical to use as it is not much fertile.
Due to this the change has not been very large.

Histogram and frequency polygon

f2=filter(B,Year=="2017"& `State Name`=="Uttar Pradesh")

ggplot(f2,aes(`RICE PRODUCTION (1000 tons)`))+
geom_histogram(binwidth=100,col="black",fill="yellow")+
geom_freqpoly(binwidth=100)+
theme_gray()

20
8

6
count

0 400 800 1200

RICE PRODUCTION (1000 tons)

This chart depicts the number of districts in Uttar Pradesh and the amount of rice produced by them.The
width of each bin is 100 that is 1 bin represents 100 tons of rice.The height of the bin shows the number of
districts lying in that particular interval. This histogram has 2 peaks at 550-650 and between 50-250. This
data shows that almost all districts of Uttar Pradesh cultivate rice.

Bibliography

https://mospi.gov.in/4-agricultural-statistics Beginning R: Statistical Programming Language by Dr. Mark

Gardener

Measuring Variability and Factors Affecting The Agricultural Production: A Ridge Regression Approach
No ratings yet
Measuring Variability and Factors Affecting The Agricultural Production: A Ridge Regression Approach
14 pages
Lee Research Proposal
No ratings yet
Lee Research Proposal
23 pages
HARVEST FINAL REPORT
No ratings yet
HARVEST FINAL REPORT
5 pages
222009968
No ratings yet
222009968
9 pages
Crops (38) 0
No ratings yet
Crops (38) 0
27 pages
SR24705171353
No ratings yet
SR24705171353
5 pages
Paper: Geography of India Topic: Agricultural Production and Distribution of Rice (Unit - IV) Department of Geography B.A./ B.SC (IV Semester)
No ratings yet
Paper: Geography of India Topic: Agricultural Production and Distribution of Rice (Unit - IV) Department of Geography B.A./ B.SC (IV Semester)
5 pages
Food Grains
No ratings yet
Food Grains
1 page
Major Crops Updated
No ratings yet
Major Crops Updated
7 pages
Agricultural Statistics 2023 - 24 (Latest)
100% (1)
Agricultural Statistics 2023 - 24 (Latest)
10 pages
Basic Issues in The Development of Grain Production in Armenia
No ratings yet
Basic Issues in The Development of Grain Production in Armenia
7 pages
General Agriculture Data-2024: Agriculture Profile of India
No ratings yet
General Agriculture Data-2024: Agriculture Profile of India
43 pages
Paper 05 Term Paper Eco 200
No ratings yet
Paper 05 Term Paper Eco 200
19 pages
Rice-Industry-India-Nov23
No ratings yet
Rice-Industry-India-Nov23
13 pages
Crop Prod Report
No ratings yet
Crop Prod Report
9 pages
Bihar (Area Prod, Produ) 2009 10
No ratings yet
Bihar (Area Prod, Produ) 2009 10
2 pages
ECO111 REPORT PAPER
No ratings yet
ECO111 REPORT PAPER
25 pages
paddy outlook-June-july-2023-24
No ratings yet
paddy outlook-June-july-2023-24
11 pages
Agr Priya
No ratings yet
Agr Priya
3 pages
Paper 01 Eco200 Termpaper
No ratings yet
Paper 01 Eco200 Termpaper
14 pages
Agriculture
No ratings yet
Agriculture
57 pages
Distribution and Production of Rice in India
No ratings yet
Distribution and Production of Rice in India
6 pages
Content
No ratings yet
Content
276 pages
Impact of Climatic Factors and Area on Rice Crop Production[1]
No ratings yet
Impact of Climatic Factors and Area on Rice Crop Production[1]
5 pages
1-s2.0-S2666154324004617-main
No ratings yet
1-s2.0-S2666154324004617-main
20 pages
Trend in area production and productivity of paddy
No ratings yet
Trend in area production and productivity of paddy
5 pages
DV Report PDF
No ratings yet
DV Report PDF
7 pages
General Agricultue 2023 PDF (1) - 230628 - 015640
100% (1)
General Agricultue 2023 PDF (1) - 230628 - 015640
51 pages
Jharkhand
100% (1)
Jharkhand
97 pages
Rice
No ratings yet
Rice
1 page
Food Security: Agricultural Price Policy, Farm Profitability and Food Security
No ratings yet
Food Security: Agricultural Price Policy, Farm Profitability and Food Security
22 pages
Supriya Jadhav Vasantrao Naik Marathwada Krishi Vidhyapeeth India
No ratings yet
Supriya Jadhav Vasantrao Naik Marathwada Krishi Vidhyapeeth India
29 pages
Varieties of Rice
No ratings yet
Varieties of Rice
19 pages
ECONOMY INDIA
No ratings yet
ECONOMY INDIA
29 pages
Statistics Final Presentation: ALANKRIT BADOLA (1502017) BIDYUT SONOWAL (1502049)
No ratings yet
Statistics Final Presentation: ALANKRIT BADOLA (1502017) BIDYUT SONOWAL (1502049)
16 pages
Worldfood
No ratings yet
Worldfood
3 pages
Alam 2019
No ratings yet
Alam 2019
4 pages
Sohan Gurav MIP DA 04 Ppt
No ratings yet
Sohan Gurav MIP DA 04 Ppt
13 pages
IFSC Methods 1.3.3
No ratings yet
IFSC Methods 1.3.3
270 pages
Paddy June 2023
No ratings yet
Paddy June 2023
5 pages
Paddy March 2023
No ratings yet
Paddy March 2023
5 pages
Regional Geography (3052)
No ratings yet
Regional Geography (3052)
20 pages
Agricultural Statistics at A Glance 2022 0
No ratings yet
Agricultural Statistics at A Glance 2022 0
280 pages
2nd Chapter Malu
No ratings yet
2nd Chapter Malu
7 pages
Crop Production 2022
No ratings yet
Crop Production 2022
116 pages
Current Affairs 4th July 2023
No ratings yet
Current Affairs 4th July 2023
9 pages
Pres-Slogan
No ratings yet
Pres-Slogan
20 pages
Global Rice Market Report
No ratings yet
Global Rice Market Report
29 pages
1 Jurnal Metode Kuantitatif
No ratings yet
1 Jurnal Metode Kuantitatif
12 pages
Frontiers _ Rice and Paddy Industry in Malaysia_ Governance and Policies, Research Trends, Technology Adoption and Resilience
No ratings yet
Frontiers _ Rice and Paddy Industry in Malaysia_ Governance and Policies, Research Trends, Technology Adoption and Resilience
38 pages
Distinguishing Yield Advances Yield Plateaus
No ratings yet
Distinguishing Yield Advances Yield Plateaus
11 pages
Growth in Area, Production, and Productivity of Kharif Paddy in Chhattisgarh
No ratings yet
Growth in Area, Production, and Productivity of Kharif Paddy in Chhattisgarh
8 pages
UNIT - 4 Agriculture (English) #I - Magnus
No ratings yet
UNIT - 4 Agriculture (English) #I - Magnus
14 pages
Role of MATLAB in Crop Yield Estimation: Raorane A.A. and Kulkarni R.V
No ratings yet
Role of MATLAB in Crop Yield Estimation: Raorane A.A. and Kulkarni R.V
8 pages
Fsufs 07 1093605
No ratings yet
Fsufs 07 1093605
22 pages
Agriculture Related Information
No ratings yet
Agriculture Related Information
4 pages
Trends in Area, Production and Productivity of Summer Paddy in Chhattisgarh
No ratings yet
Trends in Area, Production and Productivity of Summer Paddy in Chhattisgarh
8 pages
Manufacturing Cost Data on Artificial Ice
From Everand
Manufacturing Cost Data on Artificial Ice
Herman Friedl
No ratings yet
Environmental Scenario in Indian Mining Industry - an Overview
From Everand
Environmental Scenario in Indian Mining Industry - an Overview
Ram Pratap Singh
No ratings yet
The Planning Forum _ Report 2023-24 (3)
No ratings yet
The Planning Forum _ Report 2023-24 (3)
4 pages
Evs Final Vedant
No ratings yet
Evs Final Vedant
66 pages
The Collected Works of Mahatma Gandhi - Vol
No ratings yet
The Collected Works of Mahatma Gandhi - Vol
1 page
1
No ratings yet
1
1 page
Soil Science Chapter3
100% (1)
Soil Science Chapter3
113 pages
LEA 205 UP Questionaire & Answer Key
100% (10)
LEA 205 UP Questionaire & Answer Key
25 pages
Dry Coffee Processing Proposal
100% (7)
Dry Coffee Processing Proposal
13 pages
Study On Local Government Units: A Case Study On The City of San Jose Del Monte, Bulacan
No ratings yet
Study On Local Government Units: A Case Study On The City of San Jose Del Monte, Bulacan
5 pages
Correspondence Table NACE Rev. 2 - NACE Rev. 1.1
No ratings yet
Correspondence Table NACE Rev. 2 - NACE Rev. 1.1
56 pages
Grant Atlas of Anatomy 14th Edition by Anne MR Agur ISBN 1496343816 9781496343819 - Own the ebook now with all fully detailed chapters
100% (8)
Grant Atlas of Anatomy 14th Edition by Anne MR Agur ISBN 1496343816 9781496343819 - Own the ebook now with all fully detailed chapters
38 pages
Agrarian Reform Under Diosdado Macapagal
No ratings yet
Agrarian Reform Under Diosdado Macapagal
2 pages
Hydroponics and blue pottery by kumkum
No ratings yet
Hydroponics and blue pottery by kumkum
4 pages
First Term English (Written) Paper KG-1
No ratings yet
First Term English (Written) Paper KG-1
4 pages
Musa L. (Musaceae) Comprises of 68-70 Species That Distributed All Over South East Countries
No ratings yet
Musa L. (Musaceae) Comprises of 68-70 Species That Distributed All Over South East Countries
2 pages
List of FPOs in The State of Karnataka
No ratings yet
List of FPOs in The State of Karnataka
11 pages
JSS 1&2 PVS ATEQ
No ratings yet
JSS 1&2 PVS ATEQ
24 pages
AGRI TOURISM IN PAKISTAN PPT Group 5
No ratings yet
AGRI TOURISM IN PAKISTAN PPT Group 5
10 pages
Part 2 - Mrs Emilia Tri Setyowati - Cattle Dairy, Imrpoving Livelihood in Organic Dairy Product
No ratings yet
Part 2 - Mrs Emilia Tri Setyowati - Cattle Dairy, Imrpoving Livelihood in Organic Dairy Product
16 pages
RAZ-D 001 Grow, Vegetables, Grow
100% (1)
RAZ-D 001 Grow, Vegetables, Grow
10 pages
Key Driver Sales in Mahindra Tractors
No ratings yet
Key Driver Sales in Mahindra Tractors
112 pages
Inoculation PTC
No ratings yet
Inoculation PTC
3 pages
gulayan sa paaralan
No ratings yet
gulayan sa paaralan
3 pages
2021 5.ICT in Our Lives (01) - 천재 (이재영) (프리미엄) 영어Ⅱ 2학기 기말 (31문제) (Q)
No ratings yet
2021 5.ICT in Our Lives (01) - 천재 (이재영) (프리미엄) 영어Ⅱ 2학기 기말 (31문제) (Q)
12 pages
Malta Arriva Bus X-Route Itinerary & Timetable
No ratings yet
Malta Arriva Bus X-Route Itinerary & Timetable
24 pages
2024 - 09 - 20 17 - 49 Office Lens
No ratings yet
2024 - 09 - 20 17 - 49 Office Lens
4 pages
Protected Cultivation of Spinach Report 1
No ratings yet
Protected Cultivation of Spinach Report 1
24 pages
Bellfeeds Farm Products
No ratings yet
Bellfeeds Farm Products
2 pages
Week 6-7 ULOa. - 0
No ratings yet
Week 6-7 ULOa. - 0
9 pages
John Stubbendick Resume
No ratings yet
John Stubbendick Resume
2 pages
Essentials of Statistics for the Behavioral Sciences 3rd Edition Nolan Solutions Manual instant download
100% (3)
Essentials of Statistics for the Behavioral Sciences 3rd Edition Nolan Solutions Manual instant download
41 pages
Jurnal Pupuk Dan Pemupukan EVITA
No ratings yet
Jurnal Pupuk Dan Pemupukan EVITA
19 pages
Drainage System in Pakistan
No ratings yet
Drainage System in Pakistan
90 pages

Stats-with-R-project

Uploaded by

Stats-with-R-project

Uploaded by

Analysis of Agricultural Crops in India

Vedant,Dev and Sanidhya

## The following objects are masked from ’package:stats’:

## The following objects are masked from ’package:base’:

## Registered S3 method overwritten by ’GGally’:

C=(read_xlsx("ICRISAT-District Level Data (1).xlsx"))

## Dist Code Year State Code State Name

Boxplot for rice production

C%>%ggplot(aes(`RICE PRODUCTION (1000 tons)`))+

CONCENTRATION OF RICE PROD. IN FOUR YEARS THROUGH BOXPLOTS

## Warning: ‘summarise_each_()‘ was deprecated in dplyr 0.7.0.

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

pie(H,labels = c("Bihar(61.34 %)","Gujarat(14.69%)","Haryana(25.67%)","MadhyaPradesh(38.45%)","Rajasthan

Scatter plot on barley yield

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

labs(title="BARLEY YEILD IN THE DISTRICTS OF RAJASTHAN")

Sugarcane production line graph

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

Production of SUGARCANE over the years

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

stacked bar chart for cotton yield

P=B%>%filter(.,`State Name`==c("Haryana","Gujarat",'Madhya Pradesh'))

## Warning: There was 1 warning in ‘filter()‘.

COTY=P%>%group_by(`State Name`,`Year`) %>%

## Warning: ‘funs()‘ was deprecated in dplyr 0.8.0.

## Gujarat Haryana Madhya Pradesh

Gujarat Haryana Madhya Pradesh

C=as.data.frame(read.csv("agricultural produce.csv"))# importing dataset

Scatter plot on Wheat Yield

C%>%ggplot(aes(Year, WHEAT.YIELD..Kg.per.ha., col = WHEAT.YIELD..Kg.per.ha. > 3000)) +

Wheat Yield: 2014−2017

Scatter plot on Correlationship between Rice Production and Area

## ‘geom_smooth()‘ using formula = ’y ~ x’

Correlationship between Rice Production and Area

Rice Production (1000 tons)

Barplot on oilseed production in Rajasthan

Top 5 Oilseed producing districts in Rajasthan (2015)

Ganganagar Kota Bikaner Jodhpur Bharatpur

Line graph for area under cotton

Total area under cotton in India

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

Maize Yield Distribution Over Years

0 2500 5000 7500

using pair plots

P1=B%>%filter(`State Name`==c("Madhya Pradesh","Uttar Pradesh"))

Analysing relations between different crop yields

WHEAT YIELD (Kg per ha)

SUGARCANE YIELD (Kg per ha)

0 1000 3000 0 500 1000

Maize Production pie chart

R1=(PRODUCTION[2]) # this data is extracted in the list format

## MAIZE PRODUCTION (1000 tons)1 MAIZE PRODUCTION (1000 tons)2

Oilseeds boxplot(facet wrapped by states)

B%>%ggplot(aes(`OILSEEDS PRODUCTION (1000 tons)`))+ # data

Area under production for wheat

2014.0 2014.5 2015.0 2015.5 2016.0 2016.5 2017.0

Histogram and frequency polygon

f2=filter(B,Year=="2017"& `State Name`=="Uttar Pradesh")

0 400 800 1200

https://mospi.gov.in/4-agricultural-statistics Beginning R: Statistical Programming Language by Dr. Mark

You might also like