Prac - 6
Prac - 6
AIM:
write a program to implement simple graphs using Matplotlib in Python
THEORY:
In this practical, we delved into essential concepts of data visualization using Matplotlib, a powerful
Python library widely used for creating a variety of visualizations. We demonstrated several
fundamental types of plots, including line graphs, bar graphs, scatter plots, pie charts, histograms, and
stacked bar charts.
The objective of this practical was to introduce key concepts related to data visualization and provide
hands-on experience using Matplotlib. We aimed to cover various plot types to effectively communicate
data patterns and distributions.
Key concepts:
1. Data Representation:
Importance of Graphical Representation: Data is often complex and difficult to
understand in its raw form. Graphical representation helps in presenting data in a visual
format, making it easier to analyze and draw insights.
Continuous vs. Discrete Data: Recognizing whether the data is continuous (measurable)
or discrete (countable) is crucial, as it determines the appropriate visualization technique.
2. Line Graphs:
Definition: A line graph is a type of chart that displays information as a series of data points
(markers) connected by straight line segments.
Usage: It is used to illustrate trends or changes in data over time or continuous data points.
Continuous Data Trends: Line graphs are effective for displaying trends and patterns in
continuous data. For instance, they are useful for showing stock price trends over time.
Plotting X and Y Values: Understanding how to plot x-values against corresponding y-
values is fundamental to creating a meaningful line graph.
3. Bar Graphs:
Definition: A bar graph is a chart that represents data with rectangular bars, where the length
of each bar is proportional to the value it represents.
Usage: It's used to compare values across different categories.
Visualizing Discrete Data: Bar graphs are excellent for comparing discrete categories of
data. They represent each category as a bar, making comparisons intuitive and easy to
interpret.
Handling Multiple Categories: Learning how to plot and compare multiple categories
enhances the ability to represent complex data.
4. Scatter Plots:
Definition: A scatter plot is a graph that shows individual data points as dots on the graph
to represent the relationship between two variables.
Usage: It's used to identify patterns, relationships, or correlations between the variables.
Identifying Patterns: Scatter plots are valuable for identifying patterns and relationships
between variables. They help in understanding the correlation or lack thereof between
different data points.
Versatility: Scatter plots find applications in various scenarios, from scientific research to
business analytics, due to their versatility.
5. Pie Charts:
Definition: A pie chart is a circular statistical chart that is divided into slices to illustrate
numerical proportions.
Usage: It's used to display the relative sizes of parts that make up a whole.
Proportional Representation: Pie charts provide a clear and simple way to represent
proportions or percentages of a whole. This is crucial for visualizing contributions of
different categories to a total.
Labels and Sizes: Effective use of labels and sizes helps in conveying the relative
significance of each category.
6. Histograms:
Definition: A histogram is a graphical representation of the distribution of a dataset,
showing the frequency of data points falling within specified intervals (bins).
Usage: It's used to understand the underlying frequency distribution of continuous or
discrete data.
Frequency Distribution: Histograms are essential for displaying the frequency
distribution of a dataset, which is critical in understanding the underlying distribution
pattern.
Bin Sizes: Choosing appropriate bin sizes is essential for presenting the data effectively
without losing important details.
Algorithm:
1. Import the necessary libraries (matplotlib.pyplot and numpy).
2. Set the figure size for all plots using plt.figure(figsize=(16, 8)).
3. Define the sample data (x-values and y-values) that will be used for plotting.
4. Create individual subplots for each type of plot (line graph, bar graph, scatter plot, pie chart,
histogram, stacked bar chart) using plt.subplot.
5. Plot a line graph using plt.plot for the first subplot.
Specify x-values and y-values.
Customize the plot with labels, title, and legend.
6. Plot a bar graph using plt.bar for the second subplot.
Specify x-values and y-values.
Customize the plot with labels, title, and legend.
7. Plot a scatter plot using plt.scatter for the third subplot.
Specify x-values and y-values.
Customize the plot with labels, title, and legend.
8. Plot a pie chart using plt.pie for the fourth subplot.
Specify labels and sizes for the pie slices.
Customize the plot with a title.
9. Generate random data for a histogram using np.random.randn for the fifth subplot.
10. Plot a histogram using plt.hist for the fifth subplot.
Specify the data and bin sizes.
Customize the plot with labels, title, and alpha value for transparency.
11. Plot a stacked bar chart using plt.bar for the sixth subplot.
Specify x-labels, y-values for two groups, and use the bottom parameter for stacking.
Customize the plot with labels, title, and legend.
12. Adjust the layout using plt.tight_layout() for proper spacing between subplots.
13. Display the plots using plt.show().
Code Explanation:
1. Importing Libraries:
matplotlib.pyplot and numpy are imported for plotting and numerical operations, respectively.
3. Sample Data:
x_values and y_values are defined to represent data points for plotting.
4. Line Graph:
plt.subplot(2, 3, 1) creates a subplot grid with 2 rows, 3 columns, and selects the 1st
position.
plt.plot(x_values, y_values, label='Line Graph', marker='o', color='b') plots a line graph
with specified attributes.
plt.xlabel(), plt.ylabel(), plt.title(), and plt.legend() add labels, a title, and a legend for
clarity.
5. Bar Graph:
plt.subplot(2, 3, 2) selects the 2nd position in the subplot grid.
plt.bar(x_values, y_values, label='Bar Graph', color='r') creates a bar graph with specified
attributes.
Relevant labeling and title assignments are made.
6. Scatter Plot:
plt.subplot(2, 3, 3) selects the 3rd position in the subplot grid.
plt.scatter(x_values, y_values, label='Scatter Plot', color='g') plots a scatter plot.
Relevant labeling and title assignments are made.
7. Pie Chart:
plt.subplot(2, 3, 4) selects the 4th position in the subplot grid.
plt.pie(sizes, labels=labels, autopct='%1.1f%%') creates a pie chart with specified
attributes.
A title is added for clarity.
8. Histogram:
np.random.seed(0) seeds the random number generator for reproducibility.
data = np.random.randn(1000) generates random data.
plt.subplot(2, 3, 5) selects the 5th position in the subplot grid.
plt.hist(data, bins=30, color='purple', alpha=0.7) plots a histogram with specified attributes.
The parameters in functions like plt.subplot(), plt.plot(), plt.bar(), plt.pie(), plt.hist(), etc., are used to
customize the appearance and behavior of the plots. For example, color, label, marker, autopct, bins,
alpha, etc., are parameters that control color, labels, markers, percentages, bin sizes, transparency, and
more. Understanding and using these parameters allow for flexible and detailed customization of the
plots.