Python Unit 4&5 Que
Python Unit 4&5 Que
Write a Python code to create a Data Frame from a dictionary, and explain how you
can access the columns and rows.
python
Copy code
import pandas as pd
# Sample dictionary
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}
# Creating DataFrame
df = pd.DataFrame(data)
Accessing Columns:
Columns in a DataFrame can be accessed using their labels. There are two primary methods:
Example:
python
Copy code
# Accessing columns
print("Accessing columns:")
print(df['Name']) # Using bracket notation
print(df.Age) # Using dot notation (for valid
identifiers)
Accessing Rows:
Example:
python
Copy code
# Accessing rows
print("\nAccessing rows:")
print(df.iloc[0]) # Accessing the first row
print(df.loc[1]) # Accessing the row with label 1 (if
index is not default)
These are the fundamental aspects of creating, accessing columns, and accessing rows in a
Pandas DataFrame using Python. Adjustments can be made based on specific requirements
such as custom indexes or handling missing data.
where nnn is the number of data points, XiX_iXi and YiY_iYi are individual data points, and
Xˉ\bar{X}Xˉ and Yˉ\bar{Y}Yˉ are the means of XXX and YYY, respectively.
The formula for Pearson correlation coefficient ρ\rhoρ between variables XXX and YYY is:
where σX\sigma_XσX and σY\sigma_YσY are the standard deviations of XXX and YYY,
respectively.
In Pandas, there are several ways to find unique values in a DataFrame or Series:
python
Copy code
import pandas as pd
series = pd.Series([1, 2, 2, 3, 3, 3])
unique_values = series.unique()
print(unique_values) # Output: [1 2 3]
python
Copy code
import pandas as pd
data = {'A': [1, 2, 2, 3], 'B': [4, 5, 5, 6]}
df = pd.DataFrame(data)
unique_values = df.unique()
print(unique_values)
# Output:
# array([[1, 4],
# [2, 5],
# [3, 6]])
3. Series.value_counts(): This method provides a count of unique values in a Series,
sorted in descending order.
python
Copy code
import pandas as pd
series = pd.Series([1, 2, 2, 3, 3, 3])
value_counts = series.value_counts()
print(value_counts)
# Output:
# 3 3
# 2 2
# 1 1
# dtype: int64
python
Copy code
import pandas as pd
data = {'A': [1, 2, 2, 3], 'B': [4, 5, 5, 6]}
df = pd.DataFrame(data)
value_counts = df['A'].value_counts()
print(value_counts)
# Output:
# 2 2
# 3 1
# 1 1
# Name: A, dtype: int64
These methods are useful for exploring and understanding the distribution of data in Pandas
DataFrames and Series, providing insights into unique values and their frequencies.
3.Discuss the methods available for sorting and ranking data in Pandas. Provide examples
to demonstrate how to sort Data Frame columns and rank the data.
In Pandas, sorting and ranking data are common operations that allow you to arrange your
DataFrame based on specific criteria.:
Sorting in Pandas can be performed on both rows and columns based on index labels or
column values.
python
Copy code
import pandas as pd
# Sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}
df = pd.DataFrame(data)
Output:
csharp
Copy code
Sorted by Age (ascending):
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
You can sort in descending order by setting the ascending parameter to False:
python
Copy code
# Sorting by 'Age' column in descending order
sorted_df_desc = df.sort_values(by='Age', ascending=False)
print("\nSorted by Age (descending):")
print(sorted_df_desc)
Output:
csharp
Copy code
Sorted by Age (descending):
Name Age City
3 David 40 Houston
2 Charlie 35 Chicago
1 Bob 30 Los Angeles
0 Alice 25 New York
To sort a DataFrame by its index labels, you can use the sort_index() method:
python
Copy code
# Sorting by index labels in descending order
sorted_by_index = df.sort_index(ascending=False)
print("\nSorted by Index (descending):")
print(sorted_by_index)
Output:
csharp
Copy code
Sorted by Index (descending):
Name Age City
3 David 40 Houston
2 Charlie 35 Chicago
1 Bob 30 Los Angeles
0 Alice 25 New York
Ranking assigns ranks to data elements in a DataFrame based on certain criteria, such as
ascending or descending order.
Ranking Values:
The rank() method computes numerical data ranks (1 through n) along a specified axis. It
handles ties by assigning the average rank:
python
Copy code
# Ranking by 'Age'
df['Rank_Age'] = df['Age'].rank()
print("\nRanking by Age:")
print(df)
Output:
vbnet
Copy code
Ranking by Age:
Name Age City Rank_Age
0 Alice 25 New York 1.0
1 Bob 30 Los Angeles 2.0
2 Charlie 35 Chicago 3.0
3 David 40 Houston 4.0
You can customize the ranking behavior using parameters like ascending, method, and
na_option. For example, to rank in descending order:
python
Copy code
# Ranking by 'Age' in descending order
df['Rank_Age_desc'] = df['Age'].rank(ascending=False)
print("\nRanking by Age (descending):")
print(df)
Output:
csharp
Copy code
Ranking by Age (descending):
Name Age City Rank_Age Rank_Age_desc
0 Alice 25 New York 1.0 4.0
1 Bob 30 Los Angeles 2.0 3.0
2 Charlie 35 Chicago 3.0 2.0
3 David 40 Houston 4.0 1.0
In this example, Rank_Age_desc shows the ranks where higher ages receive lower ranks.
Conclusion:
Sorting and ranking are fundamental operations in data analysis with Pandas. Sorting allows
you to arrange data based on column values or index labels, while ranking assigns numerical
ranks based on specified criteria. These operations are versatile and essential for various data
manipulation and analysis tasks in Python.
4.Explain how you can filter out missing data in a Data Frame. Provide a code example
demonstrating how to remove rows with missing values in Pandas.
Filtering out missing data in a Pandas DataFrame involves identifying and then removing or
replacing rows or columns that contain NaN (Not a Number) or None values. Here’s a step-
by-step explanation along with a code example demonstrating how to remove rows with
missing values in Pandas.
python
Copy code
import pandas as pd
import numpy as np
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
Output:
r
Copy code
Original DataFrame:
A B C
0 1.0 5.0 x
1 2.0 NaN NaN
2 NaN 7.0 z
3 4.0 8.0 w
To remove rows with any NaN values, you can use the dropna() method:
python
Copy code
# Drop rows with any NaN values
cleaned_df = df.dropna()
print("\nDataFrame after dropping rows with NaN values:")
print(cleaned_df)
Output:
sql
Copy code
DataFrame after dropping rows with NaN values:
A B C
0 1.0 5.0 x
3 4.0 8.0 w
In this example:
The original DataFrame df contains rows with NaN values in columns A and B.
dropna() removes rows where any NaN value is present by default (axis=0).
cleaned_df is the resulting DataFrame after removing rows with missing values.
axis: Specifies whether to drop rows (axis=0) or columns (axis=1) with missing
values.
how: Determines how to drop rows or columns. Options include:
o how='any' (default): Drops rows/columns if any NaN values are present.
o how='all': Drops rows/columns only if all values are NaN.
subset: Specifies specific columns or rows to look for NaN values.
For example, to drop rows where all values are NaN (how='all'):
python
Copy code
# Drop rows where all values are NaN
cleaned_df_all = df.dropna(how='all')
print("\nDataFrame after dropping rows where all values are
NaN:")
print(cleaned_df_all)
Output:
r
Copy code
DataFrame after dropping rows where all values are NaN:
A B C
0 1.0 5.0 x
1 2.0 NaN NaN
2 NaN 7.0 z
3 4.0 8.0 w
Conclusion
Filtering out missing data in Pandas involves using dropna() to remove rows or columns
containing NaN values based on specified criteria. Handling missing data is essential for data
cleaning and preprocessing tasks to ensure the quality and reliability of subsequent data
analysis or machine learning models. Adjust the parameters of dropna() based on your
specific data cleaning requirements and the nature of missing data in your dataset.
Hierarchical indexing, also known as MultiIndexing, in Pandas allows you to have multiple
index levels on an axis (typically rows), providing a way to represent higher-dimensional data
in a familiar tabular structure. This is particularly useful when you have data that naturally
fits into multiple categories or subcategories.
Let's illustrate how to create a DataFrame with a hierarchical index and then access data
using this index.
python
Copy code
import pandas as pd
df = pd.DataFrame(data)
Output:
yaml
Copy code
DataFrame with Hierarchical Index:
Population
City Year
New York 2020 8623000
2021 8700000
Los Angeles 2020 3979000
2021 3988000
Chicago 2020 2693000
2021 2705000
In this example:
The DataFrame df has been created with a hierarchical index consisting of two
levels: 'City' and 'Year'.
The set_index() method is used to set these columns as the index, creating a
MultiIndex.
You can access data using hierarchical indexing using loc[]. Provide the values for each
level of the index in a tuple:
python
Copy code
# Accessing data using hierarchical index
print("\nAccessing data for 'New York' in 2020:")
print(df.loc[('New York', 2020)])
Output:
yaml
Copy code
Accessing data for 'New York' in 2020:
Population 8623000
Name: (New York, 2020), dtype: int64
Hierarchical indexing in Pandas is powerful for handling complex data structures and
facilitating efficient data manipulation and analysis, especially in multi-dimensional datasets.
Adjust the operations based on your specific data analysis needs and the structure of your
hierarchical data.
6.What are the advantages of seaboarn packages .Explain with real time examples
Seaborn provides a high-level interface for drawing attractive and informative statistical
graphics. It simplifies the process of creating complex visualizations that are commonly used
in exploratory data analysis and statistical modeling.
Example: Suppose you have a dataset with information about car sales across different
regions. Using Seaborn, you can quickly create a categorical plot to show how sales vary
across regions:
python
Copy code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Example data
data = pd.DataFrame({
'Region': ['North', 'South', 'East', 'West'],
'Sales': [350, 240, 400, 280]
})
# Plotting with Seaborn
sns.barplot(x='Region', y='Sales', data=data)
plt.title('Car Sales by Region')
plt.xlabel('Region')
plt.ylabel('Sales')
plt.show()
Seaborn has built-in themes and color palettes that make visualizations more visually
appealing and easier to interpret. It also includes several statistical plotting routines that show
data distributions and relationships.
Example: Using Seaborn's distplot, you can visualize the distribution of a dataset along
with a kernel density estimate (KDE):
python
Copy code
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
# Example data
data = np.random.normal(loc=0, scale=1, size=1000)
Seaborn works seamlessly with pandas DataFrames and Series. It can directly use column
names from pandas objects to visualize data, making it convenient for working with datasets
loaded into pandas.
python
Copy code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Example data
data = pd.DataFrame({
'X': [1, 2, 3, 4, 5],
'Y': [3, 5, 4, 6, 2]
})
Seaborn offers a wide range of plot types and customization options. It supports complex
plots such as multi-plot grids, conditional plots, and regression plots out-of-the-box, allowing
for deeper insights into data relationships.
python
Copy code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Example data
data = pd.DataFrame({
'A': [1, 2, 3, 4, 5],
'B': [3, 5, 4, 6, 2],
'C': [2, 4, 3, 5, 1]
})
Seaborn includes functions that automatically aggregate and summarize data for
visualization, such as box plots, violin plots, and bar plots. These plots reveal patterns and
distributions in the data that can be crucial for exploratory analysis and understanding
statistical relationships.
python
Copy code
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Example data
data = pd.DataFrame({
'Category': ['A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
'Value': [10, 15, 20, 18, 25, 12, 30, 22]
})
Conclusion
Seaborn is a valuable tool for data visualization in Python due to its high-level interface,
attractive default styles, integration with pandas, and wide range of plot types. It simplifies
the creation of complex visualizations and facilitates deeper insights into data patterns and
relationships, making it popular among data scientists and analysts.
7.Explain how to create bar charts using the matplotlib package. Develop a sample code
to create a bar chart showing the frequency of different categories in a Data Frame
column.
Creating bar charts using the matplotlib package in Python is straightforward and can be
done in a few steps. Here's a guide along with a sample code to create a bar chart showing the
frequency of different categories in a DataFrame column using matplotlib.
Step-by-Step Guide:
1. Import Required Libraries: First, you need to import the necessary libraries.
matplotlib is used for plotting, and pandas is often used to work with
DataFrame structures.
python
Copy code
import pandas as pd
import matplotlib.pyplot as plt
2. Prepare Data: You'll need a DataFrame with the data you want to plot. For example,
let's say you have a DataFrame df with a column Category that contains different
categories.
python
Copy code
# Example DataFrame
data = {'Category': ['A', 'B', 'C', 'A', 'A', 'B', 'B',
'C', 'A', 'C']}
df = pd.DataFrame(data)
python
Copy code
category_counts = df['Category'].value_counts()
This will give you a pandas Series where index represents the unique categories and
values represent their frequencies.
4. Plotting the Bar Chart: Use matplotlib to create a bar chart based on the
frequencies obtained.
python
Copy code
# Plotting
plt.figure(figsize=(8, 6)) # Specify figure size first
# Plot bars
category_counts.plot(kind='bar', color='skyblue')
# Show plot
plt.show()
Example Output:
For the provided example data, the bar chart will show bars representing the frequencies of
categories A, B, and C.
This approach allows you to quickly visualize the distribution of categorical data using bar
charts in Python with matplotlib. Adjustments such as color, labels, and figure size can
be customized based on your specific requirements.
Certainly! Here's how you can create scatter plots and histograms using the matplotlib
library in Python:
1. Scatter Plot
A scatter plot is used to visualize the relationship between two variables by displaying data
points as dots. Here’s a simple example:
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
x and y are arrays containing the x and y coordinates of the data points.
c specifies the color of each point. Here, colors is a sequence of numbers mapped
to colors using the 'viridis' colormap.
s specifies the size of each point. Larger values in sizes create larger points.
alpha controls the transparency of the points (0 for fully transparent, 1 for fully
opaque).
2. Histogram
A histogram represents the distribution of a continuous variable by dividing the data into bins
and displaying the frequency of observations in each bin.
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
These examples demonstrate basic usage of matplotlib for creating scatter plots and
histograms. You can further customize these plots by adjusting colors, sizes, labels, titles, and
other properties to suit your specific data and visualization needs.
9.How can you customize colors, markers, and line styles in matplotlib plots? Provide
examples showing the use of different styles for a line chart.
In matplotlib, you can customize colors, markers, and line styles to enhance the
appearance of your plots. Here’s how you can do it with examples for each customization:
Customizing Colors:
You can specify colors in several ways in matplotlib. Here are some common methods:
1. Specifying Named Colors: You can use named colors such as 'blue', 'red',
'green', etc.
2. Using Hexadecimal Colors: Specify colors using hexadecimal strings like
'#FF5733'.
3. Using RGB or RGBA tuples: Provide colors using tuples specifying RGB or RGBA
values, e.g., (0.1, 0.2, 0.5) or (0.1, 0.2, 0.5, 0.3).
4. Using HTML color names: Some colors can also be specified using their
HTML/CSS names like 'skyblue', 'gold', etc.
Customizing Markers:
Markers are used to highlight individual data points on a plot. You can customize them with
different shapes and sizes.
1. Marker Styles: Use markers like 'o' (circle), 's' (square), '+' (plus sign), etc.
2. Marker Size: Adjust marker size using the markersize parameter.
3. Marker Color: Set marker color using the markerfacecolor parameter.
Line styles control the appearance of connecting lines between data points.
1. Line Styles: Use styles like '-' (solid line), '--' (dashed line), ':' (dotted line),
etc.
2. Line Width: Adjust the width of the line using the linewidth parameter.
Example Code:
Here’s an example that demonstrates how to customize colors, markers, and line styles in a
line chart using matplotlib:
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
Explanation:
Result:
The resulting plot will display two lines with different styles and markers, each customized
with a different color. You can further customize the plot by adjusting parameters such as line
width (linewidth), marker size (markersize), and other styling options as needed.
10.How can you customize colors, markers, and line styles in matplotlib plots? Provide
examples showing the use of different styles for a line chart.
In matplotlib, you can customize colors, markers, and line styles to enhance the
appearance of your plots. Here’s how you can do it with examples for each customization:
Customizing Colors:
You can specify colors in several ways in matplotlib. Here are some common methods:
1. Specifying Named Colors: You can use named colors such as 'blue', 'red',
'green', etc.
2. Using Hexadecimal Colors: Specify colors using hexadecimal strings like
'#FF5733'.
3. Using RGB or RGBA tuples: Provide colors using tuples specifying RGB or RGBA
values, e.g., (0.1, 0.2, 0.5) or (0.1, 0.2, 0.5, 0.3).
4. Using HTML color names: Some colors can also be specified using their
HTML/CSS names like 'skyblue', 'gold', etc.
Customizing Markers:
Markers are used to highlight individual data points on a plot. You can customize them with
different shapes and sizes.
1. Marker Styles: Use markers like 'o' (circle), 's' (square), '+' (plus sign), etc.
2. Marker Size: Adjust marker size using the markersize parameter.
3. Marker Color: Set marker color using the markerfacecolor parameter.
Line styles control the appearance of connecting lines between data points.
1. Line Styles: Use styles like '-' (solid line), '--' (dashed line), ':' (dotted line),
etc.
2. Line Width: Adjust the width of the line using the linewidth parameter.
Example Code:
Here’s an example that demonstrates how to customize colors, markers, and line styles in a
line chart using matplotlib:
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
Explanation:
Result:
The resulting plot will display two lines with different styles and markers, each customized
with a different color. You can further customize the plot by adjusting parameters such as line
width (linewidth), marker size (markersize), and other styling options as needed.
Handling multiple exceptions in Python allows you to gracefully manage different types of
errors that may occur during the execution of your code. This is achieved using multiple
except blocks or by catching a tuple of exceptions in a single except block. Here's a
detailed explanation along with an example to illustrate how to catch different types of
exceptions:
You can use separate except blocks for each type of exception you want to handle. This
approach is useful when you need to perform different actions based on the type of exception
raised.
python
Copy code
try:
# Code that may raise exceptions
x = 10 / 0 # This will raise a ZeroDivisionError
y = int('abc') # This will raise a ValueError
z = my_dict['key'] # This will raise a KeyError
except ZeroDivisionError:
print("Error: Division by zero")
except ValueError:
print("Error: Invalid conversion to integer")
except KeyError:
print("Error: Key not found in dictionary")
except Exception as e:
print("Error:", e) # Catch any other exceptions
Explanation:
try Block: The code that potentially raises exceptions is placed inside the try
block.
except Blocks: Each except block handles a specific type of exception. In the
example:
o ZeroDivisionError: Handles division by zero errors.
o ValueError: Handles errors when converting a value to an integer fails.
o KeyError: Handles errors when accessing a dictionary key that doesn't exist.
o Exception as e: This catches any other exceptions that were not
explicitly handled by the previous except blocks. The exception instance is
stored in e, allowing you to print or handle it as needed.
Alternatively, you can catch multiple exceptions using a tuple in a single except block.
This is concise and useful when you want to perform the same action for several related
exceptions.
python
Copy code
try:
# Code that may raise exceptions
x = 10 / 0 # This will raise a ZeroDivisionError
y = int('abc') # This will raise a ValueError
z = my_dict['key'] # This will raise a KeyError
except (ZeroDivisionError, ValueError, KeyError) as e:
print("Error:", e) # Handle all specified exceptions in
one block
except Exception as e:
print("Error:", e) # Catch any other exceptions
Explanation:
Tuple in except Block: You can list multiple exception types within parentheses. If
any of these exceptions occur within the try block, the corresponding except
block will execute.
Common Exception (Exception): It's often a good practice to include a generic
except block (Exception as e) at the end to catch any unexpected exceptions
that were not explicitly handled. This helps in debugging and understanding
unforeseen errors during development or execution.
Example:
Here’s a complete example demonstrating how to handle multiple exceptions using both
approaches:
python
Copy code
try:
x = 10 / 0 # ZeroDivisionError
y = int('abc') # ValueError
my_dict = {'key': 'value'}
z = my_dict['nonexistent_key'] # KeyError
except ZeroDivisionError:
print("Error: Division by zero")
except ValueError:
print("Error: Invalid conversion to integer")
except KeyError:
print("Error: Key not found in dictionary")
except Exception as e:
print("Error:", e)
Output:
vbnet
Copy code
Error: Division by zero
This is because the ZeroDivisionError occurred first, and hence the corresponding
except block was executed. If you comment out the x = 10 / 0 line and uncomment y
= int('abc'), you will see:
vbnet
Copy code
Error: Invalid conversion to integer
This demonstrates how Python sequentially checks each except block until it finds a
matching exception type or falls back to a more general Exception handler. Adjust the
code and exception types as per your specific requirements and the types of exceptions you
anticipate handling in your application.
12.Explain the methods finditer(), match(), and search() in the re package. Provide
examples of how each method can be used to find patterns in a given string.
In Python's re (regular expression) package, there are several methods for searching patterns
in strings. Here’s an explanation of three important methods: finditer(), match(), and
search(), along with examples demonstrating their usage.
1. finditer()
The finditer() method returns an iterator yielding match objects for all non-
overlapping matches of the regular expression pattern in the string. It's useful when you want
to iterate over all occurrences of a pattern in a string.
Syntax:
python
Copy code
re.finditer(pattern, string, flags=0)
Example:
python
Copy code
import re
pattern = r'Python'
Output:
perl
Copy code
Found 'Python' at index 0
Found 'Python' at index 28
2. match()
The match() method checks for a match only at the beginning of the string. If the pattern
matches at the beginning, it returns a match object. Otherwise, it returns None.
Syntax:
python
Copy code
re.match(pattern, string, flags=0)
Example:
python
Copy code
import re
pattern = r'Python'
if match:
print(f"Found '{match.group()}' at index {match.start()}")
else:
print("No match found at the beginning of the string.")
Output:
perl
Copy code
Found 'Python' at index 0
3. search()
The search() method scans through the string, looking for any location where the pattern
matches. It returns a match object if the pattern is found, otherwise it returns None.
Syntax:
python
Copy code
re.search(pattern, string, flags=0)
Example:
python
Copy code
import re
pattern = r'language'
# Using search to find 'language' anywhere in the string
match = re.search(pattern, text)
if match:
print(f"Found '{match.group()}' at index {match.start()}")
else:
print("No match found.")
Output:
perl
Copy code
Found 'language' at index 17
Explanation:
finditer():
o Iterates over all occurrences of the pattern in the string.
o Useful when you need to find all matches and perform actions on each match.
match():
o Checks if the pattern matches at the beginning of the string.
o Use it when you specifically want to match patterns starting from the
beginning of the string.
search():
o Scans through the string to find any location where the pattern matches.
o Suitable for finding patterns anywhere within the string.
Conclusion:
13.What does the findall() method do in the re package? Explain with example
The findall() method in the re package in Python is used to find all occurrences of a
pattern (regular expression) in a string and return them as a list of strings. It scans through the
input string, finds all matches of the pattern, and returns them as a list, with each element of
the list representing a matched string.
Syntax:
python
Copy code
re.findall(pattern, string, flags=0)
Example:
python
Copy code
import re
text = "Hello there! How are you? Can you tell me how to find
all vowels in this text?"
Output:
less
Copy code
Found vowels: ['e', 'o', 'e', 'e', 'o', 'a', 'e', 'o', 'u',
'a', 'e', 'e', 'i', 'o', 'e', 'i', 'e', 'o', 'i', 'e', 'a',
'e', 'u', 'i', 'o', 'a', 'i', 'e', 'o', 'i']
Explanation:
Additional Notes:
Case Sensitivity: By default, regular expressions are case-sensitive. If you want to
match both uppercase and lowercase vowels, you can modify the pattern to [aeiou]
or use the re.IGNORECASE flag (re.I) to make the pattern case-insensitive.
Matching Subgroups: If your pattern contains capturing groups (...),
findall() will return a list of tuples containing strings that match each group.
Empty Matches: findall() will return empty strings for matches of zero length if
they exist in the text.
Performance: findall() is useful when you want to collect all matches of a
pattern into a list. It's efficient for relatively small to medium-sized texts. For very
large texts or more complex matching scenarios, consider using finditer() for
better memory efficiency and the ability to process matches one at a time.
In Python, formatting dates and times is achieved using the strftime() method (string
format time) available in the datetime module. This method allows you to specify a format
string to customize the representation of dates and times according to various patterns. Here’s
an overview of common format codes used in strftime():
You can combine these format codes with literal text and other characters to create a custom
format. Here’s an example:
python
Copy code
from datetime import datetime
now = datetime.now()
Output (example):
lua
Copy code
Formatted date and time: Friday, 24 June 2024 11:30:15 AM
Python's locale module allows you to format dates and times according to the conventions
and formats of a specific locale (region or country). This is useful when you need to display
dates and times in formats that are customary for different locales. Here’s how you can use it:
1. Setting the Locale: You can set the desired locale using locale.setlocale()
function, specifying the category (LC_TIME for date and time formatting) and the
locale identifier.
python
Copy code
import locale
from datetime import datetime
now = datetime.now()
python
Copy code
import locale
from datetime import datetime
now = datetime.now()
Output (example):
lua
Copy code
Formatted date and time (German locale): Freitag, 24. Juni
2024 11:30:15 Uhr
Conclusion:
These capabilities in Python are powerful for displaying dates and times in a way that is both
culturally appropriate and easily understood by users from different regions. When working
with dates and times in applications, choosing the right format and locale can greatly enhance
usability and user experience.
1. Compilation of Patterns:
o re.compile(pattern, flags=0): Compiles a regular expression
pattern into a regex object for reuse. This is useful when the pattern will be
used multiple times.
python
Copy code
import re
pattern = re.compile(r'\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-
Z]{2,}\b', re.IGNORECASE)
python
Copy code
# Using re.search
result = re.search(r'apple', 'I like apples and oranges')
if result:
print('Found:', result.group()) # Found: apple
python
Copy code
# Using re.findall
results = re.findall(r'\d+', 'There are 3 apples and 12
oranges')
print(results) # ['3', '12']
4. Splitting by Pattern:
o re.split(pattern, string, maxsplit=0, flags=0): Splits
the string by occurrences of the pattern and returns a list of substrings.
python
Copy code
# Using re.split
parts = re.split(r'\s+', 'Split this sentence into
words')
print(parts) # ['Split', 'this', 'sentence', 'into',
'words']
5. Replacing Patterns:
o re.sub(pattern, replacement, string, count=0,
flags=0): Replaces occurrences of the pattern in the string with the
replacement string.
python
Copy code
# Using re.sub
new_string = re.sub(r'\d+', 'X', 'There are 3 apples and
12 oranges')
print(new_string) # 'There are X apples and X oranges'
Python's re module also supports flags that modify the behavior of regex operations:
python
Copy code
import re
Conclusion
Python's re module provides a robust and efficient way to work with regular expressions,
enabling tasks such as searching, matching, splitting, and replacing strings based on complex
patterns. Understanding regular expressions and their usage in Python is essential for tasks
involving text processing, data validation, and pattern recognition. The re module's
flexibility and powerful features make it a valuable tool in the Python programmer's toolkit.