Sometimes, while working with Python records, we can have a problem in which, we need to perform elements grouping based on multiple key equality, and also summation of the grouped result of a particular key. This kind of problem can occur in applications in data domains. Let's discuss certain ways in which this task can be performed.
Input :
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best')]
grp_indx = [1, 2] [ Indices to group ]
sum_idx = [0] [ Index to sum ]
Output : [('M', 'Gfg', 12), ('H', 'Gfg', 23), ('M', 'Best', 13)]
Input :
test_list = [(12, 'M', 'Gfg'), (23, 'M', 'Gfg'), (13, 'M', 'Best')]
grp_indx = [1, 2] [ Indices to group ]
sum_idx = [0] [ Index to sum ]
Output : [('M', 'Gfg', 35), ('M', 'Best', 13)]
Method 1: Using loop + defaultdict() + list comprehension
The combination of the above functionalities can be used to solve this problem. In this, we perform grouping using a loop and the task of performing a summation of keys is done using list comprehension.
Approach:
- List of tuples test_list is initialized with some values.
- grp_indx is a list of grouping indices, indicating the positions of elements in each tuple that will be used for grouping.
- sum_idx is a list of summation indices, indicating the positions of elements in each tuple that will be used for summation.
- A defaultdict named temp is initialized to store the results.
- A loop iterates through each tuple in test_list.
For each tuple, the elements at positions grp_indx[0] and grp_indx[1] are used to form a key for temp. - The value at position sum_idx[0] in the tuple is added to the corresponding value in temp.
- Once all tuples have been processed, a list comprehension is used to create a new list res by iterating through each key-value pair in temp and creating a new tuple by concatenating the key and value.
- Finally, the grouped summation is printed.
Follow the below steps to implement the above idea:
Python3
# Python3 code to demonstrate working of
# Multiple Keys Grouped Summation
# Using loop + defaultdict() + list comprehension
from collections import defaultdict
# initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'),
(13, 'M', 'Best'), (18, 'M', 'Gfg'),
(2, 'H', 'Gfg'), (23, 'M', 'Best')]
# printing original list
print("The original list is : " + str(test_list))
# initializing grouping indices
grp_indx = [1, 2]
# initializing sum index
sum_idx = [0]
# Multiple Keys Grouped Summation
# Using loop + defaultdict() + list comprehension
temp = defaultdict(int)
for sub in test_list:
temp[(sub[grp_indx[0]], sub[grp_indx[1]])] += sub[sum_idx[0]]
res = [key + (val, ) for key, val in temp.items()]
# printing result
print("The grouped summation : " + str(res))
Output :
The original list is : [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'), (18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
The grouped summation : [('M', 'Gfg', 30), ('H', 'Gfg', 25), ('M', 'Best', 36)]
Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(m), where m is the number of distinct combinations of grouping indices.
Method 2: Using itertools.groupby() and a lambda function for Multiple Keys Grouped Summation
In this method, we first sorts the input list using the sorted() function and a lambda function that extracts the grouping indices. It then uses itertools.groupby() to group the sorted list by the same indices. Finally, it uses a list comprehension to iterate over each group, summing the values of the sum_idx index for each element in the group, and creating a new tuple that includes the grouping indices and the summed value.
Python3
from itertools import groupby
# Initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'),
(13, 'M', 'Best'), (18, 'M', 'Gfg'),
(2, 'H', 'Gfg'), (23, 'M', 'Best')]
# Printing original list
print("The original list is : " + str(test_list))
# Initializing grouping indices
grp_indx = [1, 2]
# Initializing sum index
sum_idx = [0]
# Multiple Keys Grouped Summation
# Using itertools.groupby() and a lambda function
res = [(key[0], key[1], sum(sub[0] for sub in group))
for key, group in groupby(sorted(test_list, key=lambda x: (x[grp_indx[0]], x[grp_indx[1]])),
key=lambda x: (x[grp_indx[0]], x[grp_indx[1]]))]
# Printing result
print("The grouped summation : " + str(res))
OutputThe original list is : [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'), (18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
The grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]
Time complexity: O(n log n) because of the sorting operation. The groupby function itself has a time complexity of O(n).
Auxiliary space: O(n).
Method 3: Using pandas library
Pandas is a powerful library in Python for data manipulation and analysis. It has a groupby function that can be used to group data by one or more keys and perform operations on the grouped data.
Python3
import pandas as pd
# initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'),
(13, 'M', 'Best'), (18, 'M', 'Gfg'),
(2, 'H', 'Gfg'), (23, 'M', 'Best')]
# creating a pandas DataFrame from the list
df = pd.DataFrame(test_list, columns=['value', 'key1', 'key2'])
# grouping by key1 and key2 and summing the values
grouped = df.groupby(['key1', 'key2'])['value'].sum()
# converting the result back to a list of tuples
res = [(key[0], key[1], value) for key, value in grouped.items()]
# printing result
print("The grouped summation : " + str(res))
OUTPUT-
The grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]
Time complexity: O(n log n) because of the sorting operation performed internally by pandas for grouping the data.
Auxiliary space: O(n) because pandas needs to create a DataFrame object to store the input data and perform the grouping operation.
Method 4: Using itertools.groupby() and operator.itemgetter()
Use the itertools.groupby() function and the operator.itemgetter() function to group the elements by their keys and sum the values.
Python3
import itertools
import operator
# Initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'),
(13, 'M', 'Best'), (18, 'M', 'Gfg'),
(2, 'H', 'Gfg'), (23, 'M', 'Best')]
# Initializing grouping indices
grp_indx = [1, 2]
# Initializing sum index
sum_idx = [0]
# Multiple Keys Grouped Summation
# Using itertools.groupby() and operator.itemgetter()
test_list.sort(key=operator.itemgetter(*grp_indx))
res = []
for k, g in itertools.groupby(test_list, key=operator.itemgetter(*grp_indx)):
vals = [sub[sum_idx[0]] for sub in g]
res.append(k + (sum(vals),))
# Printing result
print("The grouped summation : " + str(res))
OutputThe grouped summation : [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]
Time complexity: O(n log n) due to the sorting of the input list using the sorted() function.
Auxiliary space: O(n) because the result list res and the temporary list vals both have a maximum size of n, where n is the number of elements in the input list.
Method 5: Using dictionary comprehension
- Initialize the input list, grouping indices, and sum index.
- Create a dictionary comprehension to initialize a dictionary with keys as tuples of grouping indices and values as 0.
- Traverse through each sub-list in the input list, and update the corresponding key value in the dictionary by adding the value at the sum index to the existing value.
- Convert the dictionary to a list of tuples where each tuple contains the grouping indices followed by the sum.
- Print the result.
Python3
# Initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'),
(18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
# Initializing grouping indices
grp_indx = [1, 2]
# Initializing sum index
sum_idx = [0]
# Multiple Keys Grouped Summation
# Using dictionary comprehension
temp = {(sub[grp_indx[0]], sub[grp_indx[1]]): 0 for sub in test_list}
for sub in test_list:
temp[(sub[grp_indx[0]], sub[grp_indx[1]])] += sub[sum_idx[0]]
res = [key + (val,) for key, val in temp.items()]
# Printing result
print("The grouped summation: " + str(res))
OutputThe grouped summation: [('M', 'Gfg', 30), ('H', 'Gfg', 25), ('M', 'Best', 36)]
Time complexity: O(n). Where n is the length of the dictionary.
Auxiliary Space: O(m), where m is the number of unique combinations of grouping indices.
Method 6: Using the built-in function reduce() from the functools module
reduce() is a function from the functools module in Python that applies a function of two arguments cumulatively on a sequence of elements, in this case, our list of tuples.
Approach:
- Import the functools module
- Initialize grp_indx and sum_idx variables as before
- Define a lambda function that takes two tuples as arguments and returns a tuple with the same first two elements and the sum of their third elements. This function will be used by reduce() to perform the grouped summation.
- Use reduce() to apply the lambda function on the list of tuples. The initial value passed to reduce() is an empty dictionary.
- Convert the resulting dictionary to a list of tuples, where each tuple has the same first two elements as the keys of the dictionary and the third element is the value of the corresponding key.
- Print the result.
Below is the implementation of the above approach:
Python3
from functools import reduce
# Initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'),
(18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
# Initializing grouping indices
grp_indx = [1, 2]
# Initializing sum index
sum_idx = [0]
# Using reduce() for Multiple Keys Grouped Summation
res_dict = reduce(lambda d, t: {**d, (t[grp_indx[0]], t[grp_indx[1]]): d.get((t[grp_indx[0]], t[grp_indx[1]]), 0) + t[sum_idx[0]]}, test_list, {})
# Converting the dictionary to a list of tuples
res = [(k[0], k[1], v) for k, v in res_dict.items()]
# printing result
print("The grouped summation: " + str(res))
OutputThe grouped summation: [('M', 'Gfg', 30), ('H', 'Gfg', 25), ('M', 'Best', 36)]
Time Complexity: O(nlogn) due to the use of reduce() which has a time complexity of O(n) and the time complexity of the lambda function which is O(logn).
Auxiliary Space: O(n) because of the use of a dictionary to store intermediate results.
Method 7: Using NumPy
Steps:
- First, we import the NumPy library.
- We initialize the input list (test_list) and the grouping and sum indices (grp_indx and sum_indx, respectively).
- We convert the input list to a NumPy array using np.array().
- We extract the grouping and sum indices as separate arrays using array slicing (arr[:, grp_indx] and arr[:, sum_idx], respectively).\
- We convert the sum_arr to a numeric data type (such as int) using the astype() method, so that we can perform summation on it later.
- We use the np.unique() function to find the unique combinations of the grouping indices (grp_arr) and store them in unique_groups.
- We iterate over the unique combinations of grouping indices using a for loop.
- For each unique combination, we calculate the grouped summation by using the np.all() function to compare the grp_arr with the current group, and then summing the corresponding values in sum_arr.
- We append the results as tuples to a list called result.
- Finally, we print the result.
Python3
import numpy as np
# initializing list
test_list = [(12, 'M', 'Gfg'), (23, 'H', 'Gfg'), (13, 'M', 'Best'),
(18, 'M', 'Gfg'), (2, 'H', 'Gfg'), (23, 'M', 'Best')]
# initializing grouping indices
grp_indx = [1, 2]
# initializing sum index
sum_idx = [0]
# convert the list to a NumPy array
arr = np.array(test_list)
# extract the grouping and sum indices as separate arrays
grp_arr = arr[:, grp_indx]
sum_arr = arr[:, sum_idx].astype(int) # convert to int for numeric summation
# use np.unique() to find the unique combinations of the grouping indices
unique_groups = np.unique(grp_arr, axis=0)
# iterate over the unique combinations and calculate the grouped summation
result = []
for group in unique_groups:
group_sum = np.sum(sum_arr[np.all(grp_arr == group, axis=1)])
result.append((group[0], group[1], group_sum))
# printing result
print("The grouped summation: " + str(result))
OUTPUT :
The grouped summation: [('H', 'Gfg', 25), ('M', 'Best', 36), ('M', 'Gfg', 30)]
Time complexity: O(NlogN) for np.unique(), where N is the number of elements in test_list, and O(N) for the for loop.
Auxiliary Space: O(N) for the NumPy arrays and O(N) for the result list.
Similar Reads
Python | Equal Keys List Summation
Sometimes, while working with dictionaries, we can have a problem in which we have many dictionaries and we are required to sum like keys. This problem seems common, but complex is if the values of keys are list and we need to add elements to list of like keys. Letâs discuss way in which this proble
4 min read
Python | Grouped summation of tuple list
Many times, we are given a list of tuples and we need to group its keys and perform certain operations while grouping. The most common operation is addition. Let's discuss certain ways in which this task can be performed. Apart from addition, other operations can also be performed by doing small cha
10 min read
Python - Key Lists Summations
Sometimes, while working with Python Dictionaries, we can have problem in which we need to perform the replace of key with values with sum of all keys in values. This can have application in many domains that include data computations, such as Machine Learning. Let's discuss certain ways in which th
9 min read
Summation Matrix columns - Python
The task of summing the columns of a matrix in Python involves calculating the sum of each column in a 2D list or array. For example, given the matrix a = [[3, 7, 6], [1, 3, 5], [9, 3, 2]], the goal is to compute the sum of each column, resulting in [13, 13, 13]. Using numpy.sum()numpy.sum() is a hi
2 min read
Python | i^k Summation in list
Python being the language of magicians can be used to perform many tedious and repetitive tasks in a easy and concise manner and having the knowledge to utilize this tool to the fullest is always useful. One such small application can be finding sum of i^k of list in just one line. Letâs discuss cer
5 min read
Element indices Summation - Python
Our task is to calculate the sum of elements at specific indices in a list. This means selecting elements at specific positions and computing their sum. Given a list and a set of indices, the goal is to compute the sum of elements present at those indices. For example, given the list [10, 20, 30, 40
3 min read
Python | Above K elements summation
Many times we might have problem in which we need to find summation rather than the actual numbers and more often, the result is conditioned.. Letâs discuss certain ways in which this problem can be successfully solved. Method #1 : Using loop This problem can easily be solved using loop with a brute
3 min read
Python - Tuple Matrix Columns Summation
Sometimes, while working with Tuple Matrix, we can have a problem in which we need to perform summation of each column of tuple matrix, at the element level. This kind of problem can have application in Data Science domains. Let's discuss certain ways in which this task can be performed. Input : tes
8 min read
Python | Cumulative Columns summation of Records
Sometimes, while working with records, we can have a problem in which we need to sum all the columns of a container of lists which are tuples. This kind of application is common in web development domain. Let's discuss certain ways in which this task can be performed. Method #1 : Using sum() + list
7 min read
Python - Summation Grouping in Dictionary List
Sometimes, while working with Python Dictionaries, we can have a problem in which we need to perform the grouping of dictionaries according to specific key, and perform summation of certain keys while grouping similar key's value. This is s peculiar problem but can have applications in domains such
5 min read