Python - Group records by Kth column in List
Last Updated :
18 Apr, 2023
Sometimes, while working with Python lists, we can have a problem in which we need to perform grouping of records on basis of certain parameters. One such parameters can be on the Kth element of Tuple. Lets discuss certain ways in which this task can be performed.
Method #1 : Using loop + defaultdict() The combination of above methods can be used to perform this task. In this we store the tuples in different list on basis of Kth Column using defaultdict and iteration using loop.
Python3
# Python3 code to demonstrate
# Group records by Kth column in List
# using loop + defaultdict()
from collections import defaultdict
# Initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is : " + str(test_list))
# Initializing K
K = 0
# Group records by Kth column in List
# using loop + defaultdict()
temp = defaultdict(list)
for ele in test_list:
temp[ele[K]].append(ele)
res = list(temp.values())
# printing result
print ("The list after grouping : " + str(res))
Output : The original list is : [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping : [[('Gfg', 1), ('Gfg', 3)], [('is', 2), ('is', 4)], [('best', 5)]]
Time Complexity: O(n*n) where n is the number of elements in the list “test_list”.
Auxiliary Space: O(n) where n is the number of elements in the list “test_list”.
Method #2 : Using itemgetter() + groupby() + list comprehension The combination of above function can also be performed using above functions. In this, itemgetter is used to select Kth Column, groupby() is used to group and list comprehension is used to compile the result.
Python3
# Python3 code to demonstrate
# Group records by Kth column in List
# using itemgetter() + list comprehension + groupby()
from operator import itemgetter
from itertools import groupby
# Initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is : " + str(test_list))
# Initializing K
K = 0
# Group records by Kth column in List
# using loop + defaultdict()
temp = itemgetter(K)
res = [list(val) for key, val in groupby(sorted(test_list, key = temp), temp)]
# printing result
print ("The list after grouping : " + str(res))
Output : The original list is : [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping : [[('Gfg', 1), ('Gfg', 3)], [('is', 2), ('is', 4)], [('best', 5)]]
The time complexity of the code is O(nlogn), where n is the length of the input list.
The space complexity of the code is O(n), where n is the length of the input list.
Method #3 : Using numpy
One more approach to perform the grouping of records based on the Kth column in a list is using the numpy library.
Here's how it can be done:
Python3
import numpy as np
# Initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is : " + str(test_list))
# Initializing K
K = 0
# Group records by Kth column in List using numpy
arr = np.array(test_list)
keys, indices, inverse = np.unique(arr[:, K], return_index=True, return_inverse=True)
res = [arr[np.where(inverse == i)].tolist() for i in range(len(keys))]
# printing result
print("The list after grouping : " + str(res))
#This code is contributed by Edula Vinay Kumar Reddy
Output:
The original list is : [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping : [[('Gfg', 1), ('Gfg', 3)], [('is', 2), ('is', 4)], [('best', 5)]]
Time Complexity: O(NlogN)
Space Complexity: O(N)
Method #4: Using a list comprehension with enumerate() function and a set:
- Prints the original list.
- Initializes a variable K with a value of 0.
- Creating a list result with the length equal to the number of unique keys in the input list. This is done by getting all the keys in the input list, using a set to remove duplicates, and getting the length of the resulting set.
- Loops over each tuple in the input list, unpacking each tuple into the variables key and value, and using the enumerate() function to also get the index I.
- Uses a list comprehension to get a list of all unique keys in the input list. Then, it finds the index of the current key in that list, and appends the current tuple to the corresponding list in the result list.
Python3
# Initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is : " + str(test_list))
# Initializing K
K = 0
# Group records by Kth column in List
# using enumerate() function
result = [[] for i in range(len(set([x[0] for x in test_list])))]
for i, (key, value) in enumerate(test_list):
result[list(set([x[0] for x in test_list])).index(key)].append((key, value))
# printing result
print("The list after grouping : " + str(result))
OutputThe original list is : [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping : [[('Gfg', 1), ('Gfg', 3)], [('is', 2), ('is', 4)], [('best', 5)]]
Time Complexity: O(n log n), where n is the length of the input list test_list as the set() operation in the list comprehension takes O(n) time, and the index() method in the for loop takes O(log n) time.
Space Complexity: O(n), as a new list is created of length n.
Method #5: Using setdefault on a dictionary
Step-by-step algorithm:
- Initialize the list of tuples containing records.
- Initialize the column number K for grouping by that column.
- Initialize an empty dictionary called groups.
- Iterate each record in the test_list.
a. Get the value of the Kth column in the current record.
b. If the key for this value does not exist in the groups dictionary, create a new empty list as the value for that key.
c. Append the current record to the list for the corresponding key in the groups dictionary. - Convert the dictionary of groups to a list of lists containing the grouped records.
- Print the resulting list of lists.
Python3
#initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is : " + str(test_list))
#The column number K is initialized to 0
K = 0
#An empty dictionary called groups is initialized
groups = {}
#Then on each record in the test_list is iterated over
for x in test_list:
groups.setdefault(x[K], []).append(x)
groups = list(groups.values())
#The resulting list of lists is printed
print("The list after grouping: " + str(groups))
OutputThe original list is : [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping: [[('Gfg', 1), ('Gfg', 3)], [('is', 2), ('is', 4)], [('best', 5)]]
Time Complexity: O(n), where n is the number of records in the input list. This is because the algorithm iterates over each record in the input list once.
Auxiliary Space: O(n), where n is the number of records in the input list.
Method #6: Using itertools.groupby()
Step-by-step approach:
- We first import the groupby() function from the itertools module.
- We initialize the original list test_list and print it.
- We initialize the value of K to the index of the column we want to group by (in this case, the 0th column).
- We use the sort() method to sort the list based on the values in the Kth column.
- We use a list comprehension and the groupby() function to group the records based on the Kth column. The groupby() function groups the list into sub-lists based on the key value returned by the lambda function (lambda x: x[K]), which is the Kth element of each tuple. We then convert each group into a list and append it to the res list.
- Finally, we print the resulting list.
Below is the implementation of the above approach:
Python3
from itertools import groupby
# Initializing list
test_list = [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
# printing original list
print("The original list is: " + str(test_list))
# Initializing K
K = 0
# Group records by Kth column in List
# using itertools.groupby()
test_list.sort(key=lambda x: x[K])
res = [list(group) for key, group in groupby(test_list, lambda x: x[K])]
# printing result
print("The list after grouping: " + str(res))
OutputThe original list is: [('Gfg', 1), ('is', 2), ('Gfg', 3), ('is', 4), ('best', 5)]
The list after grouping: [[('Gfg', 1), ('Gfg', 3)], [('best', 5)], [('is', 2), ('is', 4)]]
Time complexity: O(n log n), where n is the length of the list.
Auxiliary space: O(n), where n is the length of the list.
Similar Reads
Column Average in Record List - Python
Given a list of records where each record contains multiple fields, the task is to compute the average of a specific column. Each record is represented as a dictionary or a list, and the goal is to extract values from the chosen column and calculate their average. Letâs explore different methods to
3 min read
Python - Sort Records by Kth Index List
Sometimes, while working with Python Records, we can have a problem in which we need to perform Sorting of Records by some element in Tuple, this can again be sometimes, a list and sorting has to performed by Kth index of that list. This is uncommon problem, but can have usecase in domains such as w
4 min read
Python - Group Tuples by Kth Index Element
Sometimes, while working with Python records, we can have a problem in which we need to perform grouping of elements of tuple by similar Kth index element. This kind of problem can have application in web development domain. Let's discuss the certain way in which this task can be performed. Input :
5 min read
Python - Extract records if Kth elements not in List
Given list of tuples, task is to extract all the tuples where Kth index elements are not present in argument list. Input : test_list = [(5, 3), (7, 4), (1, 3), (7, 8), (0, 6)], arg_list = [6, 8, 8], K = 1 Output : [(5, 3), (7, 4), (1, 3)] Explanation : All the elements which have either 6 or 8 at 1s
4 min read
Python | Binary element list grouping
Sometimes while working with the databases, we need to perform certain list operations that are more like query language, for instance, grouping of nested list element with respect to its other index elements. This article deals with binary nested list and group each nested list element with respect
9 min read
Python - Convert Uneven Lists into Records
Sometimes, while working with Records, we can have a problem, that we have keys in one list and values in other. But sometimes, values can be multiple in order, like the scores or marks of particular subject. This type of problem can occur in school programming and development domains. Lets discuss
3 min read
Python - Remove Record if Nth Column is K
Sometimes while working with a list of records, we can have a problem in which we need to perform the removal of records on the basis of the presence of certain elements at the Nth position of the record. Let us discuss certain ways in which this task can be performed. Method #1: Using loop This is
10 min read
Python - Group contiguous strings in List
Given a mixed list, the task is to write a Python program to group all the contiguous strings. Input : test_list = [5, 6, 'g', 'f', 'g', 6, 5, 'i', 's', 8, 'be', 'st', 9] Output : [5, 6, ['g', 'f', 'g'], 6, 5, ['i', 's'], 8, ['be', 'st'], 9] Explanation : Strings are grouped to form result.Input : t
5 min read
Python - Sort from Kth index in List
Given a list of elements, perform sort from Kth index of List. Input : test_list = [7, 3, 7, 6, 4, 9], K = 2 Output : [7, 3, 4, 6, 7, 9] Explanation : List is unsorted till 3 (1st index), From 2nd Index, its sorted. Input : test_list = [5, 4, 3, 2, 1], K= 3 Output : [5, 4, 3, 1, 2] Explanation : Onl
3 min read
Python | Grouping list values into dictionary
Sometimes, while working with data, we can be encountered a situation in which we have a list of lists and we need to group its 2nd index with the common initial element in lists. Let's discuss ways in which this problem can be solved. Method 1: Using defaultdict() + loop + dict() The defaultdict ca
7 min read