Python | Group and count similar records
Last Updated :
24 Mar, 2023
Sometimes, while working with records, we can have a problem in which we need to collect and maintain the counter value inside records. This kind of application is important in the web development domain. Let's discuss certain ways in which this task can be performed.
Method #1 : Using loop + Counter() + set()
The combination of the above functionalities can be employed to achieve this task. In this, we run a loop to capture each tuple and add to set and check if it's already existing, then increase and add a counter value to it. The cumulative count is achieved by using Counter().
Python3
# Python3 code to demonstrate working of
# Group and count similar records
# using Counter() + loop + set()
from collections import Counter
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# printing original list
print("The original list : " + str(test_list))
# Group and count similar records
# using Counter() + loop + set()
res = []
temp = set()
counter = Counter(test_list)
for sub in test_list:
if sub not in temp:
res.append((counter[sub], ) + sub)
temp.add(sub)
# printing result
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]
Time complexity: O(n)
Auxiliary space: O(n)
Method #2: Using Counter() + list comprehension + items()
This is a one-liner approach and is recommended to use in programming. The task of loops is handled by list comprehension and items() are used to access all the elements of the Counter converted dictionary to allow computations.
Python3
# Python3 code to demonstrate working of
# Group and count similar records
# using Counter() + list comprehension + items()
from collections import Counter
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# printing original list
print("The original list : " + str(test_list))
# Group and count similar records
# using Counter() + list comprehension + items()
res = [(counter, ) + ele for ele, counter in Counter(test_list).items()]
# printing result
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]
Time complexity: O(n), where n is the number of elements in the list "test_list".
Auxiliary space: O(n), as it stores the result in a new list "res".
Method #3 : Using count(),join(),list() and set() methods
Python3
# Python3 code to demonstrate working of
# Group and count similar records
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# printing original list
print("The original list : " + str(test_list))
# Group and count similar records
res = []
x = list(set(test_list))
for i in x:
a = test_list.count(i)
b = "".join(i)
res.append((a, b))
# printing result
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (1, 'best'), (1, 'geeks'), (1, 'for'), (2, 'is')]
Time complexity: O(n^2) - due to the use of count() method within the for loop.
Auxiliary space: O(n) - used to store the result list.
Method 4: using operator.countOf() method
Python3
# Python3 code to demonstrate working of
# Group and count similar records
import operator as op
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# printing original list
print("The original list : " + str(test_list))
# Group and count similar records
res = []
x = list(set(test_list))
for i in x:
a = op.countOf(test_list, i)
b = "".join(i)
res.append((a, b))
# printing result
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (1, 'best'), (1, 'for'), (1, 'geeks'), (2, 'is')]
Time Complexity: O(N)
Auxiliary Space : O(N)
Method #5: Using defaultdict() and list comprehension
This method uses a defaultdict to count the occurrences of each sub-list in the given list. Then, it creates a new list using a list comprehension by unpacking each sub-list and adding the count value as the first element of the tuple.
Python3
from collections import defaultdict
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# printing original list
print("The original list : " + str(test_list))
# Group and count similar records
# using defaultdict() and list comprehension
counter = defaultdict(int)
for sub in test_list:
counter[sub] += 1
res = [(count,) + sub for sub, count in counter.items()]
# printing result
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]
Time complexity: O(n), where n is the length of the input list test_list.
Auxiliary space: O(k), where k is the number of distinct elements in the input list.
Method #7: Using itertools.groupby() and len()
This approach uses itertools.groupby() to group the elements of the test_list by their value, and then applies len() to each group to get the count of similar records. It then creates a new list with the count and the element value.
Approach:
- Import itertools.
- Sort the test_list to ensure that similar records are grouped together.
- Use itertools.groupby() to group the elements of the test_list by their value.
- Use a list comprehension to create a list with the count and the element value for each group.
- Print the original list and the new list.
Python3
import itertools
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
('is', ), ('for', ), ('geeks', )]
# sort the list
test_list.sort()
# group and count similar records
res = [(len(list(group)), key) for key, group in itertools.groupby(test_list)]
# print the original list and the new list
print("The original list : " + str(test_list))
print("Grouped and counted list is : " + str(res))
OutputThe original list : [('best',), ('for',), ('geeks',), ('gfg',), ('gfg',), ('is',), ('is',)]
Grouped and counted list is : [(1, ('best',)), (1, ('for',)), (1, ('geeks',)), (2, ('gfg',)), (2, ('is',))]
Time complexity: O(NlogN) due to sorting the list.
Auxiliary space: O(N) to store the new list.
Similar Reads
Python - Group Records on Similar index elements
Sometimes, while working with Python records, we can have a problem in which we need to perform grouping of particular index of tuple, on basis of similarity of rest of indices. This type of problems can have possible applications in Web development domain. Let's discuss certain ways in which this t
6 min read
Python | Record Similar tuple occurrences
Sometimes, while working with data, we can have a problem in which we need to check the occurrences of records that occur at similar times. This has applications in the web development domain. Let's discuss certain ways in which this task can be performed. Method #1 : Using map() + Counter() + sorte
9 min read
Python - Group records by Kth column in List
Sometimes, while working with Python lists, we can have a problem in which we need to perform grouping of records on basis of certain parameters. One such parameters can be on the Kth element of Tuple. Lets discuss certain ways in which this task can be performed. Method #1 : Using loop + defaultdic
8 min read
Python - Group Similar keys in dictionary
Sometimes while working with dictionary data, we can have problems in which we need to perform grouping based on the substring of keys and reform the data grouped on similar keys. This can have application in data preprocessing. Let us discuss certain ways in which this task can be performed. Method
5 min read
Python | Strings with similar front and rear character
Sometimes, while programming, we can have a problem in which we need to check for the front and rear characters of each string. We may require to extract the count of all strings with similar front and rear characters. Let's discuss certain ways in which this task can be performed. Method #1: Using
4 min read
Python - Count similar pair in Dual List
Sometimes, while working with data, we can have a problem in which we receive the dual elements pair and we intend to find pairs of similar elements and it's frequency. This kind of data is usually useful in data domains. Let's discuss certain ways in which this task can be performed.Method #1 : Usi
4 min read
Python - Retain records with N occurrences of K
Sometimes, while working with Python tuples list, we can have a problem in which we need to perform retention of all the records where occurrences of K is N times. This kind of problem can come in domains such as web development and day-day programming. Let's discuss certain ways in which this task
10 min read
Python - Count elements in record tuple
Sometimes, while working with data in form of records, we can have a problem in which we need to find the total element counts of all the records received. This is a very common application that can occur in Data Science domain. Letâs discuss certain ways in which this task can be performed. Method
5 min read
Python - Group similar elements into Matrix
Sometimes, while working with Python Matrix, we can have a problem in which we need to perform grouping of all the elements with are the same. This kind of problem can have applications in data domains. Let's discuss certain ways in which this task can be performed. Input : test_list = [1, 3, 4, 4,
8 min read
Python - Extract Similar pairs from List
Sometimes, while working with Python lists, we can have a problem in which we need to perform extraction of all the elements pairs in list. This kind of problem can have application in domains such as web development and day-day programming. Let's discuss certain ways in which this task can be perfo
5 min read