Python | Group and count similar records

Last Updated : 24 Mar, 2023

Sometimes, while working with records, we can have a problem in which we need to collect and maintain the counter value inside records. This kind of application is important in the web development domain. Let's discuss certain ways in which this task can be performed.

Method #1 : Using loop + Counter() + set()

The combination of the above functionalities can be employed to achieve this task. In this, we run a loop to capture each tuple and add to set and check if it's already existing, then increase and add a counter value to it. The cumulative count is achieved by using Counter().

Python3

# Python3 code to demonstrate working of
# Group and count similar records
# using Counter() + loop + set()
from collections import Counter

# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# printing original list
print("The original list : " + str(test_list))

# Group and count similar records
# using Counter() + loop + set()
res = []
temp = set()
counter = Counter(test_list)
for sub in test_list:
    if sub not in temp:
        res.append((counter[sub], ) + sub)
        temp.add(sub)

# printing result
print("Grouped and counted list is : " + str(res))

Output

The original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]

Time complexity: O(n)
Auxiliary space: O(n)

Method #2: Using Counter() + list comprehension + items()

This is a one-liner approach and is recommended to use in programming. The task of loops is handled by list comprehension and items() are used to access all the elements of the Counter converted dictionary to allow computations.

Python3

# Python3 code to demonstrate working of
# Group and count similar records
# using Counter() + list comprehension + items()
from collections import Counter

# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# printing original list
print("The original list : " + str(test_list))

# Group and count similar records
# using Counter() + list comprehension + items()
res = [(counter, ) + ele for ele, counter in Counter(test_list).items()]

# printing result
print("Grouped and counted list is : " + str(res))

Output

The original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]

Time complexity: O(n), where n is the number of elements in the list "test_list".
Auxiliary space: O(n), as it stores the result in a new list "res".

Method #3 : Using count(),join(),list() and set() methods

Python3

# Python3 code to demonstrate working of
# Group and count similar records

# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# printing original list
print("The original list : " + str(test_list))

# Group and count similar records
res = []
x = list(set(test_list))
for i in x:
    a = test_list.count(i)
    b = "".join(i)
    res.append((a, b))
# printing result
print("Grouped and counted list is : " + str(res))

Output

The original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (1, 'best'), (1, 'geeks'), (1, 'for'), (2, 'is')]

Time complexity: O(n^2) - due to the use of count() method within the for loop.
Auxiliary space: O(n) - used to store the result list.

Method 4: using operator.countOf() method

Python3

# Python3 code to demonstrate working of
# Group and count similar records
import operator as op
# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# printing original list
print("The original list : " + str(test_list))

# Group and count similar records
res = []
x = list(set(test_list))
for i in x:
    a = op.countOf(test_list, i)
    b = "".join(i)
    res.append((a, b))
# printing result
print("Grouped and counted list is : " + str(res))

Output

The original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (1, 'best'), (1, 'for'), (1, 'geeks'), (2, 'is')]

Time Complexity: O(N)
Auxiliary Space : O(N)

Method #5: Using defaultdict() and list comprehension

This method uses a defaultdict to count the occurrences of each sub-list in the given list. Then, it creates a new list using a list comprehension by unpacking each sub-list and adding the count value as the first element of the tuple.

Python3

from collections import defaultdict

# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# printing original list
print("The original list : " + str(test_list))

# Group and count similar records
# using defaultdict() and list comprehension
counter = defaultdict(int)
for sub in test_list:
    counter[sub] += 1
res = [(count,) + sub for sub, count in counter.items()]

# printing result
print("Grouped and counted list is : " + str(res))

Output

The original list : [('gfg',), ('is',), ('best',), ('gfg',), ('is',), ('for',), ('geeks',)]
Grouped and counted list is : [(2, 'gfg'), (2, 'is'), (1, 'best'), (1, 'for'), (1, 'geeks')]

Time complexity: O(n), where n is the length of the input list test_list.
Auxiliary space: O(k), where k is the number of distinct elements in the input list.

Method #7: Using itertools.groupby() and len()

This approach uses itertools.groupby() to group the elements of the test_list by their value, and then applies len() to each group to get the count of similar records. It then creates a new list with the count and the element value.

Approach:

Import itertools.
Sort the test_list to ensure that similar records are grouped together.
Use itertools.groupby() to group the elements of the test_list by their value.
Use a list comprehension to create a list with the count and the element value for each group.
Print the original list and the new list.

Python3

import itertools

# initialize list
test_list = [('gfg', ), ('is', ), ('best', ), ('gfg', ),
             ('is', ), ('for', ), ('geeks', )]

# sort the list
test_list.sort()

# group and count similar records
res = [(len(list(group)), key) for key, group in itertools.groupby(test_list)]

# print the original list and the new list
print("The original list : " + str(test_list))
print("Grouped and counted list is : " + str(res))

Output

The original list : [('best',), ('for',), ('geeks',), ('gfg',), ('gfg',), ('is',), ('is',)]
Grouped and counted list is : [(1, ('best',)), (1, ('for',)), (1, ('geeks',)), (2, ('gfg',)), (2, ('is',))]

Time complexity: O(NlogN) due to sorting the list.
Auxiliary space: O(N) to store the new list.

Python | Record Similar tuple occurrences

manjeet_04

Improve

Article Tags :

Practice Tags :

python

Python | Group and count similar records

Method #1 : Using loop + Counter() + set()

Method #2: Using Counter() + list comprehension + items()

Method #3 : Using count(),join(),list() and set() methods

Method 4: using operator.countOf() method

Method #5: Using defaultdict() and list comprehension

Method #7: Using itertools.groupby() and len()

Similar Reads

Thank You!

What kind of Experience do you want to share?