
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Fetch Common Rows Between Two DataFrames in Python Pandas Using Concat
To fetch the common rows between two DataFrames, use the concat() function. Let us create DataFrame1 with two columns −
dataFrame1 = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'], "Reg_Price": [1000, 1500, 1100, 800, 1100, 900] } )
Create DataFrame2 with two columns −
dataFrame2 = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'], "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000] } )
Finding common rows between two DataFrames with concat() −
dfRes = pd.concat([dataFrame1, dataFrame2])
Reset index −
dfRes = dfRes.reset_index(drop=True)
Groupby columns −
dfGroup = dfRes.groupby(list(dfRes.columns))
Getting the length of each row to calculate the count. If count is greater than 1, that would mean common rows −
res = [k[0] for k in dfGroup.groups.values() if len(k) > 1]
Example
Following is the code −
import pandas as pd # Create DataFrame1 dataFrame1 = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'], "Reg_Price": [1000, 1500, 1100, 800, 1100, 900] } ) print"DataFrame1 ...\n",dataFrame1 # Create DataFrame2 dataFrame2 = pd.DataFrame( { "Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'], "Reg_Price": [1200, 1500, 1000, 800, 1100, 1000] } ) print"\nDataFrame2 ...\n",dataFrame2 # finding common rows between two DataFrames dfRes = pd.concat([dataFrame1, dataFrame2]) # reset index dfRes = dfRes.reset_index(drop=True) # groupby columns dfGroup = dfRes.groupby(list(dfRes.columns)) # length of each row to calculate the count # if count is greater than 1, that would mean common rows res = [k[0] for k in dfGroup.groups.values() if len(k) > 1] print"\nCommon rows...\n",dfRes.reindex(res)
Output
This will produce the following output −
DataFrame1 ... Car Reg_Price 0 BMW 1000 1 Lexus 1500 2 Audi 1100 3 Tesla 800 4 Bentley 1100 5 Jaguar 900 DataFrame2 ... Car Reg_Price 0 BMW 1200 1 Lexus 1500 2 Audi 1000 3 Tesla 800 4 Bentley 1100 5 Jaguar 1000 Common rows... Car Reg_Price 3 Tesla 800 1 Lexus 1500 4 Bentley 1100
Advertisements