Dataframes - Jupyter Notebook
Dataframes - Jupyter Notebook
In [2]: 1 # dataframe 1
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra', 'goa'], \
6 'positive': [20, 21, 19, 18],'neagtive': [120, 121, 119, 18] }
7
8 # Create DataFrame
9 df1 = pd.DataFrame(data)
In [3]: 1 df1
Out[3]:
city positive neagtive
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
3 goa 18 18
In [4]: 1 # dataframe 2
2 data = {'city': ['delhi', 'mumbai', 'agra', 'chennai'], \
3 'positive': [10, 21, 39, 18],'neagtive': [12, 101, 129, 118] }
4
5 # Create DataFrame
6 df2 = pd.DataFrame(data)
7 df2
Out[4]:
city positive neagtive
0 delhi 10 12
1 mumbai 21 101
2 agra 39 129
3 chennai 18 118
In [99]: 1 # concatenate: concatenate the two dataframes one below the other.
2 df3 = pd.concat([df1,df2])
localhost:8888/notebooks/Dataframes.ipynb 1/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
In [100]: 1 df3
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
3 goa 18 18
0 delhi 10 12
1 mumbai 21 101
2 agra 39 129
3 chennai 18 118
We see that in above result, we did not get continuous indexes( 0,1,2,3,0,1,2,3) to make
them continuous like 0,1,2,3,4,… we can write ignore_index=True
In [102]: 1 df3
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
3 goa 18 18
4 delhi 10 12
5 mumbai 21 101
6 agra 39 129
7 chennai 18 118
localhost:8888/notebooks/Dataframes.ipynb 2/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
In [104]: 1 df3
1 mumbai 21 121
2 agra 19 119
3 goa 18 18
second 0 delhi 10 12
1 mumbai 21 101
2 agra 39 129
3 chennai 18 118
In [105]: 1 df3.loc['first']
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
3 goa 18 18
In [106]: 1 df3.loc['first', 0]
In [107]: 1 df3.loc['second']
0 delhi 10 12
1 mumbai 21 101
2 agra 39 129
3 chennai 18 118
In [108]: 1 # if you want to combine the two data frames horizontally means one nex
2 df3 = pd.concat([df1,df2], axis =1)
3 df3
Another example: Create two dataframes and concatenate them horizontally (axis =1)
localhost:8888/notebooks/Dataframes.ipynb 3/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
In [5]: 1 # dataframe 1
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra', 'goa'], \
6 'temperature': [20, 21, 19, 18]}
7
8 # Create DataFrame
9 df1 = pd.DataFrame(data)
In [110]: 1 df1
0 delhi 20
1 mumbai 21
2 agra 19
3 goa 18
In [6]: 1 # dataframe 2
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['agra','mumbai','goa','delhi',], \
6 'windspeed': [2, 2, 1, 1]}
7
8 # Create DataFrame
9 df2 = pd.DataFrame(data)
In [112]: 1 df2
0 agra 2
1 mumbai 2
2 goa 1
3 delhi 1
0 delhi 20 agra 2
1 mumbai 21 mumbai 2
2 agra 19 goa 1
3 goa 18 delhi 1
We see in the above output the rows are not containing records of same city, to rectify it we
can pass the index
localhost:8888/notebooks/Dataframes.ipynb 4/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
0 delhi 20 delhi 1
1 mumbai 21 mumbai 2
2 agra 19 agra 2
3 goa 18 goa 1
Append:
The concat method can combine data frames along either rows or columns, while the
append method only combines data frames along rows
In [8]: 1 # dataframe 1
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra'], \
6 'positive': [20, 21, 19],'neagtive': [120, 121, 119] }
7
8 # Create DataFrame
9 df1 = pd.DataFrame(data)
10 df1
Out[8]:
city positive neagtive
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
localhost:8888/notebooks/Dataframes.ipynb 5/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
In [117]: 1 # dataframe 2
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra'],\
6 'positive': [210, 211, 19],'neagtive': [12, 121, 109] }
7
8 # Create DataFrame
9 df2 = pd.DataFrame(data)
10 df2
0 delhi 210 12
2 agra 19 109
In [119]: 1 df3
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
0 delhi 210 12
2 agra 19 109
localhost:8888/notebooks/Dataframes.ipynb 6/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
We can join the dataframes in different ways: 1)inner join: only common data of the
dataframes are outputted 2)left join:That means we should get all records of left dataframe
and only the matching data of right dataframe. 3)Right join:That means we should get all
records of right dataframe and only the matching data of left dataframe. 4)Full outer join: all
data from right and left dataframe. if no matching NaN will come
In [123]: 1 # in the above output we cant see the change as all records were common
In [124]: 1 # dataframe 1
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra', 'goa'], \
6 'positive': [20, 21, 19, 88],'neagtive': [120, 121, 119, 133] }
7
8 # Create DataFrame
9 df1 = pd.DataFrame(data)
10 df1
0 delhi 20 120
1 mumbai 21 121
2 agra 19 119
3 goa 88 133
localhost:8888/notebooks/Dataframes.ipynb 7/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
In [9]: 1 # dataframe 2
2 import pandas as pd
3
4 # assign data of lists.
5 data = {'city': ['delhi', 'mumbai', 'agra'], \
6 'positive': [210, 211, 19],'neagtive': [12, 121, 109] }
7
8 # Create DataFrame
9 df2 = pd.DataFrame(data)
10 df2
Out[9]:
city positive neagtive
0 delhi 210 12
2 agra 19 109
In [127]: 1 # we see that record for goa did not come as it was not common in both
localhost:8888/notebooks/Dataframes.ipynb 8/9
8/9/24, 9:58 AM Dataframes - Jupyter Notebook
https://github.com/codebasics/py/blob/master/pandas/9_merge/pandas_merge.ipynb
(https://github.com/codebasics/py/blob/master/pandas/9_merge/pandas_merge.ipynb)
localhost:8888/notebooks/Dataframes.ipynb 9/9