0% found this document useful (0 votes)
23 views

Pandas Profiling Library For EDA

1) The document imports pandas and reads a CSV file called 'train.csv' into a dataframe. 2) It displays the head of the dataframe to show the column names and first 5 rows of data. 3) It installs the pandas-profiling package and uses it to generate a profile report on the dataframe, exporting the results to an HTML file called 'output.html'.

Uploaded by

Yajur Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Pandas Profiling Library For EDA

1) The document imports pandas and reads a CSV file called 'train.csv' into a dataframe. 2) It displays the head of the dataframe to show the column names and first 5 rows of data. 3) It installs the pandas-profiling package and uses it to generate a profile report on the dataframe, exporting the results to an HTML file called 'output.html'.

Uploaded by

Yajur Agarwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

In 

[2]: import pandas as pd

In [3]: df = pd.read_csv('train.csv')

In [5]: df.head()

Out[5]: PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked

0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S

1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C

2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S

3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S

4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S

In [6]: !pip install pandas-profiling

Requirement already satisfied: pandas-profiling in c:\users\91842\anaconda3\lib\site-packages (2.11.0) 


Requirement already satisfied: seaborn>=0.10.1 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.10.1)
Requirement already satisfied: tqdm>=4.48.2 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (4.59.0)
Requirement already satisfied: scipy>=1.4.1 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (1.5.0)
Requirement already satisfied: htmlmin>=0.1.12 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.1.12)
Requirement already satisfied: tangled-up-in-unicode>=0.0.6 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.0.7)
Requirement already satisfied: missingno>=0.4.2 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.4.2)
Requirement already satisfied: pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (1.0.5)
Requirement already satisfied: confuse>=1.0.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (1.4.0)
Requirement already satisfied: jinja2>=2.11.1 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (2.11.2)
Requirement already satisfied: attrs>=19.3.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (19.3.0)
Requirement already satisfied: phik>=0.10.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.11.2)
Requirement already satisfied: visions[type_image_path]==0.6.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.6.0)
Requirement already satisfied: requests>=2.24.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (2.24.0)
Requirement already satisfied: numpy>=1.16.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (1.18.5)
Requirement already satisfied: ipywidgets>=7.5.1 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (7.5.1)
Requirement already satisfied: joblib in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (0.16.0)
Requirement already satisfied: matplotlib>=3.2.0 in c:\users\91842\anaconda3\lib\site-packages (from pandas-profiling) (3.2.2)
Requirement already satisfied: python-dateutil>=2.6.1 in c:\users\91842\anaconda3\lib\site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,!=1.1.0,>=0.25.3->pandas-profiling) 
(2 8 1)
In [7]: from pandas_profiling import ProfileReport
prof = ProfileReport(df)
prof.to_file(output_file='output.html')

Summarize dataset: 0%| | 0/25 [00:00<?, ?it/s]

Generate report structure: 0%| | 0/1 [00:00<?, ?it/s]

Render HTML: 0%| | 0/1 [00:00<?, ?it/s]

Export report to file: 0%| | 0/1 [00:00<?, ?it/s]

In [ ]: ​

You might also like