SlideShare a Scribd company logo
Princeton Research
Data Management
Workshop 2020
Co-sponsored by the Center for Digital Humanities, the Center for Statistics and Machine Learning, the Office of
the Dean for Research, and Data-Driven Social Science Initiative
Organized by Princeton University Library’s Princeton Research
Data Service, Princeton Institute for Computational Science and
Engineering, and OIT Research Computing
Day Two:
Break-out Session:
Python, Numpy, Pandas
Python, Numpy, and Pandas
Henry Schreiner, PICSiE/PHY
henryfs@princeton.edu
2020 Research Data Management Workshop
Python for data science
● Second most popular
language on GitHub
● General purpose
● Only Data Science
language in top 10
● Over 200K PyPI
packages, 1.6 billion
releases
Python for data science
● Another metric (PYPL, Google-based) has it #1
● Data Science languages shown below
● Python fastest growing
● R peaked around 2017
● Others also in decline
● Note the log scale!
Timeline
● 1994: Python 1.0 released
● 1995: First array package: Numeric
● 2003: Matplotlib
● 2005: Numeric and numarray merged into Numpy
● 2008: Pandas introduced
● 2012: The Anaconda python distribution
Timeline
● 2012: Numba JIT compiler
● 2014: IPython becomes Jupyter project & notebook
● 2016: LIGO's discovery: Jupyter Notebook + Python
● 2017: Google releases TensorFlow (Python)
● Now: All Machine Learning libraries are primarily or
exclusively used via Python
Why Python?
What makes Python
special?
● Great interactivity
● General purpose
● Weaknesses filled
by libraries and
services
Python: the language
● Simple
● Easy to
learn
● Flexible and
powerful
● Object
Oriented
def square(x):
return x**2
print(square(4))
# Prints 4
IPython
● Adds interactive features to
Python
○ Timing chunks of code
○ Shell-like features
○ Fancy display system
%cd my_dir
%%timeit
run_long()
! ./program
Jupyter Notebooks
● Cell-based HTML
document
● Supports many
kernels (IPython was
first and is the most
popular)
● Interleave
documentation, code,
and output
Jupyter Lab
● Holds multiple
views of
○ Notebooks
○ Output
○ Editors
○ Terminals
Jupyter Hub
● Multiuser notebook or lab instances
● Available at mybinder.org or through Princeton Research
Computing
Example: Runge-Kutta static notebook, runnable mybinder
Libraries
PyPI
● The core service for
Python libraries
● Uses pip to install
● Environment
management separate
Anaconda
● Can package Python
and complex libraries
● Uses conda to install
● Environment manager
too (reproducible)
● conda-forge is
community effort
Numpy
● Adds an array type
● Fast computations
array-at-a-time
● Python and Numpy now
define a standard protocol
for arrays
● A library that replaces
langagues like ADL
import numpy as np
v = np.array([1,2,3])
print(v**2)
# Prints 1, 4, 9
Pandas
● Tabular data
○ A library that replaces languages like R and Excel
○ Designed with interactivity in mind
● Other libraries mimic Pandas’ API
Numba
● Adds full JIT (just in time) compiler to Python
● Compiles normal python functions into LLVM
● Growing subset of Python and Numpy
● Can be as fast as any compiled language
● Supports parallel computation, GPUs, and more
Other libraries of note
● CuPY: CUDA with a numpy interface
● TensorFlow/PyTorch: Machine learning libraries
● Matplotlib: The plotting library for Python
● PyQt/PySide: Bindings to Qt Graphical User Interface
● PyBind11: Easy C++ bindings
Summary
● Python is wildly popular, simple to learn, and well
supported
● Python has an impressive collection of tools
○ Interactivity: IPython, Jupyter
○ Package delivery: PyPI (pip), Conda
○ Libraries: Numpy, Pandas, and many more
Demo
● The second half is devoted to a Pandas demo session
Ad

More Related Content

What's hot (20)

Python by Rj
Python by RjPython by Rj
Python by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Introduction to IPython & Jupyter Notebooks
Introduction to IPython & Jupyter NotebooksIntroduction to IPython & Jupyter Notebooks
Introduction to IPython & Jupyter Notebooks
Eueung Mulyana
 
Numpy
NumpyNumpy
Numpy
Jyoti shukla
 
Python presentation by Monu Sharma
Python presentation by Monu SharmaPython presentation by Monu Sharma
Python presentation by Monu Sharma
Mayank Sharma
 
Python course syllabus
Python course syllabusPython course syllabus
Python course syllabus
Sugantha T
 
Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming
KrishnaMildain
 
NUMPY
NUMPY NUMPY
NUMPY
SharmilaChidaravalli
 
Pandas
PandasPandas
Pandas
Jyoti shukla
 
PYTHON TUTORIALS.pptx
PYTHON TUTORIALS.pptxPYTHON TUTORIALS.pptx
PYTHON TUTORIALS.pptx
EzatIlman1
 
Modules and packages in python
Modules and packages in pythonModules and packages in python
Modules and packages in python
TMARAGATHAM
 
Why Python?
Why Python?Why Python?
Why Python?
Adam Pah
 
Basics of python
Basics of pythonBasics of python
Basics of python
SurjeetSinghSurjeetS
 
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Edureka!
 
Beginning Python Programming
Beginning Python ProgrammingBeginning Python Programming
Beginning Python Programming
St. Petersburg College
 
Fundamentals of Python Programming
Fundamentals of Python ProgrammingFundamentals of Python Programming
Fundamentals of Python Programming
Kamal Acharya
 
Python Course | Python Programming | Python Tutorial | Python Training | Edureka
Python Course | Python Programming | Python Tutorial | Python Training | EdurekaPython Course | Python Programming | Python Tutorial | Python Training | Edureka
Python Course | Python Programming | Python Tutorial | Python Training | Edureka
Edureka!
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
AnirudhaGaikwad4
 
Python tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academyPython tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academy
TIB Academy
 
Python Basics
Python BasicsPython Basics
Python Basics
tusharpanda88
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
Fariz Darari
 
Introduction to IPython & Jupyter Notebooks
Introduction to IPython & Jupyter NotebooksIntroduction to IPython & Jupyter Notebooks
Introduction to IPython & Jupyter Notebooks
Eueung Mulyana
 
Python presentation by Monu Sharma
Python presentation by Monu SharmaPython presentation by Monu Sharma
Python presentation by Monu Sharma
Mayank Sharma
 
Python course syllabus
Python course syllabusPython course syllabus
Python course syllabus
Sugantha T
 
Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming Python programming | Fundamentals of Python programming
Python programming | Fundamentals of Python programming
KrishnaMildain
 
PYTHON TUTORIALS.pptx
PYTHON TUTORIALS.pptxPYTHON TUTORIALS.pptx
PYTHON TUTORIALS.pptx
EzatIlman1
 
Modules and packages in python
Modules and packages in pythonModules and packages in python
Modules and packages in python
TMARAGATHAM
 
Why Python?
Why Python?Why Python?
Why Python?
Adam Pah
 
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Python Sequence | Python Lists | Python Sets & Dictionary | Python Strings | ...
Edureka!
 
Fundamentals of Python Programming
Fundamentals of Python ProgrammingFundamentals of Python Programming
Fundamentals of Python Programming
Kamal Acharya
 
Python Course | Python Programming | Python Tutorial | Python Training | Edureka
Python Course | Python Programming | Python Tutorial | Python Training | EdurekaPython Course | Python Programming | Python Tutorial | Python Training | Edureka
Python Course | Python Programming | Python Tutorial | Python Training | Edureka
Edureka!
 
Python tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academyPython tutorial for beginners - Tib academy
Python tutorial for beginners - Tib academy
TIB Academy
 
Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02Basic Python Programming: Part 01 and Part 02
Basic Python Programming: Part 01 and Part 02
Fariz Darari
 

Similar to RDM 2020: Python, Numpy, and Pandas (20)

Python workshop
Python workshopPython workshop
Python workshop
Shiraz LUG
 
Python workshop
Python workshopPython workshop
Python workshop
Marie Behzadi
 
Python Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to usePython Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to use
SrajanCollege1
 
python training in chandigarh
python     training    in     chandigarhpython     training    in     chandigarh
python training in chandigarh
priyansuthakur59093
 
London level39
London level39London level39
London level39
Travis Oliphant
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Marcus Hanwell
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AyushmanTiwari11
 
Presentation.pptx
Presentation.pptxPresentation.pptx
Presentation.pptx
AyushmanTiwari11
 
what is python and why is important with
what is python and why is important withwhat is python and why is important with
what is python and why is important with
LetsUpdateSkills
 
Why learn python in 2017?
Why learn python in 2017?Why learn python in 2017?
Why learn python in 2017?
Karolis Ramanauskas
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and Spark
Felix Crisan
 
Python, the Language of Science and Engineering for Engineers
Python, the Language of Science and Engineering for EngineersPython, the Language of Science and Engineering for Engineers
Python, the Language of Science and Engineering for Engineers
Boey Pak Cheong
 
A Comprehensive Guide of Python Final Year Projects with Source Code.pdf
A Comprehensive Guide of Python Final Year Projects with Source Code.pdfA Comprehensive Guide of Python Final Year Projects with Source Code.pdf
A Comprehensive Guide of Python Final Year Projects with Source Code.pdf
jagan477830
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Muralidharan Deenathayalan
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Muralidharan Deenathayalan
 
Programming for data science in python
Programming for data science in pythonProgramming for data science in python
Programming for data science in python
UmmeSalmaM1
 
Python in geospatial analysis
Python in geospatial analysisPython in geospatial analysis
Python in geospatial analysis
Sakthivel R
 
Python in Industry
Python in IndustryPython in Industry
Python in Industry
Dharmit Shah
 
Anaconda vs Python: Understanding the differences
Anaconda vs Python: Understanding the differencesAnaconda vs Python: Understanding the differences
Anaconda vs Python: Understanding the differences
Julie Bowie
 
An overview of data and web-application development with Python
An overview of data and web-application development with PythonAn overview of data and web-application development with Python
An overview of data and web-application development with Python
Sivaranjan Goswami
 
Python workshop
Python workshopPython workshop
Python workshop
Shiraz LUG
 
Python Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to usePython Introduction its a oop language and easy to use
Python Introduction its a oop language and easy to use
SrajanCollege1
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Marcus Hanwell
 
what is python and why is important with
what is python and why is important withwhat is python and why is important with
what is python and why is important with
LetsUpdateSkills
 
Data analysis with Pandas and Spark
Data analysis with Pandas and SparkData analysis with Pandas and Spark
Data analysis with Pandas and Spark
Felix Crisan
 
Python, the Language of Science and Engineering for Engineers
Python, the Language of Science and Engineering for EngineersPython, the Language of Science and Engineering for Engineers
Python, the Language of Science and Engineering for Engineers
Boey Pak Cheong
 
A Comprehensive Guide of Python Final Year Projects with Source Code.pdf
A Comprehensive Guide of Python Final Year Projects with Source Code.pdfA Comprehensive Guide of Python Final Year Projects with Source Code.pdf
A Comprehensive Guide of Python Final Year Projects with Source Code.pdf
jagan477830
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Muralidharan Deenathayalan
 
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning StudioIntroduction to Jupyter notebook and MS Azure Machine Learning Studio
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Muralidharan Deenathayalan
 
Programming for data science in python
Programming for data science in pythonProgramming for data science in python
Programming for data science in python
UmmeSalmaM1
 
Python in geospatial analysis
Python in geospatial analysisPython in geospatial analysis
Python in geospatial analysis
Sakthivel R
 
Python in Industry
Python in IndustryPython in Industry
Python in Industry
Dharmit Shah
 
Anaconda vs Python: Understanding the differences
Anaconda vs Python: Understanding the differencesAnaconda vs Python: Understanding the differences
Anaconda vs Python: Understanding the differences
Julie Bowie
 
An overview of data and web-application development with Python
An overview of data and web-application development with PythonAn overview of data and web-application development with Python
An overview of data and web-application development with Python
Sivaranjan Goswami
 
Ad

More from Henry Schreiner (20)

Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)
Henry Schreiner
 
Tools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton WintersessionTools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton Wintersession
Henry Schreiner
 
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Henry Schreiner
 
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
 
Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024
Henry Schreiner
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
Henry Schreiner
 
Princeton RSE Peer network first meeting
Princeton RSE Peer network first meetingPrinceton RSE Peer network first meeting
Princeton RSE Peer network first meeting
Henry Schreiner
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023
Henry Schreiner
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance Tooling
Henry Schreiner
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
Henry Schreiner
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you needed
Henry Schreiner
 
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
Henry Schreiner
 
SciPy 2022 Scikit-HEP
SciPy 2022 Scikit-HEPSciPy 2022 Scikit-HEP
SciPy 2022 Scikit-HEP
Henry Schreiner
 
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingPyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
Henry Schreiner
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
Henry Schreiner
 
boost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meetingboost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meeting
Henry Schreiner
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meeting
Henry Schreiner
 
CMake best practices
CMake best practicesCMake best practices
CMake best practices
Henry Schreiner
 
Pybind11 - SciPy 2021
Pybind11 - SciPy 2021Pybind11 - SciPy 2021
Pybind11 - SciPy 2021
Henry Schreiner
 
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
Henry Schreiner
 
Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)Princeton RSE: Building Python Packages (+binary)
Princeton RSE: Building Python Packages (+binary)
Henry Schreiner
 
Tools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton WintersessionTools to help you write better code - Princeton Wintersession
Tools to help you write better code - Princeton Wintersession
Henry Schreiner
 
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Henry Schreiner
 
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
 
Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024Modern binary build systems - PyCon 2024
Modern binary build systems - PyCon 2024
Henry Schreiner
 
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling - Wintersession 2024
Henry Schreiner
 
Princeton RSE Peer network first meeting
Princeton RSE Peer network first meetingPrinceton RSE Peer network first meeting
Princeton RSE Peer network first meeting
Henry Schreiner
 
Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023Software Quality Assurance Tooling 2023
Software Quality Assurance Tooling 2023
Henry Schreiner
 
Princeton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance ToolingPrinceton Wintersession: Software Quality Assurance Tooling
Princeton Wintersession: Software Quality Assurance Tooling
Henry Schreiner
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
Henry Schreiner
 
Everything you didn't know you needed
Everything you didn't know you neededEverything you didn't know you needed
Everything you didn't know you needed
Henry Schreiner
 
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
Henry Schreiner
 
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingPyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
Henry Schreiner
 
PyCon2022 - Building Python Extensions
PyCon2022 - Building Python ExtensionsPyCon2022 - Building Python Extensions
PyCon2022 - Building Python Extensions
Henry Schreiner
 
boost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meetingboost-histogram / Hist: PyHEP Topical meeting
boost-histogram / Hist: PyHEP Topical meeting
Henry Schreiner
 
Digital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meetingDigital RSE: automated code quality checks - RSE group meeting
Digital RSE: automated code quality checks - RSE group meeting
Henry Schreiner
 
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHOW 2019: Machine Learning for the Primary Vertex Reconstruction
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
Henry Schreiner
 
Ad

Recently uploaded (20)

Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Top 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing ServicesTop 10 IT Help Desk Outsourcing Services
Top 10 IT Help Desk Outsourcing Services
Infrassist Technologies Pvt. Ltd.
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHSBuild 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHS
TECH EHS Solution
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Build 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHSBuild 3D Animated Safety Induction - Tech EHS
Build 3D Animated Safety Induction - Tech EHS
TECH EHS Solution
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 

RDM 2020: Python, Numpy, and Pandas

  • 1. Princeton Research Data Management Workshop 2020 Co-sponsored by the Center for Digital Humanities, the Center for Statistics and Machine Learning, the Office of the Dean for Research, and Data-Driven Social Science Initiative Organized by Princeton University Library’s Princeton Research Data Service, Princeton Institute for Computational Science and Engineering, and OIT Research Computing Day Two: Break-out Session: Python, Numpy, Pandas
  • 2. Python, Numpy, and Pandas Henry Schreiner, PICSiE/PHY [email protected] 2020 Research Data Management Workshop
  • 3. Python for data science ● Second most popular language on GitHub ● General purpose ● Only Data Science language in top 10 ● Over 200K PyPI packages, 1.6 billion releases
  • 4. Python for data science ● Another metric (PYPL, Google-based) has it #1 ● Data Science languages shown below ● Python fastest growing ● R peaked around 2017 ● Others also in decline ● Note the log scale!
  • 5. Timeline ● 1994: Python 1.0 released ● 1995: First array package: Numeric ● 2003: Matplotlib ● 2005: Numeric and numarray merged into Numpy ● 2008: Pandas introduced ● 2012: The Anaconda python distribution
  • 6. Timeline ● 2012: Numba JIT compiler ● 2014: IPython becomes Jupyter project & notebook ● 2016: LIGO's discovery: Jupyter Notebook + Python ● 2017: Google releases TensorFlow (Python) ● Now: All Machine Learning libraries are primarily or exclusively used via Python
  • 7. Why Python? What makes Python special? ● Great interactivity ● General purpose ● Weaknesses filled by libraries and services
  • 8. Python: the language ● Simple ● Easy to learn ● Flexible and powerful ● Object Oriented def square(x): return x**2 print(square(4)) # Prints 4
  • 9. IPython ● Adds interactive features to Python ○ Timing chunks of code ○ Shell-like features ○ Fancy display system %cd my_dir %%timeit run_long() ! ./program
  • 10. Jupyter Notebooks ● Cell-based HTML document ● Supports many kernels (IPython was first and is the most popular) ● Interleave documentation, code, and output
  • 11. Jupyter Lab ● Holds multiple views of ○ Notebooks ○ Output ○ Editors ○ Terminals
  • 12. Jupyter Hub ● Multiuser notebook or lab instances ● Available at mybinder.org or through Princeton Research Computing Example: Runge-Kutta static notebook, runnable mybinder
  • 13. Libraries PyPI ● The core service for Python libraries ● Uses pip to install ● Environment management separate Anaconda ● Can package Python and complex libraries ● Uses conda to install ● Environment manager too (reproducible) ● conda-forge is community effort
  • 14. Numpy ● Adds an array type ● Fast computations array-at-a-time ● Python and Numpy now define a standard protocol for arrays ● A library that replaces langagues like ADL import numpy as np v = np.array([1,2,3]) print(v**2) # Prints 1, 4, 9
  • 15. Pandas ● Tabular data ○ A library that replaces languages like R and Excel ○ Designed with interactivity in mind ● Other libraries mimic Pandas’ API
  • 16. Numba ● Adds full JIT (just in time) compiler to Python ● Compiles normal python functions into LLVM ● Growing subset of Python and Numpy ● Can be as fast as any compiled language ● Supports parallel computation, GPUs, and more
  • 17. Other libraries of note ● CuPY: CUDA with a numpy interface ● TensorFlow/PyTorch: Machine learning libraries ● Matplotlib: The plotting library for Python ● PyQt/PySide: Bindings to Qt Graphical User Interface ● PyBind11: Easy C++ bindings
  • 18. Summary ● Python is wildly popular, simple to learn, and well supported ● Python has an impressive collection of tools ○ Interactivity: IPython, Jupyter ○ Package delivery: PyPI (pip), Conda ○ Libraries: Numpy, Pandas, and many more
  • 19. Demo ● The second half is devoted to a Pandas demo session