0% found this document useful (0 votes)
26 views

3-PythonInstallation Sammatrix

The document discusses installing and using Anaconda and Conda for Python data analysis. It provides detailed instructions for installing Anaconda on Windows and confirming the installation. It also describes what packages are included in Anaconda and how to use Conda to manage packages and environments.

Uploaded by

Pranjal Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

3-PythonInstallation Sammatrix

The document discusses installing and using Anaconda and Conda for Python data analysis. It provides detailed instructions for installing Anaconda on Windows and confirming the installation. It also describes what packages are included in Anaconda and how to use Conda to manage packages and environments.

Uploaded by

Pranjal Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Data Analysis Using Python

Samatrix Consulting Pvt Ltd


Python Installation
Python Installation
• Since everyone uses Python for different applications, there is no
single solution for setting up Python and required add-on packages.
• We have provided detailed instructions to get set up.
• We recommend using the free Anaconda distribution.
What is Anaconda?
• Anaconda is a free and open-source distribution of Python and R
programming languages for data science and machine learning.
• It is an easy-to-install collection of high-performance Python libraries
along with Conda
• Anaconda helps you simplify your Python deployment and later on
your package management.
• Anaconda comes with 1500 packages (including the package
management system conda) and a GUI named Anaconda Navigator.
• The Anaconda Navigator also install some applications by default such
as Jupyter Notebook, Spyder IDE and Rstudio (for R).
Why do we need Anaconda Installation?
• Several scientific packages need a specific version of Python.
• It is challenging to keep them up-to-date and prevent from breaking.
• The Anaconda Distribution helps in management of multiple Python
versions on one computer and provides a large collection of highly
optimized, commonly used data science libraries to get you started
faster.
Installing Anaconda on Windows
• Download the Anaconda installer from the following link
https://www.anaconda.com/products/individual#windows
Installing Anaconda on Windows
• Double click the installer to launch.
• Note If you encounter issues during installation, temporarily disable your
anti-virus software during install, then re-enable it after the installation
concludes. If you installed for all users, uninstall Anaconda and re-install it
for your user only and try again.
• Click Next.
• Read the licensing terms and click “I Agree”.
• Select an install for “Just Me” unless you’re installing for all users (which
requires Windows Administrator privileges) and click Next.
• Select a destination folder to install Anaconda and click the Next button.
Installing Anaconda on Windows
• Choose whether to add
Anaconda to your PATH
environment variable. We
recommend not adding
Anaconda to the PATH
environment variable, since
this can interfere with other
software. Instead, use
Anaconda software by
opening Anaconda Navigator
or the Anaconda Prompt
from the Start Menu.
Installing Anaconda on Windows
• Choose whether to register Anaconda as your default Python. Unless
you plan on installing and running multiple versions of Anaconda or
multiple versions of Python, accept the default and leave this box
checked.
• Click the Install button. If you want to watch the packages Anaconda
is installing, click Show Details.
• Click the Next button.
• Optional: To install PyCharm for Anaconda, click on the link to
https://www.anaconda.com/pycharm.
• Or to install Anaconda without PyCharm, click the Next button.
Installing Anaconda on Windows
Installing Anaconda on Windows
• After a successful installation you
will see the “Thanks for installing
Anaconda” dialog box:
• If you wish to read more about
Anaconda.org and how to get
started with Anaconda, check the
boxes “Anaconda Individual
Edition Tutorial” and “Learn
more about Anaconda”.
• Click the Finish button.
Confirm Anaconda Installation
• You can confirm that Anaconda is
installed and working with Anaconda
Navigator or conda.
• Anaconda Navigator is a graphical user
interface that is automatically installed
with Anaconda.
• Navigator will open if the installation
was successful.
• Windows: Click Start, search or select
Anaconda Navigator from the menu.
Confirm Anaconda Installation
• If you prefer using a command line
interface (CLI), you can use conda to
verify the installation using Anaconda
Prompt on Windows or terminal on
Linux and macOS.
• To open Anaconda Prompt: Windows:
Click Start, search, or select Anaconda
Prompt from the menu.
Confirm Anaconda Installation
• After opening Anaconda Prompt or the terminal, choose any of the
following methods to verify:
• Enter conda list. If Anaconda is installed and working, this will display a list of
installed packages and their versions.
• Enter the command python. This command runs the Python shell. If Anaconda
is installed and working, the version information it displays when it starts up
will include “Anaconda”. To exit the Python shell, enter the command quit().
• Open Anaconda Navigator with the command anaconda-navigator. If Anaconda is
installed properly, Anaconda Navigator will open.
Packages included in Anaconda
Packages included in Anaconda 4.4+, or install with "conda install PACKAGENAME"
• NumPy
• numpy.org
• N-dimensional array for numerical computation
• SciPy
• scipy.org
• Scientific computing library for Python
• Matplotlib
• matplotlib.org
• 2D Plotting library for Python
• Pandas
• pandas.pydata.org
• Powerful Python data structures and data analysis toolkit
Packages included in Anaconda
• Seaborn
• seaborn.pydata.org/
• Statistical graphics library for Python
• Bokeh
• bokeh.pydata.org
• Interactive web visualization library
• Scikit-Learn
• scikit-learn.org/stable
• Python modules for machine learning and data mining
• NLTK
• nltk.org
• Natural language toolkit
Packages included in Anaconda
• Jupyter Notebook
• jupyter.org
• Web app that allows you to create and share documents that contain live
code, equations, visualizations and explanatory text
• R essentials
• https://docs.anaconda.com/anaconda/user-guide/tasks/use-r-language
• 80+ of the most used R packages for data science can be installed with “conda
install r-essentials”
• R package list
• https://docs.anaconda.com/anaconda/packages/r-language-pkg-docs
What is Miniconda?
• Miniconda is a free minimal installer for conda.
• It is a small, bootstrap version of Anaconda that includes only conda,
Python, the packages they depend on, and a small number of other
useful packages, including pip, zlib and a few others.
• Use the conda install command to install 720+ additional conda
packages from the Anaconda repository.
What is Conda?
• Conda is an open-source package management system and
environment management system that runs on Windows, macOS and
Linux.
• Conda quickly installs, runs and updates packages and their
dependencies.
• Conda easily creates, saves, loads and switches between
environments on your local computer.
• It was created for Python programs, but it can package and distribute
software for any language.
What is Conda?
• Conda as a package manager helps you find and install packages.
• If you need a package that requires a different version of Python, you
do not need to switch to a different environment manager, because
conda is also an environment manager.
• With just a few commands, you can set up a totally separate
environment to run that different version of Python, while continuing
to run your usual version of Python in your normal environment.
Starting Conda
• Starting Conda using Windows
• From the Start menu, search for and open "Anaconda
Prompt.”
• On Windows, all commands are typed into the Anaconda
Prompt window.

• Starting Conda using MacOS


• Open Launchpad, then click the terminal icon. On macOS, all
commands are typed into the terminal window.

• Starting Consa using Linux


• Open a terminal window. On Linux, all commands are typed
into the terminal window.
Managing Conda
• Verify that conda is installed and running on your system by typing:
conda --version
• Conda displays the number of the version that you have installed. You do not
need to navigate to the Anaconda directory.
conda 4.8.3
• Update conda to the current version. Type the following:
conda update conda
• Conda compares versions and then displays what is available to install. If a newer
version of conda is available, type y to update:
Proceed ([y]/n)? y
• It is recommended that you always keep conda updated to the latest version.
Managing Packages
• You check which packages you have installed, check which are available and look for a
specific package and install it.
• Check to see if a package you have not installed named "beautifulsoup4" is available from
the Anaconda repository (must be connected to the Internet):
conda search beautifulsoup4
• Conda displays a list of all packages with that name on the Anaconda repository, so we
know it is available.
• Install this package into the current environment:
conda install beautifulsoup4
• Check to see if the newly installed program is in this environment:
conda list
• To uninstall the package into the current environment
conda uninstall beautifulsoup4
What is pip (Package Manager)
• pip is the package installer for Python.
• You can use pip to install packages from the Python Package Index and
other indexes.
• Most distributions of Python come with pip preinstalled. Python 2.7.9 and
later (on the python2 series), and Python 3.4 and later include pip (pip3 for
Python 3) by default.
• To install a package using pip, you can execute the following command
pip install some-package-name
• You can uninstall the package using the following command
pip uninstall some-package-name
Conda versus pip
• Conda and pip are often considered as being nearly identical. Although
some of the functionality of these two tools overlap, they were designed
and should be used for different purposes.
• Pip installs Python software packaged as wheels or source distributions.
The latter may require that the system have compatible compilers, and
possibly libraries, installed before invoking pip to succeed.
• Conda is a cross platform package and environment manager that installs
and manages conda packages from the Anaconda repository as well as
from the Anaconda Cloud.
• Conda packages are binaries.
• There is never a need to have compilers available to install them.
• Additionally, conda packages are not limited to Python software. They may
also contain C or C++ libraries, R packages or any other software.
Conda versus pip
• Pip installs Python packages whereas conda installs packages which may
contain software written in any language.
• For example, before using pip, a Python interpreter must be installed via a
system package manager or by downloading and running an installer.
• Conda on the other hand can install Python packages as well as the Python
interpreter directly.
• Conda has the ability to create isolated environments that can contain
different versions of Python and/or the packages installed in them.
• This can be extremely useful when working with data science tools as
different tools may contain conflicting requirements which could prevent
them all being installed into a single environment.
• Pip has no built in support for environments but rather depends on other
tools like virtualenv or venv to create isolated environments.
Conda versus pip
• When installing packages, pip installs dependencies in a recursive, serial
loop.
• No effort is made to ensure that the dependencies of all packages are
fulfilled simultaneously.
• This can lead to environments that are broken in subtle ways, if packages
installed earlier in the order have incompatible dependency versions
relative to packages installed later in the order.
• In contrast, conda uses a satisfiability (SAT) solver to verify that all
requirements of all packages installed in an environment are met.
• This check can take extra time but helps prevent the creation of broken
environments.
• As long as package metadata about dependencies is correct, conda will
predictably produce working environments.
Conda versus pip

Conda Pip
Manages binaries wheel or source
Require Compiler No Yes
Package type Any Python only
Create environment Yes-built in No require virtualenv or venv
Check dependencies Yes No
Package Source Anaconda repo and cloud PyPI
Anaconda Navigator
• Anaconda Navigator is a desktop graphical user interface (GUI)
included in Anaconda distribution that allows you to launch
applications and easily manage conda packages, environments, and
channels without using command-line commands.
• Navigator can search for packages on Anaconda.org or in a local
Anaconda Repository.
• It is available for Windows, macOS, and Linux.
Anaconda Navigator
Jupyter Notebook
Jupyter Notebook
• The Jupyter Notebook is an open-source web application that allows
you to create and share documents that contain live code, equations,
visualizations and narrative text.
• Uses include: data cleaning and transformation, numerical simulation,
statistical modeling, data visualization, machine learning, and much
more.
Features of Jupyter Notebook
• Language of choice: Jupyter supports over 40 programming
languages, including Python, R, Julia, and Scala.
• Share notebooks: Notebooks can be shared with others using email,
Dropbox, GitHub and the Jupyter Notebook Viewer.
• Interactive output: Your code can produce rich, interactive output:
HTML, images, videos, LaTeX, and custom MIME types.
• Big data integration: Leverage big data tools, such as Apache Spark,
from Python, R and Scala. Explore that same data with pandas, scikit-
learn, ggplot2, TensorFlow.
Features of Jupyter Web Application
• In-browser editing for code, with automatic syntax highlighting,
indentation, and tab completion/introspection.
• The ability to execute code from the browser, with the results of
computations attached to the code which generated them.
• Displaying the result of computation using rich media representations, such
as HTML, LaTeX, PNG, SVG, etc. For example, publication-quality figures
rendered by the matplotlib library, can be included inline.
• In-browser editing for rich text using the Markdown markup language,
which can provide commentary for the code, is not limited to plain text.
• The ability to easily include mathematical notation within markdown cells
using LaTeX, and rendered natively by MathJax.
Notebook Document
• Notebook documents contains the inputs and outputs of an
interactive session as well as additional text that accompanies the
code but is not meant for execution.
• These documents are internally JSON files and are saved with the
.ipynb extension.
• Since JSON is a plain text format, they can be version-controlled and
shared with colleagues.
• Notebooks may be exported to a range of static formats, including
HTML (for example, for blog posts), reStructuredText, LaTeX, PDF, and
slide shows, via the nbconvert command.
Starting Notebook Server
• You can start running a notebook server from the command line using the
following command:
jupyter notebook
• This will print some information about the notebook server in your
console, and open a web browser to the URL of the web application (by
default, http://127.0.0.1:8888).
• The landing page of the Jupyter notebook web application, the dashboard,
shows the notebooks currently available in the notebook directory (by
default, the directory from which the notebook server was started).
• You can create new notebooks from the dashboard with the New
Notebook button, or open existing ones by clicking on their name.
Starting Notebook Server
• When starting a notebook server from the command line, you can also
open a particular notebook directly, bypassing the dashboard, with jupyter
notebook my_notebook.ipynb. The .ipynb extension is assumed if no extension is
given.
• When you are inside an open notebook, the File | Open… menu option will
open the dashboard in a new browser tab, to allow you to open another
notebook from the notebook directory or to create a new notebook.
• You can start more than one notebook server at the same time, if you want
to work on notebooks in different directories.
• By default the first notebook server starts on port 8888, and later notebook
servers search for ports near that one. You can also manually specify the
port with the --port option.
Create New Notebook Document
• A new notebook may be created at any time, either from the
dashboard, or using the File ‣ New menu option from within an active
notebook.
• The new notebook is created within the same directory and will open
in a new browser tab.
• It will also be reflected as a new entry in the notebook list on the
dashboard.
Create New Notebook Document
Notebook User Interface
• When you create a new notebook document, you will be presented with
the notebook name, a menu bar, a toolbar and an empty code cell.
• Notebook name: The name displayed at the top of the page, next to the
Jupyter logo, reflects the name of the .ipynb file.
• Clicking on the notebook name brings up a dialog which allows you to rename it.
Thus, renaming a notebook from “Untitled0” to “My first notebook” in the browser,
renames the Untitled0.ipynb file to My first notebook.ipynb.
• Menu bar: The menu bar presents different options that may be used to
manipulate the way the notebook functions.
• Toolbar: The tool bar gives a quick way of performing the most-used
operations within the notebook, by clicking on an icon.
• Code cell: the default type of cell; read on for an explanation of cells.
Notebook User Interface
Structure of Notebook Document
• The notebook consists of a sequence of cells.
• A cell is a multiline text input field, and its contents can be executed
by using Shift-Enter, or by clicking either the “Play” button the
toolbar, or Cell, Run in the menu bar.
• The execution behavior of a cell is determined by the cell’s type.
There are three types of cells: code cells, markdown cells, and raw
cells.
• Every cell starts off being a code cell, but its type can be changed by
using a drop-down on the toolbar (which will be “Code”, initially), or
via keyboard shortcuts.
Code Cell
• A code cell allows you to edit and write new code, with full syntax
highlighting and tab completion.
• The programming language you use depends on the kernel, and the default
kernel (IPython) runs Python code.
• When a code cell is executed, code that it contains is sent to the kernel
associated with the notebook.
• The results that are returned from this computation are then displayed in
the notebook as the cell’s output.
• The output is not limited to text, with many other possible forms of output
are also possible, including matplotlib figures and HTML tables (as used, for
example, in the pandas data analysis package).
• This is known as IPython’s rich display capability.
Markdown Cell
• You can document the computational process in a literate way, alternating descriptive
text with code, using rich text.
• In IPython this is accomplished by marking up text with the Markdown language.
• The corresponding cells are called Markdown cells. The Markdown language provides a
simple way to perform this text markup, that is, to specify which parts of the text should
be emphasized (italics), bold, form lists, etc.
• If you want to provide structure for your document, you can use markdown headings.
Markdown headings consist of 1 to 6 hash # signs # followed by a space and the title of
your section.
• The markdown heading will be converted to a clickable link for a section of the notebook.
It is also used as a hint when exporting to other document formats, like PDF.
• When a Markdown cell is executed, the Markdown code is converted into the
corresponding formatted rich text. Markdown allows arbitrary HTML code for formatting.
Raw Cell
• Raw cells provide a place in which you can write output directly. Raw
cells are not evaluated by the notebook.
Spyder IDE
Spyder IDE
• Spyder is a free and open-source scientific environment written in
Python, for Python, and designed by and for scientists, engineers and
data analysts.
• It features a unique combination of the advanced editing, analysis,
debugging, and profiling functionality of a comprehensive
development tool with the data exploration, interactive execution,
deep inspection, and beautiful visualization capabilities of a scientific
package.
Spyder
Spyder Editor
• Spyder’s multi-language Editor
integrates a number of
powerful tools right out of the
box for an easy to use,
efficient editing experience.
• The Editor’s key features
include syntax highlighting;
real-time code and style
analysis; on-demand
completion, calltips and go-to-
definition features; a
function/class browser,
horizontal and vertical
splitting, and much more.
IPython Console
• The IPython Console
allows you to execute
commands and enter,
interact with and visualize
data inside any number
of fully featured IPython
interpreters.
Spyder Variable Explorer
• The Variable Explorer allows
you to interactively browse
and manage the objects
generated running your
code.
Spyder Variable Explorer
• It shows the namespace
contents (including all
global objects, variables,
class instances and
more) of the currently
selected IPython
Console session, and
allows you to add,
remove, and edit their
values through a variety
of GUI-based editors.
Spyder Variable Explorer
• The Variable Explorer
gives you information on
the name, size, type and
value of each object. To
modify a scalar variable,
like a number, string or
boolean, simply double
click it in the pane and
type its new value.
Spyder Help Pane
• You can use the Help pane
to find, render and display
rich documentation for
any object with a
docstring, including
modules, classes,
functions and methods.
This allows you to access
documentation easily
directly from Spyder,
without having to
interrupt your workflow.
Spyder Plots
• The Plots pane shows the
static figures and images
created during your
session.
• It will show you plots from
the IPython Console,
produced by your code in
the Editor or generated by
the Variable Explorer
allowing you to interact
with them in several ways.
Spyder Plots
• The figures shown in the
Plots pane are those
associated with the
currently active Console
tab;
• if you switch consoles,
the list of plots displayed
(or none at all, if a new
console) will change
accordingly.
Spyder Files
• The Files pane is a
filesystem and directory
browser built right into
Spyder.
• You can view and filter
files according to their
type and extension,
open them with the
Editor or an external
tool, and perform many
common operations.
Spyder History Pane
• With the History pane,
you can view all the
commands you’ve
entered into any IPython
Console, along with their
timestamp.
Thanks
Samatrix Consulting Pvt Ltd

You might also like