0% found this document useful (0 votes)
9 views

Unit 3

Uploaded by

ramnathshenoy777
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Unit 3

Uploaded by

ramnathshenoy777
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Business Analytics-1

Unit-3
Tools Used for Data Analytics
Introduction to Data Analytics Software
Data analytics software refers to tools and platforms that allow businesses, researchers, and analysts to
process, analyse, and interpret large datasets to gain insights, identify patterns, and make informed decisions.
These software solutions are essential in various industries, including finance, marketing, healthcare, retail,
and more, as they help in transforming raw data into actionable insights.
Data analytics software typically offers a range of functionalities, including data collection, cleaning,
visualization, statistical analysis, predictive modelling, and reporting. These tools help users to analyse
trends, perform complex calculations, build models, and visualize data to facilitate decision-making and
problem-solving.

Types of Data Analytics Software


Data analytics software can be categorized into several types:
Business Intelligence (BI) Tools: These focus on data visualization and reporting, making it easier for non-
technical users to explore data and generate insights.
Statistical Analysis Software: Specialized tools for advanced statistical analysis and hypothesis testing,
often used by researchers and statisticians.
Data Mining and Machine Learning Tools: These enable users to build predictive models and discover
patterns in large datasets.
Big Data Analytics Platforms: Designed to handle massive volumes of data and perform distributed
processing, often using technologies like Hadoop and Spark.
Data Preparation Tools: Help users clean and transform data into a suitable format for analysis.
Dashboard and Visualization Tools: Emphasize creating interactive and visually appealing reports
and dashboards.

Open-Source Data Analytics Software


• Meaning: Open-source data analytics software is freely available for use, modification, and
distribution. The source code is open and accessible, allowing users to make changes to the software
as per their requirements.
• Key Features:
Free to use: Most open-source tools are free, with no licensing fees.
Customizable: Users can modify the software to suit their specific needs.
Community-driven: Development is often driven by a community of developers who
contribute improvements and new features.
Support: Support is primarily available through community forums, online documentation,
and user groups.
Proprietary Data Analytics Software
Business Analytics-1
• Meaning: Proprietary data analytics software is developed, owned, and sold by a specific company
or vendor. These tools are typically available for purchase through licensing agreements, and the
source code is not accessible to the public.
• Key Features:
Paid Licensing: Requires users to purchase licenses, which may be a one-time fee or
subscription based.
User-friendly: Often designed with intuitive user interfaces, making them easier to use for
non-technical users.
Integrated Support: Vendors provide professional customer service and support to help users
with troubleshooting, maintenance, and updates.
Regular Updates and New Features: Vendors typically offer ongoing updates to improve
performance, security, and introduce new features.

• Key Differences Between Open Source and Proprietary Data Analytics Software

Aspect Open Source Proprietary


Cost Free, no licensing fees Paid (licensing/subscription required)
Customization Highly customizable (source code is open) Limited customization options
Support Community-based support Professional, vendor-provided support
Updates Community-driven updates Regular updates from the vendor
Ease of Use Can be complex, requires technical expertise User-friendly with intuitive interfaces
Transparency Full transparency (source code accessible) No transparency (source code is closed)
Security Dependent on community and user management Vendor-managed security and updates
Examples R, Python, KNIME, Apache Hadoop Tableau, Power BI, SAS, IBM SPSS

Open-source Software’s
Meaning of Tableau
Tableau is a leading data visualization software tool used for business intelligence (BI). It helps users
transform raw data into interactive and easily understandable visual reports, dashboards, and charts. Known
for its user-friendly interface, Tableau empowers businesses to gain insights from their data without the
need for extensive technical knowledge.
Tableau provides powerful data analytics capabilities and enables users to create data visualizations that are
both rich in detail and easy to understand. It is often used for exploring, analysing, and presenting large
volumes of data.
Key Features of Tableau
1. Data Visualization
Tableau is renowned for its ability to transform data into intuitive, interactive visualizations like bar charts,
pie charts, heat maps, scatter plots, and geographical maps.
2. Drag-and-Drop Interface
Business Analytics-1
The intuitive, drag-and-drop functionality enables even non-technical users to create reports, dashboards,
and visualizations without any coding.
3. Real-time Data Analysis
Tableau allows users to connect to real-time data sources, enabling immediate analysis and decision-
making.
4. Data Connectivity
Tableau can connect to a wide range of data sources, including databases (SQL, MySQL, etc.), cloud-based
data sources (Google Analytics, Salesforce), spreadsheets, and even big data platforms like Hadoop.
5. Dashboards and Storytelling
Tableau provides a feature called "Storytelling," which allows users to create a sequence of visualizations
to convey a compelling narrative with data.
Dashboards enable users to combine multiple views of data on a single screen, making it easier to analyse
and present data in one place.
6. Sharing and Collaboration
Tableau allows users to share reports and dashboards via Tableau Server, Tableau Online, or export them
as PDF, image files, or interactive web content.
Users can collaborate in real-time, providing feedback and insights directly on shared dashboards.
7. Advanced Analytics
Tableau offers built-in statistical capabilities such as trend lines, forecasting, and clustering. You can also
use custom calculated fields to create advanced data models.
8. Data Blending and Joins
Tableau provides powerful data blending capabilities, allowing you to combine data from different sources
into a single visualization, even if the data is stored in different formats or databases.
9. Mobile Support
Tableau’s mobile app enables users to access their dashboards and reports on smartphones and tablets,
making it easy to view, share, and interact with data on the go.
10. Security Features
Tableau offers role-based security, data encryption, and permissions management to ensure that sensitive
data is protected when accessed by multiple users.

R Programming: Meaning and Key Features


Meaning of R Programming
R is a powerful, open-source programming language and environment designed for statistical computing
and data analysis. It is widely used by statisticians, data analysts, and researchers for conducting complex
data analysis, statistical modelling, and creating data visualizations. R has a rich ecosystem of packages
and libraries that make it ideal for data manipulation, exploration, and visualization.
Business Analytics-1
R is highly extensible, meaning users can develop their own functions and packages. It is also known for
its ability to handle large datasets and perform a wide range of statistical analysis, from basic descriptive
statistics to advanced machine learning and predictive modelling.
Key Features of R Programming
1. Comprehensive Statistical Analysis
R provides a vast range of statistical functions and methods for performing complex analyses such as
regression analysis, hypothesis testing, time series analysis, multivariate analysis, and more. It also includes
functions for working with probability distributions, statistical tests, and models.
2. Data Visualization
One of R’s key strengths is its powerful data visualization capabilities. It has several libraries, such as
ggplot2, lattice, and portly, for creating static and interactive visualizations like bar charts, line graphs,
histograms, and scatter plots.
R allows users to create customizable and publication-quality visualizations, which makes it especially
useful for data presentation and reporting.
3. Rich Ecosystem of Libraries and Packages
R has a vast collection of over 16,000 packages available on CRAN (Comprehensive R Archive Network),
which extend its functionality in various areas such as data manipulation (e.g., dplyr, tidyr), machine
learning (e.g., caret, random Forest), and visualization (e.g., ggplot2, portly).
These packages make R highly versatile and allow users to perform almost any type of data analysis and
modelling.
4. Data Handling and Manipulation
R offers powerful data handling capabilities, especially when it comes to manipulating and cleaning data.
Functions from packages like dplyr and tidyr allow users to filter, group, transform, and summarize data
effectively.
R also supports various data formats, including data frames, matrices, and lists, and can read/write data
from various file types (CSV, Excel, SQL, etc.).
5. Reproducibility and Documentation
R supports reproducible research by enabling users to write scripts and code that document the steps of an
analysis. RStudio, a popular IDE for R, also offers features like R Markdown, which allows users to
combine code and narrative in a single document, producing dynamic reports that can be easily shared and
reproduced.
6. Integration with Other Languages and Tools
R can be integrated with other programming languages like Python, C++, and Java, allowing for more
flexible and powerful data analysis workflows.
It also integrates well with data sources such as SQL databases, web APIs, and big data platforms, making
it suitable for a wide range of data environments.
7. Machine Learning and Predictive Modelling
Business Analytics-1
R is widely used in machine learning for tasks such as classification, regression, clustering, and model
evaluation. Libraries like caret, random Forest, and boost provide tools for building and evaluating machine
learning models.
R supports both supervised and unsupervised learning and has tools for hyperparameter tuning, model
selection, and performance evaluation.
8. Extensibility and Customization
R is highly customizable and extensible. Users can develop custom functions and packages to suit their
specific needs, making it a flexible tool for researchers and analysts in any domain.
With the help of the develop tools package, users can create, test, and share their own packages, further
enhancing the language’s capabilities.
9. Open Source and Active Community
R is free and open-source, which means that anyone can use, modify, and distribute the software. It also
has a large and active community of users and developers who contribute to its ongoing development,
ensuring that it continues to evolve and stay up to date with the latest advancements in statistics and data
science.
There are numerous online forums, tutorials, and documentation available to help new users get started
with R programming.
10. Cross-Platform Support
R can be run on multiple platforms, including Windows, macOS, and Linux. This makes it versatile for use
across different operating systems, ensuring it can be deployed in various environments.

Meaning of Python Programming


Python is a high-level, interpreted, and general-purpose programming language known for its readability
and simplicity. It emphasizes code readability and allows developers to write clear, logical code for small-
and large-scale projects. Python is widely used for web development, data analysis, machine learning,
artificial intelligence, scientific computing, automation, and more.
Python is an open-source language, meaning it is freely available and supported by a large and active
community. Its extensive standard library and rich ecosystem of third-party packages make it a versatile
tool for developers and data scientists alike.
Key Features of Python Programming
1. Easy to Learn and Read
Python is designed to be easy to understand, even for beginners. Its syntax is clear and straightforward,
resembling pseudo-code, which allows new programmers to focus on learning programming concepts
rather than complex syntax.
2. Interpreted Language
Python is an interpreted language, meaning the code is executed line by line, making it easier to debug.
This also means that Python does not need to be compiled before execution, speeding up the development
process.
3. Dynamically Typed
Business Analytics-1
In Python, you do not need to declare the data type of a variable explicitly. The interpreter automatically
infers the type of data based on the value assigned to it (e.g., integers, strings, floats).
4. Extensive Standard Library
Python has a vast standard library that provides modules and packages to handle common programming
tasks such as file I/O, regular expressions, system calls, and more. This reduces the need to write custom
code for routine operations.
5. Cross-Platform Compatibility
Python is cross-platform, meaning that Python programs can run on various operating systems, including
Windows, macOS, and Linux, without requiring modifications.
6. Large Ecosystem of Libraries and Frameworks
Python boasts an extensive ecosystem of third-party libraries and frameworks for different domains.
Libraries such as NumPy, pandas, and Matplotlib are popular for data analysis and visualization, while
Django and Flask are commonly used for web development.
For machine learning, popular libraries like TensorFlow, scikit-learn, and PyTorch provide tools for
building and training models.
7. Object-Oriented Programming (OOP)
Python supports object-oriented programming, which means it allows developers to create classes and
objects, enabling code reusability and easier maintenance. This helps in building scalable and modular
applications.
8. Support for Multiple Paradigms
Python supports multiple programming paradigms, including object-oriented programming, functional
programming, and procedural programming. This allows developers to choose the best approach based on
the problem at hand.
9. Automatic Memory Management
Python uses automatic memory management, meaning it handles memory allocation and deallocation
automatically, using a garbage collector to manage unused objects. This reduces the complexity of memory
management for developers.
10. Strong Community Support
Python has a large and active community of developers, researchers, and data scientists who contribute to
the development of libraries, frameworks, and tutorials. There is also a wealth of online resources, forums,
and documentation that can help new learners and experienced professionals alike.
11. Integration Capabilities
Python can easily integrate with other languages such as C, C++, and Java, making it useful in multi-
language applications. It also supports integration with databases (SQL, NoSQL) and external APIs.
12. Versatility
Python is used across a wide variety of applications, from simple automation scripts to complex systems.
Whether it's web development, scientific computing, data analysis, artificial intelligence, or automation,
Python can handle it all.
Business Analytics-1

Meaning of JAMOVI:
JAMOVI is a free, open-source statistical software designed for statistical analysis, with an intuitive
graphical interface. It simplifies the process of conducting statistical analyses, such as descriptive statistics,
hypothesis testing, and regression, without requiring advanced coding skills. JAMOVI is especially popular
in academic research due to its user-friendly interface and integration with R for more complex tasks.

Key Features of JAMOVI:


1. User-Friendly Interface: Designed for both beginners and advanced users, providing a graphical interface
to perform statistical analysis without the need for coding.
2. Real-Time Results: Automatically displays results as data is manipulated or analysis settings are adjusted,
making it easy to interpret outcomes.
3. Wide Range of Statistical Tests: Supports various statistical methods, including descriptive statistics, t-
tests, ANOVA, regression analysis, factor analysis, and non-parametric tests.
4. Data Visualization: Offers tools to create a variety of charts and graphs, such as histograms, scatter plots,
and box plots, to visualize data and results.
5. R Integration: Allows advanced users to write and execute R scripts within JAMOVI, providing additional
flexibility for complex analyses.
6. Modules for Specialized Analysis: Offers pre-packaged modules for specific types of analysis, like
Bayesian analysis and mixed models, which can be installed directly within the software.
7. Open-Source and Free: JAMOVI is free to use, with an open-source license, meaning anyone can
contribute to or modify the software.
8. Cross-Platform Support: Available for Windows, Mac, and Linux, ensuring accessibility on multiple
operating systems.
9. Spreadsheets for Data Entry: Users can input and manipulate data directly in spreadsheet-like tables,
similar to Excel, making data entry straightforward.
10. Interactive Output: The software provides interactive results, which users can modify and update instantly
by adjusting their analysis parameters.

Meaning of GRETL:
GRETL (Gnu Regression, Econometrics, and Time-series Library) is an open-source statistical software
specifically designed for econometrics and time-series analysis. It is widely used for conducting regression
analysis, econometric modelling, and data exploration. GRETL provides a user-friendly interface, making
it accessible for both beginners and advanced users, and supports a wide range of statistical tests and
methods.

Key Features of GRETL:


1. User-Friendly Interface: GRETL provides a simple, easy-to-navigate graphical user interface (GUI) for
users, allowing them to perform econometric analysis without needing to write complex code.
Business Analytics-1
2. Wide Range of Econometric Tools: GRETL supports multiple regression techniques (e.g., OLS, GLS,
IV), time-series models, and panel data analysis, making it ideal for econometric research.
3. Support for Time-Series Analysis: It has built-in functions for time-series data analysis, such as ARIMA,
GARCH, and VAR models, along with tools for handling time-series data.
4. Data Import and Export: GRETL supports various data formats, allowing users to import data from CSV,
Excel, and other formats. It also supports exporting results for further analysis in other tools.
5. Regression Analysis: GRETL offers a wide array of regression techniques, including linear and non-linear
models, as well as panel data regressions and limited dependent variable models.
6. Advanced Statistical Methods: GRETL includes advanced statistical methods like Generalized Least
Squares (GLS), instrumental variables (IV) estimation, and various tests for model diagnostics.
7. Scripting Capabilities: GRETL supports its own scripting language, allowing users to automate complex
analyses and extend the software’s capabilities.
8. Graphical Output: GRETL provides high-quality plots and visualizations, such as scatter plots,
histograms, and time-series plots, to help users understand the results and relationships in the data.
9. Cross-Platform: GRETL is available for Windows, Mac, and Linux, ensuring it can be used across
different operating systems.
10. Open-Source and Free: GRETL is free to download and use, with an open-source license that allows
anyone to contribute to or modify the software.

Proprietary Software’s
Meaning of SPSS:
SPSS (Statistical Package for the Social Sciences) is a powerful, widely used statistical software
developed by IBM. It is designed for performing complex data analysis and statistical operations,
especially in the fields of social sciences, market research, and health research. SPSS is known for its
user-friendly interface, making it accessible to users without advanced programming skills.
Key Features of SPSS:
1. User-Friendly Interface: SPSS offers an intuitive, spreadsheet-like interface that makes it easy to
input, manipulate, and analyse data without requiring complex coding.
2. Comprehensive Statistical Analysis: SPSS supports a wide range of statistical techniques, including:
o Descriptive statistics (mean, median, mode, etc.)
o T-tests, ANOVA, Chi-square tests
o Regression analysis (linear, logistic)
o Correlation analysis
o Factor analysis, reliability analysis, and more.
3. Data Management: SPSS provides powerful tools for data cleaning, data transformation, and data
manipulation, allowing users to prepare their datasets for analysis efficiently.
Business Analytics-1
4. Advanced Analytics: SPSS includes advanced statistical tools such as multivariate analysis, time-
series forecasting, cluster analysis, and survival analysis.
5. Graphics and Visualization: SPSS allows users to create a variety of charts and graphs (e.g., bar
charts, histograms, scatter plots, etc.) to visualize their data and statistical results.
6. Syntax Editor: In addition to its graphical interface, SPSS provides a syntax editor where users can
write and run commands using its proprietary syntax language. This is especially useful for
automating repetitive tasks.
7. Data Import and Export: SPSS supports a wide range of file formats, including Excel, CSV, and text
files, enabling users to import and export data seamlessly.
8. Extension and Customization: SPSS allows users to extend its functionality by installing additional
plug-ins or writing custom macros to automate tasks.
9. Reporting and Output: SPSS generates detailed output that can be easily interpreted and exported into
different formats like PDF, Word, or Excel for reporting purposes.
10. Cross-Platform Support: SPSS is available for both Windows and Mac, ensuring compatibility across
different operating systems.
11. Widely Used in Academia and Industry: SPSS is extensively used in academic research, market
research, healthcare studies, and by professionals who require in-depth statistical analysis and data
management.
12. Premium Software: Unlike open-source alternatives, SPSS is a paid software with various licensing
options, though it offers a free trial for users to test out its features.

E-Views:
E-Views is a statistical software package primarily used for econometric analysis, time-series forecasting,
and data analysis. It is popular among economists, researchers, and analysts for its powerful features in
modeling economic data, time-series analysis, and econometrics. E-Views allows users to perform
advanced statistical operations with an easy-to-use interface that simplifies complex tasks like regression
analysis, hypothesis testing, and data visualization.

Key Features of E-Views:


1. Time-Series Analysis: E-Views is known for its comprehensive tools for time-series analysis,
including unit root tests, cointegration analysis, vector autoregressions (VAR), and forecasting
models.
2. Econometric Modelling: E-Views offers advanced econometric models such as autoregressive
conditional heteroskedasticity (ARCH), generalized method of moments (GMM), and simultaneous
equations models, making it ideal for economic research.
3. Data Import and Export: E-Views supports a wide variety of data formats (Excel, CSV, SAS, etc.)
and can import data from databases or external files. It also allows for exporting analysis results into
formats like Excel, Word, and PDF.
4. Regression Analysis: E-Views provides tools for linear and nonlinear regression analysis, including
ordinary least squares (OLS), generalized least squares (GLS), and logistic regression.
Business Analytics-1
5. Statistical Tests: The software includes a range of statistical tests like t-tests, F-tests, hypothesis
testing, and diagnostics for checking model validity, such as autocorrelation and heteroscedasticity
tests.
6. Forecasting Tools: E-Views provides robust tools for forecasting, including exponential smoothing,
ARIMA models, and Monte Carlo simulations, often used in economic and financial forecasting.
7. Graphical Visualization: E-Views offers advanced graphing capabilities for visualizing data and
statistical outputs, such as line graphs, histograms, scatter plots, and time-series plots, to aid in
understanding trends and patterns.
8. Batch Processing and Automation: Users can automate repetitive tasks and analyses through
scripting in E-Views. The software supports a command-driven interface that enables users to
automate models, regression tests, and other complex tasks.
9. Panel Data Analysis: E-Views supports advanced techniques for panel data analysis, including fixed-
effects and random-effects models, which are useful for handling datasets with both cross-sectional
and time-series data.
10. User-Friendly Interface: While offering powerful statistical tools, E-Views provides a simple, user-
friendly interface, making it accessible to both beginner and advanced users in economics and
business analysis.
11. Comprehensive Documentation: E-Views offers detailed documentation and support, which
includes tutorials, guides, and a knowledge base to help users effectively utilize the software for
various analyses.
12. Cross-Platform Compatibility: E-Views runs on both Windows and Mac OS, ensuring broad
accessibility for users across different systems.
13. Economic and Financial Modelling: It is especially useful in financial and economic modelling,
offering a broad range of tools for analysing economic indicators, market data, and financial models.
14. Customizable Outputs: E-Views allows users to create customized reports and presentations,
making it easier to share the analysis with colleagues, clients, or stakeholders.

Meaning of Power BI:


Power BI (Business Intelligence) is a suite of business analytics tools developed by Microsoft. It is used
for transforming raw data into meaningful insights through interactive reports, dashboards, and data
visualizations. Power BI helps users make data-driven decisions by providing advanced analytics
capabilities and integrating data from a wide range of sources into one unified platform. It is widely used
in businesses for reporting, data visualization, and performance tracking.

Key Features of Power BI:


1. Data Visualization: Power BI offers a wide range of customizable visualizations such as bar charts,
line charts, pie charts, scatter plots, and geographic maps to present data in an easily digestible format.
Users can create interactive dashboards to monitor key metrics and trends.
2. Data Integration: Power BI allows users to connect and import data from a variety of sources,
including Excel, SQL Server, Google Analytics, Salesforce, and cloud-based data sources like Azure.
It also supports custom connectors for additional data sources.
Business Analytics-1
3. Interactive Reports and Dashboards: Users can create interactive reports and dashboards that allow
for real-time exploration of the data. These reports can be filtered, drilled down, or sliced to view
different segments of the data.
4. Advanced Analytics: Power BI integrates with tools like DAX (Data Analysis Expressions) and
Power Query for advanced data modelling, calculations, and transformation. It also supports machine
learning and AI-driven insights, enabling users to analyse data at a deeper level.
5. Real-Time Data Monitoring: Power BI enables real-time data updates and streaming, which allows
users to monitor performance metrics and other KPIs as the data changes.
6. Natural Language Query: Power BI's Q&A feature allows users to ask questions in natural language
(e.g., "What were the sales last quarter?"), and it will generate the appropriate visualizations based
on the query.
7. Collaboration and Sharing: Reports and dashboards created in Power BI can be shared with team
members and stakeholders via email, embedded on websites, or through the Power BI Service,
enabling collaboration and decision-making across teams.
8. Mobile Access: Power BI has mobile applications for iOS, Android, and Windows that allow users
to access reports and dashboards from anywhere, making it easy to stay informed on the go.
9. Security and Access Control: Power BI includes robust security features, such as role-based access
control, data encryption, and row-level security, ensuring that sensitive data is protected and only
accessible to authorized users.
10. Custom Visualizations: In addition to the built-in visualizations, users can create custom visuals or
import third-party visuals from the Power BI marketplace, enabling further flexibility in data
representation.
11. Data Refresh: Power BI allows users to schedule data refreshes, ensuring that reports and dashboards
are always up-to-date with the latest data. This can be done automatically or manually, depending on
the user's preference.
12. Integration with Other Microsoft Tools: Power BI integrates seamlessly with other Microsoft tools
like Excel, SharePoint, and Teams, making it easier to share and collaborate on reports within
organizations that already use Microsoft products.
13. Cloud and On-Premises Deployment: Power BI offers both cloud-based services (Power BI
Service) and on-premises solutions (Power BI Report Server), allowing businesses to choose the
deployment model that suits their needs.
14. Customizable Dashboards: Users can design personalized dashboards by pinning important
visualizations and reports, providing a quick overview of critical data and metrics.
15. Power BI Desktop and Service: Power BI Desktop is a free application that allows users to create
reports and data models on their local machine, while Power BI Service is a cloud-based platform for
sharing, collaborating, and publishing reports.
16. AI-Powered Insights: Power BI uses artificial intelligence to provide automated insights, such as
identifying trends, forecasting, and anomaly detection, making it easier for users to extract value from
complex data sets.

You might also like