4 Design and Development

Best design concepts

Uploaded by

Srinivas D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views3 pages

4 Design and Development

Best design concepts

Uploaded by

Srinivas D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

CHAPTER 4

SYSTEM DESIGN AND DEVELOPMENT

4.1 Architectural Design

Fig. 4.1.1 Block diagram

Fig. 4.1.1 represents the workflow of an online web scraper built using Python. The
system extracts data from multiple websites and stores it either in a database or in a file.
The components involved in this process include:

● Web Sites (Web Site 1, Web Site 2, Web Site 3): These are the target websites
from which data is to be scraped. The scraper will access these websites to gather
the necessary information.
● Scraping Script: The core component of the system, the scraping script, is written
in Python. It is responsible for sending HTTP requests to the target websites,
parsing the HTML content of the web pages to extract the required data, handling
any errors or exceptions that occur during the scraping process, and transforming
the extracted data into a structured format.

Dept of ISE, BNMIT 2023-24Page 5

Web scraping
● Storage Options:

1. Database: The extracted data can be stored in a database for easy retrieval and
query. This can be implemented using various database management systems
like MySQL, PostgreSQL, MongoDB, etc.
2. File: Alternatively, the data can be saved in a file, such as a CSV or JSON file.
This is useful for smaller datasets or when a database is not necessary.

The workflow starts with the scraping script sending requests to the specified
websites (Web Site 1, Web Site 2, Web Site 3). The HTML content of these websites is
downloaded and parsed by the scraping script. The relevant data is extracted and
processed. The processed data is then either saved to a database for complex queries and
large datasets, or saved to a file for simpler use cases or smaller datasets. This system can
be expanded or modified as per the requirements, for example, by adding additional
websites or implementing more complex data processing or storage mechanisms.

Fig. 4.1.2 Flowchart

Dept of ISE, BNMIT 2023-24Page 7

Web scraping

This flowchart (Fig. 4.1.2) illustrates the sequential steps involved in the operation of
an online web scraper using Python. The process includes downloading the contents of
web pages, extracting the necessary data, storing the data, and analyzing the data. The
detailed steps are as follows:

● Downloading the Contents: The first step involves sending HTTP requests to the
target websites and downloading the HTML content of the web pages. This is the
initial step where the scraper accesses the web pages to gather the required data.
● Extracting the Data: Once the HTML content is downloaded, the next step is to
parse this content and extract the relevant data. This involves identifying and
extracting specific pieces of information from the web pages based on the defined
requirements.
● Storing the Data: After extracting the data, it needs to be stored in a structured
format. The data can be saved in a database for easy retrieval and complex
queries, or in a file (such as CSV or JSON) for simpler use cases and smaller
datasets.
● Analyzing the Data:
The final step involves analyzing the stored data to derive meaningful insights.

This could involve:

o Data Cleaning: Removing duplicates, handling missing values, and

correcting data inconsistencies.
o Data Visualization: Creating charts and graphs using libraries like
matplotlib or seaborn to visualize trends and patterns.
o Statistical Analysis: Performing statistical tests or calculations using
libraries like numpy or scipy.
o Machine Learning: Applying machine learning algorithms using libraries
like scikit-learn or tensorflow to make predictions or classify data.

This flowchart provides a clear visual representation of the entire web scraping
process, from initial data acquisition to final data analysis. Each step is crucial to ensure
the accuracy and usefulness of the extracted data.

Dept of ISE, BNMIT 2023-24Page 7

New Balance Online Order Receipt
No ratings yet
New Balance Online Order Receipt
3 pages
Upadhyay (2017) - Articulating The Construction of A Web Scraper For
No ratings yet
Upadhyay (2017) - Articulating The Construction of A Web Scraper For
4 pages
Synopsis WS
No ratings yet
Synopsis WS
11 pages
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Linear Programing Project 1
No ratings yet
Linear Programing Project 1
6 pages
6 Results and Discussions
No ratings yet
6 Results and Discussions
5 pages
Web Scraping - Notes - 321
No ratings yet
Web Scraping - Notes - 321
3 pages
Web Scraping
No ratings yet
Web Scraping
5 pages
BE IT Project Synopsis Format 2022 23 V1
No ratings yet
BE IT Project Synopsis Format 2022 23 V1
11 pages
Document2
No ratings yet
Document2
6 pages
20 - 3 - A Study
No ratings yet
20 - 3 - A Study
5 pages
Web Scraping 2
No ratings yet
Web Scraping 2
14 pages
Rohan report
No ratings yet
Rohan report
25 pages
Web Data Scraping
No ratings yet
Web Data Scraping
5 pages
Web Scrapping: Dept - of CS&E, BIET, Davangere Page - 1
No ratings yet
Web Scrapping: Dept - of CS&E, BIET, Davangere Page - 1
8 pages
Web Scraping for Data Analytics a BeatifulSoup Implementation
No ratings yet
Web Scraping for Data Analytics a BeatifulSoup Implementation
6 pages
Arindam Manna, Financial Analytics
No ratings yet
Arindam Manna, Financial Analytics
9 pages
Practical Web Scraping for Economists 1744341390
No ratings yet
Practical Web Scraping for Economists 1744341390
33 pages
Scraping
100% (1)
Scraping
25 pages
Web Scraping Presentation With Images
No ratings yet
Web Scraping Presentation With Images
4 pages
Web Scraping with Python Step by Step: A Practical Guide with Examples
From Everand
Web Scraping with Python Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Introduction To Web Scraping
100% (1)
Introduction To Web Scraping
3 pages
Data Analysis by Web Scraping Using Python
No ratings yet
Data Analysis by Web Scraping Using Python
6 pages
Unit 11 Application Development Using Python
No ratings yet
Unit 11 Application Development Using Python
19 pages
web_scrapping_final[1]
No ratings yet
web_scrapping_final[1]
7 pages
Data Collection
No ratings yet
Data Collection
10 pages
Seminar Completed
No ratings yet
Seminar Completed
22 pages
EJMCM Volume7 Issue3 Pages433-442
No ratings yet
EJMCM Volume7 Issue3 Pages433-442
11 pages
Diouf 2019
No ratings yet
Diouf 2019
3 pages
Data Aggregation by Web Scraping Using Python
No ratings yet
Data Aggregation by Web Scraping Using Python
48 pages
INDEX
No ratings yet
INDEX
3 pages
Implementation of Web Application For Disease Prediction Using AI
No ratings yet
Implementation of Web Application For Disease Prediction Using AI
5 pages
Final Publish Paper
No ratings yet
Final Publish Paper
4 pages
E-Commerce Review Scrapper: Python Mini Project On
No ratings yet
E-Commerce Review Scrapper: Python Mini Project On
15 pages
DAP_4_module
No ratings yet
DAP_4_module
45 pages
Semin
No ratings yet
Semin
8 pages
Experiment2 Web Scraping and Data Analysis
No ratings yet
Experiment2 Web Scraping and Data Analysis
5 pages
Sing Rodia 2019
No ratings yet
Sing Rodia 2019
6 pages
Web Scraping Using Python - Notes
No ratings yet
Web Scraping Using Python - Notes
6 pages
06 WebScrapingData
No ratings yet
06 WebScrapingData
39 pages
Final Report
No ratings yet
Final Report
39 pages
19-5E8 Tushara Priya
No ratings yet
19-5E8 Tushara Priya
23 pages
Web Scrapping
No ratings yet
Web Scrapping
1 page
Web Scraping and Data Collection CheatSheet 1731972399
No ratings yet
Web Scraping and Data Collection CheatSheet 1731972399
10 pages
Abhishek
No ratings yet
Abhishek
10 pages
On Unit-4
No ratings yet
On Unit-4
60 pages
Upload PDF
No ratings yet
Upload PDF
11 pages
Web-Scraping-With-Python
No ratings yet
Web-Scraping-With-Python
16 pages
Text-Processing-For-NLP-Web-Scrapping (5)
No ratings yet
Text-Processing-For-NLP-Web-Scrapping (5)
18 pages
Utilizing_Python_for_Web_Scraping_and_Incremental_Data_Extraction
No ratings yet
Utilizing_Python_for_Web_Scraping_and_Incremental_Data_Extraction
6 pages
chp3A10.10072F978 3 319 32001 4 - 483 1
No ratings yet
chp3A10.10072F978 3 319 32001 4 - 483 1
4 pages
20_BeautifulSoup Library for Web Scraping
No ratings yet
20_BeautifulSoup Library for Web Scraping
12 pages
Beginner Guide To Web Scraping of Data
No ratings yet
Beginner Guide To Web Scraping of Data
14 pages
WEB Scrap Report
No ratings yet
WEB Scrap Report
77 pages
web scraping using python
No ratings yet
web scraping using python
18 pages
intermediate_scraping_techniques
No ratings yet
intermediate_scraping_techniques
2 pages
Web Crawling - python
No ratings yet
Web Crawling - python
34 pages
Image Scrapper
No ratings yet
Image Scrapper
14 pages
Software Engineering Project
No ratings yet
Software Engineering Project
55 pages
How To Build A Web Scraper For Tenders Extraction
No ratings yet
How To Build A Web Scraper For Tenders Extraction
12 pages
Introduction to Web Scraping in RPA With Python
No ratings yet
Introduction to Web Scraping in RPA With Python
10 pages
Summary Paper 10 11 12
No ratings yet
Summary Paper 10 11 12
3 pages
10th Marks card
No ratings yet
10th Marks card
2 pages
2nd Paper bka
No ratings yet
2nd Paper bka
14 pages
Question Bank (MES)
No ratings yet
Question Bank (MES)
1 page
2nd IA QB
No ratings yet
2nd IA QB
12 pages
Graph Theory Notes
No ratings yet
Graph Theory Notes
12 pages
Table of Contentsbla
No ratings yet
Table of Contentsbla
3 pages
Akash Front Page
No ratings yet
Akash Front Page
5 pages
BPI - Activity Based Costing Technique Paper
No ratings yet
BPI - Activity Based Costing Technique Paper
15 pages
Foundation of Information Technology Sample Question Paper II Summative Assessment - Term II
No ratings yet
Foundation of Information Technology Sample Question Paper II Summative Assessment - Term II
6 pages
Invoice
No ratings yet
Invoice
2 pages
Paper AI IN Performance Management
No ratings yet
Paper AI IN Performance Management
11 pages
VersaFlex XSL Expandable Liner Hanger System - SDS - H09713
No ratings yet
VersaFlex XSL Expandable Liner Hanger System - SDS - H09713
2 pages
Computer Quizzes 2
No ratings yet
Computer Quizzes 2
2 pages
Safe Coordination Between Power and Communication Surge Protection
No ratings yet
Safe Coordination Between Power and Communication Surge Protection
7 pages
Thesis On Baidu Inc
No ratings yet
Thesis On Baidu Inc
15 pages
Java Collections Framework
No ratings yet
Java Collections Framework
11 pages
STD of SNT DrawingsV2.1
No ratings yet
STD of SNT DrawingsV2.1
107 pages
Educational Hybrid Transmission Model
No ratings yet
Educational Hybrid Transmission Model
39 pages
Klüberfluid B-F 2 Ultra TDS
No ratings yet
Klüberfluid B-F 2 Ultra TDS
2 pages
DeChinhThuc AnhChuyen
No ratings yet
DeChinhThuc AnhChuyen
11 pages
Configuring A SINAMICS S120 With Startdrive V15
No ratings yet
Configuring A SINAMICS S120 With Startdrive V15
31 pages
Bed Head Panel Single Unit
No ratings yet
Bed Head Panel Single Unit
2 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
11 pages
Gartner Top 10 Strategic Predictions For 2024 & Beyond
No ratings yet
Gartner Top 10 Strategic Predictions For 2024 & Beyond
38 pages
(SV) Zero Conditional
No ratings yet
(SV) Zero Conditional
6 pages
Total Cement Cost (Fuel+Material+Power) : Kiln Feed Option-1 Option-2
No ratings yet
Total Cement Cost (Fuel+Material+Power) : Kiln Feed Option-1 Option-2
6 pages
AVLib A Simulink Library For Multi-Agent Systems Research
No ratings yet
AVLib A Simulink Library For Multi-Agent Systems Research
7 pages
Assignment: Software Defined Networking (SDN)
No ratings yet
Assignment: Software Defined Networking (SDN)
12 pages
Unit 5
No ratings yet
Unit 5
11 pages
Panasonic EMT Conduit
No ratings yet
Panasonic EMT Conduit
7 pages
Crypto Currency Prediction
100% (1)
Crypto Currency Prediction
34 pages
70 Ways Sds Enhances Autocad Electrical For Substation PC Design
No ratings yet
70 Ways Sds Enhances Autocad Electrical For Substation PC Design
57 pages
Wikipedia Sample
No ratings yet
Wikipedia Sample
10 pages
Instrumentation Project Documentation & Execution Ch.3 Documents To Be Designed Instrumentation Index Sheet Instrumentation Index
100% (1)
Instrumentation Project Documentation & Execution Ch.3 Documents To Be Designed Instrumentation Index Sheet Instrumentation Index
2 pages