0% found this document useful (0 votes)
50 views

The Apify Web Scraping Guide for E-commerce

Uploaded by

rvcdax
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

The Apify Web Scraping Guide for E-commerce

Uploaded by

rvcdax
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

The Apify web

scraping guide
for e-commerce

Why your business


needs better data
and how to get it

2022
Table of contents
1. What is web scraping?
2. Is web scraping legal?
3. What is scraped data?
4. Case study: price monitoring
5. Case study: product tracking
6. Case study: marketing research
7. Case study: brand sentiment
8. In-house vs. outsource?
10. Ready to get started?
1. What is web scraping?
Web scraping is a way of automatically extracting
information from web pages. If you've ever Every time you open a web page
and copy something from it into an
copied text from a website and pasted it into a Excel document or into notes, that
counts as scraping. Web scraping is
document, you were extracting that data. Web just another name for extracting

scraping uses bots to do the same thing, but much publicly accessible information
from websites.
faster and more efficiently.
Ondra Urban, Apify Head of Growth

Web scrapers can extract huge amounts of


information in seconds. And the data is delivered
in machine-readable formats so that it can easily
be used in spreadsheets, applications, and
databases.

Page 1 - The Apify web scraping guide for e-commerce


Unlock the potential of data on the web
The web is the greatest repository of Web scraping allows you to collect
knowledge and data in the history of structured data. Structured data just
humanity. But that information was means that the information is easy for
designed to be read by human beings, computers to read or add to a database.
not machines. Web scraping enables
you to create rules for computers to Instead of relying on humans to read or
access that data in an efficient and process web pages, computers can
machine-readable way. rapidly use that data in lots of
unexpected and useful ways. To
It is already impossible for humans to illustrate the difference, imagine how
process even a fraction of the data on long it might take you to manually copy
the web. We need machines to read and paste text from 100 web pages. A
that data for us so that we can use it in machine could do it in less than a second
business, conservation, protecting if you give it the correct instructions. It
human rights, fighting crime, and any can also do it repeatedly, tirelessly,
number of projects that can benefit and at any scale.
from the kind of data that the Internet
is so good at accumulating. Forget about 100 pages. A computer
could deal with 1,000,000 pages in the
To ignore the potential of web scraping time it would take you to open just the
is to ignore the potential of the web. first few.

Page 2 - The Apify web scraping guide for e-commerce


2. Is web scraping legal?
Contrary to popular belief, there’s nothing shady or
illicit about web scraping. Web scraping is legal. The World Resources Institute
asked Omdena to help identify
That does not mean that any kind of web scraping economic and financial incentives
for forest and landscape
is legal: there are boundaries. The most important restoration in Latin America. One of

boundaries are personal data and intellectual the tasks was to get 1,000s of PDFs
from over 100 government
property regulations, but other factors, such as the websites. The Omdena team solved
the problem by using Apify to
website’s terms of service, can play a role as well. scrape Google Search Engine
Results Pages (SERPs) and was able
to download the entire 740,000

Remember to always respect your target files in less than 14 hours.

websites and use empathy to create ethical Leo Sanchez, Head of Tech
Partnerships at Omdena
scrapers.

Page 3 - The Apify web scraping guide for e-commerce


Ethical web scraping as a force for good
Scraping personal data 4 rules for ethical scraping
Private personal data is always 1. Scrape only public data.
protected by law and should never be 2. Do not infringe on rights by scraping
scraped. You only have lawful access and using copyrighted data.
to publicly available data. So always 3. Do not seek to overburden the
consider whether the information was targeted website.
publicly available and not behind a 4. Do not use the information to steal
password authentication barrier. market share.

Scraping copyrighted content Web scraping is legal


The scraping of copyrighted content is So, is web scraping legal? Yes, it is, but,
only permitted for the purposes of like everything else, it can be used
generating information. For example, ethically or unethically. We believe in
you can scrape a web page to extract web scraping as a force for good.
prices from it or books for natural Apify helps rescue trafficked children,
language analysis, but you cannot find lost dogs and even restore forests
scrape news articles and then republish with web scraping. So it can't be all
them on your own website. bad, can it?

Page 4 - The Apify web scraping guide for e-commerce


3. What is scraped data?
Scraped data is any information you have retrieved
from a web browser or user interface. This I don’t steal personal data or trade
secrets, I extract terabytes of public
information could be anything from product items data and translate them into a
language that machines can
and price lists to photos and videos. understand.

Ondra Urban, Apify Head of Growth


Scrapers extract data in structured formats such
as XML, JSON, and Excel, so you can easily use that
data in spreadsheets, databases, reports or in
other apps.

Let's explore 4 different ways to use scraped data


for business.

Page 5 - The Apify web scraping guide for e-commerce


4 ways you can leverage scraped data
Competitor price monitoring Market research
Online shopping makes it easy for Scraped data can be used to audit
potential customers to compare prices popular e-commerce platforms and
before paying for a product or service. mobile apps for images, products,
With web scraping technology, you can related keywords, and more to stay
quickly and efficiently research and ahead of emerging trends and
price your products and services consumer habits. Learn what the
appropriately to maximize conversions. market wants before your competitors.

Product tracking Brand sentiment


Retailers and e-commerce enterprises Web scraping allows you to engage
use web scraping to track listings, with your customers in real time so you
products, and sales data from various can understand their side of the story.
online stores to see how different items By scraping customer reviews and
and categories are performing for you social media chats, you can create a
and your competitors. You can then data-driven strategy for understanding
adjust your product range or use that how to better serve your clients, or see
data to target new audiences. where you need to do better.

Page 6 - The Apify web scraping guide for e-commerce


4. Price monitoring
In the e-commerce industry, price monitoring is
essential if you want to stay ahead of the
competition. I really like the flexibility and power
behind the Apify platform. It has
everything you need to build
If you are a retailer, price monitoring will give you customized web scrapers that work
on even the most complicated
greater control over the market and will give you websites. Support is excellent, with
the team always on hand to answer
an indicator of how competitive your products are. any queries and great help site
documentation.

If you are a brand, price monitoring will let you Bryce Davies, Head of Growth at
SeedLever
know how your products or services are
positioned on the market.

Page 7 - The Apify web scraping guide for e-commerce


Why automate price monitoring?
E-commerce is a rapidly growing To put it simply, price scraping is what
industry. There are now tens of millions retailers and e-commerce businesses
of e-commerce websites around the need to do in order to maximize their
world, all competing for customers. profits without pricing themselves out
Price monitoring is critical for of the market.
companies in this industry, and it is not
surprising that successful e-commerce Price scraping over extended periods of
is becoming fully automated. time can help uncover key strategies
such as special offers during holiday
Web scraping makes it easy to seasons, know when to reduce prices
simultaneously monitor millions of e- and absorb losses on popular products
commerce websites and products in to attract more customers, and set up
real-time. With web scraping pricing algorithms based on the selling
technology you can find e-commerce history of a product.
products, explore social selling sites,
automate research, and track Through price scraping, a business can
competitors. This way you can create more revenue by setting up
streamline the monitoring of strategic pricing techniques to
competitor prices and products so you attract customers while keeping the
can adjust your own pricing strategy average profit margins on your
and optimize your performance. products the same.

Page 8 - The Apify web scraping guide for e-commerce


CASE STUDY The challenges
The retailer monitored the competition regularly but struggled to keep track of how they
were performing across multiple product categories. Competitor websites were blocking
web crawlers at high volumes, so data extraction would fail or become corrupted as
• Nasdaq-listed data volume increased. This was a huge problem for the retailer, as their sales strategy
• E-commerce retailer depended on tracking the listing price of thousands products across their markets.
• Active in 150+ countries
• 10k+ product categories
Our approach
Operating in a fiercely With no way to purchase market intelligence data, the retailer used Apify to collect
competitive market, a
product descriptions and real-time market pricing. Apify's crawlers were designed to
Nasdaq-listed e-commerce
behave like real users on the target site and proxies allowed them to scale their data
retailer looked to Apify to
extraction. The extracted data was served via API so that the data team could easily
help give them a
competitive edge. Selling
integrate it into their internal reporting systems.
across thousands of product
categories, the retailer
needed a way to collect
The results
product price data at scale Apify helped the client extract product data from competing websites in real time and at
to understand how they scale, enabling them to monitor prices, availability and other marketplace data. By
were performing against collecting data on thousands of products across the globe, the retailer was able to track
their competitors in real how competitors were pricing their products and identify areas of improvement. This
time. data now fuels their data team with critical pricing data, enabling them to create
powerful insights and predictions for management.

Page 9 - The Apify web scraping guide for e-commerce


Click these ready-to-use scrapers to try for free
eBay Scraper

Amazon Best Sellers Scraper


Scrapes the Amazon Best Sellers categories and extracts the top 100 most
popular items on Amazon. Download product name, price, URL, and
thumbnail image. Works on .com, .co.uk, .de, .fr, .es, and .it domains.
Download your data as HTML table, JSON, CSV, Excel, XML, and RSS feed.

eBay Scraper
Unofficial eBay API to extract data from eBay based on keywords or
categories. Scrape reviews, prices, product descriptions, images, location,
availability, brand, and more. Download extracted data in structured
format and use it in reports, spreadsheets, databases, and applications.

Page 10 - The Apify web scraping guide for e-commerce


5. Product tracking
Product tracking is all about extracting product
details such as price, description, images, and The Apify team was great to work
with and they rolled out our scraper
unique product codes from websites, and without any issues. The team is
there whenever you contact them
monitoring competitor performance in real and their scraping solution delivers

time. exactly the large amounts of


precise data we need.

Zeyn Khan, E-Commerce Specialist,


In the constantly changing and competitive AutoMarine Online

e-commerce market, product trends can directly


affect sales strategies.

You can use product tracking to identify new


opportunities, optimize your online store, or
just keep up with the competition.
Page 11 - The Apify web scraping guide for e-commerce
Why automate product tracking?
Online businesses continuously change Automated monitoring helps to
strategies and that can make it difficult
to monitor your competitors and
products. When it comes to staying
benchmark the performance of your
products at all times. It allows you to
measure product performance
1
ahead of the curve, the role retail data throughout the day on a weekly, Apify advantage
plays in the the retail ecosystem has monthly, quarterly, or even an annual
been increasing like never before. basis. Collect data from any
These days, data has become a website
multipurpose tool, assisting retailers Scraping data makes it easy to find e- Start extracting unlimited
big and small in specifying anything commerce products, explore social amounts of structured data
from ways to improve their product selling sites, automate research, and right away with our ready-
offerings to fresh ideas for creating new track competitors. to-use scraping tools or
ones. work with us to solve your
unique use case. Fast,
Web scraping gives you the power to
accurate results you can
Web scraping makes it easy to monitor automate product customization. You
rely on.
products on millions of sites in real- can adjust the range of products you
time and simultaneously. You can offer, enter new niches, make sure that
track listings, items, and sales data retailers are describing your products
from various online stores to see how correctly, and understand exactly
different products and categories are how your products fit into the e-
performing for you or your competitors. commerce landscape.

Page 12 - The Apify web scraping guide for e-commerce


CASE STUDY The challenges
The retailer enlisted Apify to help collect product descriptions across multiple websites
including industry giants Google and Amazon. They needed to extract a high quantity
of data in the market. However, they found difficulties with extracting data using their
• Nasdaq-listed
previous solution. The two main sources (Google and Amazon) were updated regularly
• E-commerce retailer
and needed constant maintenance to collect data at scale to understand how they were
• Millions of customers
performing. The client's sales strategy depended on tracking thousands of new products
worldwide
• Selling across thousands of that appear in the market and monitoring product trends.
product categories
Our approach
The client migrated from their previous AI-based solution, which was failing to
In this constantly changing
and competitive market,
accurately extract product data. Apify built bespoke web scrapers which automatically
The e-commerce retailer collect product data from Google and Amazon, monitoring competitor performance in
enlisted Apify to help real time. Products were matched via unique product codes to compare like products
collect product descriptions across platforms.
across multiple websites
including industry giants -
The results
Google and Amazon. The Apify successfully built and implemented Google and Amazon scrapers which delivered
data that the client needed product information at scale and with much higher accuracy and higher-quality data.
to track included price, Using this data, the client was able to understand which products were performing well.
product descriptions, Apify tracked both the products as well as which competitors were selling them. Using
images and unique product the data collected, the client gathered insights and tracked trends related to new
codes. product development, informing company strategy and future product development.

Page 13 - The Apify web scraping guide for e-commerce


Click these ready-to-use scrapers to try for free
eBay Scraper

AliExpress Scraper
Scrape data from AliExpress. Extract descriptions, image, feedback,
questions, prices, and all other product details. You can specify country,
language, and region for shipping.

Google Shopping Scraper


Google Shopping Scraper extracts data from Google Shopping web site, in
any country domain using Google SERP. It scrapes the results on the first
result page.

Page 14 - The Apify web scraping guide for e-commerce


6. Market research
Accurate and relevant information is the foundation
of successful business ventures. Apify is a key critical component of
our product. Their solution fit just
right with our workflow, enabling us
to provide better service for our
With web scraping, market researchers can get customers. The best way to reach

invaluable insights into market trends, competitor Apify is to use their chat.

monitoring, research and development, and CTO of AI-powered content agency

content development and analysis.

Use the vast amount of data on the web to gather


information about existing and prospective
customers, the competition, and your industry at
large.

Page 15 - The Apify web scraping guide for e-commerce


4 market insights from web scraping
Market trends Research and development
A term that has its roots in the financial
sector, 'market trends' are crucial for
any industry. A market trend is
A well-structured cycle of research can
reduce product post-launch problems.
High-quality web data opens up new
2
anything that alters the market in possibilities for research in every Apify advantage
which a company operates. A trend aspect of the cycle. This is why
could be as simple as customer companies use web scraping to Automate any online
preferences, or as far-reaching as uncover new trends, train AI models, process
artificial intelligence technology. Such and reveal new knowledge. Scale processes, robotize
changes move fast, so keeping up-to- tedious tasks, and speed
date with industry trends is not easy. Content development and analysis up workflows with flexible
Web scraping gives you all the With web scraping you can build new automation software.
information you need in real time to online services that collect a large Automation that lets you
work faster and smarter
ensure you don't fall behind. amount of data from selected websites,
than your competitors with
forums, and news articles, and
less effort.
Competitor monitoring synthesize it in a digestible format. You
With scraping, it's easy to monitor the can send instant notifications to users
competition in real time. Use this about high-value updates, stay ahead
information to create a comparison site of emerging trends and aggregate
or adjust your strategy. useful content at scale.

Page 16 - The Apify web scraping guide for e-commerce


CASE STUDY The challenges
When launching a new training course, Human Coders is always aware that it is not the
only company in France covering that particular topic. Living up to its name, the
company set out to make an automated, low maintenance way to keep an eye on its
• Human Coders competitors' offerings. This would allow it to constantly adapt its own courses based on
• IT training the data collected (duration of the course, SEO ranking of the competitors, prices, topics,
• Based in Paris, France etc.).
• 85+ online courses
Our approach
Human Coders is a group of Scraping the required information with Apify was very simple. Apify made it possible to
passionate information
focus on the essentials and not risk wasting time on sysadmin tasks. The scraping was
technology professionals
fast enough for Human Coders to parse several thousand pages each week while keeping
who offer a variety of IT
the resources within the bounds of the free account.
courses, mostly geared
towards developers who
need to sharpen their skills The results
as well as anyone looking to
Human Coders set up a series of serverless apps to run weekly on the Apify platform.
get started with a
These apps extract available data about all related courses offered in France. They target
programming
language/framework. The both their competitors as well as their own websites on a weekly basis. By using an
courses, held in French, online scraping and visualization tool, Human Coders doesn't have to spend time
mainly focus on the tools of maintaining servers. Using Apify, it can instead focus on the implementation of core
the trade. functionality.

Page 17 - The Apify web scraping guide for e-commerce


Click these ready-to-use scrapers to try for free
eBay Scraper

Google Search Results Scraper


Scrapes the Amazon Best Sellers categories and extracts the top 100 most
popular items on Amazon. Download product name, price, URL, and
thumbnail image. Works on .com, .co.uk, .de, .fr, .es, and .it domains.
Download your data as HTML table, JSON, CSV, Excel, XML, and RSS feed.

Contact Details Scraper


Unofficial eBay API to extract data from eBay based on keywords or
categories. Scrape reviews, prices, product descriptions, images, location,
availability, brand, and more. Download extracted data in structured
format and use it in reports, spreadsheets, databases, and applications.

Page 18 - The Apify web scraping guide for e-commerce


7. Brand sentiment
Web scraping can help you understand how people
feel about your brand, what they say about it, and Detecio uses various technologies
and Apify is one of the most
what you need to change so that you can improve important and reliable. The stability
of the Apify platform and well-
engagement and loyalty. Social media has documented interface enables easy

created new opportunities, but also increased integration with our internal
systems.
the risks. Negative PR or sentiment online can
Daniel Řezníček, Detecio co-founder
erode your brand if you don't monitor sentiment.

Keeping an eye on how people are talking about


your company is now a fundamental part of
marketing strategizing. Web scraping can help you
by automating the data collection process.

Page 19 - The Apify web scraping guide for e-commerce


Measuring brand sentiment
Web scraping solutions allow you to There are lots of different reasons to
monitor thousands of product pages
and routinely extract product details,
including ratings and reviews. Data can
use web scraping to track brand
sentiment. Here are 3 good ones: 3
be collected regularly during the day, Customer insight: understanding Apify advantage
requiring extractions to run on your customers, what they like and
schedule. dislike about your product and Integrate with any
your brand is key to building system
This process can be deeply integrated targeted marketing strategies. Export your datasets in
into your company’s internal tech stack machine-readable
so that you can quantify brand Competition monitoring: it’s good formats like JSON and
sentiment for a truly data-driven to keep an eye on what your CSV. Apify gives you APIs
marketing strategy. Brand intelligence competitors are doing and how to let you seamlessly
integrate with databases
gives you a foundation upon which to they are relating to their
and web apps such as
build a targeted marketing strategy in customers.
Zapier and Make.
order to generate new leads and sales.
Web scraping gives you a chance to do Real-time response: reacting
it in real-time so you can act on your promptly to feedback shows that
customers’ perceptions straight away. you are present and that you care.

Page 20 - The Apify web scraping guide for e-commerce


CASE STUDY The challenges
The challenge before using the Apify platform was to get data from customer websites
fast – thousands of web pages – and to be able to effortlessly integrate scraped results
into the database. A smooth workflow like this would enable them to improve their
• AI-powered content agency understanding of the user behavior and bring maximum customer success to their work.
• Software marketing Apify turned out to be the perfect solution to help this company advance their data
• Based in the US & Canada extraction and scale up their analysis processes.
• 200+ employees
Our approach
The client is a B2B content It might come as a surprise, but their workflow with Apify APIs starts quite like any other
analysis agency with 200+
user on the platform: choose a URL, edit the input parameters, run a task. Once all the
clients all over the world.
necessary data is scraped, it gets effortlessly integrated into their content intelligence
Our client's solution is
NoSQL database. The last and the best bit is that Apify APIs didn’t require any
aimed at businesses who
want to analyze user
additional work to perform data integration.
engagement on their The results
websites. After the user
The features that turned out to be the most useful for our client's work are the
behavior analysis, their
webhooks that send the data over to the database. Eventually, they started needing
mission was to figure out an
algorithm to further more data than the free plan was able to provide and since their experience was positive,
recommend accurate, the company upgraded to a more advanced subscription plan. Overall, our client's use
customized content of of and interest in the Apify platform has been growing over time. We’re now looking
interest to these website forward to a new level of cooperation, including developing a custom solution as well as
visitors. new ways to help them scrape the web and amplify their content analysis services.

Page 21 - The Apify web scraping guide for e-commerce


Click these ready-to-use scrapers to try for free

Twitter Scraper
Scrape any Twitter user profile. Creates an unofficial Twitter API to extract
tweets, retweets, replies, favorites, and conversation threads with no
Twitter API limits. Download your data as HTML table, JSON, CSV, Excel,
XML, and use it in spreadsheets, applications, reports, and databases.

YouTube Scraper
Unofficial eBay API to extract data from eBay based on keywords or
categories. Scrape reviews, prices, product descriptions, images, location,
availability, brand, and more. Download extracted data in structured
format and use it in reports, spreadsheets, databases, and applications.

Page 22 - The Apify web scraping guide for e-commerce


8. In-house vs. outsource
Many companies recognize the value of big data,
and they are investing in web scraping and data After switching to Apify, we have time
back to build our core product,
extraction tools – and so should you. But should develop exciting features for
customers. The Apify team has been
you build an engineering team dedicated to web flexible and innovative in
accommodating our unique data
scraping, or should you outsource it to specialists, collection needs and maintains the
high quality and consistency of data
such as Apify? at scale.

Bria King, Senior Product Manager at


Having an in-house team of developers dedicated Thorn

to web scraping might become a burden on you


and your business if you haven’t made the proper
considerations. Read on to find out about the risks
and how you can do both with the Apify platform.

Page 23 - The Apify web scraping guide for e-commerce


Know the risks before you choose
Ask yourself these questions before you And then make sure you're familiar
decide on whether to do your web
scraping in-house or outsource:
with the main risks involved with doing
it all in-house. 4
How much web scraping The volume of work tends to Apify advantage
experience do I currently have in increase with time and your team
my team? might struggle to keep up with Scrapers that never
How important is uninterrupted maintenance. get blocked
data extraction to my business? So your engineers might have little Smart proxy rotation of
How would web scraping time left to analyze the data, or datacenter and residential
downtime affect my business? work on new scrapers. proxies combine with
Do I have a dedicated team or am I Scrapers exist across different industry-leading browser
planning to hire one? platforms and languages. fingerprinting research to
make our bots
What is the future scope and scale Your engineering team might end
indistinguishable from
of the team's web scraping up being unable to satisfy your
humans.
activities? business analytics team’s demand.
How integral is web scraping to my Scaling web data extraction can
business? quickly complex and expensive.

Page 24 - The Apify web scraping guide for e-commerce


Best of both with the Apify platform
There are advantages to doing your Alternatively, you can get a complete
web scraping in-house, especially if you
already have a team of experienced
devs or an existing scraping solution.
end-to-end scraping solution from
Apify's world-class experts. 5
But there are parts of the process that With a full web scraping solution from Apify advantage
you can outsource, while still Apify, you get:
developing and maintaining your own Service-level agreement Rich developer
scrapers. Your own Apify technical manager ecosystem
Dedicated development team Apify is built on solid
The Apify platform gives you the Ongoing monitoring system open-source tools so
freedom to create in-house scrapers, Long-term maintenance don’t worry about vendor
but run them on a robust, scalable, Rich API integrations lock-in. And you can take
and reliable scraping platform. Data delivery direct to your systems advantage of a thriving
and skilled community of
Apify Freelancers and
You will own your code so what you So create your own web scraping
partners.
scrape and how you scrape is up to you. solution or rely on us to create and
We'll just take care of the vital maintain it. Apify can help you with
infrastructure. both.

Page 25 - The Apify web scraping guide for e-commerce


10. Ready to get started?
Explore hundreds of ready-to-use scrapers on Apify
Store or submit your project if you're ready to get To make automation possible and
strive towards the goal of the
started. open web as a public good and a
basic right for everyone, people
need tools to extract structured
data from the web and automate
workflows on it.

Jan Curn, Apify CEO

You might also like