CH 6 Web Mining and Other Data Mining
CH 6 Web Mining and Other Data Mining
1. Finds Patterns
o Data mining helps web mining discover hidden patterns in
web data (like what users often search).
2. Web Personalization
o News or shopping sites change what you see based on
your interests and past behavior.
3. Clickstream Analysis
o Tracks how users click through a website to improve
design and navigation.
4. Online Marketing
o Helps show targeted ads by analyzing user preferences
and browsing history.
2. Content Recommendation
o Websites like YouTube or Amazon use content mining to
recommend videos or products based on user behavior.
3. News Aggregation
o News websites use content mining to gather and display
the latest articles and reports on various topics.
3. Types of Links:
o Intra-page links (links within the same website)
o Inter-page links (links to other websites)
4. Purpose:
o To understand how web pages are connected
o To find important pages or popular websites
5. Techniques Used:
o Graph theory (web as a graph of nodes and links)
o PageRank algorithm (used by Google to rank pages)
6. Applications:
o Search engine optimization (SEO)
o Finding hubs and authorities on the web
o Improving website navigation
3. Main Purpose:
o a) Improve user experience
o b) Provide personalized content
o c) Optimize website structure and services
4. Data Sources:
o a) Web server logs
o b) Browser cookies
o c) Web application logs
6. Applications:
o a) Product recommendations (e.g., Amazon)
o b) Targeted advertising
o c) Detecting suspicious or fraudulent activity
o d) Website improvement and design
o URL requested
1. Large Size:
2. Noisy Data:
3. User Identification:
4. Session Identification:
5. Privacy Concerns:
o Logs may contain sensitive user data, which must be
handled carefully to avoid privacy issues.
6. Time Synchronization:
7. Incomplete Data:
1. Definition:
Temporal Mining is the process of finding patterns or trends in
data that change over time.
2. Keyword:
The word "temporal" means related to time.
3. Purpose:
To discover time-based patterns like:
4. Examples:
6. Applications:
o Weather prediction
7. Tools/Techniques Used:
o Data mining algorithms
o Time-series analysis
o Pattern recognition
1. Definition:
Spatial Data Mining is the process of finding patterns or
knowledge from data that is related to geographical or spatial
locations.
2. Keyword:
The word "spatial" means related to space or location (like
maps, GPS data).
3. Purpose:
To discover interesting patterns, relationships, or trends in
data that involve location or distance.
4. Examples:
6. Applications:
o Urban planning
o Disaster management
o Location-based marketing
o Environmental monitoring
7. Techniques Used:
2. Keyword:
The word "multimedia" means multiple types of media (not
just text).
3. Purpose:
To understand, organize, and make use of non-text data in
large multimedia databases.
5. Examples:
6. Applications:
7. Techniques Used:
o Pattern recognition
o Image processing
2. Real-Time Processing
3. Fraud Detection
5. Healthcare Systems
6. Telecommunication
8. Cybersecurity
2. Open Source
3. Scalable
4. Fault Tolerant
o If one computer fails, Hadoop automatically recovers the
data using copies (replication).
5. Distributed Processing
8. Cost-Effective