Classn 439
Classn 439
K. R. Suneetha, R. Krishnamoorthi
ABSTRACT
Focus of many industries change towards customer orientation to retain regular frequent
accessors for the improvement of customer relationship management. Study of interested
web users, provides valuable information for web designer to quickly respond to their
individual needs. The aim of this paper is, instead of tracking the behavior of overall
users to redesign the web site, our model extracts only focused group of interested users.
The proposed model consists of two phases. In first phase, the web server log data is
preprocessed. The purpose of data preprocessing is to extract useful data from raw web
log. In the second phase data is classified using enhanced version of Decision tree
algorithm C4.5. NASA web server data is used for experimental purpose, which results
in less execution time and reduced memory utilization with high accuracy.
2 RELATED WORK
Attri- Description
butes
A1 total session time:15-30 min.
Figure 2: Decision Tree Generation
A2 total time a user stays at the site: > 30secs
A3 total number of accessed pages during the IU: Interested Users
whole session: >5pages NIU: Not Interested Users
A4 access methods used to interact with the
site: GET, POST Decision Rules:
A5 Depth Wise Access from particular Rule1: If (Time spent <30 and No. of pages referred
page(DWA) <5and Method used GET and
DepthwiseReference=’NO’) = “NIU”
We subjectively identify users who have some
Rule2: If (Time spent <30 and No. of pages referred <5 users and interested users with Data Base Size. Fig. 5
and Method used GET and indicates variation between the entries of unique and
DepthwiseReference=’YES’) = “IU” interested users with sessions. Percentage of reduction
Rule3: If (Time spent <30 and No. of pages referred <5 in Data Base size of interested users with unique users
and Method used POST) = “IU” is shown in Fig. 6. This shows a small training dataset
Rule4: If (Time spent <30 and No. of pages referred >5 (interested user group) achieve high accuracy for the
and Method used GET and most important class of users with purchase interest.
DepthwiseReference=’NO’) = “NIU”
Rule5: If (Time spent <30 and No. of pages referred >5
Entrie s and IP addre ss Ratio
and Method used GET and
DepthwiseReference=’YES’) = “IU” 10 0 0 0 0
Rule6: If (Time spent <30 and No. of pages referred >5 90000
Entries Vs IP Address
Rule7: If (Time spent >30 and No. of pages referred <5 70 0 0 0