p1 (3)
p1 (3)
Experiments
Dataset Description
Our partnering company provided the dataset we used.
The data comes from an e-commerce website. The
website consists of household goods. The dataset
contains pageview, purchase, and shopping cart data of
users. We discarded the sessions that consist of only
pageviews. In the dataset, there are 490817 interactions
(including pageviews, purchase, and shopping cart
information) on 38117 user sessions, which can be seen in
Table 1. Both purchase and shopping cart sessions contain
Figure 3: Model Architecture
Training Procedures For the benchmarks (BERT4Rec, SASRec), we tuned the
hyper-parameters according to the descriptions in the
We split both the purchase data and the shopping cart
respective papers, or we used the recommended
data with respect to time. We used the first 14 months of
parameters. All of the models are reported under their
data for training, the next 2 months for validation, and
best hyperparameter settings.
the last 2 months for testing. During training, validation,
and testing, we predicted the last items in user sessions,
by utilizing the previous items. We used shopping cart
sessions only in training and validation but not in testing.
We used purchase sessions in training, validation, and
testing. We tuned the hidden dimension of the
transformer encoder within the range of [8, 16, 32, 64,
128, 256] and the L2 regularization penalty within the
range of [0.1, 0.001, 0.0001, 0.00001]. We set the number
of transformer blocks and the number of heads as 2, for
fair comparison with other benchmarks (BERT4Rec,
SASRec).