International Conference On Advanced Computer Science and Information System 2012
International Conference On Advanced Computer Science and Information System 2012
International Conference on
Advanced Computer Science and Information System 2012
(ICACSIS 2014)
Hotel Ambhara, Jakarta
October 18th - 19th, 2014
Search
View
Please enable Javascript on your browser to view all the page properly.
Copyright
Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or
promortional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any
copyrighted component of this work in other works must be obtained from Faculty of Computer Science, Universitas
Indonesia, Indonesia.
Contacts
ICACSIS Committee
Email: [email protected]
Phone: +62 21 786 3419 ext. 3225
International Conference on
Advanced Computer Science and Information System 2012
(ICACSIS 2014)
Hotel Ambhara, Jakarta
October 18th - 19th, 2014
Search
Committee
Honorary Chairs
A. Jain, Fellow IEEE, Michigan State University, US
T. Fukuda, Fellow IEEE, Nagoya-Meijo University, JP
M. Adriani, Universitas Indonesia, ID
General Chairs
E. K. Budiardjo, Universitas Indonesia, ID
D.I. Sensuse, Universitas Indonesia, ID
Z.A. Hasibuan, Universitas Indonesia, ID
Program Chairs
H.B. Santoso, Universitas Indonesia, ID
W. Jatmiko, Universitas Indonesia, ID
A. Buono, Institut Pertanian Bogor, ID
D.E. Herwindiati, Universitas Tarumanagara, ID
Section Chairs
K. Wastuwibowo, IEEE Indonesia Section, ID
Publication Chairs
A. Wibisono, Universitas Indonesia, ID
Program Committees
A. Azurat, Universitas Indonesia, ID
A. Fanar, Lembaga Ilmu Pengetahuan Indonesia, ID
A. Kistijantoro, Institut Teknologi Bandung, ID
A. Purwarianti, Institut Teknologi Bandung, ID
A. Nugroho, PTIK BPPT, ID
A. Srivihok, Kasetsart University, TH
A. Arifin, Institut Teknologi Sepuluh Nopember, ID
A.M. Arymurthy, Universitas Indonesia, ID
A.N. Hidayanto, Universitas Indonesia, ID
B. Wijaya, Universitas Indonesia, ID
B. Yuwono, Universitas Indonesia, ID
B. Hardian, Universitas Indonesia, ID
B. Purwandari, Universitas Indonesia, ID
B.A. Nazief, Universitas Indonesia, ID
B.H. Widjaja, Universitas Indonesia, ID
Denny, Universitas Indonesia, ID
D. Jana, Computer Society of India, IN
E. Gaura, Coventry University, UK
E. Seo, Sungkyunkwan University, KR
F. Gaol, IEEE Indonesia Section, ID
H. Manurung, Universitas Indonesia, ID
H. Suhartanto, Universitas Indonesia, ID
H. Sukoco, Institut Pertanian Bogor, ID
I. Budi, Universitas Indonesia, ID
I. Sitanggang, Institut Pertanian Bogor, ID
I. Wasito, Universitas Indonesia, ID
K. Sekiyama, Nagoya University, JP
L. Stefanus, Universitas Indonesia, ID
Marimin, Institut Pertanian Bogor, ID
M.T. Suarez, De La Salle University, PH
M. Fanany, Universitas Indonesia, ID
M. Kyas, Freie Universitat Berlin, DE
M. Nakajima, Nagoya University, JP
M. Widyanto, Universitas Indonesia, ID
M. Widjaja, PTIK BPPT, ID
N. Maulidevi, Institut Teknologi Bandung, ID
O. Sidek, Universiti Sains Malaysia, MY
O. Lawanto, Utah State University, US
P. Hitzler, Wright State University, US
P. Mursanto, Universitas Indonesia, ID
S. Bressan, National University of Singapore, SG
S. Kuswadi, Institut Teknologi Sepuluh Nopember, ID
S. Nomura, Nagaoka University of Technology, JP
S. Yazid, Universitas Indonesia, ID
T. Basaruddin, Universitas Indonesia, ID
T. Hardjono, Massachusetts Institute of Technology, US
T. Gunawan, Int. Islamic University Malaysia, MY
T.A. Masoem, Universitas Indonesia, ID
V. Allan, Utah State University, US
W. Chutimaskul, King Mokut’s Univ. of Technology, TH
W. Molnar, Public Research Center Henri Tudor, LU
W. Nugroho, Universitas Indonesia, ID
W. Prasetya, Universiteit Utrecht, NL
W. Sediono, Int. Islamic University Malaysia, MY
W. Susilo, University of Wollongong, AU
W. Wibowo, Universitas Indonesia, ID
X. Li, The University of Queensland, AU
Y. Isal, Universitas Indonesia, ID
Y. Sucahyo, Universitas Indonesia, ID
Faculty of Computer Science - Universitas Indonesia ©2014
ICACSIS 2014
International Conference on
Advanced Computer Science and Information System 2012
(ICACSIS 2014)
Hotel Ambhara, Jakarta
October 18th - 19th, 2014
Evaluation on People Aspect in Knowledge Management System Implementation: A Case Study of Bank Indonesia
Denny
Page(s): 10-15
Abstract | Full Text: PDF
Multicore Computation of Tactical Integration System in the Maritime Patrol Aircraft using Intel Threading Building Block
Government Knowledge Management System Analysis: Case Study Badan Kepegawaian Negara
Forecasting the Length of the Rainy Season Using Time Delay Neural Network
Hybrid Sampling for Multiclass Imbalanced Problem: Case Study of Students' Performance Prediction
Porawat Visutsak
Page(s): 52-56
Abstract | Full Text: PDF
Pierre Sauvage
Page(s): 64-71
Abstract | Full Text: PDF
Andika Yudha Utomo, Afifa Amriani, Alham Fikri Aji, Fatin Rohmah Nur Wahidah, Kasiyah M. Junus
Page(s): 72-78
Abstract | Full Text: PDF
Model Prediction for Accreditation of Public Junior High School in Bogor Using Spatial Decision Tree
Application of Decision Tree Classifier for Single Nucleotide Polymorphism Discovery from Next-Generation Sequencing
Data
Quality Evaluation of Airline’s E-Commerce Website, A Case Study of AirAsia and Lion Air Websites
A comparative study of sound sources separation by Independent Component Analysis and Binaural Model
Enhancing Reliability of Feature Modeling with Transforming Representation into Abstract Behavioral Specification (ABS)
Classification of Campus E-Complaint Documents using Directed Acyclic Graph Multi-Class SVM Based on Analytic
Hierarchy Process
Knowledge Management System Development with Evaluation Method in Lesson Study Activity
Implementation of Steganography using LSB with Encrypted and Compressed Text using TEA-LZW on Android
Ledya Novamizanti
Page(s): 138-143
Abstract | Full Text: PDF
Hotspot Clustering Using DBSCAN Algorithm and Shiny Web Framework
Framework Model of Sustainable Supply Chain Risk for Dairy Agroindustry Based on Knowledge Base
Winnie Septiani
Page(s): 148-154
Abstract | Full Text: PDF
International Conference on
Advanced Computer Science and Information System 2012
(ICACSIS 2014)
Hotel Ambhara, Jakarta
October 18th - 19th, 2014
Search
A
Achmad Benny Mutiara 467-471
Achmad Nizar Hidayanto 425-430
Adhi Kusnadi 171-176
Aditia Ginantaka 354-360
Afifa Amriani 72-78
Agus Buono 29-34
Agus Widodo 256-261
Ahmad Eries Antares 171-176
Ahmad Nizar Hidayanto 295-300
Ahmad Tamimi Fadhilah 269-276
Aini Suri Talita 467-471
Ajeng Anugrah Lestari 301-306
Ala Ali Abdulrazeg 124-129
Albertus Sulaiman 415-419
Alexander Agung Santoso Gunawan 237-240
Alfan Presekal 312-317
Alham Fikri Aji 72-78
Amalia Fitranty Almira 29-34
Anang Kurnia 342-347
Andika Yudha Utomo 72-78
Andreas Febrian 492-497
Aniati Murni Arymurthy 79-84, 216-221, 425-430
Anthony J.H. Simons 231-236
Anto S Nugroho 177-181
Arief Ramadhan 289-294
Arin Karlina 204-209
Ario Sunar Baskoro 227-230
Armein Z.R. Langi 118-123
Audrey Bona 41-46
Ayu Purwarianti 371-375
Aziz Rahmad 182-186
Azizah Abdul Rahman 130-137
Azrifirwan 388-393
B
Bagus Tris Atmaja 94-98
Bambang Sridadi 16-21
Bayu Distiawan Trisedya 57-63
Belawati Widjaja 256-261
Belladini Lovely 318-323
Bob Hardian 410-414
Boudraa Bachir 47-51
C
Chanin Wongyai 210-215
Cliffen Allen 376-381
D
Dana Indra Sensuse 22-28, 289-294
Darius Andana Haris 376-381, 438-445
Darmawan Baginda Napitupulu 420-424
Dean Apriana Ramadhan 382-387
Denny 10-15
Devi Fitrianah 425-430
Diah E. Herwindiati 431-437
Dwi Hendratmo Widyantoro 324-329
Dyah E. Herwindiati 450-454
E
Elfira Febriani 262-268
Elin Cahyaningsih 22-28
Endang Purnama Giri 79-84, 216-221
Enrico Budianto 492-497
Eri Prasetio Wibowo 467-471
Eric Punzalan 155-160
F
Fadhilah Syafria 336-341
Fajar Munichputranto 262-268
Fajri Koto 193-197
Farah Shafira Effendi 90-93
Faris Al Afif 484-491
Fatin Rohmah Nur Wahidah 72-78
Febriana Misdianti 330-335
Firman Ardiansyah 204-209
G
Gladhi Guarddin 312-317
H
Hamidillah Ajie 251-255
Harish Muhammad Nazief 312-317
Harry Budi Santoso 402-409
Hemis Mustapha 47-51
Herman Tolle 472-477
Heru Sukoco 367-370
Husnul Khotimah 461-466
I
I Made Tasma 85-89
Ida Bagus Putu Peradnya Dinata 410-414
Ika Alfina 90-93
Ikhsanul Habibie 361-366, 492-497
Ikhwana Elfitri 307-311
Imaduddin Amin 324-329
Imam Cholissodin 105-111
Imas Sukaesih Sitanggang 166-170
Indra Budi 256-261
Indriati 105-111
Irsyad Satria 342-347
Issa Arwani 105-111
Ito Wasito 446-449
Iwan Aang Soenandi 283-288
J
Janson Hendryli 431-437
Jean-Marc Salotti 41-46
Jeanny Pragantha 376-381
Joel Ilao 155-160
John Derrick 231-236
Junaidy Budi Sanger 367-370
K
Karlina Khiyarin Nisa 144-147
Kasiyah M. Junus 72-78
Kyoko Ito 112-117
L
Lailan Sahrina Hasibuan 222-226
Ledya Novamizanti 138-143
M
M Anwar Ma'sum 394-401
M. Anwar Ma'sum 484-491, 492-497
M. Iqbal Tawakal 484-491
Maria Ulfah Siregar 231-236
Maya Kurniawati 105-111
Meidy Layooari 177-181
Mira Suryani 402-409
Mohammad Uliniansyah 177-181
Muhammad Abrar Istiadi 85-89
Muhammad Asyhar Agmalaro 29-34
Muhammad Faris Fathoni 16-21
Muhammad Iqbal 467-471
Muhammad Irfan Fadhillah 99-104
Muhammad Octaviano Pratama 289-294
Muhammad Rifki Shihab 295-300, 301-306, 330-335
Muhammad Sakti Alvissalim 198-203
Murein Miksa Mardhia 118-123
N
Ni Made Satvika Iswari 171-176
Nina Hairiyah 262-268
Nuanwan Soonthornphisaj 35-40
Nursidik Heru Praptono 425-430
P
Pauzi Ibrahim Nainggolan 161-165
Pierre Sauvage 64-71
Porawat Visutsak 52-56
Prane Mariel Ong 155-160
Prasetia Putra 251-255
Putu Satwika 492-497
Putu Wuri Handayani 1-9
R
Ralph Vincent Javellana Regalado 246-250
Ravika Hafizi 130-137
Reggio N Hartono 177-181
Riva Aktivia 455-460
Roger Luis Uy 155-160
S
Sani M. Isa 431-437, 450-454
Satyanto Saptomo 367-370
Setia Damawan Afandi 187-192
Shogo Nishida 112-117
Sigit Prasetyo 348-353
Siobhan North 231-236
Sri Tiatri 498-504
Sri Wahyuni 295-300
Stanley Karouw 277-282
Stewart Sentanoe 177-181
Suraya Miskon 130-137
Syandra 478-483
T
Taufik Djatna 262-268, 283-288, 318-323, 354-360, 388-393, 455-460, 461-466
Teny Handayani 446-449
Tji beng Jap 498-504
Tonny Adhi Sabastian 312-317
V
Vina Ayumi 289-294
W
Wanthanee Prachuabsupakij 35-40
Widodo Widodo 251-255
Wilson Fonda 371-375
Wina 450-454
Winnie Septiani 148-154
Wisnu Ananta Kusuma 85-89
Wisnu Jatmiko 484-491
Y
YB Dwi Setianto 241-245
Yani Nurhadryani 342-347, 455-460, 461-466
Yasutaka Kishi 112-117
Yaumil Miss Khoiriyah 166-170
Yoanes Bandung 118-123
Yudho Giri Sucahyo 348-353
Yustina Retno W. Utami 241-245
Z
Zainal A. Hasibuan 402-409
lukman - 22-28
International Conference on
Advanced Computer Science and Information System 2012
(ICACSIS 2014)
Hotel Ambhara, Jakarta
October 18th - 19th, 2014
Search
View
Please enable Javascript on your browser to view all the page properly.
Copyright
Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or
promortional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any
copyrighted component of this work in other works must be obtained from Faculty of Computer Science, Universitas
Indonesia, Indonesia.
Contacts
ICACSIS Committee
Email: [email protected]
Phone: +62 21 786 3419 ext. 3225
Published by:
Abstract— Intentionally or not, social media users that travel goal is one of attribute in tourism context, it
likely to share others recommendation about was divided to 9 value such as cultural experience,
things, included tourism activities. In this paper we scenic/landscape, and education. One tourism site may
proposed a technique which was able to structure have more than one value, and if one tourism site has
the joint recommendation of composite social one category such as cultural experience it doesn’t
media and extract them into knowledge about the mean its value to cultural experience is one and to
tourist sites by deploying the vector space model. other categories is zero. This method is not suitable
We included advice seeking technique to not only with current user’s behavior as we had mentioned in
calculate recommendations obtained from the previous paragraph.
profile itself but also recommendations by social In this paper, we proposed a method for tourism
network users. This is a potential solution to RSs through social media extraction so it will not
handle sparsity problem that usually appears in depend to static data that have high value of
conventional recommender systems. We further inflexibility. Information about tourism site was
formulated an approach to normalize the provided in several social media and dynamically
unstructured text data of social media to obtain grew as people share their experience through their
appropriate recommendation. We experimented comment, testimony, post, etc.
the real world data from various source of social Our motivation was to overcome the drawback of
media in R language. We evaluated our result with static identifier for tourism site in RSs through text
Spearman’s rank correlation and showed that our data in social media. There are some challenges when
formulation has diversity recommendation with we need to utilize text data in social media. Data in
positive correlation to user’s profile. social media is very unstructured, user are free to post
with emoticon, word abbreviation, link, or any non-
Keywords--social media, vector space model, standardized text data. To clear these challenges, we
composite extraction presented a text mining method with some additional
normalization process to obtain well-formed identifier
I. INTRODUCTION of tourism site.
urrently social media is one of tools for people to In this paper we did not use historical data about
C search for travel destination and information. The
needs of online information for traveler, give an
users and items as conventional RSs did. Our method
provided recommendations based on user’s posting in
opportunity for tourism organization to enlarge their social media such as sparsity problem which
promotion through social media [1]. Social media has potentially appear if users rarely make a post in their
changed the way of traveler to planning their trip by social media. This problem has potential to cause
combined information from various social media. For sparsity problem which in conventional RSs affected
instance, we like to search comments or testimonies by sparse data rating [4], [5]. In order to avoid this
by others people in social media about any tourism problem, we utilized the power of the social network
site. Users assess and match it towards their
usage data that support data-network between users,
preferences. They repeat these activities manually
such as in Facebook which are known as friends or in
until they found the best tourism site that match with
Twitter known as follower. Our assumption is the
their preferences.
There were many tourism recommender systems items that might be preferred by user’s friend would
(RSs) to support the selection process of tourism directly influence user’s choice. This assumption also
destination easily. In [2], [3] tourism RSs developed known as advice-seeking [6]. We utilize users
under static data which it represent characteristic of connection in social network to complete the process
tourism site. In real world, the tourism characteristics of this assumption.
does not compatible with static data which its value in We thus formulated the mechanism of RSs from the
crisp value (0 or 1). For instance in [3] was defined extraction process in user’s social media and tourism
461
ICACSIS 2014
site’s social media through text mining processes to be growing naturally and potentially to be resources for
identifier for each object. Every tourism site has the RSs. This power potentially provides RSs
set of term as their identifier. The occurrences of performance to be more dynamic but faced more
tourism site’s identifier projected in vector space challenges.
model with vector defined by the collection of term. There are some studies to explore the benefit of
User’s identifier filtered based on tourism corpus and social media data in RSs. In [7] presented two
laid on tourism vector space model to calculate the approaches for recommendation framework based on
proximity between user and each tourism site. social media, there are interest-oriented and
The objective of this paper was to formulate influenced-oriented which focused for content
recommendation model based on composite social recommendation in social media. In [8] presented field
media and to extract unstructured composite social experiment based on interview to demonstrate the
media text data in Indonesia language. The rest of this benefit of social recommendation, trust-aware
paper is organized as follows. In section II we briefly recommendation, and advice-seeking recommendation
mention state of the art RSs based on social media. In to improve the performance of RSs as it has similar
section III we briefly present illustration how we mechanism with real world recommendation.
formulate recommendation. In section IV we In practice, one person may have more than one
presented our proposed method and we conducted social media, [9] presented recommendation with
relevant experiment for our proposed method with real composite social media to acquire friend list and
world data in section V. The conclusions are drawn in analyze friends who have impact in user decision
section VI. making to generate personalized recommendation.
The drawback of the conventional RSs in [9] still use
II. RECOMMENDER SYSTEMS BASED ON SOCIAL static identifier that given by user manually. Most of
MEDIA research in object identifier extraction such as
In social media, there are many aspects to provide sentiment analysis and opinion mining are based on
recommendation process. Social media become text mining method [10], [11], [12].
people’s need and influenced the data to keep on Based on this review, we caught a gap in social
media data extraction for object identifier to infer in
462
ICACSIS 2014
process of recommendation. We further complete the ‘gunuuung’ (in English: mountain) according to
recommendation by using relevant content from ‘gunung’. Regular expression will transform the
composite social media to enrich our findings. repeatedly letters into single letters. This is the
formulation of regular expression transformation:
III. FORMULATION OF RECOMMENDATION BASED ON [a]+ [a]
SOCIAL MEDIA DATA [b]+ [b]
We briefly present our proposed method with :
illustration in Fig. 1. The information from various [z]+ [z]
+
social media identified as characteristic or identifier where [a] means any words that consist repeatedly ‘a’
for each tourism site. For each site, we extract characters (more than one ‘a’ characters in a string)
information from various social media with text will be transformed into single ‘a’ characters.
mining to emerge its characteristics. Then, the Step 5: re-normalization process. In this
extraction result stored in database. The collection of normalization process, we use re-normalized words
extraction process from various tourism sites defined have been changed due to process in 1d, such as
as tourism corpus. ‘tanggal’ (in English: date) have been changed to
In this research, we combined recommendation ‘tangal’ and we must re normalize to early form.
process with advice-seeking technique related to In order to avoid the term that generated abundant,
social recommender. We assumed the items that might we use term compression based on compression rate.
be liked by other users who have strong relation with The collection of identifier from all site were defined
user also contribute to user’s choice. Then, we as tourism term. For each term will be calculated term
identified user in the system who have strong relation frequency (tf) and then normalized sublinear tf scaling.
with users who will receive recommendation, in this We assign tf t,s depends on the number of occurrences
research we call those users as socialize users. We of term t in site s. We use normalized sublinear tf
also projected socialize users into vector space model scaling in [15] as follow :
to calculate each user proximity with tourism site. 1 + log tft ,s if tf t ,s > 0
Then, score for each tourism site aggregated based on wft , s = (1)
level of trust (λ). For the last step, we rank tourism site 0 otherwise
based on score of aggregation function to generate top B. User Profiling
N recommendation.
In this paper, the occurrence of tourism terms in
user’s social media content guide us to generate
IV. PROPOSED APPROACH
tourism site recommendation. First of all, user might
A. Tourism identifier Extraction have more than one account in various social media.
Nowadays information of tourism were supported The collection of user’s social media can be seen as
by various social media. The collections of social corpus for each user. Then, to get tourism term in
media for each tourism site assumed as a corpus. In user’s corpus, the corpus were then will be proceed by
our approach, we defined identifier for each site by text mining processes which are similar to the
term occurrence in social media feature that based on previous section of tourism identifier extraction.
text media. C. Vector Space Model
Object identifiers for each site were generated from
The calculation of proximity between each tourism
its corpus in Indonesia language with text mining
site user based on cosine similarity function in vector
method, there are tokenization, normalization, term
space model:
compression, and term weighting. We normalized
unstructured data on social media by following these v (ui ) ⋅ v ( s j )
step:
score(ui , s j ) = (2)
v (ui ) v ( s j )
Step 1: remove punctuation and numbers.
Step 2: normalize based on words abbreviation. Otherwise, in social media there is linkage between
This is to to solve another challenges of data text in users that define connection between users. We
social media. When posting in social media, people identified advice-seeking process by the connection
like to use abbreviation of words. For instance, between users. We assumed if user 1 and user 2 were
‘sepeda’ (in English: bicycle), we can use abbreviation friend with each other in social media, user 1 will have
‘spd’. In this normalization process, we using contribution to influence user 2 recommendation, and
abbreviation list in Indonesia Language from [13]. vice versa. If we want to give recommendation to
Step 3: stem all word based on Indonesia language user 1 as main user, we must identified list of user 1 ’s
[14]. friend, for example in this case we detect user 2 as
Step 4: transform unstructured form of words using user 1 ’s friend. We define notation for (f 1 , f 2,...... f z ) as
regular expression. Some challenges in social media the collection of main user’s friends in our
text are people freely to post unstructured words like formulation.
463
ICACSIS 2014
We aggregated score from main user and main tourism site, there are Twitter search (tourism site’s
user’s friends based on level of trust in range 0 ≤ λ ≤ name as a query), Facebook page of tourism site (if
1, the value of λ represent a weight of how we trust any), Twitter account (if any), and Wikipedia
recommendation from our friends than we trust webpage.
recommendation from our own profile. If we set the The data retrieval was assisted by R packages, there
bigger value of λ , then we trust recommendation from are RFacebook and twitter, and for Wikipedia data
our friend more than our own profile. We formulated source firstly we saved html file and converted HTML
equation that derived from function of weighted mean to text by XML package. The usage of twitteR and
aggregation in [16] for each site (s 1 , s 2,, s j ,...s n ) from RFacebook are we must get access token from API
this equation : registration at https://developers.facebook.com/ and
(1 − λ ) score ( u, s j ) +
finalScore j = https://dev.twitter.com/.
B. Data Extraction
∑ score ( f , s )
z
i j
Firstly each source of data were processed
( λ ) i =1 independently, as data from facebook and twitter
almost contain unstructured form but data from
z (3)
Wikipedia contain full structure form. Twitter data
Then, we ranked tourism site based on the final
might be contain name of user, for example in Twitter
score and filtered based on top-N.
post: ‘RT @poo Taman Safari belajar
We evaluated our formulation based on
keanekaragaman fauna’, we could detect ‘@poo’ as
Spearman’s rank correllation coefficient in [17] :
name of user as ‘@’ mark was at beginning of user
ρ= ∑ ( u − u )( v − v ) (4)
name in twitter. In normalization process for Twitter
n ∗ stdev ( u ) stdev ( v ) data, we removed word with its formulation. In
unstructured form of Facebook and Twitter data, we
The objective of our evaluation was to identify the performed text mining with normalization process for
effect of our assumption that the items might be liked abbreviation word. This process matched the words
by user’s friend will influence user’s choice. In with dictionary of abbreviation and then replaced with
Spearman’s rank correlation coefficient, we compare word in the normal form. In [13] had been
ranking of recommendation between user’s profile experimented with function of Levenshtein distance
(without aggregation recommendation to user’s [18] to normalize abbreviation in Indonesian
friends) and with user’s friend recommendation. language, and the result showed matching process
using dictionary was more accurate. In this
V. EXPERIMENTAL RESULT experiment, we used dictionary from [13].
In this section, we perform the result from our The challenges for text mining in Indonesian
experiment with real world data and hypothetic data. language is process of stemming. Porter algorithm
We using R 3.1.0 software to retrieve data from social [19] and Nazief & Adriani Algorithm [14] are two
media and assist text mining process. popular algorithm for stemming corpus in Indonesian
language. The comparison of these two algorithm [20]
A. Experimental data
showed Nazief & Adriani Algorithm was more
In this paper, we collect data about tourism site in accurate than Porter algorithm, although Porter
Table 1 from various social media and compose the algorithm faster than Nazief & Adriani algorithm.
data based on tourism site. Then each tourism site take Then we implemented Nazief & Adriani algorithm in
a role as documents and built a tourism corpus. R environment with MySQL database to store word
We retrieve data from 4 different source for each base of Indonesian language.
TABLE I
LIST OF TOURISM SITE
Tourism Site Data Resources
(Tourism index) Wikipedia Query of Twitter Search Facebook Account Twitter Account
Bogor Botanical Garden (s1) √ kebun raya bogor - @kebunrayabogor
Safari Garden, Cisarua (s2) √ taman safari, cisarua Taman Safari @TSI_Bogor
Taman Mekarsari (s3) √ taman mekarsari - @TamanMekarsari
Kebun raya cibodas (s4) √ kebun raya cibodas Kebun Raya Cibodas ( @KRCibodas19
KRC)
Museum Fatahillah (s5) √ museum fatahillah - @Fatahillah_MSJ
Trans Studio Bandung (s6) √ transstudio bandung - @TransStudioBdng
Sea World (s7) √ sea world - @SEAWORLDANCOL
Monumen Nasional (s8) √ monumen nasional Monumen Nasional - @Tugu_Monas
Monas
Taman Mini Indonesia Indah (s9) √ taman mini indonesia indah - @ilovetamanmini
Taman Impian Jaya Ancol (s10) √ taman impian jaya ancol - @ancoltmnimpian
464
ICACSIS 2014
TABLE 2
EXAMPLE OF TERM-DOCUMENT MATRIX
Term Frequency
Tourism Site anak- .....
air alam baca buah wisata
anak
Bogor Botanical Garden 1 2 0 1 0 ..... 8
Safari Garden, Cisarua 1 5 0 2 3 ..... 9
Taman Mekarsari 2 1 1 2 96 ..... 34
Kebun raya cibodas 6 2 1 1 1 ..... 39
Museum Fatahillah 8 1 2 1 10 ..... 6
Trans Studio Bandung 0 3 1 1 1 ..... 6
Sea World 12 6 4 1 2 ..... 1
Monumen Nasional 3 0 0 1 5 ..... 2
Taman Mini Indonesia Indah 6 2 4 1 1 ..... 9
Taman Impian Jaya Ancol 4 4 1 1 7 ..... 17
Stop word list in information retrieval depend on take role as vector where cosine similarity defined the
the context of its field. For instances, in the field of proximity between user and tourism site.
computer we adjust ‘swim’ as stop word, but not in In this experiment we used level of trust with λ=0.4
the field of tourism. In our experiment, we just use which show score for recommendation based on
126 stop word contain conjunction such as ‘yang’, friend’s profile. Table 4 is the result of our
‘ke’, ‘pada’. In Fig. 2 we can see there are experiments, final score calculated by equation (3) and
unrepresentative words for tourism context. We show the aggregated value of cosine similarity
obtained 5018 term which contain 81% sparse term. between user 1 and user 2 . The result show how social
Then, we reduce tourism term with value of recommender impact the result of recommendation for
compression rate 40% which means we filtered out historical site by user 1 . The top-3 recommendation are
term that were not appear in minimal 4 documents. Sea world (s 7 ), Safari Garden (s 2 ), Monumen Nasional
Term compression reduce tourism term to 229 terms (s 8 ).
with 12% term. Table 2 is an example list for term Then, we evaluated our result based on Spearman’s
document matrix. rank correlation. The evaluation showed score of 0.78
which compared with ranking of recommendation
C. Recommendation Process
based on user’s social media content and our
We perform data acquisition from two users who formulation (combine with friend’s social media
have Twitter and Facebook account and they are content). It showed that our recommendation has
friends in real world and in social network. User 1 is positive correlation with user’s profile. Furthermore,
main user to given the recommendation and user 2 is our formulation has more diversity recommendation.
friend to user1. In this process, we matched the If we generate a recommendation based on user 1
occurrences of tourism term in user’s social media profile, the top-5 recommendation (s 7 , s 2 , s 4 , s 1 , s 3 )
post. Table 3 show term occurrences for user 1 and
user 2 after user profiling process. User 1 like to travel TABLE 3
to natural site and user 2 like to travel to historical site. TERM-DOCUMENT OF USER MATRIX
Then, we normalized term frequency with equation User1 User2
Tourism Term
(u1) (u2)
(1) then calculated cosine similarity for user 1 and air 2 7
user 2 . Fig. 3 show illustration how vector space model alam 1 3
projection for users and tourism site, tourism term anak-anak 6 0
... ... ...
wisata 3 3
465
ICACSIS 2014
466