0% found this document useful (0 votes)

32 views

TF-IDF Method in Ranking Keywords of Instagram Users' Image Captions

Uploaded by

fghjkvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

TF-IDF Method in Ranking Keywords of Instagram Users' Image Captions

Uploaded by

fghjkvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2015 International Conference on Information Technology Systems and Innovation (ICITSI)

Bandung - Bali, November 16 - 19, 2015

ISBN: 978-1-4673-6664-9

TF-IDF Method in Ranking Keywords of

Instagram Users’ Image Captions
Bernardus Ari Kuncoro Bambang Heru Iswanto
Master of Information Technology Department of Physics
Bina Nusantara University Jakarta State University
Jakarta, Indonesia Jakarta, Indonesia
Email: b.kuncoro [at] binus.ac.id Email: bhi [at] unj.ac.id

Abstract—Instagram is one of the popular social media appli-

cations used by a wide range of people around the world. The
significant growth of active Instagram users affects the size of
Instagram data. The more number of users, the larger and more
various Instagram data is posted. In line with its popularity,
in recent years many researchers begin to study and analyze
it for various purposes, such as detecting event photos based
on location, clustering the photo content, advertising strategies
based on user types, and so on. As of now there are three types
of data available in Instagram which are text, image, and video.
In this paper we propose Term-Frequency and Inverse Document
Frequency (TF-IDF) method to rank keywords of top twenty most
followed Instagram users based on image captions of Instagram.
The objective of this research is to automatically know the main
idea of Instagram users based on 50 recent image captions posted.
In our experiments, TF-IDF has been successfully implemented to
reveal a set of keywords with its ranking. The highest ranking of
keyword is indeed the main topic of a user, indicated by the value
Fig. 1. Example of Cristiano’s and Instagram’s Posts with Image Captions
of TF-IDF. The result of study indicates that TF-IDF method is
very useful to find and rank the keywords of Instagram users
image captions. In the future research, the ranking keywords are
needed in solving classification and clustering tasks as feature Despite the fact of Instagram popularity, the number of
extractions. researches focused in Instagram is very low. In 2014, Hu,
Keywords—Instagram; text mining; Term-Frequency and In- Manikonda, and Kambhampati [3] wrote that their work is be-
verse Document Frequency, social media lieved to be the first study to conduct a deep analysis of photo
content and user activities and types on Instagram. In their
study, computer vision and identification by clustering were
I. I NTRODUCTION successfully applied thus eight popular categories of photos
and five distinct types of Instagram users were revealed. A
Instagram is one of the popular social media platforms that dissertation related to Instagram was reported by McCune. He
provides users a quick way to capture and share their life investigated peoples motivations of using Instagram through a
moments with followers through a series of filter-manipulated survey study of 23 Instagram users [4]. In 2013, Silva, Vaz de
photo and video. It is more popular amongst a younger de- Melo, Almeida, Salles, and Loureiro have applied visualization
mographic. Over 35% of people using Instagram are between and cultural analytics on Instagram photos from different cities
ages 18-29 years [1]. Since establishment in October 2010 in the world to trace their social and cultural differences [5].
until this paper was written, the growth of active Instagram Instagram has three types of data which are text, image,
users has significantly increased. According to an updated and video. To narrow down the idea of this study, only text
data by the official Instagram account in September 2015 [2], data was used. The text data used in this study was the
Instagram has been registered by 400 million users which is image caption that represents the description of the image.
25% higher than the number of registered users in December As illustrated in Fig. 1, the image caption is located under the
2014. Another interesting fact is that the average of photos image that was posted by the user.
being uploaded by users per day is more than 80 million
photos. The research question of this study is ”How to find keywords

978-1-4673-6664-9/15/$31.00 ©2015 IEEE

and the rankings of Instagram account based on the image TABLE I
caption data posted?”. To answer this question, text mining T OP 20 I NSTAGRAM P ROFILES
(TF-IDF) method is used. The output of this study is the Rank Username Media Followers Following
keywords with the ranking value. The higher the ranking, the 1 instagram 2,509 103,226,690 182
more relevant the keyword with the captions that users posted. 2 taylorswift 732 49,451,242 77
3 kimkardashian 3,167 48,014,416 96
The significances of this study are as follows. First, the ranking 4 beyonce 1,172 47,173,577 0
keywords of username image captions can be used as features 5 selenagomez 1,028 45,858,936 173
of advanced research such as clustering, classification, and 6 arianagrande 1,869 44,598,791 952
profiling of Instagram username. Second, this study adds the 7 justinbieber 2,508 40,228,982 73
8 kendalljenner 2,343 38,055,799 170
diversity of Instagram data research with a different approach 9 kyliejenner 3,338 38,075,231 186
which is text mining. Third, the method can be used to 10 nickiminaj 3,387 35,185,711 382
expedite researchers in retrieving significant words of the 11 khloekardashian 2,935 33,091,863 149
12 natgeo 8,432 32,835,979 94
users, as this can be done automatically rather than a manual 13 neymarjr 3,018 32,555,977 1,023
retrieval, by keeping an eye on the captions posted by the 14 cristiano 602 31,865,306 198
users. 15 mileycyrus 4,280 29,539,569 384
16 katyperry 366 28,891,826 217
17 therock 1,343 28,778,259 64
18 jlo 1,185 27,824,613 966
II. DATASET 19 badgalriri 3,267 26,976,059 1,166
20 kourtneykardash 2,021 25,875,453 72
The dataset was crawled using API of Insta-
gram. First, the top 20 most followed Instagram
usernames were collected. The list is based on III. M ETHODOLOGY
http://socialblade.com/instagram/top/100/followers accessed
on October 7, 2015 [6] and it can be seen in Table I. Second, The methodology of this study is illustrated in Fig. 2.
in order to know the most updated keywords of the users, Basically, there are three moduls used. They are retrieval,
only 50 of the most recent image captions were used. Each preprocessing, and ranking moduls. In retrieval modul, each
username is assumed as one document that contains a bag of of the usernames was used to request the recent 50 image
words, hence there are 20 documents in total. captions via Instagram API. The output of this retrieval is
a group of text files. Since the number of username used
The following are characteristics of Instagram image caption is 20, hence the output of this modul is 20 text files. This
data. Please be noted that these can be changed in the future methodology is inspired by Kumar and Sebastian research in
without prior notice due to Instagram updates. 2012 [8].

1) Image caption character limit: the limit for captions The next modul is the preprocessing modul. It is needed
on the photo and subsequent comments caps is 2200 to pass the important words and filter irrelevant words and
characters each. User is also allowed not to write a characters in each document. The first preprocessing modul
caption at all. step is removal of HTML and symbol characters. It is im-
2) Hashtag limit: The limit of hashtag is 30 hashtag per portant because, the users commonly write symbol characters
caption. that has less significant meaning or a non keyword symbol.
3) Symbol characters: Some of the users uses the symbol The second step of preprocessing modul is punctuation, #tag,
characters provided in the smartphone keyboard. @tag, and stopwords removal. The main goal of this step is
4) Writing technique: As Instagram has 2200 characters to retrive essential words and to eliminate words that has less
limit, spelling and cyber slang in the image caption is significance towards the documents such as ”the”, ”is”, ”are”,
not often used by users compared to Tweets in Twitter. ”an”, ”of”, ”to”, etc. It is also useful to reduce indexing file
5) Availability: The amount of data available is extremely size, improving efficiency and effectiveness. The third step is
large. According to the Instagram official release in standardizing words. For example the user sometimes writes
September 2015, there are 80 million photos uploaded ’go hooooome’, thus the output of this step is ’go home’.
daily. The Instagram API facilitates the collection of Last step on preprocessing modul is URLs removal. It is clear
image captions as well as the URL Link of image. that the URL link is not significant to be used to reveal the
6) Topics: Instagram users post photos and videos in a wide keywords.
variety of topics. Previous research observed that there
Upon finishing preprocessing the data, ranking process is
are eight main photo categories which are friends, food,
then applied. The first step of this modul is tokenization. Its
gadget, captioned photo, pet, activity, selfie, and fashion
objective in this case is to break the text up into words or
posted in Instagram [3].
other meaningful elements called tokens. Then each tokens, or
7) Weekend Onpeak: Users tend to post the photos and
commonly refered to as terms are used to form vector space
videos during weekends and at the end of the day. [7]
model.
Fig. 3. Top 10 Words of @instagram Account

1, if x = t
fr(x, t) =
0, otherwise
Hence, TF(t, d) returns how many times the term t is
present in the document d.
2) IDF is defined with the following formula:
|D|
IDF(t) = log (2)
1 + |{d : t ∈ d}|
where |{d : t ∈ d}| is the number of documents where
the term t appears, when the term-frequency function
satisfies TF(t, d) = 0, were only adding 1 into the
formula to avoid zero-division.
3) TF-IDF formula is defined as follows:

Fig. 2. Methodology TF-IDF(t) = TF(t, d) × IDF(t) (3)

The TF-IDF value increases proportionally to the number

TF-IDF stands for ”Term Frequency, Inverse Document of times a word appears in the document, but is offset by the
Frequency”. It is a way to score the importance of words frequency of the word in the corpus, which helps to adjust for
(or ”terms”) in a document based on how frequently they the fact that some words appear more frequently in general,
appear across multiple documents. Besides that, it is the most thus the more appears in a document, the more a word is
common weighting method used to describe documents in the estimated to be significant in that document.
Vector Space Model (VSM), particularly in Information Re-
trieval problems. TF-IDF is a relatively old method proposed IV. R ESULT AND D ISCUSSION
by Salton and Buckley in 1988 [9]. Despite its age, it is simple
and effective, making it a popular starting point compared to The proposed method was applied to the top twenty most
the more recent algorithms. To know more about the TF-IDF, followed Instagram usernames as input. The number of rank-
here are the descriptions of TF and IDF. ing keywords can be varied and in this study was limited up
to 10 ranks. Thus the result are 20 username items with 10
1) TF is a measure of how many times the terms t present ranking keywords. Three samples of the results that represents
in each file document d. The formula of TF in mathe- the top 10 words and TF-IDF value of each Instagram users
matical symbol is as follows: are illustrated with a bar chart in Fig 3, 4, and 5. The first bar
is the highest ranking words or the most relevant word of a
TF(t, d) = fr(x, t) (1)
specific user. According to those figures, the most relevant
x∈d
words for each @instagram, @taylorswift, and @cristiano
where the fr(x, t) is a simple function defined as users are weekend, toronto, and drive respectively.
Fig. 4. Top 10 Words of @taylorswift Instagram Account

Fig. 6. Result of Keyword Ranking for Remaining Usernames - part 1

Fig. 5. Top 10 Words of @cristiano Instagram Account

Going more deeply to the highest ranked keyword in each

username, it turns out that they have different reasons why
it becomes the highest. The term ’weekend’ in @instagram
account becomes the highest keyword, because during the
time data was crawled, @instagram held Weekend Hashtag
Project. The username @taylorswift whose has term ’toronto’
as her highest rank keyword, because she has just shared
several photos about her concert in Toronto, Canada. While the Fig. 7. Result of Keyword Ranking for Remaining Usernames - part 2
term ’drive’ in @cristiano becomes the highest rank keyword,
because he is currently endorsing his new sport drink product
and named CR7Drive.
the least rank term: ’tulsa’. Other than that, some of the terms
Figure 6 and 7 illustrates the result of keyword ranking in the result are not easily understood due to it being slang
for the remaining 17 usernames. They are arranged from the terms (e.g. yall, j4, poo, etc.), usernames of other Instagram
higest rank to the lowest. For example, @arianagrande highest users (e.g. ronyalwin), numbers (mostly date), and non-English
rank keyword is ’focus’ followed by ’babes’, ’andrea’, until language. This needs to be improved in future works.
V. C ONCLUSION

A set of keywords with its ranking have been successfully

revealed from image captions of the top 20 most followed
Instagram users. The use of the proposed method in which TF-
IDF is implemented is very simple and effective in revealing
the keywords and its ranking from a certain user. The results
show that the highest ranking of keyword is indeed the main
topic of a user, indicated by the value of TF-IDF. The higher
the TF-IDF value, the more relevant that keyword is to the
speciﬁc Instagram username. However, this work still needs
to be improved in terms of understanding slang words and
non-English language, adding feature of keywords based on
annotation images, and so on.

R EFERENCES

[1] J. Golbeck, Introduction to Social Media Investigation: A Hands-On

Approach, 1st ed. Massachusetts: Syngress, 2015.
[2] Instagram, “Instagram 400,000,000,” 2015. [Online]. Available:
https://instagram.com/p/78n-7MBQU8/
[3] Y. Hu, L. Manikonda, and S. Kambhampati, “What we Instagram : a ﬁrst
analysis of Instagram photo content and user types,” Proceedings of the
Eight International AAAI Conference on Weblogs and Social Media, pp.
595–598, 2014.
[4] Z. Mccune and J. Thompson, “Consumer Production in Social Media
Networks : A Case Study of the Instagram iPhone App,” Ph.D.
dissertation, University of Cambridge, 2011.
[5] T. H. Silva, P. O. S. V. D. Melo, J. M. Almeida, J. Salles, and
A. A. F. Loureiro, “A picture of instagram is worth more than a thousand
words: Workload characterization and application,” Proceedings - IEEE
International Conference on Distributed Computing in Sensor Systems,
DCoSS 2013, no. i, pp. 123–132, 2013.
[6] Socialblade, “Top 100 Instagram Users by Followers,” 2015. [Online].
Available: http://socialblade.com/instagram/top/100/followers
[7] C. S. Araujo, L. P. D. Correa, A. P. C. D. Silva, R. O.
Prates, and W. Meira, “It is Not Just a Picture: Revealing
Some User Practices in Instagram,” 2014 9th Latin American
Web Congress, no. May, pp. 19–23, 2014. [Online]. Available:
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7000167
[8] A. Kumar and T. M. Sebastian, “Sentiment Analysis on Twitter,” Inter-
national Journal of Computer Science Issues, vol. 9, no. 4, pp. 372–378,
2012.
[9] G. Salton and C. Buckley, “Term-weighted approaches to automatic text
retrieval.” In Information Processing & Management, vol. 24, no. 5, pp.
513–523, 1988.

Cambridge 10 Listening Test 2 Transcript
No ratings yet
Cambridge 10 Listening Test 2 Transcript
10 pages
Sta108 - Group Project Assignment
No ratings yet
Sta108 - Group Project Assignment
8 pages
The Zipline Adventure
No ratings yet
The Zipline Adventure
1 page
Reading Turkish Novelist
No ratings yet
Reading Turkish Novelist
14 pages
Poetry Recitation Competition: Rules and Guidelines
No ratings yet
Poetry Recitation Competition: Rules and Guidelines
5 pages
Future Board Game
No ratings yet
Future Board Game
1 page
The "Once-Forgotten" Turkish Bestseller: (Re-) Contextualizing Sabahattin Ali'S
No ratings yet
The "Once-Forgotten" Turkish Bestseller: (Re-) Contextualizing Sabahattin Ali'S
120 pages
IELTS4 Answer Keys
No ratings yet
IELTS4 Answer Keys
2 pages
Leyla Erbil Öykülerinde Nesnellik Dil Ve Anlatım
No ratings yet
Leyla Erbil Öykülerinde Nesnellik Dil Ve Anlatım
133 pages
Semantics: The Study of Meaning in Language
No ratings yet
Semantics: The Study of Meaning in Language
22 pages
Top 1000 IMDB Actors & Actresses - 1 January 2024 (Print Version)
No ratings yet
Top 1000 IMDB Actors & Actresses - 1 January 2024 (Print Version)
140 pages
Halk Cephesi 2014
No ratings yet
Halk Cephesi 2014
44 pages
Buy ebook The consumer society myths structures Baudrillard cheap price
100% (5)
Buy ebook The consumer society myths structures Baudrillard cheap price
20 pages
ESKİ TÜRK EDEBİYATINDA NAZIM FAHİR İZ 70-81 Arasi
No ratings yet
ESKİ TÜRK EDEBİYATINDA NAZIM FAHİR İZ 70-81 Arasi
12 pages
5282 Cumhuriyet - Donemi - Edebiyati 1923 1950 Ahmed - Oktay 1971 1300s
No ratings yet
5282 Cumhuriyet - Donemi - Edebiyati 1923 1950 Ahmed - Oktay 1971 1300s
15 pages
Kısa Film Katalog 2008
No ratings yet
Kısa Film Katalog 2008
264 pages
Makine Cevirisi Ve Ceviri Kuramlari
No ratings yet
Makine Cevirisi Ve Ceviri Kuramlari
11 pages
Conjunctions&Transitions DİLKO Prestige Junior
No ratings yet
Conjunctions&Transitions DİLKO Prestige Junior
16 pages
MTR1AK
0% (1)
MTR1AK
58 pages
Kazanım: E11.3.R1. Students Will Be Able To Answer The Questions About A Text On People's Habits and Experiences
No ratings yet
Kazanım: E11.3.R1. Students Will Be Able To Answer The Questions About A Text On People's Habits and Experiences
5 pages
Eng-Unlocked-Inte B1
No ratings yet
Eng-Unlocked-Inte B1
8 pages
January February 2019
No ratings yet
January February 2019
44 pages
Şukufe Ni̇hal - Gayya
No ratings yet
Şukufe Ni̇hal - Gayya
41 pages
Sabahattin Ali Kağnı
No ratings yet
Sabahattin Ali Kağnı
5 pages
The Awakening By:: Kate Chopin: PPT: BY: Will Zorn
No ratings yet
The Awakening By:: Kate Chopin: PPT: BY: Will Zorn
8 pages
Dlko Yds Set PDF Notlar Kitaplar PDF Dosyalar Yds
No ratings yet
Dlko Yds Set PDF Notlar Kitaplar PDF Dosyalar Yds
1 page
Epe Odtü CAREFUL READING
No ratings yet
Epe Odtü CAREFUL READING
12 pages
The One Where Ross Got High - English
No ratings yet
The One Where Ross Got High - English
27 pages
Yks Strateji 4 Ornek Sayfalar
No ratings yet
Yks Strateji 4 Ornek Sayfalar
21 pages
Eli̇f Şafak Mahrem - The Gaze
No ratings yet
Eli̇f Şafak Mahrem - The Gaze
244 pages
Sputnik Sweetheart Haruki Murakami Download
No ratings yet
Sputnik Sweetheart Haruki Murakami Download
1 page
Nazim H İkmet Ran
No ratings yet
Nazim H İkmet Ran
15 pages
Oxford 5000 Kelime Listesi
No ratings yet
Oxford 5000 Kelime Listesi
119 pages
KK
100% (1)
KK
110 pages
Güvenlik Mi Dediniz
From Everand
Güvenlik Mi Dediniz
Z.G. De Vincentiis
No ratings yet
Bütün-Beyinli Çocuk - Daniel J. Siegel - Tina Payne Bryson. (2017!04!12 21-09-42 UTC)
No ratings yet
Bütün-Beyinli Çocuk - Daniel J. Siegel - Tina Payne Bryson. (2017!04!12 21-09-42 UTC)
259 pages
Case Study
No ratings yet
Case Study
10 pages
Paper AIESD-2022-How To Use Instagram Camera Ready
No ratings yet
Paper AIESD-2022-How To Use Instagram Camera Ready
12 pages
What We Instagram
No ratings yet
What We Instagram
4 pages
IG
No ratings yet
IG
17 pages
Project 2 Instagram User Analytics
No ratings yet
Project 2 Instagram User Analytics
7 pages
MySQL Project - Instagram User Analytics
No ratings yet
MySQL Project - Instagram User Analytics
13 pages
1 s2.0 S1877050917325814 Main
No ratings yet
1 s2.0 S1877050917325814 Main
8 pages
Instagram User Analytics
No ratings yet
Instagram User Analytics
6 pages
Data Analysis Project 02
No ratings yet
Data Analysis Project 02
6 pages
FULLTEXT01
No ratings yet
FULLTEXT01
44 pages
PAPER On Reach Analysis
No ratings yet
PAPER On Reach Analysis
5 pages
Work Experiance With Hemal Uncle
No ratings yet
Work Experiance With Hemal Uncle
9 pages
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
From Everand
Data Science Fundamentals and Practical Approaches: Understand Why Data Science Is the Next (English Edition)
Dr. Gypsy Nandi
No ratings yet
Case Study SNA
No ratings yet
Case Study SNA
2 pages
Instagram User Analytics
No ratings yet
Instagram User Analytics
14 pages
41 Incredible Instagram Statistics You Need To Know - Brandwatch
No ratings yet
41 Incredible Instagram Statistics You Need To Know - Brandwatch
8 pages
ssrn-3372757
No ratings yet
ssrn-3372757
9 pages
Social Media Data Mining: Insights and Strategies
From Everand
Social Media Data Mining: Insights and Strategies
Vidhur Gupta
No ratings yet
Instagram User Analytics
No ratings yet
Instagram User Analytics
12 pages
Project 2-Instagram User Analytics
No ratings yet
Project 2-Instagram User Analytics
16 pages
The 100K Hustle
No ratings yet
The 100K Hustle
59 pages
Instagram and LinkedIn Marketing 2023
No ratings yet
Instagram and LinkedIn Marketing 2023
129 pages
Data Analytics Trainee Task - 2
No ratings yet
Data Analytics Trainee Task - 2
22 pages
Instagram Marketing Guide Learn the Power of Social Media Advertising Secrets to Take Advantage and Grow Your Instagram Account, Gain a Following and Market It for Your Business
From Everand
Instagram Marketing Guide Learn the Power of Social Media Advertising Secrets to Take Advantage and Grow Your Instagram Account, Gain a Following and Market It for Your Business
Tony Mathews
4.5/5 (4)
project 2 trainity
No ratings yet
project 2 trainity
8 pages
Rocket Propulsion Primer: Subramaniam Krishnan Jeenu Raghavan
No ratings yet
Rocket Propulsion Primer: Subramaniam Krishnan Jeenu Raghavan
427 pages
Tms Script
100% (1)
Tms Script
24 pages
2020 Integrating Green Strategy and Green HR PDF
No ratings yet
2020 Integrating Green Strategy and Green HR PDF
31 pages
Bar Chart in Tableau 49
No ratings yet
Bar Chart in Tableau 49
9 pages
Physics - Unit 1 (Mechanics)
100% (1)
Physics - Unit 1 (Mechanics)
49 pages
Qatargas South - Pvl-Rev-6 - March 2019
No ratings yet
Qatargas South - Pvl-Rev-6 - March 2019
166 pages
2.2 CDMA Link Buget
No ratings yet
2.2 CDMA Link Buget
31 pages
Enumerative Combinatorics: Volume 2: Second Edition Richard P. Stanley 2024 scribd download
100% (8)
Enumerative Combinatorics: Volume 2: Second Edition Richard P. Stanley 2024 scribd download
20 pages
Coolant SpecificationSBGEN172E2 PDF
No ratings yet
Coolant SpecificationSBGEN172E2 PDF
1 page
CS-12-PREBOARD-SET-1 -MARKING SCHEME(with questions)
No ratings yet
CS-12-PREBOARD-SET-1 -MARKING SCHEME(with questions)
13 pages
TEST 1 GRADE 12 AND MEMO
No ratings yet
TEST 1 GRADE 12 AND MEMO
9 pages
P3 3B Bernardo Kathryne
No ratings yet
P3 3B Bernardo Kathryne
5 pages
Entrepreneurship Project
No ratings yet
Entrepreneurship Project
2 pages
WK 1 Handout Introduction To Accounting
No ratings yet
WK 1 Handout Introduction To Accounting
7 pages
Systems
No ratings yet
Systems
2 pages
4 4 PB
No ratings yet
4 4 PB
13 pages
Brochure Multi Stage Roots Pumps A 200 L
No ratings yet
Brochure Multi Stage Roots Pumps A 200 L
8 pages
Blackbuck 1
No ratings yet
Blackbuck 1
2 pages
DIP3E - Chapter06A - Tran - Edge Detection
No ratings yet
DIP3E - Chapter06A - Tran - Edge Detection
19 pages
Basell For IV Sol. Ldpe - Pe 3220 D
No ratings yet
Basell For IV Sol. Ldpe - Pe 3220 D
1 page
Industrial Building
70% (10)
Industrial Building
37 pages
Diaz Rico - SE - 01
No ratings yet
Diaz Rico - SE - 01
34 pages
Carburetor Design and Operation
No ratings yet
Carburetor Design and Operation
12 pages
A Comparative Study of Using Fly Ash and Rice Husk Ash in Soil Stabilization
No ratings yet
A Comparative Study of Using Fly Ash and Rice Husk Ash in Soil Stabilization
6 pages
Assignment - 2 (EC502)
No ratings yet
Assignment - 2 (EC502)
15 pages
O-Ring Seal: Design Features Typical Applications
No ratings yet
O-Ring Seal: Design Features Typical Applications
8 pages
Adaptive Antenna Systems
No ratings yet
Adaptive Antenna Systems
35 pages
LCR-RIAA Ahlswede
No ratings yet
LCR-RIAA Ahlswede
13 pages
Ultrasonic Velocity and Anisotropy of Hydrocarbon Source Rocks
No ratings yet
Ultrasonic Velocity and Anisotropy of Hydrocarbon Source Rocks
9 pages

TF-IDF Method in Ranking Keywords of Instagram Users' Image Captions

Uploaded by

TF-IDF Method in Ranking Keywords of Instagram Users' Image Captions

Uploaded by

2015 International Conference on Information Technology Systems and Innovation (ICITSI)

Bandung - Bali, November 16 - 19, 2015

TF-IDF Method in Ranking Keywords of

Abstract—Instagram is one of the popular social media appli-

978-1-4673-6664-9/15/$31.00 ©2015 IEEE

Fig. 2. Methodology TF-IDF(t) = TF(t, d) × IDF(t) (3)

The TF-IDF value increases proportionally to the number

Fig. 6. Result of Keyword Ranking for Remaining Usernames - part 1

Fig. 5. Top 10 Words of @cristiano Instagram Account

Going more deeply to the highest ranked keyword in each

A set of keywords with its ranking have been successfully

[1] J. Golbeck, Introduction to Social Media Investigation: A Hands-On

You might also like