0% found this document useful (0 votes)
26 views54 pages

Psosm

The document provides an overview of different types of online social media and popular social networks like Facebook, Twitter, YouTube and others. It discusses some key aspects of social networks including the type of content generated, features like posts, likes and shares on Facebook, and followers/followings and retweets on Twitter.

Uploaded by

Reethika Selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views54 pages

Psosm

The document provides an overview of different types of online social media and popular social networks like Facebook, Twitter, YouTube and others. It discusses some key aspects of social networks including the type of content generated, features like posts, likes and shares on Facebook, and followers/followings and retweets on Twitter.

Uploaded by

Reethika Selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

M.C.A.

- PRIVACY AND SECURITY IN ONLINE SOCIAL MEDIA


QUICK SUMMARY

Overview of Online Social Media:

Social media is of different types, different types of contents are getting


generated on our social media. One popular type of social media is social networks,
which is Facebook, Twitter, LinkedIn and networks like this fall into social network.
So, there are many different ways in which social media content is getting
generated, for example, publish which is of the category of social media which has
Wikipedia and crowd-sourced ways of creating content. There is an also social
game, there is an also virtual games, and there are different types of content that
are getting generated through these different types of social media services that are
available.

Here are some examples of popular social networks of different categories, different
categories that means different types of contents are getting generated in these
networks, for example, YouTube is one of the most popular video sharing service,
Flickr images, Foursquare is mostly the location based services, LinkedIn which
is for professional services, Facebook combination of many different types of
content, Instagram which is for the images, Twitter is the micro blog of short
content plus also the combination of different types of content. So, if you look
at this set of social networks that are available, social media services that are
available, they are actually creating content in their networks of a particular
category type, for example, Foursquare is only with respect to location, their atomic
level information is to say is actually the location; the network is based on location.
LinkedIn, the network is based on the professional connections that we would like
to have or we have, but this is more the traditional type of social networks.
Whereas off-late there has been other types of social networks also that are
getting more popular which is again in networks like Pinterest, which is one of the
fastest growing social networks which has images as their base. Vine, tumblr,
tinder, whisper, Snapchat or Wechat, there are many, many other social networks
which are getting popular, but these category of social networks can be categorized
into things like ephemeral social networks which are where the content is getting
posted and it destroys by itself aftersome time and there is also anonymous social
networks like whisper, where the content that you post, is also anonymous
and who is posting the content is actually difficult to find in networks like
Whisper and these are different types of networks that are also getting more and
more popular. This is not a comprehensive list of social media services that are
available out there. There are about 215 or 220 popular social networks services
that are available now. The things that are listed here is only some of them which
are more popular, to show that the social media has different types of content that
are getting generated on social network and the popular ones in each of this
category.

V's of Online Social Media:


There is also V’s of data big data which is connected to this V’s of online
social media also.
The first is Velocity; velocity is the speed in which the data actually getting
generated on these networks.
Second V is actually Variety, if you look at the slide that I had by different
types of social networks examples this actually shows you the variety of content
that are getting generated on social networks.
The third V, which is actually the Veracity, which is to see the confirmation,
which is to find out whether an information which is posted on social media is
legitimate and not actually very hard. Veracity is the third V.
The forth V is Volume; volume is the size of the content that is getting
generated which is very much connected to the velocity also, which is 400 hours of
videos getting generated every 60 seconds on YouTube, which is also the space that
its, and that much of space it is going to occupy, which is the reference to volume.
The fifth which is more recently people have been talking about, it is actually
Value; value means we can have these V’s; volume, velocity, variety and veracity,
but if is this content is not having value then it does not make any sense. So, the
5th V is actually value. So, the 4 Vs to start with the volume, velocity, variety,
veracity and the 5th V is value.
Different Social Networks:
Now, let us look at different social networks that are available there and we
are only going to look at some of the popular ones.
1. Facebook:
These are the interesting social networks that are there, first let us look at
Facebook.

This is how Facebook looks. This is my account (Account of Mr.


Ponnurangan Kumaraguru PK) and the basic building blocks of Facebook, first is
the feed that I get on my Wall. These are the friends, my friends are posting the
content whatever they are posting and there are let me go to my account. This is
my account and if you look at it, the post that I have done here sometime yesterday
that I did this post which talks about 1500 registrations for the online course.
This text that I have posted, I also have an image with it and there are likes
for the post, you can actually comment on the post and you can actually share,
even though some of these are very basic and kind of just run this quickly, so that
we will actually reuse it as we go ahead in the course. So, the basic building blocks
are the post that you do, the post can be text, image or video that you can upload,
likes, comments and shares and the content that Facebook stores is actually in a
graph format, which is it stores all the content that is produced in Facebook in a
graph.
As a user, I create some content as a node in the graph and then the
content that I create is also as nodes and there are edges between these nodes and
the friends, there is other component in the Facebook is actually friends. So, the
people that you are connected with and there is an edge between the two users
which is the friendship relationship and Facebook is actually a bidirectional
network.
What does a bidirectional mean? In this case, if I were to be friends with
Amitabh Bachchan, I have to send a request and he has to actually accept it, that
is bidirectional.Which is, there is a relationship between two people, only both of
them agree to be friends and I will show you differences in other networks, how is it
different? So, that is Facebook and of course, there are many, many other features
in Facebook, which I am not going to go in detail. My point here was only to say
some basic building blocks which we will actually use it later and there also pages
in Facebook, which we will look at later in the course which is the pages that you
are part of, or that you have liked.
In this case I am part of some pages which is Antara Prerana, which is the
entrepreneurship cell at IIIT, Delhi.
I am also part of some groups, some groups can be public, and some groups
can be private. So, these are the simple things that we will look at later in the
course. Again, just to refresh, Facebook, bidirectional, content that are produced
here – text, image and video which can be uploaded, likes, comments and shares,
pages and groups are the basic building blocks of Facebook.
2. Twitter:
Now, let us go to Twitter.

Twitter is a unidirectional network, which is, I can follow anybody on


Twitter. I can follow Amitabh Bachchan. So, I get to know what he is posting and
Twitter is a micro blog website, what does that mean? That means the content that
is getting generated on Twitter is only 140 characters. It cannot be more than that,
whereas in Facebook it can be large text.
In Facebook, we saw friends. Here it is followers and followings. So, this is
my account I have 1156 followers which is people who are following me, when I
post a content 1100 people are going to see it, 176, the following, are the people
who are actually following, whom I am following. It is if any of these 176 people
post a content I am actually going to get that. So I have done until now 1470
tweets, 176 followings, 1156 followers.

Here, this is my time line, which is post from any of the 176 people and
Twitter does a lot of promotional posts also here which shows up and there is also
in notifications,which is basically any activity that you are related with, for
example, if your tweet is getting retweeted, liked, or if anybody is actually following
you, all of this information shows on notification.
There is also hashtag. You saw one of the posts that I did on Facebook which
had hashtag psomonnptel, which is the hashtag I am using for this class also. In
case, if anybody in the class is actually posting on social networks, please add
hashtag psosmonptel, I actually plan to collect this data and look at this data, how
it is being used, if at all you are talking about it on social networks. So, trends;
trends is something that Twitter made very popular. Here if you see Delhi trends, I
have set it for Delhi you can actually change it for other locations that you may be
interested in.
In this case, Mr. RobotOnInfinity, something that is trending now,
ModiForeignAchievement and these are all hashtags that are trending, there could
be words which are not hashtags also that is trending. Trending is from Delhi.
These are the posts; these are the words or the hashtags that are popular as of
now. So, that is what trending would be. As we discussed in Facebook, we talked
about likes, comments and shares, in the Twitter world it is retweet, which is here,
reply, that is here and this is like. Twitter used to have this favorite earlier, but now
it is likes.
Again I am going through some other basic things, which you will actually
look at while collecting data and analyzing. So, Twitter is a unidirectional network,
it is a micro blogging site, the interactions that users could have is post a text,
image, video links, and things like that. Here, it is reply, retweet and like and also
talked about hashtag also. Now, let us look at the third social network. So, first we
did Facebook, which is kind of more interactions with friends and it is a
combination of many things, type of content. Twitter is micro blogging and now we
look at LinkedIn, which is more like professional networks.

So, this is how LinkedIn looks, LinkedIn basically works on connections. The
term here in Facebook, it is friends and Twitter, and it is followers and followings.
In LinkedIn it is more likely connections. Here, I mean, you will rarely find people
posting on LinkedIn and saying I am having vacation in Kerala. So, that is not the
kind of post that people will, people here are more serious. They are going to talk
about their job activities, they are actually looking for people, there lot of
recruitments that goes on LinkedIn these days, right.
Foursquare:

Foursquare is primarily a location based social network. This network is


based on locations. Users go to specific locations and they do something called as
check-in. Check-in is the basic function of Foursquare. Check-in like check-in into
the hotel, check-in into the airport. When you check in, the Foursquare system
understands that you are in that location and your friends get to know that you are
in this location. This can be used in multiple ways, for example, for the search that
I have now on the screen, which is food in New Delhi and it is showing me some
restaurants that I can go to with the rating and you can actually check into the
restaurant. You can actually give the tip in a location which is saying the food is
good and things like that. So, that is the tip that you can leave. So, that is
Foursquare, again, building blocks is location based networks. There have also
built into this things about commenting, and there is also does in Foursquare also
which is like comments in other social networks.
So, the ones that you are looking at, as I said is only the popular ones that
we are looking at Facebook, Twitter and LinkedIn, Foursquare and now this is
Google Plus. I did the same post that I showed on Facebook. In Google Plus
interestingly they have the friends called as circles. So, you actually add people into
your circle and you get added to their circle. So, that is what Google plus is, again,
the same things like other networks, you can add text, images, you can add a link
to the video, and here it’s called +1 this post,which is similar to likes in Facebook,
similar to retweet in Twitter, and again in LinkedIn also it is ‘like’ a post and you
can actually comment in LinkedIn also.
Terminologies of twitter:

Mention is a when you do a post you actually mention a person like going to my, let me go to my
Twitter account and this is if I want to do if I want to actually mention the Prime Minister I can
actually hashtag narendramodi and do a post. So, this wouldbasically have a notification to the
handle Narendra Modi saying that some post has been done with his handle.

3. Youtube

YouTube is another popular video sharing website. As I said, given that I


understand majority of you would have used YouTube, I am not going do a detailed
review of YouTube. Here, again you can do, you can upload a video, you can
actually like or share a video. You can create a channel, people can subscribe to
your channel, you can report a video as spam or malicious content, and you can
actually post comments to the video. So, essentially many things that we be talked
about on Facebook can also to be done, Facebook and other social networks, can
also be done on YouTube.
4. Pinterest

Pinterest is another popular social network, which is one of the most fastest
growing social networks, so to say, when it started, which is the first one million
users came in quickly in Pinterest. So, this, the basic building block of Pinterest is
an image, where I actually take a picture, post it in Pinterest, my friends get to see
the pictures, and they are all attached to boards. Boards is the basic way by which
the images are connected.
5. Periscope

Periscope is another interesting social network, where the basic building


block is live streaming of videos. YouTube is more like you upload a video and it
gets stored there. Here it is something that is uploaded in real time; it is live. Here
is a simple example of Periscope.
6. Tinder

Where it is mostly the left swipe and the right swipe for the activities that
people do, which is to connect with people in a particular location, given that, let us
take I am travelling to Chennai and I want to find out and meet the people who are
in Chennai who are very similar to my profile. I start looking at them on Tinder and
I connect with them. That is the basic idea for Tinder. It is one of the very, very
popular social networks among the youngsters now.
7. Whisper

Whisper is one of the anonymous social networks where the content that is
getting uploaded is actually anonymous, it is hard to find out who has posted it. In
the, now when you do it, this is the way that Whisper actually collates and puts the
contents; popular, latest, LOL, confessions, relationships, OMG, military and other
topics. Here, the idea is that you actually upload a content, but it is actually
anonymous post.
For materials related to ubuntu and python basics, refer tutorial 1 part 1
and tutorial 1 part 2 respectively.
Frameworks/Platforms:
First, API, which is Application Programming Interface; this basically enables
you to interact with the online social media, programmatically, and collect data
from there. What does this mean? This basically means that, you can actually have
a tunnel that is from your program to the social media services, to collect data. It
just creates a tunnel between your program and the online social media services,
where you are going to ask some data and then, the social media service is going to
respond with saying, here is the data that you asked for, right.
Particularly, in our case, we will actually look at APIs for Facebook and
Twitter, which will help you to collect data from Facebook and Twitter. There is
other APIs also; all other social media services or majority of the social media
services provide you with an API. So, one of the important thing that you want to
also keep in mind is that, about the rate limit, which is that in social media
services when we want to collect data you cannot collect the data everything that is
available on social, on these services. Because, the companies do not want to give
you all the data also. They have set it up, you know, by saying that, they have a
rate limits for every social media service, and every piece of data that we want to
collect from them. So, we will look at something in the tutorials about rate limits,
particularly about each of the social media API.
Also then in python, since you have already done a tutorial on python, I will
keep it really short. It is basically a programming language, that is used to collect
data and is oneof the popular languages currently used in terms of writing API
requests to the social media services.
And, it also has a lot of libraries for reading URLs, parsing data, interact
with API, and understanding the JSON objects, and things like that.

Data format, so particularly the API, when you send the request to Facebook
saying that, ‘please give me all the data about friends that PK has’, or, about the
date of birth of PK, or about my friends’ network. So, what it is going to give you
back is actually, it is going to give you in some format. One of the formats that it
gives you is actually a JSON format, which we will see in brief what this format
means and how we actually interpret the data that is coming back from Facebook,
or Twitter. XML, which is also a format with some social media services give, or the
JSON, is also a little bit like an XML, which is Extended Mark-up Language.
So, here is what a JSON means. JSON means, JavaScript Object Notation,
which is a data that you get back from the social media services. So, here is an
example that I have in this slide, which just shows you about the JSON object that
is returned, when you are asking for id and name of a particular user. So, this is
the Graph API Explorer, which you will see in the tutorial in more detail but, it is
essentially a through by browser you can actually look at the data, look at the
JSON objects of the Facebook data of yourself, or whatever the Facebook API
allows, which we will be able to see through this graph API.
So, again, that we emphasize JSON is the JavaScript Object Notation, which
is the way that the data is stored in Facebook, data is stored in twitter when you
request through the API, for saying, ‘give me this data about PK’, it is returning the
data in JSON format. It is basically the format that most social media services use
today.

So, when you take the data from JSON, and when you want to interpret the
data that is available in this JSON, data that is coming back from Facebook or
Twitter, you can actually use JSON dot viewer dot stack dot hu. This is only for you
to see visually, what data is coming back; you can take the data that is coming out
of Facebook, copy paste it into this JSON viewer, and you will be able to see, what
the fields are. When you look at the data that is coming back from Facebook, it is
generally a block of data; it is just a lot of data that comes back. So, you can
actually take it, and put it into the JSON viewer, to see what are the fields that it is
actually giving you.

And, of course when you collect the data, so first is API which is a way by
which you want to collect the data, and the data is coming back in JSON. When
you collect the data, you have to store it in some format, right. So, the format that
majority of the times, the data is stored, is in MySQL. Basically, it is a relational
database to store the data, and data is stored in rows and columns, and simple
queries, you could use to get the data.

MongoDB is one of the popular ones, more recently we have started looking
at and people are actually using this. So, MongoDB is another way by which the
data is stored and the data that is collected from Facebook is actually stored.
So, again, let me emphasize which is API; then, there is programming
language; then, there is MySQL database or MongoDB, which is data is coming
through an API, collected and dumped into this MySQL or MongoDB. So, now, we
also need a way by which to look at the data that is being stored. So, one of the
ways you could use thisactually phpMyAdmin, which actually allows you to look at
the data that you have in your own database.

So MySQL phpMyAdmin can look at the data from MySQL, and RoboMongo
will help you to look at the data from a MongoDB. So, essentially, these are the
ways by which you can collect the data, store the data and look at the data that is
available with you.
So, this is another view of RoboMongo, which shows you what are the
different fields that are available; what data is stored in those fields.

All content on Facebook is actually stored in a graph format; that is, user -
the friends that I would have, the pictures that I upload, the videos that I upload,
and the status updates that I do, everything is actually a node in the graph. And,
every interaction, which is basically like the comments, likes and things like that,
becomes edges in this graph. Facebook actually stores all interactions, of all data
that they have within the graph format; that is why the API that they have is also
called as a graph API.
Here is the another view of the same message, which is, all objects are stored
as nodes in the graph; connections like friends, friendships, likes are edges and all
nodes have a unique numeric ID, which is users, pages and posts. And, we will be
talking mostly about users; we shall later talk about also, pages, which is one of
the ways by which content can be generated on Facebook.
For material related to facebookapi refer tutorial 2 – part 1 & 2.
Trust and Credibility on OSM
Many incidents related to misinformation tweets are illustrated with
various examples in lecture 9. Some gist from that is presented here. Refer lecture
9 for detailed review.
Here it is showed that the truthful information is coming into the social
media slower than the rumours, and there are multiple techniques by which you
can actually attack this problem.

Most specifically I was trying to give you an intuition about how, what
analysis can be done. Particularly; who, when, where, what, why, and how. These
are the kind of analysis that you should be interested in doing while looking at the
social media content.

And then, we later looked at different features that are available in tweets
particularly user features and that tweet features and we tell detail about what
these features mean, how these features can be put together to create a classifier
which can look at tweet and then say that whether it is legitimate or fake tweet.
This is one slide when I talked about, how the data can be represented, what
data has been collected for doing these kinds of analysis.

This is Geo-Located tweets where each dot is a tweet which has a geo tag
attached with it and such kind of graphs can be helpful in saying where these
tweets are coming from.
We also talked about how the community of users who are posting this fake
content, who created fake accounts, how they are connected. Interestingly, they are
all connected very closely and it is a closed community.

And I also walked you through a multiple architecture diagrams mentioning


about how data is collected from social media, what kind of annotations and how
do you actually verify with the data that you annotated is actually of high quality
and, what kind of feature generations can be done, what is the model that we
developed and what is the model that one can develop, and now what are the
evaluation matrix to actually find out whether the technique that we have applied
and the model that we have created is actually good.
So, this is an architecture that I tell in detail talking about each block and
explaining all this block helps in creating some interesting solutions for the
problems in the trust and credibility space.
And then I showed you about plugin which is called TweetCred, I hope some
of you have played around with the tweetcred plugin to find out how the tweets are
evaluated and the value of x on 7 is presented to the users.
Difference between Facebook & Twitter
Facebook is a bi-directional network and twitter is a unidirectional network
and the structure itself is very different. And, the features that are available in
these two social networks to study are also different. In twitter it is followers and
following and Facebook, it is actually friends and the information that these
networks provide through API are also very different. And the structure of the
networks have different, particularly I wanted to highlight this friendship thing in
Facebook; the connections are more personal and if there is a post that shows up
by your friend, there is some tendency that it is more likely to be truthful and then
we you believe that your friend's post is actually more truthful than a random
person's post. So, that is the one of the differences between the Facebook and
twitter network, particularly keeping this trust and credibility as the space of
discussion. I wanted to highlight this difference, and now given this difference we
should also look at how we can actually use the model that we have understood in
twitter to apply it into Facebook.
The architecture if you see it is almost the same, in this case it is just
presented slightly differently. FBI: stands for Facebook Inspector, a similar tool that
is like tweetcred which takes the Facebook post from Facebook graph API and then
looks at the posts andmake some judgement on how whether these post are
malicious or not, credible or not, trust worthy are not.
In this case it is the same architecture which takes the post, do some feature
extraction, do some ground truth understanding of the post, then creates some
feature vectors out of it, create a model out of it, in this case supervised learning
model because we actually have data from the posts that we are collecting and then
create a RESTful API through which you can actually find out whether this post is
malicious or not.
So, one thing that was also mentioned in the tweetcred or the twitter trust
and credibility slides is a Web of Trust that is called WOT. Then I thought I would
just mention it briefly what does it mean. It basically takes a domain and produces
an output which says that a value similar to tweetcred, similar to other services
that you may have seen where input is a domain and the output is score, which
you can use to say that whether it is a malicious domain or not. Then in the past
also I mentioned about how long the domain has been registered, who registered
the domain and things like that. These features can be used to make the
judgement.
So web of trust basically gives you value of excellent, good, satisfactory, poor
and very poor. If you give domain saying iiit dot edu dot in, it will actually come
back with the rating scale and a confidence scale. We use this in Facebook
inspector because in Facebook inspector it is also going to look at URL as the
feature or particularly the domain as a feature from the post that we are analyzing.

So, here is the pointer to the plugin. It will be interesting if you can actually
download it and play around with it. These are links to the Chrome extension and
to the Firefox. And here is also a plugin that you can use if you are a Firefox user.
Therefore, if a Facebook inspector is available as a Chrome browser plugin and as a
Firefox plugin add on which you can use on your browser. So, that is the way you
could think about taking way and understandings from twitter, where we studied
about how to build techniques using the features from twitter to create an
understanding of whether the post is a credible or not, here I showed you about
Facebook.
Privacy & Social Media
How do we define privacy? Privacy is actually very, very hard to define the
way that if I ask what privacy is for you and what privacy is for me, it may be very,
very different. Your privacy in class that you are sitting with your friends and
privacy that you have at home, the privacy that you have while you are watching
movie, privacy that when you are going out with friends is very, very different.
Every context has different privacy expectations.
And in the research domain, Alan Westin; Prof. Alan Westin has actually
looked at this ways of what privacy means? What kind of people have what kind of
privacy preferences and he has been studying this for more than 30 years or so,
and is very, very well known in this field of understanding privacy perceptions of
citizens. So, what he did was, he kind of asked the same question every year on
specific topics for about 30 years or so, and he kind of classifed the US citizens into
these 3 categories; fundamentalists, pragmatists, unconcerned.
Fundamentalist is being 25 percent, pragmatists is being 60 percent and
unconcerned being 15 percent. Fundamentalist are the people who actually do not
give away any personal information. Pragmatists make decision about privacy
keeping the situation in mind. Unconcerned are the set of people who gave away
personal information and be part of revealing personal information is about 15
percent in the US.
Tutorial 3 , Part 1 describes about collecting data using twitter API. Part 2
describes about MySQL for handling those collected data. Part 3 describes about
MongoDB, an another database for storing the collected data. For detailed idea, go
through the respective tutorials.
Privacy Issues:

what are the kind of privacy issues that you have on Facebook, Twitter? How
you define privacy?
One of the definitions that was given earlier about privacy was that “Privacy
is a value so complex, so entangle in competing and contradictory dimensions, so
engorged with various and distinct meanings, that I sometimes despair whether it
can be usefully addressed at all.” So that was Robert talking about privacy in his
book ‘Three Concepts of Privacy.’

Fundamentally privacy is been always talked about control over information,


here are two definitions of Alan Westin actually tried defining in his book in a
‘Privacy and Freedom’ in 1967. “Privacy is the claim of individuals, groups or
institutions todetermine themselves when, how and what extent information about
them is communicated to others.”
So it is basically about to determine for themselves, how much of my
information I can actually share with others. “Each individual is continually
engaged in personal adjustment process in which he balances the desire for privacy
with the desire for disclosure and communication.” How much do I want to reveal
about myself, how much do I want to actually anonymize information about myself,
how much do I want to reveal about myself, is the way that the word privacy is
defined and is the way by which you are controlling the information that you are
actually spreading.
So, I am sure you kind of get the definition privacy which is very hard to
define and also it is very difficult to actually come up with the list of privacy
expectations for any individual in all given contexts. They strictly convey privacy is
about control over information. It sometimes could be actually a group information
also, given that idea is more or collective society we generally talk about a privacy
of a group, instead of individual privacy, that the society is where its individualistic
society where the privacy information of the individuals are more protected than
the privacy information of the group.
Different forms of privacy

Some forms of privacy that people have come up with; information privacy,
communication privacy, territorial privacy and bodily privacy. Majority of the times
when we talk about privacy particularly in courses like these it is always referred to
as information privacy and particularly the internet privacy.
There is also communication privacy which is telephones and other forms of
communication. Territorial privacy is about my living space, my home, my city, my
country and, the topics around that. Bodily privacy is about self. So, information
about my own physical presence is actually also discussed in the concept of
privacy. For example, a CCTV camera is one example where bodily privacy can be
actually attacked.
Companies like Facebook, Microsoft, Google, Apple have actually acquired a
lot of face recognition companies in the last few years, to study, to understand, to
use these technologies to identify faces on pictures that are being uploaded on the
all social networks or online services. It has become very, very important to apply
these kind techniques like, machine learning, deep learning and concepts around
that into these images to study what is happening on online social networks.
If you really look at what is going on currently in terms of these pictures that
were uploaded and the privacy about individuals, increasing public self disclosures
through online social networks happen, which is I take a pictures, I take a selfie
standing near one of the very important spots let us take in Delhi I upload this
picture you know that I am in Delhi, or let us take a picture next to Taj Mahal and
upload it on my Facebook account you know that I am actually traveling to Taj
Mahal now.As a self-disclosure through online social networks and there are many
many issues that are going all around because of self-disclosure of information on
Twitter, Facebook, Instagram and other networks.
Parallely in one side this increase in public information is going on. In
parallel there is also increase in face recognition accuracy. In earlier the accuracy
which lower now the techniques, technologies that are actually improved. In
particular if you look at networks like Facebook it is actually pretty high it is
because they search space that they have to search for a particular face in the
picture that you are uploading is actually only your friends, majority of the times
you're going to be taking pictures with the friends to whom you are already are
connected with or probably they are in a one, and one and half hour or two hours
away from here.
So, that is happening on one side. And also this is whole idea of cloud,
storing information on the cloud, easily able to compute, computing cost is
becoming lower and lower for doing any of these analysis. On the fourth dimension,
problem is that identification of this users, who they are, what kind of information
they are valuing is also getting better. Meaning, the concepts like k-anonymity
came in 15 or 20 years before, but certain many further and advance techniques
that have been developed to identify users, to identify faces, to identify information
about users, to re-identify people on social network, people on other networks.
Those are four different things that are colluding; one, increasing self-
disclosure, improving the accuracy of face recognition techniques, the whole idea of
cloud and ubiquitous computing, and the techniques for re-identification of users
is actually getting better and better.

The one important question and one interesting question that people could
ask is, can one combine publicly available online social network data with the off
the shelf face recognition technology which is something that is already available,
and be able to reidentifying individuals and finding potentially sensitive
information.
Here is a goal. Goal is to use un-identified sources which is any websites
that you can think of, match dot com, shaadi dot com, photos from Flickr, CCTV
feeds and things like that, which is impossible to identify or its very hard, the user
themselves are not disclosing who they are in these websites. It could be either they
have psuedonyms and names that you cannot identify or re-identify to that
particular person. Can we actually take these sources, shaadi dot com and pictures
from Flickr and Facebook, connected to identify sources which are on Facebook, I
would actually reveal that I am so and so on.
On Linkedin I will put this as I am so and so, on government website and
other services that are available. Which is un-identified sources like, shaadi dot
com, identified sources which is where I am disclosing that I am so and so, and I
upload a picture my account is actually ponnurangam.kumaraguru, can we
actually put these two together to get some sensitive information of the individual.
For example, gender orientation like example Social Security Number, like example
Adhaar card number and the information like that.
So that is the idea that built on to create something called as k anonymity,
but the problem she highlighted was that bringing these two different sets of data
which is independent medical data and voter data, you could actually re-identify
users uniquely.
Similarly various examples are discussed in this topic.
Study related to Foursquare:
Here is one study that researchers have done to show that privacy
information from Foursquare which is one of the location based social networks,
one of the very popular location based social network. Data from Foursquare can
be actually used to find out where you live.
These two research which was done back to back is to inferring home
location from the check ins that you do in foursquare. I am not going to get into
details of study that is why I put the pointers to the papers, but I will talk you in
general how this was done, how this could be done, and how you can actually look
at some of the data yourself also.
Foursquare is one of the popular online social networks, just for locations it
is called location based social network. The different topics and different concepts
in foursquare are check ins. Check ins is, you check in to the hotel, you check in to
the airports and similarly you check in to a location in foursquare and you can
actually also leave a tip, let us take if I go to Sarvana Bhavan, Connaught Place in
Delhi. I have food there and I can leave a tip saying that food was pretty good. And
you can also become a mayor in foursquare which is, if I visit this place, if I visit
this location in foursquare the most number of times in the last 60 days, I become
the mayor of this location.
The mayor information can be actually pretty useful. Today organizations
are monetizing this check ins and mayorship in foursquare. Also someone actually
is providing you free parking spots if you are a mayor of that location for the week.
This information that is you check in can actually be used to find out your location,
your home location also. People have studied other things, people have studied
actually from the pictures that you upload can I actually find out your home
location. This work is specifically focused on finding out the home location from
social networks like foursquare, and there was a high confidence in finding out the
home location with the foursquare check ins that people have done.
Mobility of people is actually not that much. Another conclusion that they
also found was people do not move a lot from their current location. With this
information like check in, mayorship, they were able to actually find out with the
high confidence the home location of the person within few kilometers of distance of
error.
Therefore, social network data privacy, initially we saw some survey where
people actually said about their information of the social networks, then we looked
at some studies where pictures uploaded on social media and pictures uploaded on
these publicly available websites which they called as unidentified sources can be
actually used to find a person specifically or uniquely identify an individual. So
here I am saying that your location can be also inferred from the social networks
like foursquare.

Policing and Online Social Media:


How many of you are friends with the police on your social network? Police, I
mean Police Organizations; how often do you use social network to post comments
or interact with police? And of course, the question that I'm going to be trying to
address in this week content is actually what has police been using Online Social
Media for, what have they been doing, what can they do, how we can actually help,
how people using social media like you and me can actually participate with the
police in online social networks.
Here is the general way by which Police Organizations are actually using the
social media services. This is a Facebook page of Bangalore City Police. In India
now, Bangalore City Police, Bangalore Traffic Police, Delhi Traffic Page, these are
the very popular handles in the country now. And that is why you will get, teaching
this particular course is actually pretty exciting for me, it is because we are
actually looking at topics which are very rather relevant, I mean just open your
Facebook now and look at Bangalore City Police you will actually look at some of
the things I am talking now. Bangalore City Police is the verified page you can see a
blue tick next to the top left of Bangalore City Police.
And there is about thousands and thousands of likes this picture that was
taken some time back I am sure the likes have changed now. And in the bottom the
picture showing you an example of a typical post that comes from these kinds of
police pages. The post says that ‘we are taking up traffic signals synchronization on
10 corridors in the city for smooth traffic flow’ and it is coming from handle
AddICPTraffic. So the idea is that citizens can inform about what is going on in the
decisions that they are making.
So, let us discuss about the specifics of how the data from police
organizations can be collected, and what kind of analysis can be done to find out
some interesting things.
Here is one research question that you can think about - objectives of the
study. And then I am going to be taking about whether online social media can
support police to get actionable information about crime and residents’ opinion
about policing activities in urban cities yeah, so that is the goal. So, let us try and,
see if you can actually teach this objective to study some data from Facebook and
Twitter and make some useful inferences.
So, let me just break this objective into pieces, which is, can we use
Facebook to support police to get actionable information? What is an actionable
information, actionable information is something like do this, can you actually get
this done, I mean I am having a problem in the street that is traffic issues in the
road there is a pot hole which is broken on this street, a car broken down. So,
these are actionable information that police organizations can take from the post
and that is actually useful for decision making. And residents' opinion, of course,
what people think about police, what are they talking about police is also useful
information for police organizations.
re-identification of information of a particular individual is the concept,
which takes some unidentified data and using some identified data putting them
together and identifying the users. So, we did this work on social networks for
police and residents in India exploring online communications.
Multiple police organizations have actually adopted using Facebook, Twitter,
for sharing for interacting with the citizens and that is the topic that we saw in the
context of policing. A specific question that we saw was how we can actually use
this data from social media to collect actionable information.
E-Crime:
E-crime, cyber crime anything that is around electronic crime, but focus it
only on the social media context. Crimes happen all around the places using the
internet, using the web, but people will focus on these kind of crimes only that is
happening on social media.
The first one which is phishing, and again these are not arranged in any
particular order and they are not comprehensive at all. The phishing problem on
social media services, the act of tricking someone to into handling or logging details
which is basically there is a, in traditional ways in emails, in email domains you get
emails which says please click on this link or please click on the links to update a
password or your account is expired click on this link to activate your account.
When you click on this link you are taken to a fake web site which
sometimes looks like a legitimate website, but sometimes it does not need to be
looking like a legitimate website also. And when you go there it is asking for
username, password and when you give the username, password, you are basically
sharing the credentials to someone else. Sphere phishing is a way by which you
target a set of people.
Here is the second one, fake comments on popular post. I think some of the
posts that that become very popular. Let us take the prime minister was talking
about it, if it was Obama who is talking about something these posts become very
popular. And when these posts become very popular there are also lot of
comments. For example, now I am sure if you look at the Olympics Facebook page
or the twitter handle or the hashtag, people are actually talking a lot about things
that are going on in the Olympics in the context of Facebook page and Twitter
accounts also.
The third one is fake live streaming videos, which is particularly in the
context of Olympics and cricket matches, world cups and things like that, there is
tendency of actually looking for these matches in live. Here is an example where
this post is actually saying live video for this match, right? If you are interested in
watching it in your laptop, in your phone you tend to actually look at these pages,
look at these links which talks about this game and tend to actually taking into a
fake website.
The next one is fake online discounts which is, scammers take the real
account, real organization in this case - Netflix, it could be anything Facebook, it
could be Flipkart, it could be any real organization. They create fake accounts that
looks like real business and they are actually carry out business using these fake
list, but giving you discounts. Like for example Netflix could say that, this page
which is a fake page, it could say that there is a 10 percent discount in Netflix
account that you open now. 40 percent discount for the next 6 months, if you open
the account right now. These kind of posts can actually lure people into using these
fake accounts, fake pages, fake services.
Next type is, Fake Online Surveys and Contests. These kind of scams have
been around for a long, long time, where the criminals of these scammers get you
to get survey, fill the survey to get some money, to get some information. For
example, how do you know your personality? Personality test and find out other
people who are bonding your date, who has the same personality and things like
that, while these kind of things have been around for a long time. And there were
also contests, win 1000 Rupees for filling on this survey. So these have being there
in traditional ways now these have moved on to the social media services.

Social reputation has become such a big deal now, everybody talks about I
have 2.5 million users and then 2.5 million followers, then the number of likes that
you have on your page is becoming the way that people measure your influence in
the society. Even among friends, it does not have to be the celebrities, politicians,
even among friends you are more, more curious about how many friends other
have. The social status is now being measured by the presence in social media; by
the number of likes that you get on posts, number of friends that you have, it is
becoming more and more popular.
In this case, some examples that I had put in here is Flipkart, social
reputation can be manipulated by actually writing good reviews about product. So,
reviews become a big way by which you can actually manipulate the social
reputation of the product, of the company, of the seller, all of them can actually be
manipulated. It is actually a very big problem in terms of studying Amazon’s
reviews or Flipkart’s reviews also for products.

Here is another problem in terms of crimes on social media. Clickbaiting,


where you are actual director your keeping the website, so you go read a particular
page of news or something, there they present you with information which is
sometimes relevant sometime not relevant and they take you to a fake website. So,
here in this case also, the link here, this information was actually presented in one
of the social media services where it was taking it to a fake website. Clickbaiting -
getting you to click on links which are not legitimate.

Hashtag hijacking; hashtag hijacking is also becoming a big issue these


days, I assume all of you know what a hashtag is. Hashtag is the way by which a
particular set of tweets, if you want to talk about now Olympics you use hashtag
Rio 2016. So that is the way of using hashtag Rio 2016, you are saying that the
content that I am posting is connected to this topic, so Twitter can actually bring in
all these posts which has hashtag Rio 2016 and show it to people who are
interested in it. So, that is the logic behind using a hashtag.
In this example where CocoCola has actually posted tweet which says, ‘Time
for a Royal Celebration hashtag Royalbaby’. Here what coke is doing is, coke is
actually using a hashtag which is very popular otherwise for actually selling their
product. That is hijacking right, royal baby is nothing relevant to coke. They are
kind of using it to promote their products. So that is one way of hijacking the
hashtag.
Compromised account, I have actually shown this particular tweet even in
my trust and credibility section, but I brought this back just to tell you different
problem, I think I explained the problem then but I will explain it in the context of
e-crimes also. Compromised account, where The Associated Press is a verified
account and this account was compromised for sometime which is, somebody else
had access to this account and the tweet was, Breaking: Two Explosions in the
White House and Barack Obama is injured’ I am sure you can all agree that the
effect this tweet must have had.
This is account compromised, somebody else getting access to your account
because of leak of username password and getting that to misuse, getting the
account to be misused also. That is compromised account.

Impersonation is also another problem which is I can take an account like


for example, any of you in the class I can take some details of you that I know
pictures, and your city, and the information that I could collate from online
sources, use that to actually create an account which as though looks like it is you.
Here is a complaint that Kiran Rao has actually filed saying that fake account has
been created, and there are many, many fake accounts like this. If you know
remember the policing section I also showed you about the fake account of police
organizations also. And it is not just about individuals, even organization's
accounts are actually created fake.

Here is another interesting problem which is, Work from home scam. Again
these things have been in traditional ways for example, if you are driving down
somewhere in the signal, you will see a post which says, ‘want to won 1000 Rupees
a day sitting at home please call this number’ these kind of scams are being there.
Here is an example of a scam that went popular in Pinterest where this image was
actually floating around, ‘want to make an extra salary simply by filling out survey
for major companies, here is a website to go to. You get paid 5 to 40 dollars per
survey. This is work from home scam. Again there is a lot of scams which are
similar to these work from home scams, different versions that are very popular on
social networks. So, this is an important scam also.
Link Farming
Also search engines rank websites basically the Pagerank idea where every
page is linked to every page and Pagerank of the rank of every page increases
depending on the links that it has with the pages and. So, essentially if you have
more of high indegree helps in increasing the Pagerank.
So, Google works on this technology, where you actually have; you create a
website, you link it to, let us take to IIITD’s website and IIITD links it back to you,
then I think your Pagerank increases heavily, so that is simple idea for Pagerank,
but link forming in on the web is basically an idea where websites exchange
reciprocal lengths with other sites to improve the rank. So, the idea of making the
links between websites which is not otherwise there; creating links or increasing
the links of the websites to other websites is actually link farming.
Same ideas are connected to Pagerank. Pagerank is benign or legitimate
links that you create. link farming is the idea in which these links are created
which are not benign ones.

So, here is a simple diagram to show that what, how link farming or what
link farming is. A link farm is a form of spamming the index of the search engine
which is essentially increasing the links between different websites like, for
example, website and website A and B. All the, if you start creating links between
these websites, if they do not exist and that is called actually link farming.
Sometimes, it is also called spamdexing and spamexing. So, that is the idea for link
farming. Link farming is a way which non legitimate links are created between the
websites. The idea for doing this is when you do this and when you increase the in-
degree which is the links that are coming into the website increases then the
Pagerank of the website automatically increases.
So, while link farming in twitter is basically a large amount of data is getting
generated, real time information is spread there and when users search for topic,
the information is actually presented depending on the links, depending on the
Pagerank, depending on the links that follow a rank depending on the links that
users are. And particularly link farming in Twitter is basically, spammers follow
other users and attempt to get them to follow back also. Essentially, how do they
increase the in-degree, the in-degree is increased if I am a spammer, I start
following thousands and thousands of people and there is a probability that you
will actually; the people that I am trying to follow, now will actually follow me back.
And again, there is multiple researchers, people who have shown, how the
reciprocity can be, there is a high probability that if I follow you, you will follow me
back and, giving, with that effect, the link farming actually increases on increases
and therefore, twitter can be used to increase the link farm.
So, here is a slide to show the differences and similarities of link farming in
web and Twitter. In the web increasing my Pagerank, increasing my in-degree,
increases my probability of showing up in the search results. In the Twitter space,
increasing the indegree actually increases the gain; similarly, to show on my tweets
on the search results.
In the webs, spammers actually use link farming. In Twitter, spammers do
actually link farming, but it is also done by legitimate and popular users, I think
that is the whole idea with where you actually increase the in-degree by making
your number of followers high and therefore, you can actually your content can
actually be presented to a large number large set of users. And of course, in the
context of Twitter, in the context of web, it is not necessary that if I link your
website you are probably going to link back to my website; hyperlinks are not
created in that way.
Whereas in the context of Twitter there is a high probability that, let us take
if I am, I actually follow one of the students who are taking this class, there is a
high probability that the student is going to follow me back again and the same
way if I follow a professor and the professor probably there is a high probability
there the professor will follow me back.
Social Network Analysis

This is an example of my Twitter followees network graph. Nodes are my


followees and an edge signifies that the node is following the other node.
A graph is a data structure which consists of a finite set of nodes and edges.
Nodes represent the entities of social network like users, pages or groups. Edges
define the relationships between various nodes, for instance, a directed edge from
user a to user b can mean that a follows b or an edge between a user a and the
page can mean that user likes that particular page.

However, how would a computer understand such a node edge format?


There are various ways to represent a node edge graph; some of the most widely
used methods are adjacency matrix, graph ML format and CSV files. Let us look at
what they are.
An adjacency matrix is a 2 dimensional square matrix whose size is equal to
the number of nodes in the graph. In this particular example, since the graph has 6
nodes, the size of the corresponding adjacency matrix is 6 x 6, this is at
intersection of ith row and jth column is 1, if an edge exist between nodes i and
node j, otherwise 0. In this example, there is an edge from node 1 to node 2 and 3.
Therefore, the cell at the intersection of first row and second column gets a 1.
Similarly, the cell at the intersection of first row and third column also gets a
1. However, there is no edge between node 1 and node 4. Therefore, the cell at the
intersection of first row and fourth column remain 0. The rest of the adjacency
matrix is also filled in similar manner. Adjacency matrix can be very easy to
construct using an array data structure in any programming language; however, if
the input graph has high number of nodes and less edges then the resulting
adjacency matrix can be very sparse and space consuming. Therefore, let us look at
another way to represent a graph, graph ML format.
Graph ML is an xml file format for graphs. It consists of an xml file
containing a graph element within which is an unordered sequence of node and
edge elements. Each node element should have a distinct id attribute and each
edge element has source and target attributes that identify the end points of an
edge between two nodes, in this example, we have a graph with 11 nodes that node
ids n 0 to n 10. The first edge element signifies that there exists an edge between
node n 0 and n 2. Now, we have learnt how to collect your own twitter following
network graph in graph ML format.
We will be using twecoll, a command line tool to get twitter data in graph ML
format. Using twecoll, we will collect our followees information which is also called
friends and friends of friends information. Example will be discussed in Lecture 21.
Refer it for more details.
SNA Metrics:

The most commonly use SNA metric is degree, in a directed graph, in degree
is equal to the number of edges entering a node. In this example, for node 2, edges
are entering it from node 1 and there is a self loop from itself therefore, it is in
degree is two. Out degree equals to the number of edges leaving a node in this
example, edges are going away from node 2 to 4 and node 2 to 5, node two also has
a self loop. Therefore, the out degree of node 2 is 3. Total degree of a graph is
calculated by summing the in degree and out degree.
One of the other useful SNA metrics is centrality that is finding out which is
the most central or important node. There can be various ways to define centrality.
Let us look at them one by one; in degree centrality finds the node with highest in
degree. It can signify the most influential node or in case of Twitter follower graph,
the user with highest number of followers.
Out degree centrality helps in locating the node whose out degree is the
highest, other ways to measure centrality are betweenness and closeness.
Betweenness centrality is equal to the number of shorter paths from all vertices to
all others that pass through that node, closeness centrality helps to find the node
with the lowest total distance from all other nodes.

Let us also briefly look at community in a graph. A community is a group of


similar or strongly connective nodes. The measure to define the strength of a
community is modularity, which means the fraction of edges that fall within the
given group. Now, we will look at a tool called Gephi for graph visualization.
Semantic attacks: Spear phishing

Semantic attacks are attacks that happens where humans are targeted. So,
for example, Bruce Schneier who is supposed to be a expert in security classified
the different types of attacks that could happen as physical, syntactic and
semantic, but physical attacks are the were happening like 15-20 years before
where the attackers would actually get physical access to the machine.
Whereas in syntactic attacks are the attacks that were happening around
the programs, around the systems that are built, which is more like the denial of
service attacks, buffer overflow attacks and attacks like that, but the semantic
attacks are attacks which target the way we as humans assign meaning to the
content which is that what do we because the specific attack that we will be talking
about is phishing.
Here is the broad category of semantic attacks: Security attacks - physical,
semantic and syntactic which is what Bruce Schneier did and if you look at
Semantic attacks, you can actually go through multipe categories Phishing, Mules,
Nigerian, 4 1 scams and attacks like that, and in phishing also there are multiple
categories – update your information, banks and in your ICICI banks sending you a
message saying that, please update your information within next 24 hours or your
account would be closed. Verification, saying that, we want to verify whether it is
really you, please click and verify. Security alert, Microsoft is updating the latest
version of MacOS, there is an update.
Here is the link, please go and update. Mortgage information, meaning your
mortgage, the due is coming closer, please click this link and do something. All of
these kinds of categories of attacks are called phishing attacks and almost all
companies today probably are undergoing, are part of, or being victims of this
attack of phishing. Even academic institutes probably are victims of phishing
attacks.
So, here are some kinds of Phishing Attacks I think we probably briefly
mention this in the past also. So, I'll go over quickly, phishing which is a classical
one that I showed you, Context-aware phishing the email that I talked about
sending in to the students taking this course, Whaling is an attack which is sent to
the chief executive officers of the company, Vishing is over the phone, Smsishing is
over the SMS. So, what is social phishing? That is what the topic that we have been
discussing for the rest of this week.

Social phishing is nothing but looking at the information from the social
context then using that to actually phish, it is not about finding whether you are
taking a course that could be many other information that I defined on from a
Facebook page, from the facebook account, things that you've done and things like
that. So, the topics that we have seen until now are using older data we saw
Latanya Sweeney's work using medical health data, again Latanya Sweeney's work,
using pictures from FB voter data.
Profile Linking on Online Social Media:

Here is, the top one is my Facebook profile, the one at the bottom is my
Twitter profile the one.
The next slide, this is my LinkedIn profile, public LinkedIn profile. So, if you
look at these three images, you can actually see or you can actually think about
some features that you can use for deciding whether these 3 profiles are mine. For
example, you can look at my profile picture in both, they seem to be the same
thing, you can look at probably some friends that I have on Facebook and people
who are following me or then the accounts that I am following on Twitter, you can
look at some of these features to make the decision. Unfortunately, in the public
profile that I have on LinkedIn, there is no profile picture, but there are details like
associate professor at IIIT Delhi, Data Security Council of India, Carnegie Mellon
University and connections like that.
For example, my personal website, the personal website from here may be
actually linked to my website at IIIT Delhi. So, you can actually make all these
connections to find out whether this is actually the same details is both in
Facebook and the Twitter. I am sure many of you are listening to this lecture also
have multiple accounts. So, the question that you can ask yourself is, how do I put,
how do you put your own accounts together to find out whether they are same or
not. So, that is the problem that we look at.
So, tracking social footprint identities across different social networks, which
is finding out whether they are the same. So, the question is, people have multiple
accounts on social media and sending information to all of them, you want to send
information to the people only once. So, that is the goal then, but there are many
test cases for this problem.

The approaches that we can take is list out common attributes, which is
Facebook has my gender, my age, my university that I work at, places that I got my
degrees from. Twitter has my followers, my profile again, the website that I am
connected to, the place that I work, all that information. We can actually list on all
common attributes, compare the attributes, which I think in the example that I
showed you, I showed you profile picture being same, profile picture being same on
Twitter and on Facebook, we can actually compare that.
Compare attribute values using syntactic, semantic or graph based, which is
what I am typing in, on, the social networks, what content are am I posting and
what will, what is the details in my profiles and the graph is basically my networks
- my friends in Facebook, my followers, followings in Twitter..
And then high similarity, if there is, in my case in the example that I showed
you it is the exact the same picture profile picture on both the places. If things are
like that, it mostly likely the same person. And then the question is also, you can,
so, one thing that I will talk about few slides later is not just that you want to look
at these details only that is now, but you can actually look at details that are past
also, which is, you do not have to look at only the post that I did now or the profile
picture that I had now or the handle that I have now. You can actually go back in
time and look at the post that I have done and you can derive some information
even from that. Details regarding this is available in lecture 26.
Anonymous Networks:

So, until now, the social networks that we are seeing is generally popular
networks like Facebook, Twitter and these are called online social networks. And
particularly we have also looked at Foursquare, which is a location based social
network. Then I think briefly we have also talked about ephemeral social networks,
which are networks where the contents that is getting generated can be actually
removed after some period of time, where the contents are ephemeral, which it is
like a snapshot network; where you post some content and after sometimes that
content get's deleted right.
Anonymous networks are networks, where it is not clearly visible or it is not
possible to find out who is actually posting the content. Some examples of
anonymous networks are 4chan, Whisper, Secret, Yik Yak, Wickr, these are the
different types of anonymous social network, there are many, there are many such
networks that are available there here is only a small list.

Why do you need anonymous social network? So, we already have Facebook,
we already have twitter, why you might need a network or network of the category
of anonymous network or network that gives the preference or gives the facility for
having anonymity. Increasing awareness of privacy, so people are getting to know
more and more about privacy,people are getting to or people want to have more
privacy on online social networks. So, therefore people are looking for networks,
that will give more anonymity. And there were also incidences like Snowden;
projects like PRISM were the information that is publicly available or the
information that is available to these organizations can be used for other reasons
also.
Terminologies related to Anonymous Networks:

So, terminologies that we'll see to understand the rest of the lecture, we need
to understand some terminologies, whispers or the posts, replies or I do a post and
you are actually replying like a comment in Facebook or a reply in Twitter. And the
posts are anonymous you really do not get to see it is ponguru . I may have an
account, which is called professor, teaching computer science or anything that I
wanted to keep that is the username and interestingly whisper also allows you to
back with probably have seen video also, whisper also allows you to change the
usernames as anonymous as you want and more number of times also. So, that
makes it much more difficult to go back and look at the person who posted the
content.
And whisper does not associated any personal information of the user id, it
is not collecting any information and does not archive any user history, which at
least that's what they claim, it does not support persistent social links between
users. The person who hearts at that, the person who replies it, the links of the
users are not kept, where as if you remember the homework and the questions that
you have seen in the past where in the context of facebook or twitter. The content
for all the relationship between the users are stored as a graph and you can
analyze those graph, also retrieving the graph from twitter or facebook and use
these graph to make some inferences. Heart a message anonymously may also use
just in (Refer Time: 10:02). A heart is basically the one that I showed you in the
slide, like the like in facebook. If in the private messages against or this in the video
that I had a few minutes before, which showed private messages also you can
actually post private messages between the users.
Gephi Network Visualization:
Gephi is an open source network analysis and visualization software. Gephi
is widely used in a number of research projects in academia, journalism, digital
humanities etcetera. Gephi can also input data of social networks like Facebook
and Twitter and generate graphs and clusters. Refer lecture 28 for more details.
Location based services on online social media, there are many actually and
there are some which are very popular which are like Foursquare, Yelp, Gowalla,
Facebook, Twitter these are the different social media services that are actually
pretty popular in terms of giving the location based services, for example, in
Foursquare you could actually see where is the next, let us take petrol pump, in
the directions that you’re travelling. Secondly, in the Yelp you can look at where the
restaurant or places that you are interested in, what kind of reviews do they have,
Facebook you can actually looked you can actually do check in into a location in
Facebook. Similarly, you can do the geo location information shared on Twitter.
Some research papers are analysed on this topic of privacy in location based social
networks. For more details refer lecture 29, 30.

You might also like