0% found this document useful (0 votes)
14 views

Yammer SQL Project

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Yammer SQL Project

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

SQL

Yammer Case Study


With @Pranjal Meshram

The case can be found at - https://mode.com/sql-tutorial/sql-


business-analytics-training#analytics-cases-yammer
Yammer
• Yammer is a social network for communicating with coworkers.
Individuals share documents, updates, and ideas by posting them in
groups. Yammer is free to use indefinitely, but companies must pay
license fees if they want access to administrative controls, including
integration with user management systems like ActiveDirectory.
• Yammer has a centralized Analytics team, which sits in the
Engineering organization. Their primary goal is to drive better product
and business decisions using data. They do this partially by providing
tools and education that make other teams within Yammer more
effective at using data to make better decisions. They also perform
ad-hoc analysis to support specific decisions.
The case - A Drop in Engagement
• Engagement dips—you figure out the source of the problem.
• As a Yammer Analyst, I am responsible for triaging product and
business problems as they come up. In many cases, these problems
surface through key metric dashboards that execs and managers
check daily.
Overview
• The problem
• Getting oriented
• Digging in
• Summary
• Making a recommendation
• Answers
The problem
• On September 2, 2014, the chart shows
the number of engaged users each week.
• Yammer defines engagement as having
made some type of server call by
interacting with the product (shown in the
data as events of type "engagement").
• Any point in this chart can be interpreted
as "the number of users who logged at
least one engagement event during the
week starting on that date."
I am responsible for determining what caused the
dip at the end of the chart shown above and
recommending solutions for the problem.
Getting oriented
Before I get to the data, I have come up with a list of possible causes (hypotheses) for the dip -

INTERNAL CAUSES EXTERNAL CAUSES


• Collaboration with a large company or • The market may have entered a
several companies may have expired. recession, causing massive lay offs of
workers. Hence drop in users.
• Technical problems with accessing • Users may have switched to a competitor
Yammer like site failure at log in. platform or new internal communicating
• It's possible that the code that logs channel in company.
events is, itself, broken. • Holiday/ Lockdown period.
• Shut down of a particular marketing/ ad • For a website that receives a lot of traffic,
that was driving users with a blitz. changes in the way search engines index
them could cause big swings in traffic.
• Reduced effectiveness of email
notifications. • Shift in how users interact with the
platform – less log in, sharing, messaging.
Digging in – Data Available (1)
Table 1: Users
This table includes one row per user, with descriptive information about that user's account.

user_id: A unique ID per user. Can be joined to user_id in either of the other tables.
created_at: The time the user was created (first signed up)
state: The state of the user (active or pending)
activated_at: The time the user was activated, if they are active
company_id: The ID of the user's company
language: The chosen language of the user
Digging in – Data Available (2)
Table 2: Events
This table includes one row per event, where an event is an action that a user has taken on Yammer. These
events include login events, messaging events, search events, events logged as users progress through a signup
funnel, events around received emails
user_id: The ID of the user logging the event. Can be joined to user\_id in either of the other tables.

occurred_at: The time the event occurred.


event_type: The general event type. There are two values in this dataset: "signup_flow", which refers to anything occuring during
the process of a user's authentication, and "engagement", which refers to general product usage after the user has
signed up for the first time.
event_name: The specific action the user took. Possible values include: create_user: User is added to Yammer's database during
signup process enter_email: User begins the signup process by entering her email address enter_info: User enters her
name and personal information during signup process complete_signup: User completes the entire
signup/authentication process home_page: User loads the home page like_message: User likes another user's
message login: User logs into Yammer search_autocomplete: User selects a search result from the autocomplete
list search_run: User runs a search query and is taken to the search results page search_click_result_X: User clicks
search result X on the results page, where X is a number from 1 through 10. send_message: User posts a
message view_inbox: User views messages in her inbox
location: The country from which the event was logged (collected through IP address).

device: The type of device used to log the event.


Digging in – Data Available (3)
Table 3: Email Events
This table contains events specific to the sending of emails. It is similar in structure to the events table above.

user_id: The ID of the user to whom the event relates. Can be joined to user_id in either of
the other tables.
occurred_at: The time the event occurred.
action: The name of the event that occurred. "sent_weekly_digest" means that the user
was delivered a digest email showing relevant conversations from the previous
day. "email_open" means that the user opened the email. "email_clickthrough"
means that the user clicked a link in the email.
Digging in – Data Available (4)
Table 4: Rollup Periods
The final table is a lookup table that is used to create rolling time periods.

period_id: This identifies the type of rollup period. The above dashboard uses period 1007, which is rolling 7-day
periods.
time_id: This is the identifier for any given data point — it's what you would put on a chart axis. If time_id is 2014-
08-01, that means that is represents the rolling 7-day period leading up to 2014-08-01.

pst_start: The start time of the period in PST. For 2014-08-01, you'll notice that this is 2014-07-25 — one week prior.
Use this to join events to the table.
pst_end: The start time of the period in PST. For 2014-08-01, the end time is 2014-08-01. You can see how this is
used in conjunction with pst_start to join events to this table in the query that produces the above chart.

utc_start: The same as pst_start, but in UTC time.


pst_start: The same as pst_end, but in UTC time.
Digging in – Analysis
The engagement has indeed dropped
significantly when compared to the
previous month (July).

Although it is to be noted that the month


of July itself shows a rather inflated
engagement compared to prior data
which contributes to a sharp decline in
August 2014.
Digging in – New users
First we will look at the user level data,
to identify any changes in the new user
sign ups.

• The number of new active users in


fact seems to be increasing over our
window period of 4 months.
Digging in – New users
• When we increase the granularity
to daily number of new users, we
see that nothing has really changed
about the growth rate—it
continues to be high during the
week and low on weekends.
• So we can rule out our hypotheses
related to external factors that the
market may have entered
recession or there is any
Holiday season.

All users
Active users
Digging in –
Existing user engagement
Next we will see if the change is brought up by
existing users. By grouping the users into cohort
based on when they signed up at Yammer.
• This chart shows a decrease in engagement
among the existing users who signed up more
than 10 weeks prior.
• Although a deeper pattern is visible that user
engagement decreases as the account age of the
user goes up.
• Hence there probably isn’t any
restriction affecting new traffic
to the site or change in
the rank on search engines.
Digging in – Engagement per user
With this code we can see if the drop
was due to a drop in users or a drop in
the number of engagements/user too.
• As established earlier, now we can
clearly note a drop of 8.6% in the
number of total users from July 2014
to August 2014
• The number of events per user also
fell from 30 to 26, a decrease of
13.3%, which is also fairly significant
• We will now shift our focus to event
related data, and investigate whether
a particular type of event is triggering
the drop off.
Digging in – Engagement event details
There are 17 event names that are
classified as an engagement generating
event. I calculated the month on month
change to find whether there is a
particular feature that is not working or
not used by the users anymore.
• The monthly aggregate of all these 17
event names shows a sharp decline in
August 2014
• The dip in engagement was largely
attributed to home_page,
like_message_ view_inbox,
send_message, and login.
• Hence, It seems like the drop in all of
these events is related. Again pointing
to the fact that users are logging in less
or dropping off.
Digging in – Regional variation
Lets quickly rule out the possibility that
the drop in engagement is accrued to
regional variation in users.
• The regional composition shows that
the drop in engagement has been
observed consistently across
countries.
• Hence it is not a regional problem
and may be related to the
engagement events itself.
Digging in –
Email action
I wanted to take an aggregated look at
the email activity to see if there was a
change in emails sent, click-through
rates (CTRs), or something else that
may have caused a reduction in active
users.
• Notice that there was a steady
increase in the action taken for re-
engagement emails but there was a
decrease in clickthrough rates, weekly
digest and emails opened.
• But compared to July 2014, only
clickthrough rate shows a decline of
27%, other activity has remained
consistent.
Digging in –
Email action
To gather more information, I want to
see if we can narrow the problem even
further by email opening issue or CTR
issue for email type
• It's evident that the drop in clicks is
due to the weekly digest email and
not re-engagement email.
• While it is particularly
attributed to the fall in
click through rate instead
of email opening rate.
Digging in – Device
I wanted to take a deeper look into the
clickthrough rates to see if the decline
had anything to do with the devices. It
could have had to do with the type of
operating system (IOS vs Android) or
mobile vs desktop.
• We see that the clickthrough rates on
laptops and computers were stable
from July to August, but not the
tablets and cellphones.
Digging in – Device
By categorizing the device names into
'mobile', 'tablet', and 'laptop’, it is
possible to determine that this is truly
the case
• The drop in clickthrough rates was
attributed specifically to mobile
devices and tablets
Summary of the analysis
• The drop in engagement was mainly attributed to a drop in five engagement
events (home_page, like_message_ view_inbox, send_message, and login).
• The decrease in events was caused by a reduction in total active users MoM,
as well as a decrease in engagement per user.
• After an aggregated look at the emails table, there was a significant decrease
in MoM click-through rates for weekly digest from July to August. The decline
is not caused by the re-engagement email.
• By segmenting the clickthrough rates by device type (mobile, tablet, laptop),
the drop in clickthrough rates was attributed to mobile and tablet devices.
Recommendation
• Immediately take a deeper look into the weekly digest emails
specifically for mobile devices and tablets.
• It's possible that there's a technical problem, making it difficult for
users to click the email or simply a UX problem, where the content
and layout of the email are not enticing users to click.
• A good first step would be to see what changes have been made
from July to August and working backward.
• Make a quality check on the code that logs data related to digest
emails to verify if the data pipelines are working fine.
• The engagement dips were more pronounced among longer-term
users, indicating a potential need for re-engagement strategies.
Further Analysis
Some other things that could've been looked at include the
following:
• The engagement for July was quite higher than usual, identify
what caused this.
• Cohort analysis revealed that there was a short user lifecycle,
further detailed investigation on it is necessary.
• Detailed Analysis by language
• Detailed Analysis by geography

You might also like