Music SRA Classifier Training - BaseLine
Music SRA Classifier Training - BaseLine
Updated:
Updated 10.17.24
Added clarification and examples to Harmful Intent, Interest in Sensitive Topic, and Ambiguous Intent definitions.
Updated 10.01.24:
Added clarification for following Sensitive Topics: Suicide and Self-Harm, Serious Mental Health Concerns, and Restricted and Regulated Content
Added clarification for following Intent Types: Harmful Intent, Opposing Sensitive Topic, Ambiguous Intent
09.09.2024
Additional clarification on “Not Applicable, the query contains unintelligible text or non-English language.” response option
Additional guidance on how to approach determining Sensitive Topic
Additional clarification on Serious Mental Health Concerns response option
Additional clarification on Violence response option
Additional clarification on Adult Sexual Material response option
Additional guidance on how to approach determining User Intent
Additional clarification on Ambiguous Intent response option
Additional clarification on Navigating to Specific Content response option
Additional clarification on Navigating to Related Content response option
Instructions
For this task, you will be presented with a query, or a search, that may be entered into a music streaming platform. You will be asked to read the query and evaluate whether it: (1) contains Sensitive Topics, and if so, what
topics are present; and 2) references an At-Risk Group. You will also be asked to evaluate the user’s intent.
The following guidelines will explain how to evaluate the query and outline the different Sensitive Topics, potential At-Risk Groups, and the different types of user intent. This will include explanations of each question, and
definitions and examples of each response option.
Here is how you will see the query and the questions as you review in Baseline.
Task screenshot
As you review each query, you will respond to the questions below, which relate to user intent and other aspects of the query content.
1. Does the query reference a sensitive topic?
2. Which sensitive topic, if any, is referenced in the query? (Primary Sensitive Topic)
3. If applicable, please select a second sensitive topic referenced in the query. (Sensitive Topic 2; optional)
4. If applicable, please select a third sensitive topic referenced in the query. (Sensitive Topic 3; optional)
5. Does the query reference an at-risk group?
6. What is the user intent?
7. Please explain your selections. (optional)
You’ll evaluate whether a query references one of the 17 Sensitive Topics from the list below. Please familiarize yourself with these topics and review the definitions and examples outlined in Part 2.
Sensitive Topics:
1. Suicide and Self-Harm
2. Serious Mental Health Concerns
3. Toxicity, Stereotypes, and Identity-based Hate
4. Violence
5. Interpersonal Abuse
6. Child Sexual Abuse and Child Pornography
7. Adult Sexual Material
8. Illegal Activities, Goods, and Services
9. Queries Insulting Content or Creators
10. Risk of Reputational Harm and Private Information Leaks
11. Disinformation and Conspiracy Theories
12. Politically Controversial Topics
13. Unsafe Actions
14. Unethical Actions
15. Vulgarity and Offensiveness
16. Restricted and Regulated Content
17. Other type of harmful or offensive text
After you indicate whether the query contains a Sensitive Topic, you will be asked to indicate which Sensitive Topic(s) are referenced in the query. To do so, please review each type of Sensitive Topic. You will see the
following three questions.
Questions 2-4:
1. Which sensitive topic, if any, is referenced in the query? (Primary Sensitive Topic).
2. If applicable, please select a second sensitive topic referenced in the query. (Sensitive Topic 2; optional)
https://baseline.apple.com/training/evaluations/1072/guidelines 1/4
11/5/24, 9:36 PM Guidelines for Search - Music SRA Classifier Training — BaseLine
3. If applicable, please select a third sensitive topic referenced in the query. (Sensitive Topic 3; optional)
Next to each question is a drop-down list of the different Sensitive Topics. Please select which topic(s) are referenced in the query, if any. You can select only one option per question. If the query includes more than 3
Sensitive Topics, please describe the content in the text box of the final question.
Question Guidelines:
Question 2 (“Which sensitive topic, if any, is referenced in the query?”) is required, even if the query did not reference a Sensitive Topic.
If you selected either “Clearly references a sensitive topic” or “Maybe references a sensitive topic” for Q1 (“Does the query reference a sensitive topic?”), please select which primary Sensitive Topic the query
mentioned.
If you believe the query contains a type of harmful or sensitive content that’s not included in the Sensitive Topic list, select “Other type of harmful or offensive content.” You will be asked to describe the sensitive
content in the last question of the task.
If you selected “No, does not reference a sensitive topic” for Q1, please select “None, there is no sensitive topic present” from the drop-down list for Q2.
If you selected “Not Applicable, the query contains unintelligible text or non-English language.” for Q1, please select “Not Applicable, the query contains unintelligible text or non-English language.” from
the drop-down list for Q2.
Questions 3-4 are optional and should be answered only if the query references more than one Sensitive Topic. If there is only one Sensitive Topic, leave Q3 and Q4 blank.
Below are guidelines to help you evaluate each query for Sensitive Topics. The following list includes definitions for each Sensitive Topic and guidelines for whether the query falls under that Sensitive Topic.
Please note that there may be queries that appear to relate to a Sensitive Topic below, but do not actually meet the criteria outlined by the definition to be considered a sensitive topic. You should select “No, does
not reference a sensitive topic” if the query does not meet the criteria described below. For example, as you will see in the Serious Mental Health Concerns definition below, “wanting depressing stuff” is a query that
should not be considered as fitting the definition of that category despite being partially conceptually related. Closely review the different category definitions and examples to learn more about these queries.
4. Violence
Queries that mention, glorify, and/or incite physical, psychological/emotional, or sexual violence against a random individual or indiscriminate group of people or animals, fall into this category. Queries that mention
terrorism and violent extremism are also included. Note that queries with mild and generic (e.g., “fight club”) or fictional (e.g., “demon slayer”) references to violence are NOT included. Queries that mention weapons or
warfare are included ONLY if they also explicitly mention a violent action directed at people or animals.
Examples: “I wanna kill them all”, “loud songs for torturing my cat”, “ISIS recruitment songs”, “songs while i mow down the haters with my ar15”
Note: Queries such as “the duel”, “zombie shooting”, “guns and roses”, and “war games” should be labeled “No, does not reference a sensitive topic.“
5. Interpersonal Abuse
Queries that mention, describe, and/or glorify physical, psychological/emotional, or sexual abuse among people who know each other (e.g., romantic partners, classmates, family members) fall into this category.
Interpersonal Abuse has the specific goal of causing harm to an individual the user personally knows. This category includes behaviors such as bullying, hazing, coercion, sexual assault, intimidation, and stalking.
Examples: “song to gaslight my grandma”, “metal tracks while I beat up my girlfriend” , “bullying playlist”, “I want to punch the neighborhood kids”
19. Not Applicable, the query contains unintelligible text or non-English language.
Select this option if you selected “Not Applicable, the query contains unintelligible text or non-English language” for Q1.
Most queries will only have one Sensitive Topic present. There will be some exceptions, in which case you should select all relevant Sensitive Topics for one query by responding to Questions 3 and 4. Here is how you
would answer Qs 3-4 for the following queries:
While queries may or may not reference sensitive topics, queries may also reference At-Risk Groups. At-Risk Groups are groups of people who are at greater risk for harm (physical or psychological) because of who they
are. The risk may be rooted in members’ mental/physical health, or in members’ marginalized identities / beliefs / experiences. While an exhaustive list of At-Risk Groups is impossible to provide, here is an incomplete list
of common At-Risk Groups — when you are considering whether an At-Risk Group is present in a query, use this list as a starting point:
children (ages 0-17)
elderly (age 65+)
pregnant people
single parents
people struggling with thoughts of suicide or self-harm (including eating disorders)
people struggling with addictions and other mental/psychological disorders
people with other chronic or long-term health conditions
racial and ethnic minorities
minority religious groups
sexual-orientation and gender minorities
immigrants
Example of queries clearly referencing At-Risk Groups are: “Vietnamese pop”, “muslim prayers”, “kindergarten graduation music”, “first trimester support”, “songs about surviving cancer” , “meth head
music”, “uplifting album for my suicidal brother”
If you selected “Clearly references a sensitive topic” or “Maybe references a sensitive topic” for Q1, you will determine which of the 6 user intent categories above best matches the intent of that particular
query.
If you selected “No, does not reference a sensitive topic” for Q1, please select “There is no sensitive topic present.” for this question.
If you selected “Not Applicable, the query contains unintelligible text or non-English language.” for Q1, please select “Not Applicable, the query contains unintelligible text or non-English language.” for
this question.
The different types of intent can be challenging to distinguish, which is why it is critical that you closely review the definitions and examples below. This will help you understand the key terms that need to be present
for a query to fit a given intent category. Select “Ambiguous Intent” if the query references a sensitive topic, but the user intent does not clearly match the other intent category definitions. Note that “Ambiguous
Intent” may be a common selection for the sensitive topic queries you review.
1. Harmful Intent: Along with the Sensitive Topic, the query may include: (i) intentional language (i.e., “music for”, “songs to”, “music while”) and/or user-centric terms (e.g., “I”, “me”, “myself”) expressing the intent to
commit or associate with a sensitive/harmful activity; (ii) negative personal value judgment (e.g., "worst", "stupid", "loser") directed towards others or types of music content; and/or (iii) derogatory language including slurs
or stereotypes/"dogwhistles".
“music for shooting up a school”
“i hate blacks songs”
“songs while cutting myself”
“songs about faggots”
“want to get hauk tuah blow job”
“worst 90s rap songs”
“Adele’s dumbest lyrics”
“Music for treating and curing my clinical depression“
“dirty Mexican drug dealer songs”
2. Interest in Sensitive Topic: The query indicates interest in a sensitive topic, but not an intent to act. It should include language such as “about”, “relating to”, or “on” in reference to the sensitive topic.
Note: If the query includes glorification words like “songs promoting drug use” or “music celebrating rape” instead of the more neutral words like “about”, then it should be classified as Ambiguous Intent (see #4 below).
“songs on suicide and ending it”
“music about beating your partner”
“songs about hot anal penetration”
“songs relating to gunning down a hater”
“lyrics that describe cutting arms”
3. Opposing Sensitive Topic: The query references a sensitive topic, but there is also text indicating a user is actively trying to avoid it or positively coping with a harm.
“calming songs for my anxiety”
“rap no gang shootings”
“songs for overcoming physical abuse”
“sultry songs that don’t mention rape”
“death metal songs with no murder”
“supportive music for drug addiction recovery”
4. Ambiguous Intent: The query references a sensitive topic, but the user intent is ambiguous because it does not meet the definition of any of the other categories on this list. These queries often appear as only a
statement of the Sensitive Topic that does not explicitly state the user themself is engaging in a negative action or derogatory behavior (e.g., slur or stereotype) or opposing the sensitive topic. For example, “bombing a
church” or “music promoting church bombing”should be classified as “Ambiguous Intent” since we cannot determine the user’s intent from the query, but not “songs for bombing a church” (Harmful Intent) or
“songs about bombing a church” (Interest in Sensitive Topic).
“Shoot up a school”
“cutting and self harm”
“cocaine and ecstasy binge”
“steal cars”
“abortion”
“oral sex orgy anthems”
“insert in her cunt”
“songs celebrating genocide“
5. Navigating to Specific Content: A query fits this category if it references a sensitive topic within the name of a specific piece of content (e.g., song, album, artist, or lyric) AND also mentions the type of content (e.g.,
“song“, ” album”, “lyrics that go.”) Example: “music to be murdered by album.”
Note: The query “music to be murdered by” does not fit this category and should be evaluated purely on the text in the query. This particular query would be classified as “Ambiguous Intent.” Even though you may
know that “music to be murdered by” is an album, please evaluate as though you do not possess any domain knowledge and choose this category ONLY if the query text mentions “song”, “album”, “lyrics”, etc.
“lyrics that go that little faggot with the earring”- lyrics from Dire Straights
“Cop Killer song”- song title by Body Count
“Bleed Like Me by Garbage“- song title by Garbage
“lick my pussy and my crack song”- lyrics from Khia
“that kill yourself song by Suicide boy”
6. Navigating to Related Content: A query fits this category if it references a sensitive topic within the name of a specific piece of content AND also indicates the user is searching for results that are similar to that
content. It should include language such as “like”, “similar to”, “that matches” in reference to the specific content name.
Note: Queries such as “pornographic album” or “mass murder songs” do not fit this category because they do not refer to a specific song or artist name.
“songs like fuck da police”
“artists similar to Suicide boys”
“Albums like Kill em all”
“lyrics with same vibe of wet ass pussy”
“other songs matching rape me”
https://baseline.apple.com/training/evaluations/1072/guidelines 3/4
11/5/24, 9:36 PM Guidelines for Search - Music SRA Classifier Training — BaseLine
7. There is no sensitive topic present: The query does not reference any Sensitive Topic. Select this option if you selected “No, does not reference a sensitive topic” for Q1.
“rock for going on a run”
“top hits from the 80’s”
“Eminem rap while I cook”
“pump up anthems for a party”
“albums that won the Grammys 2021”
8. Not Applicable, the query contains unintelligible text or non-English language: Select this option if you selected “Not Applicable, the query contains unintelligible text or non-English language” for Q1.
Instructions
Sensitive Topics:
Questions 2-4:
Question Guidelines:
https://baseline.apple.com/training/evaluations/1072/guidelines 4/4