Outlier
Outlier
Introduction
Having completed the onboarding courses, you're now prepared to take on assessment
tasks known as Benchmarks. These tasks will look different from what you
encountered during onboarding. While onboarding tasks were structured to guide you
through learning, Benchmark tasks are designed to assess your ability to apply those
skills independently.
The goal of this guide is to help you recognize the differences between onboarding and
Benchmark tasks and to remind you that the principles and skills you learned during
1. Write a Prompt to Test the Model: Start by crafting a prompt that aligns with specific
requirements to observe how the model responds, aiming to create a scenario where
one model might not perform as expected.
2. Rate Responses and Identify Issues: Evaluate both model responses, noting any
issues or areas where they fall short. This step helps you practice critical
assessment skills.
3. Side-by-Side Ranking: Rank the original and improved responses in a side-by-side
comparison, providing justifications for why one response is better. This encourages
another prompt that extends or deepens the conversation, practicing continuity and
context management.
Production/Project Tasks.
Languages: Preference Ranking and Rewrites - Next Steps
Key Differences Between Production/Project and
Assessment Tasks
Assesment tasks will look different. Here are some of the differences you’ll encounter:
Assessment Tasks contain prefilled prompts and responses. Review these carefully to
understand how they relate to the task requirements.
Key Actions
• Read the prefilled prompt to ensure it aligns with the task's instructions.
• Review the prefilled response for accuracy, tone, and completeness.
Languages: Preference Ranking and Rewrites - Next Steps
generalized fields.
Key Actions
In this step, explain why you gave each rating. This is your opportunity to clarify your
thought process and ensure the reviewer understands your assessment.
Key Actions
• Describe the reasons behind each rating.
• Point out any issues or positive aspects that influenced your decision.
Languages: Preference Ranking and Rewrites - Next Steps
You now need to rank the 2 responses based on specific characteristics: Instruction
Following, Writing Quality, and Truthfulness. Use a preference ranking score and follow
these guidelines: if the difference between the responses is mainly subjective or varies
based on opinion, choose one of the middle three preference scores, as these reflect a
Key Actions
subjective or minor.
process and why you believe the selected response is better than the other. Be detailed
and concrete in your reasoning.
Key Actions
• Provide specific examples from each response to support your ranking.
• Write your justification in English, ensuring clarity and coherence in your explanation.
Key Actions
• Make necessary adjustments to the response, focusing on any areas that didn’t meet
• Confirm that the rewrite aligns closely with the task’s objectives and expectations.
Languages: Preference Ranking and Rewrites - Next Steps