SlideShare a Scribd company logo
BETTER SEARCH
           ENGINE TESTING




     STPCON 2011 | EPUGH@O19S.COM | @DEP4B

                                             1




     WHY AM I QUALIFIED TO BE
           UP HERE?
•   President of OpenSource
    Connections

• Contributor  to CruiseControl
    and Continuum CI projects

• Member    of Apache Software
    Foundation

• Presenter at conferences
    (OSCON, ApacheCON, jTDS,
    ExpoQA, STPcon 2009!)

                                             2
AUTHOR




         3




WRITER




         4
FATHER




           5




AGILISTA




           6
AGENDA
 Why is Search Becoming More
          Important?


   What is a Search Engine?

    Techniques for Testing

          Wrap Up



                               7




WHY IS SEARCH
BECOMING MORE
  IMPORTANT?


                               8
INFORMATION IS EXPLODING

   “information workers ... are each bombarded with1.6
  gigabytes of information on average every day through
    emails, reports, blogs, text messages,calls and more”.

  • http://online.wsj.com/article/SB124252211780027326.html




                                                               9




               UNSTRUCTURED



• emails, spreadsheets, documents, presentations, images,
 databases

  • 75%   unstructured to 25% structured




                                                              10
MANAGING DATA IS
              EXPENSIVE


•1   GB costs $.20 to store

•1   GB costs $3500 to Manage




                                                               11




 WHAT DOES 3500 BUY YOU?


• 69% of respondents felt 50% or less of data could be found
 online

• Knowledge  workers spend 25% of their time engaged in
 search-related activities.




                                                               12
WHY NOT JUST USE GOOGLE


 • We   don’t want 44 million results, we want 1

 • we   want “the” answer, not “an” answer

 • we   tolerate inefficieny in the Internet search
                    As
John Allenhappy toputs it: “The Internet is
 • We are Paulos “satisfice”
the world's largest library.It's just that all
           the books are on
                the floor.”
                                                     13




        WHAT IS A SEARCH
           ENGINE?




                                                     14
15




16
17




18
19




THREE STAGES OF SEARCH




                         20
CONTENT INDEXING
•- creating an index by crawling the content directories,
 databases, other repositories using an automated process
 (either pushing or pulling changes)

• create    an Index, which is a searchable key to a collection.

• In
   Enterprise Search, the indexing mechanism should be able
 to access company private data (with access privileges
 maintained)

• control
        indexing schedule - being able to index rapidly
 changing content quickly, other content more slowly.

• rather   than having the bot look for the data.
                                                                      21




            CONTENT INDEXING

• Indexing    may also support

  • Metadata     extraction

  • Auto-summarization, which      is analyse of the collection and
       group its content into categories or clusters.

• Metadatain turn becomes facets that can be used to tune the
 query to put emphasis on that category.


                                                                      22
CONTENT INDEXING
                   23




   QUERYING




                   24
FORMATTING
                                            25




   FACETING
Faceted or "guided navigation"
leverages metadata fields and
values to provide users with visible
options for narrowing or refining
their query.
- Peter Morville, Search Patterns




                                       26
                                            26
Search Stack



          User Interface

          Search Engine

               Data

                           27




  HOW DO WE TEST?

                           28
HOW DO WE TEST?


• Querying

• Formatting

• Content   Indexing

• Performance



                               29




         WHO SHOULD TEST?




                               30
CHALLENGES
• Competing   business stakeholders:

 • Tester: When I search for “lamp shades”, I used to see these
   documents, now I see a differing set.

 • Business  Owner: How do I know that the new search
   engine is better?

 • User: My   pet feature “search within these results” works
   differently.

 • Marketing Guy: I want to control the results so the current
   marketing push for toilet paper brand X always shows up at
   the top.
                                                                  31




                  CHALLENGES


 • Stakeholders  want a better search implementation, but
   perversely often want it to all work “the exact same way”. !
   Getting agreement across all the stakeholders for the
   project vision, and agree on the metrics is a challenge.




                                                                  32
PERFECT SEARCH TESTER
         WOULD BE ALL OF
• Mathematician                    • Business Analyst

• Librarian                        • Systems   Engineer

• UX   Expert                      • Geographer!

• Writer                           • Psychologist

• Programmer




                                                                   33




       KNOWLEDGE TRANSFER

• If
   you don’t have the perfect team already, bring in experts and
  do domain knowledge transfer.

• Learn the vocabulary of search to better communicate
  together

  • “auto   complete” vs “auto suggest”

• Do “Search    for Content Team” brownbag sessions!


                                                                   34
QUERY TESTING



• Often    called “relevancy testing”




                                        35




              TWO SCHOOLS OF
                THOUGHT


• “One True Answer”

• “I   know it when I see it”




                                        36
“ONE TRUE ANSWER”

• Absolute Truth   / Matrix / Grid / TREC / Relevancy Assertions

 • The    correct answers for each search are known ahead of
   time

 • Humans   judges often decide these correct answers, stored
   as Relevancy Assertions

 • Can    be labor intensive to setup

• A “Numerical   Grade” is produced for comparision

                                                                   37




           PROBLEMS WITH THIS
               APPROACH
• Open  to gaming. TREC competition is swamped by
 “academic” search engine efforts that don’t work in the real
 world.

• Needa well understood data set with generally accepted
 answers.

  is it better to have an engine that gives modestly relevant
   results almost all the time, or an engine that gives really
 good answers sometimes, better on average than the other
     engine, but sometimes gives back complete garbage?
                                                                   38
A/B TESTING
                                              Engine version 1 and
                                              version 2!

• Tracks   explicit or implicit preferences between engines A/B

• Often    dispenses with the notion of the "correct" answer

• Canbe easier to setup, but some fear the best answers will be
 missed by both engines



                                                                     39




                       RELEVANCY



• Do   we have any defined relevancy metrics?

• Relevancy   is like porn.....




                                                                     40
I KNOW IT WHEN I SEE IT!




            http://en.wikipedia.org/wiki/Les_Amants

                                                                  41




   BEYOND PRECISION AND
  RECALL: HOW ENGINES ARE
• Binary   vs. Non-Binary Grading Systems

  • Early TREC
             had binary judgements, only Yes/No on whether
   each doc was related to a test search

  • More    choices were later added

  •A system can use letter grades (A, B, C, D and F) or numeric
   grades

  • Another style asks testers to sort documents in their
   preferred order
                                                                  42
CLASSIC MEASUREMENTS OF
    SEARCH RELEVANCY


• Recall: "Did
           I find all the documents I expected to get back?!
 What percent?"

• Precision: "Did
                the system bring back other documents that
 weren't relevant?! What percent were on target?"




                                                              43




                    NEWER IDEAS


• Rank: The   order of documents that were returned

  • Generally
            a 1 in 20 match in the #1 spot is better than a
   50% rate where all matches are on the second page.




                                                              44
INTERACTIVITY: WHAT
      NAVIGATORS OR
 VISUALIZATION WERE GIVEN
• Facets   and sorting: Clickable filters and sort options

• Unsupervised     Clustering: Related terms or phrases, or related
 searches

• Spelling   and thesaurus suggestions



                                                                      45




   SUBJECT DISAMBIGUATION,
   SENTIMENT, CONFLICTING
    INFORMATION, CROWD
            HINTS
• kidney   bean or kidney cell

• "best   football team in the UK"




                                                                      46
47




SOURCES OF VARIANCE, AKA
      "PROBLEMS"
                 Note, this is talking
                 about comparing search
                 engine a to search engine
                 b. But I am thinking
                 more in the context of
                 search engive v1 to v2!




                                             48
DIFFERENT GOALS



• Perfect/Human   vs. Best vs. Acceptable vs. Better than X

• Constrained   vs. Unconstrained Resources (time, cpu, storage)




                                                                   49




                     SAMPLE SIZE


• Amount   of Data

 • Fixed   set or growing over time

• Number    of Testers (AB or Relevancy Judgments)

• Number    of Searches



                                                                   50
VERTICAL VS. HORIZONTAL
          CONTENT


• Oneextreme: Specific demo may cover just one discipline, for
 example Medical Journals

• Other   extreme: Internet covers vastly disparate domains




                                                                51




                             USERS
• Experienced     vs. New Searcher

• Subject   Expert vs. Novice

• Spelling, typing   and computer proficiency

• InterfaceMedium (large visual display, small text display,
 audible, Braille, etc)

• Amount      of Effort to understand Search

• Willingness   to Iterate

• Searching    for specific answer vs. General Exploration
                                                                52
TYPE OF SEARCHES

• Length    / 1 or 2 words

• Full   question

• Sample     text

• Internet   Boolean

• Advanced     Boolean / Syntax / Proximity

  • Wildcard, Regex, etc.

                                              53




                    PUNCTUATION


• Chemical

• Source    Code

• Units   of Measure

• Literal   vs. Search Operator



                                              54
NOT EVEN GETTING INTO
    MULTI LINGUAL SEARCH



• How   do I test in languages I don’t understand?




                                                     55




          GROK YOUR RESULTS
                                                     56
FORMATTING TESTING



• Directly   builds on most of our existing test skills.




                                                           57




     PERSONAS & SCENARIOS




                                                           58
Persona 1: Going to be a mom
                            Oh my God                                                                                                       Needy
                            I’m actually
                            Pregnant!                                          Narrative

                                                                                 Self Introduction
                                                                                 Hi all, I'm very new to this but i couldn't help but share my
                                                                                 excitement. I have just found out today that i am pregnant. It
                                                                                 wasn't planned, me and my partner of a year and a half were
                                                                                 going to wait until we had our own place and were married
                                                                                 first but it looks like we have done it the other way round.
         What’s next? What am I
         supposed to do?
         Guidance please!
                                                                                 My only concern is that i don't really know how my boyfriend
  To interact with                                                               feels about it. I know we need to discuss the options but i
  people going                                                                   have really already made up my mind about what i want to do.
  through the same                                                               There is so much to consider, money, a decent place to live,
  thing.                                                                         being ready but i know i am ready and have been for a long
                                                                                 time ( I get extremely broody when i see my friends kids)

                                    Scenarios that typify -
                                    planned to get pregnant, but                 Should i just tell him how i feel or go with how he feels
                                    hasn't done any research                     because i don't want to lose him. He is a loving partner who
                                    Catch phrases - Nervous but                  would stand by me through anything i just don’t want him to
                                    excited, giddy, Where do I                   feel like i am tying him down!!! I suppose i am feeling very
                                    start?                                       happy but also very confused at the same time!!!
                                    Tag lines - Wants to share,
                                    has a million questions                      http://forum.sofeminine.co.uk/forum/maternite1/
                                    Likely to say - Guide me,                    __f468_maternite1-Oh-my-god-i-m-pregnant.html
                                    help me get off to a good
                                    start




                                                                                                                                                    59




              Persona 2: New Mom
                          Are my kids sick or is                                                                                        Demanding
                          this condition normal?
                          How do I…?                                       Narrative
                                                                           I have been hearing about women who claim that thier 2, 2 1/2 or 3 year
                                                                           old is not ready for the potty. They claim its a nightmare and are waiting
                                                                           for their children to come around.

                                                                           Maybe I grew up in the twilight zone, but I had always assumed that
                                                                           potty training was something that is just done. Its done when:
                                                                              a) The child in question can sleep through the night and stay dry.
                                                                              b) The child in question can speak to you, in full sentences. like,
                                                                              "apple juice, please" or "wanna go to the park" or "momma I wanna
             How do I ensure my
                                                                              hold you..."
             baby is latching on
                                                                              c) The child in question knows they are soiled and can ask to be
             correctly?
                                                                              changed.

                                                                           Barring any of those things, a child is ready to be placed on the potty.
  What type of stroller                                                    using the potty was never negotiable in my family. When we hit the
  should I buy? What                                                       above milestones my mother trained us. We just did it. If we complained
  brand of car seat is                                                     she never put diapers on us, she just kept directing us back to the potty.
  best?
                                                                           Her methods of redirection may be controversial (she told my brother
                                                                           that unless he was a big boy he would not get a happy meal. Boys who
                                                                           pooed on themselves got sad meals... lol!!! He straightened up and
                                   Scenarios that Typify                   started using the potty at 2 1/2) but she was never abusive or anything
                                                                           she just DIDNT ASK US. it was time to potty and that was it.
                                   Likely to say -Are my kids sick or is
                                   this condition normal?                  The reasoning was that I used to drink from a bottle, and sleep with my
                                   Describes herself - wants to be a
                                                                           mother, and such, now I don't. I also used to crap my pants, and that is
                                   good mother, looking for expert
                                   advice, wants to get ideas from other   no longer allowed after a certain point.
                                   moms
                                   Narrative- could be working mom,
                                   could be stay at home mom               My question is this: why ask children if they are ready to use the potty,
                                   Questions likely to ask - sometimes     after they are clearly ready to use it (with language tools and bladder
This picture captures my life
                                   wants to ask questions/get expert       control)? Why is it treated like something that is negotiable or that the
perfectly: an adult beverage
                                   advice
  sitting on a book about                                                  child has a choice of either coming around to it or not? I understand that
         underpants.                                                       children are sensitive and you have to follow their lead, at times. But
                                                                           allowing them to shit
                                                                                                                                                    60
Scenario 1
Find old answer                                “I know went through this before with my first child,
                                               but cannot recall the answer”


                                          Preamble
                                          Experienced mom has a déjà vu moment about a
                                          previous problematic experience with her first child. She
                                          has a partial recollection of a piece of information
                                                                                                                                    Success Factors
                                          related to the answer she seeks but she needs help in
                                                                                                                                    • Speed of Comprehension
                                          pulling
                                                                                                                                    • Directness to destination
                                                                                                                                    • Reduced:
                                                                                                                                       • Number of queries
                                                                                                                                       • Number of results
                                                                                                                                    • Indirect Knowledge Transfer

       Thinking aloud in the Family Room
                                                                                                                       Very nice – lists out related
                                                                Josh had not started to cry                            concerns for constipation.
           Hhhm I now I                                         non-stop for 3 hours when                                 Let’s see: ‘symptoms’,
           had the same           wwwaaaaaaaa                     it finally dawned on me                                ‘cures’, ‘when to call the
          issue with josh,       wwwwaaaaaa . . .                  that he had not had a                               doctor’, ‘what other moms
            but what the              ggg                        movement for 3 days . . .                               are saying’, ‘topic over
           heck did I do?        wwwaaaaaaaa . . .                             Let’s try querying that . . .                       view’     Ok – I’ll take ‘cures’ Alex
                                                                              “no poop” . . . Not likely . . .
                                                                                                                                             for a 300 points and my
                                                                               Uumm . .. “constipation”?
                                                                                                                                            personal sanity! Water . . .
                                                                               Oh, might help to specify
                                                                                                                                             fruit juice . . . high-fiber
                                                                              who as well . . . “baby” . . .
                                                                                                                                            baby foods - Ahhh prune
                                                                                                                                            juice . . . prune juice! Now
                                                                                                                                           why didn’t I remember that!




     After hours of frustration mother home alone has a     Mother starts to type in query but suggest-as-you-     Structured results quickly tip off the mother to the
     partial epiphany as to her child’s problem.            type search box hints to her to be more specific.      assorted aspects of constipation. She focuses in on
                                                                                                                   one of the aspects and has total recollection of her
                                                                                                                   previous experience.




                                                                                                                                                                                     61




Scenario 2
Urgent Question                          It’s 2am and I don’t know who to ask?”



                                          Preamble
                                          Mother of twins finds herself with panicked in the early
                                          morning hours with a new situation.




                                                                                                                                    Success Factors
                                                                                                                                    • Speed of Comprehension
                                                                                                                                    • Directness to answer

                  Crying in the Kitchen
                                                                             I don’t have to read                      ‘102’ . . . thank
                         wwwaaaaaaaa                                    hundreds of pages on the                      god ! We’re safe
                       wwwwaaaaaa . . .                                  internet . . . I just need a
                             gggwwwaaaaaaaa                             quick concise answer . . .
                      wwwaaaaaaaaa . .                                    . . . at what temperature do I
           Crap! Who I am I     wwwwaaaaa . . .                              need to be worried . . ? !                     Ahhh . . . that’s helpful -
                              .      ggg
         supposed to at this                                                                                                other conditions to know
          hour ! Why is it no wwwaaaaaaaaa . . .                              Please [BabyCenter] show me                            about . . .
         body is open when                                                            the answer . . !                           That’s thorough : ‘What will the doctor
            I need them ? !                                   wwwaaaaaaaa                                                                           do? ‘
                                                             wwwwaaaaaa . . .                                                   Interesting ‘If fever is a defense against
                                                                  ggg                                                          infection, is it really a good idea to try to
                                                                   wwwaaaaaaaa
                                                             wwwaaaaaaaaa . .                                                                 bring it down?’
                                                                 wwwwaaaaaa . . .
                                                                    .
                                                                       ggg
                                                                 wwwaaaaaaaaa . .                                                                                Let me book mark
                                                                        .                                                                                          this for later.




     In the middle of the night, a mother of twins finds    Mother starts to type in a query but notices the       The mother zooms in on the specific answer she
     herself alone, overwhelmed, and in dire need of an     suggest-as-you-type search box lets her narrow her     seeks. But then she notices collateral knowledge
     answer.                                                question boosting her confidence she is going to get   she takes note of for later reading.
                                                            the answer she needs.




                                                                                                                                                                                     62
CONTENT INDEXING
               TESTING


• Leverages our normal testing skills. And typically what it really
 means is “Performance Testing”.

 • Lot’s   of “integration” testing.




                                                                      63




     PERFORMANCE TESTING




                                                                      64
LEVELS OF SCALING
• Scale    High

  • There     is a quickly hit point of diminishing returns!

• Scale Wide

  • The    safety valve for lots of load.

• Scale    Deep

  • ScalingDeep? You are doing some crazy stuff with huge
    indexes!!
                                     65
                                                               65




            SCALE WIDE (SLAVES)

• Too   many inbound queries!

• slaves
      poll master for
 changes

• index and config files
 transferred

• ALL   JAVA!

                                     66
                                                               66
SCALE WIDE (SHARDING)
• Too     large of an index to query

• Split
      index over multiple Search
 servers

  •A      -> M: Server 1, N -> Z: Server 2

  • uniqueId.hash    % numServers

• Relevancy     typically balanced shards

• Requestsplit across shards, results
 aggregated to single response
                                   67
                                             67




                       SCALE DEEP


• Combine  both scaling wide
 to handle number of queries
 with sharding to handle size
 of indexes!




                                   68
                                             68
WRAP UP




                                                                                69




                               User         Search
Methodology                  Interface      Engine
                                                           Data

Concurrent Streams of Work
                                             Iteration 2 Story:
Operationalize Solr           Deploy Solr into BabyCenter Test Environment




                                                Iteration 2 Story:
  Search Analysis               Integrate Solr into Community UI, A/B Testing




                                            Iteration 2 Story:
Search Experience               Conceptual Model (Personas, etc) & Mockups




 OSC APPROACH TO SEARCH
                                                                                70
OSC APPROACH TO SEARCH
                                                                     71




                     RESOURCES

• http://www.scribd.com/doc/17563004/Why-You-Cant-Just-
    Google-for-Enterprise-Knowledge

• http://www.searchtools.com/info/user-interface.html

• http://www.alistapart.com/articles/testing-search-for-relevancy-
    and-precision/

•



                                                                     72
SEARCHPATTERNS.ORG
                          73
                                                 73




                 THANK YOU!



• twitter:   dep4b

• speakerrate:   http://www.speakerrate.com/epugh/

• email:   epugh@opensourceconnections.com

                          74
                                                 74

More Related Content

PDF
Getting Started With User Research, Presented at Agile2010
Carol Smith
 
PDF
SolrとElasticsearchの比較
genta kaneyama
 
PDF
SolrとElasticsearchを比べてみよう
Shinsuke Sugaya
 
KEY
Better Search Engine Testing
OpenSource Connections
 
PDF
Search Solutions 2011: Successful Enterprise Search By Design
Marianne Sweeny
 
PDF
Configuring share point 2010 just do it
Marianne Sweeny
 
PDF
Search V Next Final
Marianne Sweeny
 
PDF
Not Your Mom's SEO
Marianne Sweeny
 
Getting Started With User Research, Presented at Agile2010
Carol Smith
 
SolrとElasticsearchの比較
genta kaneyama
 
SolrとElasticsearchを比べてみよう
Shinsuke Sugaya
 
Better Search Engine Testing
OpenSource Connections
 
Search Solutions 2011: Successful Enterprise Search By Design
Marianne Sweeny
 
Configuring share point 2010 just do it
Marianne Sweeny
 
Search V Next Final
Marianne Sweeny
 
Not Your Mom's SEO
Marianne Sweeny
 

Similar to Better Search Engine Testing - Eric Pugh (20)

PDF
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Marianne Sweeny
 
PPTX
Search engines
Anshuman Tyagi
 
PDF
Birds Bears and Bs:Optimal SEO for Today's Search Engines
Marianne Sweeny
 
PDF
Optimal SEO (Marianne Sweeny)
uxpa-dc
 
PDF
Solr pattern
OpenSource Connections
 
PDF
Defining the Search Experience
Marianne Sweeny
 
PDF
Search engine strategies
laytonhind
 
PDF
Search Introduction - Updated
Dominique Hind
 
PDF
Introduction to Enterprise Search
Findwise
 
PPTX
Introduction to Information Retrieval
Roi Blanco
 
PPT
How search engines work
Chinna Botla
 
PDF
Searchland: Search quality for Beginners
Valeria de Paiva
 
PDF
pedersen
Hiroshi Ono
 
KEY
rorosyd - Test Driven Search Development
Andrew Harvey
 
PDF
Information Retrieval (for beginners)
James Melzer
 
PDF
UProRevs-User Profile Relevant Results
Royston Olivera
 
PDF
Search engine strategy introduction
laytonhind
 
PDF
Search engine strategies - introduction
laytonhind
 
PPTX
How to SEO a Terrific - and Profitable - User Experience
BrightEdge
 
PDF
Smashing SIlos: UX is the New SEO
BrightEdge
 
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Marianne Sweeny
 
Search engines
Anshuman Tyagi
 
Birds Bears and Bs:Optimal SEO for Today's Search Engines
Marianne Sweeny
 
Optimal SEO (Marianne Sweeny)
uxpa-dc
 
Defining the Search Experience
Marianne Sweeny
 
Search engine strategies
laytonhind
 
Search Introduction - Updated
Dominique Hind
 
Introduction to Enterprise Search
Findwise
 
Introduction to Information Retrieval
Roi Blanco
 
How search engines work
Chinna Botla
 
Searchland: Search quality for Beginners
Valeria de Paiva
 
pedersen
Hiroshi Ono
 
rorosyd - Test Driven Search Development
Andrew Harvey
 
Information Retrieval (for beginners)
James Melzer
 
UProRevs-User Profile Relevant Results
Royston Olivera
 
Search engine strategy introduction
laytonhind
 
Search engine strategies - introduction
laytonhind
 
How to SEO a Terrific - and Profitable - User Experience
BrightEdge
 
Smashing SIlos: UX is the New SEO
BrightEdge
 
Ad

More from lucenerevolution (20)

PDF
Text Classification Powered by Apache Mahout and Lucene
lucenerevolution
 
PDF
State of the Art Logging. Kibana4Solr is Here!
lucenerevolution
 
PDF
Search at Twitter
lucenerevolution
 
PDF
Building Client-side Search Applications with Solr
lucenerevolution
 
PDF
Integrate Solr with real-time stream processing applications
lucenerevolution
 
PDF
Scaling Solr with SolrCloud
lucenerevolution
 
PDF
Administering and Monitoring SolrCloud Clusters
lucenerevolution
 
PDF
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
lucenerevolution
 
PDF
Using Solr to Search and Analyze Logs
lucenerevolution
 
PDF
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
PDF
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
PDF
Solr's Admin UI - Where does the data come from?
lucenerevolution
 
PDF
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
PDF
High Performance JSON Search and Relational Faceted Browsing with Lucene
lucenerevolution
 
PDF
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
lucenerevolution
 
PDF
Faceted Search with Lucene
lucenerevolution
 
PDF
Recent Additions to Lucene Arsenal
lucenerevolution
 
PDF
Turning search upside down
lucenerevolution
 
PDF
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
lucenerevolution
 
PDF
Shrinking the haystack wes caldwell - final
lucenerevolution
 
Text Classification Powered by Apache Mahout and Lucene
lucenerevolution
 
State of the Art Logging. Kibana4Solr is Here!
lucenerevolution
 
Search at Twitter
lucenerevolution
 
Building Client-side Search Applications with Solr
lucenerevolution
 
Integrate Solr with real-time stream processing applications
lucenerevolution
 
Scaling Solr with SolrCloud
lucenerevolution
 
Administering and Monitoring SolrCloud Clusters
lucenerevolution
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
lucenerevolution
 
Using Solr to Search and Analyze Logs
lucenerevolution
 
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Solr's Admin UI - Where does the data come from?
lucenerevolution
 
Schemaless Solr and the Solr Schema REST API
lucenerevolution
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
lucenerevolution
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
lucenerevolution
 
Faceted Search with Lucene
lucenerevolution
 
Recent Additions to Lucene Arsenal
lucenerevolution
 
Turning search upside down
lucenerevolution
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
lucenerevolution
 
Shrinking the haystack wes caldwell - final
lucenerevolution
 
Ad

Recently uploaded (20)

PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Beyond Automation: The Role of IoT Sensor Integration in Next-Gen Industries
Rejig Digital
 
PDF
This slide provides an overview Technology
mineshkharadi333
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
REPORT: Heating appliances market in Poland 2024
SPIUG
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPT
Coupa-Kickoff-Meeting-Template presentai
annapureddyn
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Doc9.....................................
SofiaCollazos
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Beyond Automation: The Role of IoT Sensor Integration in Next-Gen Industries
Rejig Digital
 
This slide provides an overview Technology
mineshkharadi333
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Francisco Vieira Júnior
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Event Presentation Google Cloud Next Extended 2025
minhtrietgect
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
REPORT: Heating appliances market in Poland 2024
SPIUG
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
Coupa-Kickoff-Meeting-Template presentai
annapureddyn
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Advances in Ultra High Voltage (UHV) Transmission and Distribution Systems.pdf
Nabajyoti Banik
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 

Better Search Engine Testing - Eric Pugh

  • 1. BETTER SEARCH ENGINE TESTING STPCON 2011 | [email protected] | @DEP4B 1 WHY AM I QUALIFIED TO BE UP HERE? • President of OpenSource Connections • Contributor to CruiseControl and Continuum CI projects • Member of Apache Software Foundation • Presenter at conferences (OSCON, ApacheCON, jTDS, ExpoQA, STPcon 2009!) 2
  • 2. AUTHOR 3 WRITER 4
  • 3. FATHER 5 AGILISTA 6
  • 4. AGENDA Why is Search Becoming More Important? What is a Search Engine? Techniques for Testing Wrap Up 7 WHY IS SEARCH BECOMING MORE IMPORTANT? 8
  • 5. INFORMATION IS EXPLODING “information workers ... are each bombarded with1.6 gigabytes of information on average every day through emails, reports, blogs, text messages,calls and more”. • http://online.wsj.com/article/SB124252211780027326.html 9 UNSTRUCTURED • emails, spreadsheets, documents, presentations, images, databases • 75% unstructured to 25% structured 10
  • 6. MANAGING DATA IS EXPENSIVE •1 GB costs $.20 to store •1 GB costs $3500 to Manage 11 WHAT DOES 3500 BUY YOU? • 69% of respondents felt 50% or less of data could be found online • Knowledge workers spend 25% of their time engaged in search-related activities. 12
  • 7. WHY NOT JUST USE GOOGLE • We don’t want 44 million results, we want 1 • we want “the” answer, not “an” answer • we tolerate inefficieny in the Internet search As John Allenhappy toputs it: “The Internet is • We are Paulos “satisfice” the world's largest library.It's just that all the books are on the floor.” 13 WHAT IS A SEARCH ENGINE? 14
  • 10. 19 THREE STAGES OF SEARCH 20
  • 11. CONTENT INDEXING •- creating an index by crawling the content directories, databases, other repositories using an automated process (either pushing or pulling changes) • create an Index, which is a searchable key to a collection. • In Enterprise Search, the indexing mechanism should be able to access company private data (with access privileges maintained) • control indexing schedule - being able to index rapidly changing content quickly, other content more slowly. • rather than having the bot look for the data. 21 CONTENT INDEXING • Indexing may also support • Metadata extraction • Auto-summarization, which is analyse of the collection and group its content into categories or clusters. • Metadatain turn becomes facets that can be used to tune the query to put emphasis on that category. 22
  • 12. CONTENT INDEXING 23 QUERYING 24
  • 13. FORMATTING 25 FACETING Faceted or "guided navigation" leverages metadata fields and values to provide users with visible options for narrowing or refining their query. - Peter Morville, Search Patterns 26 26
  • 14. Search Stack User Interface Search Engine Data 27 HOW DO WE TEST? 28
  • 15. HOW DO WE TEST? • Querying • Formatting • Content Indexing • Performance 29 WHO SHOULD TEST? 30
  • 16. CHALLENGES • Competing business stakeholders: • Tester: When I search for “lamp shades”, I used to see these documents, now I see a differing set. • Business Owner: How do I know that the new search engine is better? • User: My pet feature “search within these results” works differently. • Marketing Guy: I want to control the results so the current marketing push for toilet paper brand X always shows up at the top. 31 CHALLENGES • Stakeholders want a better search implementation, but perversely often want it to all work “the exact same way”. ! Getting agreement across all the stakeholders for the project vision, and agree on the metrics is a challenge. 32
  • 17. PERFECT SEARCH TESTER WOULD BE ALL OF • Mathematician • Business Analyst • Librarian • Systems Engineer • UX Expert • Geographer! • Writer • Psychologist • Programmer 33 KNOWLEDGE TRANSFER • If you don’t have the perfect team already, bring in experts and do domain knowledge transfer. • Learn the vocabulary of search to better communicate together • “auto complete” vs “auto suggest” • Do “Search for Content Team” brownbag sessions! 34
  • 18. QUERY TESTING • Often called “relevancy testing” 35 TWO SCHOOLS OF THOUGHT • “One True Answer” • “I know it when I see it” 36
  • 19. “ONE TRUE ANSWER” • Absolute Truth / Matrix / Grid / TREC / Relevancy Assertions • The correct answers for each search are known ahead of time • Humans judges often decide these correct answers, stored as Relevancy Assertions • Can be labor intensive to setup • A “Numerical Grade” is produced for comparision 37 PROBLEMS WITH THIS APPROACH • Open to gaming. TREC competition is swamped by “academic” search engine efforts that don’t work in the real world. • Needa well understood data set with generally accepted answers. is it better to have an engine that gives modestly relevant results almost all the time, or an engine that gives really good answers sometimes, better on average than the other engine, but sometimes gives back complete garbage? 38
  • 20. A/B TESTING Engine version 1 and version 2! • Tracks explicit or implicit preferences between engines A/B • Often dispenses with the notion of the "correct" answer • Canbe easier to setup, but some fear the best answers will be missed by both engines 39 RELEVANCY • Do we have any defined relevancy metrics? • Relevancy is like porn..... 40
  • 21. I KNOW IT WHEN I SEE IT! http://en.wikipedia.org/wiki/Les_Amants 41 BEYOND PRECISION AND RECALL: HOW ENGINES ARE • Binary vs. Non-Binary Grading Systems • Early TREC had binary judgements, only Yes/No on whether each doc was related to a test search • More choices were later added •A system can use letter grades (A, B, C, D and F) or numeric grades • Another style asks testers to sort documents in their preferred order 42
  • 22. CLASSIC MEASUREMENTS OF SEARCH RELEVANCY • Recall: "Did I find all the documents I expected to get back?! What percent?" • Precision: "Did the system bring back other documents that weren't relevant?! What percent were on target?" 43 NEWER IDEAS • Rank: The order of documents that were returned • Generally a 1 in 20 match in the #1 spot is better than a 50% rate where all matches are on the second page. 44
  • 23. INTERACTIVITY: WHAT NAVIGATORS OR VISUALIZATION WERE GIVEN • Facets and sorting: Clickable filters and sort options • Unsupervised Clustering: Related terms or phrases, or related searches • Spelling and thesaurus suggestions 45 SUBJECT DISAMBIGUATION, SENTIMENT, CONFLICTING INFORMATION, CROWD HINTS • kidney bean or kidney cell • "best football team in the UK" 46
  • 24. 47 SOURCES OF VARIANCE, AKA "PROBLEMS" Note, this is talking about comparing search engine a to search engine b. But I am thinking more in the context of search engive v1 to v2! 48
  • 25. DIFFERENT GOALS • Perfect/Human vs. Best vs. Acceptable vs. Better than X • Constrained vs. Unconstrained Resources (time, cpu, storage) 49 SAMPLE SIZE • Amount of Data • Fixed set or growing over time • Number of Testers (AB or Relevancy Judgments) • Number of Searches 50
  • 26. VERTICAL VS. HORIZONTAL CONTENT • Oneextreme: Specific demo may cover just one discipline, for example Medical Journals • Other extreme: Internet covers vastly disparate domains 51 USERS • Experienced vs. New Searcher • Subject Expert vs. Novice • Spelling, typing and computer proficiency • InterfaceMedium (large visual display, small text display, audible, Braille, etc) • Amount of Effort to understand Search • Willingness to Iterate • Searching for specific answer vs. General Exploration 52
  • 27. TYPE OF SEARCHES • Length / 1 or 2 words • Full question • Sample text • Internet Boolean • Advanced Boolean / Syntax / Proximity • Wildcard, Regex, etc. 53 PUNCTUATION • Chemical • Source Code • Units of Measure • Literal vs. Search Operator 54
  • 28. NOT EVEN GETTING INTO MULTI LINGUAL SEARCH • How do I test in languages I don’t understand? 55 GROK YOUR RESULTS 56
  • 29. FORMATTING TESTING • Directly builds on most of our existing test skills. 57 PERSONAS & SCENARIOS 58
  • 30. Persona 1: Going to be a mom Oh my God Needy I’m actually Pregnant! Narrative Self Introduction Hi all, I'm very new to this but i couldn't help but share my excitement. I have just found out today that i am pregnant. It wasn't planned, me and my partner of a year and a half were going to wait until we had our own place and were married first but it looks like we have done it the other way round. What’s next? What am I supposed to do? Guidance please! My only concern is that i don't really know how my boyfriend To interact with feels about it. I know we need to discuss the options but i people going have really already made up my mind about what i want to do. through the same There is so much to consider, money, a decent place to live, thing. being ready but i know i am ready and have been for a long time ( I get extremely broody when i see my friends kids) Scenarios that typify - planned to get pregnant, but Should i just tell him how i feel or go with how he feels hasn't done any research because i don't want to lose him. He is a loving partner who Catch phrases - Nervous but would stand by me through anything i just don’t want him to excited, giddy, Where do I feel like i am tying him down!!! I suppose i am feeling very start? happy but also very confused at the same time!!! Tag lines - Wants to share, has a million questions http://forum.sofeminine.co.uk/forum/maternite1/ Likely to say - Guide me, __f468_maternite1-Oh-my-god-i-m-pregnant.html help me get off to a good start 59 Persona 2: New Mom Are my kids sick or is Demanding this condition normal? How do I…? Narrative I have been hearing about women who claim that thier 2, 2 1/2 or 3 year old is not ready for the potty. They claim its a nightmare and are waiting for their children to come around. Maybe I grew up in the twilight zone, but I had always assumed that potty training was something that is just done. Its done when: a) The child in question can sleep through the night and stay dry. b) The child in question can speak to you, in full sentences. like, "apple juice, please" or "wanna go to the park" or "momma I wanna How do I ensure my hold you..." baby is latching on c) The child in question knows they are soiled and can ask to be correctly? changed. Barring any of those things, a child is ready to be placed on the potty. What type of stroller using the potty was never negotiable in my family. When we hit the should I buy? What above milestones my mother trained us. We just did it. If we complained brand of car seat is she never put diapers on us, she just kept directing us back to the potty. best? Her methods of redirection may be controversial (she told my brother that unless he was a big boy he would not get a happy meal. Boys who pooed on themselves got sad meals... lol!!! He straightened up and Scenarios that Typify started using the potty at 2 1/2) but she was never abusive or anything she just DIDNT ASK US. it was time to potty and that was it. Likely to say -Are my kids sick or is this condition normal? The reasoning was that I used to drink from a bottle, and sleep with my Describes herself - wants to be a mother, and such, now I don't. I also used to crap my pants, and that is good mother, looking for expert advice, wants to get ideas from other no longer allowed after a certain point. moms Narrative- could be working mom, could be stay at home mom My question is this: why ask children if they are ready to use the potty, Questions likely to ask - sometimes after they are clearly ready to use it (with language tools and bladder This picture captures my life wants to ask questions/get expert control)? Why is it treated like something that is negotiable or that the perfectly: an adult beverage advice sitting on a book about child has a choice of either coming around to it or not? I understand that underpants. children are sensitive and you have to follow their lead, at times. But allowing them to shit 60
  • 31. Scenario 1 Find old answer “I know went through this before with my first child, but cannot recall the answer” Preamble Experienced mom has a déjà vu moment about a previous problematic experience with her first child. She has a partial recollection of a piece of information Success Factors related to the answer she seeks but she needs help in • Speed of Comprehension pulling • Directness to destination • Reduced: • Number of queries • Number of results • Indirect Knowledge Transfer Thinking aloud in the Family Room Very nice – lists out related Josh had not started to cry concerns for constipation. Hhhm I now I non-stop for 3 hours when Let’s see: ‘symptoms’, had the same wwwaaaaaaaa it finally dawned on me ‘cures’, ‘when to call the issue with josh, wwwwaaaaaa . . . that he had not had a doctor’, ‘what other moms but what the ggg movement for 3 days . . . are saying’, ‘topic over heck did I do? wwwaaaaaaaa . . . Let’s try querying that . . . view’ Ok – I’ll take ‘cures’ Alex “no poop” . . . Not likely . . . for a 300 points and my Uumm . .. “constipation”? personal sanity! Water . . . Oh, might help to specify fruit juice . . . high-fiber who as well . . . “baby” . . . baby foods - Ahhh prune juice . . . prune juice! Now why didn’t I remember that! After hours of frustration mother home alone has a Mother starts to type in query but suggest-as-you- Structured results quickly tip off the mother to the partial epiphany as to her child’s problem. type search box hints to her to be more specific. assorted aspects of constipation. She focuses in on one of the aspects and has total recollection of her previous experience. 61 Scenario 2 Urgent Question It’s 2am and I don’t know who to ask?” Preamble Mother of twins finds herself with panicked in the early morning hours with a new situation. Success Factors • Speed of Comprehension • Directness to answer Crying in the Kitchen I don’t have to read ‘102’ . . . thank wwwaaaaaaaa hundreds of pages on the god ! We’re safe wwwwaaaaaa . . . internet . . . I just need a gggwwwaaaaaaaa quick concise answer . . . wwwaaaaaaaaa . . . . . at what temperature do I Crap! Who I am I wwwwaaaaa . . . need to be worried . . ? ! Ahhh . . . that’s helpful - . ggg supposed to at this other conditions to know hour ! Why is it no wwwaaaaaaaaa . . . Please [BabyCenter] show me about . . . body is open when the answer . . ! That’s thorough : ‘What will the doctor I need them ? ! wwwaaaaaaaa do? ‘ wwwwaaaaaa . . . Interesting ‘If fever is a defense against ggg infection, is it really a good idea to try to wwwaaaaaaaa wwwaaaaaaaaa . . bring it down?’ wwwwaaaaaa . . . . ggg wwwaaaaaaaaa . . Let me book mark . this for later. In the middle of the night, a mother of twins finds Mother starts to type in a query but notices the The mother zooms in on the specific answer she herself alone, overwhelmed, and in dire need of an suggest-as-you-type search box lets her narrow her seeks. But then she notices collateral knowledge answer. question boosting her confidence she is going to get she takes note of for later reading. the answer she needs. 62
  • 32. CONTENT INDEXING TESTING • Leverages our normal testing skills. And typically what it really means is “Performance Testing”. • Lot’s of “integration” testing. 63 PERFORMANCE TESTING 64
  • 33. LEVELS OF SCALING • Scale High • There is a quickly hit point of diminishing returns! • Scale Wide • The safety valve for lots of load. • Scale Deep • ScalingDeep? You are doing some crazy stuff with huge indexes!! 65 65 SCALE WIDE (SLAVES) • Too many inbound queries! • slaves poll master for changes • index and config files transferred • ALL JAVA! 66 66
  • 34. SCALE WIDE (SHARDING) • Too large of an index to query • Split index over multiple Search servers •A -> M: Server 1, N -> Z: Server 2 • uniqueId.hash % numServers • Relevancy typically balanced shards • Requestsplit across shards, results aggregated to single response 67 67 SCALE DEEP • Combine both scaling wide to handle number of queries with sharding to handle size of indexes! 68 68
  • 35. WRAP UP 69 User Search Methodology Interface Engine Data Concurrent Streams of Work Iteration 2 Story: Operationalize Solr Deploy Solr into BabyCenter Test Environment Iteration 2 Story: Search Analysis Integrate Solr into Community UI, A/B Testing Iteration 2 Story: Search Experience Conceptual Model (Personas, etc) & Mockups OSC APPROACH TO SEARCH 70
  • 36. OSC APPROACH TO SEARCH 71 RESOURCES • http://www.scribd.com/doc/17563004/Why-You-Cant-Just- Google-for-Enterprise-Knowledge • http://www.searchtools.com/info/user-interface.html • http://www.alistapart.com/articles/testing-search-for-relevancy- and-precision/ • 72
  • 37. SEARCHPATTERNS.ORG 73 73 THANK YOU! • twitter: dep4b • speakerrate: http://www.speakerrate.com/epugh/ • email: [email protected] 74 74