lab lookups subsearches
lab lookups subsearches
Overview
Welcome to the Splunk Education lab environment. These lab exercises will test your knowledge of lookup
commands and subsearches.
Scenario
You will use data from the international video game company, Buttercup Games. A list of source types is
provided below.
NOTE: This is a lab environment driven by data generators with obvious limitations. This is not a
production environment. Screenshots approximate what you should see, not the exact output.
network Email security data cisco_esa dcid, icid, mailfrom, mailto, mid
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 1
Common Commands and Functions
These commands and statistical functions are commonly used in searches but may not have been explicitly
discussed in the course. Please use this table for quick reference. Click on the hyperlinked SPL (Search
Processing Language) to be taken to the Search Manual for that command or function.
SPL Type Description Example
Sorts results in Sort the first 100 src_ip values in descending order
descending or ascending
sort command
order by a specified field.
| sort 100 -src_ip
Can limit results to a
specific number.
Return events with a count value greater than 30
Filters search results
where command
using eval-expressions.
| where count > 30
Rename SESSIONID to 'The session ID'
Renames one or
rename command
more fields.
| rename SESSIONID as "The session ID"
Returns the number of Count all events as "events" and count all events that
occurrences of all events contain a value for action as "action"
count or statistical
or a specific field. Can
count() function
be used with stats, | stats count as events,
timechart, and chart count(action) as action
commands.
Refer to the Search Reference Manual for a full list of commands and functions.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 2
Lab Exercise 1 – Using Lookup Commands
Description
Configure the lab environment user account. Then, use inputlookup, lookup, and outputlookup commands
to call on and create lookups in search.
Steps
Log into Splunk and change the account name and time zone.
Set up your lab environment to fit your time zone. This allows the instructor
to track your progress and assist you if necessary.
Log into your Splunk lab environment using the username and
password provided to you.
You may see a pop-up window welcoming you to the lab environment.
You can click Continue to Tour but this is not required. Click Skip to
dismiss the window.
Click on the username you logged in with (at the top of the screen) and
then choose Account Settings from the drop-down menu.
After you complete step 6,
In the Full name box, enter your first and last name.
you will see your name in
Click Save. the web interface.
Reload your browser to reflect the recent changes to the interface.
(This area of the web interface will be referred to as user name.)
NOTE: Sometimes there is a delay in executing an action like saving in the UI or returning results of a search.
Please allow the UI a few minutes to execute your action.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 3
Scenario: You provided your knowledge manager with a CSV containing HTTP statuses, status
descriptions, and status types. Your knowledge manager just informed you that the lookup
was uploaded.
Your lab environment is configured to take you to the Search & Reporting app within Splunk. (Also called
the “search” app.) Confirm you are in the correct app by clicking Apps in the top left corner. You should
see Search & Reporting highlighted. If you do not, click on Search & Reporting.
Your knowledge manager provided you with the following information about the
status_definitions.csv lookup. Use the inputlookup command to verify that the file-based lookup
has been correctly uploaded.
filename: status_definitions.csv
definition name: status_definitions_lookup
lookup type: file-based
Your recently saved L1S1 report will be visible in the Reports tab.
The following search needs to find events from the online sales data that do not have a status of 200.
(This represents unsuccessful events and is written as status!=200.) However, the search is missing the
lookup command.
a. Run the search above over the Last 24 hours. You should receive an error or No results found.
b. Use the lookup command to add the status_description and status_type fields from the
status_definitions_lookup. Then, run the search again.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 5
Save your search as a report with the name L1S2.
Scenario: SecOps wants a report of known users who have been browsing "Uncategorized URLs" over
the last 24 hours.
This task uses information from the status_definitions.csv lookup and the knownusers.csv lookup.
Your knowledge manager has provided you with the following information about the knownusers.csv
lookup. Use the inputlookup command to explore the knownusers.csv file-based lookup.
filename: knownusers.csv
definition name: none
lookup type: file-based
HINT: You may find it helpful to have both lookups available to reference for this task. Right-click on
Search in the application bar (next to Datasets, Reports, etc.) and click "Open Link in New
Tab." Run an inputlookup search on status_definitions.csv. Repeat these steps
for knownusers.csv.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 6
a. The first lookup should use knownusers.csv to retrieve user values for all matching username
values in the events. (Hint: You will need to use user as username for your lookup. This tells
Splunk to match the values of user from the lookup against the values of username from the
event data.)
b. The second lookup should use status_definitions.csv to retrieve status_description
values for all matching status values in the events.
c. Run the search over the Last 24 hours.
Scenario: Sales would like a map of retail sales in Canada by province over the previous week.
Use the geospatial lookup file, canada.kml, to return a choropleth map of Canadian retail sales
by province during the previous week.
The knowledge manager has uploaded and defined the canada.kml geospatial lookup and provided you
with the following information. Use this info to create a search that will display the contents of the lookup.
You should see the geospatial lookup output displayed as a table with the following fields: count,
featureCollection, featureId, and geom.
filename: canada.kml
definition name: canada_prov
lookup type: geospatial
Open a second search browser window. The following search calculates total sales from Canada in
Canadian dollars. Complete the <missing> portions of the geom command so that the results of this
search are correlated with the canada.kml lookup. (Hint: The geom command must use the geospatial
lookup definition name and the featureIdField should be a field with values that are present in the
events and in the lookup.)
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 7
Output of the first 4 lines of this search.
Run the search over the Previous week and confirm that your output looks like the table below.
Click the Visualization tab and change the visualization to Choropleth Map.
Under the Format tab:
a. Set Latitude to 53.
b. Set Longitude to -92.
c. Set Zoom to 4.
d. Set Color Mode to Sequential.
e. Set Maximum Color to 006D9C.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 8
Save your search as a report with the name L1S4.
Scenario: TechOps wants to be able to search for web server errors coming from
www.buttercupgames.com that are associated with unsuccessful purchases.
Troubleshoot this search and then output results to a lookup with the outputlookup
command.
This search is not returning the desired results. Troubleshoot the lookup command expression.
Output the results of this search to a lookup called BCG_web_server_errors.csv. Make sure the lookup
is created in the same app the search is being run. Run the search over the Last 24 hours.
Confirm that the BCG_web_server_errors.csv lookup has been created by clicking on Job.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 9
Save your search as a report with the name L1S5.
Scenario: SecOps is finding an increase in penetration attempts. Find unknown users with more than 3
failed logins within the last 24 hours.
Complete the <missing> portion of the lookup expression in this search. The search should exclude
known users from the final results. Keep a few things in mind:
a. Both the linux_secure data and the knownusers.csv lookup file use the same field name for
user. Therefore, the user field from the linux_secure data has been renamed to
user_from_events before using the lookup and search commands.
b. The search command filters search results. The <missing> portion of the search expression is a
field name.
c. The remainder of the search performs statistical aggregations on the results and further
manipulates the data to achieve the scenario goal.
d. The search should be run over the Last 24 hours.
index=security sourcetype=linux_secure fail*
| rename user as user_from_events
| lookup <missing>
| search NOT <missing>=*
| stats count by user_from_events, src_ip
| stats values(src_ip) as Attacker_IP, sum(count) as Failed_Attempts by
user_from_events
| rename user_from_events as Attacker
| search Failed_Attempts > 3
| sort –Failed_Attempts
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 10
Example of final output.
Save your search as a report with the name L1X.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 11
Lab Exercise 2 – Adding a Subsearch
Description
Create subsearches to manipulate search input.
Steps
Scenario: Marketing and Sales would like to know how many times multiplayer games were "viewed"
on the website during the "Multiplayer Madness" event this past Saturday.
Your knowledge manager provided you with the following information. These lookups contain information
about the products sold by Buttercup Games. This task requires events from the following games: SIM
Cubicle, Dream Crusher, Mediocre Kingdoms, Puppies vs. Zombies, Manganiello Bros., Final Sequel,
Benign Space Debris, and Curling 2014. Use the inputlookup command to find the correct lookup and
verify its contents.
filename: products.csv
definition name: product_lookup
description: code, category ID, price, product ID, and sale price of all products
lookup type: file-based
filename: sp_products.csv
definition name: none
description: list of single player games
lookup type: file-based
filename: mp_products.csv
definition name: none
description: list of multiplayer games
lookup type: file-based
This search is looking back to Saturday (earliest=@w6 latest=@w7) for all events involving a "view"
action in the web sales index. Then, the search transforms and sorts the data to show which game was
viewed the most. Replace the <missing> portion of the basic search with a subsearch so that only events
involving multiplayer games are returned.
Combine two searches into a single search. The resulting search should find the average
and median sales totals for clientips who have experienced problems making a purchase
(action=purchase status>=400) but still managed to complete a successful web order
during the previous week.
This Venn diagram represents the components of this search: the results of the outer search (green), the
results of the inner search (blue), and the results of the outer search filtered by the results of the inner
search (grey).
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 12
Based on the task description, we want to perform statistical transformations on the data represented by
the grey inner section—the customers that experienced problems with a purchase (action=purchase
status>=400) yet still completed a successful online sales order (action=purchase status=200) over
the previous week.
Answer these questions about the inner and outer searches:
a. TRUE or FALSE: The inner search (blue) will look for customers who did not experience issues
with their online purchase.
b. TRUE or FALSE: The outer search (green) will look for successful purchase events but only return
events from customers that appeared in the results of the inner search.
Which of these searches provides the desired results of the inner search?
Search 2
index=web sourcetype=access_combined status=200 OR status=400 action=purchase
| stats sum(sale_price) as sales_sum by clientip
| stats avg(sales_sum) as avg_sales, median(sales_sum) as median_sales
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 13
Search 3
search index=web sourcetype=access_combined status>=400 action=purchase
| stats values(clientip) as clientip
Which of these searches provides the desired results of the outer search? (Note: If you run these
searches, remove the [<subsearch>] placeholder, otherwise you will receive an error.)
Search 1
index=web sourcetype=access_combined status=200 action=purchase
[<subsearch>]
| stats sum(sale_price) as sales_sum by clientip
| stats avg(sales_sum) as avg_sales, median(sales_sum) as median_sales
Search 2
index=web sourcetype=access_combined status=200 OR status=400 action=purchase
[<subsearch>]
| stats sum(sale_price) as sales_sum by clientip
| stats avg(sales_sum) as avg_sales, median(sales_sum) as median_sales
Search 3
search index=web sourcetype=access_combined status>=400 action=purchase
[<subsearch>]
| stats values(clientip) as clientip
Combine the inner and outer search to create your final search. Run this search over the Previous week.
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 14
Lab Exercise 3 – Using the return Command
Description
Use the return command to control output from a search and a subsearch.
Steps
Return search results as key value pairs.
A coworker has asked you to help create a subsearch for a report. You have created a search that
normalizes username and Username values in the network data and finds the top 5 most active users.
Complete the <missing> portion of the search so that User values are returned as key-value pairs. Run
the search over the Last 24 hours.
index=network
| eval User=coalesce(username,Username)
| stats count by User
| sort 5 -count
| <missing>
Scenario: SecOps wants to know which employees have entered invalid passwords over the last
7 days.
Filter search input by returning key-value pairs from the employees.csv lookup. Count
instances of "failed password" by employee usernames.
Your knowledge manager has provided you with the following information about the employees.csv
lookup. Create a search that will open employees.csv and return all USERNAME values as key-value pairs.
(Hint: Use the inputlookup command with the employees.csv lookup to find out how many rows of data
exist in the lookup file. The number of rows will match the number of results returned. Then, use this
number with the return command.)
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 15
filename: employees.csv
definition name: employee_lookup
lookup type: file-based
This search looks for "failed password" events in the security index. Filter the search input by adding the
subsearch you created in the previous step. Then, run the search over the Last 7 days. What results
were returned?
The search is not working because the subsearch is returning USERNAME values while the outer search is
aggregating on user values. Fix the search and run over the Last 7 days. (Hint: No additional pipes need
to be added to the search.)
© 2023 Splunk Inc. All rights reserved. Leveraging Lookups and Subsearches 20 September 2023 16