Open Source Investigation Handbook
Open Source Investigation Handbook
Introduction
By Phil Rees,
Director of Investigative Journalism, Al Jazeera 6
Chapter 1
What are Open Source
Investigations? 10
Chapter 2
Planning and Carrying Out an
Investigation 14
Chapter 3
Ethics and Safety 20
Chapter 4
Tracking Ships and Planes 24
Chapter 5
How to Identify Weapons 30
Chapter 6
Finding Out Who Owns
a Corporation 36
Chapter 7
Analysing Satellite Imagery 40
Chapter 8
Tools and Networks 49
Open Source Investigation
6
Introduction
Director of Investigative
Phil Rees Journalism, Al Jazeera
Many of us know the scene in All the in restaurants, coffee shops or late
President’s Men, Hollywood’s inter- at night in bars. They persuaded
pretation of the Watergate scandal, whistle blowers to do the right
when “Deep Throat” is standing in thing; they gained their trust
a car park basement in Washington so that the identity of a
DC. The man we now know to be the source would not be
former Associate Director of the FBI, revealed. Managing a
Mark Felt, was the secret informant source was a critical
who gave clues such as “follow the skill for an investi-
money” to the Washington Post jour- gator. HUMINT - or
nalist, Bob Woodward. human intelligence -
was the cornerstone
Finding the evidence that brought of investigative jour-
down US President Richard Nixon nalism and obtaining
was a watershed event in investigative information that no
journalism and has rightly entered its one else had was es-
folklore. To break investigative stories sential for an exclusive.
in the 1970s, you needed to devel-
op sources. The skill sets needed for A journalist usually carried
success were once described by the only a notebook and a tape re-
late Nick Tomalin as “ratlike cunning, corder. When I started in journalism,
a plausible manner, and a little literary there were no mobile phones. There
ability”. Tomalin was killed while re- was little methodology to investiga-
porting the Arab-Israeli war in 1973, a tive journalism. Success depended
year after Woodward broke the Water- on who you knew and how effec-
gate story. tively you exploited them.
The roots of OSINT lie in Computer As- usually after something has happened.
sisted Research. CAR began by Investigative journalism, by contrast,
exploring and analysing da- seeks to prove that some aspect of
tabases. In doing this we what the public thinks it knows about
can discover patterns, the world is wrong. Like a policeman or
trends and anomalies prosecutor, an investigator will discov-
that may be useful in er a lead or obtain prima facie evidence
producing new infor- that supports a hypothesis that “X is
mation. The practical lying” or “X is corrupt”. The investiga-
use of this method- tion will aim to prove this supposition.
ology emerged with If it can’t, the investigation is dropped.
the Freedom of In-
formation Act in the This investigative methodology,
United States, which known as hypothesis-based narra-
was introduced in tive, replaced conventional charac-
the 1960s to open the ter-based or travelogue storytelling.
workings of government Evidence gathering became the glue
to public scrutiny. that holds the narrative together.
Philip Mayer, a pioneer of CAR, Decades ago, Philip Meyer made the
called it “precision journalism”. It prophetic statement: “When informa-
was inspired by the methodology of tion was scarce, most of our efforts
social sciences where a journalist were devoted to hunting and gather-
used evidence to prove his assertion. ing. Now that information is abundant,
processing is more important.”
A methodology was born that in-
spired a distinct storytelling style In the last decade, open-source in-
that distinguishes investigative telligence (OSINT) has emerged as a
from conventional journalism. journalistic science, as the vast re-
source of data collected from social
Conventional journalism is re- networks and internet-connected
active and observational. It de- devices is mined for information be-
scribes the world as it is seen, yond just databases.
Open Source Investigation
8
The volume of data created, captured cerned with obtaining secret data than
and consumed globally is projected to finding ways to make sense of public
be around 200 billion gigabytes a year data, and tell stories based on that.
in 2025 (Up from 70 billion in 2020). More complex computer-based tools,
Every minute on Facebook, around such as data mining programmes,
half a million comments are posted, geographic information systems, de-
and 150,000 photos are uploaded. mographic databases and so forth can
More than four million hours of con- be used to identify patterns, anoma-
tent is uploaded to YouTube every day. lies and discrepancies in data. Much
Add to that, 700 million tweets per day of the new technology surrounding
Investigative journalism will increas- open source intelligence will involve
ingly rely on tapping these sources. machine learning, that is when a com-
We are not discovering truths that are puter model is trained to analyse data
strictly hidden from us - they are not much faster than a human being. In
confidential - but we are assembling effect, you train a computer to do the
information in a fashion that reveals hunting for you.
new truths. We are unpicking the re-
sources available online to tell the It means that investigative journalists
story behind the picture, the story that no longer need to only learn how to
the metadata provides, or the sto- write and turn on a tape recorder or
ry that shipping or flight data tells us camera. They will need to learn the
about an event. tools of the internet. While computer
scientists will write the programmes,
For filmmakers dealing with investi- journalists will need to understand the
gative content, there are new chal- science of OSINT.
lenges. There will be more use of
computer-generated imagery to tell OSINT is not a substitute but a com-
the story and less use of video. There plement for HUMINT. For most in-
will be a need to harmonise different vestigations, journalists need to use
sources, such as vertical aspect ra- human sources as well as data. In-
tio imagery, publicly generated and vestigative journalists still need “rat-
low-definition content with profes- like cunning, a plausible manner, and
sional standards. Graphic designers, a little literary ability”. But they also
data scientists and filmmakers will need to understand how to get value
need to work together in ways that from the abundance of information
presently rarely exist. The model of on the Internet.
television production needs to adapt
to a new method of storytelling. This handbook provides an invaluable
guide to achieve this.
With the amount of information in the
public domain, investigative journal-
ists of the future may be less con-
Open Source Investigation 9
Open Source Investigation
10
Chapter 1
WHAT ARE
OPEN SOURCE
INVESTIGATIONS?
An open source investigation (OSINT) OSINT as “any and all information that
uses intelligence gathering tech- can be obtained from the overt col-
niques and technologies including lection: all media types, government
satellite imagery, social media posts reports, and other files, scientific re-
and user-generated content to uncov- search and reports, business informa-
er the invisible. In recent years, open tion providers, the Internet, etc”.
source investigations have become
one of journalism’s most valuable The learning process of how to use open
tools, largely due to its ability to tap source tools is constantly evolving. This
into vast amounts of publicly available handbook provides core elements and
online information to reveal otherwise tools for journalists who are interested in
untold stories. conducting open-source investigations.
It introduces a framework and outlines
Collecting and analysing publicly ethical approaches, while examining
available data and information from case studies, to analyse the fundamen-
across the internet can include any- tals of online search and research tech-
thing from analysing an IP address all niques for investigations.
the way through to interrogating pub-
lic governmental records. Whether it involves using search en-
gines to gather documentation, ex-
What is OSINT? Open source intelli- amines videos and satellite imagery to
gence is the application of intelligence collect critical evidence, or evaluates
gathering techniques and technology data gathered from an online data-
to investigations that make use of base, this handbook offers journalists
open source data. the necessary skills to acquire and
verify documentation.
Security adjunct professor at Columbia
University Mark M Lowenthal defines
Open Source Investigation 11
From early conflict and environmental tion without compromising the safety
monitoring to high-profile investiga- of your subject matter or those in-
tions such as Anatomy of a Killing,1 volved in investigating the story.
using advanced open source tech-
niques has quickly developed to be- Third, you should develop the right
come a crucial practice for journalists strategies to validate your findings.
in both long-form investigations and Collaboration is an important consid-
breaking news. Open source tech- eration here.
niques involve researching, selecting,
archiving and analysing information
from publicly available sources.
1
https://www.youtube.com/watch?v=XbnLkc6r3yc
Open Source Investigation
12
Benefits Risks
● Wide array of information ● Identity exposure
to collect
● Counterattacks from online
● No or low barriers to access adversaries
● Easy-to-locate publicly ● Collection of misinformation
available data
Chapter 2
PLANNING
AND CARRYING OUT
AN INVESTIGATION
Journalists can follow the following 2: What are the key questions that
four steps to start their open source need to be answered?
investigations:
3: Which tools and platforms can help
gather the required information?
Step one: Planning
Always evaluate any potential data storage risks and keep evidence and doc-
umentation safe by using encrypted storage. Also, don’t forget to take precau-
tions to ensure your identity remains secure.
ARCHIVING
Raw information gathered must be analysed and processed before any useful
or actionable conclusions can be drawn. This includes contacting people and
verifying findings across multiple sources. Verification is an iterative process
that involves three main phases:
Verifying the source - Where did you get the information from?
Verifying the content - Is the information actually what it claims to be?
Verifying its relevance - Does this information fit into your investigation?
Open Source Investigation 17
Chapter 3
ETHICS AND SAFETY
Open Source Investigation carries im- Secondary trauma refers to a range of
portant ethical concerns, as well as le- trauma-related stress reactions and
gal compliance. Information might be symptoms that may result from expo-
publicly available but personal data may sure to graphic details of another indi-
be subject to data privacy regulations to vidual’s traumatic experience.
varying degrees. Do not forget to con-
sider the issues below when using open As content from open source investi-
source investigative techniques: gations is often very graphic, knowing
yourself, and knowing what images
The origin and the intent of your affect you the most, is important to
sources: Make sure that all your consider. Another factor in preventing
searches are targeted and that you are secondary trauma is understanding
collecting only the information that is your personal connection to the work
relevant to your investigation. you are investigating.
2
“Safer Viewing: A Study of Secondary Trauma Mitigation Techniques in Open Source Investigations“
https://www.hhrjournal.org/2020/05/safer-viewing-a-study-of-secondary-trauma-mitigation-tech-
niques-in-open-source-investigations/
Open Source Investigation 21
Once the material is archived, our investigative team sorts the content and iden-
tifies crucial pieces to be verified. The verification process involves determining
the source of the video, the location where it was filmed, the time of day and
date on which the incident happened, and any other relevant context.
Once many videos have been verified from the same event, our team can begin
piecing together the truth of what occurred on that day. We use a standardised
data tagging process to ensure every researcher is using the same tools and
drawing the same conclusions, and we share those methods with our readers -
an important piece of accountability is transparency in this process.
Our most recent investigation is a large dataset called the Coup Files, which
aims to verify documentation of violent incidents at any protests that have oc-
curred in opposition to the 2021 coup. In this dataset, our teams tag each in-
vestigated piece of documentation with identifiers that help us conclude who
was the perpetrator of the violence. This includes tags focused on identifiable
weapons, uniforms, vehicles and other indicators of those perpetrator groups.
As well, we identify any protest characteristics that could help us prove there
were indicators of excessive force or unlawful use of crowd control techniques.
Open Source Investigation 23
That can be examples such as videos of tear gas canisters thrown directly into a
dense crowd of people, or photos of live bullets at a protest involving the pres-
ence of students and children.
We publish incident reports focused on the protest days, grouping together vi-
olent incidents or the presence of security forces that we can confirm using this
open source documentation. We also publish the data, set on a map, to help
human rights advocates find the information they need - including by sorting for
verified documentation of specific types of incidents or possible perpetrators.
Chapter 4
Tracking Ships
and Planes
Tracking the movement of ships and How to get started:
planes are increasingly valuable tech-
niques in Open Source investigations. 1. Choose a ship-locating website.
In the following chapter we present
Some go-to platforms for journalists
how these techniques can be used
looking for real-time shipping data in-
to investigate the movement of sanc-
clude:
tioned goods, follow the travel paths
- Marine-Traffic,
of government officials and track ille-
- VesselFinder
gal fishing or forced labour.
- FleetMon
Tracking Ships
2
https://www.icij.org/investigations/paradise-papers/offshore-gurus-help-rich-avoid-taxes-jets-yachts/
Open Source Investigation
28
RadarBox24
A flight tracker with live maps and
search function.
Freedar
A flight tracker that includes military
aircraft. It also has monitoring of air
traffic control audio.
OpenSky Network
A non-profit association based in
Switzerland that provides open ac-
cess to flight tracking control data.
Open Source Investigation 29
Icarus Flights
https://icarus.flights/
Open Source Investigation
30
Chapter 5
How to Identify
Weapons
Since the conflict in Yemen began in Here are a few steps to help
2015, it has become harder for inter- you identify weapons
national rights organisations, UN bod-
ies and journalists to document viola- 1. Determine the weapon’s class.
tions committed by all parties to the
conflict. Broadly speaking, there are three main
classes of weapons: small arms, light
Investigators have to work very hard weapons, and heavy weapons.
to identify and verify the details of ● Small arms include pistols, rifles,
possible unlawful attacks, mainly us- light machine guns and other weap-
ing intelligence gathering techniques ons that can be carried and operated
and technologies. One of these tech- by one person.
niques involves analysing photos and
videos to verify the types of weapons ● Light weapons include larger ma-
chine guns, rocket-propelled grenades
being used.
(RPGs), man-portable air-defence sys-
tems (MANPADS), mortars and other
Investigators can study the shape of a weapons that require a small crew to
crater left behind after a missile strike, operate.
watch footage of air raids to classify
the types of missiles used, or analyse ● Heavy weapons systems include
weapons trade data to understand tanks, helicopters, fighter planes, sub-
ownerships of these munitions. marines and warships.
In the Crisis Evidence Lab at Amnesty Digital models can be all these things
International, we use digital 3D mod- too, either separately or all at once.
els to both generate new findings (evi- They can contain many different lay-
dentiary) and to communicate existing ers of data that can be turned on and
findings (demonstrative). off or overlaid. They can be zoomed
almost infinitely, enabling 3D and 2D
elements to be viewed together at dif-
ferent scales.
Chapter 6
Find Out Who Owns
a Corporation
If you would like to investigate the Whatever you are investigating on
world’s largest companies and reveal global money-laundering cases or
who owns offshore companies and bribery investigations, you can use
trusts, free databases are your start- OpenCorporates to try to identify who
ing point. There are other ways to re- is who and who is transacting with
search companies; you can find offi- whom. The database can provide the
cial and court records, and search on company’s incorporation date, its reg-
subscription databases or corporate istered addresses, and the names of
websites. directors and officers. You can search
connections between companies, or
work out which companies are run by
the same CEO and even do more spe-
cific searches focusing on particular
countries. Similarly, journalists trying to
‘follow the money’ across borders can
use the Investigative Dashboard, cre-
ated by the Organised Crime and Cor-
ruption Reporting Project (OCCRP), to
allow access to hundreds of databases
that detail company records and online
and offline court records from nearly
every country in the world.
Open Source Investigation 37
If you are trying to expose organ- If you are interested in covering oil,
ised crime and corruption around the gas and mining and you would like to
world, the Offshore Leaks Database, discover the connection between the
developed by the International Con- companies that own and operate oil
sortium of Investigative Journalists rigs, and how they are incorporated
(ICIJ), can help you to find information as companies in, or working through,
and documents on persons of interest maritime tax havens; check the portal
and their business connections. The Double Offshore developed by Code
database contains leaked documents for Africa. The same organisation de-
about nearly 785,000 offshore compa- veloped the project the Miners of Mo-
nies and trusts. zambique, to discover the individuals
behind the mining industry in Mozam-
bique and their connections.
MORE:
● ResourceContracts: A portal that
houses over a thousand mining and
oil contracts.
● Resourceprojects.org: A repository
of extractives projects
Open Source Investigation
38
BOX #5 But this didn’t last long, the page de-
How did a complex network leted most of their recent lists and
of shell companies trade stopped publishing new ones after
Syrian phosphates despite June 2020. But using google dork-
sanctions? ing techniques - a search string that
uses advanced search queries to find
Bashar Deeb information that are not easily avail-
investigator at Lighthouse Reports able - we managed to find other pag-
es which had copy-pasted these lists
Whether it’s a warzone in the Middle and reconstructed the timeline of the
East, Ukraine or Africa, or borders be- working ships in Tartous port. This
tween Greece and Turkey, it’s often very allowed us to track the ones carrying
difficult to send journalists to inquire phosphates to their European desti-
about things in such places. During nations. Of course, we did extra tradi-
a joint investigation between Light- tional verification work in some cases
house Reports, the Organised Crime by asking for landing bills for these
and Corruption Reporting Project (OC- ships to make sure our analysis was
CRP), and Syrian Investigative Report- correct, in some other cases we ob-
ing for Accountability Journalism (SIR- tained custom records that verified to
AJ), we were looking at the exports of us that these ships were indeed mov-
Syrian phosphates to see where they ing phosphates.
were ending up. The exports started off
from the Mediterranean port of Tartous, But we still needed to find out where
which is controlled by Russia. This port the new shipments were going. Here,
is not a place where journalists can vis- the challenge was to figure out a way
it and speak to people, so OSINT was to identify which ships were moving
critical to this work. phosphates after June 2020. Based
on our observations of these lists but
In theory, finding the ships that are also on reading old online articles
loading the phosphates in the port about the structure of the port, we
was the main challenge to figure out noticed that all the phosphate-car-
where the phosphate was going be- rying vessels would dock in berths
cause then we could track these ships 18-19. This pier was built specifically
on commercial AIS services to see to handle phosphates, with a crane
their final destinations. During the in- operating between it and a dedicat-
vestigation, my colleagues noticed ed phosphate storage area. We used
that the port’s FB page was regu- satellite imagery to verify. Having this
larly publishing a list of the working information allowed us to look through
ships on a daily basis. The list also photos taken by port workers or pho-
contained the type of cargo for each tos from official visits to the port and
ship and the piers where each of them identify a few ships that were docked
were docked. in this pier.
Open Source Investigation 39
Chapter 7
Analysing
Satellite Imagery
When you have a story, but still need SATELLITE IMAGERY
to tie up loose ends to answer where PROVIDERS
or when a particular event occurred,
analysing satellite imagery can point Over the past few years, several free
you in the right direction. and subscription-based earth imaging
companies have emerged allowing
anyone to access high-resolution sat-
So how do you get started? ellite imagery from all over the world.
Some of these services include:
Analysing satellite imagery can be
useful in providing geographical con- Free services
text, reconstructing events, or even ● Google Earth
verifying if a particular event even ● NASA’s Worldview
happened at all. ● The European Space Agency
● World Imagery Wayback Tool
The use of satellite imagery has be- ● Zoom Earth
come an indispensable tool for in-
vestigative journalists to report on Subscription services
conflicts, environmental destruction, ● Maxar Technologies
developments in military infrastructure ● Planet Labs
and natural disasters. Satellite imag- ● Sentinel Hub
ery has also become a compelling ● SI Imaging Services
centrepiece for visual storytelling, and ● Spaceknow
a window into remote or restricted lo-
cations. Investigative journalists can
use satellite imagery to make visible
what governments or institutions want
hidden out of sight.
Open Source Investigation 41
However, just having access to these 4. Analyse the direction of the shad-
services is not always enough. For sat- ows and colour of the terrain to help
ellite image analysis to be effective in you determine the date and time a
your investigation you will need to en- particular image was captured.
sure that the recency of the images as
5. Consider your prior knowledge of a
well as the satellite image resolution
location to see if anything stands out
are adequate to match your needs.
in the environment
MAPPING ENVIRONMENTAL
IMPACT
CASE STUDIES
In 2018, Shawn Zhang, a Chinese law ASPI says they have identified more
student in Canada, began scouring than 380 “suspected detention facili-
Google Earth for evidence of deten- ties” in the region, where the United
tions in Xinjiang, an official autono- Nations says more than one million
mous region in China. Uighurs and other mostly Muslim Tur-
kic-speaking residents have been held
Since then, several organisations in- in recent years.
cluding the Australian Strategic Poli-
cy Institute (ASPI) have used satellite
imagery, witness accounts, media re-
ports and official construction tender
documents to classify the detention
facilities into four tiers depending on
the existence of security features such
as high perimeter walls, watchtowers
and internal fencing.
Open Source Investigation
46
● Attacks on Hospitals
and Medical Staff in Sudan
● Massacre in Tigray
Chapter 8
Tools and Networks
● Bellingcat’s Online Investigative Toolkit
● First Steps to Getting Started in Open Source
Research
● OSINTcurio.us features weekly podcasts, web-
casts and “10 minute tips” on video covering many
aspects of doing open source investigations. It’s a
community project begun in late 2018 by about 10
contributing experts
● The Open Source Intelligence Framework has a
very detailed and ever-growing list of digital inves-
tigative tools
● Exposing the Invisible Kit by Tactical Teck
● Open Source Intelligence Techniques by Michael
Bazzell
● Online research tools by Global Investigative
Journalism Network
● The OSINT Framework
NETWORKS
By
Sara Creta
Edited by:
Muhammad Al khamaiseh
Nina Montagu-Smith
Designed by:
Ahmad Fattah