0% found this document useful (0 votes)
213 views

Gbif Api

This document discusses downloading species occurrence data from the Global Biodiversity Information Facility (GBIF) web service application programming interface (API). It provides an overview of GBIF, including its mission to make biodiversity data freely available worldwide. Statistics are presented on the growth of GBIF's data holdings and usage, with the United States, Denmark, Germany, Norway, and Spain highlighted as top data publishing countries. The document also shows maps of GBIF country participants and data download requests by country.

Uploaded by

Irwing2014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
213 views

Gbif Api

This document discusses downloading species occurrence data from the Global Biodiversity Information Facility (GBIF) web service application programming interface (API). It provides an overview of GBIF, including its mission to make biodiversity data freely available worldwide. Statistics are presented on the growth of GBIF's data holdings and usage, with the United States, Denmark, Germany, Norway, and Spain highlighted as top data publishing countries. The document also shows maps of GBIF country participants and data download requests by country.

Uploaded by

Irwing2014
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/308802095

Downloading species occurrence data using the GBIF web-service API

Presentation · September 2016


DOI: 10.13140/RG.2.2.19572.35202

CITATIONS READS
0 1,697

1 author:

Dag Terje Filip Endresen


University of Oslo
62 PUBLICATIONS   256 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Nordic Crop Wild Relatives in iNaturalist View project

BioDATA – Biodiversity data management skills for students View project

All content following this page was uploaded by Dag Terje Filip Endresen on 04 October 2016.

The user has requested enhancement of the downloaded file.


NINA, Trondheim, 15th September 2016

GBIF data use

Dag Endresen
GBIF Norway
UiO Natural History Museum in Oslo
University of Oslo

Thursday, September 15th, 2016

Slides: CC-BY-4.0, GBIF.no


Status
14th September 2016

GBIF enables free and open access to


biodiversity data online.

We are an interna2onal government-ini2ated and -funded ini2a2ve focused on


making biodiversity data available to all and anyone, for scien2fic research,
conserva2on and sustainable development.
GBIF provides a data discovery system
that is dependent on resolvable stable iden;fiers for efficient
func;onality

global registry data portal

3
GBIF and GEO
Intergovernmental group on earth observations

GEO BON
Biodiversity observa2on network

Data Integration & Interoperability


GBIF provides the infrastructure delivering species occurrence
data in GEO.
GBIF BY THE NUMBERS

649,054,525
species occurrence records

32,440
datasets

813
data-publishing institutions

http://www.gbif.org | 06 JUN 2016


GBIF BY THE NUMBERS: MAY 2016

+3,818,408
species occurrence records

+4,267
datasets

+4
data-publishing institutions

http://www.gbif.org | 6 JUN 2016


data mobilizatio
DATA PUBLISHED THROUGH GBIF.ORG
Occurrence records (millions)
700

650

600

550

500

450

400

350

300

250

200

150

100

http://www.gbif.org | 6 JUN 2016


Asia (lack of data)

Africa (lack of data)


participatio
MAP OF GBIF COUNTRY PARTICIPANTS

Asia (lack of data)

Africa (lack of data)

August 2016
data publishin
DATA—BY GBIF PARTICIPANT Status May 2016

Other
United States

So
ut
hA
fri
Cos ca
taR
ica

Austra
li a

Denmark Norway
Netherlands

Belgium

Germany

Spain
Norway
Norway

Number of new records published—Top 10 participant Countries Total number of records published—Top 10 Participant Countries
(1 to 31 May 2016) (as of 31 May 2016)

1. United States 3,348,499 6. Belgium 1,620423 1. United States 271,901,500 6. Netherlands 24,241,092

2. Denmark 2,972,094 7. Netherlands 1,094,804 2. Sweden 53,776,182 7. Norway 23,811,863

3. Germany 2,868,240 8. Australia 859,896 3. United Kingdom 49,786,646 8. Germany 22,151,479

4. Norway 2,322,797 9. Costa Rica 810,035 4. France 39,896,982 9. Finland 16,612,735

5. Spain 2,238,363 10. South Africa 436,236 5. Australia 37,489,401 10. Spain 13,630,866

NOTE: Datasets are assigned to countries according to the location of the publishing institution,
including aggregated datasets with contributors from many other countries. http://www.gbif.org | 09 JUN 2016
use of gbif.or
DATA DOWNLOAD REQUESTS, BY COUNTRY
1 January – 31 May 2016

Total of
37,552 requests
From 5,131 users in
127 countries, islands
and territories

1. United States 7128 6. Colombia 2235

2. Mexico 5526 7. Italy 1319

3. Brazil 3079 8. China 1263

4. United Kingdom 2670 9. France 949

5. Spain 2478 10. Australia 858

Norwegian scien2sts generally use Artskart…


Requests for download do not necessarily result in data actually being downloaded. Based on country indicated by user login | 06 JUN 2016
data us
CITATIONS IN PEER-REVIEWED RESEARCH
Annual number of peer-reviewed publications using GBIF-mediated data

9 JUN 2016
research us
USE CITATIONS, BY COUNTRY OF AUTHORS
May 2016
1. United States 15 7. Australia 4
2. Germany 9 7. Brazil 4
3. China 5 9. Canada 3
3. France 5 9. Netherlands 3
3. Spain 5 9. South Africa 3
3. United Kingdom 5

Number of research publications in May 2016 citing use of


GBIF-mediated data, ranked by country according to affiliation of author.
Top 11 countries shown.

Total 2016
1. United States 49 5. Brazil 14
2. Germany 22 5. United Kingdom 14
3. France 18 8. Australia 11
4. China 17 8. Spain 11
5. Mexico 14 10. Canada 10

Number of research publications in 2016 citing use of GBIF-mediated data,


Norway ranked by country according to affiliation of author. Top 10 countries shown.

10 JUN 2016
research us
RESEARCH EXAMPLES (FOR NORWAY)
•  Araújo R, Assis J, Aguillar R, Airoldi L, Bárbara I, Bartsch I, Bekkby T et al. (2016)
Status, trends and drivers of kelp forests in Europe: an expert assessment.
Biodiversity and Conservation 25(7) 1319-1348.

•  Jb N (2016)
Some interesting lichenized fungi from old Fraxinus excelsior and Ulmus glabra in
Norway, including four new country records. Graphis Scripta 28(1-2) 17-21.

•  Newbold T, Hudson L, Hill S, Contu S, Gray C, Scharlemann J, Sheil D et al.


(2016)
Global patterns of terrestrial assemblage turnover within and among land uses.
Ecography.

hGp://www.gbif.org/country/NO/publica2ons
A complete archive of research citing use of GBIF can be accessed at http://www.mendeley.com/groups/1068301/gbif-public-library
10 JUN 2016
GBIF portal:

22,0 million occurrences with loca2ons in Norway.


Published from 31 countries worldwide.

Updated 5 September 2016



GBIF portal:

21,5 million occurrences from Norwegian ins2tu2ons.


Coverage 219 countries worldwide.

Updated 5 September 2016



STATUS FOR NORDIC GBIF NODES (DATA HOSTED BY…)

hGp://www.gbif.org/country/NO
Danmark Finland

Norway Sweden

Sept 2016 Datasets Occurrences


Denmark 66 + 1 10 905 213
Finland 54 3 611 729
Iceland 4 458 705
Norway 112 + 2 21 684 727
Iceland
Sweden 42 53 787 704
Download data
GBIF DATA PORTAL
SPECIES SEARCH
Portal API
webservices
GBIF DATA PORTAL API

An interface to access
data published through
the GBIF network using
web services.

PORTAL API
GBIF Data Portal API:
h9p://api.gbif.org/v1/ (+parameters)

Summary and informa2on:
hGp://www.gbif.org/developer/summary

The RESTful API take search parameters as
key=value pairs and respond with json content
type.

RESTful query format
JSON response type
GBIF API SECTIONS

•  Registry
informa2on about the datasets, organiza2ons (e.g. data
publishers), networks and the means to access them (technical
endpoints)
•  Species
informa2on about species and higher taxa, and u2lity services for
interpre2ng names and looking up the iden2fiers (access to all
published checklists in the GBIF checklist bank)
•  Occurrence
occurrence informa2on crawled and indexed by GBIF and search
services to do real 2me paged search and asynchronous
download services to do large batch downloads
•  Maps
simple services to show the maps of GBIF mobilized content
API EXAMPLE : DATASET

Search for datasets by publishing country:


http://api.gbif.org/v1/dataset/search?publishingCountry=NO

Dataset information (UiO NHM Lichens):


http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252

Contacts for a dataset:


http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252/contact

Dataset endpoint (get the download URL):


http://api.gbif.org/v1/dataset/7948250c-6958-4a29-a670-
ed1015b26252/endpoint

http://www.gbif.org/developer/registry
API EXAMPLE : SPECIES
List all name usages (across all checklists):
http://api.gbif.org/v1/species?name=Beta%20vulgaris

Name usage across checklists (Beta vulgaris, 5383920):


http://api.gbif.org/v1/species/5383920/related

Name parsed into epithets and author etc.:


http://api.gbif.org/v1/parser/name?name=Abies%20alba
%20Mill.%20sec.%20Markus%20D.
{"scientificName": "Abies alba Mill. sec. Markus D.",
"type": "SCINAME",
"genusOrAbove": "Abies",
"specificEpithet": "alba",
"authorsParsed": true,
"authorship": "Mill.",
"sensu": "sec. Markus D.",
"canonicalName": "Abies alba",
"canonicalNameWithMarker": "Abies alba",
"canonicalNameComplete": "Abies alba Mill."
}

http://www.gbif.org/developer/species
API EXAMPLE : OCCURRENCE

List occurrences of Beta vulgaris:


http://api.gbif.org/v1/species/match?name=Beta+vulgaris => taxonKey
http://api.gbif.org/v1/occurrence/search?taxonKey=5383920

List occurrences from Norway (of Beta vulgaris):


http://api.gbif.org/v1/occurrence/search?publishingCountry=NO
http://api.gbif.org/v1/occurrence/search?publishingCountry=NO&taxonKey=5383920

Information about a single occurrence record:


http://api.gbif.org/v1/occurrence/1040970640
http://api.gbif.org/v1/occurrence/1040970640/fragment
http://api.gbif.org/v1/occurrence/1040970640/verbatim

List occurrence counts for datasets of country (or taxon):


http://api.gbif.org/v1/occurrence/counts/datasets?country=NO

http://www.gbif.org/developer/occurrence
API EXAMPLE : DOWNLOAD DATA
Lookup speciesKey (1) and download occurrences (2):

http://api.gbif.org/v1/species/match?
verbose=false&kingdom=Plantae&name=Beta+vulgaris
=> usageKey/speciesKey = 5383920

http://api.gbif.org/v1/occurrence/search?
taxonKey=5383920 [&limit=1000&offset=0]
=> notice: count = 25 513
=> then: page through results…
(using offset & limit)

http://api.gbif.org/v1/occurrence/download/request
[POST] => downloadKey (see next slide)
API EXAMPLE : ASYNCHRONOUS (1)
Request asynchronous download:

$ curl -i --user yourGbifUserName:yourGbifPassord -H


"Content-Type: application/json" -H "Accept: application/json"
-X POST -d @filter.json http://api.gbif.org/v1//occurrence/
download/request >> log.txt

Search parameters in a json text file: filter.json (in current


directory or located in a “PATH-directory”):
{
"creator":”yourGbifUserName",
"notification_address": [“[email protected]"],
"predicate":
{
"type":"and",
"predicates":
[{"type":"equals","key":"HAS_COORDINATE","value":"false"},
{"type":"equals","key":"TAXON_KEY","value":"5383920"}]
}
}
DOWNLOADS ARE AVAILABLE IN THE
PORTAL (FROM YOUR USER PROFILE)
API EXAMPLE : ASYNCHRONOUS (2A)

Request asynchronous download:

function gbifapi {
curl -i –user yourGbifUserName:yourGbifPassword -H "Content-Type:
application/json" -H "Accept: application/json" -X POST -d "{\"creator\":
\”yourGbifUserName\", \"notification_address\": [\”[email protected]\"],
\"predicate\": {\"type\":\"and\", \"predicates\": [{\"type\":\"equals\",\"key\":
\"HAS_COORDINATE\",\"value\":\"true\"}, {\"type\":\"equals\", \"key\":
\"TAXON_KEY\", \"value\":\"$1\"}] }}" http://api.gbif.org/v1/occurrence/
download/request >> log.txt
echo -e "\r\n$1 $2\r\n\r\n----------------\r\n\r\n" >> log.txt
}

$ gbifapi 4140730 "Aciachne acicularis"


$ gbifapi 4140704 "Aciachne flagellifera"
$ gbifapi 5289784 "Aegilops comosa”

API EXAMPLE : ASYNCHRONOUS (2B)

(…clean log.txt with the downloadKeys using regular


expressions…)

function gbifwget {
echo -e "\n\n----------------\n$1 $2 $3\n" >> log_wget.txt
wget http://api.gbif.org/v1/occurrence/download/request/$1.zip 2>&1 | tee /
dev/tty >> log_wget.txt
mv $1.zip ./dwca/$2.zip 2>&1 | tee /dev/tty >> log_wget.txt
}

$ gbifwget 0006050-141024112412452 4140730 "Aciachne acicularis"


$ gbifwget 0006053-141024112412452 4140704 "Aciachne flagellifera"
$ gbifwget 0006056-141024112412452 5289784 "Aegilops comosa"

(work in progress…)
Slide by Daniel Amariles, 2013

MAPPING API V1.0


You can easily overlay GBIF content on
your own maps.
http://www.gbif.org/developer/maps
Slide by Daniel Amariles, 2013

MAPPING API V1.0

This service is intended for use with commonly used map clients such
as the Google Maps API, Leaflet JS library or Modest maps JS library.




hGp://leafletjs.com/




hGp://modestmaps.com/

These libraries allow the GBIF layers to be visualized with other
content, such as those coming from Web Map Service (WMS)
providers. It should be noted that the mapping API is not a WMS
service, nor does it support WFS capabili2es.
USEFUL TOOLS (JSON & REST)

REST client …
JSON client/parser …
JSONView (Firefox, Chrome, …)
http://jsonview.com/
Display formatted JSON in browser

R CRAN : jsonlite
http://cran.r-project.org/web/packages/jsonlite/
E.g. read json into a dataframe [link]

OpenRefine
http://openrefine.org/
ROPENSCI : RGBIF
library(rgbif)
key <- name_backbone(name='Hepatica nobilis', kingdom=‘Plantae')$speciesKey
sp <- occ_search(taxonKey=key, return='data', hasCoordinate=TRUE, limit=1000)
gbifmap(sp)
R CRAN

rOpenSci provides programmatic access to scientific data


with R (rgbif, taxize, EML, geonames, …).

https://github.com/ropensci
http://ropensci.org/packages/

http://ropensci.org/tutorials/rgbif_tutorial.html
http://ropensci.org/tutorials/taxize_tutorial.html
RASTER : WORLDCLIM, BIOCLIM LAYERS
# using GBIF data (bv) from the previous slide…
library(raster)
xy <- cbind('lon'=bv$decimalLongitude, 'lat'=bv$decimalLatitude);
env <- getData('worldclim', var='bio', res=10) # bioclim (pkg raster)
plot(env, 1) # plot the first bioclim layer
points(xy[,'lon'], xy[,'lat'], col='red') # plot points
bio <- extract(env, xy); # extract environment to points (pkg raster)
bv_bio <- cbind(bv, bio); # column-bind GBIF-data and bioclim
ROPENSCI : RWBCLIMATE
library(rWBclimate, ggplot2)
country_dat <- get_historical_temp(c("NOR", "SWE", "DNK", "FIN"), "year")
ggplot(country_dat, aes(x = year, y = data, group = locator)) +
theme_bw(base_size=18) + geom_point() + geom_path() +
labs(y="Average annual temperature of Nordic countries", x="Year") +
stat_smooth(se = F, colour = "black") +
facet_wrap(~locator, scale = "free")
RESOLVE TAXONOMIC NAMES
library(taxize) # rOpenSci Taxize
gnr <- gnr_resolve(names = "Beta vuulgariss") # Misspelled name
gnr$results # display suggested names
submitted_name matched_name data_source_title score
1 Beta vuulgariss Beta vulgaris L. Catalogue of Life 0.75
2 Beta vuulgariss Beta vulgaris L. ITIS 0.75
3 Beta vuulgariss Beta vulgaris NCBI 0.75
4 Beta vuulgariss Beta vulgaris var.-gr. crassa Alef. GRIN Taxonomy for Plants 0.75

specieslist <- c("Beta vulgaris", "Phleum pratensis", "Nicotiana glauca")


classification(specieslist, db = 'itis') # lookup higher taxonomy

db = ’i2s'
db = ’col'

Global Names Resolver: http://resolver.globalnames.org/


rOpenSci Taxize: http://ropensci.org/tutorials/taxize_tutorial.html
ROPENSCI : EML

library(EML, rfigshare)

description <- "My dataset published in GBIF"

eml_write(dat = dat, meta, title = "My Dataset",


description = description, creator = "Your Name
<[email protected]>", file = "dataset.xml")

eml_publish("dataset.xml", description = description,


categories = "Ecology", tags = "biodiversity", destination
= "figshare", visibility = "public")

meta <- eml_read("eml_example.xml")


GBIF API SUPPORT

Subscribe to the mailing-list for help and


information messages:
[email protected]
Node team at NHM, University of Oslo
Dag Endresen, Node manager
Chris2an Svindseth, Database manager

Fridtjof Mehlum, Research director
Einar Timdal, Associate professor
Geir Søli, Associate professor
Vidar Bakken, Consultant

Artsdatabanken, Trondheim
Wouter Koch
Nils Valland

NTNU University Museum
Anders Finstad, GBIF Science commiKee

Research Council of Norway
Per Backe-Hansen, Head of delega;on

Contact us at: [email protected]


View publication stats

You might also like