Data Sources

GBIF Data: Accessing 3+ Billion Biodiversity Records

March 9, 2026 10 min read All Levels

The Global Biodiversity Information Facility (GBIF) is the world's largest open-access biodiversity data infrastructure, providing free access to hundreds of millions of species occurrence records from around the globe. For researchers in phylogenetics, biogeography, and conservation, GBIF represents an invaluable resource for understanding species distributions.

3B+
Occurrence Records
85K+
Datasets
2,000+
Data Publishers
Free
Open Access

What is GBIF?

GBIF is an international network and research infrastructure funded by governments worldwide. Established in 2001, it aggregates biodiversity data from:

Types of Data in GBIF

Occurrence Records

The primary data type - records of a species being observed or collected at a specific location and time. Each record typically includes:

Species Checklists

Lists of species known from specific regions or taxonomic groups, useful for biodiversity assessments.

Sampling Events

Structured survey data that includes both presences and absences, enabling more sophisticated analyses.

Accessing GBIF Data

GBIF.org Portal

The easiest way to explore GBIF data is through the GBIF.org website, which offers:

GBIF API

For automated data access, GBIF provides RESTful APIs:

# Search for species occurrences
GET https://api.gbif.org/v1/occurrence/search?
    scientificName=Panthera%20tigris&
    hasCoordinate=true&
    limit=300

# Get species information
GET https://api.gbif.org/v1/species/match?
    name=Panthera%20tigris

R Package (rgbif)

The rgbif package provides convenient R functions:

library(rgbif)

# Search for tiger occurrences
tigers <- occ_search(
    scientificName = "Panthera tigris",
    hasCoordinate = TRUE,
    limit = 5000
)

# View results
head(tigers$data)

Data Quality Considerations

Not all GBIF records are equally reliable. Common quality issues include:

Coordinate Issues

Taxonomic Issues

Quality Filtering Best Practices

Always filter GBIF data before analysis. Use flags like hasCoordinate=true, coordinateUncertaintyInMeters<10000, and check for outliers that fall outside known species ranges.

GBIF Data Quality Flags

GBIF automatically flags potential issues in records. Key flags include:

Using GBIF Data for Research

Species Distribution Modeling

GBIF occurrence data is ideal for building SDMs using tools like MaxEnt, Bioclim, or ENMeval. Combine occurrences with environmental layers to predict suitable habitat.

Biogeographic Analysis

Map species distributions onto phylogenetic trees to infer ancestral ranges, detect dispersal events, and test biogeographic hypotheses.

Conservation Prioritization

Identify areas of high species richness, locate populations of threatened species, and assess habitat connectivity.

Climate Change Research

Use historical occurrence records to detect range shifts over time and project future distributions under climate scenarios.

Citing GBIF Data

When using GBIF data in publications, proper citation is essential:

GBIF.org (09 March 2026) GBIF Occurrence Download
https://doi.org/10.15468/dl.xxxxx

Search GBIF with PhyloVerse

Access GBIF occurrence data directly within PhyloVerse. Search by taxon name, visualize distributions on interactive maps, and integrate with your phylogenetic analyses.

Launch PhyloVerse

Beyond GBIF: Other Data Sources

While GBIF is the largest aggregator, other valuable biodiversity data sources include:

Conclusion

GBIF has transformed biodiversity research by making occurrence data freely available to anyone. Whether you're modeling species distributions, testing biogeographic hypotheses, or planning conservation actions, GBIF provides the foundational data you need. By understanding data quality issues and applying appropriate filters, you can leverage this remarkable resource for rigorous scientific research.