--- title: "Introduction to lifeR" author: "Jeffrey C. Oliver" date: "1 June, 2022" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to lifeR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, echo = FALSE} library(lifeR) ``` # Summary lifeR is an R package for identifying locations to visit in order to increase your bird species list count. The package relies on the [eBird API](https://documenter.getpostman.com/view/664302/S1ENwy59) to query for recent observations and compare them to a user's species list. The lists can be life lists, year lists, etc. The primary output is a report of sites to visit in a region where species _not_ on the user's list have been reported recently. # Installation You can install lifeR from CRAN via: ```{r install-cran, eval = FALSE} install.packages("lifeR") ``` Alternatively, you can download the development version from GitHub with the help of the remotes package: ```{r install-github, eval = FALSE} install.packages("remotes") remotes::install_github(repo = "jcoliver/lifeR") ``` If you install from GitHub and want to have the introductory vignette included in the installation, then pass `build_vignettes = TRUE` in the call to `install_github()`: ```{r install-vignette, eval = FALSE} remotes::install_github(repo = "jcoliver/lifeR", build_vignettes = TRUE) ``` # Using lifeR ## Requirements You will need a minimum of three pieces of information to use lifeR: 1. An eBird API key 2. A list of species you have seen (life list, year list, etc.) 3. A pair of latitude and longitude coordinates to center the search for potential locations. ### 1. eBird API key Because lifeR relies on the public eBird API, you will need to have your own unique API key to use this package. Visit [https://ebird.org/api/keygen](https://ebird.org/api/keygen) to get an eBird API key (note you will be prompted to log into your eBird account). The key will be a random string of letters and numbers. I suggest you copy the key and save it to a plain text file on your computer; you can use a program like Notepad or TextEdit to create plain text files. Save this file somewhere you will remember (like "My Documents") and name it something that is recognizable (e.g. "ebird-api-key.txt"). We will use this file later to access the eBird API. ### 2. Your species list This should be a list of the species you have already seen, but it can vary based on the list you would like to add species to. If you want to find sites where birds you have never seen have been observed recently, you would use your life list. If you are working on a Big Year, you would use your year list. Trying to break 300 species for your Arizona year list? Then yes, use your Arizona species list for the current year. These can all be downloaded from your eBird account by going to [https://ebird.org/lifelist/](https://ebird.org/lifelist/), using the dropdown menus near the top of the page to select the region and time range of your list, then downloading the list through the "Download (csv)" link near the upper-right corner of the page. Save the file somewhere where you will remember where it is. ### 3. Latitude and longitude Finally, lifeR is designed to identify sites in a region for you to visit. Therefore, you will need to provide information on _where_ that region is. The package uses a latitude and longitude coordinates to serve as the "center" of the region of interest and finds sites within a certain radius of that center. The maximum radius is 50 kilometers - if you want to search a larger area, you will just need to define two centers (information on doing this is provided below). If you do not know the latitude and longitude of your "center" I recommend Google Maps, which allows you to right- or control-click on a location on a map, and the latitude and longitude coordinates will be displayed. Note lifeR requires decimal degree formatted data (e.g. -110.92, _not_ 110° 55' 12"). ## Create the report Now that you have those three pieces of information, you can build a report of sites to visit. ### A minimalist example This example creates a report from a single center, near McCall, Idaho, USA. It uses one function from lifeR: + `SitesReport` a function that handles all the querying of eBird, comparing lists, and building the report ```{r example-1, eval = FALSE} library(lifeR) # To use the sample list included in this package list_file <- system.file("extdata", "example-list.csv", package = "lifeR") # If you have your list file downloaded, replace the line above with one that # indicates the location of your list file, e.g. # list_file <- "~/Desktop/ebird_world_year_2021_list.csv" # Read the list of species into memory user_list <- read.csv(file = list_file) # Extract the common names of species from your list my_species <- user_list$Common.Name # Read in eBird API key from a text file; replace the argument to file with # the actual location of your eBird key file key <- scan(file = "ebird-api-key.txt", what = "character") # A single center requires vector of coordinates # Change these, unless you really want to go birding near McCall, Idaho locs <- c(45, -116) SitesReport(centers = locs, ebird_key = key, species_seen = my_species) ``` The resulting report will be called Goals-Report.html and you can open it in your favorite web browser. NB: If the list file you are using was downloaded prior to 2023, the format was different - the common and scientific names were combined in a single column (called `Species`) that needed to be split. If your list is one of these older downloads, you will need to replace this line: ```{r, current-species, eval = FALSE} my_species <- user_list$Common ``` with this: ```{r, older-species, eval = FALSE} my_species <- SplitNames(x = user_list$Species)$Common ``` The line above will use `SplitNames()`, a helper function to deal with the format of species names returned from eBird for older downloads. Note this is only necessary if the _download_ was prior to 2023. If you are using your list from 2015 that you downloaded yesterday, you do not need to use the `SplitNames()` function (i.e. just use `my_species <- user_list$Common`). ### Multiple sites, with names The example above only uses one center on which to base searches and does not provide a name for that center (it will be given the default name of "Center 1" in the report). In the example below, we provide two starting points. Note that a list of nearby sites _for each starting point_ will be included in the results. That is, the report will include the top five sites near the first set of latitude and longitude coordinates and the top five sites near the second set of latitude and longitude coordinates. The example assumes you have loaded in your species list and the location of your eBird API key file as in the first example, above. ```{r example-2, eval = FALSE} # For more than one location, centers can be a matrix or a data frame, here # we use a matrix of two sites loc_mat <- matrix(data = c(39.5, -118.8, 39, -119.1), nrow = 2, byrow = TRUE) # Instead of default "Center 1" and "Center 2", we can use custom names loc_names <- c("Fallon", "Yerington") # Sites report now uses loc_names in the output SitesReport(centers = loc_mat, ebird_key = key, species_seen = my_species, center_names = loc_names) ``` ### Multiple sites from a data frame, narrower range, and informative report file name The `SitesReport` can also accommodate a data frame of latitude and longitude coordinates. You can also give your report a different name (with `report_filename`) and specify where it is saved (with `report_dir`). ```{r example-3, eval = FALSE} loc_df <- data.frame(latitude = c(39.5, 39, 40), longitude = c(-118.8, -119.1, -118.6)) loc_names <- c("Fallon", "Yerington", "Humbolt Wildlife") # We can set the area to search by passing values to the dist argument SitesReport(centers = loc_df, ebird_key = key, dist = 25, species_seen = my_species, center_names = loc_names, report_filename = "Nevada-sites", report_dir = "~/Desktop") # Saves report to desktop ``` ### Helpful tips + This should go without saying, but you are very unlikely to see _all_ of the species that have been identified as being recently observed at a given site. There might be recent observations that result in a [Northern Goshawk](https://ebird.org/species/norgos) showing up in the output report, but lifeR makes no guarantees that you will actually see that godforsaken bird. + Along similar lines, this report makes no claims on the veracity of observations on which the top sites are identified. Difficult to identify species or species only detected by call or song often find their way onto these reports. We leave it to individual users to decide if it is worth the trip to a site where someone claims to have identified a Chihuahuan Raven preening its pale neck feathers amongst an unkindness of Common Ravens. + The longer the list you use (e.g. your life list, your year list, etc), the fewer sites/species will be identified in the report. If you find you are running `SitesReport` and few or no sites are being returned, try: + A shorter, more restrictive list (e.g. use your year list instead of your life list) + A larger radius to search (up to a maximum of 50km) + Additional centers to start your search from (i.e. one that is 100km from your home base) ## How does it work? The main goal, identifying sites to visit to find species you have yet to see, uses eBird's API to query for recent local observations. Specifically, the `SitesReport` function: 1. Queries eBird to find all species seen within a certain radius of the latitude and longitude coordinates you provide, 2. Compares the list of recently observed species to the list you provide (e.g. your year list), to identify the unseen species, 3. For each unseen species, queries eBird again, this time to find all the nearby _sites_ at which that species was recently seen, 4. Identifies the sites with the most unseen species, 5. Creates maps, tables, and the final report. Perhaps you are asking yourself "what is the difference between steps 1 and 3?" The two separate queries are required because the first only returns the most recent observation of each species for a given region. That is, even though the [Black-bellied Whistling Duck](https://ebird.org/species/bbwduc) has been seen at 14 nearby sites, only the most recent observation (and therefore locality information) is returned by eBird in step 1. It is not until step 3 where we retrieve _all_ the sites at which that species was recently seen. ## What doesn't the package do? + **Connect to your eBird account**. The eBird API is designed to query publicly available data about observations. It does not provide a way to query about individual users' accounts. It is for this reason that you are responsible for downloading your species list and feeding it to `SitesReport()`. + **Include additional measures of confidence**. It would be nice if lifeR could include some estimate of the likelihood of seeing each species based on how many people have seen it or the average number of individuals observed. It _would_ be nice, but those data are not available through the eBird API. The only information that is returned by eBird is details for the most recent observation of a particular species at a particular site; that is, there is no means of determining if a species was seen by 100% of the observers on a particular day, or by 5% of the observers.