-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path01_DataAccess.qmd
114 lines (76 loc) · 6.19 KB
/
01_DataAccess.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
title: "DataAccess"
format: html
editor: visual
editor_options:
chunk_output_type: console
---
# Data Access and Cleaning
Before you get started, run the `setup.R` script to install and load the necessary packages and functions for this analysis.
```{r setup, include=FALSE}
source("01_Setup.R")
```
## 1. British Columbia Coastal Waterbird Survey (BCCWS)
### 1.1 Protocol
BCCWS data collection protocol can be found online [here](https://www.birdscanada.org/bird-science/british-columbia-coastal-waterbird-survey/bccws_resources).
In short, surveys have been conducted by volunteers using a standardized protocol and data collection [sheets](https://birdscanada.b-cdn.net/wp-content/uploads/2021/02/BCCWS_Datasheet.pdf). Shore-based counts are completed monthly on or near the second Sunday of each month from September to April. Surveys are complete within approximately 2 hours of high tide to maximize the opportunity for close observation. All waterbirds observed to a distance of 1 km from the high tide line are counted, except those that fly through without stopping. In the case of larger flocks, numbers are estimated by counting individuals and species in groups and scaling up (see [Training Module for Volunteers](https://birdscanada.b-cdn.net/wp-content/uploads/2020/02/BCCWS-Training-Module.pdf)). Data are entered through a customized online data entry system available on the Birds Canada website, [NatureCounts](https://birdscanada.github.io/www.birdscanada.%20org/birdmon/default/main.jsp). Observations are processed using the eBird data filters to flag rare species and high counts during observer data entry, and records are manually reviewed for form accuracy.
The data are collected using a standardize protocol, by trained citizen-science volunteers. This standardization is a strength of this data set for making inferences about coastal waterbirds in the Canadian Salish Sea.
### 1.2 Data Collected
Observation counts of waterbirds and raptor seen during a survey are compiled at the scale of the route (i.e., the maximum count per species) on each monthly survey. These observations are divided into inland, near shore (shoreline to 500m out from high tide), off shore (beyond 500m), and total counts. The dataset is not zero-filled.
Auxiliary Data Collected:
- Observer information: observer ID
- Survey information: time observation started, time observation ended, duration in hours
- Survey condition: precipitation, % cloud, sea condition, tide state, tide movement, visibility, survey equipment, human activity (all categorical)
**Duration in Hours** of each survey is an important variable to consider when analyzing trends. Longer surveys may detect more birds. There is substantial variation in survey duration for the BCCWS and the relationship for a lot of species looks non-linear. I.E., There are diminishing returns to survey duration. We therefore suggest that this is modeled as a covariate in the model and not an offset.
### 1.2 Data Access
Data can be freely accessed through the NatureCounts data [download](https://naturecounts.ca/nc/default/searchquery.jsp) portal or directly through the naturecounts R package. The BCCWS is Access Level 4 dataset, meaning a data request form must be submitted. This is not meant to be a barrier, rather a means of keeping track of who is using the data and for what purposes.
Data are formatted using a standardized schema that is a core standard of the [Avian Knowledge Network](https://avianknowledge.net/) and which feeds into [GBIF](https://www.gbif.org/). This format is called the Bird Monitoring Data Exchange ([BMDE](https://naturecounts.ca/nc/default/nc_bmde.jsp)), which includes 169 core fields for capturing all metric and descriptors associated with bird observations.
```{r BCCWS Data Download}
#sample code to access BCCWS data from NatureCounts
BCCWS<-nc_data_dl(collection="BCCWS", username = "YOUR USERNAME", info="MY REASON", fields_set = "core", request_id = 12345) #you will be prometed for your password
#Write to Data folder in working directory
write.csv(BCCWS, "Data/BCCWS.csv", row.names = FALSE)
#read from Data folder in working directory
BCCWS<-read.csv("Data/BCCWS.csv")
```
### 1.3 Data Clear
```{r}
#Manually Specify the start and end year of the analysis
#Keep in mind that this is the winter year (wyear) which is the start year of the survey, #The survey straddles two calendar years
Y1 = 1999 #start year of survey
Y2 = 2023
#Run the BCCWS cleaning scripts.
#Your output will include the data you need for an analysis of trends.
source("02_BCCWSClean.R")
# To write to local Data directory for analysis
write.csv(in.BCCWS, "Data/in.BCCWS.csv", row.names = FALSE)
write.csv(event.BCCWS, "Data/event.BCCWS.csv", row.names = FALSE)
```
## 2. Sampling Events Plot
Now we will plot the distribution of sampling events over the extent of the Salish Sea. This will be facets by year (wyear) so that changes in sampling effort can be spatailly visualized. Each survey program will be given a different colout.
```{r}
#Convert the data to a spatial object
events_sf <- st_as_sf(event.BCCWS, coords = c("DecimalLongitude", "DecimalLatitude"), crs = 4326)
p<-ggplot(data = events_sf) +
# Select a basemap
annotation_map_tile(type = "cartolight", zoom = NULL, progress = "none") +
# Plot the points, color-coded by survey_year
geom_sf(aes(color = as.factor(wyear)), size = 1) +
# Facet by survey_year to create the multi-paneled map
facet_wrap(~ wyear) +
# Add a theme with a minimal design and change the font styles, to your preference
theme_minimal() +
#theme(legend.position = "bottom") +
# To make the points in the legend larger without affecting map points
guides(color = guide_legend(override.aes = list(size = 3))) +
#make the text on the x-axis vertical
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
# Define the title and axis names
labs(title = "Coastal Waterbird Survey Events in the Salish Sea",
x = "Longitude",
y = "Latitude")+
#Define the legend title
scale_color_discrete(name = "Winter Year")
ggsave("output/Plots/BCCWSEventsMap.jpeg", plot = p, width = 10, height = 10, units = "in", dpi = 300)
```
Once your data cleaning is complete you are ready to move onto 02_DataAnalysis.