Example data files are included with the BuzzardsBay package to facilitate code development and testing. You can also use this example data to play with the package functions independent of a pre-existing or important data archive.
All the example files
To put all the example data in the package in one large example data store with many files:
paths <- setup_example_dir()
by default it will be set up in a path created by
tempdir()
but you can also use the parent_dir
argument to specify where you want it created.
A subset of example files
You can use combinations of the filter arguments to specify a subset of the data you want to include in the example data store. This will be quicker and use less disk space than including all the files.
For example to get a single deployment:
paths <- setup_example_dir(site_filter = "RB1", deployment_filter = "2023-06-09")
print(paths$deployments)
qc_deployment(paths$deployments[1])
Or to get all the deployments for site WH1X:
paths <- setup_example_dir(site_filter = "WH1X", delete_old = TRUE)
print(paths$deployments)
delete_old = TRUE
was added to clear out the prior
example data and is useful if you want to reset the example - perhaps
clearing out previously run reports and QC files.
Available example datasets
These are grouped into examples for the analysis module and the qc module. The formats of the two sets of files is different so each example deployment will only work for one of the two sets of functions.
Aggregation and Analysis Module
For these deployments the test data consists of the final QC CSV and
YAML files but NOT the calibration files. They are to test
report_site()
,check_site()
, and
stitch_site()
. They don’t work with
qc_deployment()
.
QC Module
These files work with qc_deployment()
. There are a ton
of files here to support the three different import types and to
demonstrate issues that have since been resolved.
Import Type 0
This was the last import type added but I assigned it a 0 as it’s a fall back simple import. It expects one CSV and one YAML file.
Import Type 1 - U24 and U26
This was the original import type supported by the package. These are based on two separate DO and Cond. loggers. For each logger there is a CSV with the data and a details text file with metadata from the calibration.
OB1 2024-05-21, 2024-05-31
These two were added to resolve an issue where a sensor was swapped
and placments.csv
updated to indicate the swap, BUT
qc_deployment)
was still throwing errors.
OB1 2024-07-30
The conductivity meter had issues so a fixed (average) salinity was used for calibration breaking the QC code.
SB2X 2024-05-15, 2024-06-10
Conductivity was calibrated with a single point calibration on these two dates instead of the standard 2-point calibration.
BD1 2024-06-21
This sensor produced a weird column name:
From Kristin:
I was working through BD1 and got the error message:
`Error in qc_deployment(deployment_dir) : The calibration data data is
missing some expected columns: "High_Range"`
I believe it comes down to the fact that whenever I export .csv files from
this particular sensor, it names the column "HIgh Range High Range (?S/cm)"
as opposed to the normal "High Range (?S/cm)." If I try to manually adjust
this, it screws up the dates saved in the .csv and I think spawns some other
issues with the code. I am going to try to figure out why this particular
data file exports that heading slightly differently, but could the code be
adjusted to still recognize it if "High Range" is repeated?
If you want to look at the data I'm working with, it's BD1 2024-06-21.
OB9 2024-07-23
Added to resolve a plot y axis limit that expanded to include -888.88 instead of treating that data as NA and thus not plotting it.
"OB9 for 2024-07-23 is a really good example of a dataset that has a sensor
malfunction written in, so the graphs are difficult to read."
(the sensor error code -888.88 is getting plotted resulting in a massive
yrange on the plot).
Type 2 (MX801)
BBC 2025-01-04, 2025-01-26, 2025-01-27)
Example output from the new MX801 logger which has a different format from the U24 and U26 loggers, and logs both conductivity and DO on one device. There are several possible calibration possibilities.
- 2025-01-04 (2, 2) - two point calibration for both DO and Cond. calibration. I think this is the most common type.
- 2025-01-26 (2, 3)- two points for DO and three points for Cond. calibration.
- 2025-01-27 (2, 1) - one point for DO and two points for Cond. calibration.
Note, I think BBC id a fake site. These are real data from short tests of the loggers.
See: the BBC README.md file for more details.
Tide Rider
WFH 2024-04-09
This is the first example of tide rider output I receeived. The TR_WFH_20240409_TRSX01.csv was from Michael Jakuba and emailed to me by Kristin Huizenga on 2024-12-05.
It contained high frequency of observations that I resampled at 10
minute intervals. The other calibrated data files are hacked copies from
a separate deployment.
See data-raw/make_fake_tide_rider_calibration_dir.R for how these files
were generated.
I decided to wait until I had real logger and tide rider data collected at the same time before implementing the tide rider module.