This function generates and saves data about a dataset's Particle Size Distribution (PSD) from Imaging FlowCytobot (IFCB) feature and hdr files, which can be used for data quality assurance and quality control.
Usage
ifcb_psd(
feature_folder,
hdr_folder,
save_data = FALSE,
output_file = NULL,
plot_folder = NULL,
use_marker = FALSE,
start_fit = 10,
r_sqr = 0.5,
beads = NULL,
bubbles = NULL,
incomplete = NULL,
missing_cells = NULL,
biomass = NULL,
bloom = NULL,
humidity = NULL,
micron_factor = 1/3.4
)
Arguments
- feature_folder
The absolute path to a directory containing all of the v2 feature files for the dataset.
- hdr_folder
The absolute path to a directory containing all of the hdr files for the dataset.
- save_data
A boolean indicating whether to save data to CSV files. Default is FALSE.
- output_file
A string with the base file name for the .csv to use (including path). Set to NULL to not save data (default).
- plot_folder
The folder where graph images for each file will be saved. Set to NULL to not save graphs (default).
- use_marker
A boolean indicating whether to show markers on the plot. Default is FALSE.
- start_fit
An integer indicating the start fit value for the plot. Default is 10.
- r_sqr
The lower limit of acceptable R^2 values (any curves below it will be flagged). Default is 0.5.
- beads
The maximum multiplier for the curve fit. Any files with higher curve fit multipliers will be flagged as bead runs. If this argument is included, files with "runBeads" marked as TRUE in the header file will also be flagged as a bead run. Optional.
- bubbles
The minimum difference between the starting ESD and the ESD with the most targets. Any files with a difference higher than this threshold will be flagged as mostly bubbles. Optional.
- incomplete
A tuple with the minimum volume of cells (in c/L) and the minimum mL analyzed for a complete run. Any files with values below these thresholds will be flagged as incomplete. Optional.
- missing_cells
The minimum image count to trigger count ratio. Any files with lower ratios will be flagged as missing cells. Optional.
- biomass
The minimum number of targets in the most populated ESD bin for any given run. Any files with fewer targets will be flagged as having low biomass. Optional.
- bloom
The minimum difference between the starting ESD and the ESD with the most targets. Any files with a difference less than this threshold will be flagged as a bloom. Will likely be lower than the bubbles threshold. Optional.
- humidity
The maximum percent humidity. Any files with higher values will be flagged as high humidity. Optional.
- micron_factor
The conversion factor to microns. Default is 1/3.4.
Details
The PSD function originates from the PSD
python repository (Hayashi et al. in prep), which can be found at https://github.com/kudelalab/PSD.
This function requires a python interpreter to be installed. The required python packages can be installed in a virtual environment using ifcb_py_install
.
The function requires v2 features generated by the ifcb-analysis
MATLAB package (Sosik and Olson 2007) found at https://github.com/hsosik/ifcb-analysis.
References
Hayashi, K., Walton, J., Lie, A., Smith, J. and Kudela M. Using particle size distribution (PSD) to automate imaging flow cytobot (IFCB) data quality in coastal California, USA. In prep. Sosik, H. M. and Olson, R. J. (2007), Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry. Limnol. Oceanogr: Methods 5, 204–216.
Examples
if (FALSE) { # \dontrun{
# Initialize the python session if not already set up
ifcb_py_install()
ifcb_psd(
feature_folder = 'path/to/features',
hdr_folder = 'path/to/hdr_data',
save_data = TRUE,
output_file = 'psd/svea_2021',
plot_folder = 'psd/plots',
use_marker = FALSE,
start_fit = 13,
r_sqr = 0.5,
beads = 10 ** 9,
bubbles = 150,
incomplete = c(1500, 3),
missing_cells = 0.7,
biomass = 1000,
bloom = 5,
humidity = NULL,
micron_factor = 1/3.0
)
} # }