Skip to contents

This function calculates aggregated biovolumes and carbon content from Imaging FlowCytobot (IFCB) samples based on biovolume information from feature files. Images are grouped into classes either based on MATLAB classification, manually annotated files, or a user-supplied list of images and their corresponding class labels (e.g. from a CNN model).

Usage

ifcb_summarize_biovolumes(
  feature_folder,
  mat_folder = NULL,
  class2use_file = NULL,
  hdr_folder = NULL,
  custom_images = NULL,
  custom_classes = NULL,
  micron_factor = 1/3.4,
  diatom_class = "Bacillariophyceae",
  marine_only = FALSE,
  threshold = "opt",
  feature_recursive = TRUE,
  mat_recursive = TRUE,
  hdr_recursive = TRUE,
  use_python = FALSE,
  verbose = TRUE
)

Arguments

feature_folder

Path to the folder containing feature files (e.g., CSV format).

mat_folder

(Optional) Path to the folder containing MATLAB classification or manual annotation files.

class2use_file

(Optional) A character string specifying the path to the file containing the class2use variable (default NULL). Only needed when summarizing manual MATLAB results.

hdr_folder

(Optional) Path to the folder containing HDR files. Needed for calculating cell, biovolume and carbon concentration per liter.

custom_images

(Optional) A character vector of image filenames in the format DYYYYMMDDTHHMMSS_IFCBXXX_ZZZZZ, where "XXX" represents the IFCB number and "ZZZZZ" represents the ROI number. These filenames should match the roi_number assignment in the feature_files and can be used as a substitute for MATLAB files.

custom_classes

(Optional) A character vector of corresponding class labels for custom_images.

micron_factor

Conversion factor from microns per pixel (default: 1/3.4).

diatom_class

A string vector of diatom class names in the World Register of Marine Species (WoRMS). Default is "Bacillariophyceae".

marine_only

Logical. If TRUE, restricts the WoRMS search to marine taxa only. Default is FALSE.

threshold

Threshold for classification (default: "opt").

feature_recursive

Logical. If TRUE, the function will search for feature files recursively within the feature_folder. Default is TRUE.

mat_recursive

Logical. If TRUE, the function will search for MATLAB files recursively within the mat_folder. Default is TRUE.

hdr_recursive

Logical. If TRUE, the function will search for HDR files recursively within the hdr_folder (if provided). Default is TRUE.

use_python

Logical. If TRUE, attempts to read the .mat file using a Python-based method. Default is FALSE.

verbose

A logical indicating whether to print progress messages. Default is TRUE.

Value

A data frame summarizing aggregated biovolume and carbon content per class per sample. Columns include 'sample', 'classifier', 'class', 'biovolume_mm3', 'carbon_ug', 'ml_analyzed', 'biovolume_mm3_per_liter', and 'carbon_ug_per_liter'.

Details

This function performs the following steps:

  1. Extracts biovolumes and carbon content from feature and classification results using ifcb_extract_biovolumes.

  2. Optionally incorporates volume data from HDR files to calculate volume analyzed per sample.

  3. Computes biovolume and carbon content per liter of sample analyzed.

The MATLAB classification or manual annotation files are generated by the ifcb-analysis repository (Sosik and Olson 2007). Users can optionally provide a custom classification by supplying a vector of image filenames (custom_images) along with corresponding class labels (custom_classes). This allows summarization of biovolume and carbon content without requiring MATLAB classification or manual annotation files (e.g. results from a CNN model).

Biovolumes are converted to carbon according to Menden-Deuer and Lessard 2000 for individual regions of interest (ROI), applying different conversion factors to diatoms and non-diatom protists. If provided, the function also incorporates sample volume data from HDR files to compute biovolume and carbon content per liter of sample.

If use_python = TRUE, the function tries to read the .mat file using ifcb_read_mat(), which relies on SciPy. This approach may be faster than the default approach using R.matlab::readMat(), especially for large .mat files. To enable this functionality, ensure Python is properly configured with the required dependencies. You can initialize the Python environment and install necessary packages using ifcb_py_install().

References

Menden-Deuer Susanne, Lessard Evelyn J., (2000), Carbon to volume relationships for dinoflagellates, diatoms, and other protist plankton, Limnology and Oceanography, 3, doi: 10.4319/lo.2000.45.3.0569.

Sosik, H. M. and Olson, R. J. (2007), Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry. Limnol. Oceanogr: Methods 5, 204–216.

Examples

if (FALSE) { # \dontrun{
# Example usage:
ifcb_summarize_biovolumes("path/to/features", "path/to/mat", hdr_folder = "path/to/hdr")

# Using custom classification result:
images <- c("D20220522T003051_IFCB134_00002",
            "D20220522T003051_IFCB134_00003")
classes = c("Mesodinium_rubrum",
            "Mesodinium_rubrum")

ifcb_summarize_biovolumes(feature_folder = "path/to/features",
                          hdr_folder = "path/to/hdr",
                          custom_images = images,
                          custom_classes = classes)
} # }