Skip to contents

This function reads biovolume data from feature files generated by the ifcb-analysis repository (Sosik and Olson 2007) and matches them with corresponding classification results or manual annotations. It calculates biovolume in cubic micrometers and determines if each class is a diatom based on the World Register of Marine Species (WoRMS). Carbon content is computed for each region of interest (ROI) using conversion functions from Menden-Deuer and Lessard (2000), depending on whether the class is identified as a diatom.

Usage

ifcb_extract_biovolumes(
  feature_files,
  class_files = NULL,
  custom_images = NULL,
  custom_classes = NULL,
  class2use_file = NULL,
  micron_factor = 1/3.4,
  diatom_class = "Bacillariophyceae",
  diatom_include = NULL,
  marine_only = FALSE,
  threshold = "opt",
  multiblob = FALSE,
  feature_recursive = TRUE,
  class_recursive = TRUE,
  drop_zero_volume = FALSE,
  feature_version = NULL,
  use_python = FALSE,
  verbose = TRUE,
  mat_folder = deprecated(),
  mat_files = deprecated(),
  mat_recursive = deprecated()
)

Arguments

feature_files

A path to a folder containing feature files or a character vector of file paths.

class_files

(Optional) A character vector of full paths to classification or manual annotation files (.mat, .h5, or .csv), or a single path to a folder containing such files.

custom_images

(Optional) A character vector of image filenames in the format DYYYYMMDDTHHMMSS_IFCBXXX_ZZZZZ(.png), where "XXX" represents the IFCB number and "ZZZZZ" represents the ROI number. These filenames should match the roi_number assignment in the feature_files and can be used as a substitute for classification files.

custom_classes

(Optional) A character vector of corresponding class labels for custom_images.

class2use_file

(Optional) A character string specifying the path to the file containing the class2use variable. Only required for manual results (default: NULL).

micron_factor

Conversion factor from microns per pixel (default: 1/3.4).

diatom_class

A character vector specifying diatom class names in WoRMS. Default: "Bacillariophyceae".

diatom_include

Optional character vector of class names that should always be treated as diatoms, overriding the boolean result of ifcb_is_diatom. Default: NULL.

marine_only

Logical. If TRUE, restricts the WoRMS search to marine taxa only. Default: FALSE.

threshold

A character string controlling which classification to use. "opt" (default) uses the threshold-applied classification, where predictions below the per-class optimal threshold are labeled "unclassified". Any other value (e.g. "all") uses the raw winning class without any threshold applied.

multiblob

Logical. If TRUE, includes multiblob features. Default: FALSE.

feature_recursive

Logical. If TRUE, searches recursively for feature files when feature_files is a folder. Default: TRUE.

class_recursive

Logical. If TRUE and class_files is a folder, searches recursively for classification files. Default: TRUE.

drop_zero_volume

Logical. If TRUE, rows where Biovolume equals zero (e.g., artifacts such as smudges on the flow cell) are removed. Default: FALSE.

feature_version

Optional numeric or character version to filter feature files by (e.g. 2 for "_v2"). Default is NULL (no filtering).

use_python

Logical. If TRUE, attempts to read .mat files using a Python-based method (SciPy). Default: FALSE.

verbose

Logical. If TRUE, prints progress messages. Default: TRUE.

mat_folder

[Deprecated] Use class_files instead.

mat_files

[Deprecated] Use class_files instead.

mat_recursive

[Deprecated] Use class_recursive instead.

Value

A data frame containing:

  • sample: The sample name.

  • classifier: The classifier used (if applicable).

  • roi_number: The region of interest (ROI) number.

  • class: The identified taxonomic class.

  • biovolume_um3: Computed biovolume in cubic micrometers.

  • carbon_pg: Estimated carbon content in picograms.

Details

  • Classification Data Handling:

    • If class_files is provided, the function reads class annotations from .mat, .h5, or .csv files.

    • If custom_images and custom_classes are supplied, they override classification file data (e.g. data from a CNN model).

    • If both class_files and custom_images/custom_classes are given, class_files takes precedence.

  • MAT File Processing:

References

Menden-Deuer Susanne, Lessard Evelyn J., (2000), Carbon to volume relationships for dinoflagellates, diatoms, and other protist plankton, Limnology and Oceanography, 45(3), 569-579, doi: 10.4319/lo.2000.45.3.0569.

Sosik, H. M. and Olson, R. J. (2007), Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry. Limnol. Oceanogr: Methods 5, 204–216.

Examples

if (FALSE) { # \dontrun{
# Using classification results:
feature_files <- "data/features"
class_files <- "data/classified"

biovolume_df <- ifcb_extract_biovolumes(feature_files,
                                        class_files)

print(biovolume_df)

# Using custom classification result:
classes <- c("Mesodinium_rubrum",
             "Mesodinium_rubrum")
images <- c("D20220522T003051_IFCB134_00002",
           "D20220522T003051_IFCB134_00003")

biovolume_df_custom <- ifcb_extract_biovolumes(feature_files,
                                               custom_images = images,
                                               custom_classes = classes)

print(biovolume_df_custom)
} # }