Read and Summarize Classified IFCB Data

This function reads a MATLAB .mat file containing aggregated and classified IFCB (Imaging FlowCytobot) data generated by the countcells_allTBnew_user_training function from the ifcb-analysis repository (Sosik and Olson 2007), or a list of classified data generated by ifcb_summarize_class_counts. It returns a data frame with species counts and optionally biovolume information based on specified thresholds.

Usage

ifcb_read_summary(
  summary,
  hdr_directory = NULL,
  biovolume = FALSE,
  threshold = "opt",
  use_python = FALSE
)

Arguments

summary: A character string specifying the path to the .mat summary file or a list generated by ifcb_summarize_class_counts.
hdr_directory: A character string specifying the path to the directory containing header (.hdr) files. Default is NULL.
biovolume: A logical indicating whether the file contains biovolume data. Default is FALSE.
threshold: A character string specifying the threshold type for counts and biovolume. Options are "opt" (default), "adhoc", and "none".
use_python: Logical. If TRUE, attempts to read the .mat file using a Python-based method. Default is FALSE.

Value

A data frame containing the summary information including file list, volume analyzed, species counts, optionally biovolume, and other metadata.

Details

If use_python = TRUE, the function tries to read the .mat file using ifcb_read_mat(), which relies on SciPy. This approach may be faster than the default approach using R.matlab::readMat(), especially for large .mat files. To enable this functionality, ensure Python is properly configured with the required dependencies. You can initialize the Python environment and install necessary packages using ifcb_py_install().

If use_python = FALSE or if SciPy is not available, the function falls back to using R.matlab::readMat().

References

Sosik, H. M. and Olson, R. J. (2007), Automated taxonomic classification of phytoplankton sampled with imaging-in-flow cytometry. Limnol. Oceanogr: Methods 5, 204–216.

Examples

mat_file <- system.file("exdata/example_summary.mat", package = "iRfcb")

summary_data <- ifcb_read_summary(mat_file, biovolume = FALSE, threshold = "opt")
print(summary_data)
#> # A tibble: 23 × 12
#>    sample  timestamp           date        year month   day time     ifcb_number
#>    <chr>   <dttm>              <date>     <dbl> <dbl> <int> <time>   <chr>      
#>  1 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  2 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  3 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  4 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  5 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  6 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  7 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  8 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#>  9 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#> 10 D20230… 2023-08-10 11:30:59 2023-08-10  2023     8    10 11:30:59 IFCB134    
#> # ℹ 13 more rows
#> # ℹ 4 more variables: ml_analyzed <dbl>, species <chr>, counts <dbl>,
#> #   counts_per_liter <dbl>