Frequently Asked Questions
General
Q: What IFCB data formats are supported?
A: The app reads standard IFCB files:
- ROI files (.roi) - Image data
- ADC files (.adc) - Metadata (dimensions, timestamps)
- HDR files (.hdr) - Header information
Q: Can I use this for non-IFCB data?
A: The app is specifically designed for IFCB data format. For other imaging systems, you would need to adapt the data loading functions.
Q: Is my data modified?
A: No. The app only reads your original files. All output is written to separate folders.
Q: How can I review all images of a specific class?
A: Use Class Review mode. Switch to “Class Review” using the mode toggle in the sidebar, select Database as the source, select a class from the dropdown, and click Load. This loads all images annotated as that class from every sample in the database. You can then reclassify any mistakes using the normal relabeling tools.
Q: Can I reclassify images across multiple samples at once?
A: Yes. Class Review mode (Database source) loads images from all samples and saves changes as row-level updates to the database. This means only the images you reclassify are updated — other annotations in those samples remain untouched.
Q: Can I sort a folder of PNG images into class subfolders?
A: Yes. In Class Review mode, select External PNG Folder as the source, browse to your PNG folder, and click Load Folder. All images are loaded with an initial class label (defaulting to the folder name). Relabel images in the gallery, then set an export folder and click Export Split Folders to copy images into class-name subfolders. This works independently of the database.
Q: Can I classify images without pre-computed classifier files?
A: Yes. Configure a Gradio API URL and model in Settings > Live Prediction, then click the Predict button in the sidebar after loading a sample. This sends images to a remote CNN model and applies the predictions directly. See the User Guide for details.
Installation Issues
Q: I see “Python not available” warning
A: This warning only appears when your storage format includes
.mat files. Python is not needed for the
default SQLite storage.
If you see this warning and don’t need .mat files,
switch to SQLite in Settings > Annotation Storage. Otherwise, to
enable .mat support:
library(iRfcb)
ifcb_py_install() # Creates venv in current working directoryThen restart the app.
Q: Do I need Python to use ClassiPyR?
A: No. The default storage format is SQLite, which works out of the
box with no Python dependency. Python is only needed if you want to
export .mat files for ifcb-analysis
compatibility.
Q: Where is the Python virtual environment created?
A: By default, ifcb_py_install() creates a virtual
environment at ~/.virtualenvs/iRfcb. You can specify a
different location:
ifcb_py_install("/path/to/your/venv")You can also specify the venv path when launching the app:
run_app(venv_path = "/path/to/your/venv")Q: How is the Python virtual environment path resolved?
A: The app uses the following priority order:
-
venv_pathargument passed torun_app()(highest priority) -
Saved settings from a previous session (stored in
settings.json) -
Default
./venvin the working directory
When you specify run_app(venv_path = "/path/to/venv"),
that path is used for Python initialization and pre-filled in the
Settings dialog, overriding any previously saved path.
Q: Package installation fails
A: Make sure you have remotes installed and try:
if (!requireNamespace("remotes", quietly = TRUE)) install.packages("remotes")
remotes::install_github("EuropeanIFCBGroup/ClassiPyR")Q: “Could not find app directory” error
A: Try reinstalling the package:
remotes::install_github("EuropeanIFCBGroup/ClassiPyR")Q: iRfcb won’t install
A: iRfcb is
the core dependency for ClassiPyR and is installed
automatically. If you encounter issues:
install.packages("iRfcb")Data Loading
Q: No samples appear in the dropdown
A: Check that:
- ROI/PNG Data Folder points to your data
- Either ROI files exist, or extracted PNG sample folders exist (for
example
.../D20230313T004021_IFCB134/D20230313T004021_IFCB134_00002.png) - Click the Sync button (circular arrow icon) to rescan folders if you recently added new data
Q: “ROI file not found” error
A: The app scans the ROI/PNG Data Folder recursively, so any subfolder layout works (including flat). Check that:
- The ROI/PNG Data Folder path is correct
- If using ROI files, each
.roifile has a matching.adcfile in the same directory - If using extracted PNGs, sample folders are named
DYYYYMMDDTHHMMSS_IFCBNNNand image files followsample_#####.png - Click the Sync button to rescan if you recently moved or added files
Q: Classifications not loading
A: For CSV files:
- Must have columns named
file_nameandclass_name(exact names required) - Optionally include
scoreandclass_name_autocolumns - The CSV file must be named after the sample (e.g.,
D20230101T120000_IFCB134.csv) - File should be in the Classification Folder (indexed via file cache; click Sync to refresh)
For H5 files:
- Must match pattern
*_class*.h5 - Requires the
hdf5rpackage (install.packages("hdf5r"))
For MAT files:
- Must match pattern
*_class*.mat - Must contain
roinumandTBclassvariables
CSV Format
Q: What should my classification CSV look like?
A: The CSV must have columns named file_name and
class_name. The file must be named after the sample (e.g.,
D20230101T120000_IFCB134.csv).
Minimal example:
file_name,class_name
D20230101T120000_IFCB134_00001.png,Diatom
D20230101T120000_IFCB134_00002.png,Ciliate
With optional score and class_name_auto
columns:
file_name,class_name,class_name_auto,score
D20230101T120000_IFCB134_00001.png,unclassified,Diatom,0.45
D20230101T120000_IFCB134_00002.png,Ciliate,Ciliate,0.87
D20230101T120000_IFCB134_00003.png,Dinoflagellate,Dinoflagellate,0.72
The class_name_auto column contains the raw prediction
without threshold. When “Apply classification threshold” is disabled in
Settings, ClassiPyR uses class_name_auto instead of
class_name.
Q: My CNN classifier outputs different column names
A: The column names must be exactly file_name and
class_name. If your classifier uses different names, rename
the columns before loading. For example in R:
df <- read.csv("my_classifications.csv")
names(df)[names(df) == "predicted_class"] <- "class_name"
names(df)[names(df) == "filename"] <- "file_name"
write.csv(df, "D20230101T120000_IFCB134.csv", row.names = FALSE)Annotation
Q: Drag-select isn’t working
A: Make sure you’re:
- Starting the drag in the gallery area (not on an image)
- Using left mouse button
- Dragging far enough (>5 pixels)
Q: Images show as “Not found”
A: The ROI might be empty (no actual image data). These are filtered out automatically on load.
Q: Changes aren’t being saved
A: Check that:
- Output folder is writable
- If using MAT format: Python is available (not needed for default SQLite storage)
- Click “Save Annotations” before closing
Class List
Q: Do I need to upload a class list file to start annotating?
A: No! You can create a class list from scratch directly in the app:
- Open Settings → Edit Class List (no file upload needed)
- Add classes one at a time or paste multiple classes in the text area
- Click Apply Changes - a temporary file is created automatically
- Start annotating immediately
With SQLite storage (default), your class list is auto-saved to the
database and restored on next startup. You can still export as
.mat or .txt for sharing or backup.
Q: How do I create a class list from scratch?
A:
- Open Settings → Edit Class List
- Add classes using “Add new class” field, or type/paste classes in the text area
- Click Apply Changes
- With SQLite storage, the class list persists automatically.
Optionally save as
.mator.txtfor portability.
Q: Is my class list saved automatically?
A: Yes, when using SQLite storage (the default). Every change —
adding classes, renaming, applying WoRMS matches, uploading a file — is
auto-saved to the global_class_list table in the SQLite
database. On next startup, the class list is restored from the database.
If you don’t use SQLite storage, save your class list as a
.txt or .mat file to preserve it.
Q: Can I import a class list from MATLAB?
A: Yes, load your existing class2use.mat file via
Settings.
Q: My class names look different after loading
A: The app may truncate trailing numbers from class names (e.g., “Diatom_01” → “Diatom”). This matches iRfcb behavior.
Q: Why are some characters in my class names changed?
A: Class names are used as folder names when exporting PNGs, so certain characters must be sanitized:
| Character | Reason | Action |
|---|---|---|
/ or \
|
Path separators | Replaced with _
|
< > " ' & |
HTML/security risks | Removed |
: * ? \| |
Windows filesystem | Removed |
.. |
Path traversal risk | Removed |
Example: Snowella/Woronichinia becomes
Snowella_Woronichinia
Common taxonomic characters like hyphens (-),
underscores (_), periods (.), and spaces are
preserved.
Q: How do I match class names to WoRMS AphiaID values?
A: In Settings → Edit Class List, click Match WoRMS AphiaID.
- The app sanitizes class names before querying WoRMS
- Long names (>80 characters) are skipped automatically in auto-match
- Unmatched/skipped classes can be rematched manually by editing query fields and clicking Rematch Unmatched
- Click Apply AphiaID Matches to persist results
Matched AphiaIDs are shown in the class list as
[AphiaID: ...].
Q: Where are AphiaID mappings stored?
A: In the SQLite database (annotations.sqlite), table
class_taxonomy, in the configured Database
Folder. They are not embedded in
class2use.mat/.txt files.
Output
Q: Where are my annotations saved?
A: Annotations are split across two locations:
- SQLite database (default): stored in the Database Folder (a local directory)
- MAT files and statistics: stored in the Output Folder (can be on a network drive)
db_folder/ ← local drive (Database Folder)
└── annotations.sqlite ← single database for ALL samples
output_folder/ ← can be a network drive (Output Folder)
├── D20230101T120000_IFCB134.mat ← only if storage format includes "MAT"
├── D20230202T080000_IFCB134.mat
└── validation_statistics/
├── ..._validation_stats.csv
└── ..._validation_detailed.csv
By default, the database is stored in a persistent local directory
(tools::R_user_dir("ClassiPyR", "data")). Back up
annotations.sqlite to preserve your work.
Q: Where is the default database location?
A: The default Database Folder is a platform-specific local directory:
-
Linux:
~/.local/share/R/ClassiPyR/ -
macOS:
~/Library/Application Support/org.R-project.R/R/ClassiPyR/ -
Windows:
%LOCALAPPDATA%/R/data/R/ClassiPyR/
You can find the exact path with:
ClassiPyR::get_default_db_dir()You can change it in Settings > Database Folder, but it should always be a local drive.
Q: Can I put the database on a network drive?
A: No. SQLite databases are not safe on network filesystems (NFS, SMB/CIFS) because network file locking is unreliable, which can lead to database corruption. Always keep the Database Folder on a local drive. The Output Folder (for MAT files and statistics) can safely be on a network drive.
Q: How do I transfer my annotations to another computer?
A: Since the SQLite database is stored locally, you cannot simply
share it over a network drive. Instead, use .mat files as
the interchange format:
- Export from the source computer (requires Python with scipy):
library(ClassiPyR)
db_path <- get_db_path(get_default_db_dir())
# Export all annotations to .mat files in a shared output folder
result <- export_all_db_to_mat(db_path, "/shared/network/manual")
cat(result$success, "exported\n")Or use the Export SQLite → .mat button in Settings.
- Import on the target computer:
library(ClassiPyR)
db_path <- get_db_path(get_default_db_dir())
# Import .mat files from the shared folder into the local database
result <- import_all_mat_to_db("/shared/network/manual", db_path)
cat(result$success, "imported,", result$skipped, "skipped\n")Or use the Import .mat → SQLite button in Settings. Already-imported samples are skipped automatically.
You can also simply copy the annotations.sqlite file
directly between machines if you prefer.
Q: Can I import annotations back to MATLAB?
A: Yes, if you save with the “MAT file” or “Both” storage format, the
MAT files are compatible with the ifcb-analysis toolbox
(Sosik & Olson, 2007). Use the list in startMC, or load
the list in MATLAB using:
Note: Python with scipy must be installed to save .mat
files. Change the storage format in Settings > Annotation
Storage.
Q: Can I migrate existing .mat annotations to the SQLite database?
A: Yes. The easiest way is the Import .mat → SQLite
button in Settings > Annotation Storage, which bulk-imports all
.mat files in your output folder.
You can also import programmatically — a single file:
library(ClassiPyR)
import_mat_to_db(
mat_path = "/data/manual/D20230101T120000_IFCB134.mat",
db_path = get_db_path(get_default_db_dir()),
sample_name = "D20230101T120000_IFCB134"
)Or bulk-import all .mat files in a folder:
result <- import_all_mat_to_db("/data/manual", get_db_path(get_default_db_dir()))
cat(result$success, "imported,", result$failed, "failed,", result$skipped, "skipped\n")Q: Can I export SQLite annotations back to .mat files?
A: Yes. Use the Export SQLite → .mat button in Settings > Annotation Storage to export all annotated samples at once. This requires Python with scipy.
You can also export programmatically:
# Single sample
export_db_to_mat(get_db_path(get_default_db_dir()), "D20230101T120000_IFCB134", "/data/manual")
# All samples
result <- export_all_db_to_mat(get_db_path(get_default_db_dir()), "/data/manual")
cat(result$success, "exported,", result$failed, "failed\n")Q: Can I create a MATLAB-format ZIP archive for sharing?
A: Yes. Use the Export SQLite → MATLAB ZIP button in
Settings > Export from SQLite. This bundles .mat
annotation files, feature CSVs, a class2use.mat config
file, optional raw data, and README files into a distributable ZIP
archive via iRfcb::ifcb_zip_matlab(). You need to provide a
features folder; a data folder with raw IFCB files is optional.
If your storage format is SQLite-only, annotations are automatically
converted to temporary .mat files (requires Python with
scipy). See the iRfcb
image export tutorial for more details on the archive format.
Q: Can I import images classified in another tool?
A: Yes. Organize your PNG images into subfolders named after each
class (e.g., Diatom/, Ciliate_002/). Then use
Import PNG → SQLite in Settings > Import / Export.
The app strips trailing _NNN suffixes from folder names
(following the iRfcb convention) and maps images to class names based on
which subfolder they are in.
If your folder class names don’t match the app’s current class list, a mapping dialog will appear letting you remap them to existing classes or add them as new classes.
Note: ROI files are needed for viewing images and re-exporting. Without ROI files, annotations are stored in the database but images cannot be displayed in the gallery.
You can also import programmatically:
library(ClassiPyR)
result <- import_png_folder_to_db(
"/data/png_export",
get_db_path(get_default_db_dir()),
class2use = c("Diatom", "Ciliate", "Dinoflagellate"),
annotator = "Jane"
)
cat(result$success, "imported,", result$failed, "failed\n")Q: Can I change the annotator name for existing annotations?
A: Yes. Use update_annotator() from the R console:
library(ClassiPyR)
db_path <- get_db_path(get_default_db_dir())
# Update a single sample
update_annotator(db_path, "D20230101T120000_IFCB134", "Jane")
# Update several samples at once
update_annotator(db_path, c("sample_A", "sample_B"), "Jane")
# Update all annotated samples (e.g. after a bulk import)
all_samples <- list_annotated_samples_db(db_path)
update_annotator(db_path, all_samples, "Jane")The function returns a named vector showing how many annotation rows were updated per sample (0 means the sample was not found in the database).
Q: What’s in the statistics CSV?
A: Two files per sample:
-
*_validation_stats.csv- Summary statistics -
*_validation_detailed.csv- Per-image details
Measurement
Q: How do I measure images?
A: Click the Measure button (ruler icon) in the toolbar to activate measure mode. Then click and drag on any image to draw a measurement line. The distance is displayed in micrometers and pixels.
Q: How do I change the scale calibration?
A: Open Settings and find “Pixels per Micrometer”. The default is 3.4 (standard for IFCB). Enter your instrument’s calibration value.
Settings & Persistence
Q: Are my settings saved between sessions?
A: Yes! Settings are stored in a configuration file:
-
Linux:
~/.config/R/ClassiPyR/settings.json -
macOS:
~/Library/Preferences/org.R-project.R/R/ClassiPyR/settings.json -
Windows:
%APPDATA%/R/config/R/ClassiPyR/settings.json
Folder paths, class list location, and Python venv path are automatically restored when you restart the app.
Q: How do I reset all settings to defaults?
A: Use the reset_settings argument when launching the
app:
run_app(reset_settings = TRUE)This deletes the saved settings.json file and starts the
app with default values. All folder paths, the class list reference, and
the Python venv path are cleared, so you will need to reconfigure them.
The class list file itself (class2use_saved.*) is not
deleted from the config directory but will not be loaded until you
re-upload it. This is useful if:
- The app fails to start due to invalid saved paths
- Folder paths point to locations that no longer exist
- You want a clean slate after changing your data layout
You can also combine it with other arguments:
# Reset settings and specify a new Python environment
run_app(reset_settings = TRUE, venv_path = "/path/to/your/venv")Q: What’s the yellow warning on some classes?
A: Classes marked with a warning are in your classification data but not in your class list. This can happen when:
- Loading classifications from a classifier trained with a different class list
- The class list has been modified since the classification was created
You can still work with these classes, but consider adding them to your class list for consistency.
Q: A sample shows both annotation and classification markers - what does that mean?
A: This sample has both manual annotations AND auto-classifications available. When you load it, the app defaults to manual annotation mode. A toggle in the header lets you switch between annotation and validation mode.
Q: Can I switch to annotation mode for a sample that only has auto-classifications?
A: Yes. Any sample with auto-classification data (✓ or ✎✓) shows a mode toggle in the header. Clicking → Manual creates blank “unclassified” annotations if no manual annotations exist yet, so you can annotate from scratch while keeping the option to switch back to the auto-classifications.
File Index Cache
Q: What is the file index cache?
A: The file index cache stores the locations of all IFCB files (ROI, classification, annotation) found in your configured folders. It’s saved to disk so the app doesn’t need to re-scan your entire folder hierarchy every time it starts. This significantly speeds up startup for large datasets.
Q: How do I refresh the file cache?
A: Click the Sync button (circular arrow icon) in the sidebar, next to the sample navigation buttons. The cache age indicator below shows when the last scan occurred.
Q: New samples I added aren’t showing up
A: The app loads from the cached file index. Click the Sync button to rescan your folders and pick up new files.
Q: Can I update the cache without opening the app?
A: Yes. Use rescan_file_index() from the R console or a
scheduled script:
ClassiPyR::rescan_file_index()This reads folder paths from your saved settings and rebuilds the cache. You can also pass paths explicitly:
ClassiPyR::rescan_file_index(
roi_folder = "/data/ifcb/raw",
csv_folder = "/data/ifcb/classified",
output_folder = "/data/ifcb/manual"
)Q: Where is the cache file stored?
A: In the same config directory as your settings:
-
Linux:
~/.config/R/ClassiPyR/file_index.json -
macOS:
~/Library/Preferences/org.R-project.R/R/ClassiPyR/file_index.json -
Windows:
%APPDATA%/R/config/R/ClassiPyR/file_index.json
Error Messages
| Error | Solution |
|---|---|
| “ROI file not found” | Check ROI/PNG Data Folder path. If no ROI exists, provide extracted PNG sample folders named by sample and click Sync |
| “ADC file not found” | ADC file must be alongside ROI file |
| “Python not available” | Only affects .mat export. Switch to SQLite in Settings,
or run iRfcb::ifcb_py_install()
|
| “Error loading class list” | Check file format (.mat or .txt) |
| “No samples found” | Check ROI/PNG Data Folder configuration and naming |
| App fails to start | Try run_app(reset_settings = TRUE) to clear saved
settings |
Performance Tips
File index cache - The app caches folder scan results for fast startup. Click Sync only when you’ve added new data.
Use pagination - Lower images per page for faster loading
Filter by class - Reduces rendering load
Close other apps - Image extraction uses memory
SSD storage - Faster file access
Scheduled rescans - On servers with regularly arriving data, use
ClassiPyR::rescan_file_index()in a cron job to keep the cache current without manual interventionDashboard: specify a dataset - Large Dashboard instances (e.g.,
habon-ifcb.whoi.eduwith 900,000+ samples) will be very slow to load without a dataset filter. Always include?dataset=in the URL, e.g.https://habon-ifcb.whoi.edu/timeline?dataset=tangosund
Getting Help
- GitHub Issues - Report bugs
- iRfcb Documentation - Core dependency for IFCB data handling