Skip to contents

This function processes a dataset by filling in missing values for specific parameters using data from a Ferrybox dataset. It rounds the timestamp to a specified unit, joins the Ferrybox data based on the rounded timestamp, and fills in missing values using the corresponding Ferrybox data.

Usage

handle_missing_ferrybox_data(
  data,
  ferrybox_data,
  parameters,
  rounding_function
)

Arguments

data

A data frame containing the main dataset with a timestamp column and several parameter columns that might have missing data.

ferrybox_data

A data frame containing the Ferrybox dataset. This dataset should have a timestamp column and columns corresponding to the parameters specified.

parameters

A character vector of column names (parameters) in data that should be checked for missing values and potentially filled using the Ferrybox data.

rounding_function

A function that rounds the timestamp to a specified unit (e.g., minute). This function should take a timestamp column and a unit argument.

Value

A data frame similar to data, but with missing values in the specified parameters filled in using the Ferrybox data. The output includes only the timestamp and the specified parameter columns.

Details

The function performs the following steps:

  • Renames the columns in the ferrybox_data by appending _ferrybox to the names of the specified parameters.

  • Filters the data for rows with missing values in any of the specified parameters.

  • Rounds the timestamp to the nearest specified unit using the rounding_function.

  • Joins the ferrybox_data to the main data based on the rounded timestamp.

  • Uses the coalesce function to fill in missing values in the specified parameters with the corresponding values from the Ferrybox data.

  • Returns a cleaned dataset containing only the timestamp and the specified parameter columns.

Examples

if (FALSE) { # \dontrun{
# Assuming you have a data frame `data` with missing values, a Ferrybox data frame `ferrybox_data`,
# and a rounding function `round_timestamp`.
filled_data <- handle_missing_ferrybox_data(data,
                                            ferrybox_data,
                                            c("8002", "8003", "8172"),
                                            round_timestamp)
} # }