Skip to contents

The drop_summary function generates a summary of missing data (NA values) for each column in a dataframe. It computes various metrics such as the number of dropout participants, section NAs, the mode length of those missing value sections for, and the proportion of complete cases for each column.

Usage

drop_summary(data)

Arguments

data

A dataframe for which to analyze missing data.

Value

A dataframe containing the following columns:

  • column: The name of each column in the input dataframe.

  • drop: The number of dropped rows (missing values) for that column.

  • sec_na: The number of sections of consecutive NAs for that column.

  • sec_length: The mode (most frequent length) of sections of consecutive NAs for that column.

  • single_na: The number of single NA values (isolated missing values) for that column.

  • na: The total number of missing (NA) values for that column.

  • complete: The proportion of complete rows for that column, where a value of 1 means no missing data, and values closer to 0 mean more missing data.

Details

The function calls a C API to compute some metrics, which are then processed and returned as a summary dataframe.

Examples

if (FALSE) {
# Example usage with the 'flying' dataframe
summary_result <- drop_summary(flying)
print(summary_result)
}