Introduction
Survey data frequently faces the issue of dropout — situations where participants leave sections of the survey incomplete. Effectively managing dropouts is key to preserving data quality and gaining a deeper understanding of participants’ response patterns. The dropout package addresses this challenge by providing tools to analyze and interpret participant behavior throughout the survey process.
Use cases of the dropout package
- Identifying the specific survey points where participants tend to stop completing the survey.
- Detecting sections that are frequently skipped by respondents.
- Quantifying the extent and locations of dropouts within the survey.
- Estimating the proportion of missing values attributed to dropouts in each column.
- Profiling respondents who discontinued the survey and pinpointing their dropout points.
library(dropout)
#> dropout package (v2.2.1) includes significant updates to the codebase, aimed at reducing unexpected behavior and minimizing dependencies.
#> If these changes cause issues with your existing code, you can access a previous version of the package from the archive.
#> For more information, visit:
#> https://github.com/hendr1km/dropout
Quantifying Dropout with drop_summary
The drop_summary
function provides an overview of where
and to what extent participants tend to stop answering questions. It
highlights patterns of missing values, such as whether participants are
skipping specific questions or entire sections of the survey.
drop_summary(flying)
#> column drop sec_na sec_length single_na na complete
#> 1 respondent_id 0 0 0 0 0 1.00
#> 2 travel_frequency 0 0 0 0 0 1.00
#> 3 seat_recline 18 164 20 0 182 0.82
#> 4 height 0 164 0 12 194 0.81
#> 5 children_under_18 1 164 0 6 189 0.82
#> 6 two_armrests 1 164 0 0 184 0.82
#> 7 middle_armrest 0 164 0 0 184 0.82
#> 8 window_shade 0 164 0 0 184 0.82
#> 9 moving_to_unsold_seat 1 164 0 0 185 0.82
#> 10 talking_to_seatmate 0 164 0 0 185 0.82
#> 11 getting_up_on_6_hour_flight 0 164 0 0 185 0.82
#> 12 obligation_to_reclined_seat 1 164 0 0 186 0.82
#> 13 recline_seat_rudeness 0 164 0 0 186 0.82
#> 14 eliminate_reclining_seats 0 164 0 0 186 0.82
#> 15 switch_for_friends 4 164 0 0 190 0.82
#> 16 switch_for_family 0 164 0 0 190 0.82
#> 17 wake_passenger_bathroom 0 164 0 0 190 0.82
#> 18 wake_passenger_walk 0 164 0 0 190 0.82
#> 19 baby_on_plane 1 164 0 0 191 0.82
#> 20 unruly_children 0 164 0 0 191 0.82
#> 21 electronics_violation 0 164 0 0 191 0.82
#> 22 smoking_violation 0 164 0 0 191 0.82
#> 23 gender 6 0 0 0 33 0.97
#> 24 age 0 0 0 0 33 0.97
#> 25 household_income 0 4 2 177 214 0.79
#> 26 education 0 4 0 2 39 0.96
#> 27 location_census_region 9 0 0 0 42 0.96
Detecting Specific Dropouts with drop_detect
For a more detailed analysis, the drop_detect
function
identifies individual participants who dropped out of the survey. It
returns the index of the participant and the column where the dropout
occurred, helping you focus on the critical dropout points.
drop_detect(flying) |>
head()
#> drop drop_index column
#> 1 TRUE 3 seat_recline
#> 2 FALSE NA <NA>
#> 3 FALSE NA <NA>
#> 4 FALSE NA <NA>
#> 5 FALSE NA <NA>
#> 6 FALSE NA <NA>