Guide to automated quality checks

This article is aimed at helping users make the most out of our automated quality checks feature.

Data quality concerns can be partially accommodated by looking into individual submissions, but others require a look into the distribution of the data as a whole. This is why monitoring data in real time is extremely important to detect data problems and act on them in a timely manner. In SurveyCTO, there are a set of features that can be used to make this task easier, including automated quality checks.

In this guide, we will walk you through automated quality checks by illustrating some examples, explaining how to read and interpret quality checks reports, and, finally, explaining how to combine this feature with other SurveyCTO monitoring tools.

Examples on how to use automated quality checks

There are many ways in which automated quality checks can be set up and used depending on the needs of each project. In this article, we will explore 3 specific examples that can be easily adapted and used.

If you would like to test some of these quality checks on your own, and you don’t have a form or data, take a look at how to deploy a sample form with sample data.

Example 1: Are there forms with a duration value considerably different from others?

Assessing the duration of the survey is among the common ways to evaluate the data quality of a form. When designing, testing, and piloting your form, you might get a good idea of how long it takes for that questionnaire to be completed. For submissions where this duration is considerably lower or higher than others, you might want to review their data to see if there were any anomalies during the survey.

The quality check Value is an outlier is the appropriate one for this purpose. Here are the steps to follow:

Click here to open in Google Slides.

Example 2: Are there too many “Don’t know” answers across submissions?

Considering the sensitive or specific nature of some questions, it is common to allow for the option "Don't know" in a number of fields. While this may be an option, it is important to be mindful of how often it occurs, and make sure enumerators are putting in appropriate effort to collect important data.

The quality check Value is too frequent is the appropriate one for this purpose. Here are the steps to follow:

Click here to open in Google Slides.

Example 3: Are some enumerators not consenting surveys more than others?

If you have a "consent" field in your survey, it is good practice to assess whether there is an unusually high number of submissions where respondents refused to consent to the survey. To explore this further, you might even consider assessing whether some enumerators are selecting the option "No" of the “consent” field more than others.

The quality check Group distribution is different is the appropriate one for this purpose. Here are the steps to follow:

Click here to open in Google Slides.

All automated quality checks take into account all submissions for that form. Take a look at our product documentation to learn more.

Understanding quality checks reports

After setting up the automated quality checks for your form, it is key to ensure that you keep an eye on warnings and understand how you can act on them. All data quality warnings are logged into a quality check report that you can run and download from the Monitor tab at any time. There are two options that we recommend enabling to ease your work on this 1. Run all checks nightly: this setting will automatically run the report nightly; and 2. Send email summary of quality-check reports to emails specified below?: whenever a report is run, it will be sent to the specified email addresses (under the Automated quality checks section, click on Options next to your form title). Both will help you pay attention to triggers at the right time. 

The quality checks reports can be downloaded in CSV format (under the Automated quality checks section, click on Report next to your form title), and the columns can be summarized as below:

  1. Unique identifiers: The columns id, dataset-id, warning-id, row-id and group-id display the unique identifier of the quality check, the form, the quality check type, the submission (KEY) and the group associated with the quality check, respectively.
  2. Quality Check characteristics: The columns field, group-field, and value will let you know more about the quality check configuration related to the warning:
    1. field: Form field related to the quality check.
    2. group-field: Form field that defines the group (only for group-based quality checks).
    3. value: Value set for thresholds.
    4. critical: 1 if critical, 0 if not (determined when creating the quality check).
  3. Data-quality warning information: The columns last-reported and warning will describe the warning in detail, mentioning the values that triggered the quality checks.
    1. last-reported: Date and time the quality check was last reported. This is related to the date and time the quality checks were run and the issue was flagged.
    2. warning: Details about the fields, groups and field values that triggered the quality check.

Combining data monitoring features

Automated quality checks can also complement other SurveyCTO monitoring tools:

  1. If you are using the Data Explorer to track and share the most important metrics for your organization, automated quality check warnings will appear under relevant graphs and/or tables, allowing you to explore them.


  2. If you are using the Review and Correction workflow, automated quality checks can determine which submissions to put on hold for review, instead of just reviewing a random subset of submissions or all submissions.


Do you have thoughts on this support article? We'd love to hear them! Feel free to fill out this feedback form.


Article is closed for comments.