Guide to dataset publishing

This support article is accompanied by the two sample workflows in this folder. For help deploying them to your server, check out our support article Deploying form definitions and server datasets.

In SurveyCTO, you can use server datasets as a way to store and organize data from both your SurveyCTO forms and outside of the platform, which can then be used as pre-load data. In this article, we will go over the features available when publishing form data to a server dataset.

1. Overview

A server dataset is a repository of data on your SurveyCTO server, separate from your form data. Server dataset data is stored in a table, just like a CSV file or spreadsheet, so it has rows and columns. This data can be used for both pre-loading and dataset publishing.

You can set up a form so that when your server receives a submission for that form, its data is published to the server dataset. This can be used for congregating data from multiple forms into a central location, using data from one form in another form, and more.

When form's data is published to a server dataset:

  1. Data can be published to one row in WIDE, or multiple rows in LONG format (see 2. Publishing format).
  2. Data can either be added as a new row, or it can update an existing row (see 3. Updating server dataset rows).
  3. Data can be published conditionally (see 4. Conditional updates)
  4. Data collected in SurveyCTO Collect can be pre-loaded into forms, even while offline (see 5. Offline dataset publishing)

Check out these videos to learn how to set up dataset publishing:

  1. Create a server dataset
  2. Set up dataset publishing

2. Publishing format

You can publish form data to a server dataset in two formats: wide format and long format.

2.1 Wide format publishing

In wide format publishing, each form submission adds or updates one row of the server dataset.


For example, let’s say you are tracking the temperature of a group of patients by visiting them on a frequent basis and collecting health data, including their temperature. In this case, you fill out a form for each patient and, every time the form is submitted, it adds or updates a single patient in the server dataset. To update another patient, the form will have to be filled out again.

When repeated data is published in wide format, it will all be published to the same row. Each repeat instance of each repeated field will publish to a different column. The column header will be the field name, followed by an underscore, followed by the repeat index (e.g. 'name_1', 'name_2', 'name_3', etc). New columns will be generated as needed.

2.2 Long format publishing

In long format publishing, each form submission can update multiple rows of the server dataset using a repeat group. Each instance of the repeat group will add or update a different row of the server dataset.

For example, if you are collecting details about a list of students using a repeat group in a form, where each repeat instance will correspond to a different student, you would be able to update multiple rows of a server dataset where each row corresponds to a different student:


Long format publishing has many other uses, such as when updating household members’ details from a household survey.

3. Updating server dataset rows

3.1 Specifying which row to update

When you specify a form field using Form field to identify unique records, the field mapping for that field will be used to determine which server dataset row should be updated. If the value of that form field exists in the server dataset column it publishes to, it will update that row; if it doesn't exist, it will add a new row instead.

If you don't specify a Form field to identify unique records, a new row will be added for every submission (or, in the case of long format publishing, a new row for every repeat instance).


3.2 Field mapping

Field mapping is used to determine which form fields will update which columns of the server dataset. Add a field mapping using the Add button.

When setting up a field mapping, you can specify one of three ways to update a server dataset column: replace, add, or concatenate. This is applied whenever a server dataset row is updated (instead of a new row being added).


Replace the existing server dataset value with the new value from the form submission.

Let's take a look at the patient temperature tracker as an example. Let's say we want the server dataset to store whether the patient currently has a healthy temperature (between 36.1 and 37.2 inclusive). Whenever there is a new submission, the 'healthy_temp' value of that patient in the server dataset will always be replaced.


That way, we can easily check the latest health status of each of our patients:



Add a form submission's numeric value to a server dataset numeric value, and publish the sum to the server dataset.

In our patients temperature tracker, it is also key to track the number of visits to that patient. To count the number of visits, we will always add 1 (field “add_visit”) to the current value of the server dataset column 'num_visits' whenever a new form is submitted. 


That way, every time there is a new submission for that patient, their 'num_visits' increases by 1, so 'num_visits' will store the total number of visits. This works even if there are multiple submissions at once.



Append the form submission value at the beginning or end of the server dataset value, so there is a list of all values. By default, the list separator is a pipe (|), but this can be customized.

To track our patients’ temperature progress, it is helpful to see all the temperatures registered so far. Instead of storing them in separate columns, we can create a list of temperature values in a single server dataset column 'temperatures'. Every time there is a new submission, the value of the field "temperature" will be appended to the beginning of the list in the server dataset column 'temperatures', separated by a pipe (|).


For example, if the column 'temperatures' has a value of 37.5|37.9, and then a form is submitted where the field "temperature" has a value of 37, then the new value of 'temperature' will become 37|37.5|37.9:


4. Conditional updates

If you specify Include form submissions whenever this field is 1, then the form submission data will only be published to the server dataset if that field's value is equal to 1.

In the patience temperature tracker workflow, there are times when the enumerator is not able to measure or collect the patient's temperature. To accommodate for this unfortunate event, the field "got_temperature" asks the enumerator, "Were you able to get the patient's temperature?". If "Yes", then this field will have a value of 1, and the submission's data will be published to the server dataset. Otherwise, its data will not be published.


Conditional updates can also be done on a field-by-field basis using relevance in the form. A field will only be published to the server dataset if it is relevant.

5. Offline dataset publishing

Advanced offline feature
This is an advanced offline feature. Advanced offline features are not part of a standard subscription. Get in touch to activate advanced offline functionality.

You can turn this on from the server dataset Settings on the Design tab. When Allow offline updates is turned on, then SurveyCTO can publish form data to an offline version of the server dataset, without submitting it to the server first. That way, the data is available for pre-loading immediately. You can learn how to set this up in this video, and you can learn more in our documentation Offline dataset publishing.

Do you have thoughts on this support article? We'd love to hear them! Feel free to fill out this feedback form.


Article is closed for comments.