Using data from another form – SurveyCTO Support Center

This support article is accompanied by the sample workflows in this folder. If you are a beginner, start with the beginner workflow. For help deploying the workflow, check out our support article on deploying form definitions and server datasets.

There are times where two forms are completed separately, but the second form needs data submitted by the first. For example, what if you are completing a midline survey, and you need the list of household members gathered during the baseline survey?

Using both server dataset publishing and pre-loading, you can publish data from one form to a server dataset, and then pull the data from the server dataset into a second form. In this support article, we will guide you and link you to resources so you can learn:

How to set up server dataset publishing.
How to set up pre-loading from a server dataset.
How to keep your collection devices up-to-date with the latest pre-load data.
1. Make form data available after syncing with server
2. Make form data available for pre-loading immediately, even when offline
Demonstrate a full working example that illustrates the above.

In the above chart, you can see that the enumerator fills out Form A, and when it is completed, its data is published to the server Dataset. Then, the data in the server dataset is pulled by Form B and Form C using the pulldata() function.

Note that form A could hypothetically pre-load data from the server dataset too. These features are very flexible.

1. Publishing data to the server dataset

First, you need to publish the data from the first form into a server dataset. If you are not familiar with dataset publishing, check out these resources, then come back here:

How-to video: How to create a server dataset
How-to video: How to set up dataset publishing
Support article: Guide to dataset publishing
Documentation: Introduction to advanced dataset usage
Documentation: Publishing form data into server datasets
Webinar: Build smarter forms with server dataset workflows

Data stored in a server dataset can be used by any other form. So, once server dataset publishing is set up, data published from that first form can be used by other forms.

Tip: It is a good idea for the server dataset column headers to have the same names as the form fields that publish to them. For example, you can set up the "name" field of the form to publish to the 'name' column of the server dataset, set up the "phone_number" field to publish to the 'phone_number' column, etc. This will make it easier to keep track of which fields publish to which columns.

2. Pulling data from the server dataset

Once the device has been updated with the latest server dataset data (which we'll discuss in section 3), the pulldata() function is used in a subsequent form to pull data from that server dataset. If you are not familiar with pre-loading and pulling data, check out these resources:

Documentation: Pre-loading data into a form
Webinar: Learn how to save time and enhance data quality by pre-loading data

Using the pulldata() function, you can pull server dataset data published by the first form, into the second form.

3. Device setup (including automatic updates)

3.1 Without offline dataset publishing

This section is for servers that don't have offline server dataset updates enabled. To learn how to use data from another form immediately, without needing a connection to the internet, check out the next section.

It is important to keep in mind that in order to pull data from the first form, the first form instance has to be submitted to the server and published to the server dataset. Then, once server dataset publishing is complete (this can take up to five minutes), SurveyCTO Collect will need to download and install the latest version of the second form, which will include the updated server dataset. That way, the second form will be able to use the latest data submitted by the first form.

Form instances completed using the first form (the publishing form) should be submitted to the server as soon as possible. If they are not submitted, their data will not be published to the server dataset, and the second form will not be able to use that new data.

Similarly, the second form will need to be updated every time the server dataset is updated (e.g. every time new data is published to it) by downloading and installing the latest form definition. If devices are not updated with the latest form and dataset data, they will only be able to pull old, outdated data. Device data updates are required even if the second form (which does the pre-loading) will be completed on the same device that submitted the data.

It is possible to do all of this manually, but this involves making sure all enumerators submit their completed form instances in a timely manner, and also installing the latest form definition every time there is even a tiny change to the server dataset it pulls from. Instead, you can also use SurveyCTO Collect settings to make sure this all happens automatically.

Follow these steps to set up automatic submissions and updates:

From the main menu, tap the three-dot menu in the upper-right.
Tap General Settings.
Checkmark all of the settings under Auto Send/Receive. These are:

Auto send with Wi-Fi
Auto send with network
Auto send with Wi-Fi
Auto download with network
Auto download on demand
Auto install downloaded updates
Display send/receive status (optional, but recommended)

With these settings enabled, whenever a form is marked as finalized, it will be submitted to the server as soon as the device is online. Also, whenever an enumerator opens a form while online, SurveyCTO Collect will check for any updates, and then install them if any are found. That way, enumerators will have the latest baseline data attached to the midline form.

Tips:

If mobile internet service will be limited or expensive, you may want to disable the "network" Auto Send/Receive settings so you don't accidentally go over your data limit.
If any of those settings are not available, you can enable them from the Admin Settings.

You can apply these settings manually on each device, but you can also use quick setup so when the user logs into the device, these settings are automatically applied. To learn more, check out our documentation on managing device settings (section Default device configurations and quick setup), as well as our best practices for configuring and securing enumerator devices.

3.2 With offline dataset publishing

Advanced offline feature

This is an Advanced offline feature. Upgrade your price plan to include these features or contact our Sales team to discuss an enterprise agreement.

You can set up your server dataset so that published form data is available for pre-loading immediately, without submitting it to the server first. For example, you can complete form A, which publishes that data to server dataset 1, and then immediately open form B, which pre-loads data from server dataset 1; that way, form B can use data just entered using form A, even though it wasn't submitted to the server yet.

Setting this up is easy. Check out this video to learn how to enable offline server dataset updates

And that's it! When you fill out that form using SurveyCTO Collect, you will be able to use form data for pre-loading immediately.

If you downloaded the server dataset BEFORE offline updates were enabled (by downloading a form that server dataset is attached to), you will need to re-retrieve that server dataset using Get Blank Form (again, by downloading a form that server dataset is attached to) to retrieve the version that allows offline updates.

You can learn more in our documentation Fully offline data collection workflows.

4. Examples

These examples include sample workflows. For assistance deploying them to your own server, check out our support article on deploying form definitions and server datasets.

The sample forms ask about the crops grown by the household, the household size, and each household member's name and annual income. When the first form, the baseline form, is submitted to the server, its data is published to a server dataset. Then, the second form, the midline form, uses the pulldata() function to pull that data from the server dataset.

Basic workflow

This example is accompanied by the beginner setup in this folder.

In this example, in the household midline form, take a look at the calculate field "pull_crops" on row 13. It uses this calculation to pull the space-separated list of crops grown:

pulldata('beginner_hh_dataset', 'crops_grown', 'hh_key', ${hh_id})

The value of that field is then used in different properties of the field "crops_produced" on row 18:

The choice_filter expression uses the field "pull_crops" to make sure that only crops that were selected in the baseline form appear.
The relevance expression uses the field "pull_crops" so if "None of the above" had been selected, then that field does not appear at all.

A similar process is used to pull other pieces of data from the CSV file. For example:

On row 14, the name of the head of the household is pulled in the field "head_name" using this calculation:
```
pulldata('beginner_hh_dataset', 'hh_name', 'hh_key', ${hh_id})
```
On row 15, the last income reported is pulled in the field “last_income” using this calculation:
```
pulldata('beginner_hh_dataset', 'income', 'hh_key', ${hh_id})
```

The only difference in between these calculations is the second parameter; we are pulling from the same server dataset and row each time, but we are pulling from different columns ('hh_name' and ‘income’, respectively). That way, the enumerator knows the name of the household head so they know whom to ask for when they arrive on site, and the form can calculate the "income_diff" using the last income reported.

This sample only pulls a handful of fields from the server dataset, but you can pre-load as many fields as you need.

You cannot publish data from encrypted fields. If a form is encrypted, and you would like a field to be published, make sure the field has a publishable value of yes.

Intermediate workflow: Pulling repeated data

This example is accompanied by the intermediate setup in this folder.

When repeated data is published to a server dataset, it is stored in wide format. It takes the original field name, concatenates (adds on) an underscore _ to the end, followed by the repeat index number. For example, take a look at this part of the form definition:

The field "name" is a repeated field. So, the first instance of the field "name" will publish to the 'name_1' column, the second instance of "name" will publish to the 'name_2' column, and so on. Here is an example of what that data may look like for households with the 'id' values of 101 and 102:

id	name_1	name_2	name_3	name_4
101	Adnan	Bhavna	Charles	Dana
102	Aiden	Bailey	Calvin	Delilah

This has been simplified to only show the 'id' and 'name_#' columns. In practice, there will also be 'age_#' columns, 'income_#' columns, and more.

Learn more about the format of exported (and published) data in the product documentation.

With the above in mind, you can construct an expression that can pre-load repeated data stored in wide format. First, you will need to construct the column name you would like to pull from. In the midline form, the field "hh_index" has a calculation of simply index(). So, the first instance of that field will return 1, the second instance will return 2, and so on. Next, to construct the column name, you will need to combine the original field name with the index using the concat() function. Take a look at this expression:

concat('income_', ${hh_index})

That expression takes the name of the field from the baseline form, "income", and concatenates an underscore to the end of it, followed by the repeat index. That way, the first repeat instance of that field will return "income_1", the second will return 'income_2', and so on. If you look at the server dataset data, you'll recognize these as column headers.

Finally, you can use that generated column header name in a pulldata() function. For example, the calculate field "last_income" on row 21 has this calculation:

pulldata('intermediate_hh_dataset', concat('income_', ${hh_index}), 'hh_key', ${hh_id})

The second parameter (concat('income_', ${hh_index})), instead of using the literal column header name, uses that concat() expression to dynamically build the column header name based on the repeat index number. That way, the first repeat instance of the calculation of the field "last_income" pulls from the 'income_1' column, the second repeat instance pulls from the 'income_2' column, and so on. In other words, the first repeat instance will pull the income of household member 1, the second repeat instance will pull the income of household member 2, and so on. That way, the income difference can be calculated for each individual household member.

Advanced workflow

If you would like to try a more advanced version of publishing and pre-loading data, check out the files in this folder. In that system, it will ask which household members from the baseline form are still household members in the midline form, ask what happened to previous household members, and allow the adding of new household members.

Do you have thoughts on this support article? We'd love to hear them! Feel free to fill out this feedback form.