This series describes the technical implementation steps to generate and retrieve a dataset. The dataset generation process is a series of steps using Data API endpoints. Once generated, the dataset can be downloaded, or used with Reports API to render a completed report, depending on the dataset type.
A dataset is a set of data files (sometimes many thousands of files) containing aggregated results and analysis in JSON and CSV formats. The exact contents and format of a dataset will vary depending on the dataset type. The dataset is generated asynchronously using Data API. Once completed, the dataset can be retrieved via Data API or rendered in-browser using Reports API, if the dataset type supports it.
Report generation is performed asynchronously as follows:
1. Initialize a new dataset using Data API: Create and configure a new dataset. You must specify a
dataset_id when you do this, in UUID format. The endpoint will return one or more URLs to which input files need to be uploaded.
If input files are not required to be uploaded, the
dataset_type and a
job_reference are returned, and you can proceed directly to step 3 below.
2. Upload input files:
a. Upload input files: Upload the input files required for the dataset. The data is uploaded to a specific URL returned from step 1.
b. Commence dataset generation job using Data API: Notify Data API that input files have been uploaded and kick off the dataset generation job.
3. Poll for job completion using Data API: The status of the dataset generation job can be obtained by polling the
4. Retrieve results via Data API or Reports API (if applicable): On completion, raw report data can be retrieved by your server application using Data API. If the dataset type has a corresponding report type in Reports API, the report can render the new dataset, and raw data can also be accessed from Reports API's public methods.