Datasets: activity-analysis-by-question Reference – Learnosity Product & Developer Help

You would want to use this dataset so that you can access raw data from Learnosity to perform your own custom data analysis and reporting.

See Implementing and Creating Datasets for Download or Reporting for more information about datasets.

Usage

The format of requests to Data API use the following syntax:

https://data.learnosity.com/[LTS_VERSION]/reports/datasets

For example, to use the 2023.3.LTS version and the 'Initialize Dataset' endpoint, you would create a request like so:

https://data.learnosity.com/v2023.3.LTS/reports/datasets

Initialize dataset

Initialize an "activity-analysis-by-question" dataset via the SET reports/datasets endpoint.

To control which session data should be used for the Item bank analysis, choose from the following:

Specify file_count: 1 (or more) to upload an explicit list of sessions that should be considered for analysis. In this case, the endpoint returns URL(s) to which each input file must be uploaded before the data compilation job can commence. Read about the upload process for details.
Alternatively, specify file_count: 0 and provide any desired filter parameters. In this case, the data compilation job is queued immediately based on the given filters, and the endpoint response includes a corresponding job_reference for polling the job's progress.

Other parameters for the initialize request are described below:

Endpoint	/[LTS_VERSION]/reports/datasets
HTTP Method	POST
Action Type	set
dataset_type string	Specifies the type of dataset to initialize. Must be `activity-analysis-by-question`.
file_count int	Number of input data files to upload containing specific session IDs to be analyzed. The response object will contain this many file URLs. Specify `0` to skip this step and analyze all sessions, or to filter sessions using the `filters` object instead. Specify `1` to upload all the session IDs in one file. You would want to use this if your report is based on a large number of sessions, and you prefer to split your data into a number of smaller files. Note: You must provide either: input files, or specify session filters in the `filters` object below.
filters object	Specifies a set of filters for the report. The sessions matching your filter must all contain the same Questions. Any sessions that contain different Questions will be discarded before analysis. See the `sessions-log` output file for details on which sessions matched your filter, and which of those sessions were actually analyzed. Note: You must provide either session filters or input files. See `file_count` for input files guidance.
filters.activity_id array[string]	An array of string `activity_id` values. Up to 100 `activity_id` values may be provided.
filters.mintime string	Filter to exclude sessions submitted before `mintime`. Values can be provided in either RFC3339 section 5.6 “yyyy-mm-ddThh:mm:ssZ+” or Unix timestamp format.
filters.maxtime string	Filter to exclude sessions submitted after `maxtime`. Values can be provided in either RFC3339 section 5.6 “yyyy-mm-ddThh:mm:ssZ+” or Unix timestamp format.
options object	Object containing report configuration parameters, see child attributes detailed below.
options.default_sort_field string	Define the default field used for sorting all group data in the dataset. Note: the named field must be present in `fields`.
options.default_sort_order string	Choose ascending or descending order for sorting group rows. Possible values: `"desc"`: descending order. `"asc"`: ascending order. Default: `"desc"`
options.fields array[string]	The fields to be calculated for all Questions in the dataset. See the list of available fields.
options.summary_fields array[string]	The summary fields to be calculated. These fields will appear in the dataset summary file. See the list of available fields.

Example

{
    "dataset_type": "activity-analysis-by-question",
    "filters": {
        "activity_id": ["20230106b_ELA_comprehension"],
        "mintime": "2023-01-01",
        "maxtime": "2023-12-31"
    },
    "file_count": 0,
    "options": {
        "default_sort_field": "item_position",
        "default_sort_order": "asc",
        "fields": [
            "organisation_id",
            "item_position",
            "item_reference",
            "question_number",
            "question_reference",
            "count_sessions",
            "count_attempted",
            "p_value",
            "p_value_if_attempted",
            "stddev_p_value",
            "stddev_p_value_if_attempted",
            "discrimination_index"
        ],
        "summary_fields": [
            "count_questions",
            "count_sessions",
            "count_sessions_discarded",
            "count_sessions_analyzed",
            "count_sessions_top_27p",
            "count_sessions_bottom_27p"
        ]
    }
}

/* Example response:
{
    "meta": {},
    "data": {
        "dataset_id": "686b89d5-2911-4721-a6fc-7b43f57772aa",
        "job_reference": "2c2e3d6d-d446-4850-8704-ccbf41e4d91c"
    }
}
*/

Input file

If specifying an explicit list of sessions for the "activity-analysis-by-question" dataset, upload one or more NDJSON input files containing user_id and session_id properties. Responses for each user's set of session_id values will be combined together as if they were a single Activity. The datasets implementation guide provides details on the upload process.

{"user_id":"ANONYMIZED_USER_aeee19f1", "session_id": ["f77b6600-dfd7-4a58-a461-4e5d9e667e43"]}
{"user_id":"ANONYMIZED_USER_aeee19f2", "session_id": ["46b0dc55-bb79-45c3-8d09-2c99cc4e7b39"]}
{"user_id":"ANONYMIZED_USER_aeee19f3", "session_id": ["3bd372ae-b302-4078-88e6-39436b7c5bad"]}
{"user_id":"ANONYMIZED_USER_aeee19f4", "session_id": ["3457ec0f-7126-4e9c-8ccb-9f955353e14f"]}
{"user_id":"ANONYMIZED_USER_aeee19f5", "session_id": ["ae92be44-8769-4a70-9503-8954c1a7bca8"]}
{"user_id":"ANONYMIZED_USER_aeee19f6", "session_id": ["1ee32517-2ec2-4690-a9b0-cc69b9b0e30c"]}
{"user_id":"ANONYMIZED_USER_aeee19f7", "session_id": ["6cb15b97-2757-4cc8-bfa0-75200b7534df"]}
{"user_id":"ANONYMIZED_USER_aeee19f8", "session_id": ["fc870d98-719b-4af3-825a-5a27cd5993ee"]}
{"user_id":"ANONYMIZED_USER_aeee19f9", "session_id": ["0d51d05a-00b5-4406-902a-4ae4f8e813b0"]}
{"user_id":"ANONYMIZED_USER_aeee19fa", "session_id": ["be914960-c6d6-4970-8037-2dcb0bcd1444"]}
{"user_id":"ANONYMIZED_USER_aeee19fb", "session_id": ["2c0b9483-2838-4b4a-b151-e15ca1adb2b4"]}
{"user_id":"ANONYMIZED_USER_aeee19fc", "session_id": ["d771d689-0278-4649-84f6-d8add3fa74de"]}
{"user_id":"ANONYMIZED_USER_aeee19fd", "session_id": ["ffa7bf43-c400-4b3b-ac98-7d463f960fae"]}
{"user_id":"ANONYMIZED_USER_aeee19fe", "session_id": ["d2c63728-2876-4ae8-bc9d-2be2781126f2"]}

Get dataset

Retrieves the URL of raw data files for a specified dataset. The endpoint returns an array of output file objects containing pre-signed URLs. Send an HTTP GET request to the pre-signed URL to retrieve the file.

Endpoint	/[LTS_VERSION]/reports/datasets
HTTP Method	POST
Action Type	get
dataset_id string	The dataset to retrieve.
dataset_type string	The type of dataset identified by `dataset_id`. Must be `"activity-analysis-by-question"`.

Response

data[].type string	The output type. Can be `"dataset"` or `"sessions-log"`.
data[].format string	The file format of the output file. Can be `"json"` or `"csv.gz"` (compressed CSV).
data[].files array	List of pre-signed URLs for downloading the files. In most cases there will be a single URL, but we allow for particularly large datasets to be split across multiple files.

Example

{
    "dataset_type": "activity-analysis-by-question",
    "dataset_id": "686b89d5-2911-4721-a6fc-7b43f57772aa"
}

/* Example response:
{
    "meta": {},
    "data": [
        {
            "type": "dataset",
            "format": "json",
            "files": [
                "https://learnosity-reportdatasets-va.s3.amazonaws.com/reports/0034/activity-analysis-by-question/686b89d5-2911-4721-a6fc-7b43f57772aa/2017-08-02_aabq_686b89d5_summary.json?AWSAccessKeyId=AKIAxxx"
            ]
        },
        {
            "type": "dataset",
            "format": "csv.gz",
            "files": [
                "https://learnosity-reportdatasets-va.s3.amazonaws.com/reports/0034/activity-analysis-by-question/686b89d5-2911-4721-a6fc-7b43f57772aa/2017-08-02_aabq_686b89d5_result_1.csv.gz?AWSAccessKeyId=AKIAxxx",
                "https://learnosity-reportdatasets-va.s3.amazonaws.com/reports/0034/activity-analysis-by-question/686b89d5-2911-4721-a6fc-7b43f57772aa/2017-08-02_aabq_686b89d5_result_2.csv.gz?AWSAccessKeyId=AKIAxxx"
            ]
        },
        {
            "type": "sessions-log",
            "format": "csv.gz",
            "files": [
                "https://learnosity-reportdatasets-va.s3.amazonaws.com/reports/0034/activity-analysis-by-question/686b89d5-2911-4721-a6fc-7b43f57772aa/2017-08-02_aabq_686b89d5_sessions-log.csv.gz?AWSAccessKeyId=AKIAxxx"
            ]
        }
    ]
}
*/

Output files

The contents of the different kinds of output files are described below.

Dataset output file - JSON format

Type	Format	Description
`dataset`	`json`	The full dataset containing `summary_fields` data, and `fields` data for all Questions.

{
    "summary": {
        "count_questions": 10,
        "count_sessions": 21436,
        "count_sessions_discarded": 1865,
        "count_sessions_analyzed": 19571,
        "count_sessions_top_27p": 5901,
        "count_sessions_bottom_27p": 5936
    },
    "fields": [
        "organisation_id",
        "item_position",
        "item_reference",
        "question_number",
        "question_reference",
        "count_sessions",
        "count_attempted",
        "p_value",
        "p_value_if_attempted",
        "stddev_p_value",
        "stddev_p_value_if_attempted",
        "discrimination_index"
    ],
    "sort_field": "item_position",
    "sort_order": "asc",
    "rows": [
        [1, 1, "item_1", 1, "question_1", 18740, 18736, 0.74, 0.77, 0.44, 0.77, 0.66],
        [1, 2, "item_2", 1, "question_2", 18670, 18668, 0.58, 0.61, 0.49, 0.61, 0.84],
        [1, 3, "item_3", 1, "question_3", 18640, 18624, 0.66, 0.69, 0.47, 0.69, 0.81],
        [1, 4, "item_4", 1, "question_4", 18595, 18583, 0.84, 0.88, 0.37, 0.88, 0.48],
        [1, 5, "item_5", 1, "question_5", 18585, 18578, 0.67, 0.71, 0.47, 0.71, 0.81],
        [1, 6, "item_6", 1, "question_6", 18598, 18575, 0.68, 0.71, 0.47, 0.71, 0.75],
        [1, 7, "item_7", 1, "question_7", 18574, 18559, 0.53, 0.56, 0.50, 0.56, 0.91],
        [1, 8, "item_8", 1, "question_8", 18583, 18552, 0.59, 0.63, 0.49, 0.63, 0.87],
        [1, 9, "item_9", 1, "question_9", 18549, 18532, 0.87, 0.92, 0.34, 0.92, 0.39],
        [1, 10, "item_10", 1, "question_10", 18570, 18545, 0.79, 0.84, 0.40, 0.84, 0.51]
    ]
}

Dataset output file - CSZ.GZ format

Type	Format	Description
`dataset`	`csv.gz`	The full dataset containing `fields` data for all Questions.

organisation_id,item_position,item_reference,question_number,question_reference,count_sessions,count_attempted,p_value,p_value_if_attempted,stddev_p_value,stddev_p_value_if_attempted,discrimination_index
1,1,item_1,1,question_1,18740,18736,0.74,0.77,0.44,0.77,0.66
1,2,item_2,1,question_2,18670,18668,0.58,0.61,0.49,0.61,0.84
1,3,item_3,1,question_3,18640,18624,0.66,0.69,0.47,0.69,0.81
1,4,item_4,1,question_4,18595,18583,0.84,0.88,0.37,0.88,0.48
1,5,item_5,1,question_5,18585,18578,0.67,0.71,0.47,0.71,0.81
1,6,item_6,1,question_6,18598,18575,0.68,0.71,0.47,0.71,0.75
1,7,item_7,1,question_7,18574,18559,0.53,0.56,0.50,0.56,0.91
1,8,item_8,1,question_8,18583,18552,0.59,0.63,0.49,0.63,0.87
1,9,item_9,1,question_9,18549,18532,0.87,0.92,0.34,0.92,0.39
1,10,item_10,1,question_10,18570,18545,0.79,0.84,0.40,0.84,0.51

Sessions log output file - CSV.GZ format

Type	Format	Description
`sessions-log`	`csv.gz`	A list of the session IDs that matched your filter. The `analyzed` flag indicates whether or not the Items in that session matched the Activity being analyzed. If not, it is discarded.

user_id,session_ids,analyzed
user-d0b684e7d918,19e1a2dc-df85-4bac-99bf-d0b684e7d918,0
user-5dec9b2e6710,640de1b1-3e18-4c4c-bb3e-5dec9b2e6710,1
user-9f7b826fe3a8,72dd2663-c851-4b97-a659-9f7b826fe3a8,0
user-33436211e541,869ac4f3-5e48-44a0-9a3c-eae8cd3fc0fd,1
user-dca654df64d8,e41d45ad-c26a-4734-b00f-dca654df64d8,0
user-30c61b1866b2,68bc2c81-f8ac-400f-aff3-30c61b1866b2,1
user-cc85152d5621,691ea0ce-9caf-4fed-bb95-cc85152d5621,0
user-9fc0f77e2c35,c1ac6605-a05c-443c-9bf7-9fc0f77e2c35,0
user-ac4fbfed6990,fd7b9240-bc72-4bde-80a8-ac4fbfed6990,0

Usage

Initialize dataset

Example

Input file

Get dataset

Response

Example

Output files

Dataset output file - JSON format

Dataset output file - CSZ.GZ format

Sessions log output file - CSV.GZ format

Related articles