PreEmptive Analytics Workbench User Guide

Data Query Spec

The Data Query API allows client applications to retrieve data that has been received and indexed by the Workbench, filtered and aggregated in a number of different ways. The Data Query API exposes individual queries by domain and name (i.e. the ID), filters and aggregates the results according to the request arguments, and returns the data in tabular format.

Accessing the Data Query API

The data query can be accessed at 127.0.0.1:88/Interaction/query. (You may need to change the host and port.) Note that these queries won't work from a browser - they require an HTTP POST with an appropriate (JSON-encoded) body.

Required Headers

HTTP requests to the Data Query API must set the following headers:

  • Content-Type: application/json; charset=utf-8
  • Accept: application/json

Query Arguments

All data queries must be requested as an HTTP POST, with an appropriate JSON-encoded body. Parameters passed as part of the URL (e.g. ?somearg=somevalue) are not supported.

Top-level object

The body must contain a JSON object as follows:

{
    "domain": <query domain>,
    "name": <query name>,
    "fields": <array of field objects>,                 // which fields should be returned
    "aggregate_on": <array of aggregation objects>      // which fields should be used for aggregation
}

All properties are required. See below for details.

Field objects

Each field object must have the following structure:

{
    "field": <field name>,
    "include_data": <boolean>,                  // optional, see below
    "filter": <array of filter objects>
}

The include_data property is optional, and should only be set for fields that are also included in the aggregate_on array (see below). The purpose of this property is to allow the query to specify that a field should be used for aggregation, but that it should not be included as a column in the query results. The default value for this property is true.

The filter array is only necessary if you wish to filter the results based on that field. See below for details.

Note: the AppId_Version and Time fields (for all queries) must always specify a filter. If you don't wish to limit the data retrieved, you can pass back the filter object that was returned by the metadata query (for the corresponding field), which has the effect of requesting "all available data".

Filter objects

Like the filter objects in the metadata query, filter objects passed as part of a data query have different properties depending on the type of the filter. All filter objects will have at least the following structure:

{
    "type": <string>,
    "__type": <string>,
    <other properties as described below>
}

The value of the type property determines which additional properties will be included in the filter object. This value must match the value provided by the metadata query. Please see below for information about each possible type value. Additional filter types may be introduced in future releases of the Workbench.

The value of the __type property is specific to the server-side implementation of each type, and may change from release to release. That value is important, however - any query that uses a filter must pass the appropriate __type value back to the server, as part of the query request.

Filter type: Application

The Application filter type specifies a list of application objects, in a values array, each of which should be allowed in the results. Each application object defines a specific company, application, and version - so in most cases a single real-world application will have multiple application objects, one for each version of the application.

So an application filter typically looks like this:

{
    "type": "Application"
    "__type": "AggregationDTO.PickFilterDTO, AggregationDTO"
    "values": <array of application metadata objects>
}
Application objects

Each application object should have the following structure:

{
    "value": <string>,
    "__type": <string>
}

The value property should be set to the unique ID of the application (company/application/version) that's being requested, as provided in the metadata query.

The value of the __type property is specific to the server-side implementation of each type, and may change from release to release. That value is important, however - any query that uses an Application filter must pass the __type value back to the server, as part of each Application object in the filter.

None of the other properties from the metadata query application object needs to be passed to the server as part of a query request, but it will work to pass the entire object as received from the metadata query, which is often easier to implement.

Filter type: DateRange

The DateRange filter type specifies min and max dates (timestamps) for inclusion. A typical date filter might look like this:

{
    "type": "DateRange",
    "__type": "AggregationDTO.RangeFilterDTO, AggregationDTO",
    "min": 630822996000000000,          // a value in Ticks, see below for details
    "inclusive_min": true,
    "max": 635337740683512287,          // a value in Ticks, see below for details
    "inclusive_max": false
}

The min and max values are specified as Ticks. They indicate the earliest and latest timestamps for which data should be retrieved.

The inclusive_min and inclusive_max values are booleans that specify whether events that exactly match the timestamp should be included in the results. This is typically used to make sure that the start time is exactly midnight and the stop time is up to midnight (non-inclusive), but any possible configuration is supported.

Filter type: Many

The Many filter type specifies a list of individual values, in the values array, that should be allowed in the results. A typical many filter might look like this:

{
    "type": "Many",
    "__type": "AggregationDTO.PickFilterDTO, AggregationDTO",
    "values": [
        "Windows 7",
        "Windows 8"
    ]
}

Aggregation objects

The aggregate_on property in the top level object specifies a set of fields that the server should use for aggregating the results, much like a "GROUP BY" clause in SQL. The aggregate_on property provides an array of aggregation objects, each with this structure:

{
    "field": <field name>,
    "options": <string>             // optional, see below
}

The options property is not required except for Time fields, in which case it must be set to any of: hour, hour_of_day, day, week, or month. Each value tells the server what time window to use for time-based aggregation.

The Data Query Response

The query, if successful, will return a JSON object as follows:

{
    "table": {
        "title": <string>,                      // the "name" from the query request
        "columns": <array of column objects>,   // see below for details
        "rows": <array of row objects>          // see below for details
    }
}

There may be other properties in this object, which should be ignored.

Column objects

Each column object describes an individual column in the response table:

{
    "label": <string>,                  // the column's unique ID
    "formatted_label": <string>,        // a human-readable default name for the column
    "column_type": <string>             // see below for details
}

The column_type value can be any of the standard field types, with the modification that the timestamp type (used for Time columns), will have the aggregation window appended to it, after an underscore. For example:

{
    "label": "Time",
    "formatted_label": "Date Range",
    "column_type": "timestamp_day"
}, {
    "label": "StartCount",
    "formatted_label": "Started Sessions",
    "column_type": "number"
}

Row objects

Each row object provides data for the individual "cells" in that row of the table:

{
    "values": <array of values>
}

Each value will have a type according to the column_type specified in the column object corresponding to the value's offset in the values array.

Most values will be simple numbers or strings, but for the formatted field type the value will be an object. This is typically the case for the AppId_Version field, which will provide an application object as the field value.



Workbench Version 1.2.0. Copyright © 2016 PreEmptive Solutions, LLC