Managing User Data

User data can be managed using the /entity API:

Endpoint

Description

/entity

retrieves a single user record

/entity.find

retrieves a set of user records, determine by the filter applied

/entity.create

creates a single user record

/entity.bulkCreate

creates multiple user records in a single API call

/entity.update

updates only the specified attributes for an existing user record

/entity.replace

replaces all attributes for an existing user record; any attributes not specified will be replaced with null values

Querying Large Data Sets

When using the entity.find endpoint to iterate over large sets of data (> 100,000) the first_result and show_total_count convenience parameters should not be used. The queries should be optimized using natural database sorting by sorting on the id attribute. This has two benefits:

  • Records created between when the time iteration begins and when the time iteration ends are included in the results.
  • Efficient and consistent performance querying and loading for each page of results.

The following tips will help you optimize your queries:

  • Use the attributes parameter to limit the number of attributes returned for each record to minimize the size of the HTTP payload.
  • Experiment with the max_results parameter to optimize for responses under 10 seconds.
  • Include the timeout parameter (up to 60 seconds) if, and only if, you are unable to keep responses under 10 seconds using the max_results parameter.
  • Do not use the first_result parameter for large data sets as it will incur an incremental performance cost with each additional page of results.
  • Do not use the show_total_count parameter for large data sets as it will incur an additional query (count) with every request.

The sample code below (written in Python) shows how to iterate over every record updated since January 1, 2016. Only the id, uuid, and email attributes will be returned in the result set, and up to 100 records will be returned with each request.


import requests
import json
last_id = 0
while True:
    response = requests.get(
       'https://YOUR_APP.janraincapture.com/entity.find',
        headers={
            'Authorization': 'Basic aW1fYV...NfbXk='
       },
        data={
            'type_name': 'user',
            'max_results': '100',
            'attributes': '["id", "uuid", "email"]',
            'sort_on': '["id"]',
            'filter': "id > {} and lastUpdated >= '2016-01-01'".format(last_id),
        }
    )
    json_resp = json.loads(response.text)
    if json_resp['stat'] == 'ok' and json_resp.get('result_count', 0) > 0:
        for record in json_resp['results']:
            # do something with record
            print(record)
            # update last_id variable with last record in the results
            last_id = record['id']
    else:
        # stop iterating when there are no more results
        break 
        

Bulk Data Imports

If you need to import user records from an existing data store into the Janrain platform, the /entity.bulkCreate API can be used for bulk loading data. The Janrain Data Loader is an example script utilizing this API that you may use to perform your own data migrations.

If you are considering utilizing this script, we recommend that you consult with Janrain Professional Services on setting appropriate arguments for batch size and rate limit. Always alert Janrain of the date and time you plan to run any bulk data events by submitting a Traffic Event request through the Support Portal.