Updating a Dataset

You'll need to supply an API key in the Authorization header read a private dataset or to make any edits.

A dataset is a JSON document. It contains object keys, such as resources, and string keys such as the title and description. The fields "id" and "readonly" are immutable.

Read a dataset

GET https://data.website/api/dataset/:slug

Read a dataset as a JSON object. Includes a `readonly` field, which cannot be patched, providing supporting database information.

Path Parameters

NameTypeDescription

slug

string

URL slug of the dataset. Also accepts the UUID.

{
  "id": "3f741e00-111f-4134-b04f-8b103f78bbe7",
  "slug": "planning-permissions-on-the-london-development-database--ldd-",
  "title": "Planning permissions on the London Development Database (LDD)",
  "parent": "5b858cb2-5c92-4b2b-8c1d-141cfc330d9c",
  "state": "active",
  "sharing": "public",
  "tags": [
    "planning",
    "planning-application"
  ],
  "author": "Greater London Authority",
  "topics": [
    "67b1cea4-806d-4b63-90d7-155cf3ac3c03",
    "b781eefa-ff44-44b8-be63-fc3acd37547c"
  ],
  "licence": "ogl-v3",
  "createdAt": "2017-05-02T13:10:08",
  "updatedAt": "2019-01-11T15:14:11.944Z",
  "maintainer": "London Development Database",
  "description": "<p>The London Development Database (LDD) records significant planning permissions in London.</p>\r\n<p>The data is entered by London's planning authorities, and is checked by the GLA to ensure consistency across London. The LDD records any planning consent that permits one or more of the following:</p>\r\n<ul>\r\n<li>any new build residential units</li>\r\n<li>any loss or gain of residential units through change of use or conversion of existing dwellings</li>\r\n<li>creation of seven or more new bedrooms for use as either a hotel, a hostel, student housing or for residential care through new build or change of use</li>\r\n<li>1,000m2 or more of floor space changing from one use class to another or created through new build or extension for any other use</li>\r\n<li>the loss or gain or change of use of open space.</li>\r\n</ul>\r\n<p>New permissions are added to the database on a monthly basis within three months of the end of the month in which they were granted. Information on scheme starts and completions is updated annually by September the following year.</p>\r\n<p>The spreadsheet <em>LDD - Planning permissions</em> includes details of all permissions either currently recorded as live (not started or under construction) or completed since 01/04/2006. <em>LDD - Non-residential floorspace</em> provides additional details of the non-residential floor space for those permissions with a non-residential component and <em>LDD - non-residential bedrooms</em> provides additional details of the non-C3 bedrooms over the same time period.</p>",
  "author_email": "mayor@london.gov.uk",
  "licence_notes": "",
  "odi-certificate": "",
  "maintainer_email": "",
  "update_frequency": "Monthly",
  "london_bounding_box": "",
  "london_smallest_geography": "Point Location",
  "shares": {
    "users": {},
    "orgs": {}
  },
  "resources": {
    "eb050c40-3e94-4384-8e59-1b8c49dbdf36": {
      "url": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/planning-permissions-on-the-london-development-database--ldd-/2019-01-11T15%3A13%3A38/LDD%20Permissions%20for%20Datastore.xlsx",
      "order": 0,
      "title": "LDD planning permissions",
      "format": "spreadsheet",
      "check_hash": "1982c439d56a3c3b56614d356d15069c",
      "check_size": 19328820,
      "description": "WARNING: Large file size",
      "check_mimetype": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
      "check_timestamp": "2019-01-11T15:13:42.556Z",
      "check_http_status": 200,
      "london_release_date": null,
      "temporal_coverage_to": null,
      "temporal_coverage_from": null
    }
  },
  "readonly": {
    "auth": {
      "package_update": false
    },
    "shares": {
      "users": {},
      "orgs": {}
    },
    "licence": {
      "title": "UK Open Government Licence (OGL v3)",
      "url": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
      "is_okd_compliant": true
    },
    "parent": {
      ...
    },
    "rootOrg": {
      ...
    },
    "topics": {
      "67b1cea4-806d-4b63-90d7-155cf3ac3c03": {
        "title": "Housing",
        "img": "https://airdrive-images.s3-eu-west-1.amazonaws.com/london/img/topic/2018-11-01T18%3A45%3A36.93/housing.png",
        "slug": "housing",
        "description": ""
      },
      "b781eefa-ff44-44b8-be63-fc3acd37547c": {
        "title": "Planning",
        "img": "https://airdrive-images.s3-eu-west-1.amazonaws.com/london/img/topic/2018-11-01T18%3A46%3A07.60/planning.png",
        "slug": "planning",
        "description": ""
      }
    }
  }
}

Update a dataset

PATCH https://data.website/api/dataset/:slug

Accepts a JSON patch (http://jsonpatch.com) to update the dataset. Can edit or add any valid field, except the `readonly` field.

Path Parameters

NameTypeDescription

slug

string

URL slug of the dataset. Also accepts the UUID.

Headers

NameTypeDescription

Content-Type

string

application/json

Request Body

NameTypeDescription

JSON Patch

string

Request body must be a well-formed JSON patch

{
    "id": "3f741e00-111f-4134-b04f-8b103f78bbe7",
    "slug": "planning-permissions-on-the-london-development-database--ldd-",
    "title": "Dataset has a new title!",
    ... the full dataset JSON is returned...
    "readonly": {
    "updates": [
        {
            "op": "replace",
            "path": "/title",
            "value": "Dataset has a new title!"
        }
    ]
}

Patching a dataset title using Python requests

import requests
import json
import os

site = 'https://data.london.gov.uk'
dataset='my-dataset-slug'

patch = [{
  'op': 'add',
  'path': '/title',
  'value': 'Dataset has a new title!',
}]

# Python 2.4.2 onward has the keyword `json=patch`
requests.patch('%s/api/dataset/%s' % (site, dataset),
              headers={
                  'Authorization': os.environ['API_KEY'],
                  'Content-type': 'application/json',
              },
              data=json.dumps(patch))

Create a new dataset

POST https://data.website/api/dataset

Submit a JSON document to create a new dataset. The API will return HTTP 400 with a list of errors if it fails validation.

Request Body

NameTypeDescription

title

string

Dataset title

slug

string

Dataset URL slug

parent

string

ID of the parent team

{
  "state": "active",
  "sharing": "public",
  "maintainer": null,
  "maintainer_email": null,
  "update_frequency": null,
  "author": null,
  "author_email": null,
  "licence": null,
  "resources": {},
  "description": "",
  "licence_notes": "",
  "tags": [],
  "topics": [],
  "shares": {
    "orgs": {},
    "users": {}
  },
  "createdAt": "2019-01-15T17:44:04.938Z",
  "updatedAt": "2019-01-15T17:44:04.938Z",
  "slug": "my-new-dataset",
  "title": "My new dataset",
  "parent": "215702e3-0b5f-4a09-b332-0eac760b48ad",
  "id": "5346a0b7-fbf7-43a1-9134-b9530001bf6b"
}

Creating a dataset using Python requests

import requests
import json
import os

site = 'https://data.london.gov.uk'

# Look up team IDs on /api/orgs
team_gla = '5b858cb2-5c92-4b2b-8c1d-141cfc330d9c'

my_new_dataset = {
    # Required to create a dataset:
    'title': 'Created via the API',
    'slug': 'api-test-1',
    'parent': team_gla,

    # Optional:
    'sharing': 'private',
    'description': '...',
}

# Python 2.4.2 onward has the keyword `json=my_new_dataset`
requests.post('%s/api/dataset' % site,
              headers={
                  'Authorization': os.environ['API_KEY'],
                  'Content-type': 'application/json',
              },
              data=json.dumps(my_new_dataset))

File Uploads

The file upload API uses multipart/form-data to upload files. All other API endpoints deal purely in application/json.

These endpoints work for files up to 100MB.

To upload files over 100MB, see the next section.

Append a file to a dataset

POST https://data.website/api/dataset/:slug/resources

Use a regular HTTP form to upload a file to a dataset.

Path Parameters

NameTypeDescription

slug

string

URL slug of the dataset. Also accepts the UUID.

Request Body

NameTypeDescription

file

object

HTTP file upload

title

string

Title of the new resource

description

string

Descriptive text of the new resource

{
  ... normal dataset JSON here...
  "readonly": {
    "updates": [
      {
        "op": "add",
        "path": "/resources/d617ea6b-6e0a-4434-b106-c720651674dd",
        "value": {
          "url": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/test/2019-01-15T14%3A58%3A05/book5.jpg",
          "check_hash": "3b421811fa74c93b2a594478e0ada065",
          "check_timestamp": "2019-01-15T14:58:05.945Z",
          "check_http_status": 200,
          "title": "book5.jpg",
          "order": 24
        }
      }
    ]
  }
}

Uploading a file from the command line, using CURL

SITE=https://data.london.gov.uk
DATASET=your-slug-here
API_KEY=..
SRC_FILE="/path/to/src/file"

URL="${SITE}/api/dataset/${DATASET}/resources"
curl "${URL}" -H "Authorization: ${API_KEY}" -F "file=@${SRC_FILE}" 

Uploading a file using Python Requests

import requests
import os

site='https://data.london.gov.uk'
dataset='your-slug-here'
api_key=os.environ['API_KEY']
src_file='/path/to/src/file'
title='Title of resource'

url='%s/api/dataset/%s/resources' % (site, dataset)
requests.post(url,
              files={'file': open(src_file, 'rb')},
              headers={'Authorization': api_key},
              data={'title': title})

Replace a file in a dataset

POST https://data.website/api/dataset/:slug/resources/:id

Path Parameters

NameTypeDescription

slug

string

URL slug of the dataset. Also accepts the UUID.

id

string

UUID of the resource

Request Body

NameTypeDescription

file

object

HTTP file upload

title

string

Optionally overwrite the title

description

string

Optionally overwrite the description

{
  ...normal dataset JSON here...
  "readonly": {
    "updates": [
      {
        "op": "replace",
        "path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/url",
        "value": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/try-out/2019-01-15T17%3A22%3A10/book1.jpg"
      },
      {
        "op": "replace",
        "path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/check_hash",
        "value": "ffe99ae64bc6fbab377535a6f7694650"
      },
      {
        "op": "replace",
        "path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/check_timestamp",
        "value": "2019-01-15T17:22:11.016Z"
      },
      {
        "op": "replace",
        "path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/title",
        "value": "new image of a book"
      }
    ]
  }
}

File Uploads Over 100MB

You can upload larger files directly to the storage backend by fetching a presigned link. This is how the web interface handles file uploads.

This offers much higher uploads speeds, but it takes three requests:

  1. POST to the API to fetch a presigned link.

  2. POST to the storage backend to upload the file.

  3. PATCH the dataset API to attach the file.

import requests
import os
from urllib.request import pathname2url
from datetime import datetime

# Path to a 100MB+ file:
file_to_upload = './files/Schedule 36 part 1.pdf'
# Website I am working on:
host = 'https://open.barnet.gov.uk'
# Dataset ID:
dataset = '2k11d'
# Don't forget to set your API key:
headers = {
    'Authorization': os.environ['API_KEY'],
}

# -------------------------------------------------

# [Request 1/3] POST /api/dataset/:dataset/presign/:filename
filename = os.path.basename(file_to_upload)
url = '%s/api/dataset/%s/presign/%s' % (host, dataset, pathname2url(filename))

print('Presigning upload... \tPOST', url)
r1 = requests.post(url, headers=headers)
r1.raise_for_status()
presigned_response = r1.json()

# [Request 2/3]: Upload the file itself
with open(file_to_upload, 'rb') as f:
    url = presigned_response['url']
    print('Uploading file... \tPOST', url)
    files = presigned_response['fields']
    files['file'] = f
    r2 = requests.post(url, files=files)
    r2.raise_for_status()

# [Request 3/3]: Update the website with a JSON patch (RFC 6902)
patch = [
    {
        'op': 'add',
        # The ID will be overwritten :-)
        'path': '/resources/any_id_here',
        'value': {
            'origin': presigned_response['fields']['key'],
            'title': filename,
            # Place the file at the top of the list:
            'order': -1,
            # (omit 'order' to append to the bottom)
        },
    },
    {
        # Update the page timestamps
        'op': 'replace',
        'path': '/updatedAt',
        'value': datetime.now().isoformat(),
    },
]

url = '%s/api/dataset/%s' % (host, dataset)
print('Updating website... \tPOST', url)
r3 = requests.patch(url, json=patch, headers=headers)
r3.raise_for_status()

print('Successfully uploaded file:', filename)

Last updated