Updating a Dataset
A dataset is a JSON document. It contains object keys, such as resources, and string keys such as the title and description. The fields "id" and "readonly" are immutable.
Read a dataset
GET
https://data.website/api/dataset/:slug
Read a dataset as a JSON object. Includes a `readonly` field, which cannot be patched, providing supporting database information.
Path Parameters
slug
string
URL slug of the dataset. Also accepts the UUID.
{
"id": "3f741e00-111f-4134-b04f-8b103f78bbe7",
"slug": "planning-permissions-on-the-london-development-database--ldd-",
"title": "Planning permissions on the London Development Database (LDD)",
"parent": "5b858cb2-5c92-4b2b-8c1d-141cfc330d9c",
"state": "active",
"sharing": "public",
"tags": [
"planning",
"planning-application"
],
"author": "Greater London Authority",
"topics": [
"67b1cea4-806d-4b63-90d7-155cf3ac3c03",
"b781eefa-ff44-44b8-be63-fc3acd37547c"
],
"licence": "ogl-v3",
"createdAt": "2017-05-02T13:10:08",
"updatedAt": "2019-01-11T15:14:11.944Z",
"maintainer": "London Development Database",
"description": "<p>The London Development Database (LDD) records significant planning permissions in London.</p>\r\n<p>The data is entered by London's planning authorities, and is checked by the GLA to ensure consistency across London. The LDD records any planning consent that permits one or more of the following:</p>\r\n<ul>\r\n<li>any new build residential units</li>\r\n<li>any loss or gain of residential units through change of use or conversion of existing dwellings</li>\r\n<li>creation of seven or more new bedrooms for use as either a hotel, a hostel, student housing or for residential care through new build or change of use</li>\r\n<li>1,000m2 or more of floor space changing from one use class to another or created through new build or extension for any other use</li>\r\n<li>the loss or gain or change of use of open space.</li>\r\n</ul>\r\n<p>New permissions are added to the database on a monthly basis within three months of the end of the month in which they were granted. Information on scheme starts and completions is updated annually by September the following year.</p>\r\n<p>The spreadsheet <em>LDD - Planning permissions</em> includes details of all permissions either currently recorded as live (not started or under construction) or completed since 01/04/2006. <em>LDD - Non-residential floorspace</em> provides additional details of the non-residential floor space for those permissions with a non-residential component and <em>LDD - non-residential bedrooms</em> provides additional details of the non-C3 bedrooms over the same time period.</p>",
"author_email": "mayor@london.gov.uk",
"licence_notes": "",
"odi-certificate": "",
"maintainer_email": "",
"update_frequency": "Monthly",
"london_bounding_box": "",
"london_smallest_geography": "Point Location",
"shares": {
"users": {},
"orgs": {}
},
"resources": {
"eb050c40-3e94-4384-8e59-1b8c49dbdf36": {
"url": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/planning-permissions-on-the-london-development-database--ldd-/2019-01-11T15%3A13%3A38/LDD%20Permissions%20for%20Datastore.xlsx",
"order": 0,
"title": "LDD planning permissions",
"format": "spreadsheet",
"check_hash": "1982c439d56a3c3b56614d356d15069c",
"check_size": 19328820,
"description": "WARNING: Large file size",
"check_mimetype": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"check_timestamp": "2019-01-11T15:13:42.556Z",
"check_http_status": 200,
"london_release_date": null,
"temporal_coverage_to": null,
"temporal_coverage_from": null
}
},
"readonly": {
"auth": {
"package_update": false
},
"shares": {
"users": {},
"orgs": {}
},
"licence": {
"title": "UK Open Government Licence (OGL v3)",
"url": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
"is_okd_compliant": true
},
"parent": {
...
},
"rootOrg": {
...
},
"topics": {
"67b1cea4-806d-4b63-90d7-155cf3ac3c03": {
"title": "Housing",
"img": "https://airdrive-images.s3-eu-west-1.amazonaws.com/london/img/topic/2018-11-01T18%3A45%3A36.93/housing.png",
"slug": "housing",
"description": ""
},
"b781eefa-ff44-44b8-be63-fc3acd37547c": {
"title": "Planning",
"img": "https://airdrive-images.s3-eu-west-1.amazonaws.com/london/img/topic/2018-11-01T18%3A46%3A07.60/planning.png",
"slug": "planning",
"description": ""
}
}
}
}
Update a dataset
PATCH
https://data.website/api/dataset/:slug
Accepts a JSON patch (http://jsonpatch.com) to update the dataset. Can edit or add any valid field, except the `readonly` field.
Path Parameters
slug
string
URL slug of the dataset. Also accepts the UUID.
Headers
Content-Type
string
application/json
Request Body
JSON Patch
string
Request body must be a well-formed JSON patch
{
"id": "3f741e00-111f-4134-b04f-8b103f78bbe7",
"slug": "planning-permissions-on-the-london-development-database--ldd-",
"title": "Dataset has a new title!",
... the full dataset JSON is returned...
"readonly": {
"updates": [
{
"op": "replace",
"path": "/title",
"value": "Dataset has a new title!"
}
]
}
Patching a dataset title using Python requests
import requests
import json
import os
site = 'https://data.london.gov.uk'
dataset='my-dataset-slug'
patch = [{
'op': 'add',
'path': '/title',
'value': 'Dataset has a new title!',
}]
# Python 2.4.2 onward has the keyword `json=patch`
requests.patch('%s/api/dataset/%s' % (site, dataset),
headers={
'Authorization': os.environ['API_KEY'],
'Content-type': 'application/json',
},
data=json.dumps(patch))
Create a new dataset
POST
https://data.website/api/dataset
Submit a JSON document to create a new dataset. The API will return HTTP 400 with a list of errors if it fails validation.
Request Body
title
string
Dataset title
slug
string
Dataset URL slug
parent
string
ID of the parent team
{
"state": "active",
"sharing": "public",
"maintainer": null,
"maintainer_email": null,
"update_frequency": null,
"author": null,
"author_email": null,
"licence": null,
"resources": {},
"description": "",
"licence_notes": "",
"tags": [],
"topics": [],
"shares": {
"orgs": {},
"users": {}
},
"createdAt": "2019-01-15T17:44:04.938Z",
"updatedAt": "2019-01-15T17:44:04.938Z",
"slug": "my-new-dataset",
"title": "My new dataset",
"parent": "215702e3-0b5f-4a09-b332-0eac760b48ad",
"id": "5346a0b7-fbf7-43a1-9134-b9530001bf6b"
}
Creating a dataset using Python requests
import requests
import json
import os
site = 'https://data.london.gov.uk'
# Look up team IDs on /api/orgs
team_gla = '5b858cb2-5c92-4b2b-8c1d-141cfc330d9c'
my_new_dataset = {
# Required to create a dataset:
'title': 'Created via the API',
'slug': 'api-test-1',
'parent': team_gla,
# Optional:
'sharing': 'private',
'description': '...',
}
# Python 2.4.2 onward has the keyword `json=my_new_dataset`
requests.post('%s/api/dataset' % site,
headers={
'Authorization': os.environ['API_KEY'],
'Content-type': 'application/json',
},
data=json.dumps(my_new_dataset))
File Uploads
Append a file to a dataset
POST
https://data.website/api/dataset/:slug/resources
Use a regular HTTP form to upload a file to a dataset.
Path Parameters
slug
string
URL slug of the dataset. Also accepts the UUID.
Request Body
file
object
HTTP file upload
title
string
Title of the new resource
description
string
Descriptive text of the new resource
{
... normal dataset JSON here...
"readonly": {
"updates": [
{
"op": "add",
"path": "/resources/d617ea6b-6e0a-4434-b106-c720651674dd",
"value": {
"url": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/test/2019-01-15T14%3A58%3A05/book5.jpg",
"check_hash": "3b421811fa74c93b2a594478e0ada065",
"check_timestamp": "2019-01-15T14:58:05.945Z",
"check_http_status": 200,
"title": "book5.jpg",
"order": 24
}
}
]
}
}
Uploading a file from the command line, using CURL
SITE=https://data.london.gov.uk
DATASET=your-slug-here
API_KEY=..
SRC_FILE="/path/to/src/file"
URL="${SITE}/api/dataset/${DATASET}/resources"
curl "${URL}" -H "Authorization: ${API_KEY}" -F "file=@${SRC_FILE}"
Uploading a file using Python Requests
import requests
import os
site='https://data.london.gov.uk'
dataset='your-slug-here'
api_key=os.environ['API_KEY']
src_file='/path/to/src/file'
title='Title of resource'
url='%s/api/dataset/%s/resources' % (site, dataset)
requests.post(url,
files={'file': open(src_file, 'rb')},
headers={'Authorization': api_key},
data={'title': title})
Replace a file in a dataset
POST
https://data.website/api/dataset/:slug/resources/:id
Path Parameters
slug
string
URL slug of the dataset. Also accepts the UUID.
id
string
UUID of the resource
Request Body
file
object
HTTP file upload
title
string
Optionally overwrite the title
description
string
Optionally overwrite the description
{
...normal dataset JSON here...
"readonly": {
"updates": [
{
"op": "replace",
"path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/url",
"value": "https://airdrive-secure.s3-eu-west-1.amazonaws.com/london/dataset/try-out/2019-01-15T17%3A22%3A10/book1.jpg"
},
{
"op": "replace",
"path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/check_hash",
"value": "ffe99ae64bc6fbab377535a6f7694650"
},
{
"op": "replace",
"path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/check_timestamp",
"value": "2019-01-15T17:22:11.016Z"
},
{
"op": "replace",
"path": "/resources/bde00af7-7783-43d4-97b9-3bc18b8da7cc/title",
"value": "new image of a book"
}
]
}
}
File Uploads Over 100MB
You can upload larger files directly to the storage backend by fetching a presigned link. This is how the web interface handles file uploads.
This offers much higher uploads speeds, but it takes three requests:
POST
to the API to fetch a presigned link.POST
to the storage backend to upload the file.PATCH
the dataset API to attach the file.
import requests
import os
from urllib.request import pathname2url
from datetime import datetime
# Path to a 100MB+ file:
file_to_upload = './files/Schedule 36 part 1.pdf'
# Website I am working on:
host = 'https://open.barnet.gov.uk'
# Dataset ID:
dataset = '2k11d'
# Don't forget to set your API key:
headers = {
'Authorization': os.environ['API_KEY'],
}
# -------------------------------------------------
# [Request 1/3] POST /api/dataset/:dataset/presign/:filename
filename = os.path.basename(file_to_upload)
url = '%s/api/dataset/%s/presign/%s' % (host, dataset, pathname2url(filename))
print('Presigning upload... \tPOST', url)
r1 = requests.post(url, headers=headers)
r1.raise_for_status()
presigned_response = r1.json()
# [Request 2/3]: Upload the file itself
with open(file_to_upload, 'rb') as f:
url = presigned_response['url']
print('Uploading file... \tPOST', url)
files = presigned_response['fields']
files['file'] = f
r2 = requests.post(url, files=files)
r2.raise_for_status()
# [Request 3/3]: Update the website with a JSON patch (RFC 6902)
patch = [
{
'op': 'add',
# The ID will be overwritten :-)
'path': '/resources/any_id_here',
'value': {
'origin': presigned_response['fields']['key'],
'title': filename,
# Place the file at the top of the list:
'order': -1,
# (omit 'order' to append to the bottom)
},
},
{
# Update the page timestamps
'op': 'replace',
'path': '/updatedAt',
'value': datetime.now().isoformat(),
},
]
url = '%s/api/dataset/%s' % (host, dataset)
print('Updating website... \tPOST', url)
r3 = requests.patch(url, json=patch, headers=headers)
r3.raise_for_status()
print('Successfully uploaded file:', filename)
Last updated
Was this helpful?