Eclips Web Technical Specification v4.2
Eclips Web Technical Specification v4.2
eClips Web
Technical Specification V4.2
Introduction .............................................................................................2
Product description .................................................................................2
Service overview.....................................................................................2
Articles & Versions .................................................................................................. 2
Payload Orchestrator ..............................................................................4
Payload Orchestrator Requests .............................................................................. 4
Base URL ...................................................................................................................... 4
Method Parameters ....................................................................................................... 4
Complete Request ......................................................................................................... 6
Payload Orchestrator Output .................................................................................. 6
Redirector ...............................................................................................7
Redirector Requests ............................................................................................... 7
Base URL ...................................................................................................................... 7
Method Parameters ....................................................................................................... 7
Complete Request ......................................................................................................... 9
Redirector Output.................................................................................................... 9
Article Orchestrator ...............................................................................10
Article Orchestrator Output ................................................................................... 10
Authentication Engine ...........................................................................11
User management ................................................................................................ 11
User identification ................................................................................................. 12
Licences ................................................................................................................ 12
Entitlements .......................................................................................................... 12
Embargoes............................................................................................................ 12
Restrictions ........................................................................................................... 12
Appendix A – Important Properties ....................................................... 13
Appendix B – Article Version Status...................................................... 15
Appendix C – Errors..............................................................................17
Appendix D – Glossary .........................................................................17
Document Control .................................................................................17
Introduction
This document provides a comprehensive technical description of the eClips Web service.
It is intended for use by organisations wishing to receive the eClips Web service, including
media monitoring organisations and content aggregators. It includes both high-level and
detailed technical information.
Product description
eClips Web is part of the eClips product, an online service that provides media monitoring
organisations (MMOs) and content aggregators with digital news content. eClips Web
specifically delivers content from UK newspaper websites in a timely and accurate manner.
The content in eClips Web is collected directly from publishers’ content management
systems (CMS), cleansed, standardised, and archived in a consistent data structure.
Collection and processing of content is carried out close to real-time and excludes non-
article content such as adverts and navigational pages.
This assurance allows MMOs and aggregators using eClips Web to deliver high quality
monitoring solutions to their customers (end users). Elements of the service exposed to end
users are specifically designed to support their information needs whilst minimising the
technical complexity to which they are exposed.
Service overview
eClips Web uses a service-oriented architecture (SOA) with four functional components.
Component Function
Payload Orchestrator Returns article details in standardised XML
Redirector Determines the appropriate method for viewing an article
Article Orchestrator Renders the article in HTML or PDF
Authentication Engine Authenticates the user for access to the specified content
Components must be accessed through SSL secure connection and therefore use the
HTTPS protocol.
An article version is a representation of the properties and contents of that article in any
given update (including its creation).
Users interacting with eClips Web will receive article versions but may use the concept of an
article to group article versions together.
Payload Orchestrator
Payload Orchestrator is a RESTful API, which can be queried manually or programmatically
using a set of defined parameters. It is accessible by MMOs and content aggregators and is
designed to support machine interrogation and interpretation.
All Payload Orchestrator requests are routed through the Authentication Engine in order to
confirm the user’s right to access the specified content, based on either cookies or
credentials supplied in the request. Licence restrictions will also apply.
Base URL
All Payload Orchestrator requests start with the following base URL:
www.nla-eclipsweb.com/service/api/payload.xml
This indicates that the request is for eclipsweb content, that it is a payload request, and
that the data should be returned in XML format.
Method Parameters
Payload Orchestrator requests must use one of two methods.
• Index continuation
• Date-time
Of these, NLA recommend Index continuation as the preferred method. This is because it is
optimal for frequent requests, ensures that article versions are not skipped, and delivers
reproducible results with high performance.
In contrast, Date-time carries a risk of different article versions being returned when the
same request is made at a later time due to the natural delay between the actual publication
of an article version and it’s processing in eCW. It is also a slower request and may deliver
very high numbers of article versions in a single payload.
Note that a single payload request cannot use both methods, so any request using
parameters for more than one method will be unsuccessful.
Index continuation
Each article version in eCW has a unique index value, which increments by one for each
new article version processed by the NLA Not all article versions will be published to MMOs,
but this value will always increase in sequence. This method uses this value to return article
versions in order of their receipt and is independent of the accuracy of the article version
metadata.
Index continuation Payload Orchestrator requests use the following method parameters:
Date-time
Each article in eCW has a publication timestamp, which is provided by the publisher. This
method returns article versions where the timestamp is within a range specified in the
request, including where the article version was restricted during that time period.
Title Filter
Payload Orchestrator requests can specify from which title(s) results should be returned.
The full list of titles and codes available at blog.nla.co.uk/ecwdocs/.
Payload Orchestrator title filtering details are supplied by the following parameters:
User Credentials
Payload Orchestrator requires user credentials. On the first request, these must be supplied
in the query string. Subsequently, these can be provided by a cookie for up to 365 days.
Complete Request
The above elements and parameters are combined to form a payload request, as shown in
the examples below.
Date-time (basic)
www.nla-eclipsweb.com/service/api/payload.xml?start=11/02/2015
&end=11/02/2015
newsMessage
itemSet
header packageItem
itemMeta newsItem contentMeta contentSet
More than one packageItem can appear within one itemSet, but only one itemSet can
appear within one newsMessage.
Within the contentSet element, the article’s text fields adhere to the NITF specification.
Redirector
Redirector is a RESTful API, which can be queried manually or programmatically. It is
accessible by all users.
Redirector checks the availability of a web news article, and reroutes if available, thereby
allowing a user to access a live article in preference to an archived version.
Authentication is not required to access the live version of a web news article, although
articles behind publisher paywalls may not be fully accessible without the appropriate
subscriptions. However, if the live version is not available, the Authentication Engine
requests user details before the archived version of the article can be returned.
Note that any request for a PDF of an article, or a specific version of an article, will always
cause Redirector to make an Article Orchestrator request even if the article is still live.
Figure 3: Redirector overview
REDIRECTOR
returns live article
available
REDIRECTOR
checks whether live article is
html still available
USER unavailable
submits Redirector request
Redirector Requests
All Redirector requests are structured as a ‘GET’ method HTTP request with a number of
component parts.
Base URL
All Redirector requests start with the following base URL:
www.nla-eclipsweb.com/service/redirector/article/
This indicates that the request is for eclipsweb content, that it is a redirector request
for article data.
Method Parameters
Redirector requests target one or more specific articles. The method for requesting multiple
articles is different from that for requesting a single article.
Single article
Multiple articles
Multiple article Redirector requests use the following method parameters, where
articleID:version groups are comma separated. The result will always be an Article
Orchestrator call.
Supplier details
User Credentials
Redirector does not require user credentials. However, if valid credentials are provided in the
query string, and the request requires a subsequent background Article Orchestrator
request, no further authentication will be required.
As with Payload Orchestrator, credentials provided in the query string will place a cookie, if
possible, which will then authenticate the user for Article Orchestrator for up to 365 days.
Complete Request
The above elements and parameters are combined to form a redirector request, as shown in
the examples below.
Redirector Output
All successful Redirector requests return either:
• The live article on the source webpage
• The article in Article Orchestrator format
The format of the live article on the source webpage is not controlled by NLA or eClips Web.
The format of Article Orchestrator is detailed below.
Article Orchestrator
Article Orchestrator is a component which renders one or more eCW article into a human-
readable format.
As described above, Redirector requests for which the live article is unavailable, or where
certain parameters are present in the request, and where authentication is successful, will
result in a background request to Article Orchestrator.
The below figure outlines the structure of an Article Orchestrator document. The structure
shown is shared by HTML and PDF documents.
Figure 4: Article Orchestrator output structure
Authentication Engine
Authentication Engine is the mechanism by which users attempting to access any part of
eClips Web are assessed and then allowed or denied access to content and features.
Articles
Embargoes
Organisation
Licences
Users
Entitlements
Restrictions Identification
User management
User management
Users of eClips Web are managed through the eClips User Management Interface (UMI).
This is available at https://www.nla-eclips.com/manage/.
Once an organisation has been set up in eClips Web by the NLA, the organisation will have
the appropriate licences assigned to it, as well as at least one user.
If a user has Admin permissions, they will be able to create and manage other users for their
organisation through the UMI.
User identification
On each request for eClips Web content, the requesting user must be identified by providing
a username and password. The mechanisms for providing these credentials are as follows.
For each of these mechanisms, the first successful authentication will generate a user- and
device-specific cookie. This avoids the need for further identification for 365 days, or until the
cookie is removed. For this to work, cookies must be allowed on the device.
Licences
Organisations are set up with licences which define to which components and titles within
eClips Web they have access. For example, MMO organisations have access to the Payload
Orchestrator component but client organisations do not.
Licenses are set up by the NLA based on the agreements made with individual
organisations.
Entitlements
An organisation’s licence for a given title is accompanied by an entitlement. This is the
period of time after the publication of an article during which users in that organisation will
have access to that article and is component specific.
Most licences are set up with 7-day entitlements for Payload Orchestrator and 100-day
entitlements for Article Orchestrator.
Embargoes
For some articles, the publisher of that article will apply an embargo to that article’s
availability in eClips Web.
In this case, the article will not be available in eClips Web feeds until that embargo has
passed.
Restrictions
For some articles, the publisher of that article will apply a restriction to that article’s
availability in eClips Web. A restriction indicates the level of permission a user must have to
continue to have access to the article.
In this case, the article will no longer be available in eClips Web feeds once the restriction
has been applied if the user has a permission level lower than that required to access the
restricted article.
Title properties
ABCe data
By default, article versions in eClips Web are usable (pubStatus=usable), which means
that it can be processed, viewed, and stored according to usage agreements.
However, on occasion, the publisher of an article will choose to restrict, or withdraw, access
to one or more versions of a published article.
Note that the date-time at which a new status was applied will also be considered in any
Payload Orchestrator (date-time) requests covering that date-time. At the same time, NLA
will issue a restriction notice by email to all MMOs who could have received the affected
article versions.
When an article version’s status is withdrawn, all organisations and users receiving this are
obligated to remove all instances of this article version from all stored and shared locations.
Where multiple versions of the same article are withdrawn, the obligation applies to all
instances of all affected article versions.
These article versions will now no longer be available through Redirector and Article
Orchestrator. If later versions of the restricted article are unrestricted, these will still be
available for use.
Appendix C – Errors
The following HTTPS response codes may be returned from a Payload Orchestrator
request:
Appendix D – Glossary
Term Meaning
ABCe Audit Bureau of Circulations (ABC) is the industry body for media
measurement. They supply domain-level access statistics. ABCe indicates the
branch of the ABC that deals with electronic publications (although this
terminology is no longer used by the ABC, it is helpful in distinguishing the
information source within eClips). More information can be found at
www.abc.org.uk/.
NewsML- An XML standard for new content metadata.
G2 More information can be found at iptc.org/standards/newsml-g2/.
NITF News Industry Text Format: an XML standard for news content structure.
More information can be found at iptc.org/standards/nitf/.
RESTful An architectural style for an API which uses Representational State Transfer.
More information can be found at ibm.com/developerworks/library/ws-restful/.
Document Control
Version Date Updated by Updates
1.4 12-02-2015 Tessa Radwan Document adapted from existing spec
v1.4, to update contents and branding.
1.5 12-03-2015 Tessa Radwan Updated following feedback from MA
1.6 02-04-2015 Tessa Radwan Updated with Redirector, Article
Orchestrator, and Authentication sections
1.7 22-05-2015 Tessa Radwan Updated with feedback from ISE
2.0 03-06-2015 Tessa Radwan Prepared for publication