4. Jobs
This section covers the Job API resource. Jobs typically take an uploaded document as input so be sure you have read that section.
The Job Resource
Jobs are the primary resource for the API. They represent the work being done as inputs get transformed.
Jobs are created through the endpoint POST /jobs
. Once created, each job
has a unique ID (UUID
format) as well as an owner (the user who created
it). By default, jobs can only be seen, modified, and removed by their owner.
A list of accessible jobs can be retrieved from GET /jobs
. The details of a
particular job can be retrieved from GET /jobs/:jobId
.
Job Type
Each job has a field type
which indicates the type of the job. The type of
the job uniquely identifies the backend process that will be used for the job.
Some example job types:
The full list of job types are available at the end point GET /job-types
.
The details of a particular job type can be retrieved at
GET /job-types/:jobTypeId
.
Job Status
Each job has a field status
which indicates the current state of the job.
Here are some example job statuses:
running
— the job is actively being processed by the system.blocked
— the job has been paused for human review.-
completed
— the job is complete and the output is ready for retrieval. -
cancelled
— the job was deliberately cancelled prematurely. The contents of the job are likely incomplete or invalid. -
failed
— the job cannot continue and has been marked as failed. The contents of the job are likely incomplete or invalid.
Job Contents
Each job includes a description of its input and output. For
all jobs, the input document is detailed in the input_content
field. Once a
job completes, the field output_content
contains a description of the
output, including a URI
for retrieving the output.
Jobs also contain other types of content, determined by their type
, their
status
, and their progress through the workflow. For example, jobs of type
data-point-extraction contain a content of type
mapping-taxonomy-json. For more information on retrieving content other than
input and output see:
Retrieving Job Contents
.
Job Collaboration
By default, only a job’s owner can view, edit, and work on a job. To allow other
users to collaboratively work on the job, the field collaboration
is used.
Collaboration can be controlled through either teams or directly through
users, as the collaboration
field contains both of these as
sub-fields.
- Adding collaborators by team requires either the name of the team or the ID of the team.
- Adding collaborators directly requires either the email of the user or the ID of the user.
Each collaboration contains a list of strings called steps
. These indicate
which roles the collaborator is allowed to access. Sending
"*"
for steps indicates the users can access all
roles.
Example
An example of a job with a single team collaborating on all steps.
{
...
"collaboration": {
"teams": [
{
"id": "c383d5a5-4cff-473f-b820-b53bb70abb78",
"steps": ["*"]
}
],
"users": []
},
...
}
Job Metadata
Each job has a field metadata
which is an object of key/value pairs.
Clients are free to use this field to add any additional client-specific information
about the job. When creating a job, any unknown fields are automatically added to the
metadata
. The values can be any valid JSON type.
Most job types require specific metadata
fields. For each job type, read
the documentation specific to that type.
Monitoring Job Status
Once a job has been created, the client must monitor the status of the job and track its
progress. This is done through the endpoint GET /jobs/:jobId
.
The progress of the job will be found in the fields progress.current
and
progress.total
which indicate the current step and the total number of
steps respectively. These numbers are simply an estimate and should be not be taken too
critically.
Once the job has reached a final status of completed
,
cancelled
or failed
, the client should no longer monitor the
job.