Skip to content

travis-ci/job-board

Repository files navigation

job-board

Build Status Code Climate Test Coverage

Job placement for everyone!

job-board Is intended to be responsible for job delivery to worker over HTTP as a replacement for RabbitMQ/AMQP. A separate image-specific API is also provided, which has been historically used as a test bed for per-job HTTP querying across various worker infrastructures.

For a detailed explanation of the APIs provided by job-board, and the behaviors expected from consumers, please see the API section below.

Status

Actively running in production, albeit with mixed levels of API adoption across infrastructures.

How does it fit into the rest of the system

  • Deployment: Heroku
  • scheduler (github): scheduler sends job payloads to job-board over HTTP.
  • worker (github): worker repeatedly sends "heartbeat" requests to job-board via HTTP for job delivery and job claim renewal. Once worker internally begins processing a job, an additional HTTP request fetches the full job payload from job-board. Some of the backend providers in worker may be configured to select images via HTTP queries to job-board, although this API is slated for eventual removal as the job delivery API provides the same data in the job payload.
  • gcloud-cleanup (github): gcloud-cleanup queries job-board over HTTP, and deletes images from GCE that are no longer registered with job-board.

API

Job Delivery API

Unique source identifier (${UNIQUE_ID})

The ${UNIQUE_ID} string used in the From: header is intended to be a unique identifier for purposes of a given Worker Processor communicating with Job Board. The scope of uniqueness we need is limited to "one Worker Processor", which is used on the Job Board side as a way to track job IDs claimed by Worker Processors.

${UUID}@${PID}.${HOSTNAME}

Delivery workflow

Each Worker Processor has an HTTP Job Queue that is responsible for repeatedly placing POST /jobs/pop requests to Job Board for purposes of fetching newly available job ids.

POST /jobs/pop{?queue}

As shown below, the queue query param is required.

Requests

If the Worker Processor is "waiting" or "new", such as when first initializing:

POST /jobs/pop?queue=flah
Content-Type: application/json
Travis-Site: ${SITE}
Authorization: basic ${BASE64_BASIC_AUTH}
From: ${UNIQUE_ID}

{
  "job_id": "${JOB_ID}"
}
Responses

If there is a job ID available:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "job_id": "${JOB_ID}",
  "@queue": "${QUEUE}"
}

If the queue param is omitted or empty:

HTTP/1.1 400 Bad Request
Content-Type: application/json

{
  "@type": "error",
  "error": "missing queue param"
}

If no job IDs are available for claim:

HTTP/1.1 204 No Content
Content-Type: application/json

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

POST /jobs/add

This resource is intended to be used by scheduler to add to the pool of jobs available for delivery. It is the rough equivalent of scheduler "enqueueing" a job to RabbitMQ. The reason why this isn't a PUT is because the scheduler representation of a job has significantly less information in it than the representation returned by GET /jobs/{job_id}.

Request
Authorization: basic ${BASE64_BASIC_AUTH}
Travis-Site: ${SITE}
Content-Type: application/json

{
  "@type": "job",
  "id": "${JOB_ID}"
  "config": {
    // … other job representation bits
  }
}
Responses

If request is valid and the job does not already exist:

HTTP/1.1 201 Created
Content-Length: 0

If the request is valid but the job already exists, the request acts as an update, and will alter all persisted fields except for the job id:

HTTP/1.1 204 No Content

If the request is invalid:

HTTP/1.1 400 Bad Request
Content-Type: application/json

{
  "@type": "error",
  "error": "${ERROR_MESSAGE}"
}

If the Travis-Site header is missing, 412.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

POST /jobs/{job_id}/claim

TODO

Request

TODO

Responses

TODO

GET /jobs/{job_id}

This resource is intended to be used by a given Worker Processor after the above "heartbeat" request has succeeded as a way to retrieve the ful job representation as needed by Worker, which includes the job script and URI templates used for communicating with the Job State and Log Parts APIs.

Request
Travis-Site: ${SITE}
Travis-Infrastructure: ${INFRASTRUCTURE}
Authorization: basic ${BASE64_BASIC_AUTH}
From: ${UNIQUE_ID}
Responses

If the request and auth are valid:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "@type": "job",
  "id": "${JOB_ID}",
  "data": {
    "queue": "${QUEUE}",
    "config": {
      "os": "linux",
      "dist": "trusty",
      // ... other config bits
    },
  },
  // … other job representation bits
  "job_script": {
    "name": "main",
    "encoding": "base64",
    "content": "MjBiZGQzOTk0Mjc...(base64-encoded build.sh)"
  },
  "job_state_url": "${JOB_STATE_URL}",
  "log_parts_url": "${LOG_PARTS_URL}",
  "jwt": "${JOB_SPECIFIC_JWT}",
  "image_name": "${IMAGE_NAME}"
}

If the Travis-Site header is missing, 412.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

DELETE /jobs/{job_id}

This resource is intended to be used by Worker as a way to mark the job as "completed" with Job Board. This is the rough equivalent of what Worker does when communicating via AMQP by sending an "ACK" for the job message. The Authorization should use the JWT that was issued by Job Board specifically for this job.

Request
Travis-Site: ${SITE}
Authorization: Bearer ${JWT}
From: ${UNIQUE_ID}
Responses

If the request and auth are valid:

HTTP/1.1 204 No Content

If the Travis-Site header is missing, 412.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

Image API

One of the things that's queryable in job-board is image names on various infrastructures.

POST /images{?infra,tags,name,is_default}

Creates an image record.

Request
POST /images?infra=gce&tags=org:true&name=floo-flah-trusty-1438203722&limit=1
Authorization: basic ${BASE64_BASIC_AUTH}
Responses

If the request and auth are valid:

HTTP/1.1 201 Created
Content-Type: application/json

{
  "data": [
    {
      "id": "${IMAGE_ID}"
      // ... other stuff
    }
  ],
}

If the request is invalid, 400.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

PUT /images{?infra,tags,name,is_default}

Updates an image record.

Request
PUT /images?infra=gce&tags=production:true,org:true&name=floo-flah-trusty-1438203722
Authorization: basic ${BASE64_BASIC_AUTH}
Responses

If the request and auth are valid:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "data": [
    {
      "id": "${IMAGE_ID}"
      // ... other stuff
    }
  ],
}

If the request is invalid, 400.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

GET /images{?infra,tags,name,is_default,limit}

Query for an image record.

Request
GET /images?infra=gce&tags=production:true,org:true
Authorization: basic ${BASE64_BASIC_AUTH}
Responses

If the request and auth are valid:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "data": [
    {
      "id": "${IMAGE_ID}"
      // ... other stuff
    }
  ],
}

If the request is invalid, 400.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.

POST /images/search

Perform the equivalent of multiple GET /images{?infra,tags,name,is_default,limit} requests within one request via newline-delimited application/x-www-form-urlencoded queries.

Request
Content-Type: application/x-www-form-urlencoded; boundary=NL
Authorization: basic ${BASE64_BASIC_AUTH}

infra=gce&tags=os:linux,group:stable,language_ruby:true&is_default=false&fields[images]=name&limit=1
infra=gce&tags=group:stable,language_ruby:true&is_default=false&fields[images]=name&limit=1
infra=gce&tags=language_ruby:true&is_default=false&fields[images]=name&limit=1
infra=gce&tags=os:linux&is_default=true&fields[images]=name&limit=1
Responses

If the request and auth are valid, noting that the above example includes fields[images]=name, which has the potential to reduce the response body size substantially.

HTTP/1.1 200 OK
Content-Type: application/json

{
  "data": [
    {
      "name": "floo-flah-trusty-1438203722"
    }
  ]
  "meta": {
    "limit": 1,
    "matching_query": {
      "infra": "gce",
      "tags": {
        "group": "stable",
        "language_ruby": "true",
      },
      "is_default": false,
      "limit": 1
    }
  }
}

If the request is invalid, 400.

If the Authorization header is missing, 401.

If the Authorization header is invalid, 403.