Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grr doc 648 #19

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 46 additions & 61 deletions docs/user-guide/storage-reference.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,48 @@
---
---
title: Object Storage Reference
markdown2extras: wiki-tables, code-friendly
---

# Object Storage Reference


The Joyent Manta Storage Service uses a REST API to read, write, and delete objects.
You should be familiar with HTTP-based REST systems when you read this section.
In particular, you should be familiar with HTTP requests, responses, status codes,
and headers.
This document assumes that you are familiar with HTTP-based REST systems, including
HTTP requests, responses, status codes, and headers.

If you want to learn the basics, see [Getting Started](index.html).

Unless otherwise
specified, the semantics described here are stable, which means that you can
expect that future updates will not change the documented behavior. You should
avoid relying on behavior not specified here.
If you want to start with basic information on Manta object storage, read [Getting Started](index.html).

Unless otherwise specified, the semantics described here are stable, which means that you can expect that future updates will not change the
documented behavior. You should avoid relying on behavior not specified here.


# Storage Overview

The storage service is based on three concepts: object, directories, and SnapLinks.

* **Objects** consist of data and metadata you can read, write, and delete from
the storage service. The data portion is opaque. The metadata is a set of
the storage service. The data portion is opaque. The metadata is a set of
HTTP headers that describe the object, such as `Content-Type` and
`Content-MD5`. An object is identified by a name.
* Objects can be grouped into **directories**, as on traditional file systems.
`Content-MD5`. An object is identified by a name.
* **Directories** are named groups of objects, as on traditional file systems.
Every object belongs to a directory.
The private storage directory, `/:login/stor` functions as the top level, or
root directory.
* **SnapLinks** allow you to create a point-in-time reference to the data and
* **SnapLinks** create a point-in-time reference to the data and
metadata that constitutes another object.
Unlike hard links or symbolic links in Unix, when the source object changes,
the SnapLink does not.
You can use SnapLinks to create arbitrary versioning schemes.




# Objects

Objects are the primary entity you store in Joyent Manta Storage Service.
Objects can be of any size, including zero bytes.
Objects consist of your raw, uninterpreted data,
as well as the metadata (HTTP headers) returned when you retrieve an object.

There are several headers for objects that control HTTP semantics


# Headers
There are several headers for objects that control HTTP semantics in Manta.

## Content Length

Expand All @@ -66,7 +58,6 @@ This estimate is only an upper bound.
The system will record how much data you *actually* transferred and record that.
Subsequent GET requests will return the actual size of the object.


## 100-continue Request Header

You can, but are not required to, use the
Expand All @@ -83,14 +74,14 @@ which will be stored and returned back (HTTP content-negotiation will be handled
If you do not specify a content type, the default is `application/octet-stream`.

If you specify a `Content-MD5` header, the system validates that the content
uploaded matches the value of the header.
uploaded matches the value of the header. You must encode MD5 headers in Base64,
as described in [RFC 1864](https://www.ietf.org/rfc/rfc1864.txt).


The `durability-level` header is a value from 1 to 6
that specifies how many copies of an object the system stores.
If you do not specify a durability level, the default is 2.


## Conditional Request Headers

The system honors the standard HTTP conditional requests such as
Expand All @@ -108,7 +99,7 @@ it can be a comma separated
list of `origin` values.
When a request is sent with the `origin` header,
the *list* of values of the stored `access-control-allow-origin` header is processed
and only the *matching* value is returned, if any. For example:
and only the *matching* value is returned, if any. For example:

$ echo "foo" | \
mput -q -H 'access-control-allow-origin: foo.com,bar.com' /:login/public/foo
Expand All @@ -128,17 +119,17 @@ and only the *matching* value is returned, if any. For example:
x-response-time: 7
x-server-name: fef8c5b8-3483-458f-95dc-7d9172ecefd1

If no `origin` header is sent, the request is assumed to not be originating from
a browser, and the original list of values is echoed back.
Note this is non-conforming to the CORS specification,
but allows you to administratively see
If no `origin` header is sent, the system assumes that the request did not originate from
a browser and the original list of values is echoed back.
While this behavior does not conform to the CORS specification,
it does allow you to administratively see
what is stored on your object.

`access-control-expose-headers` is supported as a list of HTTP headers that a
browser will expose. This list is not interpreted by the system.
browser will expose. This list is not interpreted by the system.

`access-control-allow-methods` is supported as a list of HTTP methods that the system
will honor for this request. You can only specify HTTP operations the system
will honor for this request. You can only specify HTTP operations the system
supports: HEAD, GET, PUT, DELETE.

`access-control-max-age` is supported and uninterpreted by the system.
Expand Down Expand Up @@ -173,7 +164,7 @@ All objects are stored at the top level or subdirectory one of the following dir
As noted above, `/:login/stor` functions as the top level, or root, directory where you store
objects and create directories.
Only you can read, write, and delete data here.
You can create any number of directories, objects and SpanLinks in this directory.
You can create any number of directories, objects and SnapLinks in this directory.

While the system does not yet support discretionary
access controls on objects or directories, you can grant access to individual
Expand Down Expand Up @@ -207,13 +198,10 @@ Once a jobs is archived, listing a job directory would return this.
stor/

The contents of a job's directory is a complete snapshot of all data available over the jobs API.
You can clean this data up using `mrm -r` or a command, such as `mfind` that generates
a list of objects in the directory.
You can clean this data up using `mrm -r`. You can also use `mfind` to generate a list of objects in the directory.

Only you or jobs you create can read, write, and delete data in this directory.



## Jobs Storage (/:login/jobs/:id/stor)

By default, `/:login/jobs/:id/stor` contains data created during job execution.
Expand All @@ -227,11 +215,11 @@ Note that only data emitted during the last phase of a job will have data here.

`/:login/reports` is the location where the system delivers aggregated usage reports
and raw HTTP access logs.
You can learn more about the data in these directories in the [Reports Reference](reports.html).
Learn more about the reports directory in the [Reports Reference](reports.html) section.
Only you can manage data under `/:login/reports`.


## Working with Directories

You create a directory the same way that you create an object,
but you use the special header `Content-Type: application/json; type=directory`.

Expand Down Expand Up @@ -276,21 +264,21 @@ until the total number of entries you have processed matches `result-set-size`.


You can store CORS, `cache-control` and `m-` headers on directories, as you can
on objects. Currently, no data is supported on directories.
on objects. Currently, no data is supported on directories.

# SnapLinks

SnapLinks allow you to create an alternate name for a point-in-time reference
to an object. SnapLinks do not consume any extra bytes in your usage, as they
do not create a new copy of data. They simply create an extra name that points
do not create a new copy of data. They simply create an extra name that points
at existing object data.

SnapLinks are useful for creating arbitrary versioning schemes in client
applications. You can create SnapLinks across directories.
applications. You can create SnapLinks across directories.
You can use SnapLinks to build any form of snapshotting mechanism desired.

Because objects in the system are copy-on-write, when the object that was the target
of a SnapLink changes, the SnapLink does not change. Conceptually, SnapLinks
of a SnapLink changes, the SnapLink does not change. Conceptually, SnapLinks
are like a Unix hard link that is copy on write.

As an example from the getting started guide:
Expand Down Expand Up @@ -323,23 +311,22 @@ Several principles guide the design of the service:
It chooses to be strongly consistent, at
the risk of more HTTP 500 errors than an eventually consistent system.
This system is engineered to minimize errors in the event of network or system
failures, and to recover as quickly as possible, but more errors will occur than
in an eventually consistent system. However, you can always read your writes
immediately, and the distinction between a HTTP 404 response and a HTTP 500 response is very clear.
failures and to recover as quickly as possible, but more errors will occur than
in an eventually consistent system. However, it is possible to read the writes
immediately. The distinction between a HTTP 404 response and a HTTP 500 response is very clear:
A 404 response *really* means your data isn't there.
A 500 response means that it might be, but there is some sort of outage.
* When the system responds with an HTTP 200, you can be certain your data is
durably stored on the number of servers you requested. The system is designed to
durably stored on the number of servers you requested. The system is designed to
*never* allow data loss or corruption.
* The system is designed to be secure. All writes must be performed over a
secure channel (TLS). Most reads will be as well, unless you are specifically
secure channel (TLS). Most reads will be as well, unless you are specifically
requesting to bypass TLS for browser/web channels.


## System scale

Joyent Manta Storage Service is designed to support an arbitrarily large number of objects and an
arbitrarily large number of directories. However, it bounds the number of
arbitrarily large number of directories. However, it bounds the number of
objects in a single directory so that list operations can be performed
efficiently.

Expand All @@ -350,32 +337,30 @@ terabytes, but network transfer times make object sizes of that magnitude
unreasonable anyway.

There is no default API rate limit imposed upon you, however the system reserves the
right to throttle requests if necessary to protect the system. For high-volume
right to throttle requests if necessary to protect the system. For high-volume
web assets, you should use it as a content delivery network (CDN) origin.

All REST APIs are modeled as streams. They are designed to let you iterate
through result sets without consuming too much memory. For example, listing
through result sets without consuming too much memory. For example, listing
a directory returns newline separated JSON objects as opposed to an array or
large XML document.

## Durability

By default, the system stores two copies of your object.
These two copies are placed in two different datacenters.
The system relies on ZFS RAID-Z to store your objects,
so your durability is actually greater than two would imply.
Your data is erasure encoded across a large number of disks on physically separate machines.
These two copies are placed in two different data centers.
The system relies on ZFS RAID-Z to store your objects, so the durability is actually greater than two would imply.

You are billed for exactly the number of bytes you consume in the system.
For example, if you write a 1MB object with the default number of copies (2),
you will be billed for 2MB of storage each month.
You can store anywhere from 1 to 6 copies.
When the number of copies requested is greater than one,
the system ensures that *at least* two copies are placed in two different
datacenters,
and then stripes the other copies across datacenters.
If any given datacenter is down at the time,
you may have copies unbalanced with extra replicas in fewer datacenters,
but there will always be at least two datacenters with your copy of data.
the system ensures that *at least* two copies are placed in two different
data centers,
and then stripes the other copies across data centers.
If any given data center is down at the time,
you may have copies unbalanced with extra replicas in fewer data centers,
but there will always be at least two data centers with your copy of data.
This allows you to still access your data in the event
of any one datacenter's failure.
of any one data center failure.