Add deployers, general code cleanup (#160)

cloudposse · May 12, 2021 · 7342c2e · 7342c2e
1 parent 2efb943
commit 7342c2e
Show file tree

Hide file tree

Showing 12 changed files with 498 additions and 154 deletions.
diff --git a/README.md b/README.md
diff --git a/README.yaml b/README.yaml
@@ -63,20 +63,26 @@ usage: |-
 
   For automated tests of the complete example using [bats](https://github.com/bats-core/bats-core) and [Terratest](https://github.com/gruntwork-io/terratest) (which tests and deploys the example on AWS), see [test](test).
 
-  This will create a new s3 bucket `eg-prod-app` for a cloudfront cdn.
+  This will create a new s3 bucket `eg-prod-app` for a cloudfront cdn, and allow `principal1` to upload to
+  `prefix1` and `prefix2`, while allowing `principal2` to manage the whole bucket.
 
   ```hcl
   module "cdn" {
     source = "cloudposse/cloudfront-s3-cdn/aws"
     # Cloud Posse recommends pinning every module to a specific version
-    # version     = "x.x.x"
+    # version = "x.x.x"
 
     namespace         = "eg"
     stage             = "prod"
     name              = "app"
     aliases           = ["assets.cloudposse.com"]
     dns_alias_enabled = true
     parent_zone_name  = "cloudposse.com"
+
+    deployment_arns = {
+      "arn:aws:s3:::principal1" = ["/prefix1", "/prefix2"]
+      "arn:aws:s3:::principal2" = [""]
+    }
   }
   ```
 
@@ -86,7 +92,7 @@ usage: |-
   module "cdn" {
     source = "cloudposse/cloudfront-s3-cdn/aws"
     # Cloud Posse recommends pinning every module to a specific version
-    # version     = "x.x.x"
+    # version = "x.x.x"
 
     origin_bucket     = "eg-prod-app"
     aliases           = ["assets.cloudposse.com"]
@@ -95,22 +101,110 @@ usage: |-
   }
   ```
 
-  ### Using an S3 Static Website Origin
-
-  When variable `website_enabled` is set to `true`, the S3 origin is configured
-  as a static website. The S3 static website has the advantage of redirecting
-  URL `subdir/` to `subdir/index.html` without requiring a
-  [Lambda@Edge function to perform the redirection](https://aws.amazon.com/blogs/compute/implementing-default-directory-indexes-in-amazon-s3-backed-amazon-cloudfront-origins-using-lambdaedge/).
-  The S3 static website responds only to CloudFront, preventing direct access to
-  S3.
-
+  ### Background on CDNs, "Origins", S3 Buckets, and Web Servers
+  
+  #### CDNs and Origin Servers
+  
+  There are some settings you need to be aware of when using this module. In order to understand the settings,
+  you need to understand some of the basics of CDNs and web servers, so we are providing this _highly simplified_
+  explanation of how they work in order for you to understand the implications of the settings you are providing.
+  
+  A "**CDN**" ([Content Distribution Network](https://www.cloudflare.com/learning/cdn/what-is-a-cdn/)) is a collection of
+  servers scattered around the internet with the aim of making it faster for people to retrieve content from a website.
+  The details of why that is wanted/needed are beyond the scope of this document, as are most of the details of how
+  a CDN is implemented. For this discussion, we will simply treat a CDN as a set of web servers all serving
+  the same content to different users.
+  
+  In a normal web server (again, greatly simplified), you place files on the server and the web server software receives
+  requests from browsers and responds with the contents of the files.
+  
+  For a variety of reasons, the web servers in a  CDN do not work the way normal web servers work. Instead of getting
+  their content from files on the local server, the CDN web servers get their content by acting like web browsers
+  (proxies). When they get a request from a browser, they make the same request to what is called an "**Origin Server**".
+  It is called an origin server because it _serves_ the original content of the web site, and thus is the _origin_
+  of the content.
+  
+  As a web site publisher, you put content on an Origin Server (which users usually should be prevented from accessing)
+  and configure your CDN to use your Origin Server. Then you direct users to a URL hosted by your CDN provider, the
+  users' browsers connect to the CDN, the CDN gets the content from your Origin Server, your Origin Server gets the
+  content from a file on the server, and the data gets sent back hop by hop to the user. (The reason this ends up
+  being a good idea is that the CDN can cache the content for a while, serving multiple users the same content while
+  only contacting the origin server once.)
+  
+  #### S3 Buckets: file storage and web server
+  
+  S3 buckets were originally designed just to store files, and they are still most often used for that. The have a lot
+  of access controls to make it possible to strictly limit who can read what files in the bucket, so that companies
+  can store sensitive information there. You may have heard of a number of "data breaches" being caused by misconfigured
+  permissions on S3 buckets, making them publicly accessible. As a result of that, Amazon has some extra settings on
+  top of everything else to keep S3 buckets from being publicly accessible, which is usually a good thing.
+  
+  However, at some point someone realized that since these files were in the cloud, and Amazon already had these web servers
+  running to provide access to the files in the cloud, it was only a tiny leap to turn an S3 bucket into a web server.
+  So now S3 buckets [can be published as web sites](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EnableWebsiteHosting.html)
+  with a few configuration settings, including making the contents publicly accessible.
+  
+  #### Web servers, files, and the different modes of S3 buckets
+  
+  In the simplest web sites, the URL "path" (the part after the site name) corresponds directly to the path (under
+  a special directory we will call `/webroot`) and name
+  of a file on the web server. So if the web server gets a request for "http://example.com/foo/bar/baz.html" it will
+  look for a file `/webroot/foo/bar/baz.html`. If it exists, the server will return its contents, and if it does not exist,
+  the server will return a `Not Found` error. An S3 bucket, whether configured as a file store or a web site, will
+  always do both of these things.
+  
+  Web servers, however, do some helpful extra things. To name a few:
+  - If the URL ends with a `/`, as in `http://example.com/foo/bar/`, the web server (depending on how it is configured)
+  will either return a list of files in the directory or it will return the contents of a file in the directory with
+  a special name (by default, `index.html`) if it exists.
+  - If the URL does not end with a `/` but the last part, instead of being a file name, is a directory name, the web
+  server will redirect the user to the URL with the `/` at the end instead of saying the file was `Not Found`. This
+  redirect will get you to the `index.html` file we just talked about. Given the way people pass URLs around, this
+  turns out to be quite helpful.
+  - If the URL does not point to a directory or a file, instead of just sending back a cryptic `Not Found` error code,
+  it can return the contents of a special file called an "error document".
+  
+  #### Your Critical Decision: S3 bucket or website?
+  
+  All of this background is to help you decide how to set `website_enabled` and `s3_website_password_enabled`.
+  The default for `website_enabled` is `false` which is the easiest to configure and the most secure, and with
+  this setting, `s3_website_password_enabled` is ignored.
+  
+  S3 buckets, in file storage mode (`website_enabled = false`), do none of these extra things that web servers do.
+  If the URL points to a file, it will return the file, and if it does not _exactly_ match a file, it will return
+  `Not Found`. One big advantage, though, is that the S3 bucket can remain private (not publicly accessible). A second,
+  related advantage is that you can limit the website to a portion of the S3 bucket (everything under a certain prefix)
+  and keep the contents under the the other prefixes private.
+  
+  S3 buckets configured as static websites (`website_enabled = true`), however, have these extra web server features like redirects, `index.html`,
+  and error documents. The disadvantage is that you have to make the entire bucket public (although you can still
+  restrict access to some portions of the bucket).
+  
+  Another feature or drawback (depending on your point of view) of S3 buckets configured as static web sites is that
+  they are directly accessible via their [website endpoint](https://docs.aws.amazon.com/AmazonS3/latest/userguide/WebsiteEndpoints.html)
+  as well as through Cloudfront. This module has a feature, `s3_website_password_enabled`, that requires a password
+  be passed in the HTTP request header and configures the CDN to do that, which will make it much harder to access
+  the S3 website directly. So set `s3_website_password_enabled = true` to limit direct access to the S3 website
+  or set it to false if you want to be able to bypass Cloudfront when you want to.
+  
   In addition to setting `website_enabled=true`, you must also:
 
   * Specify at least one `aliases`, like `["example.com"]` or
     `["example.com", "www.example.com"]`
   * Specify an ACM certificate
 
-  ### Generating ACM Certificate
+  ### Custom Domain Names and Generating a TLS Certificate with ACM
+  
+  When you set up Cloudfront, Amazon will generate a domain name for your website. You amost certainly will not
+  want to publish that. Instead, you will want to use a custom domain name. This module refers to them as "aliases".
+  
+  To use the custom domain names, you need to
+  - Pass them in as `aliases` so that Cloudfront will respond to them with your content
+  - Create CNAMEs for the aliases to point to the Cloudfront domain name. If your alias domains are hosted by
+  Route53 and you have IAM permissions to modify them, this module will set that up for you if you set `dns_alias_enabled = true`.
+  - Generate a TLS Certificate via ACM that includes the all the aliases and pass the ARN for the
+  certificate in `acm_certificate_arn`. Note that for Cloudfront, the certificate has to be provisioned in the
+  `us-east-1` region regardless of where any other resources are.
 
   ```hcl
   # For cloudfront, the acm has to be created in us-east-1 or it will not work