Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some aws_security_group_rules are not added to tfstate file. #2584

Closed
gtmtech opened this issue Jul 1, 2015 · 43 comments
Closed

Some aws_security_group_rules are not added to tfstate file. #2584

gtmtech opened this issue Jul 1, 2015 · 43 comments
Assignees

Comments

@gtmtech
Copy link

gtmtech commented Jul 1, 2015

We thought in terraform 0.6.0 #2366 might fix this issue, but we have tested on 0.5.3, and 0.6.0 and it is still broken.

We have a fairly large configuration full of aws_security_group_rules and sometimes multiple security_groups applied to aws_instances. We like having individual rules, rather than lots of rules in an aws_security_group, because they can be labelled, and there were previous bugs with aws_security_group on changing some of the rules, causing us to go down the aws_security_group_rules route. (We like managing the rules independently of the security group)

The problem is, on a fresh terraform apply, terraform reports that all aws_security_group_rules have been created, but some of them (a random selection each time) are not added to the tfstate file. This means that a further terraform plan yields further rules to be created, but because they do exist in Amazon, a further terraform apply does not work, as they come back with "duplicate rule".

I ran the whole thing in TF_LOG=debug mode, so have captured everything, and have tried to show the relevant bits here (as I dont want to share the entire config of what I'm doing), but the key facts are that each time a fresh terraform apply (from nothing) is done, a random set of rules fails to make it into the tfstate file.

I will shortly update this with the relevant snippets of logs/code etc.

@gtmtech
Copy link
Author

gtmtech commented Jul 1, 2015

Here are some nonexhaustive logs extracted from terraform for a failing rule called "egress_nat_http_to_all"

* INITIAL STATE IS NOT PROVISIONED IN AWS *

$ terraform plan

177:aws_security_group_rule.egress_nat_http_to_all
5459:  aws_security_group_rule.egress_nat_http_to_all
5990:  aws_security_group_rule.egress_nat_http_to_all (destroy tainted)
6000:  aws_security_group_rule.egress_nat_http_to_all (destroy)
10008:2015/06/22 18:41:09 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: walking
10009:2015/06/22 18:41:09 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: evaluating
10308:2015/06/22 18:41:09 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: expanding/walking dynamic subgraph
34759:2015/06/22 18:41:09 [DEBUG] refresh: aws_security_group_rule.egress_nat_http_to_all: no state, not refreshing
58014:+ aws_security_group_rule.egress_nat_http_to_all   <-- shows it would create

terraform apply

58686:aws_security_group_rule.egress_nat_http_to_all
63968:  aws_security_group_rule.egress_nat_http_to_all
64499:  aws_security_group_rule.egress_nat_http_to_all (destroy tainted)
64509:  aws_security_group_rule.egress_nat_http_to_all (destroy)
65753:aws_security_group_rule.egress_nat_http_to_all (destroy tainted)
65755:aws_security_group_rule.egress_nat_http_to_all (destroy)
68732:2015/06/22 18:41:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: walking
68733:2015/06/22 18:41:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: evaluating
68738:2015/06/22 18:41:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: expanding/walking dynamic subgraph
93071:2015/06/22 18:41:19 [DEBUG] refresh: aws_security_group_rule.egress_nat_http_to_all: no state, not refreshing
127680:2015/06/22 18:41:23 [DEBUG] apply: aws_security_group_rule.egress_nat_http_to_all: executing Apply
127691:aws_security_group_rule.egress_nat_http_to_all: Creating...
127922:aws_security_group_rule.egress_nat_http_to_all: Creation complete

terraform plan

132072:aws_security_group_rule.egress_nat_http_to_all
137052:  aws_security_group_rule.egress_nat_http_to_all
137583:  aws_security_group_rule.egress_nat_http_to_all (destroy tainted)
137593:  aws_security_group_rule.egress_nat_http_to_all (destroy)
137681:aws_security_group_rule.egress_nat_http_to_all (destroy tainted)
137683:aws_security_group_rule.egress_nat_http_to_all (destroy)
141803:2015/06/22 18:44:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: walking
141804:2015/06/22 18:44:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: evaluating
141812:2015/06/22 18:44:18 [DEBUG] vertex root.aws_security_group_rule.egress_nat_http_to_all: expanding/walking dynamic subgraph
166752:2015/06/22 18:44:21 [DEBUG] refresh: aws_security_group_rule.egress_nat_http_to_all: no state, not refreshing

^^-- problem - not in state file, even after creating

188868:+ aws_security_group_rule.egress_nat_http_to_all

^^-- and so it thinks that it needs to re-add...

terraform apply

191965:aws_security_group_rule.egress_nat_http_to_all: Creating...
191975:aws_security_group_rule.egress_nat_http_to_all: Error: 1 error(s) occurred:

^-- which it wont allow because it already exists in Amazon

@gtmtech
Copy link
Author

gtmtech commented Jul 1, 2015

Some more details about the security_group that failed to get added to tfstate above:

aws_security_group_rule.egress_nat_http_to_all: Creating...
  cidr_blocks.#:            "" => "1"
  cidr_blocks.0:            "" => "0.0.0.0/0"
  from_port:                "" => "80"
  protocol:                 "" => "tcp"
  security_group_id:        "" => "sg-xxxxxxxx"
  self:                     "" => "0"
  source_security_group_id: "" => "<computed>"
  to_port:                  "" => "80"
  type:                     "" => "egress"
aws_security_group_rule.egress_nat_http_to_all: Error: 1 error(s) occurred:

* Error authorizing security group rules rules: InvalidPermission.Duplicate: the specified rule "peer: 0.0.0.0/0, TCP, from port: 80, to port: 80, ALLOW" already exists
Error applying plan:

1 error(s) occurred:

@gtmtech
Copy link
Author

gtmtech commented Jul 1, 2015

Here is the rule as configured in terraform:

resource "aws_security_group_rule" "egress_nat_http_to_all" {
    type        = "egress"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]

   security_group_id = "${aws_security_group.nat.id}"
}

And the security group it relates to:

resource "aws_security_group" "nat" {
    name = "${var.environment_name}-nat"
    vpc_id = "${aws_vpc.default.id}"
    tags {
        Name        = "${var.environment_name}-nat-security-group"
        Environment = "${var.environment_name}"
    }
}

@gtmtech
Copy link
Author

gtmtech commented Jul 1, 2015

If you would like to see anything else, please let me know

@johnrengelman
Copy link
Contributor

I'm seeing this exact same issue when using security_group_rule to add an ingress/egress rule to a security group that is not managed by Terraform. It never stores a resource entry in the state file.

@gtmtech
Copy link
Author

gtmtech commented Jul 2, 2015

In our case, its worth noting that the security_group is managed by Terraform, but glad someone else is seeing this as it is causing us huge slowdown, and preventing us from spinning up new infrastructure regularly right now.

@georgebashi
Copy link

I just bumped into the same thing, with a security group not managed by terraform.

@phinze
Copy link
Contributor

phinze commented Jul 9, 2015

Sorry for the trouble here folks. Tagging and we'll get this looked at and fixed before 0.6.1.

@gtmtech
Copy link
Author

gtmtech commented Jul 9, 2015

I've been starting to learn go and develop my own version of terraform to try and help you guys tackle this. Based on commit - ab0a7d8 - which I think is very new, I now have a minimal set of security group rules. When I randomly destroy and create - sometimes the tfstate file is fine, and a single terraform apply works - sometimes it gets itself in a twist.

I have 24 aws_security_group_rules applied across 6 aws_security_groups. I introduced some debug to log the result of ipPermissionIDHash on each of these newly created resources. I noticed out of 24 aws_security_group_rules, 2 have the same ipPermissionIDHash. But only on this run (last time they were all unique). Not sure whats going on there, as I havent modified the rules in anyway, but may be a race condition?

In the state file, 2 of the rules have now ended up with the same ipPermissionIDHash as follows:

                "aws_security_group_rule.ingress_base_ssh_from_public_subnet": {
                    "type": "aws_security_group_rule",
                    "depends_on": [
                        "aws_security_group.base"
                    ],
                    "primary": {
                        "id": "sg-1437707844",
                        "attributes": {
                            "cidr_blocks.#": "1",
                            "cidr_blocks.0": "10.3.0.0/24",
                            "from_port": "22",
                            "id": "sg-1437707844",
                            "protocol": "tcp",
                            "security_group_id": "sg-eff3ff8a",
                            "self": "false",
                            "to_port": "22",
                            "type": "ingress"
                        },
                        "meta": {
                            "schema_version": "1"
                        }
                    }
                },


...


                "aws_security_group_rule.ingress_nat_ssh_from_public_subnet": {
                    "type": "aws_security_group_rule",
                    "depends_on": [
                        "aws_security_group.nat"
                    ],
                    "primary": {
                        "id": "sg-1437707844",
                        "attributes": {
                            "cidr_blocks.#": "1",
                            "cidr_blocks.0": "10.3.0.0/24",
                            "from_port": "22",
                            "id": "sg-1437707844",
                            "protocol": "tcp",
                            "security_group_id": "sg-ecf3ff89",
                            "self": "false",
                            "to_port": "22",
                            "type": "ingress"
                        },
                        "meta": {
                            "schema_version": "1"
                        }
                    }
                },

Not sure if this is helpful, but might be a start. You'll notice they depend on different security groups, but have ended up being applied to the same security group. Maybe thats it?

This isn't quite the same as what I experienced at first which is the security group rules were omitted form the tfstate file altogether, but I am hunting for all the problems with this.

@gtmtech
Copy link
Author

gtmtech commented Jul 9, 2015

So problem 1 - is as you can see the security group ID is not being considered as part of the ipPermissionIDHash - so the same rule on different security groups will get the same resource ID.

2015/07/09 17:58:36 terraform-provider-aws: 2015/07/09 17:58:36 [DEBUG] bufString=22-22-tcp-ingress-10.3.0.0/24-, hashed=%!s(int=1437707844)

In the latest run however, I got 8 missing rules in tfstate, I am now analysing problem 2

@gtmtech
Copy link
Author

gtmtech commented Jul 9, 2015

Problem 2 is the one described by this ticket, and you can see it in an example run here- All ipPermissionIDHashes were generated (24 of them), only 2 matches because of problem 1 above, but 8 didnt make it into the tfstate file - see as follows:

vagrant@vagrant:/opt/gopath/src/github.com/hashicorp/terraform/infra$ cat apply1.log | grep ipPermissionIDHash | sed -e 's/.*=//g'
 sg-3215525433
 sg-2083258091
 sg-518594090
 sg-1437707844
 sg-2142135502
 sg-3719823962
 sg-1741309070
 sg-3302475947
 sg-4274037645
 sg-293616859
 sg-1504117660
 sg-4018185139
 sg-3790921162
 sg-2148483349
 sg-64381385
 sg-419800598
 sg-144932262
 sg-911236652
 sg-4168774301
 sg-2098824017
 sg-53243631
 sg-3047568218
 sg-1437707844
 sg-4241184053
vagrant@vagrant:/opt/gopath/src/github.com/hashicorp/terraform/infra$ cat state_postapply1.log | grep '^                        "id": "sg-[0-9][0-9][0-9][0-9]' 
                        "id": "sg-144932262",
                        "id": "sg-2083258091",
                        "id": "sg-518594090",
                        "id": "sg-1437707844",
                        "id": "sg-419800598",
                        "id": "sg-1741309070",
                        "id": "sg-4168774301",
                        "id": "sg-2098824017",
                        "id": "sg-3790921162",
                        "id": "sg-4274037645",
                        "id": "sg-4241184053",
                        "id": "sg-911236652",
                        "id": "sg-3719823962",
                        "id": "sg-3047568218",
                        "id": "sg-1504117660",
                        "id": "sg-53243631",

@mitchellh
Copy link
Contributor

Thanks for all this context. This will all help but we're still going to need a reproduction Terraform config I can run locally in order to get this fixed. Can you make one? Thanks!

@jszwedko
Copy link
Contributor

@mitchellh we've seen this occasionally too, I'll take a crack at creating a small config that reproduces now.

@jszwedko
Copy link
Contributor

Had a difficult time trimming our configuration down, but still reproducing. If you have a smaller config, could you post yours @gtmtech ?

@georgebashi
Copy link

With this code I can reproduce this bug by cycling terraform apply / terraform destroy a few times. I verified everything is fully created and destroyed via the AWS console on each run, but not all rules are saved to the state file. Occurs on about 1/3rd of terraform applys.

@jszwedko
Copy link
Contributor

Was able to reproduce this with the following config which is pretty small:

./providers.tf
provider "aws" {
  region = "us-west-1"
}
./mesos/inputs.tf
variable "vpc_id" { }
./mesos/outputs.tf
output "mesos_worker_security_group_id" { value = "${aws_security_group.mesos_worker.id}" }
./mesos/main.tf
resource "aws_security_group" "mesos_worker" {
  vpc_id      = "${var.vpc_id}"
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = [ "0.0.0.0/0" ]
  }
}
./cassandra/inputs.tf
variable "vpc_id"                { description = "VPC to put nodes in" }
./cassandra/outputs.tf
output "cassandra_client_security_group_id" { value = "${aws_security_group.cassandra_internal.id}" }
./cassandra/main.tf
resource "aws_security_group" "cassandra" {
  name = "foo"
  vpc_id = "${var.vpc_id}"
  ingress {
    from_port = 7001
    to_port = 7001
    protocol = "tcp"
    self = true
  }
  egress {
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

resource "aws_security_group_rule" "thrift" {
  security_group_id        = "${aws_security_group.cassandra_internal.id}"

  type                     = "ingress"
  protocol                 = "tcp"
  from_port                = "9160"
  to_port                  = "9160"
  source_security_group_id = "${aws_security_group.opscenter.id}"
}

resource "aws_security_group_rule" "opscenter_agent" {
  security_group_id        = "${aws_security_group.cassandra_internal.id}"

  type                     = "ingress"
  protocol                 = "tcp"
  from_port                = "61621"
  to_port                  = "61621"
  source_security_group_id = "${aws_security_group.opscenter.id}"
}

resource "aws_security_group" "cassandra_internal" {
  name = "cassandra_internal"
  description = "Expose ports for opscenter and client communication."
  vpc_id = "${var.vpc_id}"
}

resource "aws_security_group" "opscenter" {
  name = "opscenter"
  description = "OpsCenter"
  vpc_id = "${var.vpc_id}"
  egress {
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}
./main.tf
resource "aws_vpc" "default" {
  cidr_block = "10.0.0.0/19"
}

module "cassandra" {
  source = "./cassandra"
  vpc_id = "${aws_vpc.default.id}"
}

module "mesos" {
  source = "./mesos"
  vpc_id = "${aws_vpc.default.id}"
}

module "mesos_worker_to_cassandra_permissions" {
  source = "./mesos_worker_to_cassandra_permissions"
  cassandra_client_security_group_id = "${module.cassandra.cassandra_client_security_group_id}"
  mesos_worker_security_group_id      = "${module.mesos.mesos_worker_security_group_id}"
}
./mesos_worker_to_cassandra_permissions/main.tf
resource "aws_security_group_rule" "cql" {
  security_group_id = "${var.cassandra_client_security_group_id}"

  type      = "ingress"
  from_port = 9042
  to_port   = 9042
  protocol  = "tcp"
  source_security_group_id = "${var.mesos_worker_security_group_id}"
}

resource "aws_security_group_rule" "thrift" {
  security_group_id = "${var.cassandra_client_security_group_id}"

  type      = "ingress"
  from_port = 9160
  to_port   = 9160
  protocol  = "tcp"
  source_security_group_id = "${var.mesos_worker_security_group_id}"
}
./mesos_worker_to_cassandra_permissions/variables.tf
variable "cassandra_client_security_group_id" {}
variable "mesos_worker_security_group_id" {}

Ran while terraform plan -out terraform.plan && terraform apply terraform.plan && terraform apply ; do terraform destroy -force ; sleep 5 ; done which ran the config until the second apply failed.

Error message:

Error applying plan:

1 error(s) occurred:

* [WARN] A duplicate Security Group rule was found. This may be
a side effect of a now-fixed Terraform issue causing two security groups with
identical attributes but different source_security_group_ids to overwrite each
other in the state. See https://github.com/hashicorp/terraform/pull/2376 for more
information and instructions for recovery. Error message: the specified rule "peer: sg-19d8487c, TCP, from port: 9160, to port: 9160, ALLOW" already exists

@gtmtech
Copy link
Author

gtmtech commented Jul 23, 2015

Do you still need an example from me?

@sairez
Copy link

sairez commented Jul 28, 2015

👍 seeing this same issue in 0.6.1.

@mahileeb
Copy link

mahileeb commented Aug 3, 2015

+1 that this is a serious issue, we are hitting issues where rules that are created successfully are not making it into the state file and subsequent runs fail as it tries to create existing rule

@phinze
Copy link
Contributor

phinze commented Aug 12, 2015

Thanks for the example @jszwedko - we'll use this to reproduce on our side, investigate, and follow up with what we find.

@catsby
Copy link
Contributor

catsby commented Aug 13, 2015

Thanks for all the additional info everyone. I'm taking a fresh look at this now, sorry for all the trouble 😦

@catsby
Copy link
Contributor

catsby commented Aug 13, 2015

Hey @jszwedko – I believe that specific issue is a race condition with the AWS API. I reconstructed your example in this repo:

I ran the snippet you shared (the while....done) and only after 6+ successful cycles did I encounter that error. Subsequent re-runs were never consistent, sometimes it would go 10+ create/destroy cycles before the error showed up.

Here's a gist of at lest 3 successful runs:

If you could take a look at the repo above, maybe you can spot something I missed?

That all said, I'm still looking into this issue. I believe there is a lingering issue with Security Group Rules that I've yet to nail down.

Thanks!

@gtmtech
Copy link
Author

gtmtech commented Aug 13, 2015

I thought that originally @catsby, but the rules end up created fine in AWS, they just dont appear in terraform tfstate file - meaning terraform has an inconsistent view of whats actually there in AWS.

So we do a single run of terraform, creating a bunch of rules. And what gets created is EXACTLY what we specified in terraform, but whats in the tfstate file is missing some of the security group rules. So I dont believe it can possibly be an AWS problem, I think its a terraform problem.

Our security groups and security group rules are now big enough that this happens on EVERY terraform run. Its a serious issue for us.

@catsby
Copy link
Contributor

catsby commented Aug 13, 2015

@gtmtech I believe the issue you reported originally and the issue @jszwedko reported latter are not the same issue. I've reproduced something similar to what you're describing and am looking into resolutions

@jszwedko
Copy link
Contributor

@catsby hmm, the symptoms are the same (the rules are created in AWS, just not recorded in the state file). I'm happy to open this as a separate issue if you think that is different though.

@catsby
Copy link
Contributor

catsby commented Aug 17, 2015

I believe both issues will be addressed in an upcoming pull request that I'm working on. Security Group Rules are getting an update on how they're found and read from the API. It's a bit tricky because of how AWS groups the cidr blocks but I'm working it out.

@catsby
Copy link
Contributor

catsby commented Aug 17, 2015

Hey @gtmtech – do you have a stripped down example of rules not being saved to the state file?
I have a patch in progress and I'm working on testing various scenarios.

Thanks!

@catsby
Copy link
Contributor

catsby commented Aug 18, 2015

I just sent #3019 as a patch for some of the issues reported here.

As mentioned above, there is a legitimate issue where Security Group Rules would fail to save correctly, both for the same Security Group, or the same rule applied to multiple Groups. Those issues should be fixed in #3019 , but I need help reviewing and vetting that.

There is another issue demonstrated here, which appears to be a race condition with the AWS API, and it's eventually consistent nature. That I do not attempt to fix in #3019, and I don't believe there's much we can do about it.

If possible please checkout the PR and let me know! Thanks all for the help here, sorry again for the delay

@gtmtech
Copy link
Author

gtmtech commented Aug 20, 2015

Sorry, been sending github mails to junk and didnt see the progress on this - do you still want help testing ?

@catsby
Copy link
Contributor

catsby commented Aug 20, 2015

@gtmtech if you can, yes! Note that #3019 has a schema migration, so if you have a test case setup that reproduces those issues, please do.

@gtmtech
Copy link
Author

gtmtech commented Aug 23, 2015

Sorry to say this isn't fixed. It did actually look a bit better - I usually get 20 rules which missed going into the state file, and on the latest run I got only 1. However as its a race condition, I cant be sure, I can only say its still not totally fixed.

$ terraform version
Terraform v0.6.4-dev (720562e)

@catsby
Copy link
Contributor

catsby commented Aug 24, 2015

Thanks @gtmtech , I'm glad it seems to have at least improved. I do have some questions, if you don't mind:

  • Do you have the specific error message(s) handy?
  • Is that one (or more) rules that are not in the state file still added to AWS?
  • Does this happen every time, most of the time, or seemingly random?
  • Is it the same rule(s), or seemingly random?
  • After the issue happens, what does terraform plan say? I assume it says it needs to add 1 or more rules..
  • After terraform plan, what does terraform apply do/say? Does it add the rule successfully, or does it return an error from the API indicating there's a duplicate?

Sorry for the barrage of questions, I'm still trying to hammer all the bugs out here :/
So far it seems to be an occasional race on the API side (unless you come back and say it happens every time or most of the time)

@gtmtech
Copy link
Author

gtmtech commented Aug 24, 2015

Do you have the specific error message(s) handy?

Yep to answer your question, the error message is exactly the same as before... there are no errors on the first apply - all rules get created in AWS, but on the second apply, there is one or two rules that attempt to apply (as they didn't get saved into the tfstate file), but as they are already in AWS, the error is the duplicate rule error that I first referenced at the top of this issue - i.e.:

  • Error authorizing security group rules rules: InvalidPermission.Duplicate: the specified rule "peer: 0.0.0.0/0, TCP, from port: 80, to port: 80, ALLOW" already exists

Is that one (or more) rules that are not in the state file still added to AWS?

Yes they get added to AWS (hence them erroring as Duplicate the next apply

Does this happen every time ?

Every time I've reterraformed my env, I get the problem yes

Is it the same rule(s), or seemingly random?

Every time it is random which rules dont get saved - sometimes one, sometimes two, different rules each time it seems to me.

After the issue happens, what does terraform plan say? I assume it says it needs to add 1 or more rules..

Correct - terraform plan thinks it needs to add the one or two rules that didnt make it into the tfstate file (but are in AWS, hence on apply it errors with Duplicate)

After terraform plan, what does terraform apply do/say? Does it add the rule successfully, or does it return an error from the API indicating there's a duplicate?

It errors with above error message saying there's duplicates.

@joslynesser
Copy link

For those with flexibility on their overall security group design, here's a workaround while this bug is worked on. This workaround requires a specific security group design, but the benefit of the design is:

  • circular dependencies with nested rules are removed entirely, which removes the need for using the aws_security_group_rule resource type. Instead, you can define all of your rules in aws_security_group resources.
  • security groups have an explicit purpose, making them easier to comprehend what they're used for

Every instance gets at least the following two security groups, each with a different purpose:

  • <role>-ingress - This specifies all ingress rules
  • <role>-egress - This specifies all egress rules (usually only 1 rule in simpler configurations: outbound internet access). More importantly, this is used as a source within -ingress security group rules.

Here's a contrived example with a circular dependency. A docker registry machine provides a redis machine with access to pull the private redis docker image, while the redis machine provides access to the docker registry for caching image layers in redis. No need for any aws_security_group_rule resources, all ingress group rules use egress groups as sources.

# docker registry

resource "aws_security_group" "dockerregistry-ingress" {
  name = "dockerregistry-ingress"
  description = "Docker Registry (Ingress)"
  vpc_id = "${var.vpc_id}"

  ingress {
    from_port = 443
    to_port = 443
    protocol = "tcp"
    security_groups = ["${aws_security_group.redis-egress.id}"]
  }
}

resource "aws_security_group" "dockerregistry-egress" {
  name = "dockerregistry-egress"
  description = "Docker Registry (Egress)"
  vpc_id = "${var.vpc_id}"

  egress {
      from_port = 0
      to_port = 0
      protocol = "-1"
      cidr_blocks = ["0.0.0.0/0"]
  }
}

# redis

resource "aws_security_group" "redis-ingress" {
  name = "redis-ingress"
  description = "Redis (Ingress)"
  vpc_id = "${var.vpc_id}"

  ingress {
    from_port = 6379
    to_port = 6379
    protocol = "tcp"
    security_groups = ["${aws_security_group.dockerregistry-egress.id}"]
  }
}

resource "aws_security_group" "redis-egress" {
  name = "redis-egress"
  description = "Redis (Egress)"
  vpc_id = "${var.vpc_id}"

  egress {
      from_port = 0
      to_port = 0
      protocol = "-1"
      cidr_blocks = ["0.0.0.0/0"]
  }
}

Since moving to this design, it has made it much easier for us to identify which rules belong to which security groups, with less lines of code. Avoiding this bug was just a lucky bonus :)

@joelmoss
Copy link
Contributor

I created a new issue at #3498 but after more investigation, my issue appears to be the same as this.

Every time, without fail, exactly the same 23 of my resources (I have more than that) always show as needing to be created in the plan. This is really impacting my ability to use Terraform now.

Does anyone know if any progress has been made on this please?

@catsby
Copy link
Contributor

catsby commented Oct 14, 2015

Hey @joelmoss – this issue with security groups and rules should be fixed in master. #3019 should resolve these issues, but if anyone is still hitting them on master please let me know.

thanks!

@catsby catsby closed this as completed Oct 14, 2015
@joelmoss
Copy link
Contributor

@catsby Awesome. Will give it a go later. When can we expect the next release?

@catsby
Copy link
Contributor

catsby commented Oct 14, 2015

@joelmoss "soon" 😄

@joelmoss
Copy link
Contributor

How did I know you'd say that ;)

@joelmoss
Copy link
Contributor

Ok, so just tried this on master and all good! thx loads

@cmlad
Copy link

cmlad commented Apr 14, 2016

We are still hitting quite hard the issue that security group rules are not getting recorded during creation. We get about 40-50% failure rate (of most recent 5 runs, 3 had the issue).

Newest 0.6.14 version. I'm attaching two state files - one for a successful build, and two for a bad build - you can see that many rules themselves are missing.

Should I file a full bug report or is that enough. If you point me in the right direction I can try to fix this myself, though I've never dabbled with go.

Good creation:
terraform.tfstate.good.txt

Bad creations:
terraform.tfstate.bad1.txt
terraform.tfstate.bad2.txt

@benglewis
Copy link

I'm still getting this issue... Let me know what I can provide that would be useful for you to fix this :)

@ghost
Copy link

ghost commented Apr 18, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests