Skip to content

Commit

Permalink
ec2_instance: wait on instance state, rather than status check (ansib…
Browse files Browse the repository at this point in the history
…le-collections#481)

ec2_instance: wait on instance state, rather than status check

Attempt to revive ansible-collections#461
(Thanks @overhacked)
SUMMARY
The await_instances() function was calling module.client('ec2').get_waiter('instance_status_ok') by default, which causes boto3 to wait on the DescribeInstanceStatus api to return "Ok". The status API is linked to EC2's instance status checks, which lag the instance's real state by 3-5 minutes.
It's unlikely that most Ansible users who set wait: true want their playbooks to delay until EC2's status checks catch up. The more efficient approach is to use wait_for or wait_for_connection. What is useful, however, is to wait the short period of time necessary for EC2 to move an instance from pending to running.
Before this change, the ec2_instance module would wait for an "Ok" status check for newly-created instances before waiting for any other state change. This change makes the default state: present wait for instance_exists, and state: running wait for instance_running. state: started (previously just an alias for state: running) now will wait for instance_status_ok if wait: true.
The only, IMHO unlikely, scenario that this is a breaking change for is if an Ansible user had combined state: present/running/started and wait: true and is actually depending on the play to halt until EC2's status checks return OK. Based on my own informal survey of playbooks on GitHub, I did not see anyone using ec2_instance in this way. Most playbooks follow ec2_instance with wait_for or wait_for_connection.
ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME
ec2_instance
ADDITIONAL INFORMATION

Reviewed-by: Jill R <None>
Reviewed-by: Alina Buzachis <None>
Reviewed-by: None <None>
  • Loading branch information
tremble authored Sep 8, 2021
1 parent d047a90 commit ad7a0eb
Show file tree
Hide file tree
Showing 25 changed files with 344 additions and 323 deletions.
13 changes: 13 additions & 0 deletions changelogs/fragments/481-ec2_instance-wait_sanity.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
breaking_changes:
- >-
ec2_instance - instance wait for state behaviour has changed. If plays
require the old behavior of waiting for the instance monitoring status to
become ``OK`` when launching a new instance, the action will need to specify
``state: started``
(https://github.com/ansible-collections/amazon.aws/pull/481).
bugfixes:
- >-
ec2_instance - ``ec2_instance`` was waiting on EC2 instance monitoring status
to be ``OK`` when launching a new instance. This could cause a play to wait
multiple minutes for AWS's monitoring to complete status checks
(https://github.com/ansible-collections/amazon.aws/pull/481).
427 changes: 242 additions & 185 deletions plugins/modules/ec2_instance.py

Large diffs are not rendered by default.

5 changes: 3 additions & 2 deletions tests/integration/targets/ec2_instance/aliases
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
disabled
# Takes about 10-15 minutes
slow

cloud/aws
ec2_instance_info
slow
5 changes: 2 additions & 3 deletions tests/integration/targets/ec2_instance/inventory
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
[tests]
# Sorted fastest to slowest
checkmode_tests
termination_protection
ebs_optimized
block_devices
cpu_options
Expand All @@ -8,9 +9,7 @@ default_vpc_tests
external_resource_attach
instance_no_wait
iam_instance_role
termination_protection
tags_and_vpc_settings
checkmode_tests
security_group
state_config_updates

Expand Down
5 changes: 1 addition & 4 deletions tests/integration/targets/ec2_instance/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,6 @@
# running an 'include' that was run_once
setup_run_once: yes
block:
- include_role:
name: 'ec2_instance'
tasks_from: find_ami.yml
- include_role:
name: 'ec2_instance'
tasks_from: env_setup.yml
Expand All @@ -38,6 +35,6 @@
- hosts: all
gather_facts: no
strategy: free
#serial: 10
serial: 3
roles:
- ec2_instance
5 changes: 2 additions & 3 deletions tests/integration/targets/ec2_instance/meta/main.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
dependencies:
- prepare_tests
- setup_ec2
- setup_remote_tmp_dir
- prepare_tests
- setup_ec2_facts
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,18 @@
ec2_instance_owner: 'integration-run-{{ resource_prefix }}'
ec2_instance_type: 't3.micro'
ec2_instance_tag_TestId: '{{ resource_prefix }}-{{ inventory_hostname }}'
ec2_ami_name: 'amzn2-ami-hvm-2.*-x86_64-gp2'

vpc_name: '{{ resource_prefix }}-vpc'
vpc_seed: '{{ resource_prefix }}'
vpc_cidr: '10.{{ 256 | random(seed=vpc_seed) }}.0.0/16'

subnet_a_az: '{{ ec2_availability_zone_names[0] }}'
subnet_a_cidr: '10.{{ 256 | random(seed=vpc_seed) }}.32.0/24'
subnet_a_startswith: '10.{{ 256 | random(seed=vpc_seed) }}.32.'
subnet_b_az: '{{ ec2_availability_zone_names[1] }}'
subnet_b_cidr: '10.{{ 256 | random(seed=vpc_seed) }}.33.0/24'
subnet_b_startswith: '10.{{ 256 | random(seed=vpc_seed) }}.33.'

first_iam_role: "ansible-test-sts-{{ resource_prefix | hash('md5') }}-test-policy"
second_iam_role: "ansible-test-sts-{{ resource_prefix | hash('md5') }}-test-policy-2"
# Zuul resource prefixes are very long, and IAM roles can only be 64 characters
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
dependencies:
- prepare_tests
- setup_ec2
- prepare_tests
- setup_ec2_facts
collections:
- amazon.aws
- amazon.aws
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
- block:
- name: "New instance with an extra block device"
ec2_instance:
state: present
state: running
name: "{{ resource_prefix }}-test-ebs-vols"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
vpc_subnet_id: "{{ testing_subnet_b.subnet.id }}"
volumes:
- device_name: /dev/sdb
Expand Down Expand Up @@ -35,7 +35,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-ebs-vols-checkmode"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
vpc_subnet_id: "{{ testing_subnet_b.subnet.id }}"
volumes:
- device_name: /dev/sdb
Expand All @@ -52,7 +52,6 @@
ec2_instance_info:
filters:
"tag:Name": "{{ resource_prefix }}-test-ebs-vols"
"instance-state-name": "running"
register: presented_instance_fact

- name: "fact checkmode ec2 instance"
Expand All @@ -61,22 +60,26 @@
"tag:Name": "{{ resource_prefix }}-test-ebs-vols-checkmode"
register: checkmode_instance_fact

- name: "Confirm whether the check mode is working normally."
- name: "Confirm instance was created without check mode"
assert:
that:
- "{{ presented_instance_fact.instances | length }} > 0"

- name: "Confirm instance was not created with check mode"
assert:
that:
- "{{ checkmode_instance_fact.instances | length }} == 0"

- name: "Terminate instances"
ec2_instance:
state: absent
instance_ids: "{{ block_device_instances.instance_ids }}"

- name: "New instance with an extra block device - gp3 volume_type and throughput"
ec2_instance:
state: present
state: running
name: "{{ resource_prefix }}-test-ebs-vols-gp3"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
vpc_subnet_id: "{{ testing_subnet_b.subnet.id }}"
volumes:
- device_name: /dev/sdb
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-checkmode-comparison"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
security_groups: "{{ sg.group_id }}"
instance_type: "{{ ec2_instance_type }}"
vpc_subnet_id: "{{ testing_subnet_a.subnet.id }}"
Expand All @@ -17,7 +17,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-checkmode-comparison-checkmode"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
security_groups: "{{ sg.group_id }}"
instance_type: "{{ ec2_instance_type }}"
vpc_subnet_id: "{{ testing_subnet_b.subnet.id }}"
Expand Down Expand Up @@ -63,7 +63,7 @@
- name: "Verify that it was not stopped."
assert:
that:
- '"{{ confirm_checkmode_stopinstance_fact.instances[0].state.name }}" != "stopped"'
- confirm_checkmode_stopinstance_fact.instances[0].state.name not in ["stopped", "stopping"]

- name: "Stop instance."
ec2_instance:
Expand All @@ -73,9 +73,8 @@
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
TestTag: "Some Value"
wait: true
register: instance_stop
until: not instance_stop.failed
retries: 10

- name: "fact stopped ec2 instance"
ec2_instance_info:
Expand All @@ -86,7 +85,7 @@
- name: "Verify that it was stopped."
assert:
that:
- '"{{ confirm_stopinstance_fact.instances[0].state.name }}" in ["stopped", "stopping"]'
- confirm_stopinstance_fact.instances[0].state.name in ["stopped", "stopping"]

- name: "Running instance in check mode."
ec2_instance:
Expand Down Expand Up @@ -158,6 +157,7 @@
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
TestTag: "Some Value"
wait: True
check_mode: yes

- name: "fact ec2 instance"
Expand All @@ -179,6 +179,7 @@
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
TestTag: "Some Value"
wait: True

- name: "fact ec2 instance"
ec2_instance_info:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-t3nano-1-threads-per-core"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
vpc_subnet_id: "{{ testing_subnet_a.subnet.id }}"
instance_type: t3.nano
cpu_options:
core_count: 1
threads_per_core: 1
wait: false
wait: true
register: instance_creation

- name: "instance with cpu_options created with the right options"
Expand All @@ -24,15 +24,15 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-t3nano-1-threads-per-core"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
vpc_subnet_id: "{{ testing_subnet_a.subnet.id }}"
instance_type: t3.nano
cpu_options:
core_count: 1
threads_per_core: 2
wait: false
wait: true
register: cpu_options_update
ignore_errors: yes

Expand All @@ -54,15 +54,17 @@

- name: "create t3.nano instance with cpu_options(check mode)"
ec2_instance:
state: running
name: "{{ resource_prefix }}-test-t3nano-1-threads-per-core-checkmode"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
vpc_subnet_id: "{{ testing_subnet_a.subnet.id }}"
instance_type: t3.nano
cpu_options:
core_count: 1
threads_per_core: 1
wait: true
check_mode: yes

- name: "fact checkmode ec2 instance"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-default-vpc"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
security_group: "default"
Expand All @@ -15,7 +15,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-default-vpc-checkmode"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
security_group: "default"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
ec2_instance:
state: present
name: "{{ resource_prefix }}-test-ebs-optimized-instance-in-vpc"
image_id: "{{ ec2_ami_image }}"
image_id: "{{ ec2_ami_id }}"
tags:
TestId: "{{ ec2_instance_tag_TestId }}"
security_groups: "{{ sg.group_id }}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
ec2_eni:
state: absent
eni_id: "{{ item.id }}"
register: removed
until: removed is not failed
with_items: "{{ enis.network_interfaces }}"
ignore_errors: yes
Expand Down Expand Up @@ -44,32 +45,6 @@
ignore_errors: yes
retries: 10

- name: "remove routing rules"
ec2_vpc_route_table:
state: absent
vpc_id: "{{ testing_vpc.vpc.id }}"
tags:
created: "{{ resource_prefix }}-route"
routes:
- dest: 0.0.0.0/0
gateway_id: "{{ igw.gateway_id }}"
subnets:
- "{{ testing_subnet_a.subnet.id }}"
- "{{ testing_subnet_b.subnet.id }}"
register: removed
until: removed is not failed
ignore_errors: yes
retries: 10

- name: "remove internet gateway"
ec2_vpc_igw:
state: absent
vpc_id: "{{ testing_vpc.vpc.id }}"
register: removed
until: removed is not failed
ignore_errors: yes
retries: 10

- name: "remove subnet A"
ec2_vpc_subnet:
state: absent
Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,5 @@
- run_once: '{{ setup_run_once | default("no") | bool }}'
block:
- name: "fetch AZ availability"
aws_az_info:
register: az_info
- name: "Assert that we have multiple AZs available to us"
assert:
that: az_info.availability_zones | length >= 2

- name: "pick AZs"
set_fact:
subnet_a_az: '{{ az_info.availability_zones[0].zone_name }}'
subnet_b_az: '{{ az_info.availability_zones[1].zone_name }}'

- name: "Create VPC for use in testing"
ec2_vpc_net:
state: present
Expand All @@ -22,12 +10,6 @@
tenancy: default
register: testing_vpc

- name: "Create internet gateway for use in testing"
ec2_vpc_igw:
state: present
vpc_id: "{{ testing_vpc.vpc.id }}"
register: igw

- name: "Create default subnet in zone A"
ec2_vpc_subnet:
state: present
Expand All @@ -48,19 +30,6 @@
Name: "{{ resource_prefix }}-subnet-b"
register: testing_subnet_b

- name: "create routing rules"
ec2_vpc_route_table:
state: present
vpc_id: "{{ testing_vpc.vpc.id }}"
tags:
created: "{{ resource_prefix }}-route"
routes:
- dest: 0.0.0.0/0
gateway_id: "{{ igw.gateway_id }}"
subnets:
- "{{ testing_subnet_a.subnet.id }}"
- "{{ testing_subnet_b.subnet.id }}"

- name: "create a security group with the vpc"
ec2_group:
state: present
Expand All @@ -77,7 +46,7 @@
to_port: 80
cidr_ip: 0.0.0.0/0
register: sg

- name: "create secondary security group with the vpc"
ec2_group:
name: "{{ resource_prefix }}-sg-2"
Expand Down
Loading

0 comments on commit ad7a0eb

Please sign in to comment.