Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest image from docker hub doesn't work on CentOS 7 #23

Open
claflico opened this issue Sep 19, 2018 · 3 comments
Open

Latest image from docker hub doesn't work on CentOS 7 #23

claflico opened this issue Sep 19, 2018 · 3 comments

Comments

@claflico
Copy link

Spun up some new load balancers docker hosts last night and attempted to migrate the keepalived service to those hosts but the VIP would never come up.

This is a snippet of the logs:

9/19/2018 1:56:29 PMWed Sep 19 13:56:29 2018: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(8,9)]
9/19/2018 1:56:29 PMWed Sep 19 13:56:29 2018: Script `chk_haproxy` now returning 2
9/19/2018 1:56:29 PMWed Sep 19 13:56:29 2018: VRRP_Script(chk_haproxy) failed (exited with status 2)
9/19/2018 1:56:29 PMWed Sep 19 13:56:29 2018: (lb-vips) Entering FAULT STATE
9/19/2018 1:56:29 PMWed Sep 19 13:56:29 2018: Kernel/system configuration issue causing multicast packets to be received but IP_MULTICAST_ALL unset
9/19/2018 1:56:31 PMDisplaying resulting /etc/keepalived/keepalived.conf contents...
9/19/2018 1:56:31 PMWed Sep 19 13:56:31 2018: Starting Keepalived v2.0.4 (06/24,2018), git commit v3.8.0_rc8-47-g5ec10636b6
9/19/2018 1:56:31 PMWed Sep 19 13:56:31 2018: WARNING - keepalived was build for newer Linux 4.4.6, running on Linux 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018
9/19/2018 1:56:31 PMWed Sep 19 13:56:31 2018: Opening file '/etc/keepalived/keepalived.conf'.
9/19/2018 1:56:31 PM    global_defs {
9/19/2018 1:56:31 PM        #Hostname will be used by default
9/19/2018 1:56:31 PM        #router_id your_name
9/19/2018 1:56:31 PM        vrrp_version 2
9/19/2018 1:56:31 PM        vrrp_garp_master_delay 1
9/19/2018 1:56:31 PM        vrrp_garp_master_refresh 60
9/19/2018 1:56:31 PM        #Uncomment the next line if you'd like to use unique multicast groups
9/19/2018 1:56:31 PM        #vrrp_mcast_group4 224.0.0.12
9/19/2018 1:56:31 PM        script_user root
9/19/2018 1:56:31 PM    }
9/19/2018 1:56:31 PM
9/19/2018 1:56:31 PM    vrrp_script chk_haproxy {
9/19/2018 1:56:31 PM        script       "iptables -t nat -nL CATTLE_PREROUTING | grep ':80'"
9/19/2018 1:56:31 PM        timeout 1
9/19/2018 1:56:31 PM        interval 1   # check every 1 second
9/19/2018 1:56:31 PM        fall 2       # require 2 failures for KO
9/19/2018 1:56:31 PM        rise 2       # require 2 successes for OK
9/19/2018 1:56:31 PM    }
9/19/2018 1:56:31 PM
9/19/2018 1:56:31 PM    vrrp_instance lb-vips {
9/19/2018 1:56:31 PM        state BACKUP
9/19/2018 1:56:31 PM        interface eth0
9/19/2018 1:56:31 PM        virtual_router_id 12
9/19/2018 1:56:31 PM        priority 100
9/19/2018 1:56:31 PM        advert_int 1
9/19/2018 1:56:31 PM        nopreempt #Prevent fail-back
9/19/2018 1:56:31 PM        track_script {
9/19/2018 1:56:31 PM            chk_haproxy
9/19/2018 1:56:31 PM        }
9/19/2018 1:56:31 PM        authentication {
9/19/2018 1:56:31 PM            auth_type PASS
9/19/2018 1:56:31 PM            auth_pass blahblah
9/19/2018 1:56:31 PM        }
9/19/2018 1:56:31 PM        virtual_ipaddress {
9/19/2018 1:56:31 PM            10.XX.XX.12/24 dev eth0
9/19/2018 1:56:31 PM        }
9/19/2018 1:56:31 PM    }
9/19/2018 1:56:31 PMStarting Keepalived in the background...
9/19/2018 1:56:31 PMWed Sep 19 13:56:31 2018: daemon is already running
9/19/2018 1:56:31 PM/usr/bin/keepalived.sh: line 101: wait: pid 19 is not a child of this shell

I saw that the new hosts were using an image that was created 5 weeks ago. I went to the previous host that had the image that was created 13 months ago, tagged it & pushed it to our Docker image server. I configured the service to use that tagged image and the VIP came up on the new hosts so there's something in this new image since it's the only thing that changed.

Also, the check port script should probably be changed from grep ':${CHECK_PORT}'" to grep 'dpt:${CHECK_PORT} '" because otherwise the script could show a false positive when something is also running on port 8000 (i.e.traefik) on that host:

iptables -t nat -nL CATTLE_PREROUTING | grep ':80'
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:80 to:10.XX.XX.45:80
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:80 ADDRTYPE match dst-type LOCAL to:10.XX.XX.45:80
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8000 to:10.XX.XX.45:8000
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8000 ADDRTYPE match dst-type LOCAL to:10.XX.XX.45:8000
@sjiveson
Copy link
Collaborator

Hey, looks like the use of the wait command. I remember having issues with this a while back on a different OS. I will update tomorrow.

@sjiveson
Copy link
Collaborator

sjiveson commented Sep 24, 2018

Thanks for your patience. Any chance you can try replacing lines 100-103 in the keepalived.sh file with what follows, rebuilding the container and seeing if that works better:

while true; do

  # Check if Keepalived is STILL running by recording it's PID (if it's not running $pid will be null):
  pid=$(pidof keepalived)
  # If it is not, lets kill our PID1 process (this script) by breaking out of this while loop:
  # This ensures Docker 'sees' the failure and handles it as necessary
  if [ -z "$pid" ]; then
    echo "Keepalived is no longer running, exiting so Docker can restart the container..."
    break
  fi

  # If it is, give the CPU a rest
  sleep 0.5

done

I can do so myself and test accordingly but it might be a couple of days.

@sjiveson
Copy link
Collaborator

sjiveson commented Oct 3, 2018

Hey Cory, thanks again for your patience, I've made the necessary changes. Please rebuild, test as appropriate and let me know if you have any further issues. I've tested and it works for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants