Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHEL interfaces sometimes come up with the wrong IP #1577

Closed
wants to merge 2 commits into from
Closed

RHEL interfaces sometimes come up with the wrong IP #1577

wants to merge 2 commits into from

Conversation

chorankates-sfdc
Copy link
Contributor

in our Vagrantfile, we've specified that this host should have the IP address of 192.168.1.46, and while vagrant is doing the right thing in /etc/sysconfig/network-scripts/ifcfg-eth1, we're still coming up to the wrong (192.168.1.180) address. i can provide the Vagrantfile if you'd like, but effectively, we're calling:
agent.vm.network :hostonly, 192.168.1.46

by creating the proper content in ifcfg-eth1 before calling ifdown, i can no longer reproduce this issue. i don't fully understand this interaction (or why it seems to only happen under 'some' circumstances), but a strace shows ifdown definitely reading that file. i can reliably reproduce this issue (with 1.0.5 and 1.1.0) without the change i'm proposing, and can't reproduce it with the change -- and on the machines that didn't see this problem originally, my change doesn't introduce any issues.

we've only seen this with RHEL boxes (5.5 and 6.2), so making change as closely scoped as possible.


choran-kates@chorankates-wsl3:~/git/piab[isd/piab|unique-ips|e697b1f|U]
6:22.12 $ vagrant ssh app
[vagrant@piab1-app1-1-piab ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth1

VAGRANT-BEGIN

The contents below are automatically generated by Vagrant. Do not modify.

BOOTPROTO=static
IPADDR=192.168.1.46
NETMASK=255.255.255.0
DEVICE=eth1

VAGRANT-END

[vagrant@piab1-app1-1-piab ~]$ /sbin/ifconfig | grep -i inet
inet addr:10.0.2.15 Bcast:10.0.2.255 Mask:255.255.255.0
inet addr:192.168.1.180 Bcast:192.168.1.255 Mask:255.255.255.0
inet addr:127.0.0.1 Mask:255.0.0.0

preparing a fix for a weird redhat ifdown issue
…ist before ifdown is called, when ifup is called, the interface comes up to the wrong IP address
@mitchellh
Copy link
Contributor

That is pretty odd... is there any way we can get more info on this? Or hopefully some RHEL user can chime in.

@mbadran
Copy link

mbadran commented Apr 18, 2013

I'm seeing similar issues with CentOS. Not sure if they're related or not.

I'm using the CentOS 6.4 VM from vagrantbox.es:

CentOS 6.4 x86_64 Minimal VMware Fusion (VMware Tools, Chef 11.4.0, Puppet 3.1.1)

In my Vagrantfile I have:

CentOS 6.4 x86_64 Minimal VMware Fusion (VMware Tools, Chef 11.4.0, Puppet 3.1.1)

When I bring up the VM, I get the following output:

> vagrant up oig1 --provider vmware_fusion
Bringing machine 'oig1' up with 'vmware_fusion' provider...
[oig1] Cloning VMware VM: 'centos_64'. This can take some time...
[oig1] Verifying vmnet devices are healthy...
[oig1] Preparing network adapters...
[oig1] Starting the VMware VM...
[oig1] Waiting for the VM to finish booting...
[oig1] The machine is booted and ready!
[oig1] Forwarding ports...
[oig1] -- 22 => 2222
[oig1] Configuring network adapters within the VM...
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null

In spite of this, I can ssh in, and sure enough, the eth1 interface is not up, although the contents of the eth1 configuration have been configured by Vagrant. eth0 is up and has a different IP over which the ssh connection occurs.

This is what happens when I manually run ifup on the instance:

[vagrant@vagrant-centos-6 network-scripts]$ sudo ifup eth1
Device eth1 does not seem to be present, delaying initialization.

As an aside, is there any straightforward way to create images for other providers such as vmware_fusion? I remember you mentioned something about that feature coming some time around v1.2.

Thanks.

@chorankates-sfdc
Copy link
Contributor Author

@mbadran: we've seen the 'Vagrant assumes that this means the command failed!

/sbin/ifup eth1 2> /dev/null' error as well, but it only happens when the wrong IP that the interface comes up to is already occupied. sudo ifdown eth1 and sudo ifup eth1 fixes it for us, i believe because by this point, the /etc/sysconfig/network-scripts/ifcfg-eth1 has the right IP specified

@kid-icarus
Copy link

I'm using a CentOS 6.4 x86_64 guest on Mac OS X 10.8 host with a VirtualBox 4.2.12 provider and Vagrant 1.2.1, experiencing the exact same issue as @mbadran.

@kid-icarus
Copy link

Here's a log: https://gist.github.com/kid-icarus/5430446

@chorankates-sfdc
Copy link
Contributor Author

@mbadran @kid-icarus with this proposed patch, do you still see the same issues?

@kid-icarus
Copy link

Hey @chorankates-sfdc it looks like the patch didn't fix it for me, I still experienced the same behavior :-\

@jistanidiot
Copy link

I'm having this problem. I believe it is related to these RedHat/Fedora bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=513692
https://bugzilla.redhat.com/show_bug.cgi?id=491432

Basically the files in /etc/sysconfig/network-scripts require the HWADDR which vagrant does not put in. There are workarounds but none work for me. Vagrant should figure out the newly assigned MAC for each interface and add it to the correct file.

@michael-harrison
Copy link

I'm also having this problem. I've been detailing my journey in #921. For details see #921 (comment) for my setup and https://gist.github.com/michael-harrison/5746092 for logs highlighting the issue.

@michael-harrison
Copy link

I managed to find a work around. You can check it out on #921 (comment)

@onejli
Copy link
Contributor

onejli commented Oct 14, 2013

Added some debugging lines to vagrant-1.2.2/plugins/guests/redhat/cap/configure_networks.rb and noticed the following when running with VirtualBox 4.2.12.

Given a Vagrantfile that looks like this:

Vagrant.configure("2") do |config|
  config.vm.define :web do |web_config|
    web_config.vm.box = "rhel6_u2"
    web_config.vm.network "private_network", ip: "172.16.1.100"
    web_config.vm.provider :virtualbox do |vb|
      vb.customize ['modifyvm', :id, '--name', "web"]
    end
  end

  config.vm.define :db do |db_config|
    db_config.vm.box = "rhel6_u2"
    db_config.vm.network "private_network", ip: "172.16.1.101"
    db_config.vm.provider :virtualbox do |vb|
      vb.customize ['modifyvm', :id, '--name', "db"]
    end
  end
end

On the first host to be vagrant up-ed (db)

ifconfig before line 26 looks like this:

eth1      Link encap:Ethernet  HWaddr 08:00:27:DF:FB:A3  
          inet addr:192.168.1.161  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fedf:fba3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:900 (900.0 b)

As lines 27 through 29 remove all vagrant info in ifcfg-eth1 provisioned by network_static.erb (it ends up being a blank file), ifdown on line 49 will always fail:

ERROR    : [ipv6_test_device_status] Missing parameter 'device' (arg 1)

This means that ifconfig between lines 49 and 50 will still show eth1 as up and unchanged:

eth1      Link encap:Ethernet  HWaddr 08:00:27:DF:FB:A3  
          inet addr:192.168.1.161  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fedf:fba3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:900 (900.0 b)

The ifup on line 51 then fails because the eth1 interface is already up:

RTNETLINK answers: File exists

As such, ifconfig between lines 51 and 52 will again be unchanged:

eth1      Link encap:Ethernet  HWaddr 08:00:27:DF:FB:A3  
          inet addr:192.168.1.161  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fedf:fba3/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:19 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:1110 (1.0 KiB)

/etc/sysconfig/network-scripts/ifcfg-eth1 looks like this:

#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=none
IPADDR=172.16.1.101
NETMASK=255.255.255.0
DEVICE=eth1
PEERDNS=no
#VAGRANT-END

On the second host to be vagrant up-ed (web)

ifconfig before line 26 looks like this:

eth1      Link encap:Ethernet  HWaddr 08:00:27:89:40:B1  
          inet6 addr: fe80::a00:27ff:fe89:40b1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:60 (60.0 b)  TX bytes:620 (620.0 b)

As with the first host, ifdown on line 49 will always fail:

ERROR    : [ipv6_test_device_status] Missing parameter 'device' (arg 1)

This means that ifconfig between lines 49 and 50 will be unchanged:

eth1      Link encap:Ethernet  HWaddr 08:00:27:89:40:B1  
          inet6 addr: fe80::a00:27ff:fe89:40b1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:60 (60.0 b)  TX bytes:620 (620.0 b)

However, in this case, ifup on line 51 succeeds.

The resulting ifconfig between lines 51 and 52 now shows the correct ip address when ifup-ing the ifcfg-eth#{network[:interface]} "deployed" on line 50:

eth1      Link encap:Ethernet  HWaddr 08:00:27:89:40:B1  
          inet addr:172.16.1.100  Bcast:172.16.1.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe89:40b1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:60 (60.0 b)  TX bytes:830 (830.0 b)

/etc/sysconfig/network-scripts/ifcfg-eth1 looks like this:

#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=none
IPADDR=172.16.1.100
NETMASK=255.255.255.0
DEVICE=eth1
PEERDNS=no
#VAGRANT-END

Issues

I'd like to debug a bit more, but I could use a nudge in the right direction regarding network_static.erb. I'm a little fuzzy as to what is deploying and populating this template.

@chorankates
Copy link

@onejli, in 1.2.2 at least, network_static.erb is being populated by line 33 in <vagrant_gem_dir>/plugins/guests/redhat/cap/configure_networks.rb:

entry = TemplateRenderer.render("guests/redhat/network_#{network[:type]}", :options => network)

that file lives in <vagrant_gem_dir>/templates/guests/redhat/

given the input from @jistanidiot above, it might be worth adding an HWADDR value, assuming the options hash contains that data

@onejli
Copy link
Contributor

onejli commented Oct 21, 2013

Tracked down (part of) my issue to a crufty base box. I had an ifcfg-eth1 left over from the creation of my base box...

#VAGRANT-BEGIN
# The contents below are automatically generated by Vagrant. Do not modify.
BOOTPROTO=static
IPADDR=192.168.1.161
NETMASK=255.255.255.0
DEVICE=eth1
#VAGRANT-END

First host to be up-ed

eth1 comes up with the system init scripts with the 192 address as (inadvertently) specified by the base box's ifcfg-eth1

/var/log/boot looks like this:

Calling the system activity data collector (sadc): 
WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Bringing up loopback interface:  ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Bringing up interface eth0:  
Determining IP information for eth0... done.
ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Bringing up interface eth1:  ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Starting system logger: ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Starting irqbalance: ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M

Second host to be up-ed

eth1 cannot come up with the system init scripts with the 192 address as (inadvertently) specified by the base box's ifcfg-eth1 due to an ip collision

/var/log/boot looks like this:

Calling the system activity data collector (sadc): 
WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Bringing up loopback interface:  ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Bringing up interface eth0:  
Determining IP information for eth0... done.
ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Bringing up interface eth1:  Error, some other host already uses address 192.168.1.161.
ESC[60G[ESC[0;31mFAILEDESC[0;39m]^M
Starting system logger: ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M
Starting irqbalance: ESC[60G[ESC[0;32m  OK  ESC[0;39m]^M

The ifup on line 51 eventually "fixes" this issue after a new ifcfg-eth1 has been deployed.

/var/log/messages looks like this:

/var/log/messages
Oct 21 23:13:21 hostfoo ntpd[1187]: ntpd [email protected] Thu May 13 14:38:25 UTC 2010 (1)
Oct 21 23:13:21 hostfoo ntpd[1188]: precision = 0.047 usec
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on interface #0 wildcard, 0.0.0.0#123 Disabled
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on interface #1 wildcard, ::#123 Disabled
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on interface #2 lo, ::1#123 Enabled
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on interface #3 lo, 127.0.0.1#123 Enabled
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on interface #4 eth0, 10.0.2.15#123 Enabled
Oct 21 23:13:21 hostfoo ntpd[1188]: Listening on routing socket on fd #21 for interface updates
Oct 21 23:13:21 hostfoo ntpd[1188]: kernel time sync status 2040
Oct 21 23:13:21 hostfoo abrtd: dbus error: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
Oct 21 23:13:21 hostfoo abrtd: Error requesting DBus name com.redhat.abrt, possible reasons: abrt run by non-root; dbus config is incorrect; or dbus daemon needs to be restarted to reload dbus config
Oct 21 23:13:31 hostfoo ntpd[1188]: frequency initialized 0.000 PPM from /var/lib/ntp/ntp.drift
Oct 21 23:13:33 hostfoo ntpd[1188]: Listening on interface #5 eth1, fe80::a00:27ff:fedc:8b6a#123 Enabled
Oct 21 23:13:33 hostfoo ntpd[1188]: Listening on interface #6 eth0, fe80::a00:27ff:fe42:f532#123 Enabled
Oct 21 23:13:39 hostfoo ntpd[1188]: Listening on interface #7 eth1, 172.16.1.100#123 Enabled

Summary

This explains why starting with the second box to be upped, there is no ipv4 address for eth1 before line 26. This ip collision during the init scripts allows the ifup on line 51 to succeed.

I'll comment again after I've had a chance to retry with a clean base box.

@onejli
Copy link
Contributor

onejli commented Nov 2, 2013

For a temporary solution, I've confirmed that removing ifcfg-eth1 from my base box and letting vagrant provision it works without issue. eth1 comes up with the correct address as specified by my Vagrantfile.

I think the real solution will require modifying the logic in configure_networks.rb a bit. Before removing/modifying any existing ifcfg file on line 24, vagrant should first attempt to ifdown the interface (similar to line 49). Pull req #2450 contains my proposed patch.

@mitchellh
Copy link
Contributor

Fixed by #2450. Thanks!

@mitchellh mitchellh closed this Nov 23, 2013
@ghost ghost locked and limited conversation to collaborators Apr 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants