Skip to content
This repository has been archived by the owner on Dec 15, 2020. It is now read-only.

Missing 'loss' field in pingbeat output #28

Open
project-poodle opened this issue Apr 30, 2017 · 8 comments
Open

Missing 'loss' field in pingbeat output #28

project-poodle opened this issue Apr 30, 2017 · 8 comments

Comments

@project-poodle
Copy link

in 1.0-beta, there is a 'loss:boolean' field that can capture packet loss. It seems this field is no longer present with 5.4.

This field is quite useful when detecting network instability. Could this field be added back?

@joshuar
Copy link
Owner

joshuar commented Apr 30, 2017

Hi @zach929, this field should still be there. Rather than adding loss: false to every good ping, pingbeat just adds loss: true to failed pings (along with the reason field which will eventually report what ICMP/network error was the cause). Are you no longer seeing loss where you previously saw it?

@project-poodle
Copy link
Author

Hi @joshuar , thanks for the reply. The 'loss: true' event was not generated in 5.4 during my test. following is the pingbeat.yml:

pingbeat:
  # Defines how often a ping is sent to a target
  period: "5s"
  # Whether to send pings over IPv4
  useipv4: true
  # Whether to send pings over IPv6
  useipv6: false
  # How long to wait for a target to respond to a ping request
  timeout: "10s"
  targets:
    - name: "100.100.100.100"
    - name: "8.8.8.8"

output:
  console:
    pretty: true

Following is the output of the program:

[es5]$ sudo /usr/bin/pingbeat -e -c pingbeat.yml -d publish
2017/04/30 23:36:02.094281 beat.go:285: INFO Home path: [/usr/bin] Config path: [/usr/bin] Data path: [/usr/bin/data] Logs path: [/usr/bin/logs]
2017/04/30 23:36:02.094362 beat.go:186: INFO Setup Beat: pingbeat; Version: 5.4.0
2017/04/30 23:36:02.094456 outputs.go:108: INFO Activated console as output plugin.
2017/04/30 23:36:02.094493 publish.go:238: DBG  Create output worker
2017/04/30 23:36:02.094658 publish.go:280: DBG  No output is defined to store the topology. The server fields might not be filled.
2017/04/30 23:36:02.094745 publish.go:295: INFO Publisher name: es5
2017/04/30 23:36:02.094920 metrics.go:23: INFO Metrics logging every 30s
2017/04/30 23:36:02.095120 async.go:63: INFO Flush Interval set to: 1s
2017/04/30 23:36:02.095153 async.go:64: INFO Max Bulk Size set to: 2048
2017/04/30 23:36:02.095175 async.go:72: DBG  create bulk processing worker (interval=1s, bulk size=2048)
2017/04/30 23:36:02.095592 beat.go:221: INFO pingbeat start running.
2017/04/30 23:36:02.095616 pingbeat.go:71: INFO pingbeat is running! Hit CTRL-C to stop it.
2017/04/30 23:36:02.096336 pingbeat.go:97: INFO Using ip4:icmp connection
2017/04/30 23:36:07.098339 client.go:214: DBG  Publish: {
  "@timestamp": "2017-04-30T23:36:07.097Z",
  "beat": {
    "hostname": "es5",
    "name": "es5",
    "version": "5.4.0"
  },
  "rtt": 1.389245,
  "target.addr": "8.8.8.8",
  "target.name": "8.8.8.8",
  "target.tags": null,
  "type": "pingbeat"
}
2017/04/30 23:36:08.096155 output.go:109: DBG  output worker: publish 1 events
{
  "@timestamp": "2017-04-30T23:36:07.097Z",
  "beat": {
    "hostname": "es5",
    "name": "es5",
    "version": "5.4.0"
  },
  "rtt": 1.389245,
  "target.addr": "8.8.8.8",
  "target.name": "8.8.8.8",
  "target.tags": null,
  "type": "pingbeat"
}
2017/04/30 23:36:12.097673 client.go:214: DBG  Publish: {
  "@timestamp": "2017-04-30T23:36:12.097Z",
  "beat": {
    "hostname": "es5",
    "name": "es5",
    "version": "5.4.0"
  },
  "rtt": 1.210513,
  "target.addr": "8.8.8.8",
  "target.name": "8.8.8.8",
  "target.tags": null,
  "type": "pingbeat"
}
2017/04/30 23:36:13.095671 output.go:109: DBG  output worker: publish 1 events
{
  "@timestamp": "2017-04-30T23:36:12.097Z",
  "beat": {
    "hostname": "es5",
    "name": "es5",
    "version": "5.4.0"
  },
  "rtt": 1.210513,
  "target.addr": "8.8.8.8",
  "target.name": "8.8.8.8",
  "target.tags": null,
  "type": "pingbeat"
}
2017/04/30 23:36:17.098026 client.go:214: DBG  Publish: {
  "@timestamp": "2017-04-30T23:36:17.097Z",
  "beat": {
    "hostname": "es5",
    "name": "es5",
    "version": "5.4.0"
  },
  "rtt": 1.462431,
  "target.addr": "8.8.8.8",
  "target.name": "8.8.8.8",
  "target.tags": null,
  "type": "pingbeat"
}

100.100.100.100 is an obvious non-pingable address. In the output, only the 8.8.8.8 address generates ping event. The 'loss: true' event was not generated for '100.100.100.100'.

@joshuar
Copy link
Owner

joshuar commented May 2, 2017

Hi @zach929 okay, the loss processing is still there, but some refactoring of the code meant that some "loss" conditions were no longer being recorded. With 4f9c249:

  • Destination Unreachable errors are properly recorded as loss.
  • Where pingbeat fails to get a reply within a timeout, it is (again) treated as loss.

The default timeout is relatively low, (10 x interval) simply because I originally want to keep the memory usage low where a large number of targets was defined and a low interval was being used. This timeout parameter can be set in the config as needed and I may opt for a higher timeout.

Can you try the master branch and see if it is better?

@jegade
Copy link

jegade commented May 3, 2017

hi @joshuar, can you build a dev-release? I'm getting no 'loss: true' with the latest-Version

@jegade
Copy link

jegade commented May 3, 2017

i'm getting a lot of errors for unreachable hosts with the latest release

2017/05/03 16:20:30.548142 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548147 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548151 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548457 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548510 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548521 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548531 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548541 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548560 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548570 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548580 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548589 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548599 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548611 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548620 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548629 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548638 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548649 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548660 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548670 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548680 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548690 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548707 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout

@atomicom
Copy link

I'm also experiencing a continuous stream of 'ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout' when an IP is unreachable - it filled up 7 log files in a second.

Running at release v5.4.0

@joshuar
Copy link
Owner

joshuar commented May 18, 2017

@atomicom @jegade @zach929 looks like I really made a mess of that last release. Can you try 5.4.1: https://github.com/joshuar/pingbeat/releases/tag/v5.4.1

This should fix both tracking of loss and also stop any unnecessary error messages.

@jegade
Copy link

jegade commented May 18, 2017

@joshuar much better, now the losts are tracked. Thank you

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants