Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashing 5.0.0-alpha1 #1466

Closed
LoungeFlyZ opened this issue Apr 23, 2016 · 8 comments
Closed

Crashing 5.0.0-alpha1 #1466

LoungeFlyZ opened this issue Apr 23, 2016 · 8 comments
Assignees

Comments

@LoungeFlyZ
Copy link

LoungeFlyZ commented Apr 23, 2016

I am hitting a problem where filebeat is crashing 5.0.0-alpha1 like this:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0xa0 pc=0x4bb6a2]

goroutine 25 [running]:
panic(0x9b5320, 0xc820018080)
/usr/local/go/src/runtime/panic.go:464 +0x3e6
github.com/elastic/beats/libbeat/common.ConvertToGenericEvent(0xc8204012c0, 0x7)
/go/src/github.com/elastic/beats/libbeat/common/event.go:47 +0x252
github.com/elastic/beats/libbeat/publisher.(*client).filterEvent(0xc820148bc0, 0xc8204012c0, 0xc820026e20)
/go/src/github.com/elastic/beats/libbeat/publisher/client.go:184 +0x30
github.com/elastic/beats/libbeat/publisher.(*client).PublishEvents(0xc820148bc0, 0xc8203c0c00, 0x7d, 0x7d, 0xc820419490, 0x2, 0x2, 0x0)
/go/src/github.com/elastic/beats/libbeat/publisher/client.go:134 +0xc6
github.com/elastic/beats/filebeat/beater.(*syncLogPublisher).Start.func1(0xc820149140)
/go/src/github.com/elastic/beats/filebeat/beater/publish.go:103 +0x3bb
created by github.com/elastic/beats/filebeat/beater.(*syncLogPublisher).Start
/go/src/github.com/elastic/beats/filebeat/beater/publish.go:113 +0x58

I am playing with 5.0.0-alpha1 for the JSON support. I am using the same filebeat.yml file as with 1.2 except with the added JSON section:

filebeat:

  prospectors:
    -
      paths:
        - "logs/*.*.log.*"

      document_type: antrea

      json:
        message_key: message
        keys_under_root: true
        add_error_key: true

output:
  elasticsearch:

    hosts: ["http://REMOVED:9200"]

    username: "REMOVED"
    password: "REMOVED"

    index: "filebeat"

logging:
  level: debug

Repro'd on OSX and linux (debian:jessie)

Here are the log files i am testing this with that repro this issue:
https://dl.dropboxusercontent.com/u/27238389/logs.zip

Any ideas?

@ruflin
Copy link
Contributor

ruflin commented Apr 25, 2016

@ruflin ruflin added the libbeat label Apr 25, 2016
@ruflin
Copy link
Contributor

ruflin commented Apr 25, 2016

@LoungeFlyZ Thanks for the report. Could you post the full panic you get? Sometimes there is some more information below about the line that it exactly happens.

@monicasarbu Based on the panic I would think the error is filter related, but it seems like no filters are configured?

@monicasarbu
Copy link
Contributor

The panic happens before the event is passed to the generic filtering when it's trying to convert the fields of the event to basic types. It might be due to a strange type that is found in the json object.

@LoungeFlyZ
Copy link
Author

LoungeFlyZ commented Apr 25, 2016

@ruflin that's the full panic. There isn't anything logged after this. Prior it logs out the log entries like this:

2016/04/22 23:28:26.584227 client.go:193: DBG Publish: { "@timestamp": "2016-04-22T23:28:21.575Z", "beat": { "hostname": "antrea-filebeat-1", "name": "antrea-filebeat-1" }, "input_type": "log", "level": "info", "message": "Got auth result", "offset": 22325314, "source": "/antrea/logs/api.antrea-api-1.log.2016-04-22", "timestamp": "2016-04-22T22:58:30.931Z", "type": "antrea" } 2016-04-22T23:28:26.597244800Z panic: runtime error: invalid memory address or nil pointer dereference

Then the rest of the panic is listed.

If i understood correctly, you are right ... i'm not using any filters.

@tsg tsg self-assigned this Apr 26, 2016
@tsg
Copy link
Contributor

tsg commented Apr 26, 2016

@LoungeFlyZ thanks for the good bug report. This is an example line causing the panic:

{"method":"GET","path":"/auth/authorize","headers":{"content-type":"application/json","content-length":4,"location":"http://devhost:3000/auth/return?backend=http%3A%2F%2Fdevhost%3A4000?code=2f4818fd-a396-4150-a96b-3ee430ebef17","access-control-allow-origin":"*","access-control-allow-headers":"Accept, Accept-Version, Content-Length, Content-MD5, Content-Type, Date, Api-Version, Response-Time","access-control-allow-methods":"GET","access-control-expose-headers":"Api-Version, Request-Id, Response-Time","connection":"Keep-Alive","content-md5":"N6YlnMDB2uKZp4Zkid/wvQ==","date":"Fri, 04 Mar 2016 00:34:16 GMT","server":"hypeapi","api-version":"1.0.0","request-id":"0f4b264f-c259-4fc3-bd23-facf4021c0be","response-time":682},"res":null,"level":"info","message":"Sent response: ","timestamp":"2016-03-04T00:34:16.846Z"}

More precisely, it's the "res": null key that we need to treat better.

tsg pushed a commit to tsg/beats that referenced this issue Apr 26, 2016
The cause was a nil value in the incoming JSON, which the generic
filtering code didn't expect.

I considered adding a `recover` to generic
filtering so that things like this don't crash the whole process, but decided
against. One reason is that it's better to discover these things while we're
still in alpha/beta. Second is that if we recover here, there could still be a
crash later in filtering our outputs.

Also took the opportunity to add a couple of system tests that combine json
and generic filtering.
ruflin pushed a commit that referenced this issue Apr 26, 2016
The cause was a nil value in the incoming JSON, which the generic
filtering code didn't expect.

I considered adding a `recover` to generic
filtering so that things like this don't crash the whole process, but decided
against. One reason is that it's better to discover these things while we're
still in alpha/beta. Second is that if we recover here, there could still be a
crash later in filtering our outputs.

Also took the opportunity to add a couple of system tests that combine json
and generic filtering.
@ruflin
Copy link
Contributor

ruflin commented Apr 27, 2016

@tsg I assume this can be closed as #1489 was merged?

@tsg
Copy link
Contributor

tsg commented Apr 27, 2016

That's right, thanks.

@tsg tsg closed this as completed Apr 27, 2016
@LoungeFlyZ
Copy link
Author

Thanks everyone for you help with this! I have the nightly running fine now. Much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants