Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High latencies #346

Closed
ecktom opened this issue Jan 22, 2020 · 4 comments · Fixed by #348
Closed

High latencies #346

ecktom opened this issue Jan 22, 2020 · 4 comments · Fixed by #348
Assignees
Labels
bug Something is not working.
Milestone

Comments

@ecktom
Copy link
Contributor

ecktom commented Jan 22, 2020

Describe the bug

Started using Oathkeeper and was quiet surprised about it's somewhat high latencies. I'm using a simple forward rule including (AllowAuthorizer / NoopAuthenticator / NoopMutator)

    - id: some-rule
      match:
        url: http://some-url/api/<.*>
        methods:
        - POST
        - GET
        - PATCH
        - PUT
      authenticators:
      - handler: noop
        config: {}
      authorizer:
        handler: allow
        config: {}
      mutators:
      - handler: noop
        config: {}

It takes around ~80ms for a request getting processed. I did some cpu profiling to check what happens. Turns out that https://github.com/ory/oathkeeper/blame/master/driver/configuration/provider_viper.go#L200 is getting called for every request multiplied with each authenticator, authorizer and mutator. This is calling the viper stack to compute the config which then get validated against the schema.

I just did some basic 3 line's caching there to proof my point and brought it down to ~5ms immediately (with all the schema validation still in place)

CPU profile

pprof001

Server configuration

    log:
      level: debug
      format: json
    profiling: cpu
    access_rules:
      repositories:
      - file:///etc/oathkeeper/rules.yaml
    authenticators:
      jwt:
        enabled: true
        config:
          jwks_urls:
          - https://some-url/jwks.json
      noop:
        enabled: true
        config: {}
    authorizers:
      allow:
        enabled: true
        config: {}
      deny:
        enabled: true
        config: {}
      keto_engine_acp_ory:
        enabled: true
        config:
          base_url: http://keto-url:4466/
          required_action: ""
          required_resource: ""
    mutators:
      noop:
        enabled: true
        config: {}
      header:
        enabled: true
        config:
          headers:
            X-SOME-HEADER: '{{ print .Subject }}'

Expected behavior

Would have expected some kind of caching here ;) I like the Prometheus approach where you would habe to call an api endpoint in order to reload the config.

Environment

  • Version: v0.35.1
  • Environment: Docker -> K8s
@aeneasr
Copy link
Member

aeneasr commented Jan 22, 2020

Thank you for this amazing write-up! We've heard complaints in the recent days about slow performance when little CPU (e.g. 0.1 units) are available. This is most likely the root cause of that!

This should indeed be addressed with a caching mechanism. I think we should add a method to viper which returns the timestamp (nanoseconds) of the most recent config update. Based on that, we can invalidate and recompute the caches. We'd appreciate it if you would be able to help out there, seeing as you've already started working on this. Would you be open to that? If so, I can give some pointers and hints as to what needs to be changed in our viper fork.

@aeneasr aeneasr added this to the v0.35.2 milestone Jan 22, 2020
@aeneasr aeneasr added the bug Something is not working. label Jan 22, 2020
@aeneasr aeneasr self-assigned this Jan 22, 2020
@ecktom
Copy link
Contributor Author

ecktom commented Jan 22, 2020

Sure ;) Some pointers and hint's would be highly appreciated and I will try to find some spare time for that.

@aeneasr
Copy link
Member

aeneasr commented Jan 22, 2020

Ok, so generally speaking we want to cache requests to here:

https://github.com/ory/oathkeeper/blob/master/driver/configuration/provider_viper.go#L201

We can do that by computing a hash (CRC32 is probably enough but I would feel more comfortable with a 64 or 128 or 160 bit hash) of append([]byte{}, prefix, id, override, viper.ConfigTimestamp().UnixNano()) and store that
in a map[string]json.RawMessage or alternatively in map[string]interface{} although the latter could be quite risky because it would mean that we override the dest value with the same pointer meaning that some handler could modify the destination of that pointer and thus messing everything up - so I think that map[string]json.RawMessage is the safer bet here. Don't forget to add a mutex lock to prevent concurrent reads/writes.

As far as our Viper is concerned, adding a configChangedAt time.Time in the Viper Struct is probably the way to go. It would be initialized in the New() call and would be updated somwhere in WatchConfig or (probably a better idea) in ReadInConfig.

It would make a lot of sense to add a small test for that behavior in viper itself, checking if the timestamp actually updates when you update the config file.

It would also make sense to add a function that exposes the configChangeAt value (e.g. ConfigChangeAt() time.Time)

As far as oathkeeper goes, a test in ./driver/configuration would be a good idea - it should test that the caching works as expected. If you're really cool, it would be great to add a small benchmark test as well!

@aeneasr
Copy link
Member

aeneasr commented Jan 22, 2020

Actually, I think it might be possible to create a copy of the memory the map[string]interface{} points at, then it might be safe to set the pointer of dest to that copy. This should probably be tested by calling PipelineConfig twice and modifying dest after the first call and expecting to see the original value after calling it the second time. This will most likely not work very well with raw pointer copies (e.g. *a = *b) because any pointers within the struct will still point to the same memory address. Maybe something like DeepCopy could be of help here. It would be interesting to see if it's faster to deep copy the dest or if it's faster to decode the JSON value (I assume deep copy is faster)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants