Allow registering events to timelines using regexes #4375

tadeboro · 2018-07-29T19:18:37Z

Up until now, in order to register event type to the timeline, we needed to list all of the event types in settings yaml.

This commit adds another option: specify event type to group/level mapping using regular expressions, which should make mapping definitions much shorter for providers with many different event types
that follow a certain pattern.

Changes were made in two areas: in database querying and in event to category assignment.

When creating database condition for events, we now use three sources:

a set of event types that should be included in result,
a set of event types that should not be present in result and
a set of regular expressions that event types can match.

An event is part of the result if its type is in include set or matches any of the regular expressions and is not part of the exclude set. For example, take this fragment of settings file:

:ems:
  :ems_dummy:
    :event_handling:
      :event_groups:
        :addition:
          :critical:
          - !ruby/regexp /^dummy_add_.+$/
        :update:
          :critical:
          - dummy_123_update
          - !ruby/regexp /^dummy_.+_update$/
          :detail:
          - dummy_abc_update
          - dummy_def_update
          - !ruby/regexp /^dummy_detail_.+_update$/

This translates into following three sources for :update category, assuming user does not want to display details:

include_set: dummy_123_update
exclude_set: dummy_abc_update, dummy_def_update
regexes: ^dummy_.+_updates$

In this case, events of type dummy_abc_update will not be displayed, since they match details and are thus put in exclude set. On the other hand, if user does want to see the details, things are a bit
different:

include_set: dummy_123_update, dummy_abc_update, dummy_def_update
exclude_set:
regexes: ^dummy_.+_updates$, ^dummy_detail_.+_update$

All this is done in order to preserve the existing behavior.

Category assignment is done in two phases. First, we try to assign category only looking at the full name of event. If this yields no category, we try to assign one by matching against regular expressions. If we continue with categories from example above, we would get those mappings:

dummy_123_update is mapped to :update. It also matches :addition category, but since explicit names have higher priority, that regular expression is not checked at all.
dummy_nil_update is mapped to :update because it matches one of the :critical expressions.
dummy_add_event is mapped to :addition.

This pull depends on ManageIQ/manageiq#17772.

@miq-bot add_label wip

/cc @miha-plesko @gberginc

miha-plesko · 2018-07-30T06:05:10Z

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1562414

miha-plesko

Generally looks great, only two small suggestions

miha-plesko · 2018-07-30T07:13:55Z

app/controllers/application_controller/timelines.rb

-      params2 = where_clause.slice(1, where_clause.length - 1)
-      params = params.concat(params2)
-      @report.where_clause = [cond, *params]
+      condition = "(" + conditions.reduce { |a, c| a + ") and (" + c } + ")"


A matter of preference, but I prefer meaningful argument names to a, b, c as well as string interpolation to suming up strings. For example:

.reduce { |cond, res| "(#{cond}) and #{res}" }

I think giving variables a longer names would be counter-productive here (like naming iterator variable in C for loop iterator), since their whole scope is a single line. Besides, using a for accumulator seems to be a standard practice in such short reduce statements.

As for the implementation of reduce operation, I can use interpolation. I have no preference whatsoever.

miha-plesko · 2018-07-30T07:25:13Z

app/controllers/application_controller/timelines.rb

-                        to_dt]
-      end
-      cond << where_clause[0]
+      tl_add_event_type_conditions(conditions, parameters)


We could rename method to tl_add_event_type_conditions! to emphasise it modifies arguments inplace

I did not feel the need to add ! at the end of the method name, since there is no alternative method that would not modify its arguments. But I can add it if that would make things better.

miq-bot · 2018-07-30T19:07:46Z

This pull request is not mergeable. Please rebase and repush.

tadeboro · 2018-08-10T14:26:16Z

@miq-bot remove_label wip

Fryguy · 2018-08-10T15:12:12Z

app/controllers/application_controller/timelines.rb

@@ -102,42 +102,54 @@ def tl_build_timeline_report_options
      end

      temp_clause = @tl_record.event_where_clause(@tl_options.evt_type)
+      conditions = [temp_clause[0], "timestamp >= ?", "timestamp <= ?"]
+      parameters = temp_clause[1..-1] + [from_dt, to_dt]


Minor, but instead of temp_clause[0] and temp_clause[1], for readability let's give them names, and set them with decomposition, so maybe...

conditions, *parameters = @tl_record.event_where_clause(@tl_options.evt_type) conditions += ["timestamp >= ?", "timestamp <= ?"] parameters += [from_dt, to_dt]

Fryguy · 2018-08-10T15:13:46Z

app/controllers/application_controller/timelines.rb

-      params2 = where_clause.slice(1, where_clause.length - 1)
-      params = params.concat(params2)
-      @report.where_clause = [cond, *params]
+      condition = "(" + conditions.reduce { |a, c| "#{a}) and (#{c}" } + ")"


I really don't like the SQL building...there has to be a way to leverage ActiveRecord relation chains.

Though I do recognize that the original code also did some level of building SQL as well, which is also not good.

I will dig deeper and determine what report does with that SQL where clause.

I looked and report is just an MiqReport instance... MiqReport#where_clause is a string column that must get used later. I'm fine with this for now, considering your PR gets the SQL building down to basically just 2 lines. I still think there;s gotta be a way to build it using ActveRecord though :)

Getting closer... 😄

query = @report.db.constantize.where("timestamp >= ?", from_dt).where("timestamp <= ?", to_dt) query.arel.where_sql # => "WHERE (timestamp >= '2018-08-05 15:55:43.491733') AND (timestamp <= '2018-08-10 15:55:43.492248')"

Fryguy · 2018-08-10T15:16:14Z

app/controllers/application_controller/timelines.rb

+  def tl_add_event_type_inclusions(conditions, parameters)
+    expressions = []
+    @tl_options.get_set(:regexes).each do |regex|
+      expressions << "event_type ~ ?"


This assumes that all Ruby regexps can be SQL regexps, which I don't believe is true. But maybe that's acceptable?

PostgreSQL's support is quite powerful. I will compare the capabilities of the Ruby's vs. PostgreSQL's regular expressions and report back. If nothing else, we can document the limitations somewhere and inform settings authors what can and what cannot be used.

I just checked this and regular expressions that PostgreSQL accepts are a subset of those that Ruby accepts. Maybe simply documenting that regular expressions in settings.yml must be valid PostgreSQL expressions would be enough?

Two things...

If possible, we may be able to ensure this via specs. That is, we can create a spec in core that goes through all event_groups definitions, and 1-by-1 and makes a query. I would expect it to blow up for invalid regexes. Then we can inform the developer that they've built a regex that cannot be used via SQL. So somethig like...

describe EventStream do describe ".event_groups" do it "regexes must be usable in SQL queries" do ... strings, regexes = group_data[lvl].partition { |typ| typ.kind_of?(String) } ... regexes.each do |r| expect { EventStream.where("event_type ~ ?", r.source).to_a }.to_not raise_error end end end end

In writing the above, I found that you cannot just embed a regex directly as a parameter, so the code as is will likely not work. That is, this code does not work:

EventStream.where("event_type ~ ?", /\w+/).first # => TypeError: can't quote Regexp

Instead, ActiveRecord says you have to give a String. I found that calling .source gives the guts of the Regexp but ignores the modifiers. I think ignoring the modifiers might be ok for a first pass, but we may have to write our own Ruby Regexp to PostgreSQL regexp converter to deal with the modifiers. In particular, case insensitivity might be important (though PostgreSQL can also handle that with an alternate operator": ~*). So, anyway, this works:

EventStream.where("event_type ~ ?", /^\w+/.source).first # EventStream Load (0.7ms) SELECT "event_streams".* FROM "event_streams" WHERE (event_type ~ '^\w+') ORDER BY "event_streams"."id" ASC LIMIT $1 [["LIMIT", 1]] # EventStream Inst Including Associations (68.0ms - 1rows) # => #<MiqEvent:0x00007f83bbbb8118 id: 21000000000001, event_type: "login_failed", ... >

Where should I put the specs: in the core repo where the EventStream lives or in the UI repo?

As for the regexp modifiers, Ruby currently supports four, but the only one that would be of any use in our case is i that controls case sensitivity. And adding support for this option is quite easy, we simply need to use proper PostgreSQL operator as you already stated.

I think the "check the event_groups spec" should be in the core repo. Ultimately it should be in the provider repos so that provider authors get a failure when they add it, but I'm not sure how to do that. Maybe we can have the spec in core, but run it in the provider repos somehow?

Fryguy · 2018-08-10T15:36:29Z

app/controllers/application_controller/timelines.rb

+      parameters << includes
+    end
+
+    condition = expressions.reduce { |a, e| a + ") or (" + e }


Minor, but isn't this just expressions.join(") or (")?

Same for the above reduce with ") and ("

Fryguy · 2018-08-10T15:37:25Z

app/controllers/application_controller/timelines.rb

+    end
+
+    condition = expressions.reduce { |a, e| a + ") or (" + e }
+    conditions << "(" + condition + ")" unless condition.nil?


Prefer interpolation... conditions << "(#{condition})" if condition

Fryguy · 2018-08-10T15:39:07Z

app/controllers/application_controller/timelines/options.rb

+        }
+
+        group_levels.each do |lvl|
+          parts = group_data[lvl]&.partition { |typ| typ.kind_of?(String) }


Give the parts names, so that the later parts[0], parts[1] are readable...

strings, regexes = group_data[lvl]&.partition { |typ| typ.kind_of?(String) }

I thought a, b = nil would break, but I was wrong.

Fryguy · 2018-08-10T15:40:18Z

app/controllers/application_controller/timelines/options.rb

+    end
+
+    def get_set(name)
+      categories.values.reduce([]) { |a, v| a.push(*v[name]) }


Prefer categories.values.flat_map { |v| v["name"] }

Fryguy · 2018-08-10T15:40:56Z

lib/report_formatter/timeline.rb

+          end&.last.try(:[], :display_name)
+
+          bucket_name ||= mri.rpt_options[:categories].find do |_, options|
+            options[:regexes].find { |regex| regex.match(event.event_type) }


As mentioned in the other PR, prefer detect.

Fryguy · 2018-08-14T15:20:29Z

app/controllers/application_controller/timelines.rb

+  def tl_add_event_type_inclusions(conditions, parameters)
+    expressions = []
+    @tl_options.get_set(:regexes).each do |regex|
+      expressions << "event_type ~ ?"


Two things...

If possible, we may be able to ensure this via specs. That is, we can create a spec in core that goes through all event_groups definitions, and 1-by-1 and makes a query. I would expect it to blow up for invalid regexes. Then we can inform the developer that they've built a regex that cannot be used via SQL. So somethig like...

describe EventStream do describe ".event_groups" do it "regexes must be usable in SQL queries" do ... strings, regexes = group_data[lvl].partition { |typ| typ.kind_of?(String) } ... regexes.each do |r| expect { EventStream.where("event_type ~ ?", r.source).to_a }.to_not raise_error end end end end

In writing the above, I found that you cannot just embed a regex directly as a parameter, so the code as is will likely not work. That is, this code does not work:

EventStream.where("event_type ~ ?", /\w+/).first # => TypeError: can't quote Regexp

Instead, ActiveRecord says you have to give a String. I found that calling .source gives the guts of the Regexp but ignores the modifiers. I think ignoring the modifiers might be ok for a first pass, but we may have to write our own Ruby Regexp to PostgreSQL regexp converter to deal with the modifiers. In particular, case insensitivity might be important (though PostgreSQL can also handle that with an alternate operator": ~*). So, anyway, this works:

EventStream.where("event_type ~ ?", /^\w+/.source).first # EventStream Load (0.7ms) SELECT "event_streams".* FROM "event_streams" WHERE (event_type ~ '^\w+') ORDER BY "event_streams"."id" ASC LIMIT $1 [["LIMIT", 1]] # EventStream Inst Including Associations (68.0ms - 1rows) # => #<MiqEvent:0x00007f83bbbb8118 id: 21000000000001, event_type: "login_failed", ... >

tadeboro · 2018-08-15T21:32:15Z

I created new pull request ManageIQ/manageiq#17867 that adds tests for event type mapping. Tests should catch most of the issues that provider authors introduce into codebase. Unfortunately, those tests are blacklisted in provider repos, so the provider authors will need to manually run those tests for now.

tadeboro · 2018-08-15T21:39:49Z

As for the regular expression options, I added support for the case insensitive expressions, since those may be useful for provider authors and they do not introduce much complexity. Support for case sensitivity option is already present in event type mapping tests.

Fryguy · 2018-08-16T19:01:30Z

LGTM, need someone from UI team to review.

skateman · 2018-08-17T06:31:39Z

@tadeboro as far as I see you're doing some refactoring before adding the regexp support. Could you please split this up into at least two separate commits so I can untangle it a little bit easier? Thanks...

We construct filter in two steps: 1. we collect the conditions and 2. join them using and operator. This reduces the amount of string manipulation and conditional statements considerably.

Up until now, in order to register event type to the timeline, we needed to list all of the event types in settings yaml. This commit adds another option: specify event type to group/level mapping using regular expressions, which should make mapping definitions much shorter for providers with many different event types that follow a certain pattern. Changes were made in to areas: in database querying and in event to category assignment. When creating database condition for events, we now use three sources: 1. a set of event types that should be included in result, 2. a set of event types that should not be present in result and 3. a set of regular expressions that event types can match. An event is part of the result if its type is in include set or matches any of the regular expressions and is not part of the exclude set. For example, take this fragment of settings file: :ems: :ems_dummy: :event_handling: :event_groups: :addition: :critical: - !ruby/regexp /^dummy_add_.+$/ :update: :critical: - dummy_123_update - !ruby/regexp /^dummy_.+_update$/i :detail: - dummy_abc_update - dummy_def_update - !ruby/regexp /^dummy_detail_.+_update$/ This translates into following three sources for :update category, assuming user selected :critical level in UI: * include_set: dummy_123_update * exclude_set: dummy_abc_update, dummy_def_update * regexes: ^dummy_.+_updates$ In this case, events of type dummy_abc_update will not be displayed, since they match details and are thus put in exclude set. On the other hand, if the user selected :critical and :detail levels, things are a bit different: * include_set: dummy_123_update, dummy_abc_update, dummy_def_update * exclude_set: * regexes: ^dummy_.+_updates$, ^dummy_detail_.+_update$ Regular expressions should be accepted by the PostreSQL DBMS and should not contain Ruby-specific functionality. The only exception to this rule is case sensitivity, which can make regular expressions a lot simpler in certain situations. Category assignment is done in two phases. First, we try to assign category only looking at the full name of event. If this yields no category, we try to assign one by matching against regular expressions. If we continue with categories from example above, we would get those mappings: * dummy_123_update is mapped to :update. It also matches :addition category, but since explicit names have higher priority, that regular expression is not checked at all. * dummy_nil_update is mapped to :update because it matches one of the :critical expressions. * dummy_add_event is mapped to :addition. All this is done in order to give fully spelled-out event type mappings higher priority than regular expression ones and to be consistent with how EventStream classifies events into groups and levels.

miq-bot · 2018-08-17T08:09:03Z

Some comments on commits xlab-si/manageiq-ui-classic@5e54ac2~...d961141

spec/lib/report_formater/timeline_spec.rb

⚠️ - 238 - Detected allow_any_instance_of. This RSpec method is highly discouraged, please only use when absolutely necessary.
⚠️ - 246 - Detected allow_any_instance_of. This RSpec method is highly discouraged, please only use when absolutely necessary.
⚠️ - 253 - Detected allow_any_instance_of. This RSpec method is highly discouraged, please only use when absolutely necessary.

miq-bot · 2018-08-17T08:09:14Z

Checked commits xlab-si/manageiq-ui-classic@5e54ac2~...d961141 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
5 files checked, 0 offenses detected
Everything looks fine. 🍪

tadeboro · 2018-08-17T08:15:51Z

@skateman Any refactoring was just a side-effect of trying to get the thing working;) But you are right, the changes indeed do look like they should be in two different commits.

skateman · 2018-08-17T12:32:44Z

@tadeboro can we schedule a meeting where you show me this in action?

skateman

miq-bot changed the title ~~Allow registering events to timelines using regexes~~ [WIP] Allow registering events to timelines using regexes Jul 29, 2018

miq-bot added the wip label Jul 29, 2018

tadeboro mentioned this pull request Jul 29, 2018

Assign Nuage event types to timeline categories ManageIQ/manageiq-providers-nuage#119

Merged

tadeboro force-pushed the regex-timelines branch 2 times, most recently from 4c2c2c6 to 3bfbed0 Compare July 29, 2018 20:34

miha-plesko reviewed Jul 30, 2018

View reviewed changes

miq-bot added the unmergeable label Jul 30, 2018

tadeboro force-pushed the regex-timelines branch from 3bfbed0 to c2489d9 Compare August 6, 2018 07:02

tadeboro mentioned this pull request Aug 6, 2018

[PROPOSAL] Per-provider Event Timelines With Regex Support ManageIQ/manageiq#17295

Closed

tadeboro force-pushed the regex-timelines branch from c2489d9 to 61bd71c Compare August 6, 2018 08:11

miq-bot removed the unmergeable label Aug 6, 2018

tadeboro force-pushed the regex-timelines branch 3 times, most recently from 4d054b4 to 8d59450 Compare August 10, 2018 04:11

miq-bot changed the title ~~[WIP] Allow registering events to timelines using regexes~~ Allow registering events to timelines using regexes Aug 10, 2018

miq-bot removed the wip label Aug 10, 2018

Fryguy reviewed Aug 10, 2018

View reviewed changes

tadeboro force-pushed the regex-timelines branch 3 times, most recently from b44f2a3 to f6a6c75 Compare August 13, 2018 11:45

Fryguy requested changes Aug 14, 2018

View reviewed changes

tadeboro force-pushed the regex-timelines branch from f6a6c75 to 8707749 Compare August 15, 2018 21:11

tadeboro force-pushed the regex-timelines branch from 8707749 to 18b2fa7 Compare August 15, 2018 21:36

tadeboro force-pushed the regex-timelines branch from 18b2fa7 to 2a2fffa Compare August 16, 2018 06:24

Fryguy approved these changes Aug 16, 2018

View reviewed changes

dclarizio added enhancement timelines gaprindashvili/no labels Aug 16, 2018

dclarizio requested a review from skateman August 16, 2018 21:16

Tadej Borovšak added 2 commits August 17, 2018 10:02

Refactor event type SQL filter construction

5e54ac2

We construct filter in two steps: 1. we collect the conditions and 2. join them using and operator. This reduces the amount of string manipulation and conditional statements considerably.

tadeboro force-pushed the regex-timelines branch from 2a2fffa to d961141 Compare August 17, 2018 08:06

skateman approved these changes Aug 20, 2018

View reviewed changes

mzazrivec self-assigned this Aug 20, 2018

mzazrivec added this to the Sprint 93 Ending Aug 27, 2018 milestone Aug 20, 2018

mzazrivec merged commit 8c82b71 into ManageIQ:master Aug 20, 2018

tadeboro deleted the regex-timelines branch August 29, 2018 13:13

Allow registering events to timelines using regexes #4375

Allow registering events to timelines using regexes #4375

Conversation

tadeboro commented Jul 29, 2018 • edited Loading

miha-plesko commented Jul 30, 2018

miha-plesko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miq-bot commented Jul 30, 2018

tadeboro commented Aug 10, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fryguy Aug 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Fryguy Aug 10, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tadeboro commented Aug 15, 2018

tadeboro commented Aug 15, 2018

Fryguy commented Aug 16, 2018

skateman commented Aug 17, 2018 • edited Loading

miq-bot commented Aug 17, 2018

miq-bot commented Aug 17, 2018

tadeboro commented Aug 17, 2018

skateman commented Aug 17, 2018

skateman left a comment

Choose a reason for hiding this comment

tadeboro commented Jul 29, 2018 •

edited

Loading

Fryguy Aug 10, 2018 •

edited

Loading

Fryguy Aug 10, 2018 •

edited

Loading

skateman commented Aug 17, 2018 •

edited

Loading