Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exception callback to handle async errors #318

Closed
wants to merge 6 commits into from

Conversation

csokol
Copy link
Contributor

@csokol csokol commented May 4, 2017

This enables the client to do something (other than logging) if a problem happens asynchronously. In our use case, we want to send the error to honeybadger.io to be notified of influxdb issues.

@@ -41,6 +42,25 @@ public void testSchedulerExceptionHandling() throws InterruptedException, IOExce
verify(mockInfluxDB, times(2)).write(any(BatchPoints.class));
}

@Test
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the tests are not formatted with the same settings as the production code. I'm not sure if that's on purpose or accidental but I decided to use the same formatting as in the main code.

@majst01
Copy link
Collaborator

majst01 commented May 4, 2017

Build is failing because of formatting.

@codecov-io
Copy link

codecov-io commented May 4, 2017

Codecov Report

Merging #318 into master will decrease coverage by 0.17%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #318      +/-   ##
============================================
- Coverage     78.62%   78.44%   -0.18%     
+ Complexity      134      133       -1     
============================================
  Files            11       11              
  Lines           711      719       +8     
  Branches         77       77              
============================================
+ Hits            559      564       +5     
- Misses          109      112       +3     
  Partials         43       43
Impacted Files Coverage Δ Complexity Δ
src/main/java/org/influxdb/InfluxDB.java 100% <ø> (ø) 0 <0> (ø) ⬇️
...rc/main/java/org/influxdb/impl/BatchProcessor.java 98.07% <100%> (+0.11%) 17 <0> (ø) ⬇️
src/main/java/org/influxdb/impl/InfluxDBImpl.java 79.71% <100%> (-1.27%) 40 <0> (-1)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cc35288...1261031. Read the comment docs.

@csokol
Copy link
Contributor Author

csokol commented May 4, 2017

Fixed, @majst01

@majst01
Copy link
Collaborator

majst01 commented May 5, 2017

Addresses #275

@majst01
Copy link
Collaborator

majst01 commented May 5, 2017

I like your simple approach, if you mind adding some usage information's to README.md i would like to merge this.

@csokol
Copy link
Contributor Author

csokol commented May 8, 2017

@majst01 cool!

Now that I'm writing the documentation, I wondering if I should add another overloaded method enableBatch to InfluxDBImpl, something like:

void enableBatch(int actions, int flushDuration, 
  TimeUnit flushDurationTimeUnit, Consumer<Throwable> exceptionHandler)

What do you think? Is there another way to enable batching? Or maybe I should just add void enableBatch(BatchProcessor batchProcessor), maybe this allows more flexibility on the caller side.

@majst01
Copy link
Collaborator

majst01 commented May 8, 2017

Yes, would be a good Idea to have a second variation of enableBatch.

@csokol
Copy link
Contributor Author

csokol commented May 8, 2017

which one? enableBatch(BatchProcessor batchProcessor) or enableBatch(int actions, int flushDuration, TimeUnit flushDurationTimeUnit, Consumer<Throwable> exceptionHandler)?

I prefer the first one.

@majst01 majst01 mentioned this pull request May 9, 2017
@majst01
Copy link
Collaborator

majst01 commented May 9, 2017

related to: #289

@kkarski
Copy link

kkarski commented May 9, 2017

Per my similar changes referenced in #289 which I commented on, my use case requires retrying to persist the failed batch of points so the exception handling method needs to reference the failed points in the batch so they can be re-tried.

Also, flip side to that, when the batch succeeds, my use case also requires other parts of the system to be notified so I added a onSuccess event as well as part of the same interface.

Overall, our solutions are very similar.

@majst01
Copy link
Collaborator

majst01 commented May 9, 2017

Having a list of failed points available in the ExceptionHandler seems a useful addition. But i would like to get the first step done, then we can add further improvements.

@csokol what do you think ?
@csokol Checkstyle is failing again

@kkarski
Copy link

kkarski commented May 9, 2017

The only thing I'd point out then is that the Consumer functional interface chosen for this change only accepts one input, in this case the Throwable. Should you want to pass more, Consumer may not be the best choice.

@csokol
Copy link
Contributor Author

csokol commented May 10, 2017

@majst01 agree. The formatting is fixed.

@kkarski passing the failed points to the callback sounds good, but I think that for your use case (retrying failed writes) it would be better to have that as a library feature. Maybe the user should be able to define a retry policy like "retry five times with exponential backoff increasing time"

@kkarski
Copy link

kkarski commented May 10, 2017

@csokol I agree an automated retry would be a good feature and would love to see that. It doesn't solve the issue 100% however which is why I still think it would be good to have a reference to the failed points in the batch in order to attempt some sort of last ditch effort to no loose the data. Alternatively, if the Throwable was a custom type, the points could be referenced in the exception itself rather than as a second parameter.

As I said earlier however, I also have an interest in successful batches.

@majst01
Copy link
Collaborator

majst01 commented May 10, 2017

Should we stick with the current approach and try to implement a automatic recovery of failed batch writes later ?

@kkarski
Copy link

kkarski commented May 10, 2017

I think the automated recovery is optional but getting a reference to the failed points is necessary. Just my opinion...or at least I'll have to continue using my fork if the official version doesn't support it.

@majst01
Copy link
Collaborator

majst01 commented May 11, 2017

The introduction of a InfluxDBExceptionHandler which implements Consumer and holds a List of failed Batchpoints will do the trick, bonus we will get a fork user back. @csokol would you like to modify your PR this way ?

@csokol
Copy link
Contributor Author

csokol commented May 11, 2017

The InfluxDBExceptionHandler would have to hold a mutable field with the list of batchpoints, is that what you mean? This doesn't look good as it might cause concurrency issues. Another way to do it is use something like BiConsumer<List<Batchpoint>, Throwable>. Then we can keep the implementation immutable and more thread safe. WDYT, @majst01?

@majst01
Copy link
Collaborator

majst01 commented May 11, 2017

Sounds good, tbh i am still a bit oldschool and not very familiar with all the glories of the function interfaces in java8 :-)

If you mind modifying this PR, or create a separate one to see the differences in implementation.

@csokol
Copy link
Contributor Author

csokol commented May 11, 2017

I'll do it in a separate branch to compare it. Looking at the code, I think I won't be able to get only the failed points (I'd have to keep separate entries for udp/http points and keep removing them from a list or something). Is it ok if in this initial implementation I pass the whole batch to the consumer instead of just the ones that failed?

@csokol
Copy link
Contributor Author

csokol commented May 11, 2017

Just noticed that #289 was actually proposing the same thing (the whole batch, not only the failed points) so I'll go for that.

@majst01
Copy link
Collaborator

majst01 commented May 11, 2017

Replaced by a better implementation in #319

@majst01 majst01 closed this May 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants