Data retention enhancement #108

andrewdodd · 2015-10-21T13:48:47Z

These are enhancements to allow data retention in the batch processor. Please see the commit for more info.

The excessive use of 'this.' is confusing, as it implies access to an object attribute/method that needs disambiguation.

The non-abbreviated field and method names are clearer.

These changes introduce two helper functions for writing batched and unmatched points separately. This helps by: - Removing the need to wrap 'unlatched' points in a 'BatchPoints' object - Locating the updates of the counters in a sensible location

These changes: - Add the ConsistencyLevel parameter to both the single and batched write methods. This was previously hidden from the user (defaulted to ConsistencyLevel.ONE). - Added a batched write method that takes a List<Point> parameter, rather than requiring a BatchPoints object. - Deprecated the BatchPoints object and both of the previous write() methods. - Provide implementations of the deprecated interface that use the new interface. - Updates the tests to use the new interface, though the old one will still work. - Fixes the BatchProcessor to use all of the common fields on the BatchEntry objects when performing the write (i.e. the current implementation creates the Map<String, BatchPoints> object only on the database name, but it should use all of the common fields). Justification 1. The current interface forces the user to create a BatchPoints object, even if they already have a list of Point objects that they want to send. 2. The BatchPoints object does not offer much convenience. The one feature that has not been reproduced is the ability to put the same tag on all points within a batch. I do not believe this is necessary because: - This tag information should be created and stored on the Point itself; and - It is pretty easy to add a function to add the same tag to a collection of points into the Point class.

Forgot to include the ConsistencyLevel.ONE parameter.

These changes add the configuration options and functionality to allow the internal BatchProcessor's data retention behaviour to be configured. The new configurable items are: - maxBatchWriteSize: The maximum number of points to attempt in one batch. - discardOnFailedWrite: If the BatchProcessor should throw away batched points if it fails to successfully write them (i.e. the current behaviour is to do this with buffered points). - BufferFailBehaviour: The behaviour the BatchProcessor should exhibit when adding to the queue fails (due to capacity problems, either implicit or explicit).

Fixing the public API to include the required enums and configuration methods. Cleaning up the attribute definitions in BatchProcessor.

Adding 'interrogation' functions to the InfluxDB interface, for investigating the state of the underlying buffer. NB: This now requires synchronisation, eek!

This library could really do with some logging.

… DataRetentionEnhancement

…ataRetentionEnhancement

These changes make the 'write()' call happen in the worker thread, rather than in the 'put()' thread, even when the 'actions' trigger is reached.

These changes introduce an exponential backoff for the flush based retry for when batched writes fail in the scheduled attempts. The backoff doubles the flushIntervalMin value, up until the newly added flushIntervalMax value. The other changes are to expose the new configuration items.

Kindrat · 2015-10-23T17:19:20Z

Consider making fork)

majst01 · 2015-10-23T17:40:50Z

I like your approach, but i need some time during the weekend to review it carefully.
Do you think it is worth to pull it now or create make a 2.1 version instead ?

andrewdodd · 2015-10-24T10:26:33Z

Hey,

First, I think it is up to you. You are the primary maintainer, so in most ways it is your choice. I don't think I have enough of a 'different direction' to really warrant making a fork. If people really like my additions they can go to andrewdodd/influxdb-java, right?

I have been thinking about this set of commits, especially with respect to the existing library. In some ways, my additions are expanding the scope of the API quite a lot, and designing for the general case they are addressing will be hard. (i.e. at least the existing library 'forces' its users to think about data retention explicitly). However, I have to admit, I'm not a big fan of the 'auto-batching' option in the current library, precisely because it silently drops the data it is claiming to buffer and batch.

Regarding these commits, I had a few goes at getting to the design I wanted (as you can see from the commit log), so please look at the end state more than the intervening states. (I was trying to get too cute by picking the right underlying queue etc, but in the end I just resorted to the one queue and some helper functions).

If you are interested in incorporating this stuff I can include some more comments/etc, I'm just a bit busy trying to deploy this with my work atm.

Also, I'm not sure what is in the 2.0 / 2.1 release roadmap, so I guess you would have to decide. Maybe if other people also review the changes and think they are a good idea?

Cheers,
Andrew

PS: I should include a comment on the addAndDropIfNecessary() method that it is a copy of the add() method from Guava's EvictingQueue.

…to DataRetentionEnhancement Conflicts: src/main/java/org/influxdb/InfluxDB.java src/main/java/org/influxdb/impl/InfluxDBImpl.java src/test/java/org/influxdb/dto/PointTest.java

# Conflicts: # src/main/java/org/influxdb/impl/BatchProcessor.java # src/test/java/org/influxdb/InfluxDBTest.java # src/test/java/org/influxdb/PerformanceTests.java # src/test/java/org/influxdb/TicketTests.java # src/test/java/org/influxdb/dto/PointTest.java # src/test/java/org/influxdb/impl/BatchProcessorTest.java

FlavioF · 2016-04-20T11:49:10Z

src/main/java/org/influxdb/impl/InfluxDBImpl.java

+			final int flushIntervalMax,
+			final TimeUnit flushIntervalTimeUnit) {
+
+		enableBatch(0, 


"Capacity should be > 0 or NULL", change it to null.

@FlavioF

Thanks @FlavioF for catching it here: https://github.com/influxdata/influxdb-java/pull/108/files#r60393226

andrewdodd · 2016-09-05T11:21:57Z

Despite this being very useful for me (and perhaps some other people), I think this 'auto-batching' feature is actually something that should be handled by the clients of this library.

I think the 'batch' mode should just be another write option, and we should not magically batch in the background. I think this will avoid (if not solve) a number of issues, such as:

what to do on failure
how to support async

andrewdodd and others added 15 commits October 21, 2015 10:31

Removed unnecessary 'this.' references

558e127

The excessive use of 'this.' is confusing, as it implies access to an object attribute/method that needs disambiguation.

Changing attributes to full words

fcb7fd8

The non-abbreviated field and method names are clearer.

Refactoring batched & unmatched writes

b93d0e3

These changes introduce two helper functions for writing batched and unmatched points separately. This helps by: - Removing the need to wrap 'unlatched' points in a 'BatchPoints' object - Locating the updates of the counters in a sensible location

Update InfluxDBImpl.java

77a16a7

Forgot to include the ConsistencyLevel.ONE parameter.

Data Retention Enhancement

2db394c

Fixing the public API to include the required enums and configuration methods. Cleaning up the attribute definitions in BatchProcessor.

Data Retention Enhancement

7f8b497

Adding 'interrogation' functions to the InfluxDB interface, for investigating the state of the underlying buffer. NB: This now requires synchronisation, eek!

Introduce slf4j to allow logging

3c1e430

This library could really do with some logging.

Merge branch 'master' of https://github.com/andrewdodd/influxdb-java

f3967d8

Merge branch 'master' into DataRetentionEnhancement

eef0c3a

index on DataRetentionEnhancement: eef0c3a Merge branch 'master' into…

aae92ce

… DataRetentionEnhancement

WIP on DataRetentionEnhancement: eef0c3a Merge branch 'master' into D…

02610e9

…ataRetentionEnhancement

Write of buffered points occur in worker thread

75f8b83

These changes make the 'write()' call happen in the worker thread, rather than in the 'put()' thread, even when the 'actions' trigger is reached.

andrewdodd closed this Dec 14, 2015

andrewdodd reopened this Jan 17, 2016

Merge branch 'master' of https://github.com/influxdb/influxdb-java in…

e890d98

…to DataRetentionEnhancement Conflicts: src/main/java/org/influxdb/InfluxDB.java src/main/java/org/influxdb/impl/InfluxDBImpl.java src/test/java/org/influxdb/dto/PointTest.java

andrewdodd mentioned this pull request Feb 17, 2016

Batching / Request Buffering feature review #148

Closed

andrewdodd mentioned this pull request Mar 3, 2016

The library does not throw an exception when database is down #152

Closed

Merge branch 'master' into DataRetentionEnhancement

5b4fe9e

This was referenced Apr 18, 2016

BatchProcessor can write points to the wrong retention policy. #162

Closed

Points loss using batch processing #163

Closed

FlavioF reviewed Apr 20, 2016
View reviewed changes

andrewdodd added 2 commits April 20, 2016 18:00

Using correct default (should be null not 0)

9e2c5e9

Thanks @FlavioF for catching it here: https://github.com/influxdata/influxdb-java/pull/108/files#r60393226

Improving the names of the tests

a60c0e5

Merge branch 'master' into DataRetentionEnhancement

788a99e

andrewdodd mentioned this pull request Jun 20, 2016

Data Loss bugfix / Additional Features #177

Closed

andrewdodd closed this Sep 5, 2016

This was referenced Feb 28, 2017

Batching Enhancements #289

Open

Proposal: Separate high performance point writer #294

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data retention enhancement #108

Data retention enhancement #108

andrewdodd commented Oct 21, 2015

Kindrat commented Oct 23, 2015

majst01 commented Oct 23, 2015

andrewdodd commented Oct 24, 2015

FlavioF Apr 20, 2016

andrewdodd commented Sep 5, 2016

Data retention enhancement #108

Data retention enhancement #108

Conversation

andrewdodd commented Oct 21, 2015

Kindrat commented Oct 23, 2015

majst01 commented Oct 23, 2015

andrewdodd commented Oct 24, 2015

FlavioF Apr 20, 2016

Choose a reason for hiding this comment

andrewdodd commented Sep 5, 2016