-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Points loss using batch processing #163
Comments
Yes, I can test it. No problem. Tomorrow I will test the PR and let you know the result of it. |
Ha! Great news. I've been hoping someone (other than me) can see if works before I commit to merging it. |
I run my application using your patch and there is still data loss. |
@FlavioF thanks for the comment, you are right that the default there should be Can I enquire, are you running with a capacity limit or not on your batch processor? (I have really only used the capacity limited form, as i am uncomfortable with the unbounded capacity/capacity-based-on-JVM-memory-limits option) |
I run the tests without a capacity limit. Do you want me to run it with a limit? |
It would be great if you could. (Just so we are on the same page, if you are running into a situation where there is a JVM limit you are going to lose data somewhere. There is nothing that we can do about that). |
Ok, I get it. I will retry the test with a limit tomorrow morning. |
Sorry, will test it only on next Tuesday. |
Did you guys make any progress on this last week, I was away for a few days. |
I am in a good spot to do some testing with this over the next few days. How is the patch looking? |
No progress here sorry. I had no time for more tests in the past days. I will back to it ASAP. |
@larrywalker I have been running this patch for 9 months or so, but have been waiting to get someone else to see if it works for them. My use case is configured with a limited buffering capacity (100,000 points I think, or maybe 500k) and will discard the 'oldest' buffered points when the limit is reached. I'm almost about to merge this PR in to the master branch anyway (I'm sick of having it unmerged and I don't think it is any worse than the existing; but it could be a little more dangerous, hence my delay). Any feedback would be great. @FlavioF thanks! |
@andrewdodd are you the maintainer now? To be clear PR #108 implements a circular buffer to retain x amount of data points when the connection is lost? How do you configure the amount of points to hang onto? |
Hi @jazdw, I am 'also' a maintainer now. I guess one of the config options from PR #108 turns the buffer into a circular buffer (DROP_OLDEST). Unfortunately the javadoc is not as clear as it could be, but the In general, there are a few options for configuration:
There is also a slightly confusing config option An example config might be:
This would result in the following behaviours:
|
Hi, |
Any plans on working on this? I've created a simple main to check if batching is suitable for our use and it doesn't seem reliable: public class Foo {
public static void main(String[] args) throws InterruptedException {
InfluxDB influx = InfluxDBFactory.connect("http://127.0.0.1:8086", "root", "");
influx.enableBatch(2000, 1, SECONDS);
for (int i = 0; i < 10000; i++) {
Point point = Point.measurement("bug")
.time(System.currentTimeMillis(), MILLISECONDS)
.addField("value", 1)
.build();
influx.write("test", "autogen", point);
}
Thread.sleep(2000);
influx.close();
}
} When I check the database after running this code, I see there's a lot of data lost:
I'm considering not using this feature in production and instead create my own implementation of batching. What do you think? Is this feature production ready? Maybe it should be deprecated or at least have a big warning in the readme mentioning that this is not reliable. |
My example didn't make any sense, sorry. Since it was using the current timestamp, most of the measurements were obviously with the same timestamp so they were overridden. Here's a realistic example which is working as expected: public class Foo {
public static void main(String[] args) throws InterruptedException {
InfluxDB influx = InfluxDBFactory.connect("http://127.0.0.1:8086", "root", "");
influx.enableBatch(100, 1, SECONDS);
Random random = new Random();
for (int i = 0; i < 10000; i++) {
Point point = Point.measurement("bug")
.time(i, MILLISECONDS)
.addField("value", 1)
.build();
influx.write("test", "autogen", point);
}
Thread.sleep(2000);
influx.close();
}
} |
nice to see, i will close this again. |
We are using influxdb java client 2.2 (this happens in 2.1 too) and we are experiencing some data loss.
In the above example, when we send like (lets say) 1k points of the same measurement, we get 1k log entries however some times we only have 999 points to that measurement in influxdb.
There isn't any errors in influxdb log.
Can it be anything related to java client batch processing?
I am happy to help debugging the problem or event fixing it if needed.
Thank you in advance.
The text was updated successfully, but these errors were encountered: