-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leaking file descriptors when generating too many http requests #3279
Comments
The fix in #3285 is most likely going to fix this issue too, as the number of file reads is reduced from ToDo: Reproduce this issue using repo https://github.com/samswen/lambda-emfiles when the fix from #3285 is out. |
No, I don't think so. |
This issue (http requests) is also more important for you @ffxsam, right? |
Yup, I'm also not sure and this request will require a deep dive.
Looking at the bug description, it appears to be because of readFile because of the following reasons:
The reason I asked to create a separate request is for better tracking. |
fyi: the tests I did where without extra readFile calls, so these are all coming from http requests. import sharedIniFileLoader from '@aws-sdk/shared-ini-file-loader'
Object.assign(sharedIniFileLoader, { loadSharedConfigFiles: async () => ({ configFile: {}, credentialsFile: {} }) }) That's why my proposal is to have a very low socketTimeout by default. |
This clearly shows that most of the EMFILE errors are caused by file reads being fixed in #3285
The default socketTimeout for the SDK will be application to all environments. |
Also facing this. Thank for the deep dive @adrai. |
I'm facing the same issue as well, I was doing 10000 dynamodb updates in 1 chunk and benchmarked the performance with 3.33.x and with 3.49 ran into this EMFILE issues, only option was to reduced the chunk size to 800 (even 1000 didn't work), or rollback to 3.46 For many in production (like me) this is a breaking change. Could we have some updates soon? @adrai thank your for the detailed description |
best advice from my side is really to stay at 3.46 and/or set a lower socketTimeout |
@Muthuveerappanv we've reduced readFile calls in #3285 which was released in v3.51.0. Can you test with >3.51.0 and share your results? |
fyi: with 3.51.0 in my setup, I still get EMFILE errors, too many open sockets... because of the socketTimeout... but it seems it is a very little bit better, but still not good |
My load test results still hasn't improved, didn't pass even for 1000 parallel dynamodb updates (promise.all), not much improvement [had 10000 updates working fine in 3.33.x, now my chunk size is 800] > @Muthuveerappanv we've reduced readFile calls in #3285 which was released in v3.51.0.
|
I concur, I tried with version 3.52.0 |
@Muthuveerappanv is the issue in 3.52.0 worse than that in 3.46.0? |
tldr; the amount of EMFILE generated by readFile is nothing compared to the amount of EMFILE generated by the http requests... that's why you will not notice a decrease of 10 EMFILEs when there are other hundreds or thousands of EMFILEs caused by the http requests |
i wouldn't say worse, but definitely hasn't gotten better either. the sdk isn't coping well with the burst http requests made, as it did prior to 3.46.0 |
@trivikr - any ETA on this bug? |
There's no ETA. Marking this issue as workaround-available as customer can always reduce socketTimeout during client creation in their high traffic lambdas import { S3 } from "@aws-sdk/client-s3";
const client = new S3({
requestHandler: new NodeHttpHandler({
socketTimeout: 10000 // <- this decreases the emfiles count, the Node.js default is 120000
})
); |
Marking this as guidance, as the issue is specific to high traffic lambdas. |
The question is: Why was this workaround NOT necessary in <= v3.46.0 ? |
We suspect that the v3.47.0 added support for defaults mode in #3192, which increased the number of readFile calls. I do not see any other change which can cause it from https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.47.0 notes. If you can provide concrete data for comparisons between, it will assist us to debug this further:
By concrete data, I mean at what number of concurrent requests on Lambda do the EMFILE error occur in these versions. The previous responses were generic, like the one below:
|
Here I compared v3.46.0 with v3.49.0: #3019 (comment) |
Also @Muthuveerappanv has some insights: #3279 (comment) |
@trivikr I think I found what is causing all this extra open EMFILEs.... It's probably exactly what @AllanZhengYP commented here: https://github.com/aws/aws-sdk-js-v3/blame/main/packages/node-http-handler/src/node-http-handler.ts#L75 When doing all these hundreds of concurrent requests, the code is not waiting for the this.config to be ready, and will initialize a loooot of new http(s) clients here: https://github.com/aws/aws-sdk-js-v3/blame/main/packages/node-http-handler/src/node-http-handler.ts#L64 All this was introduced in v3.47.0 with this commit: 9152e21 I tested with this little hack, and it seems to work much better like this: btw: to generate some concurrent requests, it is enough to do something like this: import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3'
const s3Client = new S3Client({ region: 'eu-west-1' })
for (let index = 0; index < 1000; index++) {
s3Client.send(new GetObjectCommand({
Bucket: 'some-bucket',
Key: 'some-key'
})).then(() => {}).catch(() => {})
} |
Ooo nice find @adrai!! |
@trivikr any updates on this? |
@AllanZhengYP will take a look in this issue. |
@alexforsyth @trivikr any update on this? |
hi @adrai,
In my test lambda using the code snippet with the looped |
Your code changes looks good... thank you. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread. |
This issue is extracted from this original issue: #3019, like requested: #3019 (comment)
#3019 is concentrated to the readFile (leaking file descriptor) "issue" and this issue is concentrated to the network (leaking file descriptor) "issue"
It seems lambda is not "waiting" for the file descriptors to be closed.
This can be observed especially, when having warm lambda executions with a lot of sdk calls, like for DynamoDb or S3, etc...
This each http request opens a network socket which results in an open file descriptor.
Since by default in Node.js the socket timeout is set to 120000ms (2 minutes) it may be the lambda is already finished, but the sockets are still open. When "restarting" the lambda for the next invocations, those file descriptors may still be open.
This leads to this type of EMFILE errors:
These basic tests shows the count (and leaks) of the emfile count:
Tests originally done in this issue here: #3019 (comment)
Details
compared to this tests: #3019 (comment)
Defining a custom requestHandler, with a very low socketTimeout reduces drastically the emfiles count:
Details
That's why I suggest to set a low socket Timeout by default, like proposed here: #3019 (comment)
proposal:
and probably also here?
PS. btw. it seems it got worse (more file descriptors) when updating from v3.46.0 to v3.49.0
The text was updated successfully, but these errors were encountered: