Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Technical Question] Possible DNS issue - File Upload limiting connetion lifetime #3263

Closed
jmasek opened this issue Apr 12, 2023 · 14 comments
Closed
Assignees
Labels
bug Something isn't working. fix-checked-in Fix checked into main or preview, but not yet released. IoTSDK Tracks all IoT SDK issues across the board question Further information is requested. v2 This issue is relevant or will only be addressed in v2.0.0+

Comments

@jmasek
Copy link

jmasek commented Apr 12, 2023

Hi,
we are using your SDK on approx 20 thousand devices all around the world. However, we are seeing that some devices (approx 1000 devices out of 20k) have problems with File Upload (they got stuck on the method DeviceClient.GetFileUploadSasUriAsync ). After some digging, we suspect that the DNS change of IoT Hub occurred but the underlying HttpClient keeps using the old IP address.
I discovered that in your SDK, there are two classes HttpClientHelper. The one in the namespace Microsoft.Azure.Devices allows specifying connectionLeaseTimeoutMilliseconds by passing proper options to the caller but the one that is used for file uploads - in the namespace Microsoft.Azure.Devices.Client.Transport does not have such a parameter and is created with default PooledConnectionLifetime set to infinite. Therefore, I have the following questions:

  • Are my findings correct or am I missing something?
  • Is it possible to limit the connection lifetime to force the connection to refresh the IP Address?

Thank you,
Jan

@jmasek jmasek added the question Further information is requested. label Apr 12, 2023
@github-actions github-actions bot added the IoTSDK Tracks all IoT SDK issues across the board label Apr 12, 2023
@timtay-microsoft
Copy link
Member

Hi @jmasek, this does appear to be a bug in the SDK. We'll get a fix out soon that sets the default connection lease timeout to 5 minutes on the HTTP client used for file upload operations.

@timtay-microsoft timtay-microsoft added the bug Something isn't working. label Apr 13, 2023
@timtay-microsoft timtay-microsoft self-assigned this Apr 13, 2023
@timtay-microsoft
Copy link
Member

#3266 should fix this issue

@timtay-microsoft timtay-microsoft added the fix-checked-in Fix checked into main or preview, but not yet released. label Apr 13, 2023
@jmasek
Copy link
Author

jmasek commented Apr 17, 2023

Hi @timtay-microsoft, thanks for the quick fix!
One last question, do you happen to have any estimate of when the fix could be released?

@timtay-microsoft
Copy link
Member

It may be a week or two before we put out a release. If you need this fix now, I recommend making the same change that the SDK made here in your code. This change uses a static class that all HttpClients will reference, so it is safe to make this change outside of the SDK.

ServicePoint servicePoint = ServicePointManager.FindServicePoint("<your hub name>.azure-devices.net");
servicePoint.ConnectionLeaseTimeout = <your preferred timeout>;

@jmasek
Copy link
Author

jmasek commented Apr 18, 2023

Hi @timtay-microsoft,
we have further investigated your fix and I am worried that your fix will only work in .NET Framework 4.6 and not in .NET Core and .NET5,6,7 (we are using .NET 6) because ServicePointManager is deprecated. I found only limited resources on this topic but this conversation dotnet/runtime#26048 suggests that ServicePointManager is a no-op.
In addition, you are referencing Azure SDK (https://github.com/Azure/azure-sdk-for-net/blob/7e3cf643977591e9041f4c628fd4d28237398e0b/sdk/core/Azure.Core/src/Pipeline/ServicePointHelpers.cs#L29) code where it seems that this issue is targeted using pragma statements.
Can you please check it, please?

Thanks

@timtay-microsoft
Copy link
Member

Good catch. #3279 will make it so that the SDK handles this timeout on all the target frameworks correctly.

@timtay-microsoft
Copy link
Member

And #3280 will do the same for our soon-to-be-released v2 packages

timtay-microsoft added a commit that referenced this issue Apr 18, 2023
…work (#3279)

Previous fix only worked for .NET framework, not .NET core

#3263
@jmasek
Copy link
Author

jmasek commented Apr 20, 2023

Hi Tim,
I am sorry but I think the first fix won't work.
#3279
In ServicePointHelpers, you added a case for SocketsHttpHandler but it will never be sent there because HttpClientHelper:106 uses by default HttpClientHandler and the constructor parameter of HttpClientHandler is not provided by InternalClient when created via HttpTransportHandler. So all the time is used HttpClientHandler and the no-op service point method for .NET Core/.NET5.
#3280
Looks good so far - only you are inconsistent with class azure-iot-sdk-csharp/iothub/service/src /ServicePointHelpers.cs where Microsoft uses #if NETCOREAPP2_1_OR_GREATER || NET5_0_OR_GREATER while you are using #if NETCOREAPP. As I understand it, the meaning is very similar but it can cause confusion and questions about why you are using different pragmas for similar code.

@timtay-microsoft
Copy link
Member

Good points. After some further thought, I don't think there is a way for this SDK to set the connection lease timeout for you by default if you are using a .NET core app. Our device SDK does not target .NET core, so we can't use preprocessor directives to create a SocketsHttpHandler on your behalf with the timeout.

While this is unfortunate, we will make it so that users can pass in an HttpMessageHandler instance that we will construct the HttpClient with. This allows .NET core applications to construct the SocketsHttpHandler with the timeouts set before giving it to this library.

Given that we aim to only target netstandard 2.0 in our v2 client, users will need to do the same thing in that package as well.

I'll leave #3293 open for your comments @jmasek and we'll review it internally as well.

@timtay-microsoft timtay-microsoft removed the fix-checked-in Fix checked into main or preview, but not yet released. label Apr 20, 2023
@jmasek
Copy link
Author

jmasek commented Apr 21, 2023

Hi, #3293 looks good 👍 I think it is a good approach to allow users to pass the HttpMessageHandler.

@timtay-microsoft
Copy link
Member

Sounds good. We've decided to go just a bit further and allow you to pass in the HttpClient instead of just the HttpMessageHandler in case there are other timeout values you'd like to set. Thanks for the help on this!

@timtay-microsoft timtay-microsoft added the v2 This issue is relevant or will only be addressed in v2.0.0+ label Apr 24, 2023
@timtay-microsoft
Copy link
Member

We are working to put out a v1 release with this fix within the next day or two, FYI

@jmasek
Copy link
Author

jmasek commented May 3, 2023

great, thanks. Keep me posted 😉

@timtay-microsoft timtay-microsoft added the fix-checked-in Fix checked into main or preview, but not yet released. label May 3, 2023
@timtay-microsoft
Copy link
Member

We have done a main release and a preview release that both contain this fix, so I'm closing this issue.

For anyone reading this thread looking for a tl;dr, the SDK doesn't handle this scenario for you on older frameworks, but it does on newer frameworks. Older framework users must provide the HTTP client with a connection lease timeout set.

timstewartm pushed a commit to timstewartm/azure-iot-sdk-csharp that referenced this issue May 30, 2024
…work (Azure#3279)

Previous fix only worked for .NET framework, not .NET core

Azure#3263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. fix-checked-in Fix checked into main or preview, but not yet released. IoTSDK Tracks all IoT SDK issues across the board question Further information is requested. v2 This issue is relevant or will only be addressed in v2.0.0+
Projects
None yet
Development

No branches or pull requests

2 participants