Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Hedged Requests for Azure #751

Closed
joe-elliott opened this issue Jun 9, 2021 · 5 comments
Closed

Support Hedged Requests for Azure #751

joe-elliott opened this issue Jun 9, 2021 · 5 comments

Comments

@joe-elliott
Copy link
Member

Is your feature request related to a problem? Please describe.
Hedged requests was added for GCS/S3 here: #750

Let's find a way to get it in Azure.

@pedrosaraiva
Copy link
Contributor

Azure SDK provides access to the HTTP Client using the PipelineOptions no sure if it requires upgrading the SDK to the latest version first.

See https://pkg.go.dev/github.com/Azure/azure-storage-blob-go/azblob#PipelineOptions for reference. I can take a look tomorrow.

@annanay25
Copy link
Contributor

Those PipelineOptions look interesting. The RetryOptions have the following -

	// MaxTries specifies the maximum number of attempts an operation will be tried before producing an error (0=default).
	// A value of zero means that you accept our default policy. A value of 1 means 1 try and no retries.
	MaxTries int32

	// TryTimeout indicates the maximum time allowed for any single try of an HTTP request.
	// A value of zero means that you accept our default timeout. NOTE: When transferring large amounts
	// of data, the default TryTimeout will probably not be sufficient. You should override this value
	// based on the bandwidth available to the host machine and proximity to the Storage service. A good
	// starting point may be something like (60 seconds per MB of anticipated-payload-size).
	TryTimeout time.Duration

which seems to be exactly what we're looking to do with hedged requests. Would be worth a try!

@joe-elliott
Copy link
Member Author

TryTimeout sounds a little different to me. My guess is that this is a standard client timeout after which it just issues the new request (and cancels the old), but I'm not 100% sure.

Azure SDK provides access to the HTTP Client using the PipelineOptions

Not seeing the HTTP client here.

@pedrosaraiva
Copy link
Contributor

pedrosaraiva commented Jun 10, 2021

I am not 100% sure, but I would say it's the HTTPSender.

type PipelineOptions struct {
	// Log configures the pipeline's logging infrastructure indicating what information is logged and where.
	Log pipeline.LogOptions

	// Retry configures the built-in retry policy behavior.
	Retry RetryOptions

	// RequestLog configures the built-in request logging policy.
	RequestLog RequestLogOptions

	// Telemetry configures the built-in telemetry policy behavior.
	Telemetry TelemetryOptions

	// HTTPSender configures the sender of HTTP requests
	HTTPSender pipeline.Factory
} 

I looked into the SDK example below and made a quick test with Tempo and Azurite, and it looks like it can be implemented more or less in the same way as in the other backends.
I imagine the logic needs to change a bit since the Hedged Requests are only for Reads, and the writes will stay the same.

// Create/configure a request pipeline options object.
	// All PipelineOptions' fields are optional; reasonable defaults are set for anything you do not specify
	po := PipelineOptions{
		// Set RetryOptions to control how HTTP request are retried when retryable failures occur
		Retry: RetryOptions{
			Policy:        RetryPolicyExponential, // Use exponential backoff as opposed to linear
			MaxTries:      3,                      // Try at most 3 times to perform the operation (set to 1 to disable retries)
			TryTimeout:    time.Second * 3,        // Maximum time allowed for any single try
			RetryDelay:    time.Second * 1,        // Backoff amount for each retry (exponential or linear)
			MaxRetryDelay: time.Second * 3,        // Max delay between retries
		},

		// Set RequestLogOptions to control how each HTTP request & its response is logged
		RequestLog: RequestLogOptions{
			LogWarningIfTryOverThreshold: time.Millisecond * 200, // A successful response taking more than this time to arrive is logged as a warning
		},

		// Set LogOptions to control what & where all pipeline log events go
		Log: pipeline.LogOptions{
			Log: func(s pipeline.LogLevel, m string) { // This func is called to log each event
				// This method is not called for filtered-out severities.
				logger.Output(2, m) // This example uses Go's standard logger
			},
			ShouldLog: func(level pipeline.LogLevel) bool {
				return level <= pipeline.LogWarning // Log all events from warning to more severe
			},
		},

		// Set HTTPSender to override the default HTTP Sender that sends the request over the network
		HTTPSender: pipeline.FactoryFunc(func(next pipeline.Policy, po *pipeline.PolicyOptions) pipeline.PolicyFunc {
			return func(ctx context.Context, request pipeline.Request) (pipeline.Response, error) {
				// Implement the HTTP client that will override the default sender.
				// For example, below HTTP client uses a transport that is different from http.DefaultTransport
				client := http.Client{
					Transport: &http.Transport{
						Proxy: nil,
						DialContext: (&net.Dialer{
							Timeout:   30 * time.Second,
							KeepAlive: 30 * time.Second,
							DualStack: true,
						}).DialContext,
						MaxIdleConns:          100,
						IdleConnTimeout:       180 * time.Second,
						TLSHandshakeTimeout:   10 * time.Second,
						ExpectContinueTimeout: 1 * time.Second,
					},
				}

				// Send the request over the network
				resp, err := client.Do(request.WithContext(ctx))

				return pipeline.NewHTTPResponse(resp), err
			}
		}),
	}

@joe-elliott
Copy link
Member Author

Thank you for the help!

Azure support is now in #750

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants