Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob-triggered functions execute several times even though maxDequeueCount is set to 1 #8542

Closed
mdddev opened this issue Jul 12, 2022 · 7 comments
Assignees

Comments

@mdddev
Copy link

mdddev commented Jul 12, 2022

Please provide the following:

  • Timestamp: 2022-07-12T15:58:41.180Z
  • Function App version: 4
  • Function App name: cannot disclose publicly
  • Function name(s) (as appropriate): cannot disclose publicly
  • Invocation ID: efc37408-a985-4f8f-b280-0885248a4d57
  • Region: West-Europe

Repro steps

Blob-triggered functions under the hood work with queues. In the host.json there is a section to set the maximum dequeue count. This is respected for queue-triggered functions. It is not for blob-triggered functions. No equivalent section exists for the blobs extension in the host.json.

{
    "version": "2.0",
    "extensions": {
        "queues": {
            "maxDequeueCount": 1,
            "maxPollingInterval": "00:00:10"
        }
}

Expected behavior

Functions triggers once, completes either sucessfully or unsucessfully after one execution. No retries.

Actual behavior

In this case the function failed. There were 4 additional retries.

2022-07-12T15:58:41.180 [Information] Trigger Details: MessageId: 7191b747-4718-45c9-9c0e-90a687a37136, DequeueCount: 1, InsertedOn: 2022-07-12T15:58:39.000+00:00, BlobCreated: 2022-07-12T15:55:35.000+00:00, BlobLastModified: 2022-07-12T15:58:38.000+00:00

2022-07-12T15:58:45.937 [Information] Trigger Details: MessageId: 7191b747-4718-45c9-9c0e-90a687a37136, DequeueCount: 2, InsertedOn: 2022-07-12T15:58:39.000+00:00, BlobCreated: 2022-07-12T15:55:35.000+00:00, BlobLastModified: 2022-07-12T15:58:38.000+00:00

2022-07-12T15:58:46.620 [Information] Trigger Details: MessageId: 7191b747-4718-45c9-9c0e-90a687a37136, DequeueCount: 3, InsertedOn: 2022-07-12T15:58:39.000+00:00, BlobCreated: 2022-07-12T15:55:35.000+00:00, BlobLastModified: 2022-07-12T15:58:38.000+00:00

2022-07-12T15:58:47.332 [Information] Trigger Details: MessageId: 7191b747-4718-45c9-9c0e-90a687a37136, DequeueCount: 4, InsertedOn: 2022-07-12T15:58:39.000+00:00, BlobCreated: 2022-07-12T15:55:35.000+00:00, BlobLastModified: 2022-07-12T15:58:38.000+00:00

2022-07-12T15:58:48.040 [Information] Trigger Details: MessageId: 7191b747-4718-45c9-9c0e-90a687a37136, DequeueCount: 5, InsertedOn: 2022-07-12T15:58:39.000+00:00, BlobCreated: 2022-07-12T15:55:35.000+00:00, BlobLastModified: 2022-07-12T15:58:38.000+00:00

Executed 'cannot_disclose_publicly' (Failed, Id=2751b523-bd95-4752-a75a-c7d6af2c28d5, Duration=330ms)

Known workarounds

None.

@ghost ghost assigned kshyju Jul 12, 2022
@mdddev
Copy link
Author

mdddev commented Aug 1, 2022

I imagine a workaround could look like this:

  • blob-triggered function A triggers on new blob in container C
  • function A pushes the blob-event's meta data via queue message onto a storage queue Q
  • queue-triggered function B listens for new messages on Q
  • since now the business logic is handled by function B, any failures should respect the maxDequeueCount property in host.json

Tradeoffs:

  • Function A must pass on metadata of blob payload to function B
  • Function A must be authenticated to push to Q
  • Function B must be authenticated to read from C

This obviously is hacky.

@mdddev
Copy link
Author

mdddev commented Aug 8, 2022

Another workaround

  • go to the hosting storage account's Events section
  • create a EventGrid subscription, listening for BLOB CREATED events
  • have these delivered to a queue of your liking

Obviously, the queue-triggered function must be authorized for blob access to do anything with its payload, however, in this way you don't need a function as handler.

@garashov
Copy link

garashov commented Oct 10, 2022

Hi @mdddev, @kshyju ,
I am experiencing the same issue..

Mentioned workarounds seem a bit tricky to me.
I would like to ask, if you have found an easy way to handle the issue?

@mdddev
Copy link
Author

mdddev commented Oct 11, 2022

Unfortunately not, from the above I think the event grid option is still the ~okayest

@aborremans
Copy link

I am running into the same issue using the below configuration in my hosts.json file

{
  "version": "2.0",
  "functionTimeout": "00:10:00",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[3.3.0, 4.0.0)"
  },
  "extensions": {
    "queues": {
      "maxDequeueCount": 1,
      "batchSize": 1
    }
  }
}

@mdddev mdddev changed the title Blob-triggered functions execute several times even thoug maxDequeueCount is set to 1 Blob-triggered functions execute several times even though maxDequeueCount is set to 1 Oct 25, 2022
@mdddev
Copy link
Author

mdddev commented Jun 15, 2023

Closing this now, since the way to go over Event Grid has not only proven to be stable and effective, but also more cost efficient in my scenario. Since my company deployed mandatory Advanced Threat Protection policies via Azure Defender, the transactions and traffic between the blob-triggered function and the storage account holding the blobs became ATP-billable. And since the blob-trigger polls excessively behind the scenes my cost skyrocketed. At the peak. adding almost 15 EUR/day in incremental cost. And this for only a handful of blob-triggered functions. Switching over to the Event Grid method cut that cost down by 98.8%. Also the cost for storage transactions (no polling naymore) went down by 97%.

@moritzkoerber
Copy link

There is a separate section in the configuration specifically for blob triggers:

{
    "version": "2.0",
    ...
    "extensions": {
        "blobs": {
            "poisonBlobThreshold": 1
        }
    }
}

Source: https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob?tabs=isolated-process%2Cextensionv5%2Cextensionv3&pivots=programming-language-python#hostjson-settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants