Azure webjob logs - looking for logs that detail the SDK's handling of queue triggered items

Question

As shown in my Stack Overflow question Azure webjob not appearing to respect MaxDequeueCount property, I'm troubleshooting a problem where despite my setting of the MaxDequeueCount property to 1, some items are dequeued many times before they're poisoned (in fact some items may never be poisoned at all and just dequeue, fail and are retried and fail endlessly).

The Webjobs SDK handles the retries and poisoning of queue triggered messages automatically and I'm looking for logs that contain details of that handling.

For example, I can see that my function has detected a new queue item by viewing the webjob's log through the SCM at https://myappengine.scm.azurewebsites.net/vfs/data/jobs/continuous/StuffProcessor/job_log.txt (By the way if I've enabled detailed logging to Azure storage on the web app can I get this same info in a Blob?).

[02/22/2017 01:47:22 > ec8d0f: INFO] Executing: 'StuffProcessor.ProcessQueueMessage' - Reason: 'New queue message detected on 'stuff-processor'.'
[02/22/2017 01:47:26 > ec8d0f: INFO] Executed: 'StuffProcessor.ProcessQueueMessage' (Succeeded)
[02/22/2017 01:47:26 > ec8d0f: INFO] Executing: 'StuffProcessor.ProcessQueueMessage' - Reason: 'New queue message detected on 'stuff-processor'.'

I can also get some information on an item's dequeue count by viewing the logs in the azure-jobs-host-archive container once I've enabled detailed logging to Azure storage on the web app:

{
      "Type": "FunctionCompleted",
      "EndTime": "2017-02-22T00:07:40.8133081 00:00",
      "Failure": {
        "ExceptionType": "Microsoft.Azure.WebJobs.Host.FunctionInvocationException",
        "ExceptionDetails": "Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: ItemProcessor.ProcessQueueMessage ---> MyApp.Exceptions.MySpecialAppExceptionType: Exception of type 'MyApp.Exceptions.MySpecialAppExceptionType' was thrown.
      },
      "ParameterLogs": {},
      "FunctionInstanceId": "1ffac7b0-1290-4343-8ee1-2af0d39ae2c9",
      "Function": {
        "Id": "MyApp.Processors.ItemProcessor.ProcessQueueMessage",
        "FullName": "MyApp.Processors.ItemProcessor.ProcessQueueMessage",
        "ShortName": "ItemProcessor.ProcessQueueMessage",
        "Parameters": [
          {
            "Type": "QueueTrigger",
            "AccountName": "MyStorageAccount",
            "QueueName": "stuff-processor",
            "Name": "sourceFeedItemQueueItem"
          },
          {
            "Type": "BindingData",
            "Name": "dequeueCount"
          },
          {
            "Type": "ParameterDescriptor",
            "Name": "logger"
          }
        ]
      },
      "Arguments": {
        "sourceFeedItemQueueItem": "{\"SourceFeedUpdateID\":437530,\"PodcastFeedID\":\"2d48D2sf2\"}",
        "dequeueCount": "96",
        "logger": null
      },
      "Reason": "AutomaticTrigger",
      "ReasonDetails": "New queue message detected on 'stuff-processor'.",
      "StartTime": "2017-02-22T00:07:40.6017341 00:00",
      "OutputBlob": {
        "ContainerName": "azure-webjobs-hosts",
        "BlobName": "output-logs/1ffd3c7b012c043438ed12af0d39ae2c9.txt"
      },
      "ParameterLogBlob": {
        "ContainerName": "azure-webjobs-hosts",
        "BlobName": "output-logs/1cf2c1b012sa0d3438ee12daf0d39ae2c9.params.txt"
      },
      "LogLevel": "Info",
      "HostInstanceId": "d1825bdb-d92a-4657-81a4-36253e01ea5e",
      "HostDisplayName": "ItemProcessor",
      "SharedQueueName": "azure-webjobs-host-490daea03c70316f8aa2509438afe8ef",
      "InstanceQueueName": "azure-webjobs-host-d18252sdbd92a4657d1a436253e01ea5e",
      "Heartbeat": {
        "SharedContainerName": "azure-webjobs-hosts",
        "SharedDirectoryName": "heartbeats/490baea03cfdfd0416f8aa25aqr438afe8ef",
        "InstanceBlobName": "zd1825bdbdsdgga465781a436q53e01ea5e",
        "ExpirationInSeconds": 45
      },
      "WebJobRunIdentifier": {
        "WebSiteName": "myappengine",
        "JobType": "Continuous",
        "JobName": "ItemProcessor",
        "RunId": ""
      }
    }

What I can't find though are logs which show detail for a particular queue item where processing fails due to an exception and is placed in the poison queue. I'm looking for these in an attempt to further troubleshoot the apparent ignoring of the MaxDequeueCount property. Is this logged?

UPDATE: I found the post Azure WebJobs with Storage Queue and Exceptions and it contained the following screenshot:

Webjob console

That screenshot shows the standard "New queue message detected..." messages (which I see both on Azure and running locally in the emulator) and it also shows "Message has reached MaxDequeueCount of X...Moving message to queue 'xyz-poison'", which I only see locally in the emulator. Does the Azure-based runtime not show this info for some reason? I never see the poisoning-related messages like this when running locally in the console window or via the webjobs dashboard or when running on Azure.

Solution

Are you using Azure Storage version 8.* ? It contains a breaking change.

When a queue message object gets inserted into a message queue in Azure Storage 8, the Message Id of the original queue message object is overwritten with the unique message id of the newly inserted message.

This means that it looses its reference to the poison message and can no longer remove it.

https://github.com/Azure/azure-webjobs-sdk/issues/985

As for logging, 2 options:

  1. Create your own QueueProcessorFactory and QueueProcessor - you can override the method that handles poison messages to produce more debug output. There you can also fix the issue altogether if you'd like to stay on Azure Storage 8.*.

  2. Register a trace monitor. https://github.com/Azure/azure-webjobs-sdk-extensions/wiki/Error-Monitoring#tracemonitor

    var traceMonitor = new TraceMonitor()
    .Filter(p => true, "Trace Handler")
    .Subscribe(TraceHandler.Process);
    
    config.Tracing.Tracers.Add(traceMonitor);