Skip to content

Download file from Azure Blob Storage hangs when retrieving final 1-2 blocks #729

@mfontes

Description

@mfontes

Investigative information

Please provide the following:

  • Timestamp: 2024-01-18 22:59:00.641
  • Invocation ID: f3822fe-54f2-470a-9d82-64e057ae075e
  • Region: West US 2

Note: I am providing a particular instance where this failed, but this is 100% reproducible for me and my team members when running the FunctionApp locally.

Repro steps

  1. Create a BlockBlobClient using package @azure/storage-blob
  2. Call downloadToBuffer() on the blockBlobClient and store it to a variable

Expected behavior

  1. Expect variable to have file contents

Actual behavior

  1. File download hangs for an unacceptable length of time, over 3 minutes for a 1.3mb file
  2. After ~3 minute hang, file download completes and data is stored in a variable
  3. The time it hangs seems to scale up with the length of the file. 2mb takes over 5 minutes

Known workarounds

Does not hang on smaller file sizes. Tried with 16kb and 32kb without issue. No luck on file sizes greater than about 1.3mb

Related information

I realize this is smelling like a bug in blob storage, however the reason I am filing it here is that this behavior is only reproducible when downloading the blob from the function host. I have tried downloading the blob in a few ways:

  1. Using the NodeJS SDK for Azure blob storage, calling BlockBlobClient.download()
  2. Using the SDK, calling BlockBlobClient.downloadToBuffer() using various blockSizes
  3. Using a simple http.get request to download the file using a SAS URL
  4. Using http head request to get the file content length, followed by a series of http get byte range requests to retrieve the blob in chunks (again using a SAS URL)

In each case, the blob download completes in around 1 second from a simple NodeJS script. However they all seem to hang when downloading from a function host. In cases 1, 2, and 4 I was able to write progress to my console (using context.log) and could see consistently that I would actually retrieve the whole file except the last 1 or 2 chunks of data, at which point it would hang.

For a 1.3MB file, it would hang for 3.5 minutes before actually completing successfully. For a 2.1MB file, the hang was longer than my 5 minute function timeout, so I did not see if it would complete successfully since my function would timeout before it could.

  • Programming language used - NodeJS 18.15.0
  • Bindings used - Service Bus Queue trigger
  • Using @azure/functions: ^4.1.0 programming model

Sample output given the download attempted using the code from Source below:

progress! 262144 of 1311223
progress! 262647 of 1311223
progress! 524791 of 1311223
progress! 786935 of 1311223
progress! 1049079 of 1311223
Source Download file:
  const blobStorage = new BlobServiceClient(
    `https://${account}.blob.core.windows.net`,
    new StorageSharedKeyCredential(account, key)
  );

  const blobClient = blobStorage
    .getContainerClient(containerName)
    .getBlockBlobClient(fileName);

  const { contentLength } = await blobClient.getProperties();
  const fileBuffer = await blobClient.downloadToBuffer(0, undefined, {
    blockSize: Math.pow(2, 10) * 256, // I've tried several block sizes (1k, 16k, 128k, 256k, 1mb, 2mb, and 4mb), all produce the same issue
    onProgress: function ({ loadedBytes }) {
      context.log(`progress! ${loadedBytes} of ${contentLength}`);
    },
  });

// This point is reached after 3.5minutes for a 1.3mb file
  const asString = fileBuffer.toString();
  return JSON.parse(asString);

host.json:

{
  "aggregator": {
    "batchSize": 1000,
    "flushTimeout": "00:00:30"
  },
  "extensions": {
    "serviceBus": {
      "maxConcurrentCalls": 1
    }
  },
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[4.0.0, 5.0.0)"
  },
  "functionTimeout": "00:05:00",
  "logging": {
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": false
      }
    }
  },
  "version": "2.0"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions