Azure Blob Upload Speed – Don’t use OpenWriteAsync()

Uploading to Azure Storage with the .NET Client can often require some customisation to ensure acceptable performance. First, let’s look at some options in BlobRequestOptions:

SingleBlobUploadThresholdInBytes

The minimum size of a blob before it’ll be uploaded in ‘chunks’. This only works for the non-stream based upload methods.

Minimum: 1MB, or 1,048,576 bytes.

ParallelOperationThreadCount

The maximum number of upload operations to perform in parallel, for a single blob.

There’s also a useful property on the Blob itself:

StreamWriteSizeInBytes

The size of each block to upload. So, for example, if you set to 1MB, a 4MB file will be chunked into 4 separate 1MB blocks.

Default value: 4MB, or 4,194,304 bytes.

// Options
var options = new BlobRequestOptions
{
    SingleBlobUploadThresholdInBytes = 1024 * 1024, //1MB, the minimum
    ParallelOperationThreadCount = 1
};

client.DefaultRequestOptions = options;

// Blob stream write
blob.StreamWriteSizeInBytes = 1024 * 1024;

A more thorough explanation is available here: https://www.simple-talk.com/cloud/platform-as-a-service/azure-blob-storage-part-4-uploading-large-blobs/

When it’s all ignored – OpenWriteAsync()

You can set all of these options, but if you use blob.OpenWriteAsync() it’s going to upload files in 5KB chunks as you write to the stream. This will absolutely destroy performance if you’re uploading larger files or a lot of files. Instead, you’ll need to use the blob.UploadFromStreamAsync() method:

// Buffer to a memory stream so that the client uploads in one chunk instead of multiple
// By default the client seems to upload in 5KB chunks
using (var memStream = new MemoryStream())
{
    // Save to memory stream
    await saveActionAsync(memStream);

    // Upload to Azure
    memStream.Seek(0, SeekOrigin.Begin);
    await blob.UploadFromStreamAsync(memStream);
}

If you use the UploadFromStreamAsync() method, the settings you set will be honoured and blobs will be uploaded in a much more efficient manner.

2 comments

I need to download a blob from a container and then encrypt and upload it to a different container. To do that I can use DownloadToStreamAsync and pass into that method the “upload stream” which is the stream object returned from the destination blob’s call to OpenWriteAsync. The OpenWriteAsync stream is then passed into a CryptoStream. The end result is that my call to DownloadToStreamAsync pulls down data, encrypts it, and then immediately uploads it to blob storage. The memory use is trivial because I’m never holding the entire blob in memory at any given point.

I’m not sure how I could accomplish this with UploadFromStreamAsync without having an intermediate step where I save the blob to disk. Then upload it by reading from that file stream. Have any thoughts?

From my testing OpenWrite() worked correctly. You lose the async, but memory usage should be much lower. That said, there’s been a few Azure SDK updates and the whole OpenWriteAsync() issue might be fixed.

Leave a Reply