17 Mar 2021

Download SFTP file and send to Azure Blob Storage with minimal memory usage


In Azure we use the Renci SSH.NET library to download a file from an SFTP server and put it into Blob Storage.

Our first approach was simple: Downloading the file in entirety to memory, and then uploading to blob store.

using (var memStream = new MemoryStream())
    sftpClient.DownloadFile("sftpFilePath", memStream);
    memStream.Position = 0;
    var container = cloudBlobClient.GetContainerReference("blobContainerName");
    var blob = container.GetBlockBlobReference("blobName");
    await blob.UploadFromStreamAsync(memStream);
However, we soon realized this approach meant we consumed as much memory as the file's size. We fixed this by instead opening a writeable stream to the blob, and connecting that directly to the sftp client's DownloadFile method: 
var container = cloudBlobClient.GetContainerReference("blobContainerName");
var blob = container.GetBlockBlobReference("blobName");
using (var blobStream = blob.OpenWrite())
    sftpClient.DownloadFile("sftpFilePath", blobStream);

Note however that with this approach you may run into the limit of 100,000 uncommitted blocks that azure blob storage enforces. We have to select the block size, which is the size of each "chunk" of blob that the SDK uploads. You do this by setting the blob's StreamWriteSizeInBytes property, which can be between 16Kb and 100Mb. So for 4Mb blocks, do: 

blob.StreamWriteSizeInBytes = 4 * 1024 * 1024;

However, in version of the azure storage library, we noticed that this setting seemed to be ignored when using OpenWrite, and the actual block size was something like 5Kb! This meant we hit the uncommitted block error for files over 450Mb in size. So, to handle very large file uploads we had to instead revert to using UploadFromStreamAsync (which does honour the StreamWriteSizeInBytes property), and had to do some extra work to surface the reading stream to pass to it.

1 Feb 2021

Find and kill a Windows 10 process on a given port


FIND the naughty process PID:

netstat -a -n -o | findstr *PORT_#_HERE*

KILL the naughty process:

taskkill /f /pid *PID_#_HERE*

If I helped you out today, you can buy me a beer below. Cheers!