Sync Between Azure File Share and Azure Blob Container

9 Min. Read

Updated – 28/07/2021 – The automation tool was updated to take into consideration the container soft delete feature which is enabled by default for Blob storage with a 7 day retention period. Please check this section for more details.

In this article, I will share with you how to sync and copy between an Azure file share and an Azure blob container.

Introduction

A while ago, I wrote about how to copy data from one Azure storage account in one subscription to another storage account in a different Azure subscription. And I also shared, how to sync between Azure Blob Storage and between Azure File Share(s).

You are storing data in Azure file share, and you have a line of business application (LOB) that can read only from a blob container and not from SMB file share. In another scenario, you are leveraging Azure File Sync (AFS) which synced to Azure file share and you need to have the data stored in an Azure blob container. You might also have other scenarios, please leave a comment below.

For these kinds of scenarios, you have a couple of options, at the time of this writing, you could use Azure Databox Gateway which can sync with Blobs. There are also other tools that you could use like AzCopy, Azure Batch, and Azure Data Factory that can help you move data back and forth. However, using these tools comes with some fidelity loss that you want to be aware of such as (permissions and timestamps like last modified time will be lost/changed).

For the purpose of this article, I will make use of the AzCopy tool which is a command-line utility that you can use to copy/sync blobs or files to or from a storage account, and I will use Azure Container Instances to simplify and automate the AzCopy in Runbook which will run as part of the container. In this way, we can run the container on a simple schedule to copy the data and only get billed for the time the container was used.

If you are new to the AzCopy tool, then make sure to check the get started document from Microsoft here. The good news is, Microsoft added sync support for AzCopy starting with version 10.3.0 and later. However, at the time of this writing, sync must happen between source and destination of the same type, e.g. either file <-> file, or directory/container <-> directory/container, but not between file share and blob container. Thus, I will leverage the copy support for AzCopy to copy data from an Azure file share to an Azure blob container.

Prerequisites

To follow this article, you need to have the following:

1) Azure subscription – If you don’t have an Azure subscription, you can create a free one here.

2) You need to have one or two different storage accounts either in the same region, same subscription, or in different regions and subscriptions.

3) You also need to create at least one container in the blob storage, and one Azure file share in the same storage account, or across two different storage accounts.

4) Last, you need to have some files in Azure file share, or you can sync on-premises servers with Azure File Sync to Azure file share.

Get started

First, we need to create an Azure automation account that will help you to automate the synchronization and the copy process without user interaction. This will also make sure to respect the security access of your storage account without exposing access keys to users.

Create Automation Account

In this step, I will create an Azure automation resource with a Run As account. Run As accounts in Azure Automation are used to provide authentication for managing resources in Azure with the Azure cmdlets. When you create a Run As account, it creates a new service principal user in Azure Active Directory (Azure AD) and assigns the Contributor role to the service principal at the subscription level.

Open the Azure Portal, click All services found in the upper left-hand corner. In the list of resources, type Automation. As you begin typing, the list filters based on your input. Select Automation Accounts.

Click +Add. Enter the automation account name, choose the right subscription, resource group, location, and then click Create.

Sync Between Azure File Share and Azure Blob Container 2

Import modules from Gallery

In the next step, you need to import the required modules from the Modules gallery. In your list of Automation Accounts, select the account that you created in the previous step. Then from your automation account, select Modules under Shared Resources. Click the Browse Gallery button to open the Browse Gallery page. You need to import the following modules from the Modules gallery in the order given below:

  1. Az.Accounts
  2. Az.ContainerInstance
  3. Az.Storage

Sync Between Azure File Share and Azure Blob Container 3

At the time of this writing, AzCopy is still not part of the Azure Automation Runbook. For this reason, I will be creating an Azure Container instance with AzCopy as part of the container so we can automate the entire synchronization and copy process.

What you should know

If you are using the Az.ContainerInstance PowerShell module version 2.0 or later, you might be facing the following issue:

New-AzContainerGroup : Cannot process argument transformation on parameter ‘ImageRegistryCredential’. Cannot convert value “peterdavehello/azcopy:latest” to type “Microsoft.Azure.PowerShell.Cmdlets.ContainerInstance.Models.Api20210301.IImageRegistryCredential[]”.

After a long investigation, this is a known issue after Microsoft updated the Az.ContainerInstance PowerShell module to version 2.0 and later. Please use the following workaround, and it should work as expected:

Please browse to the following page and import the Az.ContainerInstance PowerShell module version 1.0.3, then click on Azure Automation tab and then select Deploy to Azure Automation. Last, use the script as described in this article, and it should work.

Please note that if you have imported the Az.ContainerInstance PowerShell module version 2.0 or later, please delete it from the Modules section in your Azure Automation Account and then import version 1.0.3.

I will be updating the script to work with the latest Az.ContainerInstance PowerShell module once the bug is fixed by Microsoft.

Create PowerShell Runbook

In this step, you can create multiple Runbooks based on which set of Azure file shares you want to sync/copy to the Azure blob container. PowerShell Runbooks are based on Windows PowerShell. You directly edit the code of the Runbook using the text editor in the Azure portal. You can also use any offline text editor such as Visual Studio Code and import the Runbook into Azure Automation.

From your automation account, select Runbooks under Process Automation. Click the ‘+ Create a runbook‘ button to open the Create a runbook blade.

Sync Between Azure File Share and Azure Blob Container 4

In this example, I will create a Runbook to copy all the files and directories changes from a specific Azure file share to a specific blob container. You can also be creative as much as you want and cover multiple Azure File Shares / Blob Containers / Directories, etc.

Edit the Runbook

Once you have the Runbook created, you need to edit the Runbook, then write or add the script to choose which Azure File Share you want to sync and copy data to the Azure blob container. Of course, you can create scripts that suit your environment.

As mentioned earlier, in this example, I will create a Runbook to read and check all the files and directories in a specific Azure File Share Name, and then copy the data over to a specific blob container. And to maintain a high level of security, I will NOT use the storage account keys, instead, I will create a time limit SAS token URI for each service individually (file share and blob container), the SAS token will expire automatically after 30 minutes. So, if you regenerate your storage account keys in the future, the automation process won’t break.

If you have soft delete enabled on blob storage (which is the default now), you must add the “–overwrite=ifSourceNewer” option to the “azcopy” command, otherwise, it would overwrite identical/unchanged files by default and rapidly balloon out your storage costs. The script was updated to take into consideration the container soft delete feature.

Please note that you can also update the parameter section and copy between storage accounts across different subscriptions.

The script is as follows:

<#
.DESCRIPTION
A Runbook example which continuously check for files and directories changes in recursive mode
for a specific Azure File Share and then copy data to blob container by leveraging AzCopy tool
which is running in a Container inside an Azure Container Instances using Service Principal in Azure AD.

.NOTES
Filename : Copy-FileShareToBlobContainer
Author   : Charbel Nemnom
Version  : 1.5
Date     : 13-January-2021
Updated  : 24-September-2021

.LINK
To provide feedback or for further assistance please visit:
https://charbelnemnom.com 
#>

Param (
    [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]
    [String] $AzureSubscriptionId,
    [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]
    [String] $storageAccountRG,
    [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]
    [String] $storageAccountName,    
    [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]
    [String] $storageContainerName,
    [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]
    [String] $storageFileShareName
)

$connectionName = "AzureRunAsConnection"

Try {
    #! Get the connection "AzureRunAsConnection "
    $servicePrincipalConnection = Get-AutomationConnection -Name $connectionName
    Write-Output "Logging in to Azure..."
    Connect-AzAccount -ServicePrincipal `
        -TenantId $servicePrincipalConnection.TenantId `
        -ApplicationId $servicePrincipalConnection.ApplicationId `
        -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
}
Catch {
    If (!$servicePrincipalConnection) {
        $ErrorMessage = "Connection $connectionName not found..."
        throw $ErrorMessage
    }
    Else {
        Write-Error -Message $_.Exception
        throw $_.Exception
    }
}

Select-AzSubscription -SubscriptionId $AzureSubscriptionId

# Get Storage Account Key
$storageAccountKey = (Get-AzStorageAccountKey -ResourceGroupName $storageAccountRG -AccountName $storageAccountName).Value[0]

# Set AzStorageContext
$destinationContext = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey

# Generate Container SAS URI Token which is valid for 60 minutes ONLY with read and write permission
$blobContainerSASURI = New-AzStorageContainerSASToken -Context $destinationContext `
 -ExpiryTime(get-date).AddSeconds(3600) -FullUri -Name $storageContainerName -Permission rw

# Generate File Share SAS URI Token which is valid for 60 minutes ONLY with read and list permission
$fileShareSASURI = New-AzStorageShareSASToken -Context $destinationContext `
 -ExpiryTime(get-date).AddSeconds(3600) -FullUri -ShareName $storageFileShareName -Permission rl

# Create azCopy syntax command
$ContainerSASURI = "'" + $blobContainerSASURI + "'"
$shareSASURI = "'" + $fileShareSASURI + "'"
$command = "azcopy copy " + $ShareSASURI + " " + $ContainerSASURI + " --recursive=true --overwrite=ifSourceNewer"

$jobName = $storageAccountName + "-" + $storageFileShareName + "-azcopy-job"

# Set the AZCOPY_BUFFER_GB value at 2 GB which would prevent the container from crashing.
$envVars = @{'AZCOPY_BUFFER_GB'='2'}

# Create Azure Container Instance and run the AzCopy job
# The container image (peterdavehello/azcopy:latest) is publicly available on Docker Hub and has the latest AzCopy version installed
# You could also create your own private container image and use it instead
# When you create a new container instance, the default compute resources are set to 1vCPU and 1.5GB RAM
# We recommend starting with 2 vCPU and 4 GB memory for large file shares (E.g. 3TB)
# You may need to adjust the CPU and memory based on the size and churn of your file share
New-AzContainerGroup -ResourceGroupName $storageAccountRG `
    -Name $jobName -image peterdavehello/azcopy:latest -OsType Linux `
    -Cpu 2 -MemoryInGB 4 -Command $command `
    -RestartPolicy never -EnvironmentVariable $envVars

Write-Output ("")

Save the script in the CMDLETS pane as shown in the figure below.

Sync Between Azure File Share and Azure Blob Container 5

Then test the script using the “Test pane” to verify it’s working as intended before you publish it.

Once the test is completed successfully, publish the Runbook by clicking Publish. This is a very important step.

Schedule the Runbook

In the final step, you need to schedule the Runbook to run based on your desired time to copy the changes from Azure file share to Azure blob container.

Within the same Runbook that you created in the previous step, select Schedules and then click + Add schedule.

So, if you need to schedule the Runbook to run every hour, then you need to create the following schedule with Recur every 3 Hours with Set expiration to No. You can also run it on-demand if you wish to do so.

Sync Between Azure File Share and Azure Blob Container 6

While scheduling the Runbook, you can pass on the parameters required for the PowerShell Script. In my example, I need to specify the Azure Subscription ID, Resource Group Name, Storage Account Name, Azure Blob Container Name, and the Azure File Share Name that I want to copy over. The sample script takes those parameters as input.

Sync Between Azure File Share and Azure Blob Container 7

Once done, click OK twice.

Test the Runbook

In this quick demo, I will test the Runbook and request on-demand storage sync to copy the data from an Azure file share to an Azure blob container. This scenario simulates when a user adds or modifies files directly in Azure File Share and/or Azure File Sync, and then copy the data to the Azure blob container automatically.

Monitor the Runbook

You can monitor the success or failure of these schedules using the “Jobs” tab of Runbooks under Resources. You can also see the next run schedule, in my example, the Runbook will run every 3 hours, and so forth…

Sync Between Azure File Share and Azure Blob Container 8

That’s it there you have it!

This is still version 1.0, if you have any feedback or changes that everyone should receive, please feel free to leave a comment below.

How it works…

When the runbook runs for the first time, a new container will be created and then terminated. The container will perform the copy batch job, which is not meant to run for a long time. So this container runs to complete the copy command and then stops. To prevent constant restart on completion, I have added the “-RestartPolicy Never” on the container which means it doesn’t restart when finished. This is a great way to run a batch copy job.

When the runbook runs for the second time and so on, it will start the existing container instead of creating a new container, and then run the command which includes the updated file share and blob container SAS URI Token which is valid for 30 minutes only. In this way, you get billed for the time the container was used, and to make sure you don’t expose your storage account access keys.

Please note that you may need to increase the SAS URI Token expiry time based on the amount of data you have to copy. The SAS must be valid throughout the whole job duration since we need it to interact with the service. I would suggest padding the expiration a bit just to be safe.

There’s more…

The previous steps were described how to automate and copy from an Azure file share to an Azure blob container using the AzCopy tool. Another useful scenario is to reverse this process. In other words, you could copy from Azure Blob storage to Azure Files instead.

  • Azure Blob (SAS or public) -> Azure Files (SAS)

To make this happen, you need to specify the source as a Blob URL and the destination as a File URL as shown in the following example:

azcopy copy "https://[storageaccount].blob.core.windows.net/[container]/[path/to/directory]?[SAS]" "https://[storageaccount].file.core.windows.net/[filesharename]/[path/to/directory]?[SAS]" --recursive

Summary

In this article, I showed you how to sync and copy from an Azure file share to an Azure blob container using the AzCopy tool running in a container. In this way, we can the run container with sync and copy jobs on a simple schedule and only get billed for the time the container is used.

At the time of this writing, if you deleted some files from the Azure file share, they won’t be deleted from the blob container automatically. This is a copy job and not a synchronization solution. I hope that Microsoft will update the AzCopy tool to include sync functionality so we can maintain the status between file share and blob container.

The sync command differs from the copy command in several ways as follows:

  • By default, the recursive flag is true and sync copies all subdirectories. Sync only copies the top-level files inside a directory if the recursive flag is false.
  • If the deleteDestination flag is set to true or prompt, then the sync will delete files and blobs at the destination that are not present at the source.

Do you want to learn more about Azure Storage including Azure Blobs and Azure File Shares? Make sure to check my recently published online course here: Azure Storage Essential Training.

I hope you find this guide useful.

__
Thank you for reading my blog.

If you have any questions or feedback, please leave a comment.

-Charbel Nemnom-

Related Posts

Previous

How To Export and Backup Azure Policy Definitions

How to Back up and Restore Azure Managed Disks

Next

39 thoughts on “Sync Between Azure File Share and Azure Blob Container”

Leave a comment...

  1. Hello Ngoug, thank you for reporting this issue.
    Could you please download the ContainerInstance version 1.0.3 module from here and then try to import it manually in Azure Automation?
    Once you have the az.containerinstance.1.0.3.nupkg package, browse to your automation account, under Modules click + Add a module, select the .nupkg package, enter the module name (Az.ContainerInstance) and then click OK.
    Hope this helps!

  2. Thank you, Charbel,

    I did it, but I got the same error.

  3. Hello Ngoug, could you please browse to this page, then choose Azure Automation tab and then select Deploy to Azure Automation.
    Hope this helps!

  4. Thanks very much, It’s working now.

  5. I am glad it works for you now!

  6. Thank you very much.

    I scheduled the job but it fails when I’m not connected to the Azure Portal.

    The message is around authentication failure.

    How can I pass the credential to run the script? How credential is managed when scheduling the script?

  7. Yes, I create an automation account as described.

    The problem is about the timezone. The time in the schedule menu differs from 1hour that the time assigned to the SAS URL of the container (or storage account).

    So when the schedule is triggered, the permission has expired.

    When I run the script manually, it’s ok.

    I check the timezone on the schedule menu and storage account, it’s the same.

    I don’t know where to change the timezone.

  8. Hello Ngoug, please note that the script by default will generate a container and file share SAS URI Token which is valid for only 30 minutes.
    I have updated the script here to generate a 60 minutes URI token instead of 30 minutes, please check if it works for you.
    You can always increase the time if you want. Please read the comments that I have included in the script for more details.
    Thanks!

  9. I already updated the script to 60min.

    But the problem still remains. For example, when the schedule begins, it marks 03:00 pm but the real-time is 04:00 pm.

    We get an error that authentication on the storage account fails.

    Time zone not to be the same.

  10. Hello Ngoug, what Time Zone did you choose when creating (adding) the new schedule in Azure automation?
    Please make sure the Time zone is set correctly as shown in the figure below.
    Add a schedule in Azure automation account
    Hope this helps!

  11. I check, it’s the same.

    But when the scheduled script starts, the start time differs from the actual time.

  12. Hello Ngoug, sorry I can’t give further support through the comments section. This requires looking into your environment.
    If you want to investigate further, please feel free to fill the contact form here.
    Thank you for understanding!

  13. Hello-

    I’m trying to copy files from a storage container in one resource group to a storage container in a different resource group. Just a one-way copy. Would this script work for what I need to accomplish?

  14. Hello KB-
    As described in this article, this script will copy from Azure File Share to a storage container (one-way) and NOT between two storage containers.
    Check this article that describes how to copy between two storage containers, you need to update the syntax to match your environment. The storage account could be in the same resource group or different resource group and could be in a different Azure subscription as well.

  15. I think I posted twice, sorry – but wanted to say THANK YOU for this. Very great write-up and approach to this problem. Using the container is a big win as well because it funnels traffic through the cloud vs. running azcopy from a traditional datacenter or workstation where it is going to have to download/process/re-upload the data.

    I did need to modify the dynamic naming of the container cause it errors out for some reason. I set it to a static name and it worked. The error was “The container name ‘–azcopy-job’ in container group ‘–azcopy-job’ is invalid”. The issue was the variable $jobName. Instead of using variables to create container name, I set it manually in the script to something that works for us, and the job ran.

  16. Hello John, thanks for the comment and feedback!
    Yes, I had a wrong typo for the $jobName variable. I have corrected the syntax and the dynamic naming of the container should work now without any issue.
    Thank You!

  17. If you delete items from the source it doesn’t’ appear to delete in the destination when re-ran….currently troubleshooting. but this may be expected behavior for “copy”, whereas “sync” does seem to be the only method to accomplish this. but since sync is not supported for files –> blob this may be a sticking point….will try to post back if I find it differently.

  18. Hello John,
    Yes, you are right! with “copy”, the replicated files that you deleted from the source do not get deleted on the destination.
    If you want to delete the files, then you want to add the --delete-destination true flag to the command, but this works with azcopy sync and NOT with azcopy copy.
    The --delete-destination defines whether to delete extra files from the destination that are not present at the source. Check this article for more information.
    Hope this helps!

Let me know what you think, or ask a question...

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to Stay in Touch

Never miss out on your favorite posts and our latest announcements!

The content of this website is copyrighted from being plagiarized!

You can copy from the 'Code Blocks' in 'Black' by selecting the Code.

Please send your feedback to the author using this form for any 'Code' you like.

Thank you for visiting!