Updated – 01/02/2022 – Starting with the AZCopy version 10.13.0 and later, Microsoft added sync support between Azure Blob <-> Azure Files instead of only copy. The automation tool was updated to support this new scenario.
Updated – 27/12/2021 – The automation tool was updated and tested with the latest Az.ContainerInstance module version 2.1 and above.
Updated – 28/07/2021 – The automation tool was updated to take into consideration the container soft delete feature which is enabled by default for Blob storage with a 7-day retention period. Please check this section for more details.
In this article, we will share with you how to sync and copy between an Azure file share and an Azure blob container.
Table of Contents
A while ago, we wrote about how to copy data from one Azure storage account in one subscription to another storage account in a different Azure subscription. And how to sync between Azure Blob Storage and between Azure File Share(s).
You are storing data in Azure file share, and you have a line of business application (LOB) that can read only from a blob container and not from SMB file share. In another scenario, you are leveraging Azure File Sync (AFS) which is synced to Azure file share and you need to have the data stored in an Azure blob container. You might also have other scenarios, please leave a comment below.
For these kinds of scenarios, you have a couple of options, at the time of this writing, you could use Azure Databox Gateway which can sync with Blobs. There are also other tools that you could use like AzCopy, Azure Batch, and Azure Data Factory that can help you move data back and forth. However, using these tools comes with some fidelity loss that you want to be aware of such as (permissions and timestamps like the last modified time will be lost/changed).
For the purpose of this article, we will make use of the AzCopy tool which is a command-line utility that you can use to copy/sync blobs or files to/from a storage account, and I will use Azure Container Instances to simplify and automate the AzCopy in Runbook which will run as part of the container. In this way, we can run the container on a simple schedule to copy the data and only get billed for the time the container was used.
If you are new to the AzCopy tool, then make sure to check the get started document from Microsoft here. The good news is, that Microsoft added sync support for AzCopy starting with version 10.13.0 and later between Azure Files and Azure Blob. Thus, we will leverage the sync support for AzCopy to sync data from an Azure file share to an Azure blob container instead of copying data.
To follow this article, you need to have the following:
1) Azure subscription – If you don’t have an Azure subscription, you can create a free one here.
2) You need to have one or two different storage accounts either in the same region, same subscription or in different regions and subscriptions.
3) You also need to create at least one container in the blob storage and one Azure File Share in the same storage account, or across two different storage accounts.
4) Last, you need to have some files in Azure file share, or you can sync on-premises servers with Azure File Sync to Azure file share.
First, we need to create an Azure automation account that will help you to automate the synchronization and the copy process without user interaction. This will also make sure to respect the security access of your storage account without exposing access keys to users.
Create Automation Account
In this step, I will create an Azure automation resource with a Run As account. Run As accounts in Azure Automation are used to provide authentication for managing resources in Azure with the Azure cmdlets. When you create a Run As account, it creates a new service principal user in Azure Active Directory (Azure AD) and assigns the Contributor role to the service principal at the subscription level.
Open the Azure Portal, and click All services found in the upper left-hand corner. In the list of resources, type Automation. As you begin typing, the list filters based on your input. Select Automation Accounts.
Click +Add. Enter the automation account name, choose the right subscription, resource group, and location, and then click Create.
Updated – 12/11/2021 – You can now create an Azure automation account with a Managed Identity. Microsoft recommends using Managed Identities for the Automation accounts instead of using Run As accounts. Managed identity would be more secure and offers ease of use since it doesn’t require any credentials to be stored. Azure Automation support for Managed Identities is now generally available. When you create a new Automation Account, System assigned identity is enabled. We highly recommend moving away from the Run As account to Managed Identity, we kept both documented in this article as a reference.
When you create an Automation Account with Managed Identity, it creates a new service principal user in Azure Active Directory (Azure AD) by default. Next, you must assign the appropriate (Azure RBAC) Contributor role to allow access to Azure Storage for the service principal at the resource group level. If you have two different subscriptions and two different resource groups, then you must assign the RBAC Contributor role for the service principal on the source and target resource group.
Always keep in mind and follow the principle of least privilege and carefully assign permissions only required to execute your runbook.
Import modules from Gallery
In the next step, you need to import the required modules from the Modules gallery. In your list of Automation Accounts, select the account that you created in the previous step. Then from your automation account, select Modules under Shared Resources. Click the Browse Gallery button to open the Browse Gallery page. You need to import the following modules from the Modules gallery in the order given below:
Updated – 12/11/2021 – Starting in September 2021, automation accounts will now have Az modules by default installed. You don’t need to import the modules from the gallery as shown in the figure above. Please note that you can also update the modules to the latest Az version from the modules blade as shown in the figure below.
At the time of this writing, AzCopy is still not part of the Azure Automation Runbook. For this reason, we will be creating an Azure Container instance with AzCopy as part of the container so we can automate the entire synchronization and copy process.
What you should know
If you are using the Az.ContainerInstance PowerShell module version 2.0 or later, you might be facing the following issue:
New-AzContainerGroup : Cannot process argument transformation on parameter ‘ImageRegistryCredential’. Cannot convert value “peterdavehello/azcopy:latest” to type “Microsoft.Azure.PowerShell.Cmdlets.ContainerInstance.Models.Api20210301.IImageRegistryCredential”.
After a long investigation, this is a known issue after Microsoft updated the Az.ContainerInstance PowerShell module to version 2.0 and later.
Updated – 27/12/2021 – The script below has been updated and tested with the latest Az.ContainerInstance module version 2.1 and above.
Create PowerShell Runbook
In this step, you can create multiple Runbooks based on which set of Azure file shares you want to sync/copy to the Azure blob container. PowerShell Runbooks are based on Windows PowerShell. You directly edit the code of the Runbook using the text editor in the Azure portal. You can also use any offline text editor such as Visual Studio Code and import the Runbook into Azure Automation.
From your automation account, select Runbooks under Process Automation. Click the ‘+ Create a runbook‘ button to open the Create a runbook blade.
In this example, we will create a Runbook to copy all the files and directories changes from a specific Azure file share to a specific blob container. You can also be creative as much as you want and cover multiple Azure File Shares / Blob Containers / Directories, etc.
Edit the Runbook
Once you have the Runbook created, you need to edit the Runbook, then write or add the script to choose which Azure File Share you want to sync and copy data to the Azure blob container. Of course, you can create scripts that suit your environment.
As mentioned earlier, in this example, I will create a Runbook to read and check all the files and directories in a specific Azure File Share Name, and then copy the data over to a specific blob container. And to maintain a high level of security, I will NOT use the storage account keys, instead, I will create a time limit SAS token URI for each service individually (file share and blob container), and the SAS token will expire automatically after 60 minutes. So, if you regenerate your storage account keys in the future, the automation process won’t break.
If you have soft delete enabled on blob storage (which is the default now), you must add the “–overwrite=ifSourceNewer” option to the “azcopy” command, otherwise, it would overwrite identical/unchanged files by default and rapidly balloon out your storage costs. The script was updated to take into consideration the container soft delete feature and the latest Az.ContainerInstance PowerShell module version 2.1 and above.
Please note that you can also update the parameter section and copy between storage accounts across different subscriptions.
The script is as follows:
<# .DESCRIPTION A Runbook example which continuously check for files and directories changes in recursive mode for a specific Azure File Share and then sync data to blob container by leveraging AzCopy tool which is running in a Container inside an Azure Container Instances using Service Principal in Azure AD. .NOTES Filename : Sync-FileShareToBlobContainer Author : Charbel Nemnom Version : 2.2 Date : 13-January-2021 Updated : 20-April-2022 Tested : Az.ContainerInstance PowerShell module version 2.1 and above .LINK To provide feedback or for further assistance please visit: https://charbelnemnom.com #> Param ( [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()] [String] $AzureSubscriptionId, [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()] [String] $storageAccountRG, [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()] [String] $storageAccountName, [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()] [String] $storageContainerName, [Parameter(Mandatory = $true)][ValidateNotNullOrEmpty()] [String] $storageFileShareName ) # Ensures you do not inherit an AzContext in your runbook Disable-AzContextAutosave -Scope Process # Connect to Azure with system-assigned managed identity (automation account) Connect-AzAccount -Identity # SOURCE Azure Subscription Set-AzContext -Subscription $AzureSubscriptionId # Get Storage Account Key $storageAccountKey = (Get-AzStorageAccountKey -ResourceGroupName $storageAccountRG -AccountName $storageAccountName).Value # Set AzStorageContext $destinationContext = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey # Generate Container SAS URI Token which is valid for 60 minutes ONLY with read, write, create, and delete permission # If you want to change the target (BlobContainer -> AzureFileShare), then make sure to update the -Permission parameter to (rl) $blobContainerSASURI = New-AzStorageContainerSASToken -Context $destinationContext ` -ExpiryTime(get-date).AddSeconds(3600) -FullUri -Name $storageContainerName -Permission rwldc # Generate File Share SAS URI Token which is valid for 60 minutes ONLY with read and list permission # If you want to change the target (BlobContainer -> AzureFileShare), then make sure to add (rwldc) to the -Permission parameter $fileShareSASURI = New-AzStorageShareSASToken -Context $destinationContext ` -ExpiryTime(get-date).AddSeconds(3600) -FullUri -ShareName $storageFileShareName -Permission rl # Choose the following syntax if you want to Sync instead of Copy $command = "azcopy","sync",$fileShareSASURI,$blobContainerSASURI,"--recursive=true","--delete-destination=true" # Choose the following syntax if you want to Copy only # The copy command consumes less memory and incurs less billing costs because a copy operation doesn't have to index the source or destination prior to moving files. # $command = "azcopy","copy",$fileShareSASURI,$blobContainerSASURI,"--recursive=true","--overwrite=ifSourceNewer" # Container Group Name $jobName = $storageAccountName + "-" + $storageFileShareName + "-azcopy-job" # Set the AZCOPY_BUFFER_GB value at 2 GB which would prevent the container from crashing. $envVars = New-AzContainerInstanceEnvironmentVariableObject -Name "AZCOPY_BUFFER_GB" -Value "2" # Create Azure Container Instance Object and run the AzCopy job # The container image (peterdavehello/azcopy:latest) is publicly available on Docker Hub and has the latest AzCopy version installed # You could also create your own private container image and use it instead # When you create a new container instance, the default compute resources are set to 1vCPU and 1.5GB RAM # We recommend starting with 2 vCPU and 4 GB memory for large file shares (E.g. 3TB) # You may need to adjust the CPU and memory based on the size and churn of your file share $container = New-AzContainerInstanceObject -Name $jobName -Image "peterdavehello/azcopy:latest" ` -RequestCpu 2 -RequestMemoryInGb 4 -Command $command -EnvironmentVariable $envVars # The container will be created in the $location variable based on the storage account location. Adjust if needed. $location = (Get-AzResourceGroup -Name $storageAccountRG).location $containerGroup = New-AzContainerGroup -ResourceGroupName $storageAccountRG -Name $jobName ` -Container $container -OsType Linux -Location $location -RestartPolicy never Write-Output ("")
Save the script in the CMDLETS pane as shown in the figure below.
Then test the script using the “Test pane” to verify it’s working as intended before you publish it.
Once the test is completed successfully, publish the Runbook by clicking Publish. This is a very important step.
Schedule the Runbook
In the final step, you need to schedule the Runbook to run based on your desired time to copy the changes from Azure file share to Azure blob container.
Within the same Runbook that you created in the previous step, select Schedules and then click + Add schedule.
So, if you need to schedule the Runbook to run every three hours, then you need to create the following schedule with Recur every 3 Hours with Set expiration to No. You can also run it on-demand if you wish to do so.
While scheduling the Runbook, you can configure and pass the required parameters for the PowerShell Script.
In this example, we need to specify the Azure Subscription ID, Resource Group Name, Storage Account Name, Azure Blob Container Name, and the Azure File Share Name that I want to copy over. The sample script takes those parameters as input.
Once done, click OK twice.
Test the Runbook
In this quick demo, I will test the Runbook and request on-demand storage sync to copy the data from an Azure file share to an Azure blob container. This scenario simulates when a user adds or modifies files directly in Azure File Share and/or Azure File Sync, and then copies the data to the Azure blob container automatically.
Monitor the Runbook
You can monitor the success or failure of these schedules using the “Jobs” page of Runbooks under Resources. You can also see the next run schedule, in my example, the Runbook will run every 3 hours, and so forth…
That’s it there you have it!
This is still version 2.1, if you have any feedback or changes that everyone should receive, please feel free to leave a comment below.
How it works…
When the runbook runs for the first time, a new container will be created and then terminated. The container will perform the copy batch job, which is not meant to run for a long time. So this container runs to complete the copy command and then stops. To prevent constant restart on completion, I have added the “-RestartPolicy Never” on the container which means it doesn’t restart when finished. This is a great way to run a batch copy job.
When the runbook runs for the second time and so on, it will start the existing container instead of creating a new container, and then run the command which includes the updated file share and blob container SAS URI Token which is valid for 30 minutes only. In this way, you get billed for the time the container was used, and to make sure you don’t expose your storage account access keys.
Please note that you may need to increase the SAS URI Token expiry time based on the amount of data you have to copy. The SAS must be valid throughout the whole job duration since we need it to interact with the service. I would suggest padding the expiration a bit just to be safe.
The previous steps were described how to automate and sync from an Azure file share to an Azure blob container using the AzCopy tool. Another useful scenario is to reverse this process. In other words, you could copy or sync from Azure Blob storage to Azure file shares and vice versa.
- Azure Blob (SAS or public) <-> Azure Files (SAS)
To make this happen, you need to specify the source as a Blob URL and the destination as a File URL as shown in the following example:
azcopy sync "https://[storageaccount].blob.core.windows.net/[container]/[path/to/directory]?[SAS]" "https://[storageaccount].file.core.windows.net/[filesharename]/[path/to/directory]?[SAS]" --recursive
For more details, please check the following step-by-step guide on how to copy from Azure Blob storage to Azure file share.
In this article, we showed you how to sync from an Azure file share to an Azure blob container using the AzCopy tool running in a container. In this way, we can the run container with sync and copy jobs on a simple schedule and only get billed for the time the container is used.
At the time of this writing, if you deleted some files from the Azure file share, they won’t be deleted from the blob container automatically. This is a copy job and not a synchronization solution. I hope that Microsoft will update the AzCopy tool to include sync functionality so we can maintain the status between file share and blob container.
Starting with the AZCopy version 10.13.0 and later, Microsoft added sync support between Azure Blob <-> Azure Files instead of copy. The automation tool was updated to support that scenario. So if you deleted some files from the Azure file share, they will be deleted from the blob container automatically too.
The sync command differs from the copy command in several ways as follows:
- By default, the recursive flag is true and sync copies all subdirectories. Sync only copies the top-level files inside a directory if the recursive flag is false.
- If the deleteDestination flag is set to true or prompt, then the sync will delete files and blobs at the destination that are not present at the source.
Do you want to learn more about Azure Storage including Azure Blobs and Azure File Shares? Make sure to check my recently published online course here: Azure Storage Essential Training.
We hope you find this guide useful.
Thank you for reading my blog.
If you have any questions or feedback, please leave a comment.
96 thoughts on “Sync Between Azure File Share and Azure Blob Container”
Leave a comment...
Hello José, thanks for the comment and feedback!
Could you please try the updated script (27/12/2021) on PowerShell 5.1 instead of PowerShell 7.1 (preview)?
I have tested the updated script on PowerShell 7.1 (preview) and it’s working for me.
Let me know if it works for you.
This is a great article, and I have it working well in our environment.
However, we are finding that the “content-type” isn’t being set as part of the copy. Is this a known issue? I know with AzCopy v10 is meant to automatically detect and set the content-type, but perhaps the docker image doesn’t have the required mime type files.
Is this a known issue and is there a way to resolve it?
Thank you again for sharing your knowledge.
Hello Mark, thanks for the comment and feedback!
Yes, you are right, AzCopy v10 will automatically detect and set the content type.
If “Content-Type” is not specified at the “–content-type” parameter, AzCopy will set each blob’s content type according to its file extension. To set the same content type for all the blobs, we must explicitly specify a value for “Content-Type”, for example, “–content-type video/mp4”.
I am wondering if the issue is because you are copying from Azure File Shares to Blob Containers and not between the same storage service (file share TO file Share) or (blob container TO blob container).
One point worth trying is to delete the container instance and let the next run pull the latest image of the docker image. The docker image has the latest AzCopy version installed.
Hope this helps!
Now that AzCopy supports sync, can I just change that line of code from copy to sync? and then add the flag –delete-destination=true
Hello Casey, thanks for the comment!
Yes, starting with the AZCopy version 10.13.0 which was released last month, Microsoft added Sync support between Azure Blob <-> Azure Files instead of only copy.
I have updated the automation tool to support this new scenario.
Yes, you need to use the following syntax instead — You don’t need to use “–overwrite=ifSourceNewer” with the Sync option.
Let me know if it works for you.
I needed to change the blob container sas permission from rw to rwl in order to make it work.
I did use sync and AzCopy v10.13.0.
Thank you Jukka for the confirmation, the guide was updated!
Your update says Azure Blob Azure Files which implies that the sync works both ways. I am struggling to get it to work syncing Blob to a File Share. Am I missing something?
Hello Sean, thanks for the comment!
Are you encountering any specific errors?
The syntax is the same for all targets, you just need to put the right URL.
Example: azcopy sync [blob container url] [file share url]
I can only get the script to work one way (AzureFileShare -> BlobContainer) regardless of whether I use the sync or copy commands.
The Logs say the script completes successfully but it still doesn’t copy the files from BlobStorage to AzureFileShare.
Here is what I have in the PowerShell:
# Choose the following syntax if you want to Sync instead of Copy
$command = “azcopy”,”sync”,$blobContainerSASURI,$fileShareSASURI,”–recursive=true”,”–delete-destination=true”
– If I switch the source and destination around in the command, it copies from Azure FileShare to BlobContainer OK
# Choose the following syntax if you want to Copy only
# The copy command consumes less memory and incurs less billing costs because a copy operation doesn’t have to index the source or destination prior to moving files.
#$command = “azcopy”,”copy”,$fileShareSASURI,$blobContainerSASURI,”–recursive=true”,”–overwrite=ifSourceNewer”
– If I use the copy command instead, it copies from AzureFileShare to BlobContainer OK.
– If I switch the azcopy source and destination around so it copied from BlobContainer To AzureFileShare it doesn’t work.
Hello Troy, thanks for sharing the details!
If I understood well, the issue is not around the new Sync support between (AzureFileShare -> BlobContainer) and vice versa.
I believe that I found the issue. Look at the below syntax.
If you want to copy or sync from (BlobContainer -> AzureFileShare), then you need to change the permission from rl to rwldc.
Check the comments that I added before the Generate File Share SAS URI Token and Generate Container SAS URI Token. Please adjust the permission and try again.
Let me know if it works for you now.
Hello Stavros, thanks for the comment!
Yes, this is completely possible. Please check my step-by-step guide and see how to automatically move files between different Azure file share tiers and optimize storage costs.
Hope it helps!
Is there any way to make sure that the files are copied only once, and new files are continuously copied? Asking since we would like to delete the files from the share once they have been processed.
Also, what would happen if a file that has been previously copied and processed (and deleted from the charge) is then updated? Would that be copied over again?
Hello Chris, thanks for the comment!
Yes sure, as noted in this article, there are two options, you could either copy or sync.
Starting with the AZCopy version 10.13.0 and later, Microsoft added sync support between Azure Blob <-> Azure Files instead of only copy. The automation tool was updated to support this new scenario.
By using the Sync option, the files are copied only once, and new files or you update the files are continuously synced and copied over.
This is the syntax for the command is the following:
If the same file name and type that has been previously copied and processed (and then deleted from the source) is then updated, the updated file will be synced to the target and replaced.
In the syntax noted above, we are using the following flag, “–delete-destination=true”. This flag option will remove the files that have been created in the destination, that don’t exist in the source anymore.
Please make sure to remove this flag, since you don’t want to delete the files from the destination, you want to delete them from the source only.
Hope it helps!
I have just managed to get it working across different Storage Accounts and Resource groups but unfortunately deleted files on the destination are copied over again once I delete them and add new files to the source, even with the sync parameter.
Hello Chris, I am happy to hear that you managed to get it working across different Storage Accounts and Resource groups.
Now in regards to the files that are getting deleted on the destination, please remove the following parameter “–delete-destination=true” from the $command. Like this:
If it did not work as expected, then change the $command to copy instead of sync and set the “–overwrite” parameter to ifSourceNewer, like this:
Hope it helps!
I run the cmd, but PowerShell can’t get the subscription ID.
Hello Wayne, thanks for the comment!
Which command did you run in PowerShell that you can’t get the subscription ID?
You need first to set the Azure Context for the desired subscription by running: Select-AzSubscription -SubscriptionId $AzureSubscriptionId.
It’s working perfectly. When I select the network in the storage networking. The sync doesn’t work anymore. Do you know how we can fix this?
Hello Xaviri, thanks for the comment, and glad to hear it’s working for you!
Please note that Azure Automation Accounts lacks service endpoint support to Azure Storage when you want to lock down the storage account for private access only.
If you look at the Networking section for the storage account under Resource instance (Resource type), we don’t see Automation Accounts so we can allow it and have access to the storage account based on the system-assigned managed identity.
As a workaround what you could do is, you can allow the Public IPs for the Azure Automation service only from a specific Azure Region.
You can download the list of public IPs by service and region from here.
Hopefully, Microsoft will add support for Automation Accounts natively to Azure storage.
Hope it helps!
Thank you for this great article. I’ve adapted your script to sync data between two folders in two file shares and it works perfectly :-)
Because the source folder is very big with a huge amount of files it is running very long so I was asking myself: Is there a possibility to detect changes in the source folder? Is it possible to run the script not on a fixed schedule but only if there have been changes detected on the source folder? I mean some kind of trigger.
Could you answer this question?
Hello Richard, thanks for the comment and feedback!
Yes, this is possible, however, the design and architecture need to be changed.
In your case, you need to use the Azure Blob storage trigger for Azure Functions instead of Azure Automation Accounts.
When a file is added to the source blob storage container, the function will trigger and copy the data to the file share.
The Blob storage trigger starts a function when a new or updated blob is detected. Check this article for more information.
Hope it helps!
Hi Charbel. The file transfer from Container to File Share works well, thanks. There is one modification I want to do is that currently the file gets saved in the File Share root path and I want it to be saved in a specific path under File Share. For example, this is my File Share – “MyFileShare” and inside this, I have different folders, one of which is Folder1, and then inside this folder, I have Folder A where I want the file to be moved.
Hello Ravi, thanks for the comment and feedback!
Please note that to accomplish your scenario, you need to add the entire path and the folder name to the $fileShareSASURI or to $blobContainerSASURI based on your target destination (file share or blob container).
You need to construct your command in a way to add the entire path. For example:
After you generate the file Share SAS URI Token, you need to update the $fileShareSASURI variable to add the entire folder structure.
Hope it helps!
Hi Charbel, this really looks well and I want to use it, but if I run the runbook nothing happens. It even runs without error, but no files were synchronized.
I removed the actual Ids, any ideas why the synchronization doesn’t work?
I tried blob -> file, file ->blob, file blob for 5.1 and 7.1
With best regards,