By admin • November 7, 2024
In our recent Sitecore project, we integrated Azure Blob Storage to handle various types of media and data files for the backend, ensuring scalability and security. While this setup provided excellent support for content storage and retrieval, we encountered a need for efficient file management to maintain an optimized storage structure. Specifically, we needed a way to:
- Clean up old files from storage that were no longer required.
- Move files from one folder to another, adjusting the structure to reflect changes in our content hierarchy.
Here’s a closer look at the challenges we faced and how we addressed them.
The Challenge: Managing Files Without Filenames
In Azure Blob Storage, files are typically stored with a structure that includes both an ID and a filename. However, our use case only provided us with file IDs, not their associated filenames. This made direct identification of files cumbersome, especially when handling batch deletions or moving files across folders.
To solve this, we implemented custom methods that operate on file prefixes (in this case, the file ID) rather than full filenames. These methods enabled us to manage our files in Azure Blob Storage effectively.
Solution Part 1: Deleting Files by Prefix
The first task was to delete files that matched a specific prefix, which allowed us to remove old, unused files without needing the exact filenames. Using the following method, DeleteBlobByPrefixAsync
, we were able to search for files by their ID prefix and delete them in bulk.
public static async Task DeleteBlobByPrefixAsync(string prefix, string path)
{
// Create a BlobServiceClient to connect to the storage account
BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
// Get a reference to the container
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);
// Define the path and prefix to search within
string pathPrefix = path + "/" + prefix; // e.g., "video/short-videos/6245387_"
// List all blobs under the specified path
await foreach (BlobItem blobItem in containerClient.GetBlobsAsync(prefix: path))
{
// Check if the blob name starts with the specified prefix
if (blobItem.Name.StartsWith(pathPrefix))
{
// Get a reference to the blob
BlobClient blobClient = containerClient.GetBlobClient(blobItem.Name);
// Delete the blob
await blobClient.DeleteIfExistsAsync();
Console.WriteLine($"Deleted blob: {blobItem.Name}");
}
}
}
This approach provided a flexible way to remove files based on their ID prefix, helping us streamline storage without unnecessary accumulation of outdated files.
Solution Part 2: Moving Files by Prefix
Our second task involved moving files from one folder to another based on their prefix, reflecting updates to our content organization. With the MoveBlobByPrefixAsync
method, we could locate files by their prefix and transfer them seamlessly to a new storage path within Azure Blob Storage.
public static async Task MoveBlobByPrefixAsync(string prefix, string path, string destinationPath)
{
// Create a BlobServiceClient to connect to the storage account
BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
// Get a reference to the container
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);
// List all blobs under the specified path
await foreach (BlobItem blobItem in containerClient.GetBlobsAsync(prefix: path))
{
// Check if the blob name starts with the specified prefix
if (blobItem.Name.StartsWith(path + "/" + prefix))
{
// Get a reference to the original blob
BlobClient sourceBlobClient = containerClient.GetBlobClient(blobItem.Name);
// Define the new blob name (retaining the same name)
string newBlobName = $"{destinationPath}/{blobItem.Name.Substring(blobItem.Name.LastIndexOf('/') + 1)}"; // Get the original blob name
// Get a reference to the destination blob
BlobClient destinationBlobClient = containerClient.GetBlobClient(newBlobName);
// Copy the blob to the new location
await destinationBlobClient.StartCopyFromUriAsync(sourceBlobClient.Uri);
// Optionally, delete the original blob after copying
await sourceBlobClient.DeleteIfExistsAsync();
Console.WriteLine($"Moved blob: {blobItem.Name} to {newBlobName}");
}
}
}
This method was instrumental in helping us restructure our storage, making it easier to manage and locate files within our content hierarchy.
Automating the Process with a Console App
To make this process more efficient, I created a console application that reads a CSV file containing a list of IDs and statuses. For each row in the CSV file, the console app performs the corresponding action—either calling DeleteBlobByPrefixAsync
to remove files or MoveBlobByPrefixAsync
to relocate them based on the status specified in each row. This console app allowed us to process file management tasks in bulk, minimizing manual work and ensuring consistency across our storage environment.