How to Batch Anonymize a Folder of DICOM Studies

How to Batch Anonymize a Folder of DICOM Studies

This tutorial demonstrates how to batch anonymize multiple DICOM files from a folder using C#. When working with hundreds or thousands of medical imaging files, batch processing is essential for efficiency and consistency.

Benefits of Batch Anonymization

  1. Efficiency:
    • Process entire studies or archives in a single operation.
  2. Consistency:
    • Apply the same anonymization rules to all files.
  3. Automation:
    • Integrate into automated workflows and pipelines.

Prerequisites: Preparing the Environment

  1. Set up Visual Studio or any compatible .NET IDE.
  2. Create a new .NET 8 console application project.
  3. Install Aspose.Medical from the NuGet Package Manager.
  4. Prepare input folder with DICOM files and create an output folder.

Step-by-Step Guide to Batch Anonymize DICOM Files

Step 1: Install Aspose.Medical

Add the Aspose.Medical library to your project using NuGet.

Install-Package Aspose.Medical

Step 2: Include Necessary Namespaces

Add references to the required namespaces in your code.

using Aspose.Medical.Dicom;
using Aspose.Medical.Dicom.Anonymization;

Step 3: Set Up Directory Paths

Define your input and output directory paths.

string inputDirectory = @"C:\DicomStudies\Input";
string outputDirectory = @"C:\DicomStudies\Anonymized";

// Ensure output directory exists
Directory.CreateDirectory(outputDirectory);

Step 4: Enumerate DICOM Files

Get all DICOM files from the input directory, including subdirectories.

string[] dicomFiles = Directory.GetFiles(
    inputDirectory, 
    "*.dcm", 
    SearchOption.AllDirectories
);

Console.WriteLine($"Found {dicomFiles.Length} DICOM files to process.");

Step 5: Create the Anonymizer

Create an Anonymizer instance with your desired profile.

Anonymizer anonymizer = new();
// Or with a custom profile:
// ConfidentialityProfile profile = ConfidentialityProfile.CreateDefault(
//     ConfidentialityProfileOptions.BasicProfile
// );
// Anonymizer anonymizer = new(profile);

Step 6: Process Each File

Loop through each file, anonymize it, and save to the output directory.

int successCount = 0;
int failCount = 0;

foreach (string filePath in dicomFiles)
{
    try
    {
        // Load DICOM file
        DicomFile dcm = DicomFile.Open(filePath);
        
        // Anonymize
        DicomFile anonymizedDcm = anonymizer.Anonymize(dcm);
        
        // Preserve relative directory structure
        string relativePath = Path.GetRelativePath(inputDirectory, filePath);
        string outputPath = Path.Combine(outputDirectory, relativePath);
        
        // Ensure subdirectory exists
        Directory.CreateDirectory(Path.GetDirectoryName(outputPath)!);
        
        // Save anonymized file
        anonymizedDcm.Save(outputPath);
        
        successCount++;
        Console.WriteLine($"✓ Processed: {relativePath}");
    }
    catch (Exception ex)
    {
        failCount++;
        Console.WriteLine($"✗ Failed: {filePath} - {ex.Message}");
    }
}

Console.WriteLine($"\nCompleted: {successCount} succeeded, {failCount} failed.");

Complete Code Example for Batch Anonymization

Here is a complete example demonstrating batch anonymization with progress reporting:

using Aspose.Medical.Dicom;
using Aspose.Medical.Dicom.Anonymization;

string inputDirectory = @"C:\DicomStudies\Input";
string outputDirectory = @"C:\DicomStudies\Anonymized";

// Ensure output directory exists
Directory.CreateDirectory(outputDirectory);

// Get all DICOM files
string[] dicomFiles = Directory.GetFiles(inputDirectory, "*.dcm", SearchOption.AllDirectories);
Console.WriteLine($"Found {dicomFiles.Length} DICOM files to process.\n");

// Create anonymizer
Anonymizer anonymizer = new();

int successCount = 0;
int failCount = 0;
int total = dicomFiles.Length;

foreach (string filePath in dicomFiles)
{
    try
    {
        DicomFile dcm = DicomFile.Open(filePath);
        DicomFile anonymizedDcm = anonymizer.Anonymize(dcm);
        
        string relativePath = Path.GetRelativePath(inputDirectory, filePath);
        string outputPath = Path.Combine(outputDirectory, relativePath);
        Directory.CreateDirectory(Path.GetDirectoryName(outputPath)!);
        
        anonymizedDcm.Save(outputPath);
        successCount++;
        
        // Progress reporting
        double progress = (double)(successCount + failCount) / total * 100;
        Console.WriteLine($"[{progress:F1}%] ✓ {relativePath}");
    }
    catch (Exception ex)
    {
        failCount++;
        Console.WriteLine($"✗ Failed: {Path.GetFileName(filePath)} - {ex.Message}");
    }
}

Console.WriteLine($"\n========================================");
Console.WriteLine($"Batch Anonymization Complete");
Console.WriteLine($"Succeeded: {successCount}");
Console.WriteLine($"Failed: {failCount}");
Console.WriteLine($"========================================");

Parallel Processing for Better Performance

For large datasets, use parallel processing to leverage multiple CPU cores:

using Aspose.Medical.Dicom;
using Aspose.Medical.Dicom.Anonymization;
using System.Collections.Concurrent;

string inputDirectory = @"C:\DicomStudies\Input";
string outputDirectory = @"C:\DicomStudies\Anonymized";

Directory.CreateDirectory(outputDirectory);

string[] dicomFiles = Directory.GetFiles(inputDirectory, "*.dcm", SearchOption.AllDirectories);
Console.WriteLine($"Found {dicomFiles.Length} DICOM files to process.\n");

// Thread-safe counters
int successCount = 0;
int failCount = 0;
ConcurrentBag<string> failedFiles = new();

// Process in parallel
Parallel.ForEach(dicomFiles, new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount }, filePath =>
{
    try
    {
        // Each thread gets its own anonymizer instance
        Anonymizer anonymizer = new();
        
        DicomFile dcm = DicomFile.Open(filePath);
        DicomFile anonymizedDcm = anonymizer.Anonymize(dcm);
        
        string relativePath = Path.GetRelativePath(inputDirectory, filePath);
        string outputPath = Path.Combine(outputDirectory, relativePath);
        
        lock (outputDirectory) // Ensure directory creation is thread-safe
        {
            Directory.CreateDirectory(Path.GetDirectoryName(outputPath)!);
        }
        
        anonymizedDcm.Save(outputPath);
        Interlocked.Increment(ref successCount);
    }
    catch (Exception ex)
    {
        Interlocked.Increment(ref failCount);
        failedFiles.Add($"{filePath}: {ex.Message}");
    }
});

Console.WriteLine($"\nBatch Anonymization Complete");
Console.WriteLine($"Succeeded: {successCount}");
Console.WriteLine($"Failed: {failCount}");

if (failedFiles.Any())
{
    Console.WriteLine($"\nFailed files:");
    foreach (string failure in failedFiles)
    {
        Console.WriteLine($"  - {failure}");
    }
}

Enhanced Version with Logging

For production environments, implement proper logging:

using Aspose.Medical.Dicom;
using Aspose.Medical.Dicom.Anonymization;

string inputDirectory = @"C:\DicomStudies\Input";
string outputDirectory = @"C:\DicomStudies\Anonymized";
string logFile = Path.Combine(outputDirectory, "anonymization_log.txt");

Directory.CreateDirectory(outputDirectory);

string[] dicomFiles = Directory.GetFiles(inputDirectory, "*.dcm", SearchOption.AllDirectories);

Anonymizer anonymizer = new();
List<string> logEntries = new();

logEntries.Add($"Anonymization started: {DateTime.Now}");
logEntries.Add($"Input directory: {inputDirectory}");
logEntries.Add($"Output directory: {outputDirectory}");
logEntries.Add($"Total files to process: {dicomFiles.Length}");
logEntries.Add("---");

int successCount = 0;
int failCount = 0;

foreach (string filePath in dicomFiles)
{
    string relativePath = Path.GetRelativePath(inputDirectory, filePath);
    
    try
    {
        DicomFile dcm = DicomFile.Open(filePath);
        DicomFile anonymizedDcm = anonymizer.Anonymize(dcm);
        
        string outputPath = Path.Combine(outputDirectory, relativePath);
        Directory.CreateDirectory(Path.GetDirectoryName(outputPath)!);
        anonymizedDcm.Save(outputPath);
        
        successCount++;
        logEntries.Add($"SUCCESS: {relativePath}");
    }
    catch (Exception ex)
    {
        failCount++;
        logEntries.Add($"FAILED: {relativePath} - {ex.Message}");
    }
}

logEntries.Add("---");
logEntries.Add($"Anonymization completed: {DateTime.Now}");
logEntries.Add($"Succeeded: {successCount}");
logEntries.Add($"Failed: {failCount}");

// Write log file
File.WriteAllLines(logFile, logEntries);

Console.WriteLine($"Processing complete. Log saved to: {logFile}");

Best Practices for Batch Processing

  1. Run on Copy of Data: Always process a copy of your original data, not the originals themselves.
  2. Log All Operations: Keep detailed logs of which files were processed and any errors encountered.
  3. Test First: Run on a small sample before processing the entire dataset.
  4. Monitor Disk Space: Ensure sufficient disk space for output files.
  5. Handle Interruptions: Consider implementing checkpoint/resume functionality for very large batches.

Troubleshooting

Memory Issues with Large Datasets

For very large files or datasets, process files one at a time without keeping references:

foreach (string filePath in dicomFiles)
{
    using (DicomFile dcm = DicomFile.Open(filePath))
    {
        DicomFile anonymizedDcm = anonymizer.Anonymize(dcm);
        anonymizedDcm.Save(outputPath);
    }
    // Files are disposed after each iteration
}

Permission Errors

Ensure your application has read access to the input directory and write access to the output directory.

Conclusion

This tutorial has demonstrated how to batch anonymize DICOM files in C# using Aspose.Medical. Whether processing hundreds or thousands of files, the batch approach ensures consistent anonymization across your entire dataset while providing progress tracking and error handling for production reliability.

 English