The Magick of ImageMagick and Python for Automatic Photo Processing and Uploading to Azure

One thing that I hate is resizing photos. It seems like there are pages online that you upload an image and it returns a smaller photo. Or you can open it in paint and resize it. No matter what, it takes for ever and take too many clicks. Sounds like the perfect time to put Python to work and automate it.

To break this into smaller steps:

  • Search a directory for any pictures
  • Search the output directory to see if they exist
  • If they do not exist, convert using ImageMagick for smaller and more compressed pictures
  • Using Azure Storage CLI, do an RSYNC with the output folder and the remote blob container in the Storage Account

Program Prerequisites

For my setup, my laptop is setup with Fedora, and my workstation is set up with ArchLinux. Using the OS agnostic properties of Python 3, I am willing to bet it will work on either.

The next prerequisite is the Azure Copy. This can be installed from here.

Lastly, need the ImageMagick processing program. Will need both the OS program and the Python binding. In this case I am using Wand.

That’s it for pre-reqs.

Starting to Code

Just a few imports to kick off the start of the code.

import argparse
import os
import subprocess
import ruploader_conf as conf
from pathlib import Path, PurePath
from wand.image import Image

For some reason I don’t remember, I called this program ruploader. Going with that, all of the Azure configs are saved in ruploader_conf.

Next up is the main body of code.

def main():
##    '''Parser object to pull in:
##    Input Directory,
##    Output Directory
##    '''
##    parser = argparse.ArgumentParser()
##    parser.add_argument('-i', '--input')
##    parser.add_argument('-o', '--output')
    '''Read current directory for input folder'''
    current_directory = Path.cwd()
    if not Path.exists(current_directory / 'input'):
        raise FileNotFoundError('No input folder')
    input_folder = current_directory / 'input'
    output_folder = current_directory / 'output'
    output_folder.mkdir(exist_ok = True)
    os.chdir(input_folder)
    filetypes = ['.jpg', '.JPG']
    for picture in Path('.').glob('**/*'):

        output_file = (output_folder/picture)
        ##print(picture.suffix)
        if output_file.with_suffix('.jpg').exists():
            ##print(output_file)
            pass
        elif not any(x == picture.suffix for x in filetypes):
            pass
        else:
            ##print(picture)
            ##print('process', output_file)
            Path(PurePath(output_file).parent).mkdir(parents = True,
                                                     exist_ok = True)
            with Image(filename=picture) as img:
                ##print('width = ', img.width)
                img.transform(resize="1000x1000")
                img.compression_quality = 80
                img.format = 'jpeg'
                img.save(filename = ((output_folder/picture).with_suffix('.jpg')))

    '''Now can sync with the cloud'''
    command = ['azcopy',
               'sync',
               str(output_folder)+ '/',
               str(conf.ACCOUNT_URL
                   + conf.ACCOUNT_CONTAINER
                   + conf.ACCOUNT_SAS_TOKEN),
               '--recursive',
               '--delete-destination=false']
    result = subprocess.run(command, capture_output = True)
    print(result.stdout)
    
    
    


if __name__ == '__main__':
    main()

Now to explain a bit of this hodge-podge of code. I have the base directory of where this file is located. Inside the same folder is and Input and an Output folder. Any folder structure that is placed in the input folder will be copied over to the output, along with the modified pictures.

Next, because my camera will output JPG file types, and my phone will output jpg file types I have the program searching for both and writing them as jpg for all.

Using the glob function in the Path library, it will look in all directories and folders for all files. Then, for each file it will check if it is a picture. If it is a picture, it will check if it is already in the output folder. If it is already in the output folder, it will bypass it and move to the next file.

If the file is not in the output folder, its time for ImageMagick to do its work. The first step is for ImageMagick to change the size of the picture with the transform function. Just because its 1000×1000 pixels does not mean it will change the aspect ratio. In this case it will make sure that neither pixel dimension is greater than 1000. The next step it will reduce the JPEG compression to 80. This is not a hard number, just a number that I chose and worked well. Using the format function it forces the filetype to JPEG. ImageMagick has the ability to change file types so this needs to be set to what you want. And finally, it saves the converted photo in the new location.

Now to get these files to Azure storage as documented in my last post, I am using the azcopy function. This function allows for most file transfer operations to a storage account. When I first made this program, I was overwriting files every time. As my output folder got larger and larger it was taking longer and longer. In order to overcome that, I used the sync command. The sync command in this case allows for an equivalent action to RSYNC that will check the file hashes on either side and only copy or update if they have changed.

I have a simple config file that holds onto my Azure SAS token for uploading and what storage account I am uploading to. Its just 3 lines so go ahead and fill in your own information as needed.

ACCOUNT_URL = 'https://fill_in_here.blob.core.windows.net/'
ACCOUNT_CONTAINER = 'wp-media/'
ACCOUNT_SAS_TOKEN = 'biglongsastokenstring'

Size Reduction Analysis

So, how well does this process work? Lets take a look at a few photos from my travel blog at https://www.edutraveling.com/new-zealand/first-day-in-rotorua/

File Name BeforeFile Size in BytesFile Size in kBFile Size After Conversion in kBReduction Amount
006.JPG7191603719221896.97%
017.JPG7323369732328096.18%
035.JPG7085330708527496.14%
046.JPG7026969702722196.85%
061.JPG7586551758722996.98%
064.JPG7050109705022996.75%
075.JPG625764962588598.63%
080.JPG4909520491011097.76%
086.JPG401366340148197.98%
101.JPG5580257558013697.57%
106.JPG4626933462713597.07%
107.JPG4850359485012397.47%
carts.jpg38291238312766.79%
pond_steam.jpg2874793287514594.94%

As you can see here, I have save TONS of space by having ImageMagick resize all of the photos. I have not done any code profiling but the processing of files takes but a few seconds.

Closing Thoughts

I was not sure what to expect when I started this code project, but it has far exceeded all of my expectations. I can browse my old photo directories, copy and rename them into my folder structure at my leisure. Then hit process and before I know it they are uploaded to my CDN for use wherever I need them.

Just a few notes. If you want to reprocess some photos, you need to delete the output folder to tell the program to reconvert the photos.

Leave a Comment

Your email address will not be published. Required fields are marked *