How To: Download files from Amazon S3 using perl and Amazon::S3

After having such a hard time finding (m)any good examples online, this article is going to explain how to use perl to connect to Amazon's S3 service to download files. I am by no means a perl expert, so take the below code with a pound of salt. This example assumes that you have already a working Amazon S3 service, and the appropriate keys.

This code is used to download files which our customer's upload to our support site. The files are downloaded and stored in a directory by date (YYYYMMDD). We have a number of different users and as such only want to download the files for a specific user ($prefix).

The first thing that you will need to do is to install the Amazon::S3 CPAN module with a command similar to:

cpan -i Amazon::S3

Now for the code:

#!/usr/bin/perl</p>
<p>use strict;
use warnings;
use Amazon::S3;</p>
<p>## Amazon S3 Stuff ##
my $aws_access_key_id     = 'aws_access_key_id';
my $aws_secret_access_key = 'aws_secret_access_key';
my $bucket = 'bucket-name';  # The bucket name that files are uploaded to.
my $prefix = 'uploads/bucket-name@domain.tld/uuid';  # We only want to get the files that were uploaded for a specific user.
my $http_prefix = 'https://uploads.domain.tld/download/uuid/';  # I use this string to build the URL
## Amazon S3 Stuff ##</p>
<p>## Script Variables ##
my $directory = '/zstore/uploads/';  # Top level of where to save files to
my $logfile = $directory . &quot;scrape.log&quot;;  # Where to log the file downloads to.
## Script Variables ##</p>
<p>open(my $fh, &quot;+&gt;&gt;&quot;, $logfile);  # Open the file handle for writing logs</p>
<p>my $s3 = Amazon::S3-&gt;new({  # This sets up the Amazon S3 connection.
 aws_access_key_id      =&gt; $aws_access_key_id,
 aws_secret_access_key  =&gt; $aws_secret_access_key
});</p>
<p>print $fh localtime() . &quot;: Getting file list...\n&quot;;</p>
<p>my $files = $s3-&gt;list_bucket({  # This gets the entire list of files under the $prefix in the $bucket, or dies with an error.  $files is a multidimensional hash.
 bucket =&gt; $bucket,
 prefix =&gt; $prefix
}) or die $s3-&gt;err . &quot;: &quot; . $s3-&gt;errstr;</p>
<p>print $fh localtime() . &quot;: Got file list...\n&quot;;                 # If we have made it this far, we now have the entire list of files.</p>
<p>foreach my $file ( @{ $files-&gt;{keys} } ) {                      # $file is a hash of arrays of hashes, AKA multidimensional hash.
 my $file_name = substr($file-&gt;{key}, 69);                      # 20110224110053-file.tar.gz
 my $file_true_name = substr($file-&gt;{key}, 84);                 # file.tar.gz
 my $file_date = substr($file-&gt;{key}, 69, 8);                   # 20110224
 my $sub_directory = $directory . $file_date;                   # /zstore/uploads/20110224
 my $file_path = $sub_directory . '/' . $file_true_name;        # /zstore/uploads/20110224/file.tar.gz
 my $url = $http_prefix . $file_name;                           # $url is the actual url to the file</p>
<p> unless (-d $sub_directory){                                    # Unless the directory is already created
  print $fh localtime() . &quot;: Creating $sub_directory...\n&quot;;     # Log that we are creating the directory
  mkdir($sub_directory);                                        # And create that directory
 }</p>
<p> unless (-e $file_path) {                                       # Unless the file has already been downloaded
  print $fh localtime() . &quot;: Fetching $file_name...\n&quot;;         # Log that we are fetching the file
  my $line = system(&quot;fetch -mq -o '$file_path' '$url'&quot;);        # Actually download the file, assigning the return code of `fetch` to $line
  if ($line == 0) {                                             # If the file is successfully downloaded
   print $fh localtime() . &quot;: Succesfully got $file_path...\n&quot;; # Log it
  } else {                                                      # Else
   print $fh localtime() . &quot;: Problem getting $file_path...\n&quot;; # Could not fetch it for whatever reason.  Move on.
  }
 }
}
print $fh localtime() . &quot;: Exiting!\n\n&quot;;                       # We've went through the entire file list.  Log it.
close($fh);                                                     # Close the file handle.

If you spot any errors or have any questions, comments or feedback, please post a comment!