While working on the new Harvest Help Center, our team had a chance to look at some common web-app issues with fresh eyes. Asset upload is almost certainly a requirement of any modern document-based web site, usually for images or general downloadable assets. There are a few pre-built Rails plugins that address asset upload (most notably paperclip), but they require database tables and are often designed with server-side storage in mind. Having a robust server-side solution for assets provides many benefits, but we found they were unnecessary for the simple workflow we had in mind.

We worked at finding something simpler and came up smiling.

Why the traditional model doesn’t fit

Harvest uses Textile to format most of our documents. In Textile, an image is specified as follows:

!http://www.getharvest.com/images/this_image.jpg(alt text here)!

And linking to an asset is similar:

"Download a PDF":http://www.getharvest.com/assets/document.pdf

In both these situations the asset URL is written into the document itself. Even if we track the assets in a database, little can be done to modify the assets without breaking existing URLs. For instance, a plan to start with server-side image uploads then later move to a content delivery network, commonly called a CDN, isn’t an option. With the URLs in our documents, we would need to update every document before the CDN could benefit us.

Speaking of CDNs, we knew we wanted to be using one from the start. The added infrastructure needed for server-side asset storage would never be useful to us.

Once we recognized storing assets on our server and tracking them in a database offered little to our particular requirements, it was time to find a simpler solution.

We came up with a short list of requirements for simple asset handling:

  • No database tables.
  • Use Amazon S3 for storage, and support using Amazon CloudFront for a CDN.
  • Use HTML5 for asynchronous uploading.

The solution we arrived at uses the excellent Plupload, the AWS::S3 gem and some simple Rails logic.

Configuring AWS::S3 and Rails

First, register for an Amazon S3 account if you don’t already have one. Then use an S3 client like S3Hub to access your account and create buckets for your project:

  • project-development
  • project-staging
  • project-production

Add the AWS::S3 Gem to your Gemfile:

gem 'aws-s3', :require => 'aws/s3'

Run bundle install and we should be ready to start using S3. Configure AWS::S3 by adding a YAML file and initializer script. In config/initializers/s3_credentials.rb:

# Load AWS::S3 configuration values
#
S3_CREDENTIALS = \
  YAML.load_file(File.join(Rails.root, 'config/s3_credentials.yml'))[Rails.env]

# Set the AWS::S3 configuration
#
AWS::S3::Base.establish_connection! S3_CREDENTIALS['connection']

In the actual configuration file config/s3_credentials.yml:

development: &defaults
  connection:
    :access_key_id: AAAAAA_your-key-here
    :secret_access_key: 4rpsi235js_your-secret-here
    :use_ssl: true
    # :persistent: true
  bucket: project-development
  max_file_size: 10485760
  acl: public-read

test:
  <<: *defaults
  bucket: project-development

staging:
  <<: *defaults
  bucket: project-staging

production:
  <<: *defaults
  # prefix is optional. This is where you would put your CloudFront Domain
  # Name or your CloudFront CNAME if you have one configured.
  prefix: "http://project.s3.mydomain.com"
  bucket: project

Now you can interact with Amazon S3 from Rails.

Building a controller to handle uploads

For this simple uploader, we have a limited set of requirements for the server-side logic. Uploads should be routed to an action where the asset is pushed to S3, and a URL is returned to the uploading request. Start by creating a controller:

script/rails g controller uploads

And be sure your new controller is in config/routes.rb

resources :uploads

The AWS::S3 upload code can go into the create action:

class UploadsController < ApplicationController
  # Maybe you have some filters, like :authenticate_admin!

  def create
    s3 = AWS::S3::S3Object.store \
      params[:file].original_filename,
      params[:file].tempfile,
      S3_CREDENTIALS['bucket'],
      :content_type => params[:file].content_type,
      :access => :public_read
    render :json => {
      :url => public_s3_url(params[:file].original_filename)
    }
  end

private

  def public_s3_url filename
    if S3_CREDENTIALS['prefix'].present?
      "#{S3_CREDENTIALS['prefix']}/#{filename}"
    else
      request.protocol +
      AWS::S3::Base.connections['AWS::S3::Base'].options[:server] +
      "/#{S3_CREDENTIALS['bucket']}/#{filename}"
    end
  end

end

The method public_s3_url will add a prefix defined in config/s3_credentials.yml allowing us to use a CDN for uploaded assets instead of the URL automatically generated for S3 assets by AWS::S3.

HTML5 uploads with Plupload

Plupload is a great swiss-army knife for uploading assets. It intelligently handles falling back from one upload strategy to another if one is unsupported by a browser. For our own internal tools at Harvest, we only worry about modern browsers that support HTML5 uploads. This simplifies our code.

To use Plupload, you first need a container DOM element for the upload system:

<div style="margin: 2em 0;" id="upload_container">
  <div id="filelist"></div>
  <a id="pickfiles" href="#">[Select files]</a>
  <a id="uploadfiles" href="#">[Upload files]</a>
</div>

Plupload needs to be told how to accomplish a few tasks:

  • What the URL for processing uploads is.
  • How to show selected files before upload.
  • How to show progress during upload.
  • Most importantly, how to parse the returned JSON from our Rails logic and display that URL.

In your application.js, create a new instance of plupload.Uploader if the container element is present. We’re using jQuery as well as Plupload in this example:

$(function() {
  if( $("#filelist").length ){
    var uploader = new plupload.Uploader({
      runtimes : 'html5',
      browse_button : 'pickfiles',
      max_file_size : '10mb',
      url : '/uploads',
      multiple_queues : true
    });

    // When the user selects files for upload, show them on the page
    //
    uploader.bind('FilesAdded', function(up, files) {
      $.each(files, function(i, file) {
        $('#filelist').append(
          '<div id="' + file.id + '">' +
          file.name + ' (' + plupload.formatSize(file.size) + ') <b></b>' +
          '</div>'
        );
      });
    });

    // When the file is uploaded, parse the response JSON and show that URL.
    //
    uploader.bind('FileUploaded', function(up, file, response){
      var url = JSON.parse( response.response ).url;
      $("#"+file.id).addClass("uploaded").html( url );
    });

    // Show upload progress over time- with HTML5 doesn't
    // really show values besides 0 and 100.
    //
    uploader.bind('UploadProgress', function(up, file) {
      $('#' + file.id + " b").html(file.percent + "%");
    });

    // When the upload button is clicked, upload!
    //
    $('#uploadfiles').click(function(e) {
      uploader.start();
      e.preventDefault();
    });

    uploader.init();
  }
});

To upload an asset, the user can click “[Select files]” or simply drag files onto that link. They can choose one file, or several. Next they click “[Upload files]” and wait for the assets to be sent to S3. After the assets are uploaded, they copy-paste the resulting URLs into Textile markup. That’s a simple flow with less maintenance or complexity than many other solutions, and provides all the functionality we need for asset handling.

This solution isn’t ideal for all apps in all situations, but for many of our own internal projects at Harvest it’s a simple and powerful strategy. We hope you find it useful!

If you think solving common problems in new and imaginative ways is something you do well, be sure to check out our Harvest Careers page, we’re hiring smart people and would love to talk to you.