Wednesday, September 19, 2012

The trouble with downloading files

Imagine this scenario. You have QA and some customers complaining that you have this link where a file download is slow. You look at the code and see that the link is serving a dynamic file. The basic use case is that the user wants to generate a file on demand and save it to their local disk.

But you look again at the code and you see that we are using the same code to serve a dynamic file as we do to serve a static file. Anyone that has worked with HTTP and downloading a file knows that the way to tell a user to download a file is by sending special headers to a browser:

Content-Disposition: attachment; filename=yourfile

What this does is tell the browser that the link you are downloading is an attachment and gives the browser a hint on what the file should be named. The user will get a nice message asking them to save or open the file. This is great and what you want. But when you look at the code, most of the time is wasted generating the file.

So how do you fix this?  Lots of people say, "Well, we don't need to generate the file on demand, we could pre-generate the file on the server"

This is a possible solution, but in some applications, disk space is very sacred and should not be wasted.  So, what do you do when you must download the file on-demand?

The thing that I tried to do was send the content-disposition header early to the web browser so that the user would be prompted to save as soon as possible, and when they click save/open the file continues to download.

Here is a mod_perl example of accomplishing this with a flush operation:
$request->headers_out('Content-Disposition', 'attachment; filename=yourfile.pdf');
$request->content_type('application/pdf');
$request->rflush;

# continue with your possibly long operation...

On apache with mod_perl, this worked for me on firefox, but didn't work with IE and even chrome :-(

If the user clicked on the link, they would have to wait awhile until feedback about the file download showed (a save/open dialog or any indication that something was happening).

This led to someone pointing out that if you click this link many times (because you think nothing is happening), you get a hosed server at 100% cpu utilization processing your dynamic file generation.

Doh!

So how can you solve the problem of giving the user immediate feedback that something is actually downloading and eventually going to have a save/open dialog?  Product managers and even other developers started suggesting, "Hey, why don't you just disable the link with javascript until the file downloads?".  Other solutions involved showing an AJAX loading image in place of the link.

So, I humored them.  I went down the rabbit hole of implementing AJAX file downloading.  And from what I can see, it doesn't exist.  I found many crappy solutions.  Most of them didn't give the correct user experience across all browsers.  The jquery solution I found was implemented by creating a hidden iframe.  But the onSuccess function didn't get fired at the same place for all browsers.

My workflow was this:
1. onclick for the anchor, replace the anchor with a ajax loading image and start the jquery file download
2. onSuccess show the anchor again.

This worked in firefox, but in IE and chrome the onsuccess fired too early.  This didn't solve the problem of preventing the user from clicking the link many times, but it did give the user the idea that something did happen.  It felt like we traded one problem for another.

My product manager suggested we open a new window and show the ajax loading gif.  Then close the window when the user is prompted to download the file (or the browser is showing download progress).

We tried this out, using the jquery.Download module, but when we tried closing the window in the onsuccess method, we ran into cross browser issues.  Firefox window.close() was a no-op.  Chrome window.close() worked.  IE window.close() asked the user if they wanted to close the window.

This was a crappy solution since it flat out didn't work in firefox.

Back to the drawing board, one developer suggested we open in a new tab using an anchor target:
<a href="/my_dynamic_download?id=123" target="_blank" >Downlod file!</a>

This did work cross browser.  It opened a new tab in each browser.  The tab would be blank, but the browser would show some sort of loading image progress somewhere in the tab name: .
And when the file was ready to prompt the user to save/open, the tab would close automatically!

So the most elegant solution that overcame many different obstacles was actually a one line fix.
Its not perfect, but you don't have to write lots of hacky javascript.

Just know that if you ever have to deal with downloading dynamic files you will get lots of opinions.  There will be lots of crappy solutions out there that aren't cross browser.  You have to really think about how the user will react to no feedback.  I don't think my solution is perfect but I appreciate its simplicity and how it prevented me from checking in lots of crappy code that was a major javascript hack (and the many solutions even made my perl code hacky as well).

As a side note, I do feel that if I had more control over the HTTP connection on the server side, I really believe that just flushing the content disposition to the web browser immediately should have solved everything.  But since mod_perl's rflush didn't work as advertised, I had to resort to this crazy rabbit hole.