Joe on Software

Wednesday, September 19, 2012

The trouble with downloading files

Imagine this scenario. You have QA and some customers complaining that you have this link where a file download is slow. You look at the code and see that the link is serving a dynamic file. The basic use case is that the user wants to generate a file on demand and save it to their local disk.

But you look again at the code and you see that we are using the same code to serve a dynamic file as we do to serve a static file. Anyone that has worked with HTTP and downloading a file knows that the way to tell a user to download a file is by sending special headers to a browser:

Content-Disposition: attachment; filename=yourfile

What this does is tell the browser that the link you are downloading is an attachment and gives the browser a hint on what the file should be named. The user will get a nice message asking them to save or open the file. This is great and what you want. But when you look at the code, most of the time is wasted generating the file.

So how do you fix this? Lots of people say, "Well, we don't need to generate the file on demand, we could pre-generate the file on the server"

This is a possible solution, but in some applications, disk space is very sacred and should not be wasted. So, what do you do when you must download the file on-demand?

The thing that I tried to do was send the content-disposition header early to the web browser so that the user would be prompted to save as soon as possible, and when they click save/open the file continues to download.

Here is a mod_perl example of accomplishing this with a flush operation:

$request->headers_out('Content-Disposition', 'attachment; filename=yourfile.pdf');
$request->content_type('application/pdf');
$request->rflush;

# continue with your possibly long operation...

On apache with mod_perl, this worked for me on firefox, but didn't work with IE and even chrome :-(

If the user clicked on the link, they would have to wait awhile until feedback about the file download showed (a save/open dialog or any indication that something was happening).

This led to someone pointing out that if you click this link many times (because you think nothing is happening), you get a hosed server at 100% cpu utilization processing your dynamic file generation.

Doh!

So how can you solve the problem of giving the user immediate feedback that something is actually downloading and eventually going to have a save/open dialog? Product managers and even other developers started suggesting, "Hey, why don't you just disable the link with javascript until the file downloads?". Other solutions involved showing an AJAX loading image in place of the link.

So, I humored them. I went down the rabbit hole of implementing AJAX file downloading. And from what I can see, it doesn't exist. I found many crappy solutions. Most of them didn't give the correct user experience across all browsers. The jquery solution I found was implemented by creating a hidden iframe. But the onSuccess function didn't get fired at the same place for all browsers.

My workflow was this:
1. onclick for the anchor, replace the anchor with a ajax loading image and start the jquery file download
2. onSuccess show the anchor again.

This worked in firefox, but in IE and chrome the onsuccess fired too early. This didn't solve the problem of preventing the user from clicking the link many times, but it did give the user the idea that something did happen. It felt like we traded one problem for another.

My product manager suggested we open a new window and show the ajax loading gif. Then close the window when the user is prompted to download the file (or the browser is showing download progress).

We tried this out, using the jquery.Download module, but when we tried closing the window in the onsuccess method, we ran into cross browser issues. Firefox window.close() was a no-op. Chrome window.close() worked. IE window.close() asked the user if they wanted to close the window.

This was a crappy solution since it flat out didn't work in firefox.

Back to the drawing board, one developer suggested we open in a new tab using an anchor target:
<a href="/my_dynamic_download?id=123" target="_blank" >Downlod file!</a>

This did work cross browser. It opened a new tab in each browser. The tab would be blank, but the browser would show some sort of loading image progress somewhere in the tab name:

.
And when the file was ready to prompt the user to save/open, the tab would close automatically!

So the most elegant solution that overcame many different obstacles was actually a one line fix.
Its not perfect, but you don't have to write lots of hacky javascript.

Just know that if you ever have to deal with downloading dynamic files you will get lots of opinions. There will be lots of crappy solutions out there that aren't cross browser. You have to really think about how the user will react to no feedback. I don't think my solution is perfect but I appreciate its simplicity and how it prevented me from checking in lots of crappy code that was a major javascript hack (and the many solutions even made my perl code hacky as well).

As a side note, I do feel that if I had more control over the HTTP connection on the server side, I really believe that just flushing the content disposition to the web browser immediately should have solved everything. But since mod_perl's rflush didn't work as advertised, I had to resort to this crazy rabbit hole.

Wednesday, August 8, 2012

Interesting perl foreach behavior

My co-workers were surprised by this perl behavior and shared this problem so I thought it would be a fun post.

my %hash = (
   a => 1,
   b => 2,
);

foreach my $v (values %hash) {
    $v = 99;
}

warn Data::Dumper->Dump([\%hash], [qw(hash)]);

# This was the output:
$hash = {
  a => 99,
  b => 99
};

This code is trying to edit a hash, and normally, you would iterate over the keys and then change the looked up value somewhat like this:

foreach my $k (keys %hash) {
    $hash[$k] = 99;
}

You can also change an items value for an array as well:

my @test = qw(a b c);
foreach my $v (@test) {
  $v = 'zzz';
}
warn Data::Dumper->Dump([\@test], [qw(test)]);

# This was the output:
$test = [
  'zzz',
  'zzz',
  'zzz'
];

What is happening here is that the for-loop variable is an alias for each list item. Since its an alias, Perl does not copy the list item's value just to hand $v to you (it's efficient).

The modification of hash values happens in the example because values() itself returns a list of hash values. This below wouldn't work because it introduces an intermediate copy:

my %hash = (
  a => 1,
  b => 2,
);
my @hash_values = values %hash; # make copies of hash value aliases
foreach my $v (@hash_values) {
  $v = 99;
}

And the following example is also illegal because we are trying to modify read-only values:

foreach my $v (qw[ a b c ]) {
  $v = 'zzz';
}

So now you know the power of the foreach loop.

Saturday, June 16, 2012

SSH bookmarklet

When developing server software for a company, you might be troubleshooting an issue on a development server other than your own and you want to login to that server. Specifically I'm talking about when you are looking at say a QA person's issue in a web browser and you want to login to their box and see what's wrong.

I found out recently that my MacBook has the ssh protocol registerred already where if you click on a link with this scheme: ssh://user@host.name, then that executes ssh user@host.name. I wanted to figure out a way to utilize this in a bookmarklet. I also wanted it to be easy enough such that you could enter a different username, like root or the QA person's username if you have certain access restrictions on that server.

Here is my attempt to create that bookmarklet. Just drag it to your browser's toolbar:

Run SSH

Here is my breakdown of what happens:

// prompt for a user to ssh as, save that in the window for next time
window.sshuser = prompt('Login to ' + window.location.hostname + ' as', window.sshuser || '');

// construct the user@ string
user = (window.sshuser ? window.sshuser + '@' : '');

// construct the url and execute it
window.location = 'ssh://' + user + window.location.hostname

This works great, but I really want to figure out how to make multiple programs that register the ssh:// protocol for any operating system or browser. Something like this link explains: http://kb.mozillazine.org/Register_protocol

Windows is pretty easy to create a batch script that adds registry keys for an ssh protocol to run putty (or even cygwin ssh). I would like to know how to do the same on Ubuntu (or any Linux). I'm pretty stoked that the Macbook already has it. I don't know if its related to me installing mac developer tools, but I'm pretty satisfied.

Anyway, Hope this bookmarklet helps!

Weird jQuery bug

My software was recently upgraded from jQuery 1.5.2 to 1.7.2.

Luckily we had a bunch of automated tests for our ui code and we found a lot of issues with our code that toggles checkboxes or the disabled state of a tag. Previously you could say

$('#my_checkbox').attr('checked', 'checked'); // check the checkbox
$('#my_checkbox').attr('checked', ''); // uncheck the checkbox

$('#my_input').attr('disabled', 'disabled'); // disable the input
$('#my_input').attr('disabled', ''); // enable the input

We had code all over the place that did this. But when we used empty string, attr didn't enable or uncheck the elements.

The fixes we found were to either use a boolean value instead of a string, or you could use the removeAttr function:

$('#my_checkbox').attr('checked', true); // check the checkbox
$('#my_checkbox').attr('checked', false); // uncheck the checkbox
$('#my_checkbox').removeAttr('checked'); // alternative to uncheck the checkbox

$('#my_input').attr('disabled', true); // disable the input
$('#my_input').attr('disabled', false); // enable the input
$('#my_input').removeAttr('disabled'); // alternative to enable the input

We needed to figure out the best way to change our code with little impact and it looked like using removeAttr would cause more code since you would have to litter your code with if/else statements to figure out whether to use the attr or the removeAttr case.

So because of this we tried to enforce using booleans. But you could run into issues where you didn't know the type of the second parameter to attr.

var checked = $('#my_checkbox').attr('checked'); // get the checkbox state:
//   'checked' or ''
//...
$('#other_checkbox').attr('checked', checked); // set another checkbox to the same state
// same thing appplies to disabled

I did some digging and found that jQuery 1.6.2 had a fix to change attr('checked') to return the actual html value of checked instead of a boolean value:

<input id="my_checkbox" type="checkbox" checked="checked" />
...
var checkState = $('#my_checkbox').attr('checked'); // returns 'checked' as of jQuery 1.6.2 instead of true.  When it is unchecked, you get undefined returned.
// See this link for a list of more quirks: 
//     http://blog.jquery.com/2011/06/30/jquery-162-released/#comment-526605

So, to remedy this, we decided to always ensure you pass a boolean to the attr function for disabled or checked. Don't use empty string to uncheck or enable your element. If you still want to rely on the return string of the attr function, you can booleanize it like this:

var check_state = $('#my_checkbox').attr('checked'); // get the checkbox state:
//    'checked' or undefined
//...
$('#other_checkbox').attr('checked', !!check_state); // set another checkbox to the same state using a boolean
// !!undefined yields false, 
// !!'checked' yields true
// same thing applies to disabled

Now you know, and knowing is half the battle.

<rant>
I have to say that I'm pretty unimpressed with jQuery since no one caught this. Changing apis that are so important, like attr, is bad mojo and will cause many developers to curse your name late at night as they debug these unexpected api upgrade bugs. jQuery is supposed to be so awesome because it is easy going and does what you expect. This is a step backwards for all of us.
</ rant>

Wednesday, April 11, 2012

dev=1

Sometimes you want to be able to toggle dev=1 on a url. This can easily be done with javascript. You can create a link on a page with href="javascript:..." then right-click->save this link as a bookmark in your bookmark toolbar.

The following link is my attempt to write this, you can click it now, or bookmark it --> Toggle Dev

Let's read it line by line:

var alocation = window.location.search.toString(); // save the ?query string

// determine if when we add dev=1 if we need a ? or an &
var questoramp = alocation.indexOf('?') != -1 ? '&' : '?'; 

// if dev=1 exists, remove it, otherwise add it
var next = alocation.match(/[\?&]dev=1/) ? alocation.replace(/[\?&]dev=1/,'') : alocation + questoramp + 'dev=1'; 

// handle special case where dev=1 is at the beginning of the url
next = alocation.match(/\?dev=1&/) ? alocation.replace(/dev=1&/,'') : next; 

// update the window location
window.location.href=window.location.pathname + next + window.location.hash;

You have to make sure you don't forget the window.location.hash, or else you will get the dev=1 in the wrong part of the url.

=====================

I also wanted a way to toggle my web server from live apache(https) to a single threaded dev server (8080)

Here is my attempt: Toggle 8080

Again, here it is line by line

// save the whole url
var alocation = window.location.href.toString(); 

// if you see 8080 in the url, remove it and change the protocol to https, otherwise add it
var next = window.location.port == '8080' ? alocation.replace(/:8080/,'').replace(/http/,'https') : 'http://' + window.location.host + ':8080' + window.location.pathname + window.location.search + window.location.hash; 

// update the window location
window.location = next;

Hope this helps.

Friday, March 9, 2012

Logic chains in javascript and python

Let's talk about javascript. What happens when you have a function like this:

function(x) {
    x = x || 'default'
    ...
}

This is useful when you want to have a function where x is optional. When you don't pass a value to your function x is treated as undefined and the statement is read as x = undefined || 'default'. Whenever you have:

somevariable = statement || statement || statement ... || statement

the variable will get assigned the first true value from left to right or the last false value.

But what happens when you do this?

x = [] || 'kittens'

you would think that [] evaluates as a false value, but in javascript an empty list is true so x becomes []

lets try this in python

x = [] or 'kittens'

This evaluates the way you think, x becomes 'kittens'.

Now, I'm not saying anyone would have statements like this, but I just wanted you to think about the little differences that languages have and how they can cause unforeseen bugs.

The same thing applies to empty objects:

in javascript:

x = {} || 'yarg' /* yields {} */

in python

x = {} or 'yarg' /* yields yarg */

Complimentary to the 'or' examples, 'and's behave the opposite. If you have:

somevariable = statement && statement && ... && statement

your variable will get the first false statement or the last true statement.

This works pretty well in javascript/python

x = 'lol' and 'heh' and 'doh'
x = 'lol' && 'heh' && 'doh'
/* x has doh */

but again, try this:

// javascript
x = 'lol' && [] && 'doh' /* yields 'doh' */

# python
x = 'lol' and [] and 'doh' /* yields [] */

Now you know; and knowing is half the battle.

Thursday, March 1, 2012

Fun Emacs Prank

I noticed while using emacs you can use this handy program called emacsclient to perhaps open a file or run a command on emacs. I took this as an opportunity to have fun with my co-workers.

They like to run emacs with X-Forwarding on remote hosts that are the development hosts. Its easy to log into their remote host and do something like this:

emacsclient --no-wait ~/prank-file

This would open the file on their currently running emacs. Sounds great right? You could make some ascii picture of something nsfw and then just run good old emacsclient to turn their day around!

Anywho, this wasn't enough for me. I wanted to make it so the file would change and maybe scroll across the screen saying something like "LOL Kittens!". Or even you could try and make it invert the ascii art and then invert it back. Somewhat like you are flashing the text.

I first tried to get it so you could scroll the art across the screen (from left to right)
Example:
first iteration
LOL
CAT

second iteration
OL
AT

final iteration
L
T

I started out trying this:

columns=`head -n 1 /tmp/ascii-art-file | wc -c`; # get the number of columns in the text file

for x in seq 1 $columns; do
    cut -c $x-$columns /tmp/ascii-art-file > /tmp/scroll;
    emacsclient --no-wait /tmp/scroll; 
    sleep 5;
done;

This didn't work as expected. Emacs kept asking if I wanted to refresh the modified buffers.
I did some digging and it didn't seem like an easy solution. I wanted this to be something that would just appear on my co-worker's screen and cause mayhem.

After some more digging I found the handy emacs command: zone.
It does all fun stuff like scrolling your current buffer, or even dripping the text like the movie The Matrix.

I couldn't wait to find out that emacsclient came with --eval as an option. This allows you to run lisp on your emacs server. Checkmate.
I created this script:

/tmp/zone
#!/bin/sh

emacsclient --no-wait $1
emacsclient -e '(with-current-buffer "'`basename $1`'" (zone))'
sleep 5

This script ran fine. But sometimes emacsclient blocked when you ran zone. This wasn't ideal. So I added this to the beginning to kill the emacsclient every time you run zone:

ps -ef | grep emacsclient | grep -v grep | awk '{print $2}' | xargs --no-run-if-empty kill -9

Finally, to get this to run over and over, just put zone in a while loop:

while true; do /tmp/zone /tmp/ascii-art-file; done;

Happy pranking!