A Better File Upload Progress Bar using Python, Ajax Prototype, & JSON
March 31st, 2006 Author: Lindsey Simon
Update: Aug 22, 2006: Download the Code and a working Demo!
Also, read our documentation on the inner workings of the python CGI.
Demo:
The Python File Input CGI
About four months ago, Christopher Bottaro, another developer here at FineTooth, worked on a Python CGI script to handle file uploads and it is mighty ingenious. The CGI uses standard input as a stream, which PHP cannot do. The CGI writes the stream data as it arrives, in blocks, to a file buffer. This enables us to use another PHP server-side script to to poll the size of the file buffer while it’s being written to. With this, we can calculate the parameters necessary for a progress bar user interface (time remaining, speed of upload). We integrated the whole kit and kaboodle into our PHP Form class so that if a form indeed had a file input, the progress bar was all set up to popup automagically, and the form’s action was set to point to the CGI. Upon file upload completion, the CGI would perform a redirect to the ‘original’ action URI for data processing.
The Problem Case – File Inputs & JavaScript
As we’ve moved more and more towards using JavaScript to control our application’s form processing (as opposed to using the standard form action attribute as an URL) one problem situation has been caused by the file input. Since the file input is essentially an OS level object, JavaScript can’t access the information (thank goodness!) in order to send it programmatically to a handler. So what we’ve done in the past has been to use iframes for forms that needed file inputs. I’ve read that the Dojo Toolkit basically does this technique – by detecting if the form has a file input on it, and if so, it uses an iframe mechanism to submit the form. For awhile this was a good workaround. However, for aesthetic and programmatic design reasons, I wanted to come up with another solution.
Why not just use Iframes for those forms?
For one thing, iframes are just kind of annoying. If you want to access your JavaScript framework, you have to program in the scope of window.parent. Hopefully, the browser is caching your stylesheets and javascripts, but I’ve sometimes seen otherwise watching apache logs, so I don’t believe the iframe solution is always efficient. More importantly, though, using the classic form post technique and then rewriting output from the server to the iframe’s window is just not how I want to process forms in our application. I prefer to use Prototype’s Form.serialize and then Ajax.Request to deal with form transactions. Our application doesn’t have page refreshes overall, so introducing them into little iframes if and only when a form needed a file input just feels so cowardly. And, all that *really* needs to post is the file input data. So suddenly, the brain dinger started going off.
Embedding N-Iframes
Instead of using an iframe for the entirety of the form, I wanted to try embedding iframes individually for each file input that needed to be on the form. I could have an iframe that loaded up its own form object into its window.body with only the file input and some session information. Then, via the controlling form, we can use javascript to submit each embedded iframe’s form programmatically. Once the poller running in the main form detects that all of the files are safely on the server, we can submit the rest of the form data using Ajax.Request along with some logic to capture the information about the freshly uploaded files. It sounded funky, but lo-and-behold, it works and it scales.
A diagram might help make more sense of this approach:
Form
*text input
*text input
*Iframe
*form
*file input
*text input
etc...
What’s key here is that instead of having a single CGI try to deal with multiple file inputs and all the other inputs on the form, our CGI is doing only what we originally wanted it for in the first place – the file upload.
Here’s the form processing logic:
* User chooses file(s) to upload in the form. * User presses submit. * JavaScript validation runs against the form. * JavaScript looks inside the form for any embedded iframes. * If it finds them, it submits the embedded form to the CGI. * An interval runs again and again until all the files are reporting 100% upload * The rest of the form data is at last sent using Ajax.Request
A few other niceties
Since our CGI is writing data in chunks, and since the juciest information about the file is in the first little bit of data, the CGI can do a little regular expression matching early on in the process to capture the file’s mimetype and name. Our poller can then read this information and we can show a purty little icon for the filetype and also reference the actual file name in the poller window – making it look pretty slick!
The Poller
The return data from the poller comes back to the browser as JSON and looks like this:
({"success":true,
"percent_done":3,
"kps":3,
"timeleft":"21 mins 8 secs",
"filename":"01 Mistakes.mp3",
"mimetype":"audio\/mpeg",
"mimetype_iconsrc":"music.png",
"current_size":126676,
"total_size":4044948})
So Why can’t we do multiple files all at once?
Originally I wanted a display that resembled Firefox’s Dowload Manager. One problem I haven’t found an adequate answer to is, why does it seem that I can’t be sending up multiple CGI requests from the embedded iframes at once? I can see that apache seems to be blocking until all but the last CGI are left in terms of sending responses back to the browser. Maybe it is the browser doing that; I did find one mention of that being the case back in old versions of Netscape, but I can’t seem to find any other validation of this limitation anywhere. So until that issue is resolved, files will be sent and polled sequentially in this version.
Entry Filed under: User Interface
21 Comments Add your own
1. SystemSam | May 8th, 2006 at 6:09 pm
One of the cleanest file upload implementations I’ve seen. Most have been clumsy pop-ups or simple tables with obvious refresh issues. Is this going to be available for download? I would love to implement something this clean on some of our sites
http://uber-uploader.sourceforge.net/
That might help you when it comes to multiple files. It has a Perl side but might give you some insight/ideas on the logic.
2. Tobie Langel | May 29th, 2006 at 4:39 pm
A great script!
No Safari support is a shame, though… planned?
3. myra | May 30th, 2006 at 2:18 am
Good work!
But is it possible to get the source code?!
Thanks!
4. Marc | May 31st, 2006 at 3:23 pm
Fantastic script.
Did you planned giving the source code ?
5. Lindsey Simon | June 1st, 2006 at 11:58 am
We are still working on releasing the source code – we want to be able to do so in a way that makes it easy to implement, so stay in touch if you’re interested in it!
6. Chad | June 1st, 2006 at 6:05 pm
Hey – I’ve been working on something very similar (with the same original goal of doing an Firefox-esque “Upload Manager”. I wrote it all this weekend (I did an HttpHandler in .NET for the upload handling and status exposure elements). Today I plumbed everything together and was frustrated to discover that Firefox was blocking (and thus preventing Prototype from polling) when the form is submitted, even though it’s inside of an iframe. It seems that yours is doing a little better, but still kinda blocking (i.e. hourglass). Have you guys come up with any ideas to solve this?
7. Lindsey Simon | June 2nd, 2006 at 4:01 pm
Hmm.. I’m not sure what you mean by “blocking” here Chad. Firefox won’t allow more than 2 concurrent connections, is this something that you’re only experiencing in Firefox?
8. Lindsey Simon | June 2nd, 2006 at 4:04 pm
Do you maybe mean that period of time after submit and before you get a response back from the server?
9. Huw Jeffries | June 8th, 2006 at 2:24 am
Hi There,
Would it be possible to grab the python source code for the uploader?
Many thanks,
Huw
10. Aaron McMahon | June 9th, 2006 at 9:50 pm
Hey Lindsey, I don’t know for sure, but I think your blocking problem could be a browser issue. Most web browsers will only X number of connections to a single server. Any more connection requests than that will have to wait until X is at or less than the limit. Browsers do this because they don’t want to hammer a server. Again, don’t quote me on this, but I believe the connection limit is around 3 or 4 total connections for a single server. If every browser had a limit of 50, it would bring many unprepared websites to their knees.
The easiest way to get around this is to use different subdomains which would count as a “separate” server with a new set of connection limits. For example, in your script, you could have it go to “www1.server.com/upload.cgi” and ALSO to “www2.server.com/upload.cgi”. Your web server would have to be setup to respond to those subdomains and to point to the same “upload.cgi” script, but that’s easy enough. Then, you can get around the blocking connection problem.
11. Lindsey Simon | June 12th, 2006 at 3:57 pm
It turns out that by default, Internet Explorer and Firefox will only download two resources from a single domain at once when using persistent connections (as suggested in the HTTP 1.1 spec, section 8.1.4). Cal Henderson pointed this out in his recent post about delivering JavaScript. Mystery solved!
12. Hugues | July 7th, 2006 at 1:33 am
Fantastic!
I’m looking for the source to implement this solution in my office. I would prefer python than perl’s solutions ( uber-uploader ).
Many thanks.
13. kaiser | July 12th, 2006 at 11:01 pm
Fantastic!
i have looking it a long time,i love the python solution,thank you very much.
14. Al | October 4th, 2006 at 6:58 pm
If you want to use stdin and php it might be worth looking into the special file wrapper php://input (http://www.php.net/manual/en/wrappers.php.php)
It may not be what you’re looking for (I don’t really know much about it), and you may be happy with the python script, I just thought that maybe if you fancied migrating it all to php, there’s your chance.
Other than that this looks like a great tool, I haven’t tested the source code out yet but the demo was fun.
15. Harsh | October 12th, 2006 at 2:57 am
I have downloaded the code you gave. But, when i run the script, its giving me a javascript error: json has no properties.
So, do i need to change anything after putting this folder in to rooo dir, or i need to make some settings?
16. Lindsey Simon | October 14th, 2006 at 11:49 am
You need to checkout your server side handler to make sure it’s sending json.
17. Julien | October 17th, 2006 at 2:47 am
Hello, I had the same problem : exception raised because of a null json object.
I figured out the problem: some classes use php 5 class syntax ( ArrayToJSON.php and MimeTypeIcons.php) while my server runs with php 4.x Slighly modifying the code fixed the problem.
Now I am stuck : upload begins (temp directories/meta-files are created) on the server side but it seems that the datas are not transferred, so the loader keeps printing (“connecting to server…”)
18. max donnelly | October 17th, 2006 at 10:24 am
Very cool script. The shared server we use doesn’t support Perl, so I’m hoping your upload tool will save us. However, I’m having the same problem as Harsh, json has no properties.
How do I check server side handler to make sure it’s sending json?
19. DWR and AJAX » Nest&hellip | December 3rd, 2006 at 5:05 pm
[...] I haven’t used this myself, but this AJAX file uploader – http://development.finetooth.com/?p=11 – seems very cool. Unfortunately it seems to use Python in addition to PHP, however in the comments I’ve mentioned that PHP actually does allow standard input to be used, so maybe if you can translate the Python file upload script to PHP it’ll be good. The demo works perfectly for me, and it’s a very nice way of doing it I thought. But yes, it still uses a nested form – but not as bad as it could be. The form has an iframe with the new form, so the main form’s submit button just tells the iframe’s form to submit first, and only when the files are uploaded does it submit the main form. So yeah, you should at least check this out… [...]
20. Shadda | May 18th, 2007 at 1:23 am
I’ve been documenting the new php upload progress hooks in php5.2, and I’ve encountered some of the same problems you did.
Also, it’s not impossible to read the raw post data in php. By means of a hack, if you set post_max_size to 0, php won’t attempt to compile it into the $_POST variable, instead it rejects the data and you can retroactively read and parse the data from STDIN. My example of this can be found at http://mycrap.carbonix.us/uploadProgress/
My newest example is at http://carbonix.us/uploadProgress/ (Firefox 1.5+). I’m trying to get this method to work in IE however the problem I’ve encountered is that Internet Explorer blocks any Http Requests while form data is being submitted (even to an iframe), whereas other browsers do not.
Basically, it waits until all form data has been sent to the server before allowing any subrequests to take place. The solution is to keep the form itself in an iframe, however like you said this is incredibly annoying and harder to “plug in”. Would be very interested in hearing any methods you’ve found to bypass this problem.
21. Frode | July 18th, 2007 at 11:56 pm
Hi there,
This would be really cool if it could be easily integrated into a framework, like Pylons for example.
I will try to integrate it into my Pylons app, but it would probably be best if the original programmer(s) submitted it, as you understand it MUCH better than I do.
Cheers,
Frode
Leave a Comment
Some HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
Trackback this post | Subscribe to the comments via RSS Feed