Archive for the ‘development’ Category

Abusing the HTML5 History API for fun (and chaos)

Monday, March 7th, 2011

Disclaimer: this post is intended to be humorous and the techniques described in this post are not recommend for use for any website that wishes to keep visitors.

I’ve been waiting for a chance to play with the HTML5 History APIs since seeing it on Google’s After reading Dive into HTML5’s take on the History API today (thanks to a pointer by Mark Trapp), I finally came up with a great way to play around with the API.

Teaser: it turns out that this API can be used for evil on WebKit browsers. I’ll get to that after the fun part.

The web of the 90’s was a much more “interesting” place. Animated GIF icons were the rage, loud pages and <blink> were the norm and our friend, the <marquee> tag used to give us impossible-to-read scrolling messages.

If you can see this moving, your browser is rocking the marquee tag

Over time, the web grew up and these animations died out. Marquees made a short comeback as animated <title>s, but thankfully those never caught on.

The Modern Browser

The world of the 2011 browser is much different than the browser of the 90’s. Early browsers had a tiny, cramped location bar with terrible usability. In modern browsers, the location, function and usability bar is now the one of the primary focuses of the UI.

Given that, it seems like this large browser UI is ripe for some exploitation. What if we could resurrect marquee, but give it all of the screen real-estate of today’s large, modern location bar?

With that in mind, I give you THE MARQUEE OF TOMORROW:

function beginEyeBleed() {
        var i = 0;
        var message = "Welcome to the marquee of tomorrow!   Bringing the marquee tag back in HTML5!   ";
        setInterval(function() {
                var scroller = message.substring(i) + message.substring(0, i);
                scroller = scroller.replace(/ /g, "-");
                window.history.replaceState('', 'scroller', '/' + scroller);
                i %= message.length;
        }, 100);

(if you’re viewing this on my blog in an HTML5-capable browser)

And now for the evil part.

It turns out that some browsers weren’t designed to handle this level of awesomeness. Trying to navigate away from a page that turns this on can be tricky. Firefox 4 works well: just type something into the location bar and the bar stop animating. WebKit-based browsers are a different story.

On Safari, every replaceState call removes what you’ve typed into the location bar and replaces it with the new data. This makes it impossible to navigate away by typing something else into the bar. Clicking one of your bookmarks will take you away from the page, however.

On Chrome, the same thing happens, but it’s even worse. Every replaceState call not only wipes out the location bar, but it cancels navigate events too. Clicking on bookmarks has no effect. The only recourse is to close the tab at this point.

Edit: comments on this post have noted that you can refresh, or go back to get off the page. You can also click a link on the page to navigate away.

Edit: the location bar seems to be find in Chrome on non-Mac platforms. Ubuntu is reportedly OK and Windows 7 performed OK in my test just now. The bookmark navigation issue is still a problem on the other platforms.

Overall, the new HTML5 history marquee is a pretty effective way of drawing eyeballs and, in some cases, forcing longer pageview times.

I filed issue 75195 for Chromium on this. I’ll file one for Safari as well.

Follow me (@mmastrac) on Twitter and let us know what you think of!

View/vote on HackerNews

(this is Thing A Week #7)


GWT 2.2 and java.lang.IncompatibleClassChangeError

Thursday, March 3rd, 2011

I’ve been updating to the latest versions of the various libraries we use and ran into some trouble while attempting to update GWT to 2.2. There were some incompatible bytecode changes made: the classes in the package were converted to interfaces. Java has a special invoke opcode for interface types, so the classes would fail to load with this obscure error when the verifier tried to load the classes:

java.lang.IncompatibleClassChangeError: Found interface, but class was expected

There are updated libraries available for some third-party GWT libraries, but others (like the gwt-google-apis) haven’t been updated yet.

The solution suggested by others was to manually recompile these against the latest GWT libraries to fix the problem. I thought I’d try a different approach: bytecode rewriting. The theory is pretty simple. Scan the opcodes of every method in a jar and replace the opcode used to invoke a method on a class, INVOKEVIRTUAL, with the opcode for invoking a method on interfaces, INVOKEINTERFACE. More info on Java opcodes is available here.

The ASM library makes this sort of transformation trivial to write. Compile the following code along with the asm-all-3.3.1.jar and plug in the file you want to transform in the main method. It’ll spit out a version of the library that should be compatible with GWT 2.2. You might need to add extra classes to the method rewriter, depending on the library you are trying to rewrite:

package com.gripeapp.bytecoderewrite;


import org.objectweb.asm.ClassAdapter;
import org.objectweb.asm.ClassReader;
import org.objectweb.asm.ClassVisitor;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodAdapter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class BytecodeRewrite {
  static class ClassVisitorImplementation extends ClassAdapter {
    public ClassVisitorImplementation(ClassVisitor cv) {

    public MethodVisitor visitMethod(int access, String name, String desc,
              String signature, String[] exceptions) {
        MethodVisitor mv;
        mv = cv.visitMethod(access, name, desc, signature, exceptions);
        if (mv != null) {
          mv = new MethodVisitorImplementation(mv);
        return mv;

  static class MethodVisitorImplementation extends MethodAdapter {

    public MethodVisitorImplementation(MethodVisitor mv) {

    public void visitMethodInsn(int opcode, String owner, String name,
        String desc) {
      if (owner.equals("com/google/gwt/core/ext/typeinfo/JClassType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JPackage")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JMethod")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JParameterizedType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JParameter")) {
        super.visitMethodInsn(Opcodes.INVOKEINTERFACE, owner, name, desc);
      } else
        super.visitMethodInsn(opcode, owner, name, desc);

  public static void main(String[] args) throws IOException {
    ZipInputStream file = new ZipInputStream(new FileInputStream("/tmp/gwt-maps-1.1.0/gwt-maps.jar"));
    ZipEntry ze;

    ZipOutputStream output = new ZipOutputStream(new FileOutputStream(("/tmp/gwt-maps.jar")));

    while ((ze = file.getNextEntry()) != null) {
      output.putNextEntry(new ZipEntry(ze.getName()));
      if (ze.getName().endsWith(".class")) {
        ClassReader reader = new ClassReader(file);
        ClassWriter cw = new ClassWriter(0);
        reader.accept(new ClassVisitorImplementation(cw), ClassReader.SKIP_FRAMES);
      } else {
        byte[] data = new byte[16*1024];
        while (true) {
          int r =;
          if (r == -1)
          output.write(data, 0, r);


AppEngine vs. EC2 (an attempt to compare apples to oranges)

Tuesday, March 1st, 2011

This is an expanded version of my answer on Quora to the question “You’ve used both AWS and GAE for startups: do they scale equally well in terms of availability and transaction volume?”:

I’ve had an opportunity to spend time building products in both EC2 (three years at DotSpots and the last year working on At DotSpots, we started using EC2 in the early days, back when it was just a few services: EC2, S3, SQS and SDB. It grew a great deal in those years, tacking on a number of useful services, some which we used (<3 CloudFront) and a number that we didn’t. Last year, around April, we switched over to building on AppEngine and I’ve come to appreciate how great this platform is and how much more time I spend building product rather than infrastructure. Other developers enjoy building custom infrastructure, but I’m happy to outsource it to Google.

Given these two technologies, it’s difficult to directly compare the them because they are two different beasts: EC2 is a general purpose virtual machine host, while AppEngine is a very sophisticated application host. AppEngine comes with a number of services out-of-the-box. The Amazon Web Services suite tacks on a number of various utilities to EC2 that give you access to structured, query-able storage, automatic scaling at the VM level, monitoring and other goodies too numerous to mention here.

Transaction Volume

When dealing with AppEngine, the limit to your scaling is effectively determined by how well you program to the AppEngine environment. This means that you must be aware of how transactions are processed at the entity group level and make judicious use of the AppEngine memcache service. If you program well against the architecture and best practices of AppEngine, you have the potential of scaling as well as some of Google’s own properties.

Here’s an example of one of the more expensive pages we render at Our persistence code automatically keeps entities hot in memcache to avoid hitting the datastore more than a few times while rendering a page:

On EC2, scaling is entirely in your hands. If you are a MySQL wizard, you can scale that part of the stack well. Scaling is half limited only by the constraints of the boxes you rent from Amazon behalf by your skill and the other half by creativity in making them perform well. At DotSpots, we spent a fair bit of time scaling MySQL up for our web crawling activities. We built out custom infrastructure for serving key/value data fast. There was infrastructure all over the place just to keep up with what we wanted to do. Our team put a lot of work into it, but at the end of the day, it was fast.

It’s my opinion that your potential for scaling on AppEngine is much higher for a given set of resources, if your application fits within the constraints of the “ideal application set” for AppEngine. There are some upcoming technologies that I’m not allowed to talk about right now that will expand this set of ideal applications for AppEngine dramatically.

Reliability and availability

As for reliability and availability, it’s not an exact comparison here again. On EC2, an instance will fail from time-to-time for no reason. In some cases it’s just a router failure and the instance comes back in a few minutes to a few hours later. Other times, the instance just dies, taking its ephemeral state with it. In our large DotSpots fleet, we’d see a machine lock up or disappear somewhere around once a 1 month or so. The overall failure rate here was pretty low, but enough that you need to keep on your toes for monitoring and backups. We did have a catastrophic failure while using Elastic Block Store that effectively wiped out all of the data we were storing on it – that was the last time we used EBS (in fairness, that was in the early days of EBS and this probably not as likely to happen again).

On AppEngine, availability is a bit of a different story. Up until the new High-replication datastore, you’d be forced to go down every time the Datastore went into maintenance – a few hours a month. With the new HR datastore, this downtime is effectively gone, in exchange for slightly higher transaction processing fees on Datastore operations. These fees are negligible overall and definitely worth the tradeoff for increased reliability.

AppEngine had some rough patches for Datastore reliability around last September, but these have pretty much disappeared for us. Google’s AppEngine team has been working hard behind the scenes to keep it ticking well. There are some mysterious failures in application publishing from time-to-time on AppEngine. They happen for a few minutes to a few hours at a time, then get resolved as someone internally at Google fixes it. These publish failures don’t affect your running code – just your ability to publish new code. We’re doing continuous deployment on AppEngine, so this affects us more than others.

If you measure reliability in terms of the stress imposed on developers keeping the application running, AppEngine is a clear winner in my mind. If you measure it by the time that your application isn’t unavailable from forces beyond your control, EC2 wins (but only by a small amount, and by a much smaller margin when comparing the HR datastore).

Follow me (@mmastrac) on Twitter and let us know what you think of!

View/vote on HackerNews

(this is Thing A Week #6)

Solving the Carwoo code-breaking challenge

Sunday, December 19th, 2010

Erik at Carwoo posted an interesting codebreaking challenge to the Carwoo blog earlier today.

This particular challenge piqued my curiosity. One of my hobbies is rooting Android phones and I enjoy all sorts reverse-engineering problems. More importantly, this one happened to match up with some downtime I had this afternoon.

I fired up Python and quickly got to work. At my day job,, we use Java. I love Java as a development language, but I find that Python’s interactive mode is best for quickly experimenting with data in unknown formats. There’s a great deal of power in the standard Python libraries and it’s well-suited for quick byte and bit manipulation.

The first thing I did was to take a look at the individual bytes of the base64 decoded data:

>> min(s), max(s)

The range of 32 (‘ ‘) to 90 (‘Z’) suggested that the bytes had been coerced to a printable range from their natural, though binary-uncomfortable, base-59 representation. I worked from this part on assuming that this was true as it held for the challenge itself, as well as the various sample encodings:

>>> qbf
[1, 24, 38, 18, 30, 16, 43, 40, 15, 39, 35, 29, 28, 46, 48, 32, 33, 7, 19, 12, 45, 2, 1, 39, 4, 45, 45, 53, 23, 56, 41, 14, 47, 45, 23, 37, 47, 49, 34, 53, 43, 44, 13, 48, 54, 34, 30, 7, 22, 32, 52, 0, 27, 10, 13, 35, 51, 14, 21, 34, 5, 15, 27, 2, 44, 7, 47, 49, 31, 14, 6, 49, 3, 49, 24, 1, 25, 33, 15, 16, 55, 21, 46, 50, 20, 40, 6, 52, 19, 28, 32, 51, 20, 47, 14, 16, 12, 48, 15, 15, 42, 1, 21, 21, 17, 35, 21, 22, 7, 15, 22, 13, 49, 9, 23, 40, 9, 42, 27, 23, 20, 1, 25, 35, 57, 55, 34, 53, 16, 49, 55, 21, 17, 33, 55, 20, 18, 27, 13, 36, 7, 27, 47, 33, 40, 34, 29, 57, 55, 45, 38, 43, 46, 45, 23, 43, 22, 34, 30, 30, 30, 39, 0, 27, 42, 43, 54, 4, 41, 54, 20, 3, 42, 49, 29, 27, 56, 53, 6, 29, 11, 8, 57, 52, 32, 10, 7, 9, 11, 5, 2, 9, 2, 23, 12, 3, 39, 4, 56, 30, 18, 42, 52, 40, 2, 32, 1, 51, 12, 30, 53, 11, 52, 33, 54, 31, 22, 5, 9, 20, 54, 53, 18, 11, 28, 3, 9, 55, 18, 5, 29, 53, 49, 42, 40, 50, 23, 58, 34, 23, 43, 32, 31, 33, 19, 1, 2, 41, 14, 31, 42, 37, 17, 50, 42, 34, 12, 56, 5, 1, 38, 7, 31, 30, 44, 31, 29, 56, 56, 14, 7, 4, 50, 7, 56, 56, 2, 50, 8, 52, 33, 53, 14, 53, 16, 26, 23, 20, 26, 14, 2, 32, 25, 45, 25, 37, 47, 41, 39, 39, 26, 35, 25, 18, 31, 43, 58, 9, 34, 18, 57, 6, 30, 29, 55, 19, 42, 13, 42, 12, 6, 8, 53, 24, 7, 15, 40, 7, 23, 25, 58, 33, 53, 35, 53, 37, 28, 49, 32, 35, 43, 39, 24, 9, 56, 12, 25, 5, 28, 27, 31, 47, 48, 47, 43, 40, 51, 28, 16, 2, 58, 36, 33, 33, 53, 50, 25, 18, 1, 10, 28, 33, 39, 38, 45, 23, 10, 49, 23, 28, 12, 17, 50]

It didn’t appear to be a simple encoding as the number of bits wasn’t a nice, round number:

>>> len(qbf)/43.

I explored the possibility that it was a base-59 encoded version of something as well, but nothing seemed to pop out (none of the base-59 numbers had promising base-2, 10 or 16 representations):

>>> reduce(lambda x,y: x*59+y, qbf)
>>> hex(reduce(lambda x,y: x*59+y, qbf))
>>> bin(reduce(lambda x,y: x*59+y, qbf))

I also used numpy’s histogram function to see if there were signs of a simple substitution. While there were spikes of various values, nothing seemed to be significant when you compared the histograms of the various encodings of “THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG”.

The fact that more than eight output positions on average represented a single input position strongly suggested that there was some sort of reduction involved in decoding. I tried a number of different approaches with bit counting and different groupings, but none yielded any promising results.

The breakthrough came about a minute after Erik posted his last hint:

If you base64 decode this


The first 5 characters are “!8F2>”

Those 5 characters represent the first “T”. Another “T” won’t necessarily result in the same code. Actually, the odds of ever seeing that code again for a “T” is low.

I ran through a few quick tests and discovered the first part of the solution: the sum of the first five characters, modulo 59, equalled the first character of the output!

>>> sum(qbf[:5])
>>> sum(qbf[:5])%59
>>> ord('T')-32

I went though the other versions of the sample encoding and saw that the same held for the first five digits. I checked the next five and didn’t see the correlation there. It also didn’t hold for the next six. The pattern *did* hold for the next seven. Even more tantalizing was that the challenge itself appeared to follow the same pattern:

>>> chr(sum(orig[:5])%59+32)
>>> chr(sum(orig[5:12])%59+32)

At this point I figured that I’d write a quick program to brute force the rest of the string.

from base64 import *
qbf = [ord(x)-32 for x in b64decode("""IThGMj4wS0gvR0M9PE5QQEEnMyxNIiFHJE1NVTdYSS5PTTdFT1FCVUtMLVBW Qj4nNkBUIDsqLUNTLjVCJS87IkwnT1E/LiZRI1E4ITlBLzBXNU5SNEgmVDM8 QFM0Ty4wLFAvL0ohNTUxQzU2Jy82LVEpN0gpSjs3NCE5Q1lXQlUwUVc1MUFX

orig = [ord(x)-32 for x in b64decode("""QCgyPSghK1kwTlRJO0MrVTtZLUxWLCg3MypEUCNQJjNGTlMuIyU8TzwpSDNP ND0tRFg7RkUoJjotJzQwIz8pJUFCODtQI0AyPCFaTkNTUyQ/PUgqN0VKLCEi VUxWRjo2SUYnSiAiMkdGVCpGSCw+NDAqVyhUIEAtWTVMRiQzMi8jIiMuPyFP

output = ""

i = 0
start = 0
text_i = 0

while True:
        c = chr(sum(qbf[start:i])%59+32)
        print sum(qbf[start:i]), c, text[text_i]
        if c == text[text_i]:
                print "Found at %d-%d %d" % (start,i,i-start)
                output += chr(sum(orig[start:i])%59+32)
                start = i
                i = i + 4
                text_i = text_i + 1
                print output

        i = i + 1

The output from this test program yielded most of the solution:

0   T
1 ! T
25 9 T
63 $ T
81 6 T
111 T T
Found at 0-5 5
153 C H
188 + H
217 H H
Found at 5-12 7
187 * E
194 1 E
213 D E
225 P E
270 B E
272 D E
273 E E
Found at 12-23 11

... [a few dozen lines snipped] ...

Found at 347-360 13
213 D D
Found at 360-365 5
104 M O
132 . O
165 O O

The solution wasn’t perfect at this point, but when you look at the output, the pattern becomes pretty clear:

Found at 0-5 5
Found at 5-12 7
Found at 12-23 11
Found at 23-36 13
Found at 36-41 5
Found at 41-48 7
Found at 48-59 11
Found at 59-72 13
Found at 72-77 5
Found at 77-84 7
Found at 84-95 11
Found at 95-108 13

From the pattern, I quickly whipped up a solution. It’s not the most elegant bit of Python, but the results are the most important part:

from base64 import *


def decode(code):
        values = [ord(a)-32 for a in b64decode(code)]
        offsets = [5,7,11,13]*len(code) # note: won't need all of these, but guaranteed more than enough
        decoded = ""
        while len(values):
                decoded += chr(sum(values[:offsets[0]])%59+32)
                values = values[offsets[0]:]
                offsets = offsets[1:]
        return decoded

print decode(qbf)
print decode(orig)

The output, which I posted quickly to the blog and the HN story:


And thanks to Mark Trapp, the classic Seinfeld scene:

AppEngine: “Peace-of-mind scalability”

Thursday, November 25th, 2010

I had a lot of great feedback on my AppEngine post the other day. We put our trust in the AppEngine team to keep things running smoothly while we work on our app and live the rest of our lives. Today is pretty quiet around the Gripe virtual offices (aka Skype channels): it’s Thanksgiving in US and I’m getting hit by a cold too hard to do much today besides write a short post.

We had a great surprise this morning. The View episode we were featured in was in re-runs this morning and we had a huge traffic spike. Nobody noticed until about 30 minutes in, since  everything was scaling automatically and the site was still serving up as quickly as ever:

This is a whole new way to build a startup: no surprise infrastructure worries, no pager duty, no getting crushed by surprise traffic. It’s peace-of-mind scalability.

Now back to fighting my cold, without the worry of our site’s load hanging over my head. For those of you in the US, enjoy your thanksgiving and watch out for Turkey drops!

WKRP Turkey Drop from Mitch Cohen on Vimeo.

Why we’re really happy with AppEngine (and not going anywhere else)

Tuesday, November 23rd, 2010

There’s been a handful of articles critical of Google’s AppEngine that have bubbled up to the top of Hacker News lately. I’d like to throw our story into the ring as well, but as a story of a happy customer rather than a switcher.

We’ve been building out our product,, since last May, after pivoting from our previous project, DotSpots. The only resource available to develop Gripe in the early days was myself. I’d been playing with AppEngine on and off, creating a few small applications to get a feel for AppEngine’s strengths and weaknesses since its initial Python release. We were also invited to the early Java pre-release at DotSpots, but it would have been too much effort to make the switch from our Spring/MySQL platform to AppEngine.

My early experiments on AppEngine near its first release showed that it was promising, but was still an early release product. Earlier this year, I started work on a small personal-use aggregator that I’ve always wanted to write. I targeted AppEngine again and I was pleasantly surprised at how far the platform had matured. It was ready for us to test further if we wanted to tackle projects with it.

Shortly after that last experiment, one of our interns at DotSpots came to us with an interesting idea. A social, mobile complaint application that we eventually named Gri,pe. We picked AppEngine as our target platform for the new product, given the platforms new maturity. It also helped that as the sole developer on the first part of the project, I wanted to focus on building the application rather than spending time building out EC2 infrastructure and ops work that goes along with productizing your startup idea. I prototyped it on the side for a few months with our designer and once we determined that it was a viable product, we decided to focus more of the company’s effort on it.

There were a number of great bonuses to choosing AppEngine as well. We’ve been wanting to get out of the release-cycle treadmill that was killing us at DotSpots and move to a continuous deployment environment. AppEngine’s one-liner deployment and automated versioning made this a snap (I hope to detail our Hudson integration another blog post). The new task queue functionality in AppEngine let us do stuff asynchronously as we always wanted to do at DotSpots, but found to be awkward to automate with existing tools like Quartz. The AppEngine Blobstore does the grunt work of dealing with our image attachments without us having to worry about signing S3 requests (in fairness, we’re using S3 signed requests for our new video upload feature, but the Blobstore let us launch image attachments with a single day’s work).

When it came time for us to launch at TechCrunch 50 this year, I was a bit concerned about how AppEngine would deal with the onslaught of traffic. The folks on the AppEngine team assured me that as long as we weren’t doing anything to cause a lot of write contention on single entity groups, we’d scale just fine. And scale we did:

In the days after our launch, AppEngine hit a severe bit of datastore turbulence. There was the occasional latency spike on AppEngine while we were developing, but late September/early October was much rougher. Simple queries that normally took 50ms would skyrocket up to 5s. Occasionally they would even time out. Our application was still available, but we were seeing significant error rates all over. We considered our options at that point, and decided to stick it out.

Shortly after the rough period started, the AppEngine team fixed the issues. And shortly after that, a bit of datastore maintenance chopped the already good latencies down even further. It’s been smooth sailing since then and the AppEngine team has been committed to improving the datastore situation even more as time goes on:

We didn’t jump ship on AppEngine for one rough week because we knew that their team was committed to fixing things. We’ve also had our rough weeks with the other cloud providers. In 2009, Amazon lost one our EBS drives while we were prototyping DotSpots infrastructure on it. Not just a crash with dataloss, but actually lost. The whole EBS volume was no longer available at all. We’ve also had weeks where EC2 instances had random routing issues between instances, instances lock up or get wedged with no indication of problems on Amazon’s side. Slicehost had problems with our virtual machines losing connectivity to various parts of the globe.

“Every cloud provider has problems” isn’t an excuse for any provider to get sloppy. It’s an understanding I have as the CTO of a company that bases its future on any platform. No matter which provider we choose, we are putting some faith in a third-party that they will solve the problems as they come up. As a small startup, it makes more sense for us to outsource management of IT issues than to spend 1/2 of an engineer’s time dealing with this. We’ve effectively hired them to deal with managing the hard parts of our scaleability infrastructure and they are working for a fraction of what it would cost to do this ourselves.

Putting more control in the hands of a third party means that you have to give up the feeling of being in control of every aspect of your startup. If your self-managed colo machine dies, you might be down for hours while you get your hands dirty fixing, reinstalling or repairing it. When you hand this off to Google (or Amazon, or Slicehost, or Heroku), you give up the ability to work through a problem yourself. It took some time for me to get used to this feeling, but the AppEngine team has done amazing work in gaining the trust of our organization.

Since that rough week in September, we’ve had fantastic service from AppEngine. Our application has been available 100% of the time and our page rendering times are way down. We’re committing 100% of our development resources to banging out new features for and not having to worry about infrastructure.

On top of the great steady-state service, we were mentioned on ABC’s The View and had a massive surge in traffic which was handled flawlessly by AppEngine. It transparently scaled us up to 20+ instances that handled all of the traffic without a sweat. In the end, this surge cost us less than $1.00:

There’s a bunch of great features for AppEngine in the pipeline, some of which you can see in the 1.4 prerelease SDK and others that aren’t publicly available yet but address many of the issues and shortcomings of the AppEngine platform.

If you haven’t given AppEngine a shot, now is a great time.

Post-script: We do still use EC2 as part of our infrastructure. We have a small nginx instance set up to redirect our naked domain from to and deal with some other minor items. We also have EC2 boxes that run Solr to deal with some search infrastructure. We talk to Solr from GAE using standard HTTP. As mentioned in the article above, we also use S3 for video uploads and transcoding.

Follow @mmastrac on Twitter and let us know what you think of!

More on Cross-Domain Transports

Monday, January 4th, 2010

I previously discussed using as a transport last year, but there wasn’t enough information and theory in the previous post to implement your own transport. I’ll fill in the gaps a bit more here to help developers looking to implement this themselves. You can also see the implementation we’re using at DotSpots in the gwt-rpc-plus project: Cross­Domain­Frame­ and Cross­Domain­Frame­Transport­

The process of making a request is described below. Note that you can inspect the HTML and HTTP traffic on my blog to see how our DotSpots plugin performs the cross-domain requests.

  1. Create the iframe and form to post. In IE, we use ActiveXObject as previously discussed to work around the machine-gun navigation sounds. You’ll need to stash a reference to the ActiveXObject somewhere where IE won’t garbage collect it.
    if ("ActiveXObject" in window) {
        var doc = new ActiveXObject("htmlfile");;
        doc.write("<html><body><iframe id='iframe'></iframe></body></html>");
        var iframe = doc.getElementById('iframe');
    } else {
        var doc = document;
        var iframe = doc.createElement('iframe');
    var form = doc.createElement('form');
  2. Set the the iframe’s name to a globally unique request ID. Using a unique request ID is very important, as two frames with the same name in Safari will result in one of them being set to a random name. = requestId; = requestId;
  3. POST the form to the endpoint, including the unique request ID and URL on the host domain. You can usually use /favicon.ico as a local path to return to. This is what gets posted by the DotSpots plugin on my blog:
    data=... (truncated JSON data)
  4. At this point, the iframe is now in a different security domain, so onload events won’t fire and window name is inaccessible.
  5. The server generates code to set the window name and redirect back to the original domain. Since the iframe is in a different security domain, the parent page won’t be able to tell that it was loaded. We redirect this iframe back to the URL requested once we’ve set the name.
    <script>'wnr-0-8908858(truncated JSON data)';
  6. The browser redirects back. When it gets back to the other domain’s favicon, onload will successfully fire and the previously set by the server will be accessible. As multiple onload and onreadystatechange events might occur during the request/response process, you should always check the for the appropriate prefix.
  7. Clean up the iframe and form. Remove the iframe and form after the request completes, but do this after a short delay. If you remove the iframe immediately after onload fires, the Firefox loading throbber will continue to animate indefinitely.

If you are considering using a transport like this for your own project, you should also consider using postMessage for uplevel browsers that support it. There are two HTTP requests required for, adding a fair bit of latency to each cross-domain request. Using postMessage reduces the number of requests to one, cutting the round-trip latency by 50%.

Your server responses when using postMessage should look like this instead:

window.parent.postMessage('wnr-0-8908858(truncated JSON data)', '*');

If you’re using GWT, all of this work is already done for you in our open-source library, gwt-rpc-plus. We’ve tested it against all of the current desktop browsers (all the way back to IE6) and the mobile browsers too (Android and iPhone).

GWT 2.0 on OSX and the ExternalBrowser RunStyle

Saturday, December 12th, 2009

GWT 2.0 ships with a new HtmlUnit-based testing system that makes testing faster, but isn’t full web-browser under the hood. It can’t perform layout, so any tests that rely on layout measurements won’t succeed.

There are a number of alternate run styles that are useful, however. Manual mode, which outputs a URL that you can connect any browser to, works well for quick testing, but is a pain when you need to keep running it over and over. RemoteWeb and Selenium automate the process of starting and stopping browsers, but require you to start an external server before testing.

There’s another run-style that isn’t well-documented, but I’ve found to be the most useful for testing locally: ExternalBrowser. It requires an executable name in the command-line, the name of the browser to start. On OSX, you can specify the fully-qualified name of the executable beneath the application bundle, or you can use a shell script to launch the browser of your choice.

Place the following script under /usr/bin named ‘safari’ and make it chmod a+x. This allows you (and ExternalBrowser) to launch Safari from the command-line’s PATH. The argument to “open” is the name of any browser that lives under your /Applications/ directory, including subdirectories.

open -W -a Safari $1

Add a Java system property “gwt.args” to your testing launch configuration in Eclipse, Ant script or other IDE. You can then specify the run style like so:

-Dgwt.args="-runStyle ExternalBrowser:safari"

Now, when you run your unit tests, you’ll see GWT launch Safari and navigate to the appropriate URL on its own. Tests will leave windows up on your screen after they complete, but you can safely close them after the run terminates.

My Presentation at Google Campfire One

Saturday, December 12th, 2009

I went down to Mountain View last week to show off the company we’ve been building, DotSpots, plug our brand-new Chrome extension and demo some of the great new features of GWT 2.0.

See it directly on YouTube here:

Kudos to the Google team for putting together such a great event. You don’t realize how much planning and preparation goes into an hour-long event like that.

Absolutizing URLs in Javascript

Tuesday, November 17th, 2009

Occasionally it becomes useful to get the absolute form path of a URL from a relative one. You might be dynamically changing links on a page, or loading scripts from the same location as scripts that have already been loaded.

Computing an absolute URL by hand is problematic: you need to deal with any <base> elements in the document and properly implement the relative path canonicalization that browsers already do.

It turns out that on standards-compliant browsers based on Gecko, WebKit or Opera, the href property of anchor elements is magical. If you assign a URL fragment to it, it comes back as a fully-qualified URL when you read it back.

Ok, that’s great, but what about IE?  Well, it turns out that IE will fully-qualify URLs returned by the href property, but only if that anchor tag was created by the parser. Using createElement(‘a’) and setting href won’t trigger this code path.  There’s a trick we can use to work around this limitation, however.  You can force IE to use the parser to create elements by assigning innerHTML of another element.  This runs the element creation through the magic construction path that correctly sets up the attribute/property mapping.

Here’s a snippet of a function that will correctly canonicalize any URL you pass to it:

function canonicalize(url) {
    var div = document.createElement('div');
    div.innerHTML = "<a></a>";
    div.firstChild.href = url; // Ensures that the href is properly escaped
    div.innerHTML = div.innerHTML; // Run the current innerHTML back through the parser
    return div.firstChild.href;

You can use this script for other interesting purposes, like determining the base path for the current page (returns “/” for “/foo” and “/foo/” for “/foo/”). It gets the relative URL for the path “_#”, which removes any anchors, query strings or filenames from the current URL.

function getBasePath() {
    return canonicalize("_#").slice(0,-2);

The above code was tested on Safari, Chrome, Firefox 3.5 and IE6.