2011 March

AppEngine vs. EC2 (an attempt to compare apples to oranges)

This is an expanded version of my answer on Quora to the question “You’ve used both AWS and GAE for startups: do they scale equally well in terms of availability and transaction volume?”:

I’ve had an opportunity to spend time building products in both EC2 (three years at DotSpots and the last year working on Gri.pe). At DotSpots, we started using EC2 in the early days, back when it was just a few services: EC2, S3, SQS and SDB. It grew a great deal in those years, tacking on a number of useful services, some which we used (<3 CloudFront) and a number that we didn’t. Last year, around April, we switched over to building Gri.pe on AppEngine and I’ve come to appreciate how great this platform is and how much more time I spend building product rather than infrastructure. Other developers enjoy building custom infrastructure, but I’m happy to outsource it to Google.

Given these two technologies, it’s difficult to directly compare the them because they are two different beasts: EC2 is a general purpose virtual machine host, while AppEngine is a very sophisticated application host. AppEngine comes with a number of services out-of-the-box. The Amazon Web Services suite tacks on a number of various utilities to EC2 that give you access to structured, query-able storage, automatic scaling at the VM level, monitoring and other goodies too numerous to mention here.

Transaction Volume

When dealing with AppEngine, the limit to your scaling is effectively determined by how well you program to the AppEngine environment. This means that you must be aware of how transactions are processed at the entity group level and make judicious use of the AppEngine memcache service. If you program well against the architecture and best practices of AppEngine, you have the potential of scaling as well as some of Google’s own properties.

Here’s an example of one of the more expensive pages we render at Gri.pe. Our persistence code automatically keeps entities hot in memcache to avoid hitting the datastore more than a few times while rendering a page:

On EC2, scaling is entirely in your hands. If you are a MySQL wizard, you can scale that part of the stack well. Scaling is half limited only by the constraints of the boxes you rent from Amazon behalf by your skill and the other half by creativity in making them perform well. At DotSpots, we spent a fair bit of time scaling MySQL up for our web crawling activities. We built out custom infrastructure for serving key/value data fast. There was infrastructure all over the place just to keep up with what we wanted to do. Our team put a lot of work into it, but at the end of the day, it was fast.

It’s my opinion that your potential for scaling on AppEngine is much higher for a given set of resources, if your application fits within the constraints of the “ideal application set” for AppEngine. There are some upcoming technologies that I’m not allowed to talk about right now that will expand this set of ideal applications for AppEngine dramatically.

Reliability and availability

As for reliability and availability, it’s not an exact comparison here again. On EC2, an instance will fail from time-to-time for no reason. In some cases it’s just a router failure and the instance comes back in a few minutes to a few hours later. Other times, the instance just dies, taking its ephemeral state with it. In our large DotSpots fleet, we’d see a machine lock up or disappear somewhere around once a 1 month or so. The overall failure rate here was pretty low, but enough that you need to keep on your toes for monitoring and backups. We did have a catastrophic failure while using Elastic Block Store that effectively wiped out all of the data we were storing on it - that was the last time we used EBS (in fairness, that was in the early days of EBS and this probably not as likely to happen again).

On AppEngine, availability is a bit of a different story. Up until the new High-replication datastore, you’d be forced to go down every time the Datastore went into maintenance - a few hours a month. With the new HR datastore, this downtime is effectively gone, in exchange for slightly higher transaction processing fees on Datastore operations. These fees are negligible overall and definitely worth the tradeoff for increased reliability.

AppEngine had some rough patches for Datastore reliability around last September, but these have pretty much disappeared for us. Google’s AppEngine team has been working hard behind the scenes to keep it ticking well. There are some mysterious failures in application publishing from time-to-time on AppEngine. They happen for a few minutes to a few hours at a time, then get resolved as someone internally at Google fixes it. These publish failures don’t affect your running code - just your ability to publish new code. We’re doing continuous deployment on AppEngine, so this affects us more than others.

If you measure reliability in terms of the stress imposed on developers keeping the application running, AppEngine is a clear winner in my mind. If you measure it by the time that your application isn’t unavailable from forces beyond your control, EC2 wins (but only by a small amount, and by a much smaller margin when comparing the HR datastore).

Follow me (@mmastrac) on Twitter.

GWT 2.2 and java.lang.IncompatibleClassChangeError

I’ve been updating Gri.pe to the latest versions of the various libraries we use and ran into some trouble while attempting to update GWT to 2.2. There were some incompatible bytecode changes made: the classes in the com.google.gwt.core.ext.typeinfo package were converted to interfaces. Java has a special invoke opcode for interface types, so the classes would fail to load with this obscure error when the verifier tried to load the classes:

java.lang.IncompatibleClassChangeError: Found interface com.google.gwt.core.ext.typeinfo.JClassType, but class was expected

There are updated libraries available for some third-party GWT libraries, but others (like the gwt-google-apis) haven’t been updated yet.

The solution suggested by others was to manually recompile these against the latest GWT libraries to fix the problem. I thought I’d try a different approach: bytecode rewriting. The theory is pretty simple. Scan the opcodes of every method in a jar and replace the opcode used to invoke a method on a class, INVOKEVIRTUAL, with the opcode for invoking a method on interfaces, INVOKEINTERFACE. More info on Java opcodes is available here.

The ASM library makes this sort of transformation trivial to write. Compile the following code along with the asm-all-3.3.1.jar and plug in the file you want to transform in the main method. It’ll spit out a version of the library that should be compatible with GWT 2.2. You might need to add extra classes to the method rewriter, depending on the library you are trying to rewrite:

package com.gripeapp.bytecoderewrite;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

import org.objectweb.asm.ClassAdapter;
import org.objectweb.asm.ClassReader;
import org.objectweb.asm.ClassVisitor;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodAdapter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class BytecodeRewrite {
  static class ClassVisitorImplementation extends ClassAdapter {
    public ClassVisitorImplementation(ClassVisitor cv) {
      super(cv);
    }

    @Override
    public MethodVisitor visitMethod(int access, String name, String desc,
              String signature, String[] exceptions) {
        MethodVisitor mv;
        mv = cv.visitMethod(access, name, desc, signature, exceptions);
        if (mv != null) {
          mv = new MethodVisitorImplementation(mv);
        }
        return mv;
    }
  }

  static class MethodVisitorImplementation extends MethodAdapter {

    public MethodVisitorImplementation(MethodVisitor mv) {
      super(mv);
    }

    @Override
    public void visitMethodInsn(int opcode, String owner, String name,
        String desc) {
      if (owner.equals("com/google/gwt/core/ext/typeinfo/JClassType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JPackage")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JMethod")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JParameterizedType")
          || owner.equals("com/google/gwt/core/ext/typeinfo/JParameter")) {
        super.visitMethodInsn(Opcodes.INVOKEINTERFACE, owner, name, desc);
      } else
        super.visitMethodInsn(opcode, owner, name, desc);
    }
  }

  public static void main(String[] args) throws IOException {
    ZipInputStream file = new ZipInputStream(new FileInputStream("/tmp/gwt-maps-1.1.0/gwt-maps.jar"));
    ZipEntry ze;

    ZipOutputStream output = new ZipOutputStream(new FileOutputStream(("/tmp/gwt-maps.jar")));

    while ((ze = file.getNextEntry()) != null) {
      output.putNextEntry(new ZipEntry(ze.getName()));
      if (ze.getName().endsWith(".class")) {
        ClassReader reader = new ClassReader(file);
        ClassWriter cw = new ClassWriter(0);
        reader.accept(new ClassVisitorImplementation(cw), ClassReader.SKIP_FRAMES);
        output.write(cw.toByteArray());
      } else {
        byte[] data = new byte[16*1024];
        while (true) {
          int r = file.read(data);
          if (r == -1)
            break;
          output.write(data, 0, r);
        }
      }
    }

    output.flush();
    output.close();
  }
}

Abusing the HTML5 History API for fun (and chaos)

Disclaimer: this post is intended to be humorous and the techniques described in this post are not recommend for use for any website that wishes to keep visitors.

I’ve been waiting for a chance to play with the HTML5 History APIs since seeing it on Google’s 20thingsilearned.com. After reading Dive into HTML5’s take on the History API today (thanks to a pointer by Mark Trapp), I finally came up with a great way to play around with the API.

Teaser: it turns out that this API can be used for evil on WebKit browsers. I’ll get to that after the fun part.

The web of the 90’s was a much more “interesting” place. Animated GIF icons were the rage, loud pages and <blink> were the norm and our friend, the <marquee> tag used to give us impossible-to-read scrolling messages.

If you can see this moving, your browser is rocking the marquee tag

Over time, the web grew up and these animations died out. Marquees made a short comeback as animated <title>s, but thankfully those never caught on.

The Modern Browser

The world of the 2011 browser is much different than the browser of the 90’s. Early browsers had a tiny, cramped location bar with terrible usability. In modern browsers, the location, function and usability bar is now the one of the primary focuses of the UI.

Given that, it seems like this large browser UI is ripe for some exploitation. What if we could resurrect marquee, but give it all of the screen real-estate of today’s large, modern location bar?

With that in mind, I give you THE MARQUEE OF TOMORROW:

<script>
function beginEyeBleed() {
        var i = 0;
        var message = "Welcome to the marquee of tomorrow!   Bringing the marquee tag back in HTML5!   ";
        setInterval(function() {
                var scroller = message.substring(i) + message.substring(0, i);
                scroller = scroller.replace(/ /g, "-");
                window.history.replaceState('', 'scroller', '/' + scroller);
                i++;
                i %= message.length;
        }, 100);
}
</script>

(if you’re viewing this on my blog in an HTML5-capable browser)

And now for the evil part.

It turns out that some browsers weren’t designed to handle this level of awesomeness. Trying to navigate away from a page that turns this on can be tricky. Firefox 4 works well: just type something into the location bar and the bar stop animating. WebKit-based browsers are a different story.

On Safari, every replaceState call removes what you’ve typed into the location bar and replaces it with the new data. This makes it impossible to navigate away by typing something else into the bar. Clicking one of your bookmarks will take you away from the page, however.

On Chrome, the same thing happens, but it’s even worse. Every replaceState call not only wipes out the location bar, but it cancels navigate events too. Clicking on bookmarks has no effect. The only recourse is to close the tab at this point.

Edit: comments on this post have noted that you can refresh, or go back to get off the page. You can also click a link on the page to navigate away.

Edit: the location bar seems to be find in Chrome on non-Mac platforms. Ubuntu is reportedly OK and Windows 7 performed OK in my test just now. The bookmark navigation issue is still a problem on the other platforms.

Overall, the new HTML5 history marquee is a pretty effective way of drawing eyeballs and, in some cases, forcing longer pageview times.

I filed issue 75195 for Chromium on this. I’ll file one for Safari as well.

Follow me (@mmastrac) on Twitter