Tag Archive for 'google app engine'

The problem with keys (and datastore portability)

Google App Engine, we need to have a talk about your datastore keys.

Your keys can have names but they cannot start with a number. Keys can also have IDs, which are numeric. We can read those, but can't set them.

When I put an entity into the datastore for the first time, you assign it a numeric ID. I'd love to be able to create an entity on a different instance with the same ID you've assigned it but I can't. I'd love to be able to create an entity with the same key that you've assigned it (again, on a separate instance -- say the local SDK or a different app), but I can't set keys directly.

The only option seems to be the most convoluted one: The only way to have lists of keys not break when data is restored to another instance is to (a) not use lists of type db.Key, i.e., ListProperty(db.Key) is out, and (b) always store your own id in a separate field (in other words, two puts per entity instead of one.)

Not good.

Ways you could make my life easier:

  1. Instead of setting a numeric ID, set the key_name automatically for new puts.
  2. Barring that, let us set IDs too

Otherwise, porting datastore entities over that use lists of keys is going to be a real bitch (references are easy to handle).

Dear App Engine, I know you're busy these days but I hope you're listening. Much thanks!

Your friend (even though you sometimes treat me cruel),
Aral

Python, the learn-at-home language

One of the things that I love about Python is that it has all the documentation you ever need (all right, almost) in the code itself. Many moons ago, the very first framework I wrote in ActionScript (Flash 5) used the same technique by placing documentation on the activation objects of functions (and it would be cool to see that practice make a comeback in AS3.)

In Python, to find out what properties an object has, you just ask for a listing. The following, for example, shows you all properties and methods on the os module.

import os
dir(os)

In fact, I was doing this just today as I wanted to find out which functions are available for working with folders for the automatic restore feature I'm building for my Google App Engine backup solution.

Of course, that brings back a lot of items.

But have no fear, because we have regular expressions. I've grown to loooooove regular expressions thanks to being finally forced to learn them and use them daily in Django. I'm probably don't write the most concise ones but I've gotten to the point where I use them all the time and they simplify my life to no end.

So, to see just the methods that have "dir" in them:

import re
r = r'(^.*?dir.*?$)'
rc = re.compile(r)
matches = map(rc.match, dir(os))
[x.groups() for x in matches if not x == None]

And Bob's your uncle.

(Oh yeah, and list comprehensions rock too!)

And the coolest thing is that since Python is interpreted, I'm doing all this in the excellent IPython shell and using the language to learn the language and as a reference.

The more I use Python, the more I love it. They really did everything right. And that success, no doubt, is firmly planted in the unerring focus and the core values instilled in the language by Guido -- you're a freakin' genius, dude, and it gives me no end of confidence in Google App Engine that you're on the project! :)

The cloud must be decentralized

Amazon S3 has been down for several hours now and as Jonathan Boutelle (who, by the way is speaking at the Singularity Web Conference) writes, when S3 goes down, the Internet goes down. Along with images on Twitter, SlideShare, SmugMug, and a host of other sites, images on Pistach.io ads are also down. We've also had intermittent issues with Amazon's SimpleDB. I feel that these issues only serve to highlight the age-old danger of having all your eggs in one basket. Especially a proprietary basket. This is why I applaud Google in releasing the Google App Engine SDK as open source.

The launch of services like Amazon's EC2, S3, SimpleDB and Google's Google App Engine herald the birth of the Commodity Web, wherein web infrastructure is infinitely available and metered just like electricity, water, and gas. For the most part, we don't think about the limits or availability of commodities (as the impending ecological nightmare we've woven for ourselves would attest to, if nothing else.) But when we do try those limits, or when the systems of delivery break down (as in blackouts, for example), the extent of our reliance on these utilities becomes painfully clear.

Planning for contingencies is an important part of running a successful business. Hence, for example, we have UPS units and generators to provide (at least short term) redundancy for the general power grid. Unfortunately, the same cannot be said currently for many applications that rely on the variously-available "cloud" computing solutions for their hosting.

The problem is that we lack standards and interoperability.

If tied in to a proprietary system, I cannot change my hosting provider as easily as I can, say, switch from the power grid to using my own generator. They're not the same voltage. Heck, they're not even the same frequency.

This is why having the Google App Engine SDK be open source is of utmost importance and, I believe, a spark of genius on Google's part. They've created a de facto standard for the Commodity Web. It is only a matter of time before other vendors will offer compatible and scalable application hosting based on Google's standard. And that's how it should be.

The cloud cannot and should not be a centralized behemoth that creates a single point of failure for the Internet. The cloud must be decentralized and needs standards, even if only de facto, and I feel that Google App Engine is a step in the right direction.

If you're interested in the Commodity Web, I'd highly urge you to check out the writings of another Singularity speaker, Simon Wardley, who writes and speaks extensively on the topic.

Smells like Singularity

Godin Needs Singularity

I can't believe I missed this when it first came out (it was probably because I'd just started on the my crazy two-month trip to simultaneously learn Google App Engine, brush up my Python, and build the new Singularity web site):

Brooks Andrus from Techsmith, whom I always end up having a lovely conversation with whenever I'm at a geek conference, wrote about the Singularity web conference a few months back in response to a post by Seth Godin titled The new standard for meetings and conferences. (Brooks, I hope you don't mind that I stole your excellent graphic for the post.)

In his post, Seth states:

If oil is $130 a barrel and if security adds two or three hours to a trip and if people are doing more and more business with those far afield...

and if we need to bring together more people from more places when we get together...

and if the alternatives, like video conferencing or threaded online conversations continue to get better and better, then...

I think the standard for a great meeting or a terrific conference has changed.

In other words, "I flew all the way here for this?" is going to be far more common than it used to be.

I love Brooks's reply:

Seth Godin meet Aral Balkan and welcome to Singularity.

Looking at the comments for the post, I did see a common misconception voiced by several people that Singularity is an online conference. I can see how this came to be, as early on, I was calling it that too. But, as it has begun to take shape, I realize now that it's not an online conference, Singularity is a global web conference.

What's the difference? Here's a quote from the comment I left on Brooks's post:

I just have to clarify that Singularity is not an online conference, it’s a _global_ conference. The big difference here is that we have local conference hubs around the world, some being organized by venue sponsors like Yahoo! and the BBC and others — community hubs — being organized by community groups. People meet up _locally_ as part of a global event.

We definitely use the Internet but it’s our communication medium. It’s what ties all the local groups together. Sure there will be people experiencing and interacting with the conference from the comfort of their own rooms — and some speakers will even be presenting from whichever hotel room they happen to be at the moment — but we are concentrating heavily on having a good speaker and audience presence at the various local hubs. I feel this is essential to the character of the conference.

I truly feel that we are traversing some uncharted terrain here, building the first Conference 2.0, as it were. And I hope that other conferences follow suit because the type of conference we’re creating is environmentally friendly.

Thank you, Brooks, for writing up such a cool post on Singularity.

Ticket sales for the Singularity web conference started yesterday with the launch of our new site on Google App Engine. Tickets during the early bird discount are just $99 (inc. VAT). You can also micro-sponsor the conference for just $199 (inc. VAT). So what are you waiting for? Join us in making history with the world's first global web conference.

jsontime for Google App Engine

From the Simon Willison/Natalie Downe dream team comes yet another very useful service for Google App Engine developers: jsontime.

I have been meaning to blog about this for some time (no pun intended!)

Simon and Natalie have uploaded the 500 or so files that are in the pytz library so you don't have to and built a JSON API that you can call from your Google App Engine apps.

To get the current time in London, for example, you would call the following URL:

http://json-time.appspot.com/time.json?tz=Europe/London

Check out jsontime.

Google App Engine: no support, quotas, throttling… Help!

Update: Paul McDonald, Product Manager on Google App Engine, emailed me to offer their support. Thanks, Paul, I really appreciate it.

I wrote to Google on April 19th, 2008 to ask for the quotas to be lifted on the upcoming new web site for the Singularity Web Conference.

I still haven't heard back from them.

On April 23rd, I wrote to Kevin Marks, whom I'd met at the excellent LIFT conference, to tell him how excited I was about Google App Engine and that we're using it to host the conference web site:

I'm also _very_ excited about Google App Engine. So much so that I've decided to build the Singularity web application on it. In fact, in line with the underlying philosophy of openness around the conference, I launched an open source project called The GAE SWF Project (http://gaeswf.appspot.com) to share the framework I'm building for the conference with the community (along with knowledge and examples on best practices, etc.) It's currently the featured app on the app gallery.

I'd love to talk to someone at Google about raising our quotas etc. as I want to launch the Singularity site/app rather soon and we will start seeing quite a bit of use on it. I'd love to see if Google would be interested in sponsoring the conference too -- it's going to be very relevant. (And, finally, I'd love to have a session at least on Google App Engine.)

No reply.

I know the Google App Engine team is really busy and Marzia has been responding to my various emails on the forums and personally, but I just don't feel any love towards the conference from Google, which is sad because there's definitely a lot of love from the conference towards Google App Engine.

Seriously, though, Google App Engine's technology is awesome. Their intense focus, no doubt the same as that fostered by Guido that makes Python such a joy to work with, has resulted in a truly simple system (and that's a big, big compliment coming from a guy who worships simplicity.)

However, there is one glaring weak point and that is support.

Non-existent support

Any developer who has been developing web sites and applications for any length of time knows that the first thing you look for in a hosting provider is support. Not hardware specs, not technology, none of those. Support.

It's not a question of whether things will go wrong. Things _will_ go wrong. You _will_ have unique requirements. No amount of clever technology is going to replace effective and efficient support personnel. And currently, I don't believe Google App Engine has spared thought for support.

We have the forums, and Google App Engine engineers frequent it, and we have the issue tracker but there isn't anything more personal than that. Speaking to engineers about general issues is great but the core engineers are busy people and we need support people that we can discuss issues that are specific to our applications with.

What Google App Engine lacks in terms of support, however, it tries to make up for with various quotas and "clever" automated systems.

Quotas are a bad idea

Let me state it for the record: Quotas are the dumbest part of Google App Engine. Here's why:

Let's say that you're a huge multinational Internet company Google and you want to revolutionize how web applications are built and distributed. You make all the right decisions. You make your SDK open source, thereby creating an immediate de-facto standard for commodity cloud computing. You choose an elegant and simple language (Python) and support a similarly elegant framework (Django). Then, you open it up so that developers can play with it and build beautiful things.

And they do.

They build beautiful things.

And people take notice. They take notice of the most interesting and beautiful ones. And they tell their friends.

And their friends want to see what Google App Engine is capable of so they look at these interesting and beautiful examples.

And they see:

Over quota.

Isn't Google App Engine supposed to scale? I thought that was the whole idea... As an application developer, I could care less whether the system as a whole scales or not. I need to know that my application will scale.

So we get left with a system where the most popular sites and apps built with Google App Engine all look strikingly similar:

Over quota.

(Or, even worse, exhibit weird behavior and errors because bits of them, like the urlfetch API or mail API are over quota.)

Case in point: Simon and Natalie were over at ours last week and Simon wanted to show me their Liquid Fold project, which is hosted on Google App Engine. Liquid Fold lets you track the actual browser window dimensions of your site's users, instead of the common metric of tracking their screen resolutions. It sounded really cool and I was excited to see it. But instead we saw:

Over quota.

You get the idea.

The canned response to this will be that the system is pre-release and that the quotas are temporary. I think the problem runs deeper than that. It's in the philosophy behind having a quota system that penalizes your application instead of optimistically upgrading its limits.

Teacher Knows Best vs. Customer Knows Best

The philosophy behind a penalizing quota system is the old-school mindset that Teacher Knows Best. If you've been schooled in the Turkish educational system ever, you'll know it well. Teacher sets out the rules. If you follow the rules, you are rewarded with a lack of penalty. If you don't follow the rules, or question the rules, you are penalized. Teacher Knows Best, your needs be damned.

It's not difficult to see where Google gets this philosophy from. It has worked really well for them in their search business. But search and hosting are very different creatures.

In this case, Google sets quotas and your application is penalized if it goes over those quotas. Penalized? My goodness, what an interesting word to be using for a hosting service!

Good hosts don't penalize you, they charge you for what you consume. If you consume more, they charge you more. Call me crazy, but I'm happy with that.

Instead of a system of quotas, Google App Engine should, with your prior consent of course, automatically scale you up and charge you for it. No one wants to see over quota messages at the very moment that their site or application becomes most popular. That completely defeats the purpose of being hosting on the cloud.

Although Google is going to be offering commercial plans when the service launches in earnest, I have a sneaky suspicion that, unless something changes, the Teacher Knows Best philosophy will still prevail. Instead, Google must adopt a Customer Knows Best attitude and, with your prior consent, preemptively scale your application and charge you for it.

It comes down to this:

No one should have to see an Over Quota message if they are willing to pay for the privilege of not seeing one.

It's not too much to ask for. In fact, I'd argue that it is standard practice by any web host worth her salt.

It's not the quota, it's the "clever" throttling

You may be wondering why the quota system has me worried. After all, the quotas offered even in the pre-release period are not that stingy. Most of them wouldn't even be an issue for the Singularity web site if it wasn't for "clever" throttling.

You get a certain amount of quota for various services per 24 hours but Google App Engine doesn't just wait until you've reached that limit and then cut you off. Instead, in perfect adherence to Teacher Knows Best, it tries to throttle your application so that it spreads your quota over the 24 hour period. This means that if your app gets hit particularly hard for a period of time (say you've been slashdotted or dugg), Google will start showing Over Quota errors so that you, the silly student, don't end up using all of your quota for the day.

A few hours later, when people have realized that your site is down and don't care anymore, your site will display properly.

Does this sound completely backwards to you too?

The idea behind Google App Engine is to have a scalable platform for your applications that can take the weight when hit hard. Instead, it forges ahead with its Teacher Knows Best philosophy and tries to spread your quota somewhat evenly over a 24 hour period.

No doubt this provided an interesting programming challenge for the engineers but it's an example of how Teacher Knows Best can result in too-clever solutions that are disconnected with the needs of your customers.

And it's not the only example of Teacher Knows Best gone wrong at Google. Every time I connect to the Net via my T-Mobile USB stick and head to gmail.com, I am informed that they cannot serve me from that domain in Germany. I also get the Google pages localized in German. But, wait a second, I'm not in Germany! I'm in the UK! But, heck, that doesn't matter because I have no way to tell them that I'm not in Germany, because Teacher Knows Best.

My needs are simple

As a customer, my needs are simple.

I need a stable, scalable system that never, ever, ever opts to display an error message to my users or cripple my application if it can help it.

I don't want my quota spread evenly over 24 hours via some amazingly clever algorithm. I want Google App Engine to serve each and every request even when I'm getting hit hardest and charge me for it.

Instead of trying to be too clever, be dumb but meet my needs.

After all, if you want to send 50 emails out (because, say, someone just bought 50 tickets to attend your conference and wants to assign them to a list of people), then you should be able to if you're happy with paying for the privilege. As it stands, however, there is email throttling in effect which means that, to quote Marzia Niccolai from Google, "you should not really be able to send more than about 2 emails per minute at your application's peak."

In other words, Teacher Knows Best, my needs be damned.

Help!

I would be lying if I said that I wasn't worried about the lack of response from Google to my request to have the quotas lifted for the new Singularity web site.

Singularity is a very public affair with some excellent speakers and my decision to host the site on Google App Engine was, at least on my own humble scale, a show of confidence in Google App Engine. (I still love the platform, I truly do, and, apart from these few issues, it is an absolute joy to develop on.)

I can't see how having the Singularity web site on Google App Engine is a bad thing for Google. In fact, I would have thought that they would have been supportive and would have jumped at the chance to get such public and visible support for their platform.

(And guys, if anyone from Google is reading this, my invitation to have a session on Google App Engine at the conference stands. Just leave me a comment.)

I just don't understand the silence and lack of interest on Google's part.

The silver lining

Regardless, with or without quotas, I'm confident that I've developed the new Singularity web site so that it will work without issues. As I stated earlier, our needs, at least initially, are not that great. Because of the quotas, however, I'm having to build systems (like a fake cron that pings the app every minute to send email from a queue) that I would otherwise not have had to build with my rather modest requirements.

Google could have (and still can) save me a lot of work by lifting the quotas for the Singularity web conference web site -- especially the email quotas.

I really hope that I get some sort of response from Google soon and I really hope that they address the issue of support with Google App Engine. Teacher Knows Best is a great attitude for search, but a horrible one for hosting.

If a public event like the world's first global web conference isn't important enough to warrant a few minutes from the Product Managers on the Google App Engine team to give me a definite "yes" or "no" answer to a support request, I'd hate to think what level of support an independent developer with a small application and no publicity is going to get from Google.

Winning at the shell game: iPython on Google App Engine

iPython is an awesome extended Python shell that gives you goodies like tab completion for instances, history tracing (so you can easily copy interactive sessions as doctests), etc. And, if you install it, your Django project on Google App Engine will automatically start using it instead of the regular python shell when you use ./manage.py shell.

To install iPython on OS X Tiger (yes, my Leopard discs are still safely in their box since I downgraded and I don't see any reason to bring them back out yet), I followed the following steps:

  1. Download the latest iPython from the iPython distributions page (ipython-0.8.4.tar.gz)
  2. Untar it, cd into the folder
  3. As per the instructions on the iPython download page:
    python setup.py build
    sudo python setup.py install
  4. To test it out on my Google App Engine/Django project, from my project folder: ./manage.py shell

(Note: The docs mention that you need to have readline installed on Mac OS X in order to use some of the features like tab completion and syntax highlighting. It just worked out of the box for me on OS X Tiger 10.4.11 -- I'm not sure if I had installed readline at some point or whether it was just there. Check out these instructions if you're having trouble.)

Once you have it installed, try out the cool code completion:

from my_app import models
models.

Press ⇥, and you'll see a list of all your models. models.my_model. ⇥ will show you the properties for that model and so on.

To create doctests, simply enter your test instructions in the shell and then type hist -n to get a dump of your history without line numbers that you can copy and paste into your doctest.

You can press ⌃ P and ⌃ ⇧ P to interactively bring up the previous and next commands in the history. If you've typed a bit of the command before doing this, it will filter to show you only those commands from your history that match the text you've entered.

You can also access the system shell without leaving iPython by preceding system calls with an exclamation mark. !ls, for example, will show you a listing of the current working directory.

And there's much more you can do that you can read about on the iPython documentation (or just type ? in the iPython shell itself and browse the docs interactively).

Check out iPython, it's yummy!

I found out about iPython from an excellent blog post by AkH on useful tips and good practices for Django projects. Thanks, dude!

Singularity web conference: register your interest and help me test with the new teaser on Google App Engine

Singularity web conference new web site teaser

Take a sneak peek at the new Singularity web site on Google App Engine. You can sign up for the site, which will go live in July, and register your interest in the conference and help me test out the deployment environment. (And yes, Singularity is inverting its colors for the second half of the year.)

The current teaser is a glorified "coming soon" page but with one important difference: You can pre-register for the new site using your Google account.

If you have a moment, please sign up for the new site and help me test the deployment environment before we open up the actual site in July along with ticket sales.

I would really appreciate your feedback if you notice any issues (especially quota-related errors). I hope that Google will be removing the quotas for our application but I haven't heard anything definitive back from them yet.

The registration process currently asks you for your name, your email address (if you'd like to use a different one to the one on your Google account), and for your location.

The location data is really important for us as we create the schedule for the conference. The conference is going to run for 48 hours straight, over three days and be attended by people from around the world so the more location and timezone data we have, the better we can create a schedule that meets your needs. You can give as little or as much detail as you like for your location (country/city/town/even postal code) and we do our best to calculate your timezone based on that and on what your computer tells us your timezone is. (I'm using the Yahoo GeoPlanet API for much of this and I am _very_ impressed by it.)

Check out the new Singularity web site teaser and sign up to join the Singularity community.

Update: Please star Issue 404 to bring OpenID to Google App Engine

Issue 404 is the key to getting OpenID working on Google App Engine but currently it only has 10 stars.

Please star Issue 404 to get OpenID support on Google App Engine.

(Please add a star, _not_ a "me too" comment which will get emailed to everyone who has starred the issue and may cause other people to unstar it.)

Ryan Barrett from the Google App Engine team has built both OpenID provider and OpenID consumer libraries (open source) that you can reach at http://openid-provider.appspot.com/ and http://openid-consumer.appspot.com/ (uses version 2.1.1 of JanRain's OpenID library and is compatible with OpenID 2.0 providers).

Once this bug is fixed, those existing libraries will just work and we will have OpenID support on Google App Engine.

Please star Issue 404 so OpenID doesn't stay 404 on Google App Engine.

Running doctests from TextMate for Google App Engine modules

Developer emptor: I just lost a couple of hours to this: make sure you disable the Google App Engine doctest import in your apps when you're done testing a module lest you encounter _weird_ errors. I started having the login URL returned by users.create_login_url() being returned incorrectly when I forgot to remove the doctest import. It started forwarding to https://www.google.com/accounts/Login?continue=. Check out my forum post on it here.

I love Python's doctests. Basically, you test out your functions in the interactive shell and copy the results into the comments for a function. That's it! So simple.

Example:

def http_request(self, url, data, method=urlfetch.POST):
	"""
	Makes an API call to Triggermail and returns the response.
 
	>>> client = TriggerMail()
	>>> client.http_request('email', {'email':'EMAIL_REMOVED'}, urlfetch.GET)
	{u'blacklist': u'0', u'templates': {u'test2': 0}, u'verified': u'0', u'vars': {u'first_name': u'Aral', u'last_name': u'Balkan'}, u'optout': u'0'}
	"""

To run the doctests, you just need a main method in your module that looks like this:

if __name__ == "__main__":
	import doctest
	doctest.testmod()

And, if you're working with TextMate, you can run the current script and its doctests by pressing ⌘ R.

Sweet!

However, when working with Google App Engine, this doesn't work out of the box.

If you try it, you'll get an error similar to the following:

ImportError: No module named google.appengine.api

This is because the local GAE environment isn't set up properly. The same goes when trying to test your apps from the Python interactive shell.

(If you're using Django for your app, you're in luck, all you have to do is ./manage.py shell and you're up and running with an interactive shell that's configured for your GAE project.)

Thankfully, Duncan over at the GAE forums went to the trouble of finding out exactly which imports are necessary to get you up and running.

His code listing actually goes beyond setting up the environment to finding your modules and running the tests. For my purposes, I just want to be able to hit ⌘ R in TextMate and run the tests for my current module while developing it, so I took the top bit of his code and put it into a module called gae_doctests.py.

It looks like this:

# To enable doctests to run from TextMate, import this module
# (Use only when testing, then comment out.)
# From: http://groups.google.com/group/google-appengine/browse_thread/thread/fa81f6abd95aa8b9/efed988b302aafb4?lnk=gst&q=duncan+doctests#efed988b302aafb4
 
import sys
import os
sys.path = sys.path + ['/usr/local/google_appengine', '/usr/local/google_appengine/lib/django', '/usr/local/google_appengine/lib/webob', '/usr/local/google_appengine/lib/yaml/lib', '/usr/local/google_appengine/google/appengine','/Users/aral/singularity/']
 
from google.appengine.api import apiproxy_stub_map
from google.appengine.api import datastore_file_stub
from google.appengine.api import mail_stub
from google.appengine.api import urlfetch_stub
from google.appengine.api import user_service_stub
 
APP_ID = u'test_app'
AUTH_DOMAIN = 'gmail.com'
LOGGED_IN_USER = 't... {at} example(.)com'  # set to '' for no logged in user
 
# Start with a fresh api proxy.
apiproxy_stub_map.apiproxy = apiproxy_stub_map.APIProxyStubMap()
 
# Use a fresh stub datastore.
stub = datastore_file_stub.DatastoreFileStub(APP_ID, '/dev/null', '/dev/null')
apiproxy_stub_map.apiproxy.RegisterStub('datastore_v3', stub)
 
# Use a fresh stub UserService.
apiproxy_stub_map.apiproxy.RegisterStub('user',
user_service_stub.UserServiceStub())
os.environ['AUTH_DOMAIN'] = AUTH_DOMAIN
os.environ['USER_EMAIL'] = LOGGED_IN_USER
 
# Use a fresh urlfetch stub.
apiproxy_stub_map.apiproxy.RegisterStub(
    'urlfetch', urlfetch_stub.URLFetchServiceStub())
 
# Use a fresh mail stub.
apiproxy_stub_map.apiproxy.RegisterStub(
  'mail', mail_stub.MailServiceStub())
 

(Either replace the /usr/local/ bit with the actual path to your GAE install or use Duncans code which is neater -- I was lazy and copied the contents of the sys.path list from the Django interactive shell.)

To use it, simply:

import gae_doctests

And hit ⌘ R in TextMate. Sweet!

Once I'm done hacking away on a module, I simply comment out the import.

Could not embed video.





Bad Behavior has blocked 0 access attempts in the last 7 days.