Good morning I/O-ers and welcome to day two! We hope you got a chance to see all of the great sessions on day one and, if you missed our big announcements from yesterday, take a look at all the great features in 1.5.0 including Backends and our plans for App Engine to leave preview later this year.
We’ve got a great set of sessions lined up for day two including updates on our progress with Full-text Search and MapReduce. We’ve also got two great sessions on a subject close to developers’ hearts: reliability.
Under the Covers with the High Replication Datastore - At 10:45 am, we jump into the internals of the High Replication Datastore (HRD), explaining how it works, how it differs from the Master/Slave configuration, and why developers love it. Now that HRD is the default configuration and cheaper to use, come find out what you’re missing.
Life in App Engine Production - At 3:00 pm, come meet the team wearing the pagers so you don’t have to, App Engine’s Site Reliability Engineers (SREs). In this session, they’ll give you a view of what life behind the scenes is like and why you should concentrate on your application and let the SREs take care of keeping the lights on. Learn how building on App Engine means your application gets its very own Devops team.
For those of you that couldn't make it to I/O this year, don’t stress. While we wish you were here, the I/O video team will soon have videos of all our sessions available so you can catch up from the comfort of your own home. We’ve even captured a few of our Developer Sandbox companies so you’ll get the full experience!
Posted by the App Engine team
The App Engine Team
The App Engine team has been working furiously in preparation for Google I/O time and today, we are excited to announce the release of App Engine 1.5.0, complete with a bunch of new features. This release brings a whole new dimension to App Engine Applications with the introduction of Backends, some big improvements to Task Queues, a completely new, experimental runtime for the Go language, High Replication Datastore as the new default configuration (and a lower price!), and even more tweaks and bug fixes.
This post is part of Who's at Google I/O, a series of guest blog posts written by developers who are appearing in the Developer Sandbox at Google I/O. It's also cross-posted to the Google Code blog which has similar posts for all sorts of Google developer products.
Evite is one of web's oldest social planning services. Since the launch in 1998, Evite has delivered over a billion party invitations. Although it has served us well, after ten years of operation it became necessary to replace the aging platform, and position Evite for another decade of successful party planning.
The reengineering effort took well over a year. Our development team replaced Java with Python and saw a significant increase in developer productivity. However, dealing with a ten year-old Oracle database remained a challenge. To overcome scalability limitations and inflexibility of a large relational DBMS, we designed and implemented a proprietary data store. We conducted a series of A/B tests and gradually started migrating production traffic to the new system.
Our initial motivation to use Google App Engine was fast provisioning. We introduced App Engine as a temporary solution while we increased scale and optimized our proprietary data store. However, once we imported user profile data and put App Engine backend services in production, we never looked back.
Importing a large data set of user profiles from Oracle RAC into App Engine was challenging at first because we had to perform a bulk import and then keep the data synchronized between the two datastores. As we developed our data synchronization tools we gained better understanding of the API and performance characteristics of the App Engine datastore.
Once we synchronized profile data and enabled production traffic, App Engine really started to impress. We would watch our daily traffic grow and observe App Engine’s automatic scaling in action. Additional server instances would come online to meet increased demand and disappear as traffic lowered, without any sysadmin intervention. Despite our proficiency in the use of cloud computing resources and system automation, it was never this easy to provision new servers. As Grig, Evite’s devops guru, likes to say "It's not in production until it's monitored and graphed." App Engine’s dashboard automatically manages instances and graphs system usage data.
Following our positive experience with Evite profiles, we have continued to use App Engine for other services. Occasional frustrations with elevated error rates on the Master/Slave Datastore disappeared as we switched to the High Replication Datastore. At this point we see no technical obstacles to using App Engine more extensively. Architectural decisions and best practices implemented by the App Engine team are very well aligned with the choices we made in designing Evite’s new platform. This makes it easy to deploy additional services and data sets to App Engine.
The real obstacle to using App Engine exclusively for most Evite services is risk management. As reliable and cost-effective as App Engine has been for us, it is difficult to depend completely on a single vendor/service provider. To ensure availability of our application we use multiple cloud providers. Unfortunately, this strategy prevents us from using certain App Engine features and API's because we do not have adequate replacements for them in other deployment environments.
Evite’s new platform built on Python, NoSQL and App Engine has been a success. Our new web application has been well received by users, and our first iPhone app is receiving positive reviews. (Android version is in the works). We look forward to continued use of App Engine for data warehousing, rapid launches of new services and other projects.
Come see Evite in the Developer Sandbox at Google I/O on May 10-11.
Dan Mesh is a Vice President of Engineering at Evite. His extended responsibilities include espresso deliveries and release day food orders.
Posted by Scott Knaster, Editor
This is part of our on going series of blog posts from guest authors highlighting success stories from applications and services built on or targeting App Engine developers. Today, we have a post from Bruno Morency of Context.IO. Bruno has been involved in startups since graduating from McGill Engineering in 2001. And since being introduced to Pine on UNIX terminals, he’s had a love-hate relationship with email.
Email mailboxes contain years of important conversations and business information yet there are no easy ways for App Engine developers to find and use that information. This is what Context.IO does. It’s the missing email API that turns mailboxes into a data source developers can leverage. This is particularly interesting for Google App Engine developers since it makes content of Gmail accounts (and any other IMAP accessible email account) available through a set of HTTP calls.
In this post, we’re going to walk you through using Context.IO to build an app that lets users search for contacts from their inbox then easily get the history of recent emails and attachments exchanged with these contacts. A working demo is available at http://contextio-demo.appspot.com. The code for this application is available on Context.IO’s GitHub account.
Our demo is quite simple in its functionality: there’s a search box used to find contacts, and once a contact has been found, we list recent emails and attachments associated with the contact. To do this, the application offers 3 urls that are called by the JavaScript running in the browser to obtain the data: search.json, messages.json and files.json.
The actual UI formatting is all implemented in the JavaScript running in the browser, but our focus here is how we get that data out of the mailbox and return it to the browser.
Let’s see how we respond to the request to get message history for a given contact. This is done by calling /messages.json which accepts an email address as a GET parameter. Note, this functionality requires an authentication step not shown here. The code behind that call is as follows:
class MessagesHandler(webapp.RequestHandler): def get(self): current_user = users.get_current_user() current_email = current_user.email() emailAddr = self.request.get('email') contextIO = ContextIO(api_key=settings.CONTEXTIO_OAUTH_KEY, api_secret=settings.CONTEXTIO_OAUTH_SECRET, api_url=settings.CONTEXTIO_API_URL) response = contextIO.contactmessages(emailAddr,account=current_email) self.response.out.write(simplejson.dumps(response.get_data()))
The code simply uses the contactmessages.json API call of and returns all the messages including the subject, other recipients, thread ID, and even attachments in JSON format. Check the documentation for a complete breakdown of the response.
You can see how we handle the other cases similarly in handlers.py. To get a complete overview of other calls offered by Context.IO, please refer to the documentation available here http://context.io/docs/latest.
The complete code for this demo application has been made available by the Context.IO team on our GitHub account. Feel free to use it as the base for your own application. To get started, you’ll need to setup your application with your own Google API authorization and Context.IO API keys.
If you have questions about using Context.IO to embed your user’s email in your app, feel free to contact us through http://support.context.io.
Posted by Bruno Morency, Co-founder of Context.IO
Elastic Path develops a very flexible enterprise ecommerce platform. Many global brands rely on the Elastic Path platform to power their ecommerce solutions.
Many ecommerce sites are actually complex web applications. Catalog management, shopping cart functionality, promotion engine, order fulfillment, and backend integrations are just some of the challenges involved in running a full-fledged online store.
Since 2008, our Java-based platform has been the ecommerce backbone of a couple of online stores that are being migrated to run on App Engine. Like many complex web applications, these stores used to run in a multi-server environment (Apache Tomcat with a MySQL database) hosted in a colocation center.
As the diagram above shows, our goal is to have Elastic Path running entirely on the App Engine cloud. The storefronts have already been migrated, and the database and remaining parts of the Elastic Path platform will be fully on the cloud soon.
Why are we doing this? There are many benefits to being on App Engine:
Our migration’s high-level approach was to move everything except the persistence layer onto App Engine, and then resolve issues with the technical limitations such as the class whitelist and request length. We also had to modify some third-party libraries to work around App Engine’s restrictions on operations such as class loading, threads, and sockets.
We didn’t migrate the persistence layer because Elastic Path uses a relational database; converting our entire object graph to the Datastore is not feasible now. We are working closely with Google on alternatives. In the interim, we are still using a MySQL database and have kept our persistence layer running within a Tomcat application in the colo. We implemented a creative solution: the non-persistence layers of Elastic Path run on App Engine and communicate with the Tomcat-hosted persistence services via Spring Remoting. The back-and-forth remoting was expensive and impacted the performance of our application so we implemented some data caching. For this, we turned to App Engine’s Memcache, which improved performance by an order of magnitude (less than 2 seconds average response times vs. 2 minutes or more without Memcache).
Other App Engine technologies we use heavily include AppStats for performance tuning, URL Fetch for the Spring Remoting described above, and the fantastic Maven GAE plugin that we use for packaging and automated deployments. As we continue to push our platform up to the cloud, we hope to utilize more of App Engine’s cool features. If you’d like to learn more about Elastic Path, how we are migrating our Java platform to run on the cloud, and how you might be able to migrate your application to App Engine, drop by our booth in the App Engine section of the Developer Sandbox. See you there!
Come see Elastic Path Software in the Developer Sandbox at Google I/O on May 10-11.
Eddie Chan is an ecommerce developer at Elastic Path Software in beautiful Vancouver, Canada. He and his brilliant team work closely with Google and are currently focused on migrating existing online stores to App Engine.
Mojo Helpdesk from Metadot is an RDBMS-based Rails application for ticket tracking and management that can handle millions of tickets. We are migrating this application to run on Google App Engine (GAE), Java, and Google Web Toolkit (GWT). We were motivated to make this move because of the application’s need for scalability in data management and request handling, the benefits from access to GAE’s services and administrative tools, and GWT’s support for easy development of a rich application front-end.
In this post, we focus on GAE and share some techniques that have been useful in the migration process.
Task failure management
Our application makes heavy use of the Task Queue service, and must detect and manage tasks that are being retried multiple times but aren’t succeeding. To do this, we extended Deferred, which allows easy task definition and deployment. We defined a new Task abstraction, which implements an extended Deferrable and requires that every Task implement an onFailure method. Our extension of Deferred then terminates a Task permanently if it exceeds a threshold on retries, and calls its onFailure method.
Deferred
Task
Deferrable
onFailure
This allows permanent task failure to be reliably exposed as an application-level event, and handled appropriately. (Similar techniques could be used to extend the new official Deferred API).
Appengine-mapreduce
Mojo Helpdesk needs to run many types of batch jobs, and appengine-mapreduce is of great utility. However, we often want to map over a filtered subset of Datastore entities, and our map implementations are JDO-based (to enforce consistent application semantics), so we don’t need low-level Entities prefetched. So, we made two extensions to the mapper libraries. First, we support the specification of filters on the mapper’s Datastore sharding and fetch queries, so that a job need not iterate over all the entities of a Kind. Second, our mapper fetch does a keys-only Datastore query; only the keys are provided to the map method, then the full data objects are obtained via JDO. These changes let us run large JDO-based mapreduce jobs with much greater efficiency.
appengine-mapreduce
Supporting transaction semantics
The Datastore supports transactions only on entities in the same entity group. Often, operations on multiple entities must be performed atomically, but grouping is infeasible due to the contention that would result. We make heavy use of transactional tasks to circumvent this restriction. (If a task is launched within a transaction, it will be run if and only if the transaction commits). A group of activities performed in this manner – the initiating method and its transactional tasks – can be viewed as a “transactional unit” with shared semantics.
We have made this concept explicit by creating a framework to support definition, invocation, and automatic logging of transactional units. (The Task abstraction above is used to identify cases where a transactional task does not succeed). All Datastore-related application actions – both in RPC methods and "offline" activities like mapreduce – use this framework. This approach has helped to make our application robust, by enforcing application-wide consistency in transaction semantics, and in the process, standardizing the events and logging which feed the app’s workflow systems.
Entity Design
To support join-like functionality, we can exploit multi-valued Entity properties (list properties) and the query support they provide. For example, a Ticket includes a list of associated Tag IDs, and Tag objects include a list of Ticket IDs they’re used with. This lets us very efficiently fetch, for example, all Tickets tagged with a conjunction of keywords, or any Tags that a set of tickets has in common. (We have found the use of "index entities" to be effective in this context). We also store derived counts and categorizations in order to sidestep Datastore restrictions on query formulation.
Ticket
Tag
Tickets
These patterns have helped us build an app whose components run efficiently and robustly, interacting in a loosely coupled manner.
Come see Mojo Helpdesk in the Developer Sandbox at Google I/O on May 10-11.
Amy (@amygdala) has recently co-authored (with Daniel Guermeur) a book on Google App Engine and GWT application development. She has worked at several startups, in academia, and in industrial R&D labs; consults and does technical training and course development in web technologies; and is a contributor to the @thinkupapp open source project.
Use promo code NEXT1720 to save $300 off general admission