Google Cloud Platform gets a performance boost today with the much anticipated public beta of NVIDIA Tesla K80 GPUs. You can now spin up NVIDIA GPU-based VMs in three GCP regions: us-east1, asia-east1 and europe-west1, using the ...




Google Cloud Platform gets a performance boost today with the much anticipated public beta of NVIDIA Tesla K80 GPUs. You can now spin up NVIDIA GPU-based VMs in three GCP regions: us-east1, asia-east1 and europe-west1, using the gcloud command-line tool. Support for creating GPU VMs using the Cloud Console appears next week.



If you need extra computational power for deep learning, you can attach up to eight GPUs (4 K80 boards) to any custom Google Compute Engine virtual machine. GPUs can accelerate many types of computing and analysis, including video and image transcoding, seismic analysis, molecular modeling, genomics, computational finance, simulations, high performance data analysis, computational chemistry, finance, fluid dynamics and visualization.






NVIDIA K80 GPU Accelerator Board



Rather than constructing a GPU cluster in your own datacenter, just add GPUs to virtual machines running in our cloud. GPUs on Google Compute Engine are attached directly to the VM, providing bare-metal performance. Each NVIDIA GPU in a K80 has 2,496 stream processors with 12 GB of GDDR5 memory. You can shape your instances for optimal performance by flexibly attaching 1, 2, 4 or 8 NVIDIA GPUs to custom machine shapes.






Google Cloud supports as many as 8 GPUs attached to custom VMs, allowing you to optimize the performance of your applications.



These instances support popular machine learning and deep learning frameworks such as TensorFlow, Theano, Torch, MXNet and Caffe, as well as NVIDIA’s popular CUDA software for building GPU-accelerated applications.


Pricing


Like the rest of our infrastructure, the GPUs are priced competitively and are billed per minute (10 minute minimum). In the US, each K80 GPU attached to a VM is priced at $0.700 per hour per GPU and in Asia and Europe, $0.770 per hour per GPU. As always, you only pay for what you use. This frees you up to spin up a large cluster of GPU machines for rapid deep learning and machine learning training with zero capital investment.






Supercharge machine learning


The new Google Cloud GPUs are tightly integrated with Google Cloud Machine Learning (Cloud ML), helping you slash the time it takes to train machine learning models at scale using the TensorFlow framework. Now, instead of taking several days to train an image classifier on a large image dataset on a single machine, you can run distributed training with multiple GPU workers on Cloud ML, dramatically shorten your development cycle and iterate quickly on the model.



Cloud ML is a fully-managed service that provides end-to-end training and prediction workflow with cloud computing tools such as Google Cloud Dataflow, Google BigQuery, Google Cloud Storage and Google Cloud Datalab.



Start small and train a TensorFlow model locally on a small dataset. Then, kick off a larger Cloud ML training job against a full dataset in the cloud to take advantage of the scale and performance of Google Cloud GPUs. For more on Cloud ML, please see the Quickstart guide to get started, or this document to dive into using GPUs.




Next steps


Register for Cloud NEXT, sign up for the CloudML Bootcamp and learn how to Supercharge performance using GPUs in the cloud. You can use the gcloud command-line to create a VM today and start experimenting with TensorFlow-accelerated machine learning. Detailed documentation is available on our website.








Here at Google Cloud, we employ a small army of developer advocates, DAs for short, who are out on the front lines at conferences, at customer premise, or on social media, explaining our technologies and communicating back to people like me and our product teams about your needs as a member of a development community.




Here at Google Cloud, we employ a small army of developer advocates, DAs for short, who are out on the front lines at conferences, at customer premise, or on social media, explaining our technologies and communicating back to people like me and our product teams about your needs as a member of a development community.



DAs take the responsibility of advocating for developers seriously, and have spent time poring over the extensive Google Cloud Next '17 session catalog, bookmarking the talks that will benefit you. To wit:


  • If you’re a developer working in Ruby, you know to turn to Aja Hammerly for all things Ruby/Google Cloud Platform (GCP)-related. Aja’s top pick for Rubyists at Next is Google Cloud Platform < 3 Ruby with Google Developer Program Engineer Remi Taylor, but there are other noteworthy mentions on her personal blog.

  • Mete Atamel is your go-to DA for all things Windows on GCP. Selfishly, his top Next session is his own about running ASP.NET apps on GCP, but he has plenty more suggestions for you to choose from

  • Groovy nut Guillaume Laforge is going to be one busy guy at Next, jumping from between sessions about PaaS, serverless and containers, to name a few. Here’s his full list of his must-see sessions

  • If you’re a game developer, let Mark Mandel be your guide. Besides co-presenting with Rob Whitehead, CTO of Improbable, Mark has bookmarked sessions about location-based gaming, using GPUs and game analytics. Mosy on over to his personal blog for the full list.

  • In the past year, Google Apps Script has opened the door to building amazing customizations for G Suite, our communication and collaboration platform. In this G Suite Developers blog post, Wesley Chun walks you through some of the cool Apps Script sessions, as well as sessions about App Maker and some nifty G Suite APIs. 

  • Want to attend sessions that teach you about our machine learning services? That’s where you’ll find our hands-on ML expert Sara Robinson, who in addition to recommending her favorite Next sessions, also examines her talk from last year’s event using Cloud Natural Language API. 


For my part, I’m really looking forward to Day 3, which we’re modeling after my favorite open source conferences thanks to Sarah Novotny’s leadership. We’ll have a carefully assembled set of open talks on Kubernetes, TensorFlow and Apache Beam that cover the technologies, how to contribute, the ecosystems around them and small group discussions with the developers. For a full list of keynotes, bootcamps and breakout sessions, check out the schedule and reserve your spot.





At Waze, our mission of outsmarting traffic, together ...




At Waze, our mission of outsmarting traffic, together forces us to be mindful of one of our users' most precious possessions  their time. Our cloud-based service saves users time by helping them find the optimal route based on crowdsourced data.



But a cloud service is only as good as it is available. At Waze, we use multiple cloud providers to improve the resiliency of our production systems. By running an active-active architecture across Google Cloud Platform (GCP) and AWS, we’re in a better position to survive a DNS DDOS attack, a regional failure  even a global failure of an entire cloud provider.



Sometimes, though, a bug in routing or ETA calculations makes it to production undetected, and we need the ability to roll that code back or fix it as fast as possible  velocity is key. That’s easier said than done in a multi-cloud environment. For example, our realtime routing service spans over 80 autoscaling groups/managed instance groups across two cloud providers and over multiple regions.



This is where continuous delivery helps out. Specifically, we use Spinnaker, an open source, continuous delivery platform for releasing software changes with high velocity and confidence. Spinnaker has handled 100% of our production deployments for the past year, regardless of target platform.




Spinnaker and continuous delivery FTW


Large-scale cloud deployments can be complex. Fortunately, Spinnaker abstracts many of the particulars of each cloud provider, allowing our developers to focus on making Waze even better for our users instead of concerning themselves with the low-level details of multiple cloud providers. All the while, we’re able to maintain important continuous delivery concepts like canaries, immutable infrastructure and fast rollbacks.



Jenkins is a first-class citizen in Spinnaker, so once code is committed to git and Jenkins builds a package, that same package triggers the main deployment pipeline for that particular microservice. That pipeline bakes the package into an immutable machine image on multiple cloud providers in parallel and continues to run any automated testing stages. The deployment proceeds to staging using blue/green deployment strategy, and finally to production without having to get deep into the details of each platform. Note that Spinnaker automatically resolves the correct image IDs per cloud provider, so that each cloud’s deployment processes happen automatically and correctly without the need for any configuration.



Example of a multi-cloud pipeline




*Jenkins icon from Jenkins project.




Multi-Cloud blue/green deployment using Spinnaker

Thanks to Spinnaker, developers can focus on developing business logic, rather than becoming experts on each cloud platform. Teams can track the lifecycle of a deployment using several notification mechanisms including email, Slack and SMS, allowing them to coordinate handoffs between developer and QA teams. Support for tools like canary analysis and fast rollbacks allows developers to make informed decisions about the state of their deployment. Since Spinnaker is designed from the ground up to be a self-service tool, developers can do all of this with minimal involvement from the Ops team.



At Waze, we strive to release new features and bug fixes to our users as quickly as possible. Spinnaker allows us to do just that while also helping keep multi-cloud deployments and rollbacks simple, easy and reliable.





If this sounds like something your organization would benefit from, check out Spinnaker. And don't miss our talks at GCP Next 2017:








Google Cloud acquired Orbitera last fall, and already we’ve hit a key milestone: completing the migration of the multi-cloud commerce platform from AWS to ...




Google Cloud acquired Orbitera last fall, and already we’ve hit a key milestone: completing the migration of the multi-cloud commerce platform from AWS to Google Container Engine, our managed Kubernetes environment.



This is a real testament to the maturity of Kubernetes and Container Engine, which in less than two years, has emerged as the most popular managed container orchestrator service on the market, powering Google Cloud’s own services as well as customers such as Niantic with Pokemon Go.



Founded in 2012, Orbitera originally built its service on an extensive array of AWS services including EC2, S3, RDS and RedShift, and weighed the pros and cons of migrating to various Google Cloud compute platforms  Google Compute Engine, Google App Engine or Container Engine. Ultimately, migrating to Container Engine presented Orbitera with the best opportunity to modernize its service and move from a monolithic architecture running on virtual machines to a microservices-based architecture based on containers.



The resulting service allows Orbitera ISV partners to easily sell their software in the cloud, by managing the testing, provisioning and ongoing billing management of their applications across an array of public cloud providers.



Running on Container Engine is proving to be a DevOps management boon for the Orbitera service. With Container Engine, Orbitera now runs in multiple zones with three replicas in each zone, for better availability. Container Engine also makes it easy to scale component microservices up and down on demand as well as deploy to new regions or zones. Operators roll out new application builds regularly to individual Kubernetes pods, and easily roll them back if the new build behaves unexpectedly.



Meanwhile, as a native Google Cloud service, the container management service integrates with multiple Google Cloud Platform (GCP) services such as Cloud SQL load balancers, and Google Stackdriver, whose alerts and dashboards allow Orbitera to respond to issues more quickly and more efficiently.



Going forward, running on Container Engine positions Orbitera to take advantage of more modular microservices and APIs and rapidly build out new services and capabilities for customers. That’s a win for enterprises who depend on Orbitera to provide a seamless, consistent and easy way to manage software running across multiple clouds, including Amazon Web Services and Microsoft Azure.



Stay tuned for a technical deep dive from the engineering team that performed the migration, where they'll share lessons learned, and tips and tricks for performing a successful migration to Container Engine yourself. In the meantime, if you have questions about Container Engine, you can find us on our Slack channel.









Today, we’re excited to announce the public beta for Cloud Spanner, a globally distributed relational database service that lets customers have their cake and eat it too ...








Today, we’re excited to announce the public beta for Cloud Spanner, a globally distributed relational database service that lets customers have their cake and eat it too: ACID transactions and SQL semantics, without giving up horizontal scaling and high availability.



When building cloud applications, database administrators and developers have been forced to choose between traditional databases that guarantee transactional consistency, or NoSQL databases that offer simple, horizontal scaling and data distribution. Cloud Spanner breaks that dichotomy, offering both of these critical capabilities in a single, fully managed service.


Cloud Spanner presents tremendous value for our customers who are retailers, manufacturers and wholesale distributors around the world. With its ease of provisioning and scalability, it will accelerate our ability to bring cloud-based omni-channel supply chain solutions to our users around the world,”  John Sarvari, Group Vice President of Technology, JDA

JDA, a retail and supply chain software leader, has used Google Cloud Platform (GCP) as the basis of its new application development and delivery since 2015 and was an early user of Cloud Spanner. The company saw its potential to handle the explosion of data coming from new information sources such as IoT, while providing the consistency and high availability needed when using this data.



Cloud Spanner rounds out our portfolio of database services on GCP, alongside Cloud SQL, Cloud Datastore and Cloud Bigtable.



As a managed service, Cloud Spanner provides key benefits to DBAs:


  • Focus on your application logic instead of spending valuable time managing hardware and software

  • Scale out your RDBMS solutions without complex sharding or clustering

  • Gain horizontal scaling without migration from relational to NoSQL databases

  • Maintain high availability and protect against disaster without needing to engineer a complex replication and failover infrastructure

  • Gain integrated security with data-layer encryption, identity and access management and audit logging




With Cloud Spanner, your database scales up and down as needed, and you'll only pay for what you use. It features a simple pricing model that charges for compute node-hours, actual storage consumption (no pre-provisioning) and external network access.



Cloud Spanner keeps application development simple by supporting standard tools and languages in a familiar relational database environment. It’s ideal for operational workloads supported by traditional relational databases, including inventory management, financial transactions and control systems, that are outgrowing those systems. It supports distributed transactions, schemas and DDL statements, SQL queries and JDBC drivers and offers client libraries for the most popular languages, including Java, Go, Python and Node.js.




More Cloud Spanner customers share feedback


Quizlet, an online learning tool that supports more than 20 million students and teachers each month, uses MySQL as its primary database; database performance and stability are critical to the business. But with users growing at roughly 50% a year, Quizlet has been forced to scale its database many times to handle this load. By splitting tables into their own databases (vertical sharding), and moving query load to replicas, it’s been able to increase query capacity  but this technique is reaching its limits quickly, as the tables themselves are outgrowing what a single MySQL shard can support. In its search for a more scalable architecture, Quizlet discovered Cloud Spanner, which will allow it to easily scale its relational database and simplify its application:


Based on our experience and performance testing, Cloud Spanner is the most compelling option we’ve seen to power a high-scale relational query workload. It has the performance and scalability of a NoSQL database, but can execute SQL so it’s a viable alternative to sharded MySQL. It’s an impressive technology and could dramatically simplify how we manage our databases.” Peter Bakkum, Platform Lead, Quizlet


The history of Spanner


For decades, developers have relied on traditional databases with a relational data model and SQL semantics to build applications that meet business needs. Meanwhile, NoSQL solutions emerged that were great for scale and fast, efficient data-processing, but they didn’t meet the need for strong consistency. Faced with these two sub-optimal choices that customers grapple with today, in 2007, a team of systems researchers and engineers at Google set out to develop a globally-distributed database that could bridge this gap. In 2012, we published the Spanner research paper that described many of these innovations. The result was a database that offers the best of both worlds:












(click to enlarge)



Remarkably, Cloud Spanner achieves this combination of features without violating the CAP Theorem. To understand how, read this post by the author of the CAP Theorem and Google Vice President of Infrastructure, Eric Brewer.



Over the years, we’ve battle-tested Spanner internally with hundreds of different applications and petabytes of data across data centers around the world. At Google, Spanner supports tens of millions of queries per second and runs some of our most critical services, including AdWords and Google Play.



If you have a MySQL or PostgreSQL system that's bursting at the seams, or are struggling with hand-rolled transactions on top of an eventually-consistent database, Cloud Spanner could be the solution you're looking for. Visit the Cloud Spanner page to learn more and get started building applications on our next-generation database service.





Building systems that manage globally distributed data, provide data consistency and are also highly available is really hard. The beauty of the cloud is that someone else can build that for you.




Building systems that manage globally distributed data, provide data consistency and are also highly available is really hard. The beauty of the cloud is that someone else can build that for you.



The CAP theorem says that a database can only have two of the three following desirable properties:



  • C: consistency, which implies a single value for shared data

  • A: 100% availability, for both reads and updates

  • P: tolerance to network partitions



This leads to three kinds of systems: CA, CP and AP, based on what letter you leave out. Designers are not entitled to two of the three, and many systems have zero or one of the properties.



For distributed systems over a “wide area,” it's generally viewed that partitions are inevitable, although not necessarily common. If you believe that partitions are inevitable, any distributed system must be prepared to forfeit either consistency (AP) or availability (CP), which is not a choice anyone wants to make. In fact, the original point of the CAP theorem was to get designers to take this tradeoff seriously. But there are two important caveats: First, you only need to forfeit consistency or availability during an actual partition, and even then there are many mitigations. Second, the actual theorem is about 100% availability; a more interesting discussion is about the tradeoffs involved to achieve realistic high availability.



Spanner joins Google Cloud

Today, Google is releasing Cloud Spanner for use by Google Cloud Platform (GCP) customers. Spanner is Google’s highly available, global SQL database. It manages replicated data at great scale, both in terms of size of data and volume of transactions. It assigns globally consistent real-time timestamps to every datum written to it, and clients can do globally consistent reads across the entire database without locking.



In terms of CAP, Spanner claims to be both consistent and highly available despite operating over a wide area, which many find surprising or even unlikely. The claim thus merits some discussion. Does this mean that Spanner is a CA system as defined by CAP? The short answer is “no” technically, but “yes” in effect and its users can and do assume CA.



The purist answer is “no” because partitions can happen and in fact have happened at Google, and during some partitions, Spanner chooses C and forfeits A. It is technically a CP system.



However, no system provides 100% availability, so the pragmatic question is whether or not Spanner delivers availability that is so high that most users don't worry about its outages. For example, given there are many sources of outages for an application, if Spanner is an insignificant contributor to its downtime, then users are correct to not worry about it.



In practice, we find that Spanner does meet this bar, with more than five 9s of availability (less than one failure in 105). Given this, the target for multi-region Cloud Spanner will be right at five 9s, as it has some additional new pieces that will be higher risk for a while.



Inside Spanner 



The next question is, how is Spanner able to achieve this?



There are several factors, but the most important one is that Spanner runs on Google’s private network. Unlike most wide-area networks, and especially the public internet, Google controls the entire network and thus can ensure redundancy of hardware and paths, and can also control upgrades and operations in general. Fibers will still be cut, and equipment will fail, but the overall system remains quite robust.



It also took years of operational improvements to get to this point. For much of the last decade, Google has improved its redundancy, its fault containment and, above all, its processes for evolution. We found that the network contributed less than 10% of Spanner’s already rare outages.



Building systems that can manage data that spans the globe, provide data consistency and are also highly available is possible; it’s just really hard. The beauty of the cloud is that someone else can build that for you, and you can focus on innovation core to your service or application.



Next steps



For a significantly deeper dive into the details, see the white paper also released today. It covers Spanner, consistency and availability in depth (including new data). It also looks at the role played by Google’s TrueTime system, which provides a globally synchronized clock. We intend to release TrueTime for direct use by Cloud customers in the future.



Furthermore, look for the addition of new Cloud Spanner-related sessions at Google Cloud Next ‘17 in San Francisco next month. Register soon, because seats are limited.





Today we're excited to announce general availability and full support for Google Cloud Endpoints, a truly distributed API gateway. It features a server-local proxy (the ...




Today we're excited to announce general availability and full support for Google Cloud Endpoints, a truly distributed API gateway. It features a server-local proxy (the Extensible Service Proxy) and is built on the same services that Google uses to power its own APIs. For developers building applications and microservices on Google Cloud Platform (GCP), Cloud Endpoints is the best-suited modern API gateway that helps secure and monitor their APIs.



APIs are a critical part of mobile apps, modern web applications and microservices. With the increased focus on APIs comes increased responsibility: the top features you need to take care of your API are authorization, monitoring and logging. In other words, “Help make my API safer” and “Tell me how my API is doing.” And, above all: “Help make sure it is highly performant!”





Cloud Endpoints helps you with all of that. Through integrations with Firebase and Auth0, Cloud Endpoints authenticates each call to your API  so you know who's using your mobile and web apps. Cloud Endpoints also validates service-to-service calls, helping to keep your microservices more secure. You can create API keys via the Google Cloud Console, just like we do for APIs such as the Google Translation API and the Google Maps APIs. It logs API calls to Stackdriver Logging and displays a monitoring dashboard in the console, giving you critical status information about the health and performance of your API.



Cloud Endpoints is tightly integrated with the GCP ecosystem and is easy to use especially when running containerized workloads. The API proxy is built into Google App Engine flexible environment and can be added to any Kubernetes or Google Container Engine deployment with a couple of lines of YAML. Deployment occurs through gcloud, and the monitoring and logging is all in Cloud Console. The proxy functionality is also built into the Endpoints API Frameworks (see below) for use on App Engine standard environment.



Cloud Endpoints is ideal for GCP customers who need a fast, scalable API gateway. Enterprise customers can also take advantage of Apigee Edge, an industry leading API management platform that we acquired this fall, for full-featured API management that works across on-premises, cloud and hybrid deployment models.



Endpoints has been in beta for three months and our early adopters are already doing amazing things with it. We’ve seen people going from initial evaluation to production in days, and scalability has been terrific: one customer’s API peaked at over 11,000 requests per second, and another customer served nearly 50 million requests in a day.


“When migrating our workloads to Google Cloud Platform, we needed to securely communicate between multiple data centres. Traditional methods like firewalls and ad hoc authentication were unsustainable, quickly leading to a jumbled mess of ACLs. Endpoints, on the other hand, gives us a standardised authentication system, backed by Google's security pedigree.”  Laurie Clark-Michalek, Infrastructure Engineer at Qubit


Frameworks


For developers working on App Engine standard environment, we're also announcing general availability and full support for our Java and Python frameworks. The Endpoints Frameworks are available to help developers quickly get started serving an API from App Engine. Think of the Endpoints Frameworks as lightweight alternatives to Python Flask or Java Jersey. The Endpoints Frameworks come with built-in integration to the Google Service Control API, meaning that back-ends built with the Endpoints Frameworks do not need to run behind the Extensible Service Proxy.




Pricing


We want everyone to be able to benefit from Cloud Endpoints, so we created a free tier so that your first two million API calls per month are available at no charge. Once your service is popular enough to go beyond the free-tier, the pricing is a simple $3.00 per million requests.



The Endpoints Frameworks can optionally expose all of the features of the Cloud Endpoints API gateway. The Endpoints Frameworks can be used at no cost, but the Endpoints API gateway features turned on with the Frameworks (API keys, monitoring and logging) are charged at the standard rate.



Go ahead, read the documentation, try our walkthroughs for App Engine (standard or flexible environment), Container Engine or Compute Engine and join our Google Cloud Endpoints Google Group. To hear more about Google’s vision on comprehensive API technologies for enterprise customers and developers, including more about Apigee Edge, please join us next month at Google Cloud Next '17 in San Francisco.