A few weeks ago we announced Perfkit to make it easy for you to benchmark popular workloads on the cloud. As we mentioned, it’s a living benchmark, and we are evolving it to include a new tool to measure the impact on latency when you grow the number of servers that power your application.
A few weeks ago we announced Perfkit to make it easy for you to benchmark popular workloads on the cloud. As we mentioned, it’s a living benchmark, and we are evolving it to include a new tool to measure the impact on latency when you grow the number of servers that power your application.



We call the new performance benchmark Online Data Intensive Simulator, or OLDISIM, written in collaboration with the Multiscale Architecture and Systems Team (MAST) at Stanford. It models the distributed, fan-out nature of many modern applications with tight tail latency requirements, such as Google Search and some NoSQL database applications.



We use OLDSIM internally to measure the impact of both hardware and software improvements on our scale out workloads and analyze their scaling efficiency. Scale out efficiency allows us to meet new user demand by adding the fewest number of servers possible while maintaining great user experience. The fewer servers we add, the more energy efficient we are, and the cheaper the solution is. Predicting how a service will scale out is usually very hard under laboratory conditions, but experiments show that OLDISIM results strongly correlate with our current Google Search performance in scaling efficiency, as the chart below demonstrates.



Our needs within Google are similar in many ways to other scale out Internet workloads, and we're making a version of OLDISIM available to the open source community through PerfKit Benchmarker. We shared it using the Apache V2 license. With OLDISIM, you can more easily model and simulate most applications with a fan-out/synthesis model, including Hadoop and several NoSQL products. You can specify which workload you plug in to each leaf node, and measure the scaling efficiency and tail latency of your applications.



You can run OLDISIM by itself by following the instructions on GitHub, or use PerfKit Benchmarker to run it on many of the most popular cloud providers. The command line is as simple as “pkb.py --benchmarks=oldisim”.



Both OLDISIM and PerfKit Benchmarker teams get your feedback through GitHub. We’d love to hear what you think, so please send us your suggestions and issue reports.



Happy Benchmarking!



Posted by Ivan Santa Maria Filho on behalf of the Cloud and

Platforms Performance Teams

Imagine being away from your desk and receiving automatic alerts when an issue occurs in your Google App Engine app. Or waiting at the airport and stopping your test VMs before leaving for vacation. With the beta launch of Cloud Console for Android, managing Google Cloud Platform from your phone or tablet is possible (and yes, an iOS version is in the works).
Imagine being away from your desk and receiving automatic alerts when an issue occurs in your Google App Engine app. Or waiting at the airport and stopping your test VMs before leaving for vacation. With the beta launch of Cloud Console for Android, managing Google Cloud Platform from your phone or tablet is possible (and yes, an iOS version is in the works).



With just a few taps you can quickly glance at the status of your solution, set up alerts, manage your Cloud Platform resources and access Google Cloud Monitoring performance and health graphs.



Get it now from the Google Play Store.




Quickly view app status


Want to determine the status of your solution with just a quick glance? Customize the home page with your personal selection of monitoring graphs, a billing estimate or Cloud Platform service status information.








Get alerts and manage incidents


Want to be told when something goes wrong, for example when your Google Compute Engine instances rise above their expected load of 50% CPU for one hour? Cloud Console for Android integrates with Cloud Monitoring, enabling automated incident tracking when system metrics deviate. You can configure alerts to display directly in the Android notification drawer, and you can comment so that your team knows you’re working on the issue.






View App Engine and Compute Engine properties and make quick changes


When investigating an issue, you often need to check the health and properties of your resources, such as running state, zone or IP. The app supports viewing details and monitoring graphs for App Engine and Compute Engine instances. You can also invoke a number of core operations, such as changing the App Engine version or starting/stopping a Compute Engine instance.




Get started now


The app is available in the Google Play Store, just search for Cloud Console. For tips on how to configure the app, take a look at this quick guide.




Feedback wanted


Cloud Console for Android is currently in beta, and an iOS app is expected to launch later this year. Over the coming months, we’ll continue to add new features and resolve issues. To influence our future work, please send your feedback, ideas and suggestions to android-cloud-console@google.com!



­- Posted by Michael Thomsen, Product Manager

Google App Engine is a great place to run your applications, but for some workloads you may want more fine-grained control of the environment your app runs in. You may need to fine-tune how or when scaling occurs, customize the load balancer or code in a language that ...
Google App Engine is a great place to run your applications, but for some workloads you may want more fine-grained control of the environment your app runs in. You may need to fine-tune how or when scaling occurs, customize the load balancer or code in a language that Google App Engine doesn’t support.



Today we’re excited to introduce a solution paper and tutorial for Scalable and Resilient Web Applications to help you – you guessed it – build a scalable and resilient web application on Google Cloud Platform. The solution includes a technical paper that discusses the application architecture and key design decisions as well as a functional, open source application and tutorial hosted on GitHub that you can deploy or even use as a starting point for your own applications.



You may have read our previous post about Google Compute Engine Load Balancer easily handling 1,000,000 requests per second or watched the live demo where Compute Engine Autoscaler added enough instances to handle over 1,500,000 requests per second and wondered, how exactly did they do that?



The sample implementation uses Cloud Deployment Manager to provision a load balancer as well as multi-zone auto-scaled Compute Engine instances to serve the Redmine project management web app. The architecture uses Google Cloud SQL and Google Cloud Storage to reliably and scalably store the app’s data. Here’s an overview of the complete architecture:







You’ll also learn how to use Chef and Google Compute Engine startup-scripts to configure and install software on instances at boot time. There’s a lot of technical content we think you’ll find useful – check out the article, then head over to the GitHub project page (where you’ll also find the tutorial and can ask questions or make suggestions in the issues section) and start building more scalable and resilient apps.



-Posted by Evan Brown, Solutions Architect

One of the most compelling benefits of building and deploying solutions on public cloud platforms is the speed at which you can move from idea to running applications. We offer you a continuum of compute options – from ...
One of the most compelling benefits of building and deploying solutions on public cloud platforms is the speed at which you can move from idea to running applications. We offer you a continuum of compute options – from high performance VMs and container-based services to managed PaaS – so you can choose the most suitable option.



For those of you who need a VM-based solution, deploying an application requires that all underlying runtime components and packages be in place and configured correctly. This often becomes a labor-intensive, time-consuming task. Developers should spend most of their time on design and writing code. Time spent finding and deploying libraries, fixing dependencies, resolving versioning issues and configuring tooling is time away from that work.



Today, we're introducing Google Cloud Launcher, where you can launch more than 120 popular open source packages that have been configured by Bitnami or Google Click to Deploy. Deployment is incredibly straightforward: users simply select a package from the library, specify a few parameters and the package is up and running in a few clicks. Cloud Launcher is designed to make developers more efficient, removing operational deployment and configuration tasks so developers can focus on what matters – their application and their users.



Cloud Launcher includes developer tools and stacks such as Apache Solr, Django, Gitlab, Jenkins, LAMP, Node.js, Ruby on Rails, and Tomcat. It also includes popular databases like MongoDB, MySQL, PostgreSQL and popular applications like Wordpress, Drupal, JasperReports, Joomla and SugarCRM. Many of these packages have been specifically built and performance-tuned for Google Cloud Platform, and we’re actively working to ensure these packages are well integrated with Google Cloud Monitoring so you can review health and performance metrics, create custom dashboards and set alerts for your cloud infrastructure and software packages in one place. This will roll out to all supported packages on Cloud Launcher this spring.



When you visit Cloud Launcher, you can search for your desired package, or filter and browse categories such as Database, CRM or CMS.







“We are excited to partner with Google to simplify the deployment and configuration of servers and applications and look forward to continue to expand our integration with Google Compute Engine. Delivering an exceptional user experience is important to us, and Compute Engine gives Bitnami users another great way to deploy their favorite app in just few clicks,” said Erica Brescia, COO at Bitnami Inc.



You can get started with cloud launcher today to launch your favorite software package on Google Cloud Platform in a matter of minutes. And do remember to give us feedback via the links in Cloud Launcher or join our mailing list for updates and discussions. Enjoy building!



-Posted by Varun Talwar, Product Manager, Google Cloud Platform

Currently, the way that doctors and clinicians approach medical treatment is to look at a patient’s symptoms, determine a prognosis, and assign the appropriate treatment. While sensible, this reactive approach leaves a lot open for interpretation and may not hone in on critical clues such as predisposition to genetic mutation or length of time an illness lingered before symptoms appeared. With added insights about genetic makeup, environment, socioeconomic factors and family medical history, doctors and clinicians gain the ability to better tailor and individualize medical treatment.
Currently, the way that doctors and clinicians approach medical treatment is to look at a patient’s symptoms, determine a prognosis, and assign the appropriate treatment. While sensible, this reactive approach leaves a lot open for interpretation and may not hone in on critical clues such as predisposition to genetic mutation or length of time an illness lingered before symptoms appeared. With added insights about genetic makeup, environment, socioeconomic factors and family medical history, doctors and clinicians gain the ability to better tailor and individualize medical treatment.



Doctors need new technologies in order to provide this individualized care. Researchers devoted to personalized medicine can now use big data tools to analyze clinical records, genomic sequences, and laboratory data. All of this valuable data may reveal how differences in an individual’s genetics, lifestyle, and environment influence reactions to disease. And ultimately, it may show us that customized treatments can improve outcomes. To get there, we first need to overcome the challenge of data inundation. Vast health datasets create significant impediments to storage, computation, analysis, and data visualization. The raw information for a single human genome is over 100 GB spanning over 20,000 genes, and the doctors’ handwritten notes are hard for computers (and people) to make sense of. There just aren’t enough tools and data scientists available to leverage large scale health data.



At Northrop Grumman, we’ve prototyped a personalized health analytics platform, using Google Cloud Platform and Google Genomics, to improve knowledge extraction from health data and facilitate personalized medicine research. With our personalized health analytics platform, a genomics researcher would be able to evaluate diseases across a set of patients with genomic and health information. In the past, a simple question about what genetics are linked to a medical condition might take hours, or even days, to execute. By leveraging Google Cloud Platform, in combination with our own algorithms, the analysis of 1,000 patients’ genomic data, across 218 diseases, generates near real-time results.



Northrop Grumman’s analytics platform would provide multiple benefits to researchers. With Google Genomics and Google BigQuery, terabytes of genomics information can be analyzed in only a few seconds, so researchers would see faster research results. This increase in the speed of discovery deepens our understanding of how genetic variations contribute to health and disease. In addition, the scalable storage and analysis tools provided by Google Cloud Platform and Google Genomics reduce costs and increase security when compared against in-house IT systems. And lastly, our platform aims to improve patient health by expanding the knowledge base for personalized medicine with discovery of complex hidden patterns across long time periods and among large study populations.




The Architecture


To make personalized medicine research easier, we architected our health analytics platform in layers. Here they are starting from the base layer, progressing upward:




  1. Massive Data Storage: A storage layer leverages Google Genomics to efficiently store and access genomic data on the petabyte scale and Northrop Grumman knowledge engines and framework to efficiently process and store electronic health records (EHR) data.

  2. Annotation Layer: The annotation layer provides tools to extract clinical knowledge from structured and unstructured EHR data sources. It also includes a database containing aggregated phenotypic and disease associations from public sources. These enable improved functional annotation of the genomic data.

  3. Analytics Layer: The analytics layer is built on top of Google BigQuery and Google Compute Engine to provide high-performance modeling and analytics tools. With these, we can demonstrate genomic risk modeling with analysis time scales of only several seconds.

  4. Visualization & Collaboration Layer: The visualization and collaboration layer provides a framework for high-level analytics, visualization, and collaboration tools.





The system architecture for Northrop Grumman’s personalized health analytics platform. A layered approach is designed to provide an integrated research environment with greater access to storage infrastructure, improved information extraction and annotation tools, more powerful computational platforms and improved collaboration and visualization tools. 






New Breakthroughs in Personalized Medicine


Today our personalized health analytics platform is a prototype, but the results are promising. Our health analytics platform may improve a researchers’ speed of discovery, lower the costs of storing massive amounts of health data, offer better security than in-house IT systems and ultimately lead to breakthroughs in personalized medicine and treatment. If you're interested in learning more, please email Northrop Grumman at PHC@ngc.com.



- Posted by Leon Li, Future Technical Leader and Systems Engineer at Northrop Grumman Corporation

More and more organizations have learned, through experimentation, how much latent value exists in large scale data and how it can be unearthed via parallelized data processing. Bringing these practices into production requires faster, easier and more reliable data processing pipelines.
More and more organizations have learned, through experimentation, how much latent value exists in large scale data and how it can be unearthed via parallelized data processing. Bringing these practices into production requires faster, easier and more reliable data processing pipelines.



Google Cloud Dataflow is designed to meet these requirements. It’s a fully managed, highly scalable, strongly consistent processing service for both batch and stream processing. It merges batch and stream into a unified programming model which offers programming simplicity, powerful semantics and operational robustness. The first two of these benefits are properties of the Dataflow programming model itself, which Google released in open source via a SDK, and is not tied to running on Google Cloud Platform.



Today, we’re announcing another deployment option for your Dataflow processing pipelines. The team behind the fast-growing Apache Flink project has released a Cloud Dataflow runner for Flink, allowing any Dataflow program to execute on a Flink cluster. Apache Flink is a new Apache Top-Level project that offers APIs and a distributed processing engine for batch and stream data processing.



By running on Flink, Dataflow pipelines benefit not only from the power of the Dataflow programming model, but also from the portability, performance and flexibility of the Flink runtime. It provides a robust execution engine with custom memory management and a cost-based optimizer. And best of all, you have the assurance that your Dataflow pipelines are portable beyond Google Cloud Dataflow: via the Flink runner, your pipelines can execute both on-premise (virtualized or bare-metal) or in the cloud (on VMs).



This brings the number of production-ready deployment runtimes for your Dataflow pipelines to three and gives you the flexibility to choose the right platform and the right runtime for your jobs, and keep your options open as the big data landscape continues to evolve. Available Dataflow runners include:










For more information, see the blog post by data Artisans, who created the Google Cloud Dataflow runner for Flink.



We’re thrilled by the growth of deployment options for the portable Dataflow programming model. No matter where you deploy your Dataflow jobs, join us using the “google-cloud-dataflow” tag on StackOverflow and let us know if you have any questions.



-Posted by William Vambenepe, Product Manager


Your new website is growing exponentially. After a few rounds of high fives, you start scaling to meet this unexpected demand. While you can always add more front-end servers, eventually your database becomes a bottleneck, which leads you to . . .
Your new website is growing exponentially. After a few rounds of high fives, you start scaling to meet this unexpected demand. While you can always add more front-end servers, eventually your database becomes a bottleneck, which leads you to . . .




  • Add more replicas for better read throughput and data durability

  • Introduce sharding to scale your write throughput and let your data set grow beyond a single machine

  • Create separate replica pools for batch jobs and backups, to isolate them from live traffic

  • Clone the whole deployment into multiple datacenters worldwide for disaster recovery and lower latency




At YouTube, we went on that journey as we scaled our MySQL deployment, which today handles the metadata for billions of daily video views and 300 hours of new video uploads per minute. To do this, we developed the Vitess platform, which addresses scaling challenges while hiding the associated complexity from the application layer.



Vitess is available as an open-source project and runs best in a containerized environment. With Kubernetes and Google Container Engine as your container cluster manager, it's now a lot easier to get started. We’ve created a single deployment configuration for Vitess that works on any platform that Kubernetes supports.



In addition to being easy to deploy in a container cluster, Vitess also takes full advantage of the benefits offered by a container cluster manager, in particular:




  • Horizontal scaling – add capacity by launching additional nodes rather than making one huge node

  • Dynamic placement – let the cluster manager schedule Vitess containers wherever it wants

  • Declarative specification – describe your desired end state, and let the cluster manager create it

  • Self-healing components – recover automatically from machine failures




In this environment, Vitess provides a MySQL storage layer with improved durability, scalability, and manageability.



We're just getting started with this integration, but you can already run Vitess on Kubernetes yourself. For more on Vitess, check out our website, ask questions on our forum, or join us on GitHub. In particular, take a look at our overview to understand the trade-offs of Vitess versus NoSQL solutions and fully-managed MySQL solutions like Google Cloud SQL.



-Posted by Anthony Yeh, Software Engineer, YouTube

Businesses generate a staggering amount of log data that contains rich information on systems, applications, user requests, and administrative actions. When managed effectively, this treasure trove of data can help you investigate and debug system issues, gain operational and business insights and meet security and compliance needs.
Businesses generate a staggering amount of log data that contains rich information on systems, applications, user requests, and administrative actions. When managed effectively, this treasure trove of data can help you investigate and debug system issues, gain operational and business insights and meet security and compliance needs.



But log management is challenging. You need to manage very high volumes of streaming data, provision resources to handle peak loads, scale fast and efficiently and have the capability to analyze data in real-time.



Starting today, Google Cloud Logging is available in beta to help you manage all of your Google Compute Engine and Google App Engine logs in one place, and collect, view, analyze and export them. By combining Google Cloud Monitoring with Cloud Logging, you gain a powerful set of tools for managing operations and increasing business insights.



The Cloud Logging service allows you to:




  • Ingest and view the log data, so that you can see all your logs in one place

  • Search the log data in real-time, so that you can resolve operational issues

  • Analyze the log data in real-time, so that you can glean actionable insights

  • Archive logs data for longer periods, to meet backup and compliance requirements




Several customers are already using the features for logs viewing and analysis. Here’s what Wix has to say about Cloud Logging.


At Wix we use BigQuery to analyze logs of Compute Engine auto-scaled deployments. We get a large volume of syslog data that we send to BigQuery to get insights on system health state and error rates. We generate time series data and integrate it with Google Cloud Monitoring to monitor system performance and business metrics. This provides us with essential insight for the running of our operations.  - Dmitry Shestak, Engineer@Infrastructure team, Wix


Ingest and view the log data


We understand that it’s important for you to keep all your logs in one place so that you can easily analyze and correlate the data. Cloud Logging solves this problem in several ways:




  • Compute Engine VM logs can be automatically collected for about two dozen log types through the Google packaged fluentd agent, with additional logs possible through custom configuration.

  • Compute Engine Activity logs record all system actions and API calls are enabled by default, with no agent installation required.

  • App Engine logs that include syslog, request logs and application logs are automatically enabled for all App Engine projects, including applications using Managed VM runtimes.




You can view the logs in the Logs Viewer (shown below) in the Google Developers Console by clicking on the “Logs” link under “Monitoring.”




When viewing logs in the Logs Viewer, you can filter results using filter text or drop-downs




Search the log data in real-time


The Logs Viewer lets you quickly investigate and debug issues, correlate logs between different services and find the root cause of an outage. You can filter logs using the drop-down menu and the filter bar, stream logs in real-time ("tail -f") and navigate through your log timeline without awkward next/previous page buttons.



Here’s an example that shows how you can filter Compute Engine logs to see only Compute Engine “Firewall” service logs, pick a particular firewall resource to see the logs and do this for a particular log level.




A filtered view of logs data using the Logs Viewer






Analyze the log data in real-time


Many scenarios will require complex querying of the logs data in real-time. Cloud Logging allows you to easily stream logs to Google BigQuery as they arrive, letting you search, aggregate and view your data using SQL-like queries. To learn how to configure BigQuery export, visit the Exports tab of the Logs Viewer, or see the detailed documentation.



Once you enable BigQuery export, you can stream logs to BigQuery in real-time, and view them there in seconds.




Log data in the BigQuery tables

Let’s explore a couple of examples of how this data and the analysis capability can be really useful to you.




  • Monitoring Code Performance: There are situations when something unexpected happened or is indicative of an imminent problem e.g. “disk space low.” With Compute Engine log data in BigQuery, you can generate a time series and monitor logs with a particular severity. It’s simple, you just query metadata.severity = “WARNING” in the relevant tables. E.g.


     SELECT Count (*) AS total, Date(metadata.timestamp) AS time FROM (Table_date_range(TABLE ID, Timestamp('2015-03-01'), Timestamp('2015-03-12'))) WHERE metadata.severity = "warning" GROUP BY time ORDER BY total;

     
  • Monitoring Request Latency: High latency leads to poor user experience and failed requests, which can lead to frustrated users and lost revenue. With App Engine log data in BigQuery, you can create time series of latency data by aggregating and charting the “protoPayload.latency” field. You can see unusual latencies in real-time and take steps to resolve the issue.






Archive logs data for longer period


Cloud Logging retains logs in the Logs Viewer for 30 days, but in some scenarios, you need to store log data for a longer period. With the click of a button, you can configure export to Google Cloud Storage. It’s another channel for you to take data to BigQuery, Google Cloud Dataflow or any Hadoop solution for further processing and analysis of data. This makes it easier to meet your business or compliance requirements. And with the recent launch of Google Cloud Storage Nearline, long-term log storage becomes even more affordable.




Getting Started


If you’re a current Google Cloud Platform user, Cloud Logging is available to you at no additional charge. Applicable charges for using Google Cloud Platform services (such as BigQuery and Cloud Storage) will still apply. For more information, visit the the Cloud Logging documentation page and share your feedback.



- Posted by Deepak Tiwari, Product Manager

Today, we’re making it even easier to deploy Open Source Puppet on Google Compute Engine with Click to Deploy. Now you can quickly set up a Puppet master configured with ...
Today, we’re making it even easier to deploy Open Source Puppet on Google Compute Engine with Click to Deploy. Now you can quickly set up a Puppet master configured with node_gce and gce_compute modules to provision and manage resources on Compute Engine.



Whether you’re managing one virtual machine or thousands, Puppet can help make system configuration easier. Puppet is a declarative language for expressing system configuration, coupled with an agent/master framework for distributing and enforcing the configuration. In a typical web server deployment, for example, you can define how you’d like Apache configured, and then deploy it to multiple virtual machines easily. Puppet is used by more than 22,000 companies around the world and Puppet Forge has more than 3,100 modules for provisioning and managing a wide variety of system resources.



"We're excited to see Google recognize the benefits and pervasiveness of Puppet, making it the first IT automation tool available as a Click to Deploy solution. This solution lowers the time required to get a functional Puppet master up and running and is the first step toward fully automating the management of projects in Google Compute Engine," said Nigel Kersten, CIO of Puppet Labs.



Learn more about running Open Source Puppet on Google Compute Engine and deploy a Puppet master today. Please feel free to let us know what you think about this feature. You can also contact Puppet Labs for professional services, premium support, or training. Deploy away!



-Posted by Pratul Dublish, Technical Program Manager

Today we hear from Tute Genomics on how they're using cloud-based technology and big data tools to support the scientific community and help advance genomics research. Tute has made its 8.5 billion record genetic annotation database publicly available to users of Google Genomics through Google BigQuery. ...
Today we hear from Tute Genomics on how they're using cloud-based technology and big data tools to support the scientific community and help advance genomics research. Tute has made its 8.5 billion record genetic annotation database publicly available to users of Google Genomics through Google BigQuery.



Human genome sequencing is fast becoming standard practice for both clinicians and researchers now that the cost to read all 3 billion letters of a person’s DNA has dropped to just a few thousand dollars. But what does all that information mean for medicine and health?



At Tute Genomics we’re answering that question by building a comprehensive database of all known genetic variants and what they mean for disease risk, drug response, and basic research. Because the database contains 9 billion records, it’s been a challenge to work with it on a local computer or servers. That’s why we were excited to discover Google BigQuery.



With BigQuery, scientists can run sophisticated queries against the Tute database to link an individual’s genome to the wealth of information about genetic variants in general. The background of information can even include large public datasets like the 1000 Genomes Project, which is already hosted on Google Cloud Platform.



With so much data to analyze, data analysis tools like BigQuery are essential. Running queries using standard computers or VMs takes significantly longer. For example, we're able to rapidly count variants from the 17 Platinum Genomes by function. Even with 88GB of input data, we're able to see results in 30 seconds for less than $1, whereas it would have taken many minutes or even hours without BigQuery.



Our initial database version includes annotations on 8.5 billion genetic variants. Sources include clinical annotations from ClinVar and GWAS catalog, population frequencies from the 1000 Genomes Project, gene and transcript model annotations – such as amino acid and protein substitutions – and the functional consequence of exonic variants. Additionally, the database includes conservation scores, evolutionary scores, and predictions of whether genetic variants are likely to be associated with Mendelian phenotypes.



As genome sequencing becomes a more common part of clinical care as well as basic research, accurate and comprehensive genetic variant databases will be essential to help make sense of genetic information. We find that detailed annotations of genetic variants are a natural match for big data processing with Google BigQuery. We believe in this so strongly that we’ve donated an unprecedented database to the genomics community, made available through Google Cloud Platform.



Reach out to us on the Tute Genomics discussion group with any specific questions.



Posted by Bryce Daines, Reid Robison, Chris London, Brendon Beebe, David Mittelman, and Kai Wang of Tute Genomics

Today’s guest blog comes from Brett Renfer, Director of Experience Design, at COLLINS, a New York City-based brand consultancy that uses design thinking and convergence of design, music, film, and technology to create unforgettable brand experiences. ...
Today’s guest blog comes from Brett Renfer, Director of Experience Design, at COLLINS, a New York City-based brand consultancy that uses design thinking and convergence of design, music, film, and technology to create unforgettable brand experiences.



When we discussed plans for Azealia Banks’ new video, “Wallace,” we knew we had to do something as innovative as she is – something that allowed her to make a big splash and give her fans, eager to see her new work, an experience that only Azealia could introduce. Believing in the power of converging music, film, technology and design, we challenged ourselves to create a first-of-its-kind interactive music video for Azealia. The video would bring Azealia’s fans on stage with her by empowering the video version of Azealia to mirror their movements in real-time. We knew that if we could pull this off, fans could watch the video from anywhere and feel like Azealia stood directly in front of them, singing not just to them, but with them. Our Executive Creative Director, Lee Maschmeyer, asked us to push beyond what we now think is possible, and we did.



We watched dozens of music videos as we dove into development, and could not find another music video that manipulates actual video files and pixels, in the way that we aimed to. We built a unique app that combined WebGL, HTML5 video, and web camera interaction. By building the app on Google Cloud Platform, we were able to focus on creating this unique design and experience rather than on engineering and optimization, and could do so quickly with a very small team.



The COLLINS development team consisted of two developers, one in New York and one in Texas. We started prototyping the app in Chrome, and then pulled everything into Google App Engine to continue prototyping on local web servers. This allowed us to test locally, quickly deploy live, and share with collaborators around the globe. We built the entire app in just two months. Given Azealia’s huge fan base, we anticipated that we’d need to use App Engine to scale without any additional work on our end.



We learned that we could handle really high-resolution video and high quality sound in WebGL in Chrome, but needed some seriously powerful content hosting that provided a high level of configurability. Google Cloud Storage turned out to be the perfect fit; by using Cloud Storage, we could deliver high-quality content out to all users – including those who used up to 300 MB bandwidth – and load and manipulate that content in WebGL.



Azealia’s creative input throughout the process was invaluable. She’s a huge fan of technology and looks for ways to interact and engage more closely with her fans. She worked closely with us from start to finish to make sure that the video experience embodies her unique artistry.



As a creative company, we’re known for powerful storytelling, and our team did a remarkable job finding the emotional heart of Azealia’s song and leveraging Google technology to make it uncommonly real and responsive.



See the full story behind how our small team with a big idea brings Azealia closer to her fans.




Do you have a hungry application, one that has an insatiable appetite for compute power? Great, because we have a solution to satisfy that appetite.



Today, we're making the beta of our new 32-vCPU virtual machines available for all of your compute heavy apps. You can choose from three new machine types, each with 32-vCPU’s and memory ranging from ~30GB to over 200GB.
Do you have a hungry application, one that has an insatiable appetite for compute power? Great, because we have a solution to satisfy that appetite.



Today, we're making the beta of our new 32-vCPU virtual machines available for all of your compute heavy apps. You can choose from three new machine types, each with 32-vCPU’s and memory ranging from ~30GB to over 200GB.



The table below shows the configuration and pricing details, which demonstrate our belief that cloud pricing should track to Moore's Law.

























It’s been fascinating to watch you, our customers, build amazing things on Google Cloud Platform. The landscape of applications often include some very compute and memory intensive components. From huge MySQL instances behind popular mobile applications and games, to the visual effects rendering software used in the production of most movies today, some components of your cloud application just run better with more compute or more memory (often both).



Fredrik Averpil, Technical Director at Industriromantik, a digital production company and early tester of our 32-vCPU machine types, said: "We get a solid 97-98% speedup from the 16-vCPU machine types on all jobs so far. This is the best efficiency in scalability I've ever seen when talking about 3D rendering."



To get started, visit the Developers Console and create a new instance with one of these 32-vCPU machine types, or check out our docs for instructions on creating new virtual machines using our gcloud command line tool. Please note that our 32-vCPU machine types are only available in our Ivy Bridge and Haswell zones.



Test out your most CPU and RAM intensive workloads, and let us know how they work for you. We’d love to have you fire up one of these new machines, open the throttle and see how expanded memory and compute power gives greater steam to your applications. You can contact us via our feedback channels.



- Posted by Scott Van Woudenberg, Product Manager for Google Compute Engine

The amount of data being produced around the world is staggering and continues to grow at an exponential rate. Given this growing volume of data, it's critical that you store it in the right way ...
The amount of data being produced around the world is staggering and continues to grow at an exponential rate. Given this growing volume of data, it's critical that you store it in the right way keeping frequently accessed data easily accessible, keeping cold data available when needed, and being able to move easily between the two. Organizations can no longer afford to throw data away, as it’s critical to conducting analysis and gaining market intelligence. But they also can’t afford to overpay for growing volumes of storage.



Today, we're excited to introduce Google Cloud Storage Nearline, a simple, low-cost, fast-response storage service with quick data backup, retrieval and access. Many of you operate a tiered data storage and archival process, in which data moves from expensive online storage to offline cold storage. We know the value of having access to all of your data on demand, so Nearline enables you to easily backup and store limitless amounts of data at a very low cost and access it at any time in a matter of seconds.



Unlike traditional solutions, Nearline provides unmatched price and performance for long-term storage:




  • Fast Performance: all the benefits of cold storage while making the data immediately available. Unlike its competitors, Nearline enables ~3 second response times for data retrieval and improves SLAs.

  • Low-cost: capacity pricing is extremely low at 1c per GB for data at rest.

  • Security: redundant storage at multiple physical locations protects data. OAuth and granular access controls form strong, configurable security.

  • Integrated: fully integrated with other Google Cloud Storage services, providing a consistent method of access across the entire Google Cloud Storage service line.

  • Simple: no need to adopt new programming models – data manipulation behavior remains the same across Google Cloud Storage services.




We know that changing your storage platforms and processes isn’t easy. This is why we’re working with some of the leading backup and storage providers to make adopting Nearline a seamless experience. Starting today, these partners  and others across the ecosystem  are available to help you take a new approach to data storage in your own environment:




  • Veritas/Symantec: NetBackup is the market leader in enterprise backup software and will support Google Cloud Storage Nearline in version 7.7. NetBackup integration with Google Cloud Storage Nearline will allow enterprises to increase business agility and information availability. Customers can seamlessly manage the lifecycle of their backups and keep a central catalog and recovery point for all their protected information, regardless of where it's stored.

  • NetApp: SteelStore is an on-premises appliance that de-duplicates, encrypts, and compresses data before rapidly streaming it to Google Cloud Storage Nearline. SteelStore reduces data volumes by up to 30x and speeds data transport times by 400%—making onramp of your data to the cloud fast and easy. SteelStore currently supports Google Cloud Storage and will support Nearline in the second half of 2015.

  • Iron Mountain: We’re working with Iron Mountain to design an on-ramp to the cloud via an offline ingestion service that builds on Iron Mountain's security, logistics and data management capabilities. If you have massive amounts of data and limited network connectivity, you’ll be able to simply package up and send your disks to Iron Mountain, where they’ll be uploaded directly into Google Cloud Storage Nearline.

  • Geminare: Get Disaster Recovery as a Service (DRaaS) solutions running on Google Compute Engine and Google Cloud Storage Nearline. Geminare lets you use the cloud as your secondary data center location, enabling both DR modernization and, for the first time, cost-effective replication of your current data centers.





Comparing online data storage solutions against Nearline and traditional cold storage offerings



With Google Cloud Storage Nearline, you can now benefit from a very low-cost, highly-durable storage that can be used to store limitless amounts of data and have access to that data at any time. Our primary focus is to help you bring new use cases to life, and this is why we’ve worked with some of the leading backup and storage providers and are focused on growing this ecosystem. We look forward to seeing the great, innovative ways you’ll use this distinctive new storage option. To learn more about how your organization can benefit from Google Cloud Storage Nearline, sign up for our webinar on April 8 2015. To get started, visit the Nearline site and documentation!



-Posted by Avtandil Garakanidze, Product Manager

How did you first learn to code? This question always brings out smiles and stories whenever a group of coders gather. We all started somewhere: building a computer from the microprocessor hobby kit; tweaking some BASIC video game to give you unlimited lives; writing a flashcard program to study for a vocabulary quiz; or, putting up your first web page. When you see “Hello world” show up on the screen for the first time, you realize that you can now make the computer say anything. It’s like discovering at age 11 that you are, in fact, a wizard. But in today’s world of ubiquitous high-polish, high-powered technology, I wonder about the next generation of coders, builders and hackers. What will give them that first, fresh experience of being a producer, not just a consumer, of computing technology?
How did you first learn to code? This question always brings out smiles and stories whenever a group of coders gather. We all started somewhere: building a computer from the microprocessor hobby kit; tweaking some BASIC video game to give you unlimited lives; writing a flashcard program to study for a vocabulary quiz; or, putting up your first web page. When you see “Hello world” show up on the screen for the first time, you realize that you can now make the computer say anything. It’s like discovering at age 11 that you are, in fact, a wizard. But in today’s world of ubiquitous high-polish, high-powered technology, I wonder about the next generation of coders, builders and hackers. What will give them that first, fresh experience of being a producer, not just a consumer, of computing technology?



Young coders were once a select community, made up of top math kids or curious video game fans. Now you can see an emerging belief that coding is the new second language. Schools are responding by participating in Hour of Code. Entrepreneurs are creating toys to groom kiddie coders, and employers are seeking people to fill programming positions whose numbers are growing at a rate twice the national average. But it’s those first experiences of creative power that will excite kids and hook them on coding.



My favorite part about being a father is sharing my passions with my three daughters, so I look for ways to give them an understanding of coding and computer science. By the time my older girls were exploring Python for kids and creating a turtle-graphics typewriter, I was on to teaching my six-year-old daughter the basic concept of coding with a picture-based guide. She’d read aloud a brief program listing — still not automatic in first grade — and I’d enter the lines for her. It was amazing to see how quickly she broke free from the pre-set examples: “Can we make it green?” “How about 100 stars?” “Can we make them squares?” As we located which part of the program to change to make her wish come true, I smiled knowing she had never questioned this basic tenet: people just like her write the code that drives all the gizmos and gadgets and systems surrounding her. Nothing magic — or rather, a magic that any 6 year old can learn to do.




Two of David's three daughters, ages 6 and 12 here, working on examples from the Pencil Code guide

While my daughters have a live-in software-engineer to guide them, becoming a coder shouldn’t be an accident of birth. One day last fall, I read in the town newsletter that a local teen had founded a Girls Who Code club at the library near my home, in Newton, Massachusetts. All they needed to get started was a computer science instructor. I signed up and was greeted at the first meeting by 16 girls, eager to see what coding was all about. By our third week, word has spread and there were 37 girls filling the room! We had to split into two sections and recruited two (women) co-instructors to accommodate everyone. So many girls wanting to learn about computer science, curious to learn how things worked, happy to start testing things out for themselves and with each other. Now, they’re connecting beyond the club with guest speakers and field trips, talking with other women programmers in an industry filled with men. Having both women and men to look up to and learn from assures girls that they can follow and build their dreams into a reality.




David helping his daughter, now 13, who is also in Girls Who Code, on her latest coding project  

That’s why Google Cloud Platform is offering tutorials from Khan Academy and Codecademy, for anyone – girls, boys, teens, students, curious friends, even your grandparents –to learn how to code. In honor of International Women’s Day and Women’s History Month, we encourage you to think about the important girls in your life who could benefit from learning more about computer science. Whether she is your daughter, niece, sister or neighbor, she will gain a skill and an outlook that will empower her and learn the value of computer science and coding literacy. We’re also spotlighting reflections from female developers from around the world — new and seasoned — to inspire girls and show them that an interest in coding can start from anywhere. Check back throughout March to take part in the conversation. The next Grace Hopper, Ada Lovelace, Frances Allen, or Anita Borg is out there, but she has to start somewhere. Let’s inspire her to become a part of #FutureCoders.



-- Posted by David Miller, Software Engineer

Many applications need to talk to other varied, and often, distributed systems reliably and in real-time. To make sure things aren't lost in translation, you need a flexible communication model to get messages between multiple systems simultaneously.
Many applications need to talk to other varied, and often, distributed systems reliably and in real-time. To make sure things aren't lost in translation, you need a flexible communication model to get messages between multiple systems simultaneously.



That’s why we’re making the beta release of Google Cloud Pub/Sub available today, as a way to connect applications and services, whether they're hosted on Google Cloud Platform or on-premises. The Google Cloud Pub/Sub API provides:




  • Scale: offering all customers, by default, up to 10,000 topics and 10,000 messages per second



  • Global deployment: dedicated resources in every Google Cloud Platform region enhance availability without increasing latency



  • Performance: sub-second notification even when tested at over 1 million messages per second




We designed Google Cloud Pub/Sub to deliver real-time and reliable messaging, in one global, managed service that helps developers create simpler, more reliable, and more flexible applications. It's been tested extensively, supporting critical applications like Google Cloud Monitoring and Snapchat's new Discover feature. Some common use cases include:




  • Integrated messaging between components of an application  for example, processing an office transfer in an HR system: developers need to control the distribution of updates to the company directory, to security badging, to the moving company, to payroll, and many other services.



  • Robust data collection from smart devices  for example, mobile device endpoints: providing developers with the ability to integrate sensor data from the endpoints with real-time data analysis pipelines, automatically routing the data streams to an application.




You can activate Google Cloud Pub/Sub today from the APIs & auth section of the Google Developers Console and monitor key metrics with Google Cloud Monitoring dashboards. Please share your feedback directly or join our mailing list for updates and discussions.



-Posted by Rohit Khare, Product Manager


Introduction


What’s remarkable about April 7th, 2014 isn’t what happened that day. It’s what didn’t.



That was the day the Heartbleed bug was revealed, and people around the globe scrambled to patch their systems against this zero-day issue, which came with already-proven exploits. In other public cloud platforms, customers were impacted by rolling restarts due to a requirement to reboot VMs. At Google, we quickly rolled out the fix to all our servers, including those that host ...


Introduction


What’s remarkable about April 7th, 2014 isn’t what happened that day. It’s what didn’t.



That was the day the Heartbleed bug was revealed, and people around the globe scrambled to patch their systems against this zero-day issue, which came with already-proven exploits. In other public cloud platforms, customers were impacted by rolling restarts due to a requirement to reboot VMs. At Google, we quickly rolled out the fix to all our servers, including those that host Google Compute Engine. And none of you, our customers, noticed. Here’s why.



We introduced transparent maintenance for Google Compute Engine in December 2013, and since then we’ve kept customer VMs up and running as we rolled out software updates, fixed hardware problems, and recovered from some unexpected issues that have arisen. Through a combination of datacenter topology innovations and live migration technology, we now move our customers running VMs out of the way of planned hardware and software maintenance events, so we can keep the infrastructure protected and reliable—without your VMs, applications or workloads noticing that anything happened.




The Benefits of Transparent Maintenance


Our goal for live migration is to keep hardware and software updated across all our datacenters without restarting customers' VMs. Many of these maintenance events are disruptive. They require us to reboot the host machine, which, in the absence of transparent maintenance, would mean impacting customers’ VMs.



Here are a few of the issues we expected to address with live migration, and we have encountered all of these:




  • Regular infrastructure maintenance and upgrades

  • Network and power grid maintenance in the data centers

  • Bricked memory, disk drives, and machines

  • Host OS and BIOS upgrades

  • Security-related updates, with the need to respond quickly

  • System configuration changes, including changing the size of the host root partition, for storage of the host image and packages




We were pleasantly surprised to discover that live migration helped us deliver a better customer experience in the face of a much broader array of issues. In fact, our Site Reliability Engineers started using migration as a tool even before it was generally enabled; they found they could easily work around or mitigate potential breakages occurring in production.



Here are some of the unexpected issues that we encountered and worked around with live migration without impacting the running guests:




  • Flapping network cards — Network cards were intermittently failing. We were able to repeatedly try the VM migrations and successfully migrate them. This even worked with partially-failing NICs.

  • Cascading battery/power supply issues — Some overheating batteries were overheating the neighboring machines. We were able to migrate the VMs away before bringing down the machines to swap out their batteries.

  • A buggy update was pushed to production — We halted the rollout, but not before it reached some of our production machines (it didn't manifest in our canary environment). The buggy software would’ve crashed VMs within a week. Instead, we migrated the VMs on the affected machines to other hosts that didn’t have the buggy software.

  • Unexpected host memory consumption — One of our backend components consumed more memory than we had allocated and threatened to OOM (out of memory) the VMs. We migrated some VMs away from the over-loaded machines and avoided the OOM failures while patching the backend system to ensure it could not overrun its allocation.





Transparent Maintenance in Action


We’ve done hundreds of thousands of migrations since introducing this functionality. Many VMs have been up since migration was introduced and all of them have been migrated multiple times.



The response from our customers has been very positive. During the early testing for migration, we engaged with Rightscale to see the impact of migrations. After we migrated all their VMs twice, they reported:




“We took a look at our log files and all the data in the database and we saw…nothing unusual. In other words, if Google hadn’t told us that our instances had been migrated, we would have never known. All our logs and data looked normal, and we saw no changes in the RightScale Cloud Management dashboard to any of our resources, including the zone, instance sizes, and IP addresses.”



We worked with David Mytton at ServerDensity to live migrate a replicated MongoDB deployment. When the migration was done, David tweeted:




“Just tested @googlecloud live migration of a @MongoDB replica set - no impact. None of the nodes noticed the primary was moved!”



In fact, Google has performed host kernel upgrades and security patches across its entire fleet without losing a single VM. This is quite a feat, given the number of components involved and factoring in that any one of them or their dependencies can fail or disappear at any point. During the migration, many of the components that comprise the VM (the disks, network, management software and so on) are duplicated on the source and target host machines. If any one of them fail at any point in the migration, either actively (e.g. by crashing) or passively (e.g. by disappearing), we back out of the migration cleanly without affecting the running VM.




How it works


When migrating a running VM from one host to another, you need to move all the state from the source to the destination in a way that is transparent to the guest VM and anyone communicating with it. There are many components involved in making this work seamlessly, but the high-level steps are illustrated here:





The process begins with a notification that VMs need to be evicted from their current host machine. The notification might start with a file change (e.g. a release engineer indicating that a new BIOS is available), Hardware Operations scheduling maintenance, an automatic signal from an impending hardware failure etc.



Our cluster management software constantly watches for such events and schedules them based on policies controlling the data centers (e.g. capacity utilization rates) and jobs (e.g. number of VMs for a single customer that could be migrated at once).



Once a VM is selected for migration, we provide a notification to the guest that a migration is imminent. After a waiting period, a target host is selected and the host is asked to set up a new, empty “target” VM to receive the migrating “source” VM. Authentication is used to establish a connection between the source and target.



There are three stages involved in the VM’s migration:




  1. During pre-migration brownout, the VM is still executing on the source, while most state is sent from the source to the target. For instance, we copy all the guest memory to the target, while tracking the pages that have been re-dirtied on the source. The time spent in pre-migration brownout is a function of the size of the guest memory and the rate at which pages are being dirtied.

  2. During blackout, which is a very brief moment when the VM is not running anywhere, it is paused, and all the remaining state required to begin running the VM on the target is sent. We go into blackout when sending state during pre-migration brownout reaches a point of diminishing returns. We use an algorithm that balances numbers of bytes of memory being sent against the rate at which the guest VM is dirtying pages, amongst other things.

  3. During post-migration brownout, the VM executes on the target. The source VM is present, and may be providing supporting functionality for the target. For instance, until the network fabric has caught up with the new location of the VM, the source VM provides forwarding services for packets to and from the target VM.




Finally, the migration is complete, and the system deletes the source VM. Customers can see that the migration took place in their logs.



Our goal for every transparent maintenance event is that not a single VM is killed. In order to meet that bar, we test live migration with a very high level of rigor. We’re using fault-injection to trigger failures at all the interesting points in the migration algorithm. We generate both active and passive failures for each component. At the peak of development testing (for months) we were doing tens of thousands of migrations every day.



Achieving this complex, multi-faceted process requires deep integration throughout the infrastructure and a powerful set of scheduling, orchestration and automation processes.




Conclusion


Live migration technology lets us maintain our infrastructure in top shape without impacting our guest VMs. One of our reviewers even claimed we’ve granted VMs immortality. We’re able to keep our VMs running for long periods of time in the face of regular and unplanned maintenance requirements and in spite of the many issues that arise requiring the reboot of physical machines.



We’re fortunate that some of the recent security issues that have affected other cloud providers haven’t affected us, but if and when a new vulnerability affects our stack, we’ll be able to help keep Compute Engine protected without affecting our customers’ VMs.



-Posted by Miche Baker-Harvey, Tech Lead/Manager, VM Migration


Back in November, at Google Cloud Platform Live, we released the beta of Google Cloud Debugger with support for Managed VM based projects. Today, we’re expanding support for Google Compute Engine ...
Back in November, at Google Cloud Platform Live, we released the beta of Google Cloud Debugger with support for Managed VM based projects. Today, we’re expanding support for Google Compute Engine based projects. Now you can simply set a snapshot on a line of code and Cloud Debugger will return local variables and a full stack trace from the next request that executes that line. Say goodbye to littering your code with logging statements.



Setting up Cloud Debugger on Compute Engine is easy using the Cloud Debugger agent and bootstrap script – try it for yourself. You’ll need the following:








Cloud Debugger is available on both production and staging instances of your application and adds zero overhead on services that aren’t being actively debugged. The debugger adds less than 10ms to request latency when capturing application state and doesn’t block or halt execution of your application.



Stay tuned for support for other programming languages and environments. As always, we’d love direct feedback and will be monitoring Stack Overflow for issues and suggestions.



-Posted by Keith Smith, Product Manager