Earlier this year, we announced the beta of the Google Cloud Logging service, which included the capability to:
Earlier this year, we announced the beta of the Google Cloud Logging service, which included the capability to:










Today we’re expanding Cloud Logging capabilities with the beta of  Cloud Logging Connector that allows you to stream logs to Google Cloud Pub/Sub.  With this capability you can stream log data to your own endpoints and further expand how you can make big data useful.  For example, you can now transform and enrich the data in Google Cloud Dataflow before sending it to BigQuery for analysis.  Furthermore, this provides easy real-time access to all your logs data, so you can export it to your private cloud or any third party application.





Cloud Pub/Sub


Google Cloud Pub/Sub is designed to deliver real-time and reliable messaging in one global, managed service that helps you create simpler, more reliable, and more flexible applications. By providing many-to-many, asynchronous messaging that decouples senders and receivers, it allows for secure and highly available communication between independently written applications.  With Cloud Pub/Sub, you can push your log events to another Webhook, or pull them as they happen.  For more information, check out our Google Cloud Pub/Sub documentation.








High-Level Pub/Sub Schema


Configuring Export to Cloud Pub/Sub

Configuring export of logs to Cloud Pub/Sub is easy and can be done from the Logs Viewer user interface.  To get to the export configuration UI start in the Google Developers Console, go to Logs under Monitoring and then click Exports on the top menu.  Currently this supports export configuration for Google App Engine and Google Compute Engine logs.








One Click Export Configuration in the Developers Console






Transforming Log Data in Dataflow

Google Cloud Dataflow allows you to build, deploy, and run data processing pipelines at a global scale.  It enables reliable execution for large-scale data processing scenarios such as ETL and analytics, and allows pipelines to execute in either streaming or batch mode. You choose.  






You can use the Cloud Pub/Sub export mechanism to stream your log data to Cloud Dataflow and dynamically generate fields, combine different log tables for correlation, and parse and enrich the data for custom needs.  Here are a few examples of what you can achieve with log data in Cloud Dataflow:







  • Sometimes it is useful to see the data only for the key applications for top customers.  In Cloud Dataflow, you can group logs by Customer ID or Application ID, filter out specific logs, and then apply some aggregation of system level or application level metrics.



  • On the flip side, sometimes you want to enrich the log data to make it easier to analyze, for example by appending marketing campaign information to customer interaction logs, or other user profile info. Cloud Dataflow lets you do this on the fly.



  • In addition to preparing the data for further analysis, Cloud Dataflow also lets you perform analysis in real time. So you can look for anomalies, detect security intrusions, generate alerts, keep a real-time dashboard updated, etc.








Cloud Dataflow can stream the processed data to BigQuery, so you can analyze your enriched data.  For more details, please see the Google Cloud Dataflow documentation.






Getting Started

If you’re a current Google Cloud Platform user, the capability to stream logs to Cloud Pub/Sub is available to you at no additional charge.  Applicable charges for using Cloud Pub/Sub and Cloud Dataflow will apply. For more information, visit the the Cloud Logging documentation page and share your feedback.




-Posted by Deepak Tiwari, Product Manager







































When Streak — CRM in your inbox — launched in March 2012, our userbase grew 30% every week for four consecutive months. Today, Streak supports millions of users with only 1.5 back-end engineers. We chose Google App Engine to power our application because it enabled our team to build features fast and scaled with user growth. Plus, we didn’t have to worry about infrastructure.










Streak’s data growth




Here are six tips we’ve learned building on App Engine, and if you’d like even more detail – including an overview of our app’s architecture and 15 minutes of Q&A – you can check out my webinar.  




1. Keep user-facing GET requests fast


This tip isn’t specific to App Engine, as it really applies to most web applications. User-facing GET requests should be quick. App Engine has a 60 second timeout on all requests; frankly, if the total latency after a user interaction is taking longer than 200ms, users will perceive your app as slow. To keep requests fast, you should do your heavyweight processing – such as calculations or complex queries –  either in the background or at write time. That way, when the user requests data (read time), it’s already precalculated and ready to go.




2.  Take advantage of Managed VMs


So, what are Managed VMs? Managed VMs are a new hosting environment for App Engine, enabling you to take advantage of beefier compute resources and run your own custom runtimes. For example, we host our back-end data processing modules on n1-standard-1 machines (1 CPU and 3.75 GB mem), rather than App Engine frontend instances. This provides better performance and cost savings, due to sustained use discounts. Yes, Managed VMs take a littler longer to boot up than an App Engine frontend instance, but they're perfect for our background processing needs.




3. Denormalize for faster reads


Google Cloud Datastore is a NoSQL database so if you’re coming from the RDBMS world, it requires a different approach to data modeling. You have to be comfortable denormalizing and duplicating data, since SQL joins don’t exist. While data duplication might feel uncomfortable, by doing so, your reads will be very fast.  




4.  Break your application into modules


Modules make it easy for you to break your App Engine app into different components. For example, you could have a module for your user-facing traffic and one for background processing. Each module has its own yaml file, so you can set parameters such as instance size, version number, runtime language used, and more. As mentioned above, our backend modules take advantage of Managed VMs for performance/cost benefits, while our frontend module uses App Engine frontend instances that scale quicker. The documentation discusses best practices on how you should structure your app.   




5. Deploy aggressively and use traffic splitting


At Streak, we do continuous deployments because versioning, deployment and rollout is easy with App Engine. In fact, sometimes we deploy up to 20 times per day to get changes into the hands of customers. We aggressively deploy to many production versions of our app and then selectively turn on new features for our users. As we slowly ramp up the traffic to these new versions via traffic splitting, we catch issues early and often. These are usually really easy to deal with because each of our new code deploys has a small set of functionality, so its easy to find the relevant issues in the code base. We also use Google Cloud Monitoring and our own homegrown system (based on #6 below) to monitor these deploys for changes.




6. Use BigQuery to analyze your log files


Application and request logs can give you valuable insights into performance and help you make product improvements. If you’re just starting out, the log viewer’s list of recent requests will be just fine, but once you’ve reached scale you’ll want to do analysis on aggregate data or a specific user’s requests. We’ve built custom code to export our logs to Google BigQuery, but you can now stream your logs directly from the Developers Console. With these insights, my team can build a better user experience.




Watch the webinar


App Engine has been critical to our success. As our application has scaled, so has App Engine and we’ve been able to focus on building features for our customers, rather than ops. To learn more tips about App Engine – including an overview of our architecture and 15 minutes of Q&A – watch the full webinar.




-Posted by Aleem Mawani, CEO and co-founder, Streak

Understanding is good, but trying it for yourself is better. Today, we’ve made that a bit easierGoogle would like you to experience the performance of Local SSD firsthand. To make it even easier to try out this feature, we're giving our customers a discounted trial. For the next month (April 21, 2015 to May 21, 2015), Local SSD will be priced at $0.055/GB/month, a 75% discount. After that time, the price will return to its normal $0.218/GB/month. The analysis below is built on our long-term pricing, so during the promotion this month, you'll see 75% savings on these numbers. There’s never been a better time to “kick the tires” on Local SSD, so don’t wait. ...
Understanding is good, but trying it for yourself is better. Today, we’ve made that a bit easierGoogle would like you to experience the performance of Local SSD firsthand. To make it even easier to try out this feature, we're giving our customers a discounted trial. For the next month (April 21, 2015 to May 21, 2015), Local SSD will be priced at $0.055/GB/month, a 75% discount. After that time, the price will return to its normal $0.218/GB/month. The analysis below is built on our long-term pricing, so during the promotion this month, you'll see 75% savings on these numbers. There’s never been a better time to “kick the tires” on Local SSD, so don’t wait.



Since publishing our Understanding Cloud Pricing blog post, one of the most frequent follow up requests (please keep your great questions and ideas coming!) has been a closer look at storage costs and performance, especially in areas where our products work a little differently from other cloud services.



Solid State Disk (SSD) is an incredible technology, but it’s really important to realize that the wide variety of devices, configurations, connectors, and drivers can create order-of-magnitude or larger differences in performance. Not all Solid State Disks are created equal.



Additionally, different cloud providers deliver SSD in very different packages. Once again, rather than just reciting the stats and leaving the real-world analysis to you, we're going to provide a clear example of a system that uses local SSD, and analyze the difference between running it on Google Cloud Platform and AWS.



Let’s imagine that we’re going to deploy a NoSQL key-value store backend for a web-scale application, similar to what we used in our first example. We’ll use conservative best practices and deploy a three-node cluster, hosting data on local SSD for maximum performance.



On Google Compute Engine, we’ll use n1-highmem-8 instances, with four attached local SSD volumes, which is almost identical in CPU, RAM, and SSD storage volume to the AWS i2.2xlarge instance. We’ll be set up to deliver at least 75,000 IOPS, blazing fast queries here we come!



Please note that we completed these calculations on April 3, 2015, and have included the output prices in this post. Any discrepancies are likely due to pricing or calculator changes following the publishing of this post.



Here's the output of the pricing calculators:



Google Cloud Platform estimate:

Monthly: $1883.04



Amazon Web Services estimate:

Monthly: $3744.18



You’ll notice that Google Cloud Platform comes in quite a bit cheaper. Some of that’s due to our automatic Sustained Use Discounts, but even without those, we’re still 39% less expensive. Here’s all the details by the numbers:




  • i2.2xlarge advantages:


    • 17% more memory

    • 7% more SSD space


  • n1-highmem-8 with 4 attached SSD partitions advantages:


    • 39% less expensive

    • 807% more read IOPS

    • 380% more write IOPS





Did you catch that? 807% more read IOPS! Over nine times faster, at nearly ½ the cost, is not a small difference.



So what impact does this have for our NoSQL workload? Assuming a read-bound workload growing over time (many are, like reporting and analytics systems), as read capacity on the SSD in our instances gets exhausted, we’ll need to scale out our cluster by adding additional nodes. Let’s imagine read traffic multiplies by six (product success is a good problem to have).



Here's the output of the pricing calculators:



Google Cloud Platform estimate:

Monthly: $1883.04 (yup, exactly the same as above)



Amazon Web Services estimate:

Monthly: $22465.08



In order to equal the read throughput of our SSD, on AWS you’d need to step up to the next larger size instance (i2.4xlarge), and run three times as many of them. The extra read performance that Google Cloud Platform SSD provides means you not only keep the same, simple three-node system (saving you *real money* in admin/ops costs), but you keep the same low price. If you have a write-bound workload, you’d enjoy a similar advantage in picking Google; we’re nearly 4x the write performance, so you’d need to bump up your configuration similarly to keep pace.



What if you’re trying to get started smaller than where we started? Every app does not need 680k IOPS! This is one of the most important differences between Google Cloud Platform’s SSD implementation and the AWS instances: You can add SSD to standard, highmem, and highcpu instances in 375GB increments. This means that you can start on highly efficient SSD and scale more linearly. It’s important to note that AWS does include some small single-copy SSD on instances for use as an efficient scratch disk; these aren’t designed for heavy data usage and AWS does not provide a documented performance specification.



Because SSD is available on all of our primary instances, you can easily configure a much smaller instance type and still keep the power of local SSD. Let’s go down to the smallest three-node configurations we can get on each provider that still give us access to full performance SSD. For us, that’d be n1-standard-1 instances with 1x375GB local SSD, for AWS that’d be i2.xlarge instances with 1x800GB local SSD.



Here’s the output of the pricing calculators:



Google Cloud Platform estimate:

Monthly: $341.90



Amazon Web Services estimate:

Monthly: $1873.20



That’s a huge discrepancy. On Google, this system is so cost efficient, you can run it for 3 weeks and stay within our free trial with room for lots more experimentation!




Comparing Prices for SSD specifically


With local SSD, it’s been a bit of a challenge to compare prices between clouds directly because AWS bundles compute and local storage into a single SKU, whereas Google Compute Engine decouples them giving customers more freedom to rightsize and optimize their deployments.



However, using publicly published AWS documentation, it’s possible to derive a price for EC2’s local SSD by comparing configurations and prices of similar instance types that differ only in price and amount of SSD. All configuration information comes from the EC2 instance type web page and all pricing information comes from the EC2 instance pricing page. In all cases, we use the on-demand prices in Northern Virginia.



The methodology is basically to compare r3 (memory optimized) and i2 (storage optimized) instance types. By grouping them together in pairs that have the same amount of CPU and memory but different amounts of SSD and different prices and dividing the difference in SSD capacity by the difference in price, you can derive the per-GB local SSD price that AWS charges its customers. Each of the four r3/i2 pair comparisons yields a local SSD price of $0.0007/GB/hour.



By comparison, we sell Local SSD in 375GB chunks for $0.218/GB/month. Normalizing that to hourly pricing, we get $0.0003/GB/hour. So there’s the bottom line: we charge 57% less for local SSD that’s at least 4.8x faster than AWS.



We think pricing is a critical consideration as you try to make the best decision you can about infrastructure systems design. I’d love to hear your thoughts and what matters to you in cloud pricing. What areas are confusing, hard to analyze, or hard to predict? What ideas do you have? Reach out to us on Stack Overflow if there’s anything we can do to add more value.



-Posted by Miles Ward, Global Head of Solutions, Google Cloud Platform

Yesterday, we announced that Google Cloud Platform big data services are taking a big step forward by allowing everyone to use big data the ...
Yesterday, we announced that Google Cloud Platform big data services are taking a big step forward by allowing everyone to use big data the cloud way. Google BigQuery has many new features and is now available in European zones. These improvements were designed to extend BigQuery's performance and capabilities to give you greater peace-of-mind and control over your data.




European Data Location Control


You now have the option to store your BigQuery data in European locations while continuing to benefit from a fully managed service, now with the option of geographic data control, without low-level cluster maintenance headaches. Feel free to contact the Google Cloud Platform technical support team for details on how to set this up.




Streaming Inserts


One of BigQuery's most popular features is the ability to stream data into the service for real-time analysis. To allow such low-latency analysis on very high-volume streams, we've increased the default insert-rate limit from 10,000 rows per second, per table, to 100,000 rows per second, per table. In addition, the row-size limit has increased from 20 KB to 1 MB, and pricing will move from a per-row model to a per-byte model for better flexibility and scale.




Security Features


BigQuery can now tackle a wider range of enterprise applications with the addition of data expiration controls and row-level permissions. Row-level permissions eliminate the need to create different views for different users, allowing secure shared access to systems such as finance or HR. This ensures that you get the information that’s relevant to you. In addition, data in BigQuery will be encrypted at rest.




Google Cloud Platform Logging Integration


Google Cloud Logging provides a powerful set of tools for managing your operations and understanding the systems powering your business; now, it also lets your Google App Engine and Google Compute Engine applications stream their logs into BigQuery. This allows you to perform real-time analysis on your log data and gain insight into how your system is performing and how your users are behaving. By joining application logs with your marketing and partnership data, you can rapidly evaluate the effectiveness of your outreach, or apply context from user profile info into your application logs to quickly assess what behavior resulted from specific customer interactions, providing easy and immediate value to both system administrators and business analysts.




Frequently requested features


Additionally, we’ve implemented a number of new features you’ve been asking for. You can now:


For a full list of features, take a look at the release notes.




Unprecedented scale


BigQuery continues to provide exceptional scale and performance without requiring you to deploy, augment or update your own clusters. Instead, you can focus on getting meaningful insights from massive amounts of data. For example:


  • BigQuery absorbs real-time streams of customer data totaling more than 100 TB per day, which you can query immediately. All this data is in addition to the hundreds of terabytes loaded daily from other sources. If you have fast-moving, large-scale applications such as IoT, you can now make quick, accurate decisions against in-flight applications.

  • We have customers currently running queries that scan multiple petabytes of data or tens of trillions of rows using a simple SQL query, without ever having to worry about system provisioning, maintenance, fault-tolerance or performance tuning.


With BigQuery’s new features, you can analyze even more data and access it faster than before, in brand new ways. To get started, learn more about BigQuery, read the documentation, and try it out for yourself.



-Posted by Andrew Kowal, Product Manager

Big data applications can provide extremely valuable insights, but extracting that value often demands high overhead – including significant deployment, tuning, and operational effort – diverse systems, and programming models. As a result, work other than the actual programming and data analysis dominates the time needed to build and maintain a big data application. The industry has come to accept these pains and inefficiencies as an unavoidable cost of doing business. We believe you deserve better.
Big data applications can provide extremely valuable insights, but extracting that value often demands high overhead – including significant deployment, tuning, and operational effort – diverse systems, and programming models. As a result, work other than the actual programming and data analysis dominates the time needed to build and maintain a big data application. The industry has come to accept these pains and inefficiencies as an unavoidable cost of doing business. We believe you deserve better.



In Google’s systems infrastructure team, we’ve been tackling challenging big data problems for more than a decade and are well aware of the difference that simple yet powerful data processing tools make. We have translated our experience from MapReduce, FlumeJava, and MillWheel into a single product, Google Cloud Dataflow. It's designed to reduce operational overhead and make programming and data analysis your only job, whether you’re a data scientist, data analyst or data-centric software developer. Along with other Google Cloud Platform big data services, Cloud Dataflow embodies the kind of highly productive and fully managed services designed to use big data, the cloud way.



Today we’re pleased to make Google Cloud Dataflow available in beta, for use by anyone on Google Cloud Platform. With Cloud Dataflow, you can:




  • Merge your batch and stream processing pipelines thanks to a unified and convenient programming model. The model and the underlying managed service let you easily express data processing pipelines, make powerful decisions, obtain insights and eliminate the switching cost between batch and continuous stream processing.

  • Finely tune the desired correctness model for your data processing needs through powerful API primitives for handling late arriving data. You can process data based on event time as well as clock time and gracefully deal with upstream data latency when processing data from unbounded sources.

  • Leverage a fully-managed service, complete with dynamically adaptive auto-scaling and auto-tuning, that offers attractive performance out of the box. Whether you’re a developer or systems operator, you no longer need to invest time worrying about resource provisioning or attempting to optimize resource usage. Automation, a fully managed service, and the programming model work together to significantly lower both CAPEX and OPEX.

  • Enjoy reduced complexity of managing and debugging highly parallelized processes with a simplified monitoring interface that’s logically mapped to your processing logic as opposed to how your code’s mapped to the underlying execution plane.

  • Benefit from integrated processing of data across the Google Cloud Platform with optimized support for services such as Google Cloud Storage, Google Cloud Datastore, Google Cloud Pub/Sub, and Google BigQuery.






We’re also working with major open source contributors on maturing the Cloud Dataflow ecosystem. For example, we recently announced collaborations with Data Artisans for runtime support for Apache Flink and with Cloudera for runtime support for Apache Spark.



We’d like to thank our alpha users for their numerous suggestions, reports and support along this journey. Their input has certainly made Cloud Dataflow a better product. Now, during beta, everyone can use Cloud Dataflow and we continue to welcome questions and feedback on Stack Overflow. We hope that you’ll give Cloud Dataflow a try and enjoy big data made easy.



-Posted by Grzegorz Czajkowski, Director of Engineering

The promise of big data is faster and better insight into your business. Yet it often turns into an infrastructure project. Why? For example, you might be collecting a deluge of information and then correlating, enriching and attempting to extract real-time insights. Should you expect such feats, by their very nature, to involve a large amount of resource management and system administration? You shouldn’t. Not in the cloud. Not if you’re using big data the ...
The promise of big data is faster and better insight into your business. Yet it often turns into an infrastructure project. Why? For example, you might be collecting a deluge of information and then correlating, enriching and attempting to extract real-time insights. Should you expect such feats, by their very nature, to involve a large amount of resource management and system administration? You shouldn’t. Not in the cloud. Not if you’re using big data the cloud way.



Big data the cloud way means being more productive when building applications, with faster and better insights, without having to worry about the underlying infrastructure. More specifically, it includes:




  • NoOps: Your cloud provider should worry about deploying, managing and upgrading infrastructure to make it scalable and reliable. “NoOps” means the platform handles such tasks and optimizations for you, freeing you up to focus on understanding and exploiting the value in your data.

  • Cost effectiveness: In addition to increased ease of use and agility, a “NoOps” solution provides clear cost benefits via the removal of operations work; but the cost benefits of big data the cloud way go even further the platform auto-scales and optimizes your infrastructure consumption, and eliminates unused resources like idle clusters. You manage your costs by dialing up or down the number of queries and the latency of your processing based on your cost/benefit analysis. You should never have to re-architect your system to adjust your costs.

  • Safe and easy collaboration: You can share datasets from files in Google Cloud Storage or tables in Google BigQuery with collaborators inside or outside of your organization without the need to make copies or grant database access. There’s one version of the data – which you control – and authorized users can access it (at no cost to you) without affecting the performance of your jobs.






Google has been blazing the big data trail for the rest of the industry  so when you use Google Cloud Platform, big data the cloud way also means:




  • Cutting-edge features: Google Cloud Dataflow provides reliable, event-time-based stream processing, available by default with no extra work. But making stream processing easy and reliable doesn’t mean removing the option of running in batch. The same pipeline can execute in batch mode, which you can use to lower costs or analyze historical data. Now, consistently processing streaming data at large scale doesn’t have to be a complex and brittle endeavor that’s reserved for the most critical scenarios.




Google Cloud Platform delivers these characteristic by making data analysis quick, affordable and easy. Today, at the Hadoop Summit in Brussels, we announced that our big data services are taking a big step forward – allowing everyone to use big data the cloud way.




Google Cloud Dataflow now available in beta


Today, nothing stands between you and the satisfaction of seeing your processing logic, applied in your choice of streaming or batch mode, executed via a fully managed processing service. Just write a program, submit it, and Cloud Dataflow will do the rest. No clusters to manage – Cloud Dataflow will start the needed resources, autoscale them (within the bounds you choose), and terminate them as soon as the work is done. You can get started right now.




Google BigQuery has many new features and is now available in European zones


BigQuery, the quintessential cloud-native, API-driven service for SQL analytics, has new security and performance features. For example, the introduction of row-level permissions makes data sharing even easier and more flexible. With its ease of ingestion (we’ve raised the default ingestion limit to 100,000 rows per second per table), virtually unlimited storage, and fantastic query performance even for huge datasets, BigQuery is the ideal platform for storing, analyzing and sharing structured data. It also supports repeated records and querying inside JSON objects for loosely structured data. In addition, starting today, BigQuery now offers the option to store your data in Google Cloud Platform European zones. You can contact Google technical support today to use this option.




A comprehensive set of big data services


Google Cloud Pub/Sub is designed to provide scalable, reliable and fast event delivery as a fully managed service. Along with BigQuery streaming ingestion and Cloud Dataflow stream processing, it completes the platform’s end-to-end support for low-latency data processing. Whether you’re processing customer actions, application logs or IoT events, Google Cloud Platform allows you to handle them in real time, the cloud way. Leave Google Cloud Platform in charge of all the scaling and administration tasks so you can focus on what needs to happen, not how.



Using big data the cloud way doesn’t mean that Hadoop, Spark, Flink and other open source tools originally created for on-premises can’t be used in the cloud. We’ve ensured that you can benefit from the richness of the open source big data ecosystem via native connectors to Google Cloud Storage and BigQuery along with an automated Hadoop/Spark cluster deployment.



Google BigQuery customer zulily joined us recently for a big data webinar to share their experience using big data the cloud way and how it helped them increase revenue and overall business visibility while decreasing their operating costs. If you’re interested in exploring these types of benefits for your own company, you can easily get started today by running your first query on a public dataset or uploading your own data.



Here’s a simplified illustration of how Google Cloud Platform data processing services relate to each other and support all stages of the data lifecycle:









Scuba equipment helps humans operate under water, but divers still fall hopelessly short of the efficiency and agility of marine creatures. When it comes to big data in the cloud, be a dolphin, not a scuba diver. Google Cloud Platform offers a set of powerful, scalable, easy to use and efficient big data services built for the cloud. Embrace big data, the cloud way, by taking advantage of them today.



Learn more about Google Cloud Platform’s big data solutions or get started with Dataflow and BigQuery today. We can’t wait to see what you achieve when you use big data the cloud way.



-Posted by William Vambenepe, Product Manager

A feature length animated movie takes up to 100 million compute hours to render.
A feature length animated movie takes up to 100 million compute hours to render. 100 million.



When you hear the two words “Google” and “media,” what pops into your mind? YouTube, right? Well, as I’m excited to explain, media means much more than “YouTube” at Google. The media and entertainment industry is a key area of focus for Google Cloud Platform. As I’ll be sharing in my keynote address for the cloud conference at the 97,000-attendee NAB Show in Las Vegas on Tuesday, we’re rapidly expanding our platform and our partner ecosystem to uniquely solve media-specific challenges. In addition to my keynote, Patrick McGregor and Todd Prives from my team are participating in panel sessions on cloud security and cloud rendering. And as part of the recent Virtual NAB conference, Jeff Kember and Miles Ward from Google shared their insights.



We’re witnessing massive changes in the ways media companies are creating, transforming, archiving and delivering content, using the power of the cloud



We recognize that Google Cloud Platform best supports the media industry when we deliver capabilities that are tailored to specific workflow patterns. Great examples of these capabilities are our services for visual effects rendering. Aside from the skilled work that an artist puts into modeling, animating and compositing a realistic scene, the compute demands required to produce these images are often staggering. Even a relatively simple visual effects shot or animation can take several hours to render the 24 individual frames that make up one second of video.



Google Cloud Platform can greatly accelerate and simplify rendering while charging only for the processor cycles and bits that are consumed. For customers looking for an end-to-end rendering solution, we offer Google Zync Render. Beta launched in the first quarter of 2015, Zync is a turnkey service for small and medium-sized studios. It integrates directly with existing on-premises software workflows to feel as natural and responsive as a local render farm. Also, through our collaborations with The Foundry and others, Google Cloud Platform provides tools used in the creation of some of the highest-grossing movies.




Zync Render Workflow







By using Google Cloud Platform’s cost-efficient compute and storage, studios can seamlessly extend their rendering pipelines to handle burst capacity needs and remove the bottlenecks typically associated with production deadlines. We’re already seeing great successes from media customers like Framestore, RodeoFX, iStreamPlanet, Panda, and Industriromantik.



We’ve also built compelling general platform capabilities that help media companies with all stages of workflow and the content lifecycle. One example is Google Cloud Storage Nearline, which is a service that allows a virtually unlimited volume of data to be stored at very low costs with retrieval times on the order of seconds – not hours as you would experience with tape. This is ideal for media content archiving. We also recently launched 32-core VM instances for compute-intensive workloads that crunch large volumes of content. And, yesterday, we announced a collaboration with Avere Systems that enables us to bridge cloud storage and on-premises storage without impacting performance. This opens huge opportunities for creative collaboration and content production.



Please join us this week for NAB, we hope to see you in Las Vegas!



-Posted by Brian Stevens, VP for Google Cloud Platform

Today’s enterprise must focus on running key workloads both on-premises locally and remotely in the cloud. There is simultaneously the need to keep the quality of service high for end-users in terms of network latency and reliability, and the need to ensure efficiency and security for your company’s hybrid workloads – particularly workloads that are bandwidth-intensive or latency-sensitive. Raw performance, reliability and security have been major focus areas for Google from the start, and our goal with Google Cloud Platform is to share the benefits of continuous ...
Today’s enterprise must focus on running key workloads both on-premises locally and remotely in the cloud. There is simultaneously the need to keep the quality of service high for end-users in terms of network latency and reliability, and the need to ensure efficiency and security for your company’s hybrid workloads – particularly workloads that are bandwidth-intensive or latency-sensitive. Raw performance, reliability and security have been major focus areas for Google from the start, and our goal with Google Cloud Platform is to share the benefits of continuous networking innovation with our customers.



We have four announcements today in support of two major technical goals. The first is to use Google’s global network footprint – over 70 points of presence across 33 countries – to serve users close to where they are, ensuring the same low latency and responsiveness customers can expect from Google’s own services. The second goal relates to enabling enterprises to run mission-critical workloads by connecting their on-premises infrastructure to Google’s network with enterprise-grade encryption.



Today we're announcing:







Managed DNS


With Cloud DNS – our high performance, managed DNS solution for user-facing applications and services – you can host millions of zones and records and handle SLA-backed name-serving queries. For customers with more than 10,000 zones, our new pricing tier lowers the cost of ownership for large organizations operating DNS infrastructure at scale.








Global Load Balancing


Today’s connected user is accustomed to fast and responsive application services, be they web services accessed from a browser or apps on a mobile device. Latency (“lag”) is noticeable immediately, especially as users switch from a fast, optimized service to a slow one. With the expansion of Google’s load balancing solution to 12 additional locations, your workloads running on Google Cloud Platform are closer in proximity to your users who are making service requests from all over the globe.




Additional Carrier Interconnect service providers and VPN Beta


We continue to build on our goal of enabling enterprises to connect their on-premises infrastructure to Google’s network over encrypted channels to run data-intensive, latency-sensitive workloads. In addition to announcing the beta for Cloud VPN, we’re pleased to introduce 11 additional Carrier Interconnect service providers. Our growing list of technology partners extends our reach to customer locations globally while providing tailored connectivity and choice.







iStreamPlanet is one such customer who has taken advantage of our infrastructure breadth to make high-quality connections into the Google network. iStreamPlanet recently launched Aventus, its SaaS-based product that enables content owners to serve high-quality live video with simplicity to viewers across devices. Running on Google Cloud Platform, iStreamPlanet is able to create live video events for its customers in minutes rather than days, and has lowered bandwidth costs by more than 40 percent using Google Cloud Platform’s Direct Peering offering.



We’d also like to welcome CloudFlare as a Google Cloud Platform Technology Partner. CloudFlare provides website speed optimization, security and DDOS protection, as well as caching solutions over its globally distributed network. With nearly no setup required, CloudFlare reports speed optimizations that result in content loading twice as fast on average for visitors.



Google’s network, built out over the past 15 years, is a key enabler behind the services relied upon every day by our customers and our users – from Search to Maps, YouTube to Cloud Platform. We invite you to contact us to explore how we can make Google’s network an extension of your own, or to learn about your specific needs around serving your users wherever they may be globally. You can read more about Google Cloud Networking.



-Posted by Morgan Dollard, Cloud Networking Product Management Lead




Today’s guest post comes from Ed Byrne, Director at Panda – a cloud-based video transcoding platform. To learn more about how Panda uses Google Cloud Platform, watch their case study video. ...




Today’s guest post comes from Ed Byrne, Director at Panda – a cloud-based video transcoding platform. To learn more about how Panda uses Google Cloud Platform, watch their case study video.



Panda makes it easy for video producers to encode their video in multiple formats for different mobile device screen sizes. But delivering blazing fast, high-quality videos to customers is no easy task – especially when your engineers are also dealing with infrastructure. Google Cloud Platform features like Live Migration and Autoscaler have allowed us to cut our infrastructure maintenance load to only half of a developer.



With more resources to direct at innovation, we can put our focus on our customers, making their experience better with new and improved features in Panda. In fact, since relying on Google Cloud Platform for underlying infrastructure, we’ve developed our frame rate conversion by motion compensation technology. Our customers love the video quality they get using this feature, and we’re so excited about it, we agreed to give you the low down on how it works.




Introduction to motion compensation


Motion compensation is a technique that was originally used for video compression, and now it’s used in virtually every video codec. Its inventors noticed that adjacent frames usually don’t differ too much (except for scene changes), and then used that fact to develop a better encoding scheme than compressing each frame separately. In short, motion-compensation-powered compression tries to detect movement that happens between frames and then use that information for more efficient encoding. Imagine two frames:




Panda on the left...




aaaand on the right

Now, a motion compensating algorithm would detect the fact that it’s the same panda in both frames, just in different locations:




First stage of motion compensation: motion detection

We’re still thinking about compression, so why would we want to store the same panda twice? Yep, that’s what motion-compensation-powered compression does – it stores the moving panda just once (usually, it would store the whole frame #1), but it adds information about movement. Then the decompressor uses this information to construct remaining information (frame #2 based on frame #1).



That’s the general idea, but in practice it’s not as smooth and easy as in the example. The objects are rarely the same, and usually some distortions and non-linear transformations creep in. Scanning for movements is very expensive computationally, so we have to limit the search space and optimize the code, even resorting to hand-written assembly.




Frame rate conversion by motion compensation


Motion compensation can be used for frame rate conversion too, often with really impressive results.



For illustration, let’s go back to the moving panda example. Let’s assume we want to change the frame rate from two frames per second (FPS) to three FPS. In order to maintain the video speed, each frame will be on screen for a shorter amount of time (.5 sec vs .33 sec).



One way to increase the number of frames is to duplicate a frame, resulting in three FPS, but the quality will suffer. As you can see, frame #1 has been duplicated:




Converting from 2 FPS to 3 FPS by duplicating frames

Yes, the output has three frames and the input has two, but the effect isn’t visually appealing. We need a bit of magic to create a frame that humans would see as naturally fitting between the two initial frames – panda has to be in the middle. That’s a task motion compensation could deal with – detect the motion, but instead of using it for compression, create a new frame based on the gathered information. Here’s how it should work:




Converting from 2 FPS to 3 FPS by motion compensation: Panda's in the middle!

Notice that by creating a new frame, we keep our panda hero at the center.



Now for video examples, taken straight from a Panda encoder. Here’s what frame duplication (the bad guy) looks like in action (for better illustration, after converting FPS, we slowed down the video):












While the video on the left is very smooth, the frame duplicated version on the right is jittery. Not great. Now, what happens when we use motion compensation (the good guy):












The movement’s smooth and outside of slight noise, we don’t catch glimpse of any video artifacts.



There are other types of footage that fool the algorithm more easily. Motion compensation assumes simple, linear movement, so other kinds of image transformations can produce heavier artifacts that may or may not be acceptable, depending on the use case. Occlusions, refractions – you see these in water bubbles – and very quick movements, which means that too much happens between frames, are the most common examples of image transformations that can produce lower visual quality. Here’s a video full of occlusions and water:










Now let’s slow it down and see frame duplication and motion compensation side-by-side.









Motion compensation produces clear artifacts (those fake electric discharges), but still maintains higher visual quality than frame duplication.



The unilateral verdict of a short survey we shared in our office: motion compensation produces much better imaging than frame duplication.



Google Cloud Platform products like Google Compute Engine allowed us to improve performance in encoding by 30%, as well as shift our energy from focusing on underlying infrastructure to innovating for our customers. We’ve also been able to take advantage of sustained use discounts, which have helped lower our infrastructure costs, without the need to sign contracts or reserve capacity. Google’s network performance is also a huge asset for us, given video files are so large and we need to move them frequently. To learn more about how we’re using Cloud Platform, watch our video.



Panda’s excited to be at this year’s NAB show, one of the world’s largest gatherings of technologists and digital content providers. They’ll be in the StudioXperience area with Filepicker in the South Upper Hall, SU621.