Google Cloud Platform Blog
Take your logs data to new places with streaming export to Cloud Pub/Sub
Thursday, April 30, 2015
Earlier this year, we
announced
the beta of the
Google Cloud Logging
service, which included the capability to:
Stream logs in real-time to
Google BigQuery
, so you can
a
nalyze log data and get immediate insights.
Export logs to
Google Cloud Storage
(including
Google Cloud Storage Nearline
), so you can archive logs data for longer periods to meet backup and compliance requirements.
Today we’re expanding Cloud Logging capabilities with the beta of Cloud Logging Connector that allows you to stream logs to
Google Cloud Pub/Sub
. With this capability you can stream log data to your own endpoints and further expand how you can make big data useful. For example, you can now transform and enrich the data in
Google Cloud Dataflow
before sending it to BigQuery for analysis. Furthermore, this provides easy real-time access to all your logs data, so you can export it to your private cloud or any third party application.
Cloud Pub/Sub
Google Cloud Pub/Sub
is designed to deliver real-time and reliable messaging in one global, managed service that helps you create simpler, more reliable, and more flexible applications. By providing many-to-many, asynchronous messaging that decouples senders and receivers, it allows for secure and highly available communication between independently written applications. With Cloud Pub/Sub, you can push your log events to another Webhook, or pull them as they happen. For more information, check out our Google Cloud Pub/Sub
documentation
.
High-Level Pub/Sub Schema
Configuring Export to Cloud Pub/Sub
Configuring export of logs to Cloud Pub/Sub is easy and can be done from the
Logs Viewer
user interface. To get to the export configuration UI start in the
Google Developers Console
, go to Logs under Monitoring and then click Exports on the top menu. Currently this supports export configuration for
Google App Engine
and
Google Compute Engine
logs.
One Click Export Configuration in the Developers Console
Transforming Log Data in Dataflow
Google Cloud Dataflow
allows you to build, deploy, and run data processing pipelines at a global scale. It enables reliable execution for large-scale data processing scenarios such as ETL and analytics, and allows pipelines to execute in either streaming or batch mode. You choose.
You can use the Cloud Pub/Sub export mechanism to stream your log data to Cloud Dataflow and dynamically generate fields, combine different log tables for correlation, and parse and enrich the data for custom needs. Here are a few examples of what you can achieve with log data in Cloud Dataflow:
Sometimes it is useful to see the data only for the key applications for top customers. In Cloud Dataflow, you can group logs by Customer ID or Application ID, filter out specific logs, and then apply some aggregation of system level or application level metrics.
On the flip side, sometimes you want to enrich the log data to make it easier to analyze, for example by appending marketing campaign information to customer interaction logs, or other user profile info. Cloud Dataflow lets you do this on the fly.
In addition to preparing the data for further analysis, Cloud Dataflow also lets you perform analysis in real time. So you can look for anomalies, detect security intrusions, generate alerts, keep a real-time dashboard updated, etc.
Cloud Dataflow can stream the processed data to BigQuery, so you can analyze your enriched data. For more details, please see the Google Cloud Dataflow
documentation
.
Getting Started
If you’re a current
Google Cloud Platform
user, the capability to stream logs to Cloud Pub/Sub is available to you at no additional charge. Applicable charges for
using Cloud Pub/Sub and Cloud Dataflow will apply
. For more information, visit the the
Cloud Logging documentation page
and share your
feedback
.
-Posted by Deepak Tiwari, Product Manager
Streak’s Top 6 Tips for App Engine
Tuesday, April 28, 2015
When
Streak
— CRM in your inbox — launched in March 2012, our userbase grew 30% every week for four consecutive months. Today, Streak supports millions of users with only 1.5 back-end engineers. We chose
Google App Engine
to power our application
because it enabled our team to build features fast and scaled with user growth. Plus, we didn’t have to worry about infrastructure.
Streak’s data growth
Here are six tips we’ve learned building on App Engine, and if you’d like even more detail – including an overview of our app’s architecture and 15 minutes of Q&A – you can check out my
webinar
.
1. Keep user-facing GET requests fast
This tip isn’t specific to App Engine, as it really applies to most web applications. User-facing GET requests should be quick. App Engine has a
60 second timeout
on all requests; frankly, if the total latency after a user interaction is taking longer than 200ms, users will perceive your app as slow. To keep requests fast, you should do your heavyweight processing – such as calculations or complex queries – either in the background or at write time. That way, when the user requests data (read time), it’s already precalculated and ready to go.
2. Take advantage of Managed VMs
So, what are Managed VMs?
Managed VMs
are a new hosting environment for App Engine, enabling you to take advantage of beefier compute resources and run your own custom runtimes. For example, we host our back-end data processing modules on n1-standard-1 machines (1 CPU and 3.75 GB mem), rather than
App Engine frontend instances.
This provides better performance and cost savings, due to
sustained use discounts
. Yes, Managed VMs take a littler longer to boot up than an App Engine frontend instance, but they're perfect for our background processing needs.
3. Denormalize for faster reads
Google Cloud Datastore
is a NoSQL database so if you’re coming from the RDBMS world, it requires a different approach to data modeling. You have to be comfortable
denormalizing
and duplicating data, since SQL joins don’t exist. While data duplication might feel uncomfortable, by doing so, your reads will be very fast.
4. Break your application into modules
Modules
make it easy for you to break your App Engine app into different components. For example, you could have a module for your user-facing traffic and one for background processing. Each module has its own yaml file, so you can set parameters such as instance size, version number, runtime language used, and more. As mentioned above, our backend modules take advantage of Managed VMs for performance/cost benefits, while our frontend module uses App Engine frontend instances that scale quicker. The documentation discusses best practices on
how you should structure your app
.
5. Deploy aggressively and use traffic splitting
At Streak, we do continuous deployments because versioning, deployment and rollout is easy with App Engine. In fact, sometimes we deploy up to 20 times per day to get changes into the hands of customers. We aggressively deploy to many production versions of our app and then selectively turn on new features for our users. As we slowly ramp up the traffic to these new versions via
traffic splitting
, we catch issues early and often. These are usually really easy to deal with because each of our new code deploys has a small set of functionality, so its easy to find the relevant issues in the code base. We also use
Google Cloud Monitoring
and our own homegrown system (based on #6 below) to monitor these deploys for changes.
6. Use BigQuery to analyze your log files
Application and request logs can give you valuable insights into performance and help you make product improvements. If you’re just starting out, the log viewer’s list of recent requests will be just fine, but once you’ve reached scale you’ll want to do analysis on aggregate data or a specific user’s requests. We’ve built custom code to export our logs to
Google BigQuery
, but you can now
stream your logs
directly from the Developers Console. With these insights, my team can build a better user experience.
Watch the webinar
App Engine has been critical to our success. As our application has scaled, so has App Engine and we’ve been able to focus on building features for our customers, rather than ops. To learn more tips about App Engine – including an overview of our architecture and 15 minutes of Q&A – watch the full
webinar
.
-
Posted by Aleem Mawani, CEO and co-founder, Streak
Understanding Cloud Pricing Part 2 - Local SSD
Tuesday, April 21, 2015
Understanding is good, but trying it for yourself is better. Today, we’ve made that a bit easier
.
Google would like you to experience the performance of Local SSD firsthand. To make it even easier to try out this feature, we're giving our customers a discounted trial. For the next month (April 21, 2015 to May 21, 2015), Local SSD will be priced at $0.055/GB/month, a 75% discount. After that time, the price will return to its normal $0.218/GB/month. The analysis below is built on our long-term pricing, so during the promotion this month, you'll see 75% savings on these numbers. There’s never been a better time to “kick the tires” on Local SSD, so don’t wait.
Since publishing our
Understanding Cloud Pricing
blog post, one of the most frequent follow up requests (please keep your great questions and ideas coming!) has been a closer look at storage costs and performance, especially in areas where our products work a little differently from other cloud services.
Solid State Disk (SSD) is an incredible technology, but it’s really important to realize that the wide variety of devices, configurations, connectors, and drivers can create order-of-magnitude or larger differences in performance. Not all Solid State Disks are created equal.
Additionally, different cloud providers deliver SSD in very different packages. Once again, rather than just reciting the stats and leaving the real-world analysis to you, we're going to provide a clear example of a system that uses local SSD, and analyze the difference between running it on Google Cloud Platform and AWS.
Let’s imagine that we’re going to deploy a NoSQL key-value store backend for a web-scale application, similar to what we used in our
first example
. We’ll use conservative best practices and deploy a three-node cluster, hosting data on local SSD for maximum performance.
On
Google Compute Engine
, we’ll use
n1-highmem-8
instances, with four attached
local SSD
volumes, which is almost identical in CPU, RAM, and SSD storage volume to the AWS i2.2xlarge instance. We’ll be set up to deliver at least 75,000 IOPS, blazing fast queries here we come!
Please note that we completed these calculations on April 3, 2015, and have included the output prices in this post. Any discrepancies are likely due to pricing or calculator changes following the publishing of this post.
Here's the output of the pricing calculators
:
Google Cloud Platform
estimate
:
Monthly: $1883.04
Amazon Web Services
estimate
:
Monthly: $3744.18
You’ll notice that Google Cloud Platform comes in quite a bit cheaper. Some of that’s due to our automatic
Sustained Use Discounts
, but even without those, we’re still 39% less expensive. Here’s all the details by the numbers:
i2.2xlarge advantages:
17% more memory
7% more SSD space
n1-highmem-8 with 4 attached SSD partitions advantages:
39% less expensive
807% more read IOPS
380% more write IOPS
Did you catch that?
807% more read IOPS
! Over nine times faster, at nearly ½ the cost, is not a small difference.
So what impact does this have for our NoSQL workload? Assuming a read-bound workload growing over time (many are, like reporting and analytics systems), as read capacity on the SSD in our instances gets exhausted, we’ll need to scale out our cluster by adding additional nodes. Let’s imagine read traffic multiplies by six (product success is a good problem to have).
Here's the output of the pricing calculators
:
Google Cloud Platform
estimate
:
Monthly: $1883.04 (yup, exactly the same as above)
Amazon Web Services
estimate
:
Monthly: $22465.08
In order to equal the read throughput of our SSD, on AWS you’d need to step up to the next larger size instance (i2.4xlarge), and run three times as many of them. The extra read performance that Google Cloud Platform SSD provides means you not only keep the same, simple three-node system (
saving you *real money*
in admin/ops costs), but you keep the same low price. If you have a write-bound workload, you’d enjoy a similar advantage in picking Google; we’re nearly 4x the write performance, so you’d need to bump up your configuration similarly to keep pace.
What if you’re trying to get started smaller than where we started? Every app does not need 680k IOPS! This is one of the most important differences between Google Cloud Platform’s SSD implementation and the AWS instances: You can add SSD to standard, highmem, and highcpu instances in 375GB increments. This means that you can start on highly efficient SSD and scale more linearly. It’s important to note that AWS does include some small single-copy SSD on instances for use as an efficient scratch disk; these aren’t designed for heavy data usage and AWS does not provide a documented performance specification.
Because SSD is available on all of our primary instances, you can easily configure a much smaller instance type and still keep the power of local SSD. Let’s go down to the smallest three-node configurations we can get on each provider that still give us access to full performance SSD. For us, that’d be n1-standard-1 instances with 1x375GB local SSD, for AWS that’d be i2.xlarge instances with 1x800GB local SSD.
Here’s the output of the pricing calculators
:
Google Cloud Platform
estimate
:
Monthly: $341.90
Amazon Web Services
estimate
:
Monthly: $1873.20
That’s a huge discrepancy.
On Google, this system is so cost efficient, you can run it for 3 weeks and stay within our
free trial
with room for lots more experimentation!
Comparing Prices for SSD specifically
With local SSD, it’s been a bit of a challenge to compare prices between clouds directly because AWS bundles compute and local storage into a single SKU, whereas Google Compute Engine decouples them giving customers more freedom to rightsize and optimize their deployments.
However, using publicly published AWS documentation, it’s possible to derive a price for EC2’s local SSD by comparing configurations and prices of similar instance types that differ only in price and amount of SSD. All configuration information comes from the
EC2 instance type web page
and all pricing information comes from the
EC2 instance pricing page
. In all cases, we use the on-demand prices in Northern Virginia.
The methodology is basically to compare r3 (memory optimized) and i2 (storage optimized) instance types. By grouping them together in pairs that have the same amount of CPU and memory but different amounts of SSD and different prices and dividing the difference in SSD capacity by the difference in price, you can derive the per-GB local SSD price that AWS charges its customers. Each of the four r3/i2 pair comparisons yields a local SSD price of $0.0007/GB/hour.
By comparison, we sell Local SSD in 375GB chunks for $0.218/GB/month. Normalizing that to hourly pricing, we get $0.0003/GB/hour. So there’s the bottom line:
we charge 57% less for local SSD that’s at least 4.8x faster than AWS
.
We think pricing is a critical consideration as you try to make the best decision you can about infrastructure systems design. I’d love to hear your thoughts and what matters to you in cloud pricing. What areas are confusing, hard to analyze, or hard to predict? What ideas do you have?
Reach out
to us on Stack Overflow if there’s anything we can do to add more value.
-Posted by Miles Ward, Global Head of Solutions, Google Cloud Platform
Take your big data to new places with Google BigQuery
Friday, April 17, 2015
Yesterday, we
announced
that
Google Cloud Platform
big data services are taking a big step forward by allowing everyone to use big data the
cloud way
.
Google BigQuery
has many new features and is now available in European zones
. These improvements were designed to extend BigQuery's performance and capabilities to give you greater peace-of-mind and control over your data.
European Data Location Control
You now have the option to store your BigQuery data in European locations while continuing to benefit from a fully managed service, now with the option of geographic data control, without low-level cluster maintenance headaches. Feel free to contact the
Google Cloud Platform technical support
team for details on how to set this up.
Streaming Inserts
One of BigQuery's most popular features is the ability to
stream data into the service
for real-time analysis. To allow such low-latency analysis on very high-volume streams, we've increased the default insert-rate limit from 10,000 rows per second, per table, to 100,000 rows per second, per table. In addition, the row-size limit has increased from 20 KB to 1 MB, and
pricing will move
from a per-row model to a per-byte model for better flexibility and scale.
Security Features
BigQuery can now tackle a wider range of enterprise applications with the addition of data expiration controls and row-level permissions. Row-level permissions eliminate the need to create different views for different users, allowing secure shared access to systems such as finance or HR. This ensures that you get the information that’s relevant to you. In addition, data in BigQuery will be encrypted at rest.
Google Cloud Platform Logging Integration
Google Cloud Logging
provides a powerful set of tools for managing your operations and understanding the systems powering your business; now, it also lets your Google App Engine and Google Compute Engine applications stream their logs into BigQuery. This allows you to perform real-time analysis on your log data and gain insight into how your system is performing and how your users are behaving. By joining application logs with your marketing and partnership data, you can rapidly evaluate the effectiveness of your outreach, or apply context from user profile info into your application logs to quickly assess what behavior resulted from specific customer interactions, providing easy and immediate value to both system administrators and business analysts.
Frequently requested features
Additionally, we’ve implemented a number of new features you’ve been asking for. You can now:
load content from Google Cloud Datastore
nest query results
leverage FULL and RIGHT OUTER JOINS
roll up your aggregations to include subtotals
recover data from recently deleted tables
take advantage of a number of improvements to the web UI
For a full list of features, take a look at the
release notes
.
Unprecedented scale
BigQuery continues to provide exceptional scale and performance without requiring you to deploy, augment or update your own clusters. Instead, you can focus on getting meaningful insights from massive amounts of data. For example:
BigQuery absorbs real-time streams of customer data totaling more than 100 TB per day, which you can query immediately. All this data is in addition to the hundreds of terabytes loaded daily from other sources. If you have fast-moving, large-scale applications such as IoT, you can now make quick, accurate decisions against in-flight applications.
We have customers currently running queries that scan multiple petabytes of data or tens of trillions of rows using a simple SQL query, without ever having to worry about system provisioning, maintenance, fault-tolerance or performance tuning.
With BigQuery’s new features, you can analyze even more data and access it faster than before, in brand new ways. To get started,
learn more
about BigQuery,
read the documentation
, and
try it out
for yourself.
-Posted by Andrew Kowal, Product Manager
Big data is easier than ever with Google Cloud Dataflow
Thursday, April 16, 2015
Big data applications can provide extremely valuable insights, but extracting that value often demands high overhead – including significant deployment, tuning, and operational effort – diverse systems, and programming models. As a result, work other than the actual programming and data analysis dominates the time needed to build and maintain a big data application. The industry has come to accept these pains and inefficiencies as an unavoidable cost of doing business. We believe you deserve better.
In Google’s systems infrastructure team, we’ve been tackling challenging big data problems for more than a decade and are well aware of the difference that simple yet powerful data processing tools make. We have translated our experience from
MapReduce
,
FlumeJava
, and
MillWheel
into a single product,
Google Cloud Dataflow
. It's designed to reduce operational overhead and make programming and data analysis your only job, whether you’re a data scientist, data analyst or data-centric software developer. Along with other
Google Cloud Platform big data services
, Cloud Dataflow embodies the kind of highly productive and fully managed services designed to use big data,
the cloud way
.
Today we’re pleased to make
Google Cloud Dataflow available in beta
, for use by anyone on
Google Cloud Platform
. With Cloud Dataflow, you can:
Merge your batch and stream processing pipelines thanks to a unified and convenient programming model. The model and the underlying managed service let you
easily express data processing pipelines
, make powerful decisions, obtain insights and eliminate the switching cost between batch and continuous stream processing.
Finely tune the desired correctness model for your data processing needs through
powerful API primitives
for handling late arriving data. You can process data based on event time as well as clock time and gracefully deal with upstream data latency when processing data from unbounded sources.
Leverage a fully-managed service, complete with dynamically adaptive auto-scaling and auto-tuning, that offers attractive performance out of the box. Whether you’re a developer or systems operator, you
no longer need to invest time
worrying about resource provisioning or attempting to optimize resource usage. Automation, a fully managed service, and the programming model work together to significantly lower both CAPEX and OPEX.
Enjoy
reduced complexity of managing and debugging
highly parallelized
processes
with a simplified monitoring interface that’s logically mapped to your processing logic as opposed to how your code’s mapped to the underlying execution plane.
Benefit from
integrated processing
of data across the Google Cloud Platform with optimized support for services such as
Google Cloud Storage
,
Google Cloud Datastore
,
Google Cloud Pub/Sub
, and
Google BigQuery
.
We’re also working with major open source contributors on maturing the Cloud Dataflow ecosystem. For example, we recently announced collaborations with
Data Artisans
for runtime
support for Apache Flink
and with
Cloudera
for runtime support for
Apache Spark
.
We’d like to thank our alpha users for their numerous suggestions, reports and support along this journey. Their input has certainly made Cloud Dataflow a better product. Now, during beta, everyone can use Cloud Dataflow and we continue to welcome questions and feedback on
Stack Overflow
. We hope that you’ll give Cloud Dataflow a try and enjoy big data made easy.
-Posted by Grzegorz Czajkowski, Director of Engineering
Big data, the cloud way
Thursday, April 16, 2015
The promise of big data is faster and better insight into your business. Yet it often turns into an infrastructure project. Why? For example, you might be collecting a deluge of information and then correlating, enriching and attempting to extract real-time insights. Should you expect such feats, by their very nature, to involve a large amount of resource management and system administration? You shouldn’t. Not in the cloud. Not if you’re using big data the
cloud way
.
Big data the
cloud way
means being more productive when building applications, with faster and better insights, without having to worry about the underlying infrastructure. More specifically, it includes:
NoOps
: Your cloud provider should worry about deploying, managing and upgrading infrastructure to make it scalable and reliable. “NoOps” means the platform handles such tasks and optimizations for you, freeing you up to focus on understanding and exploiting the value in your data.
Cost effectiveness
: In addition to increased ease of use and agility, a “NoOps” solution provides clear cost benefits via the removal of operations work; but the cost benefits of big data the
cloud way
go even further
–
the platform auto-scales and optimizes your infrastructure consumption, and eliminates unused resources like idle clusters. You manage your costs by dialing up or down the number of queries and the latency of your processing based on your cost/benefit analysis. You should never have to re-architect your system to adjust your costs.
Safe and easy collaboration
: You can share datasets from files in
Google Cloud Storage
or tables in
Google BigQuery
with collaborators inside or outside of your organization without the need to make copies or grant database access. There’s one version of the data – which you control – and authorized users can access it (at no cost to you) without affecting the performance of your jobs.
Google has been blazing the big data trail for the rest of the industry
–
so when you use Google Cloud Platform, big data the
cloud way
also means:
Cutting-edge features:
Google Cloud Dataflow
provides reliable, event-time-based stream processing, available by default with no extra work. But making stream processing easy and reliable doesn’t mean removing the option of running in batch. The same pipeline can execute in batch mode, which you can use to lower costs or analyze historical data. Now, consistently processing streaming data at large scale doesn’t have to be a complex and brittle endeavor that’s reserved for the most critical scenarios.
Google Cloud Platform
delivers these characteristic by making data analysis quick, affordable and easy. Today, at the
Hadoop Summit
in Brussels, we announced that
our big data services
are taking a big step forward – allowing everyone to use big data the
cloud way
.
Google Cloud Dataflow now available in beta
Today, nothing stands between you and the satisfaction of seeing your processing logic, applied in your choice of streaming or batch mode, executed via a fully managed processing service. Just write a program, submit it, and Cloud Dataflow will do the rest. No clusters to manage – Cloud Dataflow will start the needed resources, autoscale them (within the bounds you choose), and terminate them as soon as the work is done. You can
get started right now
.
Google BigQuery has many new features and is now available in European zones
BigQuery, the quintessential cloud-native, API-driven service for SQL analytics, has new security and performance features. For example, the introduction of row-level permissions makes data sharing even easier and more flexible. With its ease of ingestion (we’ve raised the default ingestion limit to 100,000 rows per second per table), virtually unlimited storage, and fantastic query performance even for huge datasets, BigQuery is the ideal platform for storing, analyzing and sharing structured data. It also supports repeated records and querying inside JSON objects for loosely structured data. In addition, starting today, BigQuery now offers the option to store your data in Google Cloud Platform European
zones
. You can contact
Google
technical support today to use this option.
A comprehensive set of big data services
Google Cloud Pub/Sub
is designed to provide scalable, reliable and fast event delivery as a fully managed service. Along with BigQuery streaming ingestion and Cloud Dataflow stream processing, it completes the platform’s end-to-end support for low-latency data processing. Whether you’re processing customer actions, application logs or IoT events, Google Cloud Platform allows you to handle them in real time, the
cloud way
. Leave Google Cloud Platform in charge of all the scaling and administration tasks so you can focus on
what
needs to happen, not
how
.
Using big data the
cloud way
doesn’t mean that Hadoop, Spark, Flink and other open source tools originally created for on-premises can’t be used in the cloud. We’ve ensured that you can benefit from the richness of the open source big data ecosystem via native connectors to
Google Cloud Storage
and
BigQuery
along with an
automated Hadoop/Spark cluster deployment
.
Google BigQuery customer
zulily
joined us recently for a
big data webinar
to share their experience using big data the
cloud way
and how it helped them increase revenue and overall business visibility while decreasing their operating costs. If you’re interested in exploring these types of benefits for your own company, you can easily get started today by
running your first query on a public dataset
or uploading your own data.
Here’s a simplified illustration of how Google Cloud Platform data processing services relate to each other and support all stages of the data lifecycle:
Scuba equipment helps humans operate under water, but divers still fall hopelessly short of the efficiency and agility of marine creatures. When it comes to big data in the cloud, be a dolphin, not a scuba diver. Google Cloud Platform offers a set of powerful, scalable, easy to use and efficient big data services built for the cloud. Embrace big data, the
cloud way
, by taking advantage of them today.
Learn more about Google Cloud Platform’s
big data
solutions or
get started with Dataflow
and BigQuery today. We can’t wait to see what you achieve when you use
big data the cloud way
.
-Posted by William Vambenepe, Product Manager
Media Management with Google Cloud Platform - Live from NAB in Las Vegas
Tuesday, April 14, 2015
A feature length animated movie takes up to 100 million compute hours to render.
100 million
.
When you hear the two words “Google” and “media,” what pops into your mind? YouTube, right? Well, as I’m excited to explain, media means much more than “YouTube” at Google. The media and entertainment industry is a key area of focus for Google Cloud Platform. As I’ll be sharing in my
keynote address
for the
cloud conference
at the 97,000-attendee
NAB Show
in Las Vegas on Tuesday, we’re rapidly expanding our platform and our partner ecosystem to uniquely solve media-specific challenges. In addition to my keynote,
Patrick McGregor
and
Todd Prives
from my team are participating in panel sessions on cloud security and cloud rendering. And as part of the recent Virtual NAB conference, Jeff Kember and Miles Ward from Google shared their
insights
.
We’re witnessing massive changes in the ways media companies are creating, transforming, archiving and delivering content, using the power of the cloud
We recognize that
Google Cloud Platform
best supports the media industry when we deliver capabilities that are tailored to specific workflow patterns. Great examples of these capabilities are our services for visual effects rendering. Aside from the skilled work that an artist puts into modeling, animating and compositing a realistic scene, the compute demands required to produce these images are often staggering. Even a relatively simple visual effects shot or animation can take several hours to render the 24 individual frames that make up one second of video.
Google Cloud Platform can greatly accelerate and simplify rendering while charging only for the processor cycles and bits that are consumed. For customers looking for an end-to-end rendering solution, we offer
Google Zync Render
. Beta launched in the first quarter of 2015, Zync is a turnkey service for small and medium-sized studios. It integrates directly with existing on-premises software workflows to feel as natural and responsive as a local render farm. Also, through our collaborations with The Foundry and others, Google Cloud Platform provides tools used in the creation of some of the highest-grossing movies.
Zync Render Workflow
By using Google Cloud Platform’s cost-efficient compute and storage, studios can seamlessly extend their rendering pipelines to handle burst capacity needs and remove the bottlenecks typically associated with production deadlines. We’re already seeing great successes from media customers like
Framestore
,
RodeoFX
,
iStreamPlanet
,
Panda
, and
Industriromantik
.
We’ve also built compelling general platform capabilities that help media companies with all stages of workflow and the content lifecycle. One example is
Google Cloud Storage Nearline
, which is a service that allows a virtually unlimited volume of data to be stored at very low costs with retrieval times on the order of seconds – not hours as you would experience with tape. This is ideal for media content archiving. We also recently launched 32-core VM instances for compute-intensive workloads that crunch large volumes of content. And, yesterday, we announced a collaboration with
Avere Systems
that enables us to
bridge cloud storage and on-premises storage
without impacting performance. This opens huge opportunities for creative collaboration and content production.
Please join us this week for NAB, we hope to see you in Las Vegas!
-Posted by Brian Stevens, VP for Google Cloud Platform
Google’s network edge: presence, connectivity and choice for today’s enterprise
Monday, April 13, 2015
Today’s enterprise must focus on running key workloads both on-premises locally and remotely in the cloud. There is simultaneously the need to keep the quality of service high for end-users in terms of network latency and reliability, and the need to ensure efficiency and security for your company’s hybrid workloads – particularly workloads that are bandwidth-intensive or latency-sensitive. Raw performance, reliability and security have been major focus areas for Google from the start, and our goal with Google Cloud Platform is to share the benefits of continuous
networking innovation
with our customers.
We have four announcements today in support of two major technical goals. The first is to use Google’s global network footprint – over 70 points of presence across 33 countries – to serve users close to where they are, ensuring the same low latency and responsiveness customers can expect from Google’s own services. The second goal relates to enabling enterprises to run mission-critical workloads by connecting their on-premises infrastructure to Google’s network with enterprise-grade encryption.
Today we're announcing:
General Availability of
Google Cloud DNS
Expansion of
Google Cloud Load Balancing
solutions to 12 additional points of presence globally (Los Angeles, San Francisco, Chicago, Seattle, Dallas, Miami, London, Paris, Stockholm, Munich, Madrid, Lisbon)
Beta of
VPN
11 additional
Google Cloud Interconnect
service providers
Managed DNS
With Cloud DNS – our high performance, managed DNS solution for user-facing applications and services – you can host millions of zones and records and handle
SLA-backed
name-serving queries. For customers with more than 10,000 zones, our new
pricing
tier lowers the cost of ownership for large organizations operating DNS infrastructure at scale.
Global Load Balancing
Today’s connected user is accustomed to fast and responsive application services, be they web services accessed from a browser or apps on a mobile device. Latency (“lag”) is noticeable immediately, especially as users switch from a fast, optimized service to a slow one. With the expansion of Google’s load balancing solution to 12 additional locations, your workloads running on Google Cloud Platform are closer in proximity to your users who are making service requests from all over the globe.
Additional Carrier Interconnect service providers and VPN Beta
We continue to build on our goal of enabling enterprises to connect their on-premises infrastructure to Google’s network over encrypted channels to run data-intensive, latency-sensitive workloads. In addition to announcing the beta for
Cloud VPN
, we’re pleased to introduce 11 additional Carrier Interconnect service providers. Our growing list of technology partners extends our reach to customer locations globally while providing tailored connectivity and choice.
iStreamPlanet
is one such customer who has taken advantage of our infrastructure breadth to make high-quality connections into the Google network. iStreamPlanet recently launched Aventus, its SaaS-based product that enables content owners to serve high-quality live video with simplicity to viewers across devices. Running on Google Cloud Platform, iStreamPlanet is able to create live video events for its customers in minutes rather than days, and has lowered bandwidth costs by more than 40 percent using Google Cloud Platform’s
Direct Peering
offering.
We’d also like to welcome
CloudFlare
as a Google Cloud Platform Technology Partner. CloudFlare provides website speed optimization, security and DDOS protection, as well as caching solutions over its globally distributed network. With nearly no setup required, CloudFlare reports speed optimizations that result in content loading twice as fast on average for visitors.
Google’s network, built out over the past 15 years, is a key enabler behind the services relied upon every day by our customers and our users – from Search to Maps, YouTube to Cloud Platform. We invite you to
contact us
to explore how we can make Google’s network an extension of your own, or to learn about your specific needs around serving your users wherever they may be globally. You can read more about
Google Cloud Networking
.
-Posted by Morgan Dollard, Cloud Networking Product Management Lead
Panda achieves greater video quality using motion compensation for frame rate conversion
Thursday, April 9, 2015
Today’s guest post comes from Ed Byrne, Director at
Panda
– a cloud-based video transcoding platform. To learn more about how Panda uses Google Cloud Platform, watch their
case study video
.
Panda
makes it easy for video producers to encode their video in multiple formats for different mobile device screen sizes. But delivering blazing fast, high-quality videos to customers is no easy task – especially when your engineers are also dealing with infrastructure. Google Cloud Platform features like
Live Migration
and
Autoscaler
have allowed us to cut our infrastructure maintenance load to only half of a developer.
With more resources to direct at innovation, we can put our focus on our customers, making their experience better with new and improved features in Panda. In fact, since relying on
Google Cloud Platform
for underlying infrastructure, we’ve developed our frame rate conversion by motion compensation technology. Our customers love the video quality they get using this feature, and we’re so excited about it, we agreed to give you the low down on how it works.
Introduction to motion compensation
Motion compensation is a technique that was originally used for video compression, and now it’s used in virtually every video codec. Its inventors noticed that adjacent frames usually don’t differ too much (except for scene changes), and then used that fact to develop a better encoding scheme than compressing each frame separately. In short, motion-compensation-powered compression tries to detect movement that happens between frames and then use that information for more efficient encoding. Imagine two frames:
Panda on the left...
aaaand on the right
Now, a motion compensating algorithm would detect the fact that it’s the same panda in both frames, just in different locations:
First stage of motion compensation: motion detection
We’re still thinking about compression, so why would we want to store the same panda twice? Yep, that’s what motion-compensation-powered compression does – it stores the moving panda just once (usually, it would store the whole frame #1), but it adds information about movement. Then the decompressor uses this information to construct remaining information (frame #2 based on frame #1).
That’s the general idea, but in practice it’s not as smooth and easy as in the example. The objects are rarely the same, and usually some distortions and non-linear transformations creep in. Scanning for movements is very expensive computationally, so we have to limit the search space and optimize the code, even resorting to hand-written assembly.
Frame rate conversion by motion compensation
Motion compensation can be used for frame rate conversion too, often with really impressive results.
For illustration, let’s go back to the moving panda example. Let’s assume we want to change the frame rate from two frames per second (FPS) to three FPS. In order to maintain the video speed, each frame will be on screen for a shorter amount of time (.5 sec vs .33 sec).
One way to increase the number of frames is to duplicate a frame, resulting in three FPS, but the quality will suffer. As you can see, frame #1 has been duplicated:
Converting from 2 FPS to 3 FPS by duplicating frames
Yes, the output has three frames and the input has two, but the effect isn’t visually appealing. We need a bit of magic to create a frame that humans would see as naturally fitting between the two initial frames – panda has to be in the middle. That’s a task motion compensation could deal with – detect the motion, but instead of using it for compression, create a new frame based on the gathered information. Here’s how it should work:
Converting from 2 FPS to 3 FPS by motion compensation: Panda's in the middle!
Notice that by creating a new frame, we keep our panda hero at the center.
Now for video examples, taken straight from a Panda encoder. Here’s what frame duplication (the bad guy) looks like in action (for better illustration, after converting FPS, we slowed down the video):
While the video on the left is very smooth, the frame duplicated version on the right is jittery. Not great. Now, what happens when we use motion compensation (the good guy):
The movement’s smooth and outside of slight noise, we don’t catch glimpse of any video artifacts.
There are other types of footage that fool the algorithm more easily. Motion compensation assumes simple, linear movement, so other kinds of image transformations can produce heavier artifacts that may or may not be acceptable, depending on the use case.
Occlusions
, refractions – you see these in water bubbles – and very quick movements, which means that too much happens between frames, are the most common examples of image transformations that can produce lower visual quality. Here’s a video full of occlusions and water:
Now let’s slow it down and see frame duplication and motion compensation side-by-side.
Motion compensation produces clear artifacts (those fake electric discharges), but still maintains higher visual quality than frame duplication.
The unilateral verdict of a short survey we shared in our office: motion compensation produces much better imaging than frame duplication.
Google Cloud Platform products like
Google Compute Engine
allowed us to improve performance in encoding by 30%, as well as shift our energy from focusing on underlying infrastructure to innovating for our customers. We’ve also been able to take advantage of
sustained use discounts
, which have helped lower our infrastructure costs, without the need to sign contracts or reserve capacity. Google’s network performance is also a huge asset for us, given video files are so large and we need to move them frequently. To learn more about how we’re using Cloud Platform,
watch our video
.
Panda’s
excited to be at this year’s
NAB show
, one of the world’s largest gatherings of technologists and digital content providers. They’ll be in the StudioXperience area with Filepicker in the South Upper Hall, SU621.
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow