cloud-test: April 2015

Take your logs data to new places with streaming export to Cloud Pub/Sub

Thursday, April 30, 2015

Earlier this year, we announced the beta of the Google Cloud Logging service, which included the capability to:

Stream logs in real-time to Google BigQuery, so you can analyze log data and get immediate insights.

Export logs to Google Cloud Storage (including Google Cloud Storage Nearline), so you can archive logs data for longer periods to meet backup and compliance requirements.

Today we’re expanding Cloud Logging capabilities with the beta of Cloud Logging Connector that allows you to stream logs to Google Cloud Pub/Sub. With this capability you can stream log data to your own endpoints and further expand how you can make big data useful. For example, you can now transform and enrich the data in Google Cloud Dataflow before sending it to BigQuery for analysis. Furthermore, this provides easy real-time access to all your logs data, so you can export it to your private cloud or any third party application.

Cloud Pub/Sub
Google Cloud Pub/Sub is designed to deliver real-time and reliable messaging in one global, managed service that helps you create simpler, more reliable, and more flexible applications. By providing many-to-many, asynchronous messaging that decouples senders and receivers, it allows for secure and highly available communication between independently written applications. With Cloud Pub/Sub, you can push your log events to another Webhook, or pull them as they happen. For more information, check out our Google Cloud Pub/Sub documentation.

High-Level Pub/Sub Schema
Configuring Export to Cloud Pub/SubConfiguring export of logs to Cloud Pub/Sub is easy and can be done from the Logs Viewer user interface. To get to the export configuration UI start in the Google Developers Console, go to Logs under Monitoring and then click Exports on the top menu. Currently this supports export configuration for Google App Engine and Google Compute Engine logs.

One Click Export Configuration in the Developers Console

Transforming Log Data in DataflowGoogle Cloud Dataflow allows you to build, deploy, and run data processing pipelines at a global scale. It enables reliable execution for large-scale data processing scenarios such as ETL and analytics, and allows pipelines to execute in either streaming or batch mode. You choose.

You can use the Cloud Pub/Sub export mechanism to stream your log data to Cloud Dataflow and dynamically generate fields, combine different log tables for correlation, and parse and enrich the data for custom needs. Here are a few examples of what you can achieve with log data in Cloud Dataflow:

Sometimes it is useful to see the data only for the key applications for top customers. In Cloud Dataflow, you can group logs by Customer ID or Application ID, filter out specific logs, and then apply some aggregation of system level or application level metrics.

On the flip side, sometimes you want to enrich the log data to make it easier to analyze, for example by appending marketing campaign information to customer interaction logs, or other user profile info. Cloud Dataflow lets you do this on the fly.

In addition to preparing the data for further analysis, Cloud Dataflow also lets you perform analysis in real time. So you can look for anomalies, detect security intrusions, generate alerts, keep a real-time dashboard updated, etc.

Cloud Dataflow can stream the processed data to BigQuery, so you can analyze your enriched data. For more details, please see the Google Cloud Dataflow documentation.

Getting StartedIf you’re a current Google Cloud Platform user, the capability to stream logs to Cloud Pub/Sub is available to you at no additional charge. Applicable charges for using Cloud Pub/Sub and Cloud Dataflow will apply. For more information, visit the the Cloud Logging documentation page and share your feedback.
-Posted by Deepak Tiwari, Product Manager

Streak’s Top 6 Tips for App Engine

Tuesday, April 28, 2015

When Streak — CRM in your inbox — launched in March 2012, our userbase grew 30% every week for four consecutive months. Today, Streak supports millions of users with only 1.5 back-end engineers. We chose Google App Engine to power our application because it enabled our team to build features fast and scaled with user growth. Plus, we didn’t have to worry about infrastructure.

Streak’s data growth
Here are six tips we’ve learned building on App Engine, and if you’d like even more detail – including an overview of our app’s architecture and 15 minutes of Q&A – you can check out my webinar.
1. Keep user-facing GET requests fast
This tip isn’t specific to App Engine, as it really applies to most web applications. User-facing GET requests should be quick. App Engine has a 60 second timeout on all requests; frankly, if the total latency after a user interaction is taking longer than 200ms, users will perceive your app as slow. To keep requests fast, you should do your heavyweight processing – such as calculations or complex queries – either in the background or at write time. That way, when the user requests data (read time), it’s already precalculated and ready to go.
2. Take advantage of Managed VMs
So, what are Managed VMs? Managed VMs are a new hosting environment for App Engine, enabling you to take advantage of beefier compute resources and run your own custom runtimes. For example, we host our back-end data processing modules on n1-standard-1 machines (1 CPU and 3.75 GB mem), rather than App Engine frontend instances. This provides better performance and cost savings, due to sustained use discounts. Yes, Managed VMs take a littler longer to boot up than an App Engine frontend instance, but they're perfect for our background processing needs.
3. Denormalize for faster reads
Google Cloud Datastore is a NoSQL database so if you’re coming from the RDBMS world, it requires a different approach to data modeling. You have to be comfortable denormalizing and duplicating data, since SQL joins don’t exist. While data duplication might feel uncomfortable, by doing so, your reads will be very fast.
4. Break your application into modules
Modules make it easy for you to break your App Engine app into different components. For example, you could have a module for your user-facing traffic and one for background processing. Each module has its own yaml file, so you can set parameters such as instance size, version number, runtime language used, and more. As mentioned above, our backend modules take advantage of Managed VMs for performance/cost benefits, while our frontend module uses App Engine frontend instances that scale quicker. The documentation discusses best practices on how you should structure your app.
5. Deploy aggressively and use traffic splitting
At Streak, we do continuous deployments because versioning, deployment and rollout is easy with App Engine. In fact, sometimes we deploy up to 20 times per day to get changes into the hands of customers. We aggressively deploy to many production versions of our app and then selectively turn on new features for our users. As we slowly ramp up the traffic to these new versions via traffic splitting, we catch issues early and often. These are usually really easy to deal with because each of our new code deploys has a small set of functionality, so its easy to find the relevant issues in the code base. We also use Google Cloud Monitoring and our own homegrown system (based on #6 below) to monitor these deploys for changes.
6. Use BigQuery to analyze your log files
Application and request logs can give you valuable insights into performance and help you make product improvements. If you’re just starting out, the log viewer’s list of recent requests will be just fine, but once you’ve reached scale you’ll want to do analysis on aggregate data or a specific user’s requests. We’ve built custom code to export our logs to Google BigQuery, but you can now stream your logs directly from the Developers Console. With these insights, my team can build a better user experience.
Watch the webinar
App Engine has been critical to our success. As our application has scaled, so has App Engine and we’ve been able to focus on building features for our customers, rather than ops. To learn more tips about App Engine – including an overview of our architecture and 15 minutes of Q&A – watch the full webinar.

-Posted by Aleem Mawani, CEO and co-founder, Streak

Understanding Cloud Pricing Part 2 - Local SSD

Tuesday, April 21, 2015

Understanding is good, but trying it for yourself is better. Today, we’ve made that a bit easierGoogle would like you to experience the performance of Local SSD firsthand. To make it even easier to try out this feature, we're giving our customers a discounted trial. For the next month (April 21, 2015 to May 21, 2015), Local SSD will be priced at $0.055/GB/month, a 75% discount. After that time, the price will return to its normal $0.218/GB/month. The analysis below is built on our long-term pricing, so during the promotion this month, you'll see 75% savings on these numbers. There’s never been a better time to “kick the tires” on Local SSD, so don’t wait.Understanding Cloud Pricingfirst exampleGoogle Compute Enginen1-highmem-8local SSDHere's the output of the pricing calculatorsGoogle Cloud Platform estimateAmazon Web Services estimateSustained Use Discounts

i2.2xlarge advantages:

17% more memory

7% more SSD space

n1-highmem-8 with 4 attached SSD partitions advantages:

39% less expensive

807% more read IOPS

380% more write IOPS

807% more read IOPSHere's the output of the pricing calculatorsGoogle Cloud Platform estimateAmazon Web Services estimatesaving you *real money*Here’s the output of the pricing calculatorsGoogle Cloud Platform estimateAmazon Web Services estimateOn Google, this system is so cost efficient, you can run it for 3 weeks and stay within our free trial
Comparing Prices for SSD specifically EC2 instance type web pageEC2 instance pricing pagewe charge 57% less for local SSD that’s at least 4.8x faster than AWSReach out

Take your big data to new places with Google BigQuery

Friday, April 17, 2015

announcedGoogle Cloud Platformcloud wayGoogle BigQuery has many new features and is now available in European zones
European Data Location ControlGoogle Cloud Platform technical support
Streaming Insertsstream data into the servicepricing will move
Security Features
Google Cloud Platform Logging IntegrationGoogle Cloud Logging
Frequently requested features

load content from Google Cloud Datastore

nest query results

leverage FULL and RIGHT OUTER JOINS

roll up your aggregations to include subtotals

recover data from recently deleted tables

take advantage of a number of improvements to the web UI

release notes
Unprecedented scale

BigQuery absorbs real-time streams of customer data totaling more than 100 TB per day, which you can query immediately. All this data is in addition to the hundreds of terabytes loaded daily from other sources. If you have fast-moving, large-scale applications such as IoT, you can now make quick, accurate decisions against in-flight applications.

We have customers currently running queries that scan multiple petabytes of data or tens of trillions of rows using a simple SQL query, without ever having to worry about system provisioning, maintenance, fault-tolerance or performance tuning.

learn moreread the documentationtry it out

Big data is easier than ever with Google Cloud Dataflow

Thursday, April 16, 2015

MapReduceFlumeJavaMillWheelGoogle Cloud DataflowGoogle Cloud Platform big data servicesthe cloud wayGoogle Cloud Dataflow available in betaGoogle Cloud Platform

Merge your batch and stream processing pipelines thanks to a unified and convenient programming model. The model and the underlying managed service let you easily express data processing pipelines, make powerful decisions, obtain insights and eliminate the switching cost between batch and continuous stream processing.

Finely tune the desired correctness model for your data processing needs through powerful API primitives for handling late arriving data. You can process data based on event time as well as clock time and gracefully deal with upstream data latency when processing data from unbounded sources.

Leverage a fully-managed service, complete with dynamically adaptive auto-scaling and auto-tuning, that offers attractive performance out of the box. Whether you’re a developer or systems operator, you no longer need to invest time worrying about resource provisioning or attempting to optimize resource usage. Automation, a fully managed service, and the programming model work together to significantly lower both CAPEX and OPEX.

Enjoy reduced complexity of managing and debugging highly parallelized processes with a simplified monitoring interface that’s logically mapped to your processing logic as opposed to how your code’s mapped to the underlying execution plane.

Benefit from integrated processing of data across the Google Cloud Platform with optimized support for services such as Google Cloud Storage, Google Cloud Datastore, Google Cloud Pub/Sub, and Google BigQuery.

Data Artisanssupport for Apache FlinkClouderaApache SparkStack Overflow

Big data, the cloud way

Thursday, April 16, 2015

cloud waycloud way

NoOps: Your cloud provider should worry about deploying, managing and upgrading infrastructure to make it scalable and reliable. “NoOps” means the platform handles such tasks and optimizations for you, freeing you up to focus on understanding and exploiting the value in your data.

Cost effectiveness: In addition to increased ease of use and agility, a “NoOps” solution provides clear cost benefits via the removal of operations work; but the cost benefits of big data the cloud way go even further – the platform auto-scales and optimizes your infrastructure consumption, and eliminates unused resources like idle clusters. You manage your costs by dialing up or down the number of queries and the latency of your processing based on your cost/benefit analysis. You should never have to re-architect your system to adjust your costs.

Safe and easy collaboration: You can share datasets from files in Google Cloud Storage or tables in Google BigQuery with collaborators inside or outside of your organization without the need to make copies or grant database access. There’s one version of the data – which you control – and authorized users can access it (at no cost to you) without affecting the performance of your jobs.

Google has been blazing the big data trail for the rest of the industry – so when you use Google Cloud Platform, big data the cloud way also means:

Cutting-edge features: Google Cloud Dataflow provides reliable, event-time-based stream processing, available by default with no extra work. But making stream processing easy and reliable doesn’t mean removing the option of running in batch. The same pipeline can execute in batch mode, which you can use to lower costs or analyze historical data. Now, consistently processing streaming data at large scale doesn’t have to be a complex and brittle endeavor that’s reserved for the most critical scenarios.

Google Cloud PlatformHadoop Summitour big data servicescloud way
Google Cloud Dataflow now available in betaget started right now
Google BigQuery has many new features and is now available in European zoneszonesGoogle
A comprehensive set of big data servicesGoogle Cloud Pub/Subcloud waywhathowcloud wayGoogle Cloud StorageBigQueryautomated Hadoop/Spark cluster deploymentzulilybig data webinarcloud way running your first query on a public dataset

cloud waybig dataget started with Dataflowbig data the cloud way

Media Management with Google Cloud Platform - Live from NAB in Las Vegas

Tuesday, April 14, 2015

100 millionkeynote addresscloud conferenceNAB ShowPatrick McGregorTodd PrivesinsightsGoogle Cloud PlatformGoogle Zync Render
Zync Render Workflow

FramestoreRodeoFXiStreamPlanetPandaIndustriromantikGoogle Cloud Storage NearlineAvere Systemsbridge cloud storage and on-premises storage

Google’s network edge: presence, connectivity and choice for today’s enterprise

Monday, April 13, 2015

networking innovation

General Availability of Google Cloud DNS

Expansion of Google Cloud Load Balancing solutions to 12 additional points of presence globally (Los Angeles, San Francisco, Chicago, Seattle, Dallas, Miami, London, Paris, Stockholm, Munich, Madrid, Lisbon)

Beta of VPN

11 additional Google Cloud Interconnect service providers

Managed DNSSLA-backedpricing

Global Load Balancing
Additional Carrier Interconnect service providers and VPN BetaCloud VPN

iStreamPlanetDirect PeeringCloudFlarecontact usGoogle Cloud Networking

Panda achieves greater video quality using motion compensation for frame rate conversion

Thursday, April 9, 2015

Today’s guest post comes from Ed Byrne, Director at Panda – a cloud-based video transcoding platform. To learn more about how Panda uses Google Cloud Platform, watch their case study video. PandaLive MigrationAutoscalerGoogle Cloud Platform
Introduction to motion compensation

Panda on the left...

aaaand on the right

First stage of motion compensation: motion detection

Frame rate conversion by motion compensation

Converting from 2 FPS to 3 FPS by duplicating frames

Converting from 2 FPS to 3 FPS by motion compensation: Panda's in the middle!

Occlusions

Google Compute Enginesustained use discountswatch our videoPanda’s excited to be at this year’s NAB show, one of the world’s largest gatherings of technologists and digital content providers. They’ll be in the StudioXperience area with Filepicker in the South Upper Hall, SU621.

Google Cloud Platform Blog

Take your logs data to new places with streaming export to Cloud Pub/Sub

Streak’s Top 6 Tips for App Engine

Understanding Cloud Pricing Part 2 - Local SSD

Take your big data to new places with Google BigQuery

Big data is easier than ever with Google Cloud Dataflow

Big data, the cloud way

Media Management with Google Cloud Platform - Live from NAB in Las Vegas

Google’s network edge: presence, connectivity and choice for today’s enterprise

Panda achieves greater video quality using motion compensation for frame rate conversion

Don't Miss Next '17

Free Trial

GCP Blogs

Labels

Archive

Feed

Subscribe by email

Company-wide

Products

Developers