cloud-test: January 2017

SLOs, SLIs, SLAs, oh my - CRE life lessons

Tuesday, January 31, 2017

By AJ Ross and Adrian Hilton, Customer Reliability Engineers, and Dave Rensin, Director of Customer Reliability EngineeringLast weekCRE life lessons
Why have an SLO at all?aforementioned Shakespeare service
The SLO you run at becomes the SLO everyone expects———SRE Book—
Your SLA is not your SLOreally—
Conclusion

If you want to have a reliable service, you must first define “reliability.” In most cases that actually translates to availability.

If you want to know how reliable your service is, you must be able to measure the rates of successful and unsuccessful queries; these will form the basis of your SLIs.

The more reliable the service, the more it costs to operate. Define the lowest level of reliability that you can get away with, and state that as your Service Level Objective (SLO).

Without an SLO, your team and your stakeholders cannot make principled judgements about whether your service needs to be made more reliable (increasing cost and slowing development) or less reliable (allowing greater velocity of development).

If you’re charging your customers money you'll probably need an SLA, and it should be a little bit looser than your SLO.

worse than worthlessN. B. Google Cloud Next '17 is fewer than seven weeks away. Register now to join Google Cloud SVP Diane Greene, Google CEO Sundar Pichai, and other luminaries for three days of keynotes, code labs, certification programs, and over 200 technical sessions. And for the first time ever, Next '17 will have a dedicated space for attendees to interact with Google experts in Site Reliability Engineering and Developer Operations.

Guest post: building IoT applications with MQTT and Google Cloud Pub/Sub

Monday, January 30, 2017

By Rick Erickson, Co-founder and EVP and Paul Lundberg, CTO, Agosto[Editor’s note: Today we hear from Agosto, a Google Cloud Premier Partner that has been building products and delivering services on Google Cloud Platform (GCP) since 2012, including Internet of Things applications. Read on to learn about Agosto’s work to build an MQTT service broker for Google Cloud Pub/Sub, and how you can incorporate it into your own IoT applications.]Google Cloud Pub/Sub——RabbitMQhttps://github.com/Agosto/gcp-iot-adapterour offerings

Expanding our IDE support with a new Eclipse plugin for App Engine

Friday, January 27, 2017

By Amir Rouzrokh, Product Manager

EclipseCloud Tools for EclipseGoogle Cloud PlatformGoogle Cloud SDKApp Engine standard environmentWeb Tools PlatformGoogle Plugin for EclipseMaveninstalled through the Eclipse Update ManagerGitHubinstall the Cloud Tools for Eclipse pluginCloud ConsoleFree TrialCreate a project

Home

File > New > Project

Finish

Run As > App EngineDeploy to App Engine StandardAccount

DeployInstancesGitHubJava developers portalsee here

Available . . . or not? That is the question - CRE life lessons

Thursday, January 26, 2017

By AJ Ross, Matt Brown and Adrian Hilton, Customer Reliability Engineers, and Dave Rensin, Director of Customer Reliability Engineering

last installmentCRE life lessons seriesavailabilityAvailability— MTBFMTTR—

Measuring availabilitySRE book

Question: how often is the system available?

Observation: when you visit shakespeare.com, you normally get back the "200 OK" status code and an HTML blob. Very rarely, you see a 500 Internal Server error or a connection failure.

Hypothesis: if "availability" is the percentage of requests per day that return 200 OK, the system will be 99.9% available.

Measure: "tail" the response logs of the Shakespeare service’s web servers and dump them into a logs-processing system.

Analyze: take a daily availability measurement as the percentage of 200 OK responses vs. the total number of requests.

Interpret: After seven days, there’s a minimum of 99.7% availability on any given day.

basing your definition of availability on a measurement that does not match user-expectations or business objectives

Redefining availability in terms of the user experience with black-box monitoringavailabilityobserved by usersfeel a user can visit shakespeare.com, enter a query and get a result for that query within five seconds, 100% of the timeSRE Book

Choosing an availability target according to business goalsavailability
Evaluating cost/benefit tradeoffs, opportunity costsWhat other things could an engineer have done with that time instead?N.B. Google Cloud Next '17 is fewer than seven weeks away. Register now to join Google Cloud SVP Diane Greene, Google CEO Sundar Pichai and other luminaries for three days of keynotes, code labs, certification programs and over 200 technical sessions. And for the first time ever, Next '17 will have a dedicated space for attendees to interact with Google experts in Site Reliability Engineering and Developer Operations.

7 ways we harden our KVM hypervisor at Google Cloud: security in plaintext

Wednesday, January 25, 2017

By Andy Honig, Technical Lead Manager and Nelly Porter, Senior Product ManagerGoogle Compute EngineGoogle Container Engine

Proactive vulnerability search: There are multiple layers of security and isolation built into Google’s KVM (Kernel-based Virtual Machine), and we’re always working to strengthen them. Google’s cloud security staff includes some of the world’s foremost experts in the world of KVM security, and has uncovered multiple vulnerabilities in KVM, Xen and VMware hypervisors over the years. The Google team has historically found and fixed nine vulnerabilities in KVM. During the same time period, the open source community discovered zero vulnerabilities in KVM that impacted Google Cloud Platform (GCP).

Reduced attack surface area: Google has helped to improve KVM security by removing unused components (e.g., a legacy mouse driver and interrupt controllers) and limiting the set of emulated instructions. This presents a reduced attack and patch surface area for potential adversaries to exploit. We also modify the remaining components for enhanced security.

Non-QEMU implementation: Google does not use QEMU, the user-space virtual machine monitor and hardware emulation. Instead, we wrote our own user-space virtual machine monitor that has the following security advantages over QEMU:

Simple host and guest architecture support matrix. QEMU supports a large matrix of host and guest architectures, along with different modes and devices that significantly increase complexity. Because we support a single architecture and a relatively small number of devices, our emulator is much simpler. We don’t currently support cross-architecture host/guest combinations, which helps avoid additional complexity and potential exploits. Google’s virtual machine monitor is composed of individual components with a strong emphasis on simplicity and testability. Unit testing leads to fewer bugs in complex system. QEMU code lacks unit tests and has many interdependencies that would make unit testing extremely difficult.

No history of security problems. QEMU has a long track record of security bugs, such as VENOM, and it's unclear what vulnerabilities may still be lurking in the code.

Boot and Jobs communication: The code provenance processes that we implement helps ensure that machines boot to a known good state. Each KVM host generates a peer-to-peer cryptographic key sharing system that it shares with jobs running on that host, helping to make sure that all communication between jobs running on the host is explicitly authenticated and authorized.

Code Provenance: We run a custom binary and configuration verification system that was developed and integrated with our development processes to track what source code is running in KVM, how it was built, how it was configured and how it was deployed. We verify code integrity on every level — from the boot-loader, to KVM, to the customers’ guest VMs.

Rapid and graceful vulnerability response: We've defined strict internal SLAs and processes to patch KVM in the event of a critical security vulnerability. However, in the three years since we released Compute Engine in beta, our KVM implementation has required zero critical security patches. Non-KVM vulnerabilities are rapidly patched through Google's internal infrastructure to help maximize security protection and meet all applicable compliance requirements, and are typically resolved without impact to customers. We notify customers of updates as required by contractual and legal obligations.

Carefully controlled releases: We have stringent rollout policies and processes for KVM updates driven by compliance requirements and Google Cloud security controls. Only a small team of Google employees has access to the KVM build system and release management control.

KVM Security

KVM Security Improvements by Andrew Honig

Performant Security Hardening of KVM by Steve Rutherford

FAQShould I worry about side channel attacks?
What about Venom? What about Rowhammer? discovering practical Rowhammer attacksrecent paperWhat is Google doing to reduce the impact of KVM vulnerabilities?
How does the Google security team identify KVM vulnerabilities in their early stage? fuzzing

Centrally manage all your Google Cloud resources with Cloud Resource Manager

Tuesday, January 24, 2017

By Marco Cavalli, Product ManagerGoogle Cloud PlatformGoogle Cloud Resource ManagerOrganization resourcecreate a project"At Qubit, we love the flexibility of GCP resource containers including Organizations and Projects. We use the Organization resource to maintain centralized visibility of our projects and GCP IAM policies to ensure consistent access controls throughout the company. This gives our developers the capabilities they need to put security at the forefront throughout our migration to the cloud." — Laurie Clark-Michalek, Infrastructure Engineer at Qubit.Understanding the Cloud Resource Manager Organization resource

Tie ownership of GCP projects to your company, so they remain available when a user leaves the organization.

Allow GCP admins to define IAM policies that apply horizontally across the entire organization.

Provide central visibility and control over billing for effective cost allocation and reporting.

Enable new policies and features for improved security.

hereGetting started with the Cloud Resource Manager Organization resourceeasyorganization nameCloud ConsoleCloud Resource Manager API

easy to migrateHow to manage your Cloud Resource Manager Organization resource with gcloud# Query your Organization ID > gcloud organizations list DISPLAY_NAME ID DIRECTORY_CUSTOMER_ID MyOrganization 123456789 C03ryezon # Access Organization details > gcloud organizations describe [ORGANIZATION_ID] creationTime: '2016-11-15T04:42:33.042Z' displayName: MyOrganization lifecycleState: ACTIVEname: organizations/123456789 owner: directoryCustomerId: C03ryezon # How to assign the Organization Admin role # Must have Organization Admin or Super Admin permissions > gcloud organizations add-iam-policy-binding [ORGANIZATION_ID] --member=[MEMBER_ID] --roleroles/resourcemanager.organizationAdmin # How to migrate an existing project into the Organization > gcloud alpha projects move [PROJECT_ID] --organization [ORGANIZATION_ID] # How to list all projects in the Organization > gcloud projects list --filter ‘parent.id=[ORGANIZATION_ID] AND parent.type=organization’What’s nextsecurity bootcamp

Solution guide: creating self-service IT environments with CloudBolt

Thursday, January 19, 2017

By Peter-Mark Verwoerd, Cloud Solutions Architect

CloudBolt Softwarefull tutorial guideGoogle Cloud Platformwith this self-service tutoriajoin us for our upcoming webinar reach out to us

Google Cloud Audit Logging now available across the GCP stack

Wednesday, January 18, 2017

By Joe Corkery, Product ManagerGoogle Cloud Audit Logging Google Cloud Platform

Google Compute Engine

Google Container Engine

Google Cloud Dataproc

Google Cloud Deployment Manager

Google Cloud DNS

Google Cloud Key Management Service (KMS)

Google Cloud Storage

Google Cloud SQL

Google Cloud DataflowStackdriver DebuggerStackdriver LoggingGoogle BigQueryInteracting with audit logs in Cloud ConsoleCloud Console Activity

Private Logs Viewer

Interacting with audit logs in StackdriverStackdriver Logs Viewer

Google Cloud Pub/Subtutorial"Google Cloud Audit Logs couldn't be simpler to use; exported to BigQuery it provides us with a powerful way to monitor all our applications from one place." — Darren Cibis, Shine SolutionsPartner integrationsSplunkNetskopeTenable Network SecuritypartnersAlerting using Stackdriver logs-based metricslogs-based metricsLogs ViewerSetIamPolicy

notificationIAM consoleResponding to audit logs using Cloud FunctionsCloud Functionssign upCopyright 2017 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 'use strict'; exports.processFirewallAuditLogs = (event) => { const msg = JSON.parse(Buffer.from(event.data.data, 'base64').toString()); const logEntry = msg.protoPayload; if (logEntry && logEntry.request && logEntry.methodName === 'v1.compute.firewalls.insert') { let cancelFirewall = false; const allowed = logEntry.request.alloweds; if (allowed) { for (let key in allowed) { const entry = allowed[key]; for (let port in entry.ports) { if (parseInt(entry.ports[port], 10) !== 22) { cancelFirewall = true; break; } } } } if (cancelFirewall) { const resourceArray = logEntry.resourceName.split('/'); const resourceName = resourceArray[resourceArray.length - 1]; const compute = require('@google-cloud/compute')(); return compute.firewall(resourceName).delete(); } } return true; };package.json fileindex.js{ "name" : "audit-log-monitoring", "version" : "1.0.0", "description" : "monitor my audit logs", "main" : "index.js", "dependencies" : { "@google-cloud/compute" : "^0.4.1" } }

Conclusiontechnical bootcamp

Explore Stackdriver Monitoring data with Cloud Datalab

Friday, January 13, 2017

By Mary Koes, Product Manager Google Stackdriver MonitoringGoogle Cloud PlatformCloud DatalabJupyterGetting Started.ipynb

Time-shifted data.ipynb
Compare today’s CPU utilization to the weekly average by zone

Stackdriver metrics, viewed with Cloud Datalab

Get startedsign up for a 30-day free trialQuickstartLet us know

How we secure our infrastructure: a white paper

Thursday, January 12, 2017

By Niels Provos, Distinguished Engineer, Google SecurityGoogle Cloud PlatformGoogle Compute Enginehttps://cloud.google.com/security/security-designSecurity & Identity

Managing encryption keys in the cloud: introducing Google Cloud Key Management Service

Wednesday, January 11, 2017

By Maya Kaczorowski, Product ManagerGoogle Cloud PlatformCloud Key Management Serviceselect countries
"With the launch of Cloud KMS, Google has addressed the full continuum of encryption and key management use cases for GCP customers. Cloud KMS fills a gap by providing customers with the ability to manage their encryption keys in a multi-tenant cloud service, without the need to maintain an on-premise key management system or HSM.” — Garrett Bekker, Principal Security Analyst at 451 ResearchCloud Identity Access ManagementCloud Audit LoggingRavelin
“Google is transparent about how it does its encryption by default, and Cloud KMS makes it easy to implement best practices. Features like automatic key rotation let us rotate our keys frequently with zero overhead and stay in line with our internal compliance demands. Cloud KMS’ low latency allows us to use it for frequently performed operations. This allows us to expand the scope of the data we choose to encrypt from sensitive data, to operational data that does not need to be indexed.” — Leonard Austin, CTO at RavelinGCM Google Cloud StorageBoringSSL libraryProject Wycheproof
The GCP encryption continuum

Google Cloud StorageGoogle Compute Engine

Your data is yoursEncryption at Rest in Google Cloud Platform

Partnering on open source: Google and Pivotal engineers talk Cloud Foundry on GCP

Tuesday, January 10, 2017

By Evan Brown, Senior Software EngineerGoogle Cloud Platformannounced
“The chemistry between the two engineering teams was remarkable as if we had been working together for years. The Cloud Foundry community is already benefiting from this work. It’s simple to deploy Cloud Foundry atop Google’s infrastructure, and developers can easily extend their apps with Google’s analytics and machine learning services. We look forward to working with Google in the future to advance our shared vision for multi-cloud choice and flexibility.” — Joshua McKenty, Head of Platform Ecosystem, Pivotal

Brought BOSH to GCP, adding support for Google’s global networking and load balancer, quick VM boot times, live migration and preemptible VM pricing

Built a service broker to let Cloud Foundry developers easily use Google services such as Google BigQuery, Google Cloud SQL and Google Cloud Machine Learning in their apps

Developed the stackdriver-tools BOSH release to give operators and developers access to health and diagnostics information in Stackdriver Logging and Stackdriver Monitoring

tutorial you can follow along with on GitHubGoogle Cloud@GoogleCloud

Security talks at Google during the RSA Conference

Tuesday, January 10, 2017

By Neal Mueller, Product Marketing LeadRSA ConferenceProject Zerohttps://cloudplatformonline.com/rsa

Android Security: Delivering Secure, Client-Side Technology to Billions of Users

Adrian Ludwig

Tuesday, February 14, 2017 | 1:15 PM - 2:00 PM and 2:30 PM - 3:15 PM

What is Needed in The Next Generation Cloud Trusted Platform?

David Cross

Tuesday, February 14, 2017 | 2:30 PM - 3:15 PM

How Google Protects Its Corporate Security Perimeter without Firewalls

Heather Adkins and Rory Ward

Tuesday, February 14, 2017, 3:45 PM - 4:30 PM

Targeted attacks against corporate inboxes–Gmail perspective

Elie Bursztein and Mark Risher

Thursday, February 16, 2017 | 2:45 PM - 3:30 PM

Google Cloud Platform Blog

SLOs, SLIs, SLAs, oh my - CRE life lessons

Guest post: building IoT applications with MQTT and Google Cloud Pub/Sub

Expanding our IDE support with a new Eclipse plugin for App Engine

Available . . . or not? That is the question - CRE life lessons

7 ways we harden our KVM hypervisor at Google Cloud: security in plaintext

Centrally manage all your Google Cloud resources with Cloud Resource Manager

Solution guide: creating self-service IT environments with CloudBolt

Google Cloud Audit Logging now available across the GCP stack

Explore Stackdriver Monitoring data with Cloud Datalab

How we secure our infrastructure: a white paper

Managing encryption keys in the cloud: introducing Google Cloud Key Management Service

Partnering on open source: Google and Pivotal engineers talk Cloud Foundry on GCP

Security talks at Google during the RSA Conference

Don't Miss Next '17

Free Trial

GCP Blogs

Labels

Archive

Feed

Subscribe by email

Company-wide

Products

Developers