Google Cloud Platform Blog
Understanding Cloud Pricing Part 5 - NoSQL Databases
Monday, August 31, 2015
We’ve had a lot of great responses and feedback (keep ‘em coming!) about our cloud pricing posts (
Local SSDs
,
Virtual Machines
,
Data Warehouses
) and today we’re back to talk about running NoSQL databases in the cloud. Specifically, we want to give you the information you need to understand how to estimate the cost of running NoSQL workloads on
Google Cloud Platform
.
NoSQL Databases
The NoSQL database market has
experienced massive growth
for the last few years and NoSQL databases have been instrumental in solving many distributed data and scaling challenges, which have opened the door for new and innovative applications and solutions. “NoSQL” is an umbrella term that encompasses any data store that fits the notion of “not only SQL” and many products offer a high degree of tunability around the standard relational database concepts of atomicity, consistency, isolation, and durability (see
ACID
for more information) and the distributed systems concepts of consistency, availability, and partition tolerance (see
CAP theorem
for more information). And every NoSQL database offers something different when it comes to how data is modeled and stored - including, but not limited to - JSON document, key-value, wide-column, and blob storage.
As expected, there are several different self-managed options available such as
MongoDB
,
Apache Cassandra
,
Riak
,
Apache CouchDB
,
Couchbase
and many more. Today we’re going to focus on how to estimate pricing when running MongoDB. MongoDB is a document-based, highly-scalable NoSQL database that provides dynamic JSON schemas along with a powerful query language. There are a variety of use cases for MongoDB such as, 360-degree view of the customer, real-time analytics, internet of things applications, and content management (to name a few).
However, when looking at the pricing data for MongoDB, we noticed something interesting. We had planned a separate blog post to talk about pricing Cassandra on Google Cloud Platform as well. But the hardware (virtual or real) requirements are very similar and neither require a license to be purchased, so the costs are very similar. It didn’t make sense to have another post stating more or less the same thing, just replacing the name of the database so we are going to include Cassandra here as well.
Cassandra, unlike MongoDB, is a key-value store. Cassandra was written at Facebook with much of the data model inspired by
Google's Bigtable
white paper and the availability design inspired by
Amazon's Dynamo
white paper. Cassandra was designed for high availability, performance, and tunable consistency. Cassandra has no leader or master node, but rather all the nodes in a cluster exist in a ring, where data is replicated a configurable number of times. Availability comes from having a headless cluster storing your data; tunable consistency comes from how much effort you want your cluster to spend to return your queries. Cassandra and MongoDB are two of the most used NoSQL databases that we see our customers using.
Starting Point
So how do you estimate pricing given multiple use cases and different possible query and traffic patterns? To get started with MongoDB, we’re going to narrow the scope a bit and estimate the costs of the resources used in existing benchmarks. There are several benchmarks that have been published about MongoDB performance and we’ll focus in on two of them, one
published by MongoDB
and another from
United Software Associates
. Both benchmarks reach roughly the same throughput and latency conclusions so this is a reasonable model to build upon.
While the benchmarks from United Software Associates used a single MongoDB node for testing, the benchmarks published by MongoDB used a 3-node replica set.
Replica sets
are a redundant, highly-available deployment of MongoDB and they are strongly recommended for all production workloads (at a minimum). The smallest possible replica set is comprised of three nodes, each configured with matching specifications so we’ll include that configuration in our pricing breakdown below. The on-prem reference hardware specs used in the benchmarks were as follows (MongoDB, like most databases, tends to favor more RAM and storage IOPS where possible):
Benchmark
MongoDB
United Software Associates
CPU
Dual 10-core Xeon 3.0 GHz
Dual 6-core Xeon 3.06 GHz
RAM
128 GB
96 GB
Storage
2 x 960 GB SSD
2 x 960 GB SSD
Monthly Price (single node)
$1,525.00*
(
estimate
)
Unavailable**
Monthly Price (3-node replica set)
$4,575.00*
(
estimate
)
Unavailable**
Now if we map that back to
Google Compute Engine
instances and storage offerings we would have the following 2 closely matching configurations along with pricing:
Instance Type
n1-highmem-16
n1-standard-32
CPU
16 Xeon vCPU
32 Xeon vCPU
RAM
104 GB
120 GB
Storage
4 x 375 GB Local SSD
4 x 375 GB Local SSD
Monthly Price (single node)
$843.60
$1,146.10
Monthly Price (3-node replica set)
$2,530.76
(
estimate
)
$3,438.30
(
estimate
)
Monthly Price Difference
44%
24%
Annual Savings vs. On-Premise
$24,530.88
$13,640.40
The cost breakdown above shows the pricing for a single node and for a 3-node replica set, which is a typical production deployment of MongoDB as stated above. We selected Local SSD for the storage layer in order to support the IOPS required for the throughput metrics achieved in the benchmark reports. As shown in this
disk type comparison
, Local SSD can support up to 280,000 write IOPS per instance. We know that Local SSD is ephemeral storage, meaning that its lifecycle is tied to the virtual machine to which it is mounted, which is another reason why we chose to estimate pricing for the highly available MongoDB 3-node replica set option. Finally, the prices shown above include Google Cloud Platform
sustained use discounts
which totals about a 30% discount over the course of the month.
The pricing for Cassandra is pretty similar to MongoDB. They both benefit from Local SSD in terms of performance. And the trade-off between more memory (n1-highmem-16) and more compute (n1-standard-32) is the type of choice that DBAs will have to make when designing a typical Cassandra cluster. Of course, this is just guidance on pricing to get you started, you won't know what's best for your application until you actually run tests yourself.
Running Your Own Tests
As with any benchmarks, your mileage may vary when testing your particular workloads. Isolated tests run during benchmarks don’t always equate to real world performance so it is important that you run your own tests and assess read-write performance for a workload that closely matches your usage. Take a look at
PerfKit
and use to it to profile your own proposed deployments, including mixing and matching workloads or worker counts.
Pricing NoSQL workloads can be somewhat challenging but hopefully we’ve given you a way to get started in estimating your costs. If you’re interested in learning more about compute and storage on Google Cloud Platform, check out
Google Compute Engine
or take a look at the
documentation
. Feedback is always welcome so if you’ve got comments or questions, don’t hesitate to let us know in the comments.
We’ve gotten a lot of great feedback about this post, and we wanted to let you know that we will also be posting about cloud pricing for Google Cloud Platform's managed NoSQL options in the near future. In forthcoming blog posts, we’ll talk about how to understand the pricing around Google Cloud Bigtable and Google Cloud Datastore and compare those to other popular managed offerings. Thanks for the questions and comments, keep ‘em coming!
- Posted by Sandeep Parikh and Peter-Mark Verwoerd, Solutions Architects
* -
Price was taken from a configure-to-order bare metal server at
Softlayer
** -
Configuration was unavailable to estimate the monthly price
Google Cloud Storage now available through VMware vCloud Air
Monday, August 31, 2015
Earlier this year, we
teamed
up with VMware to offer enterprise grade Google Cloud Platform services to VMware customers through VMware vCloud Air.
Today we are excited to
announce
that vCloud Air Object Storage Service, powered by Google Cloud Platform, is generally available to all customers.
With the availability of Google Cloud Storage through vCloud Air, VMware customers will have access to a
durable and highly available object storage service powered by Google Cloud Platform.
Google Cloud Storage
enables enterprises
to store data on Google's infrastructure with very high reliability, performance and availability
. It provides a simple
HTTP-based API accessible from applications written in any modern programming language
, which enables customers to take advantage of Google's own reliable and fast networking infrastructure to perform data operations in a cost effective manner. When you need to expand, you benefit from the scalability provided by Google's infrastructure.
VMware customers will have access to all three classes of object storage offered by Google:
Standard storage
offers our highest performance storage, with very high availability.
Durable Reduced Availability storage
provides a lower cost option that doesn’t require immediate and uninterrupted access to storage. Cost savings are made by reducing replicas. Durable Reduced Availability storage offers the same durability as Standard storage.
Nearline storage,
our
newest
storage service, offers customers a simple, low-cost, fast-response storage service with quick data backup, and access for storage charges of 1 cent per GB of data.
Today’s announcement marks the launch of the first of many Google Cloud Platform services that will be offered to VMware customers through vCloud Air. We’
re excited to extend Google Cloud Platform to the VMware vCloud Air customer base
.
To learn more, contact your VMware sales team or
Google Cloud Platform Sales.
- Posted by Adam Massey - Director, Global Partner Business
Help us build a better Google Cloud Platform
Friday, August 28, 2015
Google Cloud Platform
improves as a result of extensive collaboration--including collaboration with users. In particular, user research studies help us improve our cloud platform by allowing us to get feedback directly from cloud and IT administrators around the world.
We’d like to invite you today to join our growing pool of critical contributors. Simply fill out our
form
and we’ll get in touch as user research study opportunities arise.
During a study, we may present you with and gather your feedback on Google Cloud Platform, a new feature we’re developing, or even prototypes. We may also interview you about particular daily habits or ask you to keep a log of certain activity types over a given period of time. Study sessions can happen at a Google office, in your home or business, or online through your computer or mobile device:
Usability study at a Google office
: for those that live local to one of our
offices
. Typically, you’ll come visit us and meet 1-on-1 with a Google researcher. They’ll ask you some questions, have you use a product, and then gather your feedback on it. The product could be something you’re rather familiar with or some never-before-seen prototype.
Remote usability study
: Rather than have you visit our offices, a Google researcher will harness the power of the Internet to conduct the study. Basically, they’ll call you on the phone and set up a screen sharing session with you on your own computer. You can be almost anywhere in the world, but need to have a high-speed Internet connection.
Field study
: Google researchers hit the road and come visit you. We won't just show up at your door though – we’ll always check in with you first, talk to you about the details of the study and make a proper appointment.
Experiential sampling study
: These studies would require a small amount of activity every day over the course of several days or weeks. Google researchers will ask you to respond to questions about a product, or make entries in a diary document about your use of a product, using your mobile phone, tablet, or laptop to complete the study questions or activities.
After the study, you'll receive a token of our appreciation for your cooperation, such as a gift card. Sharing your experiences with us helps inform our product planning and moves us closer to our goal of building a cloud platform that you'll love.
More questions?
Check out our
FAQs
page to learn more about our user research studies.
- Posted by Google UX Research Infrastructure Team
Stress Testing with Energyworx
Friday, August 28, 2015
Founded in 2012, Energyworx offers big data aggregation and analytics cloud-software services for the energy and utilities industry. Their products and services include grid optimization and reliability, meter-data management, consumer engagement, energy trading and environmental-impact reduction. They are based in the Netherlands. To learn more, visit
www.energyworx.org
Getting all cloudy gives you a tremendous amount: Agility, scalability, cost savings and more. The scales weigh heavily in favor of embracing cloud goodness. However, on the other side of that scale, getting all cloudy means giving up a degree of control. You don’t control the infrastructure and, in certain cases, you don’t know the implementation behind APIs you rely on. This is especially true of managed services such as databases and message queues, and those APIs and associated SLAs are central to the operation of your systems. There’s nothing surprising, bad or wrong about this situation, as stated previously there are far more pros than cons with the cloud, but as engineers whose reputation (and need for a night’s sleep uninterrupted by a 3am wake up call) rely on the stability and scalability of the systems we build, what do we do? We follow the age old maxim, trust but verify, and verify by testing!
Testing comes in many forms but broadly there are two types, functional and stress testing. Functional tests check for correctness. When I register for your service does my email address get encrypted and correctly persisted? Stress tests check for robustness. Does your service handle 100,000 users registering in the fifteen minutes after it’s mentioned in the news? As an aside, I was tempted as I wrote this post to phrase everything in terms of “we all know this…” and “of course we all do that..” when it comes to testing because we do all know it’s a good thing to do and we all do it to one extent or another but the number of issues good engineers face with scalability issues is proof that the importance of stress testing isn’t a universally held truth, or at least a universally practiced truth. The remainder of this post focuses on a set of best practices we distilled from a stress testing exercise we did in Google Cloud Platform with
Energyworx
as part of their go live.
Energyworx
and Google Cloud Platform leveraged existing Energyworx REST APIs together with
Grinder
to stress test the system. Grinder allows the calls to the REST APIs to be scaled up and down as required depending on the type and degree of stress to be applied. Test scenarios were based around scaling the number of smart meters uploading data, the amount of work performed by the meters and physical locations of the meters. For example, we knew a single meter worked correctly so let’s try several hundred thousand meters working at the same time, or let’s have a meters running Europe accessing the system in the US, or let’s have thousands of meters do an end of day upload at the same time. Following these best practices Energyworx ran extended 200 core tests for approximately $10 a time and proved that their system was ready for millions of meters flooding the grid daily with billions of values. We were right and Energyworx launch went off without a hitch. Stress testing is a blast…
First best practice is to leverage
Google Cloud Platform
to provide the resources to stress test. To simulate hundreds of thousands of smart meters (or users, or game sessions, or other stimuli) takes resources and Google Cloud Platform allows you to spin these up on demand, in very little time and pay by the minute for them. That’s a great deal for stress testing.
Second best practice is that systems are often complex, with different tiers and services interacting and it can be tough to predict how they will behave under stress, so use stress testing to probe the behavior of your system and the infrastructure and services your system relies upon. Be creative with your scenarios and you’ll learn a lot about your system’s behavior.
Third best practice is that you should test the rate of change of the load you apply as well as the maximum load. What that means is that it’s great to know your system can handle a load of 100K transactions per second but it’s still not a useful system if it can only handle these in batches of 10K increases each minute for 10 minutes when a single news article from the right expert can bring you that much traffic in the web equivalent of the blink of an eye.
Fourth best practice is that you should test regularly. If you release each Friday and bugfix on demand, you don’t need to stress test every time you release but you should stress test the entire system every 2-4 weeks to ensure that performance is not degrading over time.
- Posted by Corrie Elston, Solutions Architect
Reselling Option now available for Google Cloud Platform Partners
Wednesday, August 26, 2015
From bringing
people together at the World Cup
, to
improving the way employees talk to each other
, Google Cloud Platform Services Partners help customers unlock the full potential of our products.
To help our partners focus more on their customers’ experiences, we are pleased to announce that we’re now accepting applications for a reselling option from eligible, existing Google Cloud Platform services partners and
we anticipate expanding to
new partner program applicants in early fall.
As a reseller of
Google Cloud Platform
, partners will be able to provision and manage their customers via the new Cloud Platform reseller console. Google Cloud Platform resellers will:
Fully manage their customers’ Google Cloud Platform experience, from onboarding through implementation
Provide the first line of support and be responsible for customer problem resolution
Provide customers with a billing service that matches their specific requirements and in local currency
The ability to resell will be especially beneficial to partners aiming to bundle multiple Cloud Platform services and present one consolidated bill to their customers.
“
The reseller console showcases deep insights into our customers' engagement with the platform, allowing us to make informed recommendations in terms of best practices and opportunities available to our customers. As a trusted solutions partner, it's paramount for us to provide white glove services to make their transition to the cloud as seamless as possible."
-- Tony Safoian, Sada Systems CEO
If you’re an existing services partner and want to learn more about your organization's eligibility for reselling, visit our application page on
Google for Work Connect
. And if you’re new to Google Cloud Platform and interested in becoming a services partner, visit our site at
cloud.google.com/partners
.
- Posted by Adam Massey - Director, Global Partner Business
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow