Google Cloud Platform Blog
Stress Testing with Energyworx
Friday, August 28, 2015
Founded in 2012, Energyworx offers big data aggregation and analytics cloud-software services for the energy and utilities industry. Their products and services include grid optimization and reliability, meter-data management, consumer engagement, energy trading and environmental-impact reduction. They are based in the Netherlands. To learn more, visit
www.energyworx.org
Getting all cloudy gives you a tremendous amount: Agility, scalability, cost savings and more. The scales weigh heavily in favor of embracing cloud goodness. However, on the other side of that scale, getting all cloudy means giving up a degree of control. You don’t control the infrastructure and, in certain cases, you don’t know the implementation behind APIs you rely on. This is especially true of managed services such as databases and message queues, and those APIs and associated SLAs are central to the operation of your systems. There’s nothing surprising, bad or wrong about this situation, as stated previously there are far more pros than cons with the cloud, but as engineers whose reputation (and need for a night’s sleep uninterrupted by a 3am wake up call) rely on the stability and scalability of the systems we build, what do we do? We follow the age old maxim, trust but verify, and verify by testing!
Testing comes in many forms but broadly there are two types, functional and stress testing. Functional tests check for correctness. When I register for your service does my email address get encrypted and correctly persisted? Stress tests check for robustness. Does your service handle 100,000 users registering in the fifteen minutes after it’s mentioned in the news? As an aside, I was tempted as I wrote this post to phrase everything in terms of “we all know this…” and “of course we all do that..” when it comes to testing because we do all know it’s a good thing to do and we all do it to one extent or another but the number of issues good engineers face with scalability issues is proof that the importance of stress testing isn’t a universally held truth, or at least a universally practiced truth. The remainder of this post focuses on a set of best practices we distilled from a stress testing exercise we did in Google Cloud Platform with
Energyworx
as part of their go live.
Energyworx
and Google Cloud Platform leveraged existing Energyworx REST APIs together with
Grinder
to stress test the system. Grinder allows the calls to the REST APIs to be scaled up and down as required depending on the type and degree of stress to be applied. Test scenarios were based around scaling the number of smart meters uploading data, the amount of work performed by the meters and physical locations of the meters. For example, we knew a single meter worked correctly so let’s try several hundred thousand meters working at the same time, or let’s have a meters running Europe accessing the system in the US, or let’s have thousands of meters do an end of day upload at the same time. Following these best practices Energyworx ran extended 200 core tests for approximately $10 a time and proved that their system was ready for millions of meters flooding the grid daily with billions of values. We were right and Energyworx launch went off without a hitch. Stress testing is a blast…
First best practice is to leverage
Google Cloud Platform
to provide the resources to stress test. To simulate hundreds of thousands of smart meters (or users, or game sessions, or other stimuli) takes resources and Google Cloud Platform allows you to spin these up on demand, in very little time and pay by the minute for them. That’s a great deal for stress testing.
Second best practice is that systems are often complex, with different tiers and services interacting and it can be tough to predict how they will behave under stress, so use stress testing to probe the behavior of your system and the infrastructure and services your system relies upon. Be creative with your scenarios and you’ll learn a lot about your system’s behavior.
Third best practice is that you should test the rate of change of the load you apply as well as the maximum load. What that means is that it’s great to know your system can handle a load of 100K transactions per second but it’s still not a useful system if it can only handle these in batches of 10K increases each minute for 10 minutes when a single news article from the right expert can bring you that much traffic in the web equivalent of the blink of an eye.
Fourth best practice is that you should test regularly. If you release each Friday and bugfix on demand, you don’t need to stress test every time you release but you should stress test the entire system every 2-4 weeks to ensure that performance is not degrading over time.
- Posted by Corrie Elston, Solutions Architect
No comments :
Post a Comment
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow
No comments :
Post a Comment