Google Cloud Platform Blog
Leverage Google Cloud Platform features for your Hadoop deployments
Thursday, February 27, 2014
Accenture recently
demonstrated
Google Compute Engine as an excellent place to run
Hadoop
for performance and price. They demonstrated both familiar HDFS-based deployments as well as performance and cost advantages of using the Google Cloud Storage
connector
.
If you use
Apache Hive
with Hadoop, we want to show you another great feature of the Google Cloud Platform that will save you time and money. The default Hive installation uses a local single-session database as the Hive Metastore to store Hive table and column names, column types, and so on. This local database is fine for single-user/single-session usage, but to use Hive with multiple concurrent connections, you need to install and maintain a relational database such as MySQL.
Google Cloud SQL
is managed MySQL in the cloud, which
now supports the MySQL wire protocol
, so you can connect to Google Cloud SQL using common tools from anywhere. Why not use it to run your MySQL database in the cloud too, and save yourself the time and cost of having to install and maintain MySQL? Let Google take care of it for you.
We provide a
sample application
to help you get started running Apache Hive and
Pig
on Google Compute Engine. The
README
includes detailed instructions for using Cloud SQL for the Hive Metastore.
The rest of this blog post hits the high points for what you need to do to use Google Cloud SQL for the Hive Metastore.
Create a Google Cloud SQL instance
In the
Google Cloud Console
, create a
Google Cloud SQL instance
. Make sure to:
Choose the Compute Engine zone of the Compute Engine instance running Hive as the preferred location for the Cloud SQL instance.
Select "Assign an IP Address" to the Cloud SQL instance.
Add the external IP address of the Compute Engine instance as an Authorized IP Address. Be sure to append "/32" to the IP address entered in this field.
Define a root password for MySQL (it's good practice).
Install MySQL client and MySQL JDBC Driver
Connect to the Google Compute Engine instance that is running Hive and install both the MySQL client and the MySQL JDBC driver. Add the JDBC driver JAR file to Hive's CLASSPATH. The simplest method is to copy the file to the hive/lib/directory.
Create the hivemeta database and the user to access it
Use the mysql command line tool to:
Connect to the Google Cloud SQL instance
Create a database called hivemeta (Hive requires the database to use latin1 character encoding, which you'll need to specify explicitly).
Create the database user (for example hdpuser).
Issue grants on the hivemeta database to the database user.
Configure Hive to use Cloud SQL
Update your hive/conf/hive-site.xml file to connect to the Google Cloud SQL database. Add the following configuration, replacing cloud-sql-ip-addr with the assigned IP address of the Cloud SQL instance, and replacing password with the hdpuser database user password:
Hive is now configured to use the Google Cloud SQL database as its metastore; there's no need for you to maintain a MySQL installation.
Other software that uses MySQL can similarly use Google Cloud SQL. Since Cloud SQL is MySQL, you won't need to migrate code - just establish connectivity and you are off and running.
-Posted by Matt Bookman, Cloud Solutions Architect
No comments :
Post a Comment
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow
No comments :
Post a Comment