Google Cloud Platform Blog
Cracking the GitHub code: this week on Google Cloud Platform
Friday, July 15, 2016
Posted by Alex Barrett, Editor, Google Cloud Platform Blog
It’s been a couple of weeks since GitHub announced that it was making 3+TB of its
open source library available on BigQuery
, and the
Google Cloud Platform
community has been busy ever since.
Google Developer Advocate
Felipe Hoffa
showed the world how it was done in “
GitHub on BigQuery: Analyze all the open source code
,” and fellow DA
Fransesc Campoy
followed suit with a post
analyzing GitHub Go packages
. Along the way, he discovers that he can create even more nuanced queries by using
BigQuery User Defined Functions
.
Then, one of Google’s newest DAs
Guillaume Laforge
informs us that
there are 743,070 Groovy files on GitHub with 16,464,376 lines of code
, while CloudFlare’s
Filippo Valsorda
(the “Heartbleed guy”) analyzes
how the Go ecosystem “does vendoring
.”
Meanwhile, over on Medium, Google program manager for big data and machine learning
Lak Lakshmanan
uses BigQuery to discover
which popular Java projects need the most help
by searching for tagged comments such as FIXME and TODO. The post also shows how to use
Google Cloud Dataflow
to build a pipeline starting from BigQuery to Java in order to process the data in steps.
Or check out
Robert Kozikowski’s
blog for a treasure trove of GitHub data analysis: posts on
visualizing relationships between python packages; top pandas, numpy and scipy functions, emacs packages
and
angular directives
.
And if that’s still not enough BigQuery on GitHub for you, here’s a
Changelog podcast
on the topic for your drive home!
No comments :
Post a Comment
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow
No comments :
Post a Comment