Google Cloud Platform Blog
Cloud Spin, Part 3: processing video using Google Cloud Platform services
Friday, October 2, 2015
Cloud Spin,
Part 1
and
Part 2
introduced the Google Cloud Spin project, an exciting demo built for
Google Cloud Platform Next
, and how we built the mobile applications that orchestrated 19 Android phones to record simultaneous video.
And now the last step is to retrieve the videos from each phone, find the frame corresponding to an audio cue in each video, and compile those images into a 180-degree animated GIF. This post explains the design decisions we made for the Cloud Spin back-end processing and how we built it.
The following figure shows a high level view of the back-end design:
1. A mobile app running on each phone uploads the raw video to a
Google Cloud Storage
bucket.
Video taken by one of the cameras
2. A
n
extractor
process running on a
Google Compute Engine
instance finds and extracts the single frame corresponding to the audio cue.
3. A
stitcher
process running on an
App Engine Managed VM
combines the individual frames into a video that pans across an 180-degree view of an instant in time, and then generates a corresponding animated GIF.
How we built the Cloud Spin back-end services
After we’d designed what the back end services would do, there were several challenges to solve as we built them. We had to figure how to:
Store large amounts of video, frame, and GIF data
Extract the frame in each video that corresponds to the audio cue
Merge frames into an animated GIF
Make the video processing run quickly
Storing video, frame, and GIF data
We decided the best place to store incoming raw videos, extracted frames, and the resulting animated GIFs was
Google Cloud Storage
.
I
t’s easy to use, integrates well with mobile devices, provides strong consistency, and automatically scales to handle large amounts of traffic, should our demo become popular.
We also configured the Cloud Storage buckets with
Object Change Notifications
that kicked off the back-end video processing when the Recording app uploaded new video from the phones.
Extracting a frame that corresponds to an audio cue
Finding the frame corresponding to the audio cue or beep poses challenges. Audio and video are recorded with different qualities and at different sample rates, so it takes some work to match them up.. We needed to find the frame that matched the noisiest section of the audio.To do so, we grouped the audio frames into frame intervals, each interval containing the audio that roughly corresponded to a single video frame. We computed the average noise of each interval by calculating the average of the squared amplitude of the samples. Once we identified the interval with the largest average noise, we extracted the corresponding video frame as a PNG file.
We wrote the extractor process in Python and used
MoviePy
, a module for video editing that uses the
FFmpeg
framework to handle video encoding and decoding.
Merging frames into an animated GIF
The process of generating an animated GIF from a set of video frames can be done with only four FFmpeg commands, run by a Bash script. First we generate a video by stitching all the frames together in order, extract a color palette, and then use it to generate a lower-resolution GIF to upload to Twitter.
Making the video processing run quickly
Processing the videos one-by-one on a single machine would take longer than we wanted. Next participants to have to wait to see their animated GIF. Cloud Spin takes 19 videos for each demo, one from each phone in the 180-degree arc.If extracting the synchronized frame from each video takes 5 seconds, and merging the frames takes 10 seconds, with serial processing the time between taking the demo shot and the final animated video would be (19 * 5s + 10s = 110s), almost two minutes!
We can make the process faster by parallelizing the frame extraction. If we use 19 virtual machines, one to process each video, the time between the demo shot and the animated GIF is only 15 seconds. To make this improvement work, we had to modify our design to handle synchronization of multiple machines.
Parallelizing the workload
We developed the extraction and stitching process as independent applications. This made it easy to parallelize the frame extraction. We can run 19 extractors and one stitcher, each as a
Docker
container on
Google Compute Engine
.
But how do we make sure that each video is processed by one, and only one, extractor?
Google Cloud Pub/Sub
is
a messaging system that solves this problem in a performant and scalable way. Using Cloud Pub/Sub, we can create a communication channel that is loosely coupled across subscribers.This means that the extractor and stitcher applications interact through Cloud Pub/Sub, with no assumptions about the underlying implementation of either application. This makes future evolutions of the infrastructure easier to implement.
The preceding diagram shows two Cloud Pub/Sub topics that act as processing queues for the extractor and the stitcher applications. Each time a mobile app uploads a new video, Cloud Pub/Sub publishes a message on the videos topic. The extractors subscribe to the videos topic. When a new message is published, the first extractor to pull it down has a lease on the message during which it processes the video in order to extract the frame corresponding to the audio cue. If the processing completes successfully, the extractor acknowledges the videos message to Cloud Pub/Sub, which causes Cloud Pub/Sub to publish a new message to the frames topic. If the extractor process fails, the lease on the videos message expires and Cloud Pub/Sub republishes the message, where it can be handled by another extractor.
When a message is published on the frames topic the stitcher pulls it down and waits until all of the frames of a single session are ready to be stitched together into an animated GIF. In order for the stitcher application to detect when it has all the frames, it needs a way to check the real-time status of all of the frames in a session.
Managing frame status with Firebase
Part 2 discussed how we managed orchestrating the phones to take simultaneous video using a
Firebase
database that provides real-time synchronization.
We also used Firebase to track the process of extracting the frame from each camera that corresponds to the audio cue. To do so, we added a status field to each extracted frame in the session as shown in the following screenshot.
When the Android phone takes the video it sets this status to RECORDING and then to UPLOADING when it uploads the video to Cloud Storage. The extractor process sets the status to READY when the frame matching the audio cue has been extracted. When all of the frames in a session are set to READY the stitcher process combines the extracted frames into an animated GIF and stores the GIF in Cloud Storage and its path on Firebase.
Having the status stored in Firebase made it possible for us to create a dashboard that showed, in real time, each step of the processing of the video taken by the Android phones and the resulting animated GIF.
We finished development of the Cloud Spin backend in time for the
Google Cloud Platform Next
events, and together with the mobile apps that captured the video, ran a successful demo.
You can find more Cloud Spin animations on our Twitter feed:
@googlecloudspin
. We plan to release the complete demo code. Watch
Cloudspin on GitHub
for updates.
This is the final post in the three-part series on Google Cloud Spin. I hope you had as much fun discovering this demo as we did building it. Our goal was to demonstrate the possibilities of Google Cloud Platform in a fun way, and inspire you build something awesome!
-
Posted by Francesc Campoy Flores, Google Cloud Platform
No comments :
Post a Comment
Don't Miss Next '17
Use promo code NEXT1720 to save $300 off general admission
REGISTER NOW
Free Trial
GCP Blogs
Big Data & Machine Learning
Kubernetes
GCP Japan Blog
Labels
Announcements
56
Big Data & Machine Learning
91
Compute
156
Containers & Kubernetes
36
CRE
7
Customers
90
Developer Tools & Insights
80
Events
34
Infrastructure
24
Management Tools
39
Networking
18
Open Source
105
Partners
63
Pricing
24
Security & Identity
23
Solutions
16
Stackdriver
19
Storage & Databases
111
Weekly Roundups
16
Archive
2017
Feb
Jan
2016
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Subscribe by email
Technical questions? Check us out on
Stack Overflow
.
Subscribe to
our monthly newsletter
.
Google
on
Follow @googlecloud
Follow
Follow
No comments :
Post a Comment