Google Compute Engine vs Amazon EC2 Part 2: Synthetic CPU and Memory Benchmarks

seansmall by 

Testing Assumptions

In the last article, I examined pricing and feature differentiation between Google Compute Engine and Amazon EC2 instance types. Now, it is time to see if the last article’s key assumptions, that Google Compute Engine Units are equivalent to Amazon EC2 Compute Units, is correct; and the results may surprise you.

The Competitors

In the Google Compute engine corner is the n1-standard-4, both with and without ephemeral storage. In the other, relatively crowded corner, are three contenders from Amazon, the second generation m3.xlarge, the classic m1.xlarge, and the hi1.4xlarge. Per the benchmarking software:

Instance Type
n1-standard-4
m3.xlarge
m1.xlarge
hi1.4xlarge
ProviderGoogleAmazonAmazonAmazon
Cost per Hour$0.48$0.58$0.52$3.10
ProcessorIntel XeonIntel Xeon E5-2670Intel Xeon E5645Intel Xeon E5620
Speed2.60GHz2.60GHz2.00GHz2.40GHz
Number of Cores44416
RAM15,360 MB15,360MB15,360MB61,440MB
Boot Disk11GB8GB8GB8GB
OSGCEL 12.04Ubuntu 12.04Ubuntu 12.04Ubuntu 12.04
Kernel2.6.39-gcg-201210301000 (x86_64)3.2.0-31-virtual (x86_64)3.2.0-31-virtual (x86_64)3.2.0-31-virtual (x86_64)
CompilerGCC 4.6GCC 4.6GCC 4.6GCC 4.6
File Systemext4ext4ext4ext4

Note that GCE instances use a Google-compiled and modified Linux Kernel but otherwise the distribution looks like Ubuntu 12.04.  Also, all instances used identical Java versions,

java version “1.6.0_24″
OpenJDK Runtime Environment (IcedTea6 1.11.5) (6b24-1.11.5-0ubuntu1~12.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Benchmark Software

Two different benchmark suites were used focusing on CPU and memory performance.  Components of the Phoronix Test Suite were used and the SciMark v2.1.1.1 and Java SciMark v2.0, which consists of five computational kernels: FFT, Gauss-Seidel relaxation, Sparse matrix-multiply, Monte Carlo integration, and dense LU factorization.

CPU Benchmark Results

Both the Java and non-Java SciMark benchmarks tell a similar story (higher scores indicate better performance).  In all tests, the n1-standard-4 and the m3.xlarge top the charts, trading the performance crown back and forth by small margins.  The m1.xlarge trails the pack by a very significant margin.

SciMark1

scimark2scimark3

Curiously, the GCE instance wins all Java SciMark 2.0 regular tests but the m3.xlarge wins the same suite of tests when larger data sizes, designed to exceed the CPU cache size, were used.

From the Phoronix Suite, three separate computationally intensive tests where chosen: the LAMMPS molecular dynamics simulation v1.0, the parallel BZIP2 compression 1.1.6, and a ray tracer (POV-Ray 3.6.1).  Each test measured performance in seconds. To simplify comparison and visualization, all values were normalized by the longest run time for that test (universally the m1.xlarge). Thus, values are not shown as seconds but percentages, the lowest value is best.

The n1-standard-4 edged out the m3.xlarge in both the POV-Ray and LAMMPS but was defeated by the m3.xlarge in the BZIP2 compression test. Not surprising, the h1.4xlarge with 16 cores destroyed all comers in the parallel BZIP test.

miscBenchmarks

Memory Benchmark Results

Memory benchmark results show that more robust metrics will be necessary to truly compare cloud computing capabilities. The line plot below shows memory speed benchmark results for the n1-standard-4 GCE instance and the m3.xlarge, m1.xlarge, and the h14.xlarge Amazon instances.

MemorySpeed

Each benchmark was run 4 times each for both the the n1-standard-4 and the m3.xlarge and this is where the real story lies. Notice that the GCE instance shows little performance variability across tests. In contrast, the m3.xlarge comes close to competing evenly with the GCE instance in most (but not all) tests but demonstrates performance drops up to 40%. It would seem that there is some validity to Google’s claims that GCE offers more consistent performance than competitors. Interestingly, this benchmark took the longest wall clock time to run of the synthetic tests.

Conclusion

In terms of short-term number crunching, the m3.xlarge and the n1-standard-4 seem similarly capable, trading small wins across the numerous benchmarks. In terms of memory speed, a very different story emerges; the GCE instance holds a small but consistent lead in memory speed but a large margin of victory in consistency of performance.  For lengthy processor-intensive tasks, this differential could be significant.

As neither of these services is free, let’s return to pricing as it would seem that not all compute units are the same. The GCE n1-standard-4 is either $0.48 per hour without ephemeral storage or $0.552 per hour with storage. In comparison, the m1.xlarge costs $0.520 per hour while the m3.xlarge costs $0.58 per hour and is only available without storage. Note that all prices were current as of 1/20/2013.

At these price points, the original m1.xlarge looks significantly overpriced. One must wonder when Amazon will either phase this option out or drastically alter its pricing. Even though the m3 second generation Amazon instances were just launched 11/1/2012, the story is similar. The comparable GCE instance offers approximately the same number crunching performance, better and, more importantly, more consistent memory performance, at a 20% discount.

The question that needs to be asked is what happens if computational performance is measured not just for a few seconds or minutes, but for hours or days at a time, a common situation in high performance computing and big data. Here is where I believe that Google may have a very significant advantage and I look forward to investigating this in my next article.

Notes

Anectdotally speaking, GCE instances are ready for use **much** faster than EC2 instances in my humble experiences. The time difference was quite noticable but I did not bother to quantify this characteristic.

The following two tabs change content below.

Sean Murphy

Senior Scientist and Data Science Consultant at JHU
Sean Patrick Murphy, with degrees in math, electrical engineering, and biomedical engineering and an MBA from Oxford, has served as a senior scientist at Johns Hopkins University for over a decade, advises several startups, and provides learning analytics consulting for EverFi. Previously, he served as the Chief Data Scientist at a series A funded health care analytics firm, and the Director of Research at a boutique graduate educational company. He has also cofounded a big data startup and Data Community DC, a 2,000 member organization of data professionals. Find him on LinkedIn, Twitter, and .
This entry was posted in Reviews and tagged . Bookmark the permalink.

2 Pingbacks/Trackbacks

  • datacommunitydc

    Turns out this post was quite prophetic. From Amazon:

    Price reduction for Amazon EC2
    We are reducing Linux On Demand prices for First Generation Standard (M1) instances, Second Generation Standard (M3) instances, High Memory (M2) instances and High CPU (C1) instances in all regions. All prices are effective from February 1, 2013. These reductions vary by instance type and region, but typically average 10-20% price drops. For complete pricing details, please visit the Amazon EC2 pricing page.

  • http://www.facebook.com/ivan.balashov Ivan Balashov

    such difference in pricing also influenced by the fact that m3.xlarge does not support ephemeral disks compared to x1.xlarge

  • Pingback: Grant proposal: The “anything@Home” computing cloud | Ligon's R&D Blog

  • Pingback: Public Cloud options apart from AWS | CloudStory - YourStory.com