Mar 212014
 

I continue to receive requests for OpenSourceMark source code. You can download the source code here: https://drive.google.com/folderview?id=0B_M9uysbCa–dEJoM01kZmgwNlk&usp=sharing

I have also made public a sophisticated benchmark harness that my company developed that can be used to create application level benchmarks like SysMark. You can access the source code here: https://github.com/van-smith/OPBM

I am also making public the miniBench source code: https://github.com/van-smith/miniBench

OpenSourceMark and miniBench have both been very effective at providing CPU performance insights for new product development, validation, flag mining and product positioning.

Nov 162013
 

Internet shopping giant Amazon is currently selling the Seiki Digital 39-Inch 4K Ultra HD 120Hz LED TV for only $519.99 with free two day shipping if you are an Amazon Prime member. Additionally, the item qualifies for 6-month interest free financing. If you can bump up your order to over $600 with qualifying items (which seems to be almost anything from Amazon) then you can apply for 12-month interest free financing.

As a coder, the 3840 x 2160 resolution — “4k” because 3,840 is nearly 4,000 — offered by a 4K monitor can boost my productivity significantly since I can have many coding tool windows open and visible at one time without overlap. With smaller monitors, digging through open windows can take lots of time over the course of a day and each time I have to shuffle through windows is an opportunity to become distracted.

The 39-inch Seiki television monitor is limited to 30Hz refresh rates at 4k resolutions due to its HDMI 1.4 interface, but since it uses an LCD panel, flicker is reportedly not a problem. It’s low refresh rate will reduce its attractiveness to hardcore gamers. $520 for a 4k monitor right now is almost unbelievably inexpensive. While 39″ might seem large for a desktop monitor, keep in mind that a single 4k monitor has the same number of pixels as four 1080p monitors.

My Seiki monitor will arrive next week and I will report my experience with it soon afterwards.

Jul 182012
 

After announcing shipping delays, Google’s first tablet computer began arriving in the hands of owners this week and anecdotal reports suggest that the defect rate for the Asus manufactured device is high.

On the XDA Developers website, 140 users have responded to a poll asking where their defective Nexus 7 came from.  That message board thread currently has 35 pages of responses.  Most of the reported defects involve problems with the 7-inch IPS display: poor image quality, light leaks, dead pixels, improperly seated display and image ghosting.  Many users report substandard manufacturing and quality control issues.

UPS delivered our 16GB Nexus 7 yesterday with an unworkable touch screen defect.  The tablet often does not register screen touches, but when it does, phantom touches occur constantly, often triggering an avalanche of unwanted actions.  After a great deal of effort, I was able to install Multitouch Tester, an application that will display all currently registering screen touches.  The application revealed that a ghost touch is being registered at a fixed location near the center of the screen.

When we attempted to contact Google to report our defective tablet, we were greeted with an automated response discouraging us from staying on the line due to current call volume.  We did not hang up, but it took 50 minutes before our call was answered by a customer service representative.  The lengthy hold times further suggest that Google is struggling through a rough product launch of the Nexus 7.

UPDATE: Our Nexus 7 touch screen problem has worsened enough so that we can no longer unlock the device from the initial Android lock screen.  Google’s device return procedures are somewhat Byzantine and buggy, but apparently a replacement device will eventually be sent to us.  However, we have not received a shipping notification yet.

 

Feb 282012
 

I updated Skype recently to version 5.8.0.156 and I have had three straight calls that were dropped after 30 minutes, 24 seconds. This did not happen with previous Skype versions where I could maintain calls for hours.

Microsoft purchased Skype last year for $8.5-billion and the service has been balky on several occasions since then.

May 212011
 

Only a month or two after it was published, a detailed report that I wrote was wiped out during a BrightSideOfNews* hard drive crash. That exhaustive report, praised by many throughout the industry as the finest of its kind yet produced, examined the emerging and inevitable ARM versus x86 clash.

It took a little while and cost BSN* a lot of money to recover the data on the hard drive, but that report is now back up and can be read here.

I’m currently working on a followup to that bit of analysis that will include even more hardware than the initial report.  I’m still waiting on a vendor or two, so I can’t promise an ETA yet, but one thing I can state is that the new report will be very interesting.

The computing landscape is changing rapidly and the war between x86 and ARM microprocessors is now underway.  The competitors have dramatically different strengths and weakness, making for a particularly exciting confrontation.

Most importantly, the results of this war will have profound effects well beyond the CPU market, where several companies will possibly see their fortunes upended.  One thing is absolutely certain: computing will never be the same again.

Feb 062011
 

The popular Internet conspiracy discussion site, GodLikeProductions (GLP), is back up after about 24 dramatic hours of down time.

As we mentioned last night here, GLP closed operations yesterday, leaving only a vague, enigmatic statement as a farewell.

According to the site’s owner, who posts under the handle “^TrInItY^”, the outage was due to “internet stalkers” who he alleges committed “libel, defamation of character, cyber stalking, criminal harassment, hacking, ddos attacks, and other offenses…” Furthermore, ^TrInItY^ claims to have notified the police and has taken legal action against the alleged perpetrators: “I’ve got the ball rolling with the police and with civil charges against the individuals associated with this activity.”

However, the Internet adapts quickly and the 24-hour outage has impacted GLP‘s credibility and momentum. Many of the message board’s orphaned members found refuge on sites like LunaticOutPost and AboveTopSecret.

GLP is attempting to remedy this hit by offering free membership extensions and free one month trial memberships. But at least some GLP members are viewing the offers skeptically and wonder if the outage was nothing more than a marketing plot.

Will GLP survive this hit? Probably. The site is too big and the outage too short to cripple it for long. But a repeat of the last 24-hours might not turn out so well for GLP as many members have already established “refuges” on other websites in the case of future, abrupt GLP closures.

Feb 062011
 

One of the most popular Internet message boards in the world appears to have suddenly and unexpectedly shut down. GodLikeProductions, a nexus of Internet conspiracy discussion, has disappeared and in its place is a somewhat mysterious message:

The Kingdom of God is inside you, and all around you, not in mansions of wood and stone. Split a piece of wood and I am there; lift a stone and you will find me.

For the book has been closed, The names have been written, Judgement cometh.

And it was known to them in those days that there would come upon them the judgement of the children of God, who had lived among them as men and learned of their ways and tested their hearts.

While much of the material on “GLP,” as it is commonly called, borders on lunacy, the site has been invaluable for news because it is empowered by thousands of worried, obsessive and, ultimately, sharing people who constantly scour the Internet for substantive news, trying to find meaning and understanding within the many millions of digital voices that compose the Web.

The closure appears to be permanent because the current GLP message ends with a link to a “goodbye” recording.

If GLP is in fact gone, we lose one of the most important places on the Internet to gather and discuss current events anonymously.  GLP has been a much better place for news than CNN, MSNBC, FoxNews or any other mainstream news site.

On the other hand, the current, strange outage could be the result of hackers. GLP has built up many enemies over the years. It might even be a publicity stunt. There is no doubt that the site is crawling with paid disinformation agents including a number from within our own government.  Some critics even claim that GLP is an intelligence operation.  Nevertheless, discourse has remained mostly free and open there.

In any case, here’s hoping GLP’s farewell is only temporary.

In the meantime, it is safe to assume that thousands of paranoid GLP regulars are panicking right now.  End-of-the-world-doom was a favorite, obsessed upon topic of discussion there.

Dec 212010
 

After many months of trying to wring something out of NVIDIA, I have finally obtained a Tegra 2-based device.  It is in the form of the ViewSonic G Tablet, a 10″ Android 2.2 (Froyo) based slate computer.  We bought it from Sears, of all places.  Oddly enough, Sears has one of the largest selections of tablet devices you can find.

Despite its complete lack of refinement, this thing is awesome, but only if you don’t mind wiping out the stock ROM.  The G Tablet’s shipping GUI looks like it was designed for toothless nursing home residents with computer-phobia and a lot more patience than I possess.  Obviously, ViewSonic wanted the device to be embraced by mainstream consumers so they dumbed-down the Android 2.2 interface with a sluggish, buggy and artificially limited mess of an overlay.  To make matters worse, the Android Market is nowhere to be found.

ViewSonic really shot themselves in the foot with the G Tablet.  They were first to the U.S. market with a dual-core Tegra 2-based device.  All they had to do was slap on a standard Froyo installation with a full Android Market and the device would have been a runaway hit for them this Christmas season.  But nooooooooooo!  ViewSonic had to get all greedy with visions of iPad’s success with mainstream buyers.  The resulting, lousy Tap ‘n Tap interface is like pouring a pound of aspartame over a steak dinner.  Sprinkled with bugs, the unsavory kind.

Fortunately, if you are a computer geek then it is not too difficult to flash the G Tablet’s firmware with a proper Android environment.  It’s also a fairly safe process since someone at ViewSonic had the foresight to make the device relatively brick-proof.  I’ve been using TnT Lite 3.0, but there are other options as well.  Yes, there will be headaches along the way, but geeks like me enjoy hacking a new device.

And, frankly, I have not been this excited about a new genre of computing device in many years.  The promise of the iPad was immediately evident to me when we bought one last spring.  However, a properly prepared G Tablet runs circles around the iPad.  Android-based tablets are going to dominate the marketplace by this time next year.

Of course, we bought the device for our business to benchmark and analyze.  Tegra-2 appears to be even faster than I anticipated.  The G Tablet finished under 2.5 seconds on SunSpider using Firefox Mobile 4.0 beta 2.  When I wrote my ARM versus x86 treatise last spring, the 800MHz Cortex-A8 took over 14 seconds on SunSpider while the 1GHz Intel Atom needed over 8 seconds, about as fast a my updated iPad takes today.  Note, however, that Firefox’s JavaScript performance has improved enormously over that time.  On the other hand, remember that JavaScript is still single-threaded, so half of the Tegra-2’s performance is left untapped on SunSpider.

There’s been weeping, moaning and gnashing of teeth over the quality of the G Tablet’s display.  Truth be told, those people are crybabies.  Yes, it’s not as good as the IPS screens on the iPad or the B&N NOOKcolor, but it’s not awful either (its biggest problem is blinding glare, not its relatively limited viewing angles compared to IPS displays).  However, I was expecting more from ViewSonic, a company best known for its outstanding history as a computer monitor vendor.  But given the general unrefinement of the device, I was not too surprised.  I mean, one look at the dingy, off white G Tablet box shows that the challenges of marketing a tablet computer are currently beyond ViewSonic.  I had to take out a ViewSonic monitor box to confirm my suspicion that apparently the monitor and tablet marketing folks at ViewSonic apparently never speak to one another.

Anyhow, it’s not too late for ViewSonic.  They need to ditch tepid Tap ‘n Tap for a real, full Android experience, enable a complete Android Market, push device driver updates to the tablet and recognize the G Tablet for what it is: a Grade A geek toy.  In fact, it appears that ViewSonic decided to take a step in this direction today by promising to push out a new firmware edition before Christmas that will not only improve Tap ‘n Tap, but will also give the user the option to boot into a stock Android interface.

Penetrating the mainstream marketplace will require hardware tweaking like adding an IPS screen, improving the lame webcam, rubberizing the case and bezel, adding mechanical Android buttons and dramatically rethinking the case ink and finish.  If they want a nearly perfect tablet, ViewSonic can add a digital compass, GPS and rear-facing camera.

We’ll be testing the ViewSonic G Tablet and writing benchmarks specifically for this purpose.  Hopefully, we’ll have results to report soon.

Dec 202010
 

If you are Jonesing over the Apple iPad but can’t afford half-a-grand for a baseline 16GB WiFi version, the $250 Barnes & Noble NOOKcolor is a good alternative.

Sporting twice the amount of RAM (512MB) along with 8GB of flash memory (expandable to 40GB thorough a microSD slot), the well-designed NOOKcolor has hardware features that are very competitive with the iPad’s.  The excellent 7″, capacitive, multiTouch, IPS LCD at 1024×600 resolution nearly matches the iPad’s 1024×800 pixel count but at a much higher pixel density.  The NOOKcolor’s 800MHz ARM Cortex-A8 is almost as fast as the iPad’s 1GHz A8.  WiFi connectivity appears to be better on the NOOKcolor than on the iPad.  From my own experience, battery life is very good and approaches the iPad’s.

The only features missing on the NOOKcolor that are present on the iPad are a microphone, Bluetooth and compass.  The NOOKcolor might also lack an ambient light sensor; although there appears to be a place for one to the left of the home button, there does not appear to be any software support for a light sensor yet.

Arguably the best eReader currently available, the NOOKcolor is easily rooted using Auto-Nooter by following the directions here.  You will also need to install LauncherPro or some other similar program to access apps installed using the Android Market.  Once rooted, the NOOKcolor becomes the best Android-based tablet for the money.

Of course, the iPad’s App Store is unmatched, but the Android Market is nothing to sneeze at.  Previous purchases from the Android Market will automatically become available for installation on the rooted NOOKcolor, and this is a big advantage for Android-based devices.  While iPad users might enjoy a greater variety of software choices, the Android universe is exploding with new devices and all of your Android Market software purchases will carry over to any of them (although some programs might not run properly on every platform).

Rooting the NOOKcolor is not without problems.  The process is still young and requires perseverance, patience and a modicum of technical acumen.  Rooting the NOOKcolor might also void the device’s warranty and could possibly interfere with future official B&N operating system upgrades including the upgrade to Android 2.2 (Froyo) planned for January.

But having used both devices, the rooted NOOKcolor is responsive and fun.  Games like AngryBirds are every bit as good on the NOOKcolor as on the iPad.  The outstanding email client is equal to the iPad’s and the Dolphin Browser, available through the Android Market, is a better browser than the iPad’s limited version of Safari.  While the iPad’s touch screen might currently be a little more responsive and accurate, this could change in the NOOKcolor’s favor when the Froyo upgrade soon becomes available.

None of the NOOKcolor’s original functionality is affected by rooting, but it is satisfying to load the Amazon Kindle Android application onto the NOOKcolor so that you can read all of your Kindle books in a better format than any Kindle delivers.

So if you are feeling a little daring, the NOOKcolor is a terrific eReader that makes a great tablet when rooted.  The rooted NOOKcolor is a very strong, inexpensive alternative to the WiFi Apple iPad.

Dec 152010
 

An open source operating system lauded for its security features appears to have been infiltrated by the FBI over a decade ago resulting in the covert injection of eavesdropping code allowing the U.S. Government to snoop certain types of commonly used encrypted network traffic.

Security expert and OpenBSD leader Theo De Raadt forwarded to the openbsd-tech mailing list an email from an old associate who claims the Federal Bureau of Investigation hired his company to create secret backdoors into the OpenBSD Crypto Framework IPsec networking stack about ten years ago.  According to de Raadt, “large parts of the [IPsec] code are now found in many other projects/products.”  Indeed, IPsec is the basic toolset for securing Internet Protocol (IP) communications across the Internet, and the OpenBSD implementation of IPsec is widely used.

Apparently experiencing a change of conscience since then, de Raadt’s former associate, Gregory Perry, further suggests that OpenBSD lost its DARPA funding because DARPA was aware of these covertly implemented security vulnerabilities.  Theo de Raadt writes:

I refuse to become part of such a conspiracy, and
will not be talking to Gregory Perry about this.  Therefore I am
making it public so that
(a) those who use the code can audit it for these problems,
(b) those that are angry at the story can take other actions,
(c) if it is not true, those who are being accused can defend themselves.

If true, this successful FBI “conspiracy” represents a serious blow to the credibility of open source security efforts.  Up until now, OpenBSD has been widely viewed as one of the most secure operating systems available.

The full text of the email, which came from here, is below.

List:       openbsd-tech
Subject:    Allegations regarding OpenBSD IPSEC
From:       Theo de Raadt <deraadt () cvs ! openbsd ! org>
Date:       2010-12-14 22:24:39
Message-ID: 201012142224.oBEMOdWM031222 () cvs ! openbsd ! org
[Download message RAW]

I have received a mail regarding the early development of the OpenBSD
IPSEC stack.  It is alleged that some ex-developers (and the company
they worked for) accepted US government money to put backdoors into
our network stack, in particular the IPSEC stack.  Around 2000-2001.

Since we had the first IPSEC stack available for free, large parts of
the code are now found in many other projects/products.  Over 10
years, the IPSEC code has gone through many changes and fixes, so it
is unclear what the true impact of these allegations are.

The mail came in privately from a person I have not talked to for
nearly 10 years.  I refuse to become part of such a conspiracy, and
will not be talking to Gregory Perry about this.  Therefore I am
making it public so that
(a) those who use the code can audit it for these problems,
(b) those that are angry at the story can take other actions,
(c) if it is not true, those who are being accused can defend themselves.

Of course I don’t like it when my private mail is forwarded.  However
the “little ethic” of a private mail being forwarded is much smaller
than the “big ethic” of government paying companies to pay open source
developers (a member of a community-of-friends) to insert
privacy-invading holes in software.

—-

From: Gregory Perry <Gregory.Perry@GoVirtual.tv>
To: “deraadt@openbsd.org” <deraadt@openbsd.org>
Subject: OpenBSD Crypto Framework
Thread-Topic: OpenBSD Crypto Framework
Thread-Index: AcuZjuF6cT4gcSmqQv+Fo3/+2m80eg==
Date: Sat, 11 Dec 2010 23:55:25 +0000
Message-ID: <8D3222F9EB68474DA381831A120B1023019AC034@mbx021-e2-nj-5.exch021.domain.local>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Content-Type: text/plain; charset=”iso-8859-1″
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Status: RO

Hello Theo,

Long time no talk.  If you will recall, a while back I was the CTO at
NETSEC and arranged funding and donations for the OpenBSD Crypto
Framework.  At that same time I also did some consulting for the FBI,
for their GSA Technical Support Center, which was a cryptologic
reverse engineering project aimed at backdooring and implementing key
escrow mechanisms for smart card and other hardware-based computing
technologies.

My NDA with the FBI has recently expired, and I wanted to make you
aware of the fact that the FBI implemented a number of backdoors and
side channel key leaking mechanisms into the OCF, for the express
purpose of monitoring the site to site VPN encryption system
implemented by EOUSA, the parent organization to the FBI.  Jason
Wright and several other developers were responsible for those
backdoors, and you would be well advised to review any and all code
commits by Wright as well as the other developers he worked with
originating from NETSEC.

This is also probably the reason why you lost your DARPA funding, they
more than likely caught wind of the fact that those backdoors were
present and didn’t want to create any derivative products based upon
the same.

This is also why several inside FBI folks have been recently
advocating the use of OpenBSD for VPN and firewalling implementations
in virtualized environments, for example Scott Lowe is a well
respected author in virtualization circles who also happens top be on
the FBI payroll, and who has also recently published several tutorials
for the use of OpenBSD VMs in enterprise VMware vSphere deployments.

Merry Christmas…

Gregory Perry
Chief Executive Officer
GoVirtual Education

“VMware Training Products & Services”

540-645-6955 x111 (local)
866-354-7369 x111 (toll free)
540-931-9099 (mobile)
877-648-0555 (fax)

http://www.facebook.com/GregoryVPerry

http://www.facebook.com/GoVirtual

Aug 202010
 

Note: This report was originally published at Bright Side of News* on April 8, 2010.  After their server crashed, BSN* has not yet been able to recover the article after several weeks.  We are reposting the report here to serve as a mirror of the original article.  There are likely to be minor editing differences with the BSN* article.

Note 2: Only a month or two after it was published, a detailed report that I wrote was wiped out during a BrightSideOfNews* hard drive crash. That exhaustive report, praised by many throughout the industry as the finest of its kind yet produced, examined the emerging and inevitable ARM versus x86 clash.

It took a little while and cost BSN* a lot of money to recover the data on the hard drive, but that report is now back up and can be read here.

I’m currently working on a followup to that bit of analysis that will include even more hardware than the initial report. I’m still waiting on a vendor or two, so I can’t promise an ETA yet, but one thing I can state is that the new report will be very interesting.

The computing landscape is changing rapidly and the war between x86 and ARM microprocessors is now underway. The competitors have dramatically different strengths and weakness, making for a particularly exciting confrontation.

Most importantly, the results of this war will have profound effects well beyond the CPU market, where several companies will possibly see their fortunes upended. One thing is absolutely certain: computing will never be the same again.

Introduction

In this report we will discuss the emerging competition between ARM and x86 microprocessors. Led by the Intel Atom, x86 chips are quickly migrating downwards into embedded, low-power environments, while ARM CPUs are beginning to flood upwards into the more sophisticated and demanding market spaces currently owned by x86 processors. The central focus of this report will be an extensive compute performance comparison between the ARM Cortex-A8 versus the new Intel Atom N450, the new VIA Nano L3050 and, for historical perspective, an old AMD Mobile Athlon based upon the Barton core.   The Apple iPad A4 system-on-chip (SoC) is reportedly equipped with a 1GHz ARM Cortex-A8.

The Coming War: ARM versus x86

Over the last few years a war has been brewing. Two armies have been massing troops in their respective strongholds. Inside desktops, notebooks, servers and now even reaching into mainframes and supercomputers, the x86 family of microprocessors has mercilessly driven all competitors to extinction.

The “x86” moniker refers to the descendents of the 16-bit Intel 8086. Its 8-bit little brother, the Intel 8088, was the chip that powered the first IBM PC back in August, 1981. Shockingly primitive by today’s standards, the 8088 spoke a computer dialect that is still understood by the most modern, powerful and successful CPUs from Intel, AMD and VIA.

The roll call of those vanquished by the x86 family include microprocessors from IBM, DEC, Motorola, HP, Sun, Silicon Graphics, Commodore and even rivals from within Intel itself. Resistance has been futile. Eventually, even persistent holdout Apple succumbed to the relentless performance advances of the x86 juggernaut, dumping IBM’s Power architecture for the safety and reliability of x86 microprocessor advancements. Almost like clockwork, x86 CPUs double in capability every 18 months while prices continue to slowly decline.


x86 microprocessors, including AMD x86-64 and Intel EM64T, have taken over supercomputing. [Image taken from: http://en.wikipedia.org/wiki/File:Processor_families_in_TOP500_supercomputers.svg]

Yet, almost silently, a stealthy opponent has built up forces within the modest confines of PDAs, calculators, routers, media players, printers, GPS units and a plethora of other embedded devices but most notably mobile phones. Based in Cambridge, England, ARM Holdings dominates 32-bit microprocessor sales despite its very low profile. While AMD, a microprocessor vendor that commands about one-fifth of the x86 market, celebrated the sale of their 500-millionth CPU last July in their 40th year of operation, there were nearly 3-billion ARM chips shipped in 2009 alone.

The history of ARM microprocessors is almost as long as that for x86 CPUs. Sometimes called the “British Apple,” Acorn Computers began in 1978 and created a number of PCs that were very successful in the United Kingdom including the Acorn Electron, the Acorn Archimedes and the computer that dominated the British educational market for many years, the BBC Micro.

As the Commodore produced, 2 MHz MOS Tech 6502 microprocessor that powered the BBC Micro grew long-in-the-tooth, Acorn realized it needed a new chip architecture to compete in business markets against the IBM PC. Inspired by the Berkeley RISC project which demonstrated that a lean, competitive, 32-bit processor design could be produced by a handful of engineers, Acorn decided to design its own RISC CPU sharing some of the most desirable attributes of the simple MOS Tech 6502.

Officially begun in October, 1983, the Acorn RISC Machine project resulted in first silicon on April 26, 1985. Known as the ARM1, the chip worked on this first attempt. The first production product, the ARM2, shipped only a year later.

In 1990, Acorn spun off its CPU design team in a joint venture with Apple and VLSI under a new company named Advanced RISC Machines Ltd, which is now an alternative expansion of the original “ARM” acronym. While Acorn Computers effectively folded over ten years ago, its progeny, ARM Holdings is stronger than ever and dominates the market for mobile phone microprocessors.

Contrary to x86 chipmakers Intel, AMD and VIA, ARM Ltd does not sell CPUs, but rather licenses its processor designs to other companies. These companies include NVIDIA, IBM, Texas Instruments, Intel, Nintendo, Samsung, Freescale, Qualcomm and VIA Technologies. Late last year, AMD spin-off GlobalFoundries announced a partnership with ARM to produce 28-nanometer versions of ARM-based system-on-chip designs.

The ARM Cortex-A8 versus x86

Like the Intel Atom, the ARM Cortex-A8 is a superscalar, in-order design. In other words, the Cortex-A8 is able to execute multiple instructions – in the case of the Atom, up to two – during each clock tick,  but can only execute instructions in the order they arrive, unlike the VIA Nano and all current AMD and Intel chips beside Atom.  The Nano, for instance, can shuffle instructions around and execute them out-of-order to improve processing efficiency by about 20-30% beyond superscalar in-order chips.

The immediate predecessor of the Cortex-A8 is the ARM11 which found a home in the original Apple iPhone and countless other smartphones. The ARM11 is a simple, scalar, in-order microprocessor, so the best it can ever do is execute one instruction per clock cycle. As the Cortex-A8 is roughly equivalent to the Intel Atom, the ARM11 is somewhat similar to the VIA C7.

In-order chips suffer a performance hit because processing can come to a screeching halt when an instruction is encountered that takes a long time to complete. On the other hand, out-of-order chips can shuffle instructions around so that forward progress can usually be made while a lengthy instruction is simultaneously processed.

The Intel Atom manages to partially overcome this problem by implementing HyperThreading, Intel’s brand name for its version of symmetric multithreading (SMT). Like a few other Intel CPUs (and the three IBM PowerPC-based cores in the Xbox 360’s Xenon), the operating system (OS) views the Atom as if it has more processing cores than it actually does. In the case of the single core Atom N450, the OS sees two “virtual” cores. The operating system will accordingly distribute a thread (independently running task or program) to each core at once. Consequently, the Atom often churns through two unrelated instruction streams simultaneously, so even if one gets blocked by a slow, “high latency” instruction, the other thread can usually still be processed.

While HyperThreading doesn’t help much on single threaded tasks – and a vast amount of modern computing remains single-threaded – HyperThreading helps a great deal with slow input/output (I/O) intensive instruction streams since I/O operations can take an eternity from the CPU’s vantage point and can block even an out-of-order core. For instance, the Atom boots Windows 7 relatively quickly compared with even superscalar, out-of-order, single-core chips like the VIA Nano because the Atom can continue processing a second thread and does not have to frequently stop and wait on the vast number of I/O operations encountered during boot-up.

Intel chose to equip the Atom with HyperThreading instead of making the chip out-of-order because HyperThreading is simpler and consumes less power. Intel’s Austin design team created the Atom especially for low-power environments.

However, the benefits of HyperThreading diminish when multiple cores are available. The newer ARM Cortex-A9 MPCore is designed to be deployed in two or more cores, so SMT is not as important under multi-core conditions. For instance, the new NVIDIA Tegra 2 boasts two ARM Cortex-A9 MPCore processors. Moreover, the A9 is superscalar, and out-of-order with speculative execution, putting it on equal footing with the newer x86 chips, at least superficially.

Keep in mind that modern x86 microprocessors tend to be very rich in execution units and, after decades of development, are extremely refined in terms of low instruction latencies and feature sets.  Perhaps most importantly, the supporting x86 “ecosystems” are unmatched.  “Ecocsystem” is the current buzzword that refers to the surrounding chip set, memory, I/O, interconnect and peripheral infrastructure.

Moreover, ARM chips are RISC cores which have reduced instruction sets. In fact, RISC is an acronym for “Reduced Instruction Set Computer” and ARM CPUs typify this genre in many ways.

In general, RISC chips are leaner and usually support fewer instructions than CISC or “Complex Instruction Set Computer” microprocessors. While today’s x86 CPUs wield a decidedly CISC-style instruction set, the underlying hardware has absorbed most of the advantages of RISC while implementing many complex instructions in microcode. For instance, the VIA C3 bolted a CISC x86 frontend over a very MIPS-like RISC core.

An issue to watch out for when comparing ARM CPUs against x86 microprocessors is the size of binary files. In the past, RISC machines have produced larger executables because more instructions are often necessary than with CISC-derived systems. If binary sizes differ significantly, this places greater pressure on cache sizes, RAM size and memory bandwidth. With today’s terabyte-scale mass storage devices, increased binary bloat is not significant since the vast majority of drive space is consumed by video and other multimedia data.

Binary size comparison ARM x86
STREAM 112.3% 100.0%
miniBench 115.1% 100.0%
CoreMark 107.3% 100.0%

The table above shows that ARM Cortex binaries are indeed larger than x86 binaries, but the difference is only about 10-15 percent. If this sampling is representative for both platforms, binary size differences will rarely matter. ARM L1i and L2 caches should minimally be as large as those found on x86 microprocessors, but that is not currently the case, as will be discussed shortly.

ARM representatives responded with the following:

The binary size of the ARM benchmarks is significantly lowered with the Thumb-2 hybrid instruction set.  Expected results are 20-30% lower code size at equivalent or better performance.  The 10.0x version of Ubuntu Linux has been optimized for Thumb-2.  (the version as tested was Ubunu 9.04)

Of course, the real story in the battle between ARM and x86 is how they measure up against each other in the performance arena. In this report, we’ll take a close look at competitive performance across a broad range of tests and also take a peek at power usage.

Benchmarking considerations

Normally it is a primary prerequisite to insure that all systems under test have been configured identically prior to benchmarking. Unfortunately, this is impossible to achieve in this report given the highly integrated nature and grossly dissimilar “ecosystems” of ARM versus x86 microprocessors. For instance, the Freescale i.MX515 system that we used in our tests only supports DDR2-200 32-bit memory, much slower than the VIA Nano L3050 system’s DDR2-800 64-bit memory. Worse, the i.MX515’s integrated video solution is far more limited, maxing out at 1024×768 at 16-bit color depth, than the graphics solutions on any of the x86 systems.

Given this rigidly set, unlevel playing field, we deployed a battery of benchmarks that run primarily within the CPU’s caches. In other words, we made an attempt to only measure CPU-bound performance.

We verified the CPU sensitivity of each test by increasing the clock speed of the VIA Nano from 800MHz to 1800MHz. Tests should scale closely to the clock speed ratio of 225 percent.

Benchmark scaling
Hardinfo 226%
Peacekeeper 209%
Google V8 272%
SunSpider 228%
miniBench 220%
CoreMark 225%
stream add 108%

As shown in the table above, all of the benchmarks scaled appropriately with the exception of Google V8 and Stream Add, a memory bandwidth test that is constrained by memory performance and is included here as a counter example. Benchmarks that scale superlinearly (superlinear: to increase at a rate greater than can be described with a straight line) like Google V8 usually are not good benchmarks. Indeed, Google V8 also demonstrated very large run-to-run variations on several tests like EarlyBoyer and RegExp. Nevertheless, we have included full Google V8 results since it remains a popular JavaScript benchmark.

Speaking of run-to-run variation, we ran each test at least three times and calculated the coefficient of variation (CV) to insure result validity.

For this report, we placed four CPUs under test: the 800MHz Freescale i.MX515 which is based upon the ARM Cortex-A8, the new VIA Nano L3050 downclocked to 800MHz, the new Intel Pineview-based Atom N450 downclocked to 1GHz and, for historical perspective, an 800MHz Mobile Athlon (Barton core).

Unfortunately, it was impossible to downclock the 1.67GHz Atom N450 below 1GHz, but, as you will see, the results we obtained are still very interesting. The Atom N450 introduces an on-die GPU which significantly reduces overall platform power consumption compared with the older Silverthorne-based Atom platforms.

I purchased a Gateway LT2104u netbook from Best Buy for this report in order to test the Intel Atom N450. The Gateway is a very well executed netbook design with a solid feel, attractive appearance, excellent battery life and good feature set.

The VIA Nano L3050 is the new, second generation, “CNB” Nano that boosts performance from 20-30 percent beyond the original “CNA” Nano, while also reducing power demands by similar amounts. The CNB-based Nano is still based upon the same 65nm Fujitsu process leveraged with the original CNA-based VIA Nano.  Despite these improvements, the CNB Nano die-size is almost identical to its predecessor’s at around 62-64 square millimeters.

The table below summarizes relevant system details.

Freescale i.MX515 (ARM Cortex-A8) Mobile Athlon (Barton) VIA Nano L3050 Intel Atom N450
L1i 32 kB 64 kB 64 kB 32 kB
L1d 32 kB 64 kB 64 kB 24 kB
L2 256 kB 512 kB 1,024 kB 512 kB
frequency 800 MHz 800 MHz 800 MHz 1,000 MHz
memory speed DDR2-200 MHz (32-bit) DDR-800 MHz DDR2-800 MHz DDR2-667 MHz
operating system Ubuntu 9.04 Ubuntu 9.04 Ubuntu 9.04 Jolicloud (Ubuntu 9.04)
gcc 4.3.3 4.3.3 4.3.3 4.3.3
Firefox 3.5.7 3.5.7 3.5.7 3.5.7

All systems ran Ubuntu Linux Version 9.04 with the exception of the Atom netbook where we had to install Jolicloud Linux because of video driver issues. However, Jolicloud is based upon Ubuntu 9.04, so programs installed from the Ubuntu repositories were identical.

We chose Ubuntu 9.04 because the ARM-based Pegatron nettop we used in this report came with Ubuntu 9.04 preinstalled. An attempt to upgrade that box to the latest version of Ubuntu failed due to insufficient disk space. The Pegatron device was equipped with a 4GB flash drive.

We undersclocked the 1.8GHz VIA Nano L3050 to 800MHz by using the CPU multiplier setting in the Centaur reference system’s BIOS.  We verified the proper clock speed by reading MSR 0x198.  For the Atom N450 Gateway netbook, we underclocked the Atom to 1GHz using the Gnome CPU Frequency Monitor taskbar applet.  This handy applet does not support the VIA Nano yet.


We used the Gnome CPU Frequency Monitor applet to set the Atom’s clock speed to 1GHz.

For JavaScript tests, all systems ran Firefox version 3.5.7. It is very important to use the same browser version for JavaScript tests because performance can vary tremendously from browser to browser or even version to version of the same browser.

We thank C.J. Holthaus and Glenn Henry from Centaur Technology for the VIA Nano L3050 reference board, and Katie Traut and Phillipe Robin from ARM for the tiny, Ubuntu based Freescale i.MX515-based Pegatron prototype system.

The Pegatron “nettop” is only slightly larger than a CD case yet it boasts a full complement of features including 512MB of DDR2-200MHz memory (32-bit interface), a VGA connector, wireless “N” networking, Bluetooth 2.1 + EDR, a flash memory card reader, and audio, headphone, Ethernet and USB ports. Total system power usage rarely rises much above 6 Watts.

Unless specified otherwise, all benchmark results are reported so that larger numbers correspond to better performance. Many tests have been “normalized” against the ARM Cortex-A8 so that results are reported in terms of the performance ratio with the Cortex-A8. For instance, if the Atom is twice as fast at the Cortex-A8 on a certain test, it will score 2.00.

A gander at memory subsystem performance

As mentioned earlier, the memory subsystems vary significantly among these dissimilarly configured systems. The ARM Cortex-A8 struggles with it very weak DDR2-200MHz, 32-bit memory.

Nevertheless, memory bandwidth results are important because they underscore a handicap that ARM must eventually address. ARM systems have typically been optimized for extreme low-power environments while x86 systems have been aggressively optimized for performance. A sacrifice made in the Freescale i.MX515 is memory speed exchanged for low power usage, but this absolutely destroys performance on many types of tasks as exemplified by our STREAM results.

As can be seen in the graph above, the ARM Cortex-A8 as part of the Freescale i.MX515 struggles against even the ancient AMD Athlon and is creamed by the VIA Nano and the Intel Atom. While part of the problem is its pokey memory, another component is the ARM chip’s meager 32-bit memory interface, half the width used for single-channel memory access by x86 chips. If the Cortex-A8 were equipped to access DDR2-800 memory through a 64-bit interface, it might very well keep up with its x86 rivals in terms of memory bandwidth.

For this report, ARM representatives explained the design decisions behind the Freescale i.MX515 used in our Pegatron prototype:

The ARM ecosystem is centered on a “right-sized” computing philosophy. ARM Partners design their SoCs to a particular set of applications, enabling the best tradeoff for power, cost and performance for a given application.   The Freescale i.MX51 was designed for a particular application class, with the memory subsystem designed for the needs of these applications. It is understandable that the performance of this memory subsystem will be different from platforms targeted at general purpose computing applications.

Incidentally, the VIA Nano can also be configured to support 32-bit memory access. This is desirable in severely space constrained environments where trace and pin counts adversely impact package and PCB implementation size.

Integer Performance

Although it might not always appear to be the case, all computing is the processing of numbers. From the words of a love letter, to the glistening dew drops on a rose, to Johnny Cash’s rumbling, anguished, repentant voice, to Gordon Freeman’s apocalyptic universe, to the ruby slippers on Dorothy’s feet, all are simply numbers to a computer.

For most chores, the only numbers that matter are integers. Integers are the natural counting numbers like 1, 2, 3 and their negative counterparts plus zero. With the exclusion of 3D gaming and some types of video and still image rendering, encoding and manipulation, the vast bulk of day-to-day computing is integer-based. The integer test results we look at here can give us insight into typical system performance across chores like word processing and web browsing.

The Embedded Microprocessor Benchmark Consortium (EEMBC) recently released a benchmark that is freely available to anyone. Dubbed “CoreMark,” this test provides a quick way to compare CPU performance across entirely different processor architectures.

We compiled CoreMark on each platform using GCC version 4.3.3 and the following flags:

-O3 -DMULTITHREAD=4 -DUSE_FORK=1 -DPERFORMANCE_RUN=1  -lrt

We chose to generate four threads to insure scaling across a variety of systems featuring multiple cores and/or HyperThreading like the Intel Atom.

As you can see from the graph above, the ARM Cortex-A8 is very competitive on EEMBC CoreMark, running almost as fast as the Athlon and Nano. The Atom pulled ahead thanks to HyperThreading combined with its 25 percent clock speed advantage over the other chips. Unfortunately, there aren’t many more overall wins for the Atom ahead; please note, however, that most of the remaining tests are single-threaded.

“miniBench” is a diverse benchmark that I’ve been working on for several years. It’s part of my OpenSourceMark benchmarking project. miniBench contains a wide variety of popular tests and runs quickly from the command-line. I also have a GUI-based version that I wanted to use for this report but could not do so because the Qt tool chain would not install completely on the ARM system. Instead, I used the excellent and relatively lightweight Code::Blocks IDE to create and manage the necessary C++ project files for a command-line binary.

You can download the x86 Code::Blocks project here. An x86 Linux binary compiled with static libraries is here. A similar ARM Cortex-A8 Linux binary is here. Both the x86 Linux project and the ARM Cortex-A8 project will eventually be uploaded to the OpenSourceMark SourceForge page, along with GUI adaptations of these benchmarks.

The ARM Cortex-A8 struggles on three of the five tests in this first miniBench chart. Heap Sort is the worst result for the A8 and this is almost certainly because the test appears to be significantly impacted by memory bandwidth. The i.MX515 system is saddled with very poor bandwidth as already demonstrated in this report. Integer Matrix Multiplication is another memory bandwidth sensitive test where the ARM chip comes up short.

However, the ARM Cortex-A8 is extremely impressive on the Integer Arithmetic test, blowing away the Athlon and doubling the Atom’s performance. The Integer Arithmetic test does exactly what you’d expect it to do: it performs a large number of very simple integer arithmetic calculations.

Also notice that the 800MHz ARM Cortex-A8 beats the 1GHz Intel Atom N450 on the ubiquitous Dhrystone benchmark despite the fact that the ARM chip spots the Atom a 25 percent clock speed advantage.  ARM advertises that we should be able to get 1,600 Dhrystone MIPS from an 800MHz Cortex-A8.  On our tests, the 800MHz ARM Cortex-A8 achieved 1,680 Dhrystone MIPS.

It’s clear that the ARM Cortex-A8 is aggressively optimized for Dhrystone performance, a fact borne out by the fact that ARM touts the chip’s Dhrystone throughput.

On the second set of miniBench integer tests, the ARM Cortex-A8 holds its own against the brawnier x86 CPUs. The ARM Cortex-A8 even beat the VIA Nano L3050 on the Sieve test.  More remarkably, the Cortex-A8 is very close to parity with the Atom across all of these tests, save for one, if the Atom’s 25 percent clock speed advantage is considered.

Notice, though, that the ARM chip could not run the String Concatenation test. This is an important indication of the relatively immature state of ARM’s Linux/GNU software support. Ubuntu as a whole was often flakey. Doubtlessly, this will improve with time.

The VIA Nano L3050 obliterates all of the competition on the hashing tests because the Nano features hardware support for these important security functions.

However, the 800MHz ARM Cortex-A8 is amazingly good at hashing and thoroughly beats the 1GHz Atom on both tests and is only slightly slower than the Athlon.

The VIA Nano L3050 enjoys its biggest triumph on the miniBench cryptography tests because the Nano is equipped with robust hardware support for AES ECB encryption and decryption.

Again, the ARM Cortex-A8 remains very close to the Intel Atom if the Atom’s 25 percent clock speed advantage is considered.

HardInfo is one of the few CPU benchmarks available from within Ubuntu’s repositories.

The ARM Cortex-A8 doesn’t perform quite as well on HardInfo as it did on miniBench, possibly because I used very aggressive optimization flags for both platforms when compiling miniBench. Nevertheless, the ARM Cortex-A8 stays within spitting distance of the x86 CPUs except on the FPU Raytracing test which is not an integer test but rather a floating-point test.

Floating-point performance is the ARM Cortex-A8’s Achilles ’ heel as we will see in the next section.

Floating-point performance

Gaming, scientific computing, certain spreadsheets like financial simulations and some image and video manipulation tasks involve fractional and irrational numbers. Called “floating-point” because the decimal or radix point can float around among the significant digits of a number, floating-point performance has become increasingly important in modern computing.

However, good floating-point performance is relatively hard to engineer and requires a substantial number of additional transistors.  Of course, this drives up power usage. Typically, floating-point intensive operations consume more power than pure integer tasks. In fact, miniBench’s LinPack test was the worst case power consumer on the VIA Nano.  Centaur discovered this while I worked there as head of benchmarking.  However, this does not include “thermal virus” programs like the absolute worst case program developed by Glenn Henry, Centaur’s president.

Integrated floating-point (FP) hardware is a fairly new addition to ARM processors and even though the Freescale i.MX515 ARM Cortex-A8 features two dedicated floating-point units, there are still severe limitations. The faster of the two FP units is the “Neon” SIMD engine, but it only supports 32-bit single-precision (SP) numbers. Single-precision numbers are too imprecise for many types of calculations.

Hardware support for 64-bit, double-precision, floating-point calculations is provided by the “Vector Floating-Point” (VFP) unit, a pretty weak coprocessor. And despite being called a “vector” unit, the VFP can only really operate on scalar data (one at a time), although it does support SIMD instructions which helps improve code density.

Oddly enough, during our performance optimization experiments, Neon generated the same level of double-precision performance as the VFP, while doubling the VFP’s single-precision performance.  When we asked ARM about this, company representatives replied, “NEON improves FP performance significantly. The compiler should be directed to use NEON over the VFP.”

We therefore compiled miniBench to leverage Neon for this report. Note that while the Neon compiler flag was used for the ARM chip, none of the tests are explicitly SIMD optimized – the x86 version of miniBench used in this report does not include hand-coded SSE or SSE2 routines and the ARM Cortex-A8 version of miniBench does not include similar Neon code.

In the miniBench MFLOPS tests, the ARM Cortex-A8 looks pretty bad except on division.

While the VIA Nano has the best DP (double-precision) performance, note how well the Intel Atom  N450 handles SP calculations.

It is also worthwhile to recognize the very good floating-point division performance of the ARM Cortex-A8’s Neon.  Unlike all of the x86 chips that I have ever tested, the Cortex-A8 delivers identical throughput for both floating-point division and multiplication.  Division is much slower on x86 processors than multiplication.  Consequently, the Cortex-A8 keeps up very well with the x86 CPUs in this report on DP division, more than doubling the Atom’s performance when the Atom’s clock speed advantage is considered. In single-precision division, the ARM Cortex-A8 beats ALL of the x86 microprocessors it’s pitted against here.

The ARM Cortex-A8 continues to languish on the remaining miniBench floating-point tests with two notable exceptions. The Cortex-A8 is fairly strong on FFT calculations, an extraordinarily important algorithm for many, many tasks. The ARM chip is also competitive with the Atom on the Double Arithmetic test.

Observe how the old Barton-core Mobile Athlon demolishes all of the other chips on Trig. AMD has historically provided industry leading performance on transcendental calculations, while the same area has always been a big weakness for VIA’s CPUs.  ARM really needs to bolster their chips’ performance on transcendental operations like the trigonometry functions exercised in this test.

The takeaway from this section is that the ARM Cortex-A8 does not deliver acceptable floating-point performance for netbooks, notebooks or desktops compared with x86 CPUs. This is an area ARM must address if the company plans to compete toe-to-toe with x86 microprocessors.

JavaScript performance

JavaScript performance has become very important as cloud-based computing has finally begun to take hold with the appearance of solutions like Google Apps, Zoho Office, Adobe’s Acrobat.com, Aviary and many more applications. The Google Android operating system largely foregoes native applications and leverages Web-based JavaScript programs. Jolicloud Linux takes a similar but less aggressive tack allowing native and cloud-based applications to seamlessly co-exist.

There are several widely used JavaScript tests that run across all of the CPUs examined in this report. However, it is very important to run these tests on the same browser across all platforms.  Even specific browser version is also very important because JavaScript performance varies wildly from browser to browser and version to version as web browser developers push each other in a mad race to provide the fastest JavaScript engines.

Thankfully, Firefox 3.5.x is available for each system included in this report and we used it for these tests.

FutureMark, the maker of PCMark and 3DMark, has introduced its own JavaScript benchmark called Peacekeeper. FutureMark Peacekeeper is hands down the most elaborate JavaScript benchmark currently available, although it is difficult to assess its validity. PeaceKeeper is the only JavaScript test in our roundup that had complex graphical components.

The Freescale i.MX515 ARM system fared poorly against its x86 rivals across all Peacekeeper tests. This might be partially accountable to the slow main memory subsystem which saddled Cortex-A8. The i.MX515 Cortex-A8 only has 256kB of L2 cache compared to 512kB for the Athlon and the Atom and 1,024kB for the Nano, so it is much easier for a benchmark to spill out of the Cortex-A8’s L2 cache and into its extremely slow main memory.

ARM representatives agreed that the Cortex-A8’s poor showing on FutureMark Peacekeeper is most likely due its L2 handicap, perhaps making Peacekeeper, in the context of this report, more of a comparison of memory subsystems, not processors.

Note also that the ARM system failed to complete the Peacekeeper complex graphics test.

The VIA Nano L3050 was the clear winner of FutureMark’s PeaceKeeper, besting all of its rivals on every test. Even though the Intel Atom N450 was far behind the two other x86 chips, its overall score was nearly twice that of the ARM system. Again, keep in mind that the Atom also ran with a 25 percent clock speed advantage over the other chips in this comparison. Also be aware that JavaScript is not threaded, so the Atom’s HyperThreading engine won’t help it much on JavaScript tests.

With Google in the lead of cloud-based computing efforts, it should not be surprising that the search engine giant also provides its own JavaScript benchmark. Unfortunately, the Google V8 benchmark does not behave like a very good benchmark at this point, demonstrating large run-to-run variation and superlinear scaling. Nevertheless, Google V8 is a popular JavaScript benchmark, so we included it here.

The Google V8 benchmark closely reproduced FutureMark Peacekeeper’s results. VIA’s Nano L3050 won every test by significant margins again. The Atom trailed the other x86 processors badly, but still nearly doubled the ARM Cortex-A8’s showing.

Our final JavaScript benchmark is SunSpider, perhaps the most popular JavaScript test in use today.

Again, the ARM Cortex-A8 does not look good, faring only slightly better than on the other two JavaScript benchmarks.

The VIA Nano L3050 barely pulls out an overall win, its score hurt by very poor performance on bit level operations.  The ARM Cortex-A8 beats the Nano on two of these tests.

Despite its age, the AMD Mobile Athlon based on the Barton core has delivered competitive performance across nearly all tests.

I must state at this point that the JavaScript results do seem to reflect the relative, subjective, overall feel of the four systems. Despite its strong showing on many integer tests, the Freescale i.MX515-based Pegatron system feels much more sluggish than all three of the x86 systems; the Pegatron’s extremely slow memory subsystem doubtlessly contributes to this issue. The Atom N450 is also clearly more lethargic than either the AMD Mobile Athlon or the VIA Nano L3050 systems. The AMD and VIA systems are essentially indistinguishable during normal usage.

2D graphics performance

Take the following chart with a grain of salt because the video subsystems across the three systems are very dissimilar. The VIA, Intel and Freescale systems all used integrated graphics while the AMD system was equipped with a discrete NVIDIA NX6200 AGP card.

Even though the three x86 systems ran at 24-bit color depth, they were all two to three times faster than the ARM system that ran at only 16-bit color depth. We tested all systems at 1024×768 (XGA) resolution except the Atom, which we tested at the native panel resolution of 1024×600.

Power consumption

While the x86 microprocessors in this comparison enjoy a clear overall performance advantage, ARM CPUs are renowned for their power usage thriftiness. It is very difficult to compare power usage among the four CPUs under test for this report. The AMD and VIA systems are inappropriate for power comparisons because they are based on desktop hardware.

The chart below contrasts power consumption between the Intel Atom N450 and the ARM Cortex-A8 while running miniBench. The power curves were generated from system power usage adjusted downwards so that idle system power was discarded. For the Atom, idle power was 13.7W with the Gateway netbook’s integrated panel disabled while the idle power for the Pegatron system was only 5.4W.

Be aware that the Pegatron prototype does not implement many power management features.  ARM representatives note:

The Pegatron development board was designed as a software development tool and does not have a commercial production software build so it does not have many of the power management features found in ARM-based mobile devices. Production systems would expect to have aggressive power management implemented, lowering the ARM power consumption.

Given this information, the results we show here likely represent an energy consumption condition considerably worse than would be encountered with a similarly configured, commercial, ARM Cortex-A8-based system.

Subtracting idle power usage should isolate the curves to the power necessary for running miniBench. Note that the Atom reached minimum power usage shortly after startup and never reached that level again. Idle power beyond that point is about 1 Watt higher. Even taking that into consideration, the Atom consumes at least three times the power of the ARM Cortex A8 on the same tests.

It’s particularly interesting to see how power usage compares on the AES tests where both CPUs deliver comparable performance. The first major hump on the Atom curve shows the power consumed on the AES tests. Compared with the ARM Cortex-A8, the Intel Atom N450 required about four times more power while delivering only about 30 percent additional performance – and this is with a 25 percent clock speed advantage.

The sharp peak in Atom power usage occurred on the miniBench floating-point memory bandwidth tests.

The Atom completes miniBench in about one-half the time needed by the ARM Cortex-A8 due to the ARM processor’s very poor floating-point performance. The first major dips in both curves (at 1000s and 2000s) indicate where the two systems complete the benchmark.

Even though floating-point hardware can draw a lot of power, FP units usually deliver significant energy savings because floating-point operations take much less time to complete with accelerated hardware support. Energy consumed for a task is: E = P * t, where “E” is for “Energy,” “P” is for “Power” and “t” is for “time.” Good floating point hardware might drive up power demands, but the time to complete FP operations is reduced enough to dramatically reduce the total energy needed for those operations.

Despite the fact that the ARM Cortex-A8 blows away the Intel Atom in power thriftiness, don’t belittle the Atom. It is a resounding success in terms of reducing the power demands of x86 microprocessors. The Intel Atom is currently the only realistic x86 system-on-chip (SoC) design ready to migrate downwards into smartphones.

Doubtlessly inspired by the VIA C7 — which explains why Intel set up shop in Austin, the same town where VIA’s Centaur design team is headquartered  (in fact, a few ex-“Centaurians” worked on the Atom)  – the Intel Atom delivers acceptable performance while sipping power at levels far lower than usually seen in the x86 world. Right now, there is no competing, low-power x86 CPU – let alone SoC – that can match the Atom in terms of performance per Watt, especially on multithreaded applications.

Conclusion

The ARM Cortex-A8 achieves surprisingly competitive performance across many integer-based benchmarks while consuming power at levels far below the most energy miserly x86 CPU, the Intel Atom. In fact, the ARM Cortex-A8 matched or even beat the Intel Atom N450 across a significant number of our integer-based tests, especially when compensating for the Atom’s 25 percent clock speed advantage.

However, the ARM Cortex-A8 sample that we tested in the form of the Freescale i.MX515 lived in an ecosystem that was not competitive with the x86 rivals in this comparison. The video subsystem is very limited.  Memory support is a very slow 32-bit, DDR2-200MHz.

Languishing across all of the JavaScript benchmarks, the ARM Cortex-A8 was only one-third to one-half as fast as the x86 competition. However, this might partially be a result of the very slow memory subsystem that burdened the ARM core.

More troubling is the unacceptably poor double-precision floating-point throughput of the ARM Cortex-A8. While floating-point performance isn’t important to all tasks and is certainly not as important as integer performance, it cannot be ignored if ARM wants its products to successfully migrate upwards into traditional x86-dominated market spaces.

However, new ARM-based products like the NVIDIA Tegra 2 address many of the performance deficiencies of the Freescale i.MX515. Incorporating two ARM Cortex-A9 cores (more specifically, two ARM Cortex-A9 MPCore processors), a vastly more powerful GPU and support for DDR2-667 (although still constrained to 32-bit access), the Tegra 2 will doubtlessly prove to be highly performance competitive with the Intel Atom, at least on integer-based tests. Regarding the Cortex-A8’s biggest weakness, ARM representatives told us its successor, the Cortex-A9, “has substantially improved floating-point performance.” NVIDIA’s CUDA will eventually also help boost floating-point processing speed on certain chores.

Unmatched software support has always been the “ace in the hole” for the x86 contingent. However, with the success of Linux and the maturity of its underlying and critical GNU development toolset, Linux/GNU support could be the great equalizer that allows ARM to finally overcome the x86 stranglehold in netbooks and even notebooks and desktops. Maturing Linux support might also assist ARM chips to make further incursions into gaming devices.

I didn’t expect it, but the emerging war between ARM and x86 microprocessors is turning out to be much more competitive and interesting than I ever imagined.

In addition to the main ARM versus x86 focus of this report, there is also a subplot pitting the new Intel Atom N450 against the new VIA Nano L3050. The Intel Atom N450 is a remarkable product in that it is the first x86 SoC (system-on-chip) that is suitable for smartphones and other ultra-low power environments. As such, the Atom promises to dramatically improve the sophistication and performance levels of those market spaces.

While the various Atom models currently dominate the booming netbook market, it is evident from our JavaScript tests that the VIA Nano L3050 is much more desirable if JavaScript performance is important at all. Across our JavaScript benchmark results, the 800MHz VIA Nano L3050 is about 50 percent faster than the 1GHz Intel Atom N450.

However, VIA still lags Intel in terms of suitability for low power consumption environments, largely because Intel leverages its outstanding 45nm fabrication technologies with the Atom, while VIA still produces the Nano L3050 in the relatively elderly 65nm Fujitisu process node. The Atom is also strong on multithreaded tasks as demonstrated by its CoreMark victory. HyperThreading will also benefit Atom in I/O intensive environments where the single-core Nano will be hard-pressed to keep up.

Lastly, the AMD Mobile Athlon in this comparison gives us important insight into how the new chips from Intel, VIA and ARM stack up historically. Overall, across all of our performance tests, the ancient Barton core-based Athlon came in a very close second behind the VIA Nano L3050. This suggests AMD could easily produce a competitive low power CPU if the chipmaker did nothing else but shrink one of its older core designs while adding a few power saving tweaks.

In summary, ARM is positioned very well to engage in battles with the Intel Atom as that x86 chip advances into smartphones. The ARM Cortex-A8 appears to use much less power than the Atom, while often delivering comparable integer performance. Nevertheless, the Atom is significantly faster overall when considering holistic system performance, but that performance will be accompanied with a battery life penalty and significantly more heat production. Heat is a serious problem within the tight confines of mobile phones.

New chips based upon ARM Cortex-A9 derivatives, like the NVIDIA Tegra 2, address many of the performance weaknesses we encountered with the Freescale i.MX515. If ARM is to achieve sustained victories in the netbook space – let alone in the more performance demanding notebook and desktop spaces – ARM must substantially improve floating-point thoughput.

While the dedicated functional block approach used by ARM and its legions of licensees to provide image manipulation, video decoding/encoding, security and Java acceleration is still valid, it is not a substitute for double-precision floating-point performance.

ARM representatives told us for this report that the Cortex-A9 “has substantially improved floating-point performance.”  It will take a big jump forward to catch their x86 rivals, but if ARM pulls it off, Intel, AMD and VIA are going to have a big, bloody war on their hands.  It is conceivable the x86 empire might finally see the boundaries of its swelling, vast territories begin to retract in the near future under an army ant-like assault of tiny, fast, cheap, multi-core ARM microprocessors coming at them from dozens of different companies.

ARM’s success might also have a negative impact on Microsoft, since Linux will almost certainly play a major role in ARM’s ability to storm the netbook, “nettop,” notebook and even desktop spaces.

Whatever the outcome, it’s time to pay attention to ARM. Our results clearly demonstrate how it was possible for an ARM chip to steal the Apple iPad away from Intel’s Atom. The Apple iPad might represent merely the first of many ARM victories in its escalating war against the x86 world.

We thank Katie Traut and Phillipe Robin from ARM for the impressively tiny but full featured Freescale i.MX515-powered Pegatron prototype Ubuntu system. We also thank C.J. Holthaus and Glenn Henry from Centaur Technology for the VIA Nano L3050 reference board.

Last summer after eight years there, Van Smith left his job at Centaur Technology to form the company Cossatot Analytics Laboratories. Van was head of benchmarking for Centaur and represented VIA Technologies within the BAPCo benchmark consortium. Van has written a number of computer benchmarks including OpenSourceMark and miniBench and he has influenced or directly contributed to many others. For instance, Van wrote the cryptography tests in SiSoftware Sandra.

Nearly ten years ago, Van departed Tom’s Hardware Guide as Senior Editor to form his own website, Van’s Hardware Journal (VHJ). Van was recently interviewed and quoted in a CNN article based upon his investigative journalism published at VHJ. Van also served as Senior Analyst for InQuest Market Research.



Jul 092010
 

For many people, Apple Computer’s co-founder, Steve Jobs, exists on a higher plane than the one mere mortals occupy. It’s a peculiarity of our modern world that a gadget designer can find himself at the center of a personality cult.

Certainly, Steve Jobs’ uncompromising direction has played a huge role in Apple’s success.  However, out of necessity, Apple employees take a more pragmatic view of the world’s most famous Steve.

The collective company lore, primarily involving Mr. Jobs, is often passed down to new Apple initiates through a series of tales.  One such bit of verbal history unfolds as follows.

A young man entered an elevator at Apple’s 1 Infinite Loop headquarters located smack in the middle of California’s Silicon Valley.  Satisfied after waiting a few seconds that no one else was boarding, he pressed the button for the first floor.  Just before the elevator closed completely, a hand sliced through the narrowing gap, activating the infrared switches and separating the metal doors like Moses parting the Red Sea.  In stepped a purposeful man wearing a black St. Croix turtleneck, Levi 501 blue jeans and a pair of New Balance sneakers.  Preoccupied with thoughts of returning home after a hard day’s work, it took the young man a few seconds to realize that the older man standing next to him, alone in the descending elevator, was the iconic Steve Jobs.

Clearing his throat and attempting to be friendly, the young man chirped a common salutation.  “Hi, Mr. Jobs, how are you,” he queried in a quivering voice sounding about an octave higher than normal.  After a few awkward seconds of silence elapsed, the young man continued, “It’s a beautiful day today, isn’t it?’

Seeming slightly perturbed, the fruity messiah shot back, “So what have you done for Apple lately?’

Temporarily flummoxed by how his innocent elevator ride suddenly turned into a confrontation with one of the world’s most influential businessmen, the young man became distraught and distracted by how his own body could issue forth what seemed like a bucket of sweat instantaneously.  As the elevator bell dinged their arrival on the first floor, the young man suddenly stammered, “Well, I bought an iPod for my little daughter a couple of months ago.”  Smiling meekly at his quick thinking, he attempted to step towards the opening doors but was blocked by the spry Apple president.

Greatly feared throughout his Apple kingdom for his merciless, mercurial temper, Steve Jobs vibrated with anger while his face reddened to a ripe Macintosh hue.  Blood veins began to swell in his neck and forehead as if Mr. Jobs were transforming into a crimson Hulk.

Suddenly and inevitably Steve Jobs erupted, “Is that it? Is that the best thing you can come up with?”

The young man began to quickly realize that he was not handling this encounter well.  The small drop of aerosolized, Jobsian spittle landing in his right eye, blurring his vision slightly, was particularly distracting.

“Yes, I think so,” the young man embarrassingly admitted as he managed to maneuver around the increasingly irate CEO and into the lobby.

“Well, you’re fired!” Jobs shouted, halting the young man in his tracks.  Stepping out towards the young man, Jobs continued, “Go pack up all of your stuff and leave,” he said pointing towards the elevator.

“But you can’t fire me,” the young man insisted.

“Do you know who I am?  I’m Steve Jobs!  I run this company and I can fire anyone I want,” the Apple executive screamed.

“But you can’t fire me,” the young man asserted as a spirit of calmness began to fill him.

“Look, I don’t know who you are, but no one around here is too important to fire besides me,” Steve Jobs angrily asserted, “And you’re fired!”

“No, I’m not,” the young man retorted matter-of-factly.

Now only inches away from the young man’s face, Steve Jobs screamed, “And why not?”

The young man reached into his back pocket and pulled out a wallet.  Opening it, he pointed to an item held inside a clear, plastic flap.  “Because I don’t work here,” the young man stated quietly as he extracted his business card, “I was just here to fix a copier on the fourth floor.”

Jun 272010
 

A sure sign that a company is on its way out is when it starts needlessly defeaturing its products to create artificial market segments rather than innovating new features to add value.  Windows 7 Starter is a good example of this sad phenomena.

Microsoft originally planned to impose a three application limit to Windows 7 Starter so that no more than three applications could be run at once.  Of course, this outrageous, artificial and completely unjustified limitation would have actually cost Microsoft time and money to implement.  With the exception of a handful of confused apologists, the three app limitation earned the Redmond software giant widespread derision.  Eventually, public scorn caused Microsoft to drop this bone-headed idea before releasing Windows 7 Starter edition last fall.

However, there are other stupid and disingenuous ways Microsoft castrated Windows 7 Starter.  Not only is there an artificially imposed 2GB memory limit and the fantabulous new snipping tool is gone, but Microsoft stripped Internet Connection Sharing (ICS) from Win7 Starter.  Of course, there are absolutely no technical reasons for killing ICS, a feature that for many years has enabled users to easily set up Windows systems to serve the Internet to home networks.

The Redmond software beast wants you to fork over your hard earned dough through “Windows 7 Anytime Upgrade” to buy back ICS, a particularly vital feature for netbooks that come with integrated broadband cellular modems.  Worse still, netbooks are particularly well-suited for home servers since they use very little power and have a built-in UPS.  So killing ICS from Win7 Starter was a particularly ungreen move for the spawn of Bill Gates.

Well, it’s very easy to overcome all of the limitations your Microsoft overlord imposes.  You’ll need to download Jolicould Linux here.  A good application to burn the Jolicloud installation image to disk is ISO Recorder which you can download here.   If you don’t have a USB CD-ROM or DVD-ROM drive, you can create a bootable USB key drive by following the instructions here.  You can even install Jolicloud from within Windows if you download Jolicloud Express from here.  Once you’ve created your boot disk/USB key, boot from it.  You can install Jolicloud so that you can chose between Win7 Starter and Jolicloud at boot time, or you can completely expunge Win7 and replace it with Jolicloud.

Boot into Jolicloud and connect your netbook to the Internet.  An added benefit Jolicould brings are preinstalled broadband cellular modem drivers along with proper settings for many carriers.  When Jolicloud detects a new broadband cellular modem, the Network Manager menu, activated by clicking on the appropriate Gnome Panel icon in the top of the screen, will list a new broadband device.  Clicking on that menu entry will bring up a simple wizard so that you can select your carrier and ensure that the correct number is dialed.  It literally takes about ten seconds to to set up a new cellular modem connection in Jolicloud Linux.

To share your Internet connection, whether cellular or otherwise, right-click on the same Network Manager icon and select “Edit Connections…”.  Click the “Add” button no either the Wired or Wireless tab, depending on which way you plan to share your Internet connection.  Give the new connection a descriptive name like “Shared Internet Connection”.  On the IPv4 tab, select “Shared to other computers” as the Method.  Click “Apply”.

Reboot your netbook.  After you sign in, activate the Internet connection in the Network Manager menu if it is not automatically activated.  It might also be necessary to manually activate your “Shared Internet Connection” by clicking on the corresponding Network Manager menu entry.

You should now be actively sharing your Internet connection with your home network.

It’s humorous to note that Microsoft did a predictably sloppy job disabling ICS in Windows 7 Starter.  In fact, it is still available, but only if you want to share your active Internet connection over an ad-hoc wireless network.  In other words, other computers will have to connect to your netbook wirelessly to see the Internet.  To set up this type of Internet connection sharing configuration, simply type “adhoc” in the Windows 7 Start menu search box.  This will filter down to wizard that enables you to set up an ad hoc ICS network.

Apr 162010
 

A central component of my ARM versus x86 report published last week on Bright Side of News* was my miniBench benchmark.  For that analysis, I ported miniBench to Linux for both x86 and ARM ISAs.

The ARM project can now be downloaded here.  You will need the Code::Blocks IDE to manage the GCC C++ project.  Code::Blocks is available from the Ubuntu repositories so it is very easy to install.  Of course, you will also need to minimally install gcc, g++ and gdb.

The miniBench x86 project is available here.  The x86 Linux binary is here and the ARM binary is here.

I will merge the x86 and ARM projects and upload them to SourceForge shortly.

Apr 132010
 

BSN* has posted another of my articles. In it, I compare iPad browsing performance against an Intel Atom N450 netbook using the four most popular web browsers. While the Atom-based systems pretty much trounces the iPad except when using Microsoft Internet Explorer 8. IE8 on the Atom makes the iPad look fast.

Additionally, Opera 10.51 beats all comers.

Major kudos are also in order for Opera. Already a big player in the mobile browser market, Opera 10.51 manhandled the competition across all three benchmarks. Safari and Chrome finished a distant second and third respectively. Firefox 3.6.3 kept pace with those two except on the quirky Google V8 benchmark where it stumbled badly.

Apr 082010
 

The popular technology website Bright Side of News* has published an in-depth report I authored comparing an ARM Cortex-A8 microprocessor, used in the Apple iPad’s A4 chip, against a trifecta of x86 CPUs typically found in netbooks, small notebooks and embedded devices.  My report particularly focuses on compute performance.

A major component of that comparison is miniBench, an open source benchmark that I wrote in C++.  I ported miniBench to Linux for both ARM and x86 platforms enabling, for the first time, objective, head-to-head performance comparisons across a wide range of meaningful tests like Dhrystone, Whetstone, FFT, LinPack, MFLOPS, AES, SHA1, SHA256 and many others.

While I worked as head of benchmarking for Centaur Technology, we used miniBench to help isolate performance problems in our microprocessors so that we could optimize out the weakest attributes of our chip designs.

In the BSN* report, I also compare performance across a number of popular JavaScript benchmarks and a few other native tests.

The results surprised me.  It is also worth examinig the relative compute performance between the new Intel Atom N450, the new VIA Nano L3050 and an old AMD Mobile Athlon based upon the Barton core.

You can read my full report here.

I noticed that Theo generously gave me credit for the recall of the 1.13GHz Intel Pentium III.  I want to make it clear that Tom Pabst discovered the speed path defect that manifested when he was trying to compile the Linux kernel.  My part in the recall was representing Tom’s Hardware at Intel.  The representative for the giant chipmaker initially laughed at the issue until I threatened him with an united call to yank the defective part involving a number of major computer hardware ethusiast websites.