[RETURN HOME]

Part 2: AMD Athlon XP 2000+ Review

By Van Smith and Joel Hruska

Date: January 7, 2002

As promised, here is part two of our review of the AMD Athlon XP 2000+.  In this article we will focus on benchmark results.  The following link will take you to Part 1 where we discuss in detail the dramatic differences between AMD and Intel processor architectures.

Note:  This article was originally scheduled to pit the new Athlon XP 2000+ against Intel’s 2.2GHz Northwood.  Unfortunately, Intel was unable to supply us the necessary chip in time.  The Intel spokesman stressed, however, that this problem was due neither to the widely reported production problems the chipmaker has been having nor to alleged Northwood shortages. 

===================================

Benchmark Setup

processor AMD Athlon XP 2000+ Intel Pentium 4, 2GHz, Socket 478
motherboard Asus A7V266-E, VIA KT266A MSI 850 Pro 5, i850
memory 512 MB PC2100 512 MB PC800
video card GeForce 3 Ti 500 GeForce 3 Ti 500
operating system Windows XP Professional Windows XP Professional

===================================

Memory Bandwidth

As CPU clock speeds increase relative to their memory interfaces, performance of the memory subsystem becomes an acute bottleneck for tasks utilizing code or data that exceeds caching subsystem sizes.

There are two major performance metrics for main memory subsystems.  The first is "bandwidth" which designates the rate of flow of data.  The second issue "latency" or the time it takes data to arrive after a request is made.  Even more importantly is the delineation between regular, or predictable memory accesses, and random, unpredictable access patterns.  Regular accesses can be accelerated, while random accesses represent the worst case pattern of taxing a memory subsystem.

In our first test below, we examine memory bandwidth.  As you scan the graph from left to right, the bandwidth of L1, L2 and finally main memory is depicted.


The Pentium 4 has the lead, but block prefetching helps the Athlon XP make up ground.

The Intel Pentium 4 performs impressively on this test with significantly higher bandwidth than the Athlon XP.  However, the Athlon XP data series in red enjoys a simple technique called "block prefetching" which significantly closes with gap with the P4.  We will discuss block prefetching in detail in an upcoming article.

Although the bandwidth of the Pentium 4 is impressive in the graph above, this chip's bandwidth performance is not all cut and dry.  By simply adding a single 32-bit integer multiplication to the assignment statement, the P4's throughput crashes.


Note that the cache subsystem makes no difference at all for the Pentium 4 in this test.

Not only does the P4's bandwidth crash, but it falls to such low levels that the cache subsystem is totally neutralized.  Although the Pentium 4 has great bandwidth potential, the Athlon XP is a much better balanced solution.

===================================

Memory Latencies

In this test, we also vary data set size, but our assignment statements are randomized.  A random element out of a source array is copied to a random target in another array.  Not only is this a very latency intensive test, but it also defeats stride predictors.


Notice that the Athlon XP 1900+ outperforms the XP 2000+ in main memory.

The Athlon XP completely dominates this test.  Here, the Pentium 4's deep pipeline is a severe hindrance.  Note, however, the interesting quirk where the Athlon XP 1900+ outperforms the Athlon XP 2000+ in main memory.  The Epox motherboard appears to have a faster memory subsystem than the Asus board used in the reference system.

Although the Athlon XP is king of this test by large margins, the Athlon was not.  Below is a version of the same test that uses an eight megabyte dataset.  In this case, the impact of the probability that two assignments would fall within cache are minimized.


While the Athlon XP shines on this test, the Athlon is a poor performer.

The huge difference between the Athlon XP and Athlon is likely due to the XP's hardware data prefetch engine.  Note that the Pentium 4 falls almost exactly between the Athlon XP and the Athlon.  Also note that the Epox motherboard beats the Asus product once again.

===================================

SiSoftware's Sandra2002

SiSoftware has gone a long ways towards the correcting problems we mentioned in an earlier article.  The benchmark now uses the most appropriate optimizations for both Intel and AMD platforms.  The memory benchmark even uses the block prefetch technique used in BandwidthBurn and reports estimated bandwidth efficiencies.  The difference between the Athlon XP and the Pentium has narrowed greatly, while both benefit from the new optimizations.


Other than memory bandwidth, it's a clean sweep for the Athlon XP.

Perhaps the most startling aspect besides the much higher memory bandwidth scores is the disparity in FPU performance.  Many hardware reviewers incorrectly post the SSE2 performance of the Pentium 4 instead of the actual FPU score.  If I have a remaining bone to pick with SiSoftware, it is to make the distinction between SSE2 and true FPU performance.

===================================

Various CPU Tests

Below are a number of different CPU tests.


The Athlon XP easily beats the Pentium 4 in both Science Mark and CPUMark99.

It is startling how closely Dr. Tim Wilkens' Science Mark, a high end floating point intensive scientific computational benchmark, paces CPUMark99.  In both tests, the Athlon XP easily trumps the Pentium 4.

FPU WinMark99 shows similar results.


The Athlon XP wins big again.

Finally, here are results from of some of our own tests.


The Intel Pentium 4 trails in every test.

"Spin" is a floating point intensive simulation of the rotation of a non-symmetric rigid body about its intermediate inertia tensor.

===================================

3d Graphics Tests

The first graph below shows results from running the popular 3D first person shooter, Serious Sam, at 800x600x16.  The "Memphis" demo is used.  Patch 100C has been applied.


Athlon XP over the Pentium 4 by a wide margin.

The Athlon XP wins yet again -- and by a very wide margin.

AMD quietly produces its own 3D test called nBench, although it is not being pushed by the chipmaker.


The Athlon XP trumps P4 by a greater degree than in Serious Sam.

Here is a table with complete nBench results.  The data is lopsided and needs no further comments.

  Athlon XP 2000+ / Asus Pentium 4 / 2GHz / i850
nbench 8666 5491
Game1 (Low) 1499 982
Game1 (High) 1524 995
Game2 (Low) 12073 9111
Game2 (High) 6699 6240
Game3 (Low) 18639 8612
Game3 (High) 13336 8308
The Surface 5544 3398
2D Graphics 4585 3801
Space Fighter 11438 6708
Your Fighter 11323 6757

===================================

BAPCo

Extrapolating from the very one-sided set of results that we have obtained so far, it would seem safe to predict that the Athlon XP will dominate application level tests as well.

Although SysMark2000 would not run to completion in Windows XP, we were able to run selected application tests.  Their results are displayed below.


Other than WME which is SSE enhanced, but does not recognize the Athlon XP, the results look familiar.

And, true to form, the Athlon XP dominates in familiar fashion.  The only exception is Windows Media Encoder which has SSE optimizations, but doesn't enable them for the Athlon XP.

We have outlined many times our objections with BAPCo tests.  BAPCo, an organization that resides at Intel's headquarters is widely recognized as a front for Intel.

Below are results for SysMark2001, a test that just so happens to greatly improve the Pentium 4's relative performance.  This is accomplished by stressing bandwidth intensive operations and exploiting processor specific optimizations.

SysMark2001 ships with a version of Windows Media Encoder that must be installed before the benchmark can run.  Although current versions of WME correctly recognize the SSE capability of the Athlon XP, the version shipped with SysMark2001 does not and the impact is profound.  WME makes such a difference because it cranks through tasks in the background, hogging CPU cycles from other applications.


WME: the difference between winning and losing.

As is readily apparent, WME makes the difference between the Athlon XP winning or losing the Internet Content Creation test.

Aside from these questionable issues, SysMark2001 is simply a bad benchmark.  For one, it obfuscates even further than SysMark2000 what is actually being tested.  SysMark2001 has become simply a black box test where the user pushes a button and the program later spits out three meaningless numbers.

===================================

Benchmark Observations

The results we have looked at overwhelmingly underscore the Athlon XP's superiority on a broad range of tasks.  Even in BAPCo's SysMark2001, the Athlon XP dominates its Intel rival and slips by reported scores for the 2.2GHz P4.  Although there are tasks where the Pentium 4 may pull ahead -- namely those that stress raw bandwidth or possess SSE2 optimizations -- by and large the Athlon XP will come out on top much more often than not.

===================================

Part 3, the conclusion, will follow later this week.

===================================

Pssst!  We've updated our Shopping Page.

===================================

[RETURN HOME]