+ Reply to Thread
Results 1 to 7 of 7

Thread: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

  1. #1

    Default Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    Common sense tell you that higher memory bandwidth should mean faster results, right? I set out to put this thought to the test looking at just two different memory dividers on my o/c'ed Q6600 system. At a FSB of 333 MHz, the slowest and fastest dividers I could run are:

    1:1 a.k.a. PC5300 (667 MHz)
    3:5 a.k.a. PC8888 (1,111 MHz)



    Just for reference, as they relate to DDR2 memory:
    Code:
    PC4300=533 MHz
    PC5300=667 MHz
    PC6400=800 MHz
    PC7100=900 MHz
    PC8000=1,000 MHz
    PC8500=1,066 MHz
    PC8888=1,111 MHz
    PC10600=1,333 MHz
    The highest divider is 1:2 aka PC10600 (1,333 MHz) and it just wasn't stable with my hardware @ 333 MHz.

    All other BIOS settings were held constant:
    FSB = 333.34 MHz and multiplier = 9.0 which gives an overall core rate of 3.0 GHz.
    DRAM voltage was 2.25V and timings were 5-5-5-15-4-30-10-10-10-11.

    You can think of memory bandwidth as the diameter (size) of your memory's pipe. Quite often, the pipe's diameter isn't the bottle neck for a modern Intel-based system; it is usually much larger than the information flow to/from the processor. Think of it this way, if you can only flush your toilet twice per minute, it doesn't matter if the drain pipe connecting your home to the sewer is 3 inches around, or 8 inches around, or 18 inches around: the rate limiting step in removing water from your home is the toilet flushing/recycling and the pull of gravity, not the size of your drain line. The same is true for memory bandwidth.

    After seeing the data I generated on a quad core @ 3.0 GHz, I concluded that this toilet analogy is pretty true: the higher memory bandwidth gave more or less no appreciable difference for real world applications. Shocked? I was.

    Further, I should point out that in order for my system to run stable in PC8888 mode @ a FSB of 333, I had to boost my NB vcore two notches and raise my ICH to the max (both of which the BIOS colored red meaning "high risk.") The increased voltage means more heat production, and greater power consumption -- not worth it for small gains realized in my opinion. Anyway, the test details and results are below if you want to read on.



    Relevant test hardware:

    Motherboard: Asus P5B-Deluxe (BIOS 1215)
    CPU: Intel C2Q - Q6600 (B3 revision)
    Memory: Ballistix DDR2-1066 (PC2-8500)

    "Real-World" Application Based Tests

    I chose the following apps: lameenc, x264, winrar, and the trial version of Photohop CS3. I ran these tests on a freshly installed Windows XP Pro SP2 machine.

    Lame version 3.97 – Encoded the same test file (about 60 MB wav) with these commandline options:
    Code:
    lame -V 2 --vbr-new test.wav
    (which is equivalent to the old –-alt-preset fast standard) a total of 8 times and averaged play/CPU data as the benchmark.

    x264 version 0.55.663 – Ran a 2-pass encode on the same MPEG-2 (720x480 DVD source) file 5 times totally and averaged the results. Without getting into too much detail, the benchmark is 1,749 frames @ 23 fps. Based on these numbers, I reported the time it would take to encode 215,784 frames (which is your average 2.5 h of video @ 23 fps). Why did I do this? The differences of just 1,749 frames were too insignificant.

    Shameless promotion --> you can read more about the x264 Benchmark at this URL which contains results for hundreds of systems. You can also download the benchmark and test your own machine.

    RAR version 3.62 – rar.exe ran my standard backup batch file which generated about 1.09 G of rars (1,654 files totally). Here is the commandline used:
    Code:
    rar a -u -m0 -md2048 -v51200 -rv5 -msjpg;mp3;tif;avi;zip;rar;gpg;jpg  "E:\Backups\Backup.rar" @list.txt
    where list.txt a list of all the dirs I want it to back up. Benchmark results are an average of two runs timed with a stopwatch.

    Trial of Photoshop CS3 – The batch function in PSCS3 was used to do three things to a total of twenty-nine, 10.1 MP jpeg files:

    1) bicubic resize 10.1 MP to 2.2 MP (3872x2592 --> 1800x1200) which is the perfect size for a 4x6 print @ 300 dpi.
    2) unsharpen mask filter (60 %, 0.8 px radius, threshold 12)
    3) saved the resulting files as a quality 8 jpg.

    Benchmark results are an average of two runs timed with a stopwatch.

    "Synthetic" Application Based Tests

    Just two of these were chosen to illustrate a point about theoretical gains vs. real world gains. Actually, I did SuperPI for the hell of it. WinRAR served to illustrate that point.

    SuperPI / mod1.5 XS – The 16M test was run twice, and the average of the two are the benchmark.

    WinRAR version 3.62 – If you hit alt-B in WinRAR, it'll run a synthetic benchmark. This was run twice (stopped after 100 MB) and is the average of two runs.

    Raw Data - "Real-World" Apps
    Lameenc play/cpu (average 8 runs) @ PC5300: 30.7935
    Lameenc play/cpu (average 8 runs) @ PC8888: 30.8045
    Result: PC8888 is 0.5 % faster

    x264 time to encode 2.5 h DVD @ PC5300: 01:48:54
    x264 time to encode 2.5 h DVD @ PC8888: 01:46:14
    Result: PC8888 is 2.5 % faster

    rar.exe back-up (average 2 runs) @ PC5300: 45 sec
    rar.exe back-up (average 2 runs) @ PC8888: 44 sec
    Result: PC8888 is 2.2 % faster

    Photoshop CS3 Trial batch (average 2 runs) @ PC5300: 33 sec
    Photoshop CS3 Trial batch (average 2 runs) @ PC8888: 33 sec
    Result: PC8888 is 0.0 % faster

    So stop right here and ask yourself if a 2-3 % gain is worth the higher voltage and heat.

    Raw Data - "Synthetic" Apps

    SuperPI/16M test (average 2 runs) @ PC5300: 8 m 8.546 s
    SuperPI/16M test (average 2 runs) @ PC8888: 7 m 33.328 s
    Result: PC8888 is 7.8 % faster

    Winrar internal benchmark (average 2 runs) @ PC5300: 1,515 KB/s
    Winrar internal benchmark (average 2 runs) @ PC8888: 2,079 KB/s
    Result: PC8888 is 37.2 % faster

    ...but who uses their system exclusively running internal and synthetic benchmarks? Recall that for my 1.09 gig back up, I only gained about 2 % doing "real work" by using the higher divider. Hardrives are notorious bottle-necks in systems that serve to nullify any memory bandwidth increases. In this case the 37 % theoretical increase was translated into only a 2 % "real world" increase likely due to the hardrive/rar's ability to read/write the data. Again, this seems kinda wasteful to me.

    I will admit that there might be special cases where running at high memory dividers may produce more substantial gains: apps such as folding@home or seti@home, etc. may benefit from the higher memory bandwidth since they tend to make exclusive use of the system memory bandwidth and rely much less on the hardrive. I have no data to back-up this though. Also lacking in my experiments are any game data. I'd be interested in knowing if the higher bandwidth can be leveraged by game engines such as UT3, Crysis, etc. but I also didn't look at these here.

    Finally, since I held everything else constant, I didn't look at the tighter timings in 1:1 mode that people can often use which may give additional gains. For example, I can get away with 3-3-3-9 @ 1:1 vs. the slower 5-5-5-15 @ 3:5 with this memory.

    Anyway, I hope you found this useful and maybe this will inspire someone else to look at the gaps pointed out above (and the gaps I haven't thought of too!)

  2. #2

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    Awesome work again Graysky. I saw this thread title and got excited because I knew you would come to the table with some great data.

    Stick this one to the top.

  3. #3

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    I would be interested to see 1:1 with the tightest timings you can run stable vs 3:5 with the tightest timings you can run stable so that we could compare the two. It would make a difference in ram purchasing decisions if we found that the lower speed faster timing (which is probably cheaper) is equal to or faster than the higher speed looser timing ram.

    Also doing these test for "real world" gaming apps with something like fraps while running Crysis, UT3, Everquest2, and other games would be interesting as well as benchmarks like 3DMark06, the new vista based 3dMark product.

    I'm a gamer at heart and eventually plan to buy the "best of the best" for running in 1920x1200 full AA etc.

    Finally, I'm interested what it would look like if you ran x264 on 1080 resolution (progressive and interlaced) to see the differences there.
    Intel E8500 4.0ghz (450x9.0) @1.275v (bios) 1.21v load
    Asus Rampage Formula X48
    EVGA gtx260 [756/1512/1248(2484)]
    4g (2x2G) PC2-8500 Corsair Dominator @1081mhz
    Antec Armor + Case
    Ultra X3 1000W PSU
    T.R.U.E. 120

  4. #4

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    Thanks for the kind words and the sticky, jebo.
    @gmpotu - I agree, but I don't do a whole lot of gaming so I don't own any of those titles except the demo of ut3. Plus I returned the memory since my DDR2-800 Ballistix were more or less equally fast. If you have all those games and some flexible memory, give it a shot and let us know!

    As to your question about the x264 encode to 1080i -- x264 scales in a highly linear fashion. In other words, take the time it took your machine to do the 720x480 clip and find the ratio of the area difference to your target resolution. 1080p and 1080i = 1920x1080... the catch is that you also have to account for differences in framerates. 480p is typically 23.976 fps whereas 1080i and p can be 24, 25, 30, 50 or 60 fps for example.

    If the framerates are held constant, it's just:

    720x480 = 345,600 pixels
    1920x1080 = 2,073,600 pixels

    a/345600 = c/2073600

    where a is encode time for 480p
    solve the equation for c (cross multiply and divide)

    Example with my machine @ 9x333 (a = 52 sec):

    c = 312 sec

  5. #5

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    So for 1080 you save ~16 minutes instead of 2min 40s.

    As for me testing. My comp is a POS because I never have the $ to spend on cool new parts. I do have about 8 different MMORPG's though if I ever did get a new system.
    Intel E8500 4.0ghz (450x9.0) @1.275v (bios) 1.21v load
    Asus Rampage Formula X48
    EVGA gtx260 [756/1512/1248(2484)]
    4g (2x2G) PC2-8500 Corsair Dominator @1081mhz
    Antec Armor + Case
    Ultra X3 1000W PSU
    T.R.U.E. 120

  6. #6
    Site Supporter Skip Da Shu has a little shameless behaviour in the past
    Join Date
    Aug 2007
    Location
    Republic of Tejas, central
    Posts
    563
    Rep Power
    5

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    Think of it this way, if you can only flush your toilet twice per minute, it doesn't matter if the drain pipe connecting your home to the sewer is 3 inches around, or 8 inches around, or 18 inches around: the rate limiting step in removing water from your home is the toilet flushing/recycling and the pull of gravity, not the size of your drain line.
    Yea Bubba, but wait till yar ol' woman uses half a roll of TP and then wonders why the toilet overflows and the thing will hardly flush for a week. Then you'll wish you had 6" pipe instead of 4" out to the main! :cursing:

    Sorry, I finally found a way / place to bitch about this. :blushing:


    [now trying to redeem himself]
    I have B3 Q66 cruncher running 333 with some DDR2-800 in it. I may try to see what happens to the BOINC benchmarks running 667 vs it's current 833 4:5 ratio (my only 2 options). The latency timings are already 4-4-4-10. I haven't tried pushing it down to CAS3 at 667 but worth a shot.

    Last time I looked (on an AMD machine) anything that caused memtest to show greater memory bandwidth seemed to have a minor effect on the integer (Drystone) side of the benchmark but near nothing on the floating point (Whetstone) side.

    I'll see if I can locate an updated version of their old "standard work unit" to run. If not I'll just get the BOINC benchmark numbers and post them this weekend.
    - da shu @ the BOINC farm, SkipsJunk, Guru Mountain, Crunchers
    "Free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech”, not as in “free beer"

  7. #7

    Default Re: Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

    @SDS - Sounds good. Looking forward to the results.

+ Reply to Thread

Similar Threads

  1. Open Pandora = winRAR
    By epicelite in forum Laptops and mobile computing
    Replies: 5
    Last Post: 01-29-2008, 09:21 AM
  2. Two million points
    By jebo_4jc in forum Distributed Computing (Folding at Home)
    Replies: 25
    Last Post: 01-11-2008, 09:43 PM
  3. Water cooling should be here today
    By John in forum Cooling
    Replies: 3
    Last Post: 08-30-2007, 02:14 PM
  4. AMD quad core benchmarks
    By jebo_4jc in forum CPU's
    Replies: 2
    Last Post: 07-05-2007, 09:21 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts