Running a Java 1.5 SAX XML parser on the UltraSparc T1 is five times slower than running it on my Athlon 2800+.
That's about right. Five times slower. Now when I got delivery of my new T2000, I thought, “Hey, this baby's gonna rock!”. Then weird stuff happened.
First, I ran the performance tests for my Java CSV Component. Twice as fast. Okaaay. Well, fair enough, it is faster. Would have expected more though.
Then, I ran the performance tests for my Java XML Component. Five times slower. Eek!
Alright, I thought, slow disks! Nope. Doing the tests in memory led to the same performance. And the disks were slow, in case you're wondering.
OK. Maybe my code is to blame. Let's do a pure and simple SAX parse, with no actions. Still five times slower! So now I was getting depressed. My lovely new T2000. Broken! Or worse: Crap!
So I went out for lunch.
And then it hit me. Duh! 32 hardware threads over 4 CPUs (and that's just the minimum config). And I was only exercising one of them! My performance test only runs in one thread. You see, the T2000 is designed for throughput. Serving lots of web requests all at the same time. That sort of thing.
[Update: I wrote the original post from memory — the are only 16 hardware threads. But I'll leave the rest of this post the way it is, so you can have a few laughs. The final conclusion still holds as you can get higher spec T2000s with more CPUs].
So I run a really brain-dead test. I opened up a whole load of consoles and started the test on all of them, all at the same time.
Not a wince. Not a whine. Cool as a breeze. Same speed on all consoles! The T2000 just laughed at me. A few samples from
mpstat, and I was a happy camper again. You see, each thread may be a bit on the slow side, but you do have 32 of 'em. So if each one is five times slower, you need five threads to do the same work. But: five into 32 total threads gives you six or so. Which means:
Six Times Faster! Baby!
So what you get with the T2000 is a big old scalability lever, and a fairly small performance lever. Same old story really. There's always a trade-off.
Now maybe I was doing some wrong. I don't know. You tell me. I've been out of the game for a while when it comes to Solaris.
So I don't know what this means for my report on the T2000. The single thread numbers suck, but the concurrency rocks. Guess I'll need some new charts!