Monday, May 24, 2010

One thing fast is better than many things slower.



Yes, it been a while but I'm going to make up for it right now. Not only did we move our production account from 772 to 405 we switched CPU's we're now running on RedHat. Why? Money and performance, mostly for performance.

Before:

7.7.2
Sun V880 - 8 x 1.2 GHz
RAM - 16GB
Hosting: Database, webserver, DGW test and prod.
Performance: 15-20 second audits.


After:

4.0.5

2 x Dell T100 - Intel Quad 3.0 GHz
RAM - 4 GB
Hosting: Prod account only
Performance: 1-3 second audits.

Yes, that is a HUGE jump. My final grade batches went form 3 days to less than one day.
I could run it faster but 405 has some DGWCPUCOUNT issues that prevent me from running
simultaneous batches. I've got a SR for it.

The best part of this migrations is that my new Dell servers only cost me $1000 EACH!!

What happened to the Sun T5220's? I was pretty disappointed. I was hoping to take advantage of it's award winning multi-threading but the problem was that Degreeworks is not a multi-threaded app. I tried EVERYTHING to make the T5220 to work for us. (that's why I hadn't posted in a while.) The ONLY advantage of the T5220 at 1.6 Ghz is that you can batch faster by setting the DGWCPUCOUNT to like 10 and have 10 batch jobs running at the same time.

Listen closely now because this is a proven theory.

CPU speed is the key to increasing performance.

There. I said it. Yes, throw money at it but not a lot. Since DW is single threaded it will never take advantage of multi-threaded processors when it comes to dynamic web audits. So why have a server that does multi-threading at a slow speed (Sparc 1.6 GHz) when the app will never take advantage of it? So I bought a server that does one thread really really fast (Intel 3.0 GHz).

*(Note: Sungard also recommends a processor with at 3.0 Ghz and higher)*

I know about the WEB09M03 and, in my opinion, it does absolutly nothing. In fact, I think it makes things worst. It basically uses one thread and to jump between several requests. This is an old configuration which helped with managing cobol licenses in the old versions (772 and earlier).

Those T5220's didn't go to waste. One is used as a primary database for Degreeworks and the other is used for our test environment.

Note that separating the database from the app also made a huge difference. There is a lot of I/O going on. Now Oracle (a major resource hog) and DW don't have to compete for resources any more. This is one thing every site should do.

This layout is perfect for us right now. I'm on the verge of releasing what-if to students because it's performing so well. We never could give it to the students because 20,000 students hitting what-if at a rate of 20-second audits on a single thread would give me nightmares.

I could go on and on about this but this is enough to chew on for a while. More posts to come. I promise.

4 comments:

  1. Hey Derrek, Thanks for the information!! We currently are looking into upgrading to a new box and this information is exactly what I've been looking for!! We are currently running off a UltraSPARC T2 T5200, with 8 cores/64 Threads and 1.60 GHz. I decided to run a test last week by opening up the thread counts to try and speed up our batch audits and came up with the following results:

    Running batches of the same students totaling - 1,104.
    Number of DAP09s and WEB04s set to the same number as the DGWCPUCOUNT at each level.

    DGWCPUCOUNT set to 8
    Total Time to Run: 93 Minutes
    # Audits per Core: 138
    Time per Audit: 40 Seconds

    DGWCPUCOUNT set to 16
    Total Time to Run: 51 Minutes
    # Audits per Core: 69
    Time per Audit: 45 Seconds

    DGWCPUCOUNT set to 32
    Total Time to Run: 34 Minutes
    # Audits per Core: 35
    Time per Audit: 58 Seconds

    Results are pretty interesting, especially since the sharp increase in Time Per Audit between the 16 to 32 DGWCPUCOUNT. We had already figured that it would be best to run on faster processors instead of more cores, or maybe a combination of both, especially when we have 43,000+ students loaded into DegreeWorks and growing!! Right now though, it's taking between 45 seconds to over a minute to "Process New" a new audit, a number that we definately need to bring down!!

    ReplyDelete
  2. Almost forgot, in regards to your WEB09M03 file. Blow it away!! Delete it and be done with it once and for all. You should use the UCXCFG020-WebParms record to set your number of DAP09s, WEB04s and UTL79s. I did this and haven't looked back since!!

    ReplyDelete
  3. Thanks, Rick. I'll play with that UCX table. I'll try anything. We recently decided to retain more students in DW and our system now has 70,000+ students.

    45 second audits! Wow! They still exists. You're not alone in this department. Many other sites have it worst.

    Keep me posted!

    ReplyDelete
  4. Hi Derrek,

    Happy New Year!

    You may not remember me, but we met at the DW symposium a couple years ago. I'm a colleague of Bob Cloutier and Melissa Brown at Boston University. We are making the move from 7.7.2 to 4.0X this year. A question for you: How did you manage the migration of audit exceptions from 7.7.2 to 4.0X? Any tips or recollections you have regarding the migration of exceptions would be very helpful. (or anyone from the blog!) Thank you very much.

    Brad Peloquin

    ReplyDelete