Making scientific application run faster

Siavash · Apr 21, 2013

Hello, we are using a scientific application (ETABS) for structure analysis at work. Most of scientific applications code base dates back to single core CPUs era, when higher CPU clock meant faster execution.

Here is the problem, we have Core2 Quad boxes (Q9550) with each core clocked at 2.83GHz, but mentioned program only uses one of the CPU cores at full blast, so it is limited by CPU clock rate.

Is it possible to force applications to distribute the work load over all CPU cores, or if not possible make it somehow run faster without over clocking the CPUs? Is upgrading them to Core2/CoreiX extreme editions with higher clocks only option?

cliffordcooley · Apr 21, 2013

I'm not sure if what you ask is possible. I do know the Folding@Home project I was part of, used a single client per core. The client needed an update to take advantage of multiple cores/threads.

Rage_3K_Moiz · Apr 21, 2013

Have you attempted launching multiple instances, then using the Process Manager to assign each instance a different CPU affinity?

Siavash · Apr 21, 2013

cliffordcooley said:
I'm not sure if what you ask is possible. I do know the Folding@Home project I was part of, used a single client per core. The client needed an update to take advantage of multiple cores/threads.

One of friends mentioned to change power plan from balanced to performance mode. That shaved a few minutes but still not a real solution.

Rage_3K_Moiz said:
Have you attempted launching multiple instances, then using the Process Manager to assign each instance a different CPU affinity?

Great idea, this post deserves +100 likes. This saves a lot of time and power.

Thank you very much!

DelJo63 · Apr 21, 2013

The correct solution is to recode the program using threads and then perform work in each thread - - but unless you have a skilled programmer on staff, forget it.

Another approach, especially when the program processes lots of external data that gets read from disk is:

divide the original data into N(input groups)
process each group in its own process where the results are written back to disk as N(outputs)
now sort or post process the N(outputs) to get a 1(final result)

This technique was developed for true multi-processing on Tandem machines.

The problem is the last step, as outputs like charts, graphs, or statistical analysis don't lend themselves to the divide-n-conquer technique.

Making scientific application run faster

Siavash

Posts: 47 +23

cliffordcooley

Posts: 13,141 +6,442

Rage_3K_Moiz

Posts: 5,403 +43

Siavash

Posts: 47 +23

DelJo63

Similar threads

Latest posts