After noticing that the current benchmarking infrastructure produces some inconsistencies when executing multi-threaded benchmarks, I have implemented a new benchmarking library for the derby. You can use this library now for benchmarking.
Don't worry, the implementation decisions that you arrived at using the current (old) benchmarking infrastructure are still valid. I have only noticed inconsistencies when running multiple threads in the current system.
You can continue to use the current (old) library (lib6035.a) but note that it has problems with multi-threaded benchmarks.
The new library is named libderby.a and is located in /u/mgordon/6035/lib64. The new library interface is similar to the old interface. There are two calls:
start_caliper()and
end_caliper(). Wrap these around the code you would like to benchmark. The only difference from the old library is that the code will not be benchmarked without a caliper defined. As before you can place them in a callout in your decaf source code.
Using the new library, the assemble command is simpler, for example:
gcc4 emboss.s -pthread -lderby -L/u/mgordon/6035/lib64 -o emboss
No need for that papi library from before. When you execute your code (assuming that you have added a call to start_caliper() and end_caliper()), a brief message will print out, for example:
$ emboss
Timer: 276864 usecs
$
This tells you how many usecs (microseconds) it took for the code wrapped in the timer to execute. We will use this library during the derby to determine the ranking. The calls to start_caliper and end_caliper will be in the derby program when it is distributed on Monday.
Sorry about this change so late in the game, but it should not be too much of a hassle to switch to the new library if your are harnessing data parallelism. I just want to make sure that the derby results are as accurate as possible.