With help from our sys-admin (thanks Mike!), I have completed the new, much much more accurate benchmarking infrastructure. You have to make some changes to use it.
First, set your LD_LIBRARY_PATH environment variable to include /u/mgordon/6035/lib64For bash:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/u/mgordon/6035/lib64
For csh:
setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/u/mgordon/6035/lib64I have placed a new version of the 6035 library in /u/mgordon/6035/lib64, you don't need
to copy it from there.
Now you have to compile things a bit differently, new compile:
gcc4 program.s -pthread -lpapiex -l6035 -L/u/mgordon/6035/lib64 Just in case your are curious, papiex is the library on which I built our benchmarking infrastructure. It allows us to get access to performance counters on the CPU! Very accurate.
----
You can use /u/mgordon/6035/bin/benchmark to get the cycle count information for your program after you have compiled it.
Use 'benchmark program' to get the cycle count for program.
Here is some sample output:
looplibmonitor debug: (P20071,T0x0) monitor_fini_process()PapiEx Version: 0.99rc2Executable: /u/mgordon/6.035/example/loopProcessor: AMD K8 Revision CClockrate: 1993.465942Parent Process ID: 20064Process ID: 20071Hostname: tynerOptions: PAPI_TOT_CYC,NO_WRITEDomain: UserReal usecs: 125Real cycles: 241852Proc usecs: 122Proc cycles: 239896PAPI_TOT_CYC: 135670 Event descriptions:Event: PAPI_TOT_CYC Derived: No Short Description: Total cycles Long Description: Total cycles Developer's Notes: Start: Wed Nov 15 23:44:20 2006Finish: Wed Nov 15 23:44:20 2006libmonitor debug: (P20071,T0x0) monitor_fini_library()------
What to notice:
*First line give the program you ran (in this case "loop").
*The line your are interested in:
PAPI_TOT_CYC: 135670this is the number of user cycles that your program took to run.
----
You can also define a single "caliper" in the code. This is a section in your code that you would like detailed information about. Use
start_caliper() to define the beginning of a section and
end_caliper() to define the end of a section. These functions are defined in lib6035.a. So you can use a callout for each in decaf code or just place it in your assembly code (adhering to calling convention of course).
With a caliper defined, the output would look like:
looplibmonitor debug: (P20167,T0x0) monitor_fini_process()PapiEx Version: 0.99rc2Executable: /u/mgordon/6.035/example/loopProcessor: AMD K8 Revision CClockrate: 1993.465942Parent Process ID: 20156Process ID: 20167Hostname: tynerOptions: PAPI_TOT_CYC,NO_WRITEDomain: UserReal usecs: 435Real cycles: 860899Proc usecs: 127Proc cycles: 249792PAPI_TOT_CYC: 150002Caliper 1: Executions: 1 Real usecs: 16 Real cycles: 32333 Proc usecs: 16 Proc cycles: 32328 PAPI_TOT_CYC: 32226 ***This is the cycle count
for your caliperEvent descriptions:Event: PAPI_TOT_CYC Derived: No Short Description: Total cycles Long Description: Total cycles Developer's Notes: Start: Wed Nov 15 23:51:48 2006Finish: Wed Nov 15 23:51:48 2006libmonitor debug: (P20167,T0x0) monitor_fini_library()Let me know if there are any problems. Actually, let me know if it works for you! It is somewhat untested for anyone but me.
Mike