Performance-Monitoring Counters Library, for Intel/AMD Processors and Linux
This example introduces
The complete list of compiler options
The complete list of compile-time constants and symbols
Important!
If the library is compiled with -DPMC_READ_INLINE then the user code must
also be compiled with -DPMC_READ_INLINE. The same applies to most of the
other options described here. The easy way to do this is
gcc `pmc_options` prog.c -lpmc
pmc_options is a command that should be installed as part of this package.
Return to The Tour
Return to Main Menu
Compiler Options Effects and Defaults (if not undefined)
Processor Architecture
-DPMC_P5 One, and only one, of these must be supplied.
-DPMC_P6 There is no default. Uniqueness is verified.
-DPMC_K7 (P5: Pentium, Pentium/MMX)
(P6: Pentium Pro/II/III/4)
(K7: AMD Athlon)
-DPMC_MMX Use the Intel MMX extensions.
(Pentium/MMX, Pentium II/III/4)
-DPMC_SSE Use the Intel Streaming SIMD extensions.
Implies -DPMC_MMX.
(Pentium III/4)
-DPMC_SSE2 Use the Intel Streaming SIMD extensions, second
generation. Implies -DPMC_SSE.
(Pentium 4)
Processor Specifics
-DPMC_MHZ=800 Processor MHz rating.
Use 'rabbit -mhz 0' to find the proper value,
and the default as configured.
-DPMC_PICOSEC=1250 Cycle counter, picoseconds per cycle.
default: (1000000 / PMC_MHZ)
[for future use with SGI systems]
-DPMC_CACHE_LINE=32 Cache line size, bytes; Intel 32, AMD 64.
Used for alignment of data structures.
System Specifics
-DPMC_BUS_MHZ=133 System bus MHz rating. [future]
Derived from the operating system
-DPMC_MAX_RATE=100 Maximum sample rate for pmc_run_command()
and pmc_begin_sampling().
Cautious Implementation
-DPMC_STRING=256 String length in pmc_control_t struct, for
the fields input_file[], output_directory[],
label[] and labels[i][].
-DPMC_READ_INLINE Use inline version of pmc_read(). This is
faster, but code size increases.
-DPMC_READ_SERIAL Use serializing instructions in pmc_read().
-DPMC_READ_KERNEL_MODE Enforced with original Pentium, which does
not implement the rdpmc instruction. Some
systems may not allow rdtsc or rdpmc to be
executed from user mode.
-DPMC_LIST These options are not implemented with the
-DPMC_VERBOSE inline version of pmc_read(), and will cause
-DPMC_READ_KERNEL_MODE PMC_READ_INLINE to become undefined.
-malign-double Align double precision variables to 8 bytes.
Local variables are only aligned to an 8-byte
offset from the stack pointer; this is a
"feature" of gcc.
Incautious Implementation
-DPMC_UNALIGNED Do not request alignment of certain data
structures to appropriate boundaries.
-DPMC_ALLOW_CACHE_DISABLE
Enable pmc_configure() to disable the cache.
The system may crash at a later time.
-DPMC_SERIAL_NUMBER Allow the 96-bit serial number to be printed.
Linux can be configured to disable this feature
(CONFIG_X86_PN_OFF).
Linux Version and Module
-DPMC_MINOR=240 Minor device number for /dev/pmc; see also
pmc_dev.h and the Makefile.
The major device number is 10 (non-serial mice, misc features).
The minor device number should be in the range 240-255, which
is reserved for local use. 240 is the default if PMC_MINOR is
not given. Be sure to coordinate this value with the Makefile.
See /usr/src/linux/Documentation/devices.txt for a proper minor
device number. The exact location of this file depends on your
Linux distribution and installation. The current version is
http://www.kernel.org/pub/linux/docs/device-list/devices.txt
or see http://www.lanana.org/.
-DPMC_PROC Enable /proc/pmc when the module is loaded.
Highly recommended.
-D__KERNEL__ -DMODULE Required for the module.
-D__SMP__ Required for use with SMP kernel.
The current implementation is unsafe on a multiprocessor
(see the bug reports).
Reserved for developers and the insatiably curious
-DPMC_VERSION Version number from Makefile.
-DPMC_LIST Print cycle and event counters as obtained.
-DPMC_VERBOSE Enable the -Verbose option.
-DPMC_MEASURE Check the cost of pmc_interval_handler().
-DPMC_CHECK_ALIGNMENT=1 Examine the data structure declarations.
-DPMC_CHECK_ALIGNMENT=2 Not yet functional.
-DPMC_RESTORE Not yet functional.
-DPMC_TEST For use with pmc_test.c only.
-DPMC_ALLOW_SYSTEM_TEST For use with pmc_test.c only.
-DPMC_UNLOCKED For easier debugging of the module.
For a Pentium processor:
-DPMC_P5
For a Pentium Processor with MMX Technology:
-DPMC_P5 -DPMC_MMX
For a Pentium Pro processor:
-DPMC_P6
For a Pentium II processor:
-DPMC_P6 -DPMC_MMX
For a Pentium III processor:
-DPMC_P6 -DPMC_SSE
For an AMD Athlon processor:
-DPMC_K7
The processor MHz rate must be supplied; the default is 800 MHz. rabbit will
guess the MHz rate and report the installed default rate with 'rabbit -mhz 0'.
If the program has been compiled with one MHz rate, another can be supplied
from the command line by 'rabbit -mhz 550'.
rabbit uses an interval timer to sample the cycle and event counters; the
default and maximum rate is 100 samples per second. rabbit can override this
value, for example by 'rabbit -s 5' for 5 samples per second. As the Intel event
counters are only 40 bits wide, rabbit should not sample at rates less than once
per second. As the Linux scheduler works in time slices of 0.01 seconds,
attempts to sample at rates higher that 100 per second are defeated.
The Pentium Pro/II/III and Athlon are able to execute instructions out-of-order,
when there is no effect on the results and all the data is available. Since
the cycle and event counters are normally read from user mode (that is, if
PMC_READ_KERNEL_MODE was not selected), it is possible that the programmed
sequence
rdtsc ; read time stamp counter (cycle counter)
instr 1 ; some other kind of instruction
instr 2
rdtsc
would actually be executed as
instr 1
rdtsc
instr 2
rdtsc
which would invalidate claims about measuring instr 1 and instr 2. The option
-DPMC_READ_SERIAL will insert one additional instruction (cpuid) before each
rdtsc to prevent this behavior. If the use of -DPMC_READ_SERIAL seems crucial
to your results, you are probably expecting too much. -DPMC_READ_SERIAL is
not necessary with the P5 family, which does not support out-of-order
execution.
The options -DPMC_LIST -DPMC_VERBOSE -DPMC_MEASURE -DPMC_CHECK_ALIGNMENT=[12]
-DPMC_UNLOCKED are intended for debugging. None of these is recommended for
normal use.
The option -DPMC_LIST will print the counters as they are read; it is not
recommended with high sample rates. Additional output goes to stderr or the
output directory.
The option -DPMC_CHECK_ALIGNMENT will print some additional information about
the addresses of some important internal variables.
Compile-time constants, available to the application programmer, in pmc_lib.h
TRUE
FALSE
Processor-dependent sizes
pmc_event_counters
pmc_cycle_bits
pmc_event_bits
pmc_cycles_bits
pmc_events_bits
Event symbols
See the file pmc_events.h, which is included by pmc_lib.h. Direct use of
the event codes is allowed, but it is often better not to encode them in
the program.
Defined Symbols, for use with pmc_configure()
Feature
PMC_CONFIGURE_CACHE
PMC_CONFIGURE_SYSTEM_ALIGNMENT_CHECKING
PMC_CONFIGURE_PROCESS_ALIGNMENT_CHECKING
Action
PMC_CONFIGURE_QUERY
PMC_CONFIGURE_OFF
PMC_CONFIGURE_ON
PMC_CONFIGURE_CLEAR
PMC_CONFIGURE_SET
PMC_CONFIGURE_FLUSH
rabbit exit codes
RABBIT_SUCCESS 0
RABBIT_FAILURE 1
RABBIT_NO_STATUS 2
RABBIT_DID_NOT_FORK 126
RABBIT_DID_NOT_RUN 126
RABBIT_DID_NOT_FIND 127
These are intended to be consistent with the GNU time program.
Compile-time constants and symbols, of no use to the application programmer
(to be avoided in the interests of portability and future development)
PMC_ARCH_H
PMC_ASM_H
PMC_DEV_H
PMC_EVENTS_H
PMC_LIB_H
PMC_PRIVATE_H
PMC_SYS_H
PMC_0
PMC_1
PMC_2
PMC_3
PMC_ALIGN(x)
PMC_APIC_BASE_MSR
PMC_ARCH
PMC_ASM(instructions,N,buf)
PMC_ASM_READ_ALL(buf)
PMC_ASM_READ_CLOCK(buf)
PMC_ASM_READ_CR(N,buf)
PMC_ASM_READ_MSR(N,buf)
PMC_ASM_READ_PMC(N,buf)
PMC_ASM_READ_TSC(buf)
PMC_ASM_SERIALIZE
PMC_ASM_WRITE_MSR(N,buf)
PMC_BUS_FREQUENCY_MSR
PMC_ASM_CLEAN
PMC_CONTROL_0
PMC_CONTROL_1
PMC_CONTROL_2
PMC_CONTROL_3
PMC_DEVICE_HARDWARE_TYPE
PMC_DEVICE_SOFTWARE_VERSION
PMC_DEVICE_SOFTWARE_VERSION_KERNEL
PMC_DEVICE_SOFTWARE_VERSION_SMP
PMC_DISABLE_ALIGNMENT_CHECKING
PMC_DISABLE_CACHE
PMC_DISABLE_RDPMC
PMC_DISABLE_RDTSC
PMC_ENABLE_ALIGNMENT_CHECKING
PMC_ENABLE_CACHE
PMC_ENABLE_RDPMC
PMC_ENABLE_RDTSC
PMC_EVENT_MAX
PMC_FILLER
PMC_FLUSH_CACHE
PMC_MESI_LIMIT
PMC_MISC_ENABLE_MSR
PMC_OPTIONS
PMC_QUERY_ALIGNMENT_CHECKING
PMC_QUERY_CACHE
PMC_QUERY_RDPMC
PMC_QUERY_RDTSC
PMC_READ
PMC_READ_0
PMC_READ_1
PMC_READ_2
PMC_READ_3
PMC_READ_APICBASE
PMC_READ_APIC_SPACE
PMC_READ_CONTROL
PMC_READ_CONTROL_0
PMC_READ_CONTROL_1
PMC_READ_CONTROL_2
PMC_READ_CONTROL_3
PMC_READ_CR0
PMC_READ_CR2
PMC_READ_CR3
PMC_READ_CR4
PMC_READ_MISC_ENABLE
PMC_READ_TSC
PMC_SELECT
PMC_TEST_OVERHEAD
PMC_TEST_RDALL
PMC_TEST_RDMSR
PMC_TEST_RDPMC_0
PMC_TEST_RDPMC_1
PMC_TEST_RDTSC
PMC_TEST_RDWRMSR
PMC_TEST_STORE_AND_RELOAD
PMC_TSC
PMC_VERBOSE_(func)
PMC_VERBOSE_IN(func)
PMC_VERBOSE_MSG(msg)
PMC_VERBOSE_OUT(func)
PMC_WRITE_0
PMC_WRITE_1
PMC_WRITE_2
PMC_WRITE_3
PMC_WRITE_CONTROL
PMC_WRITE_CONTROL_0
PMC_WRITE_CONTROL_1
PMC_WRITE_CONTROL_2
PMC_WRITE_CONTROL_3
PMC_WRITE_TSC
Performance-Monitoring Counters Library, for Intel/AMD Processors and Linux
Author: Don Heller, dheller@scl.ameslab.gov
Last revised: 1 February 2001