native/unixbench-5.1.2/USAGE - platform/packages/apps/0xbench - Git at Google

 Running the Tests
 =================

 All the tests are executed using the "Run" script in the top-level directory.

 The simplest way to generate results is with the commmand:
     ./Run

 This will run a standard "index" test (see "The BYTE Index" below), and
 save the report in the "results" directory, with a filename like
     hostname-2007-09-23-01
 An HTML version is also saved.

 If you want to generate both the basic system index and the graphics index,
 then do:
     ./Run gindex

 If your system has more than one CPU, the tests will be run twice -- once
 with a single copy of each test running at once, and once with N copies,
 where N is the number of CPUs.  Some categories of tests, however (currently
 the graphics tests) will only run with a single copy.

 Since the tests are based on constant time (variable work), a "system"
 run usually takes about 29 minutes; the "graphics" part about 18 minutes.
 A "gindex" run on a dual-core machine will do 2 "system" passes (single-
 and dual-processing) and one "graphics" run, for a total around one and
 a quarter hours.

 ============================================================================

 Detailed Usage
 ==============

 The Run script takes a number of options which you can use to customise a
 test, and you can specify the names of the tests to run.  The full usage
 is:

     Run [ -q | -v ] [-i <n> ] [-c <n> [-c <n> ...]] [test ...]

 The option flags are:

   -q            Run in quiet mode.
   -v            Run in verbose mode.
   -i <count>    Run <count> iterations for each test -- slower tests
                 use <count> / 3, but at least 1.  Defaults to 10 (3 for
                 slow tests).
   -c <n>        Run <n> copies of each test in parallel.

 The -c option can be given multiple times; for example:

     ./Run -c 1 -c 4

 will run a single-streamed pass, then a 4-streamed pass.  Note that some
 tests (currently the graphics tests) will only run in a single-streamed pass.

 The remaining non-flag arguments are taken to be the names of tests to run.
 The default is to run "index".  See "Tests" below.

 When running the tests, I do *not* recommend switching to single-user mode
 ("init 1").  This seems to change the results in ways I don't understand,
 and it's not realistic (unless your system will actually be running in this
 mode, of course).  However, if using a windowing system, you may want to
 switch to a minimal window setup (for example, log in to a "twm" session),
 so that randomly-churning background processes don't randomise the results
 too much.  This is particularly true for the graphics tests.


 ============================================================================

 Tests
 =====

 The available tests are organised into categories; when generating index
 scores (see "The BYTE Index" below) the results for each category are
 produced separately.  The categories are:

    system          The original Unix system tests (not all are actually
                    in the index)
    2d              2D graphics tests (not all are actually in the index)
    3d              3D graphics tests
    misc            Various non-indexed tests

 The following individual tests are available:

   system:
     dhry2reg         Dhrystone 2 using register variables
     whetstone-double Double-Precision Whetstone
     syscall          System Call Overhead
     pipe             Pipe Throughput
     context1         Pipe-based Context Switching
     spawn            Process Creation
     execl            Execl Throughput
     fstime-w         File Write 1024 bufsize 2000 maxblocks
     fstime-r         File Read 1024 bufsize 2000 maxblocks
     fstime           File Copy 1024 bufsize 2000 maxblocks
     fsbuffer-w       File Write 256 bufsize 500 maxblocks
     fsbuffer-r       File Read 256 bufsize 500 maxblocks
     fsbuffer         File Copy 256 bufsize 500 maxblocks
     fsdisk-w         File Write 4096 bufsize 8000 maxblocks
     fsdisk-r         File Read 4096 bufsize 8000 maxblocks
     fsdisk           File Copy 4096 bufsize 8000 maxblocks
     shell1           Shell Scripts (1 concurrent) (runs "looper 60 multi.sh 1")
     shell8           Shell Scripts (8 concurrent) (runs "looper 60 multi.sh 8")
     shell16          Shell Scripts (8 concurrent) (runs "looper 60 multi.sh 16")

   2d:
     2d-rects         2D graphics: rectangles
     2d-lines         2D graphics: lines
     2d-circle        2D graphics: circles
     2d-ellipse       2D graphics: ellipses
     2d-shapes        2D graphics: polygons
     2d-aashapes      2D graphics: aa polygons
     2d-polys         2D graphics: complex polygons
     2d-text          2D graphics: text
     2d-blit          2D graphics: images and blits
     2d-window        2D graphics: windows

   3d:
     ubgears          3D graphics: gears

   misc:
     C                C Compiler Throughput ("looper 60 $cCompiler cctest.c")
     arithoh          Arithoh (huh?)
     short            Arithmetic Test (short) (this is arith.c configured for
                      "short" variables; ditto for the ones below)
     int              Arithmetic Test (int)
     long             Arithmetic Test (long)
     float            Arithmetic Test (float)
     double           Arithmetic Test (double)
     dc               Dc: sqrt(2) to 99 decimal places (runs
                      "looper 30 dc < dc.dat", using your system's copy of "dc")
     hanoi            Recursion Test -- Tower of Hanoi
     grep             Grep for a string in a large file, using your system's
                      copy of "grep"
     sysexec          Exercise fork() and exec().

 The following pseudo-test names are aliases for combinations of other
 tests:

     arithmetic       Runs arithoh, short, int, long, float, double,
                      and whetstone-double
     dhry             Alias for dhry2reg
     dhrystone        Alias for dhry2reg
     whets            Alias for whetstone-double
     whetstone        Alias for whetstone-double
     load             Runs shell1, shell8, and shell16
     misc             Runs C, dc, and hanoi
     speed            Runs the arithmetic and system groups
     oldsystem        Runs execl, fstime, fsbuffer, fsdisk, pipe, context1,
                      spawn, and syscall
     system           Runs oldsystem plus shell1, shell8, and shell16
     fs               Runs fstime-w, fstime-r, fstime, fsbuffer-w,
                      fsbuffer-r, fsbuffer, fsdisk-w, fsdisk-r, and fsdisk
     shell            Runs shell1, shell8, and shell16

     index            Runs the tests which constitute the official index:
                      the oldsystem group, plus dhry2reg, whetstone-double,
                      shell1, and shell8
                      See "The BYTE Index" below for more information.
     graphics         Runs the tests which constitute the graphics index:
                      2d-rects, 2d-ellipse, 2d-aashapes, 2d-text, 2d-blit,
                      2d-window, and ubgears
     gindex           Runs the index and graphics groups, to generate both
                      sets of index results

     all              Runs all tests


 ============================================================================

 The BYTE Index
 ==============

 The purpose of this test is to provide a basic indicator of the performance
 of a Unix-like system; hence, multiple tests are used to test various
 aspects of the system's performance.  These test results are then compared
 to the scores from a baseline system to produce an index value, which is
 generally easier to handle than the raw sores.  The entire set of index
 values is then combined to make an overall index for the system.

 Since 1995, the baseline system has been "George", a SPARCstation 20-61
 with 128 MB RAM, a SPARC Storage Array, and Solaris 2.3, whose ratings
 were set at 10.0.  (So a system which scores 520 is 52 times faster than
 this machine.)  Since the numbers are really only useful in a relative
 sense, there's no particular reason to update the base system, so for the
 sake of consistency it's probably best to leave it alone.  George's scores
 are in the file "pgms/index.base"; this file is used to calculate the
 index scores for any particular run.

 Over the years, various changes have been made to the set of tests in the
 index.  Although there is a desire for a consistent baseline, various tests
 have been determined to be misleading, and have been removed; and a few
 alternatives have been added.  These changes are detailed in the README,
 and should be born in mind when looking at old scores.

 A number of tests are included in the benchmark suite which are not part of
 the index, for various reasons; these tests can of course be run manually.
 See "Tests" above.


 ============================================================================

 Graphics Tests
 ==============

 As of version 5.1, UnixBench now contains some graphics benchmarks.  These
 are intended to give a rough idea of the general graphics performance of
 a system.

 The graphics tests are in categories "2d" and "3d", so the index scores
 for these tests are separate from the basic system index.  This seems
 like a sensible division, since the graphics performance of a system
 depends largely on the graphics adaptor.

 The tests currently consist of some 2D "x11perf" tests and "ubgears".

 * The 2D tests are a selection of the x11perf tests, using the host
   system's x11perf command (which must be installed and in the search
   path).  Only a few of the x11perf tests are used, in the interests
   of completing a test run in a reasonable time; if you want to do
   detailed diagnosis of an X server or graphics chip, then use x11perf
   directly.

 * The 3D test is "ubgears", a modified version of the familiar "glxgears".
   This version runs for 5 seconds to "warm up", then performs a timed
   run and displays the average frames-per-second.

 On multi-CPU systems, the graphics tests will only run in single-processing
 mode.  This is because the meaning of running two copies of a test at once
 is dubious; and the test windows tend to overlay each other, meaning that
 the window behind isn't actually doing any work.


 ============================================================================

 Multiple CPUs
 =============

 If your system has multiple CPUs, the default behaviour is to run the selected
 tests twice -- once with one copy of each test program running at a time,
 and once with N copies, where N is the number of CPUs.  (You can override
 this with the "-c" option; see "Detailed Usage" above.)  This is designed to
 allow you to assess:

  - the performance of your system when running a single task
  - the performance of your system when running multiple tasks
  - the gain from your system's implementation of parallel processing

 The results, however, need to be handled with care.  Here are the results
 of two runs on a dual-processor system, one in single-processing mode, one
 dual-processing:

   Test                    Single     Dual   Gain
   --------------------    ------   ------   ----
   Dhrystone 2              562.5   1110.3    97%
   Double Whetstone         320.0    640.4   100%
   Execl Throughput         450.4    880.3    95%
   File Copy 1024           759.4    595.9   -22%
   File Copy 256            535.8    438.8   -18%
   File Copy 4096          1261.8   1043.4   -17%
   Pipe Throughput          481.0    979.3   104%
   Pipe-based Switching     326.8   1229.0   276%
   Process Creation         917.2   1714.1    87%
   Shell Scripts (1)       1064.9   1566.3    47%
   Shell Scripts (8)       1567.7   1709.9     9%
   System Call Overhead     944.2   1445.5    53%
   --------------------    ------   ------   ----
   Index Score:             678.2   1026.2    51%

 As expected, the heavily CPU-dependent tasks -- dhrystone, whetstone,
 execl, pipe throughput, process creation -- show close to 100% gain when
 running 2 copies in parallel.

 The Pipe-based Context Switching test measures context switching overhead
 by sending messages back and forth between 2 processes.  I don't know why
 it shows such a huge gain with 2 copies (ie. 4 processes total) running,
 but it seems to be consistent on my system.  I think this may be an issue
 with the SMP implementation.

 The System Call Overhead shows a lesser gain, presumably because it uses a
 lot of CPU time in single-threaded kernel code.  The shell scripts test with
 8 concurrent processes shows no gain -- because the test itself runs 8
 scripts in parallel, it's already using both CPUs, even when the benchmark
 is run in single-stream mode.  The same test with one process per copy
 shows a real gain.

 The filesystem throughput tests show a loss, instead of a gain, when
 multi-processing.  That there's no gain is to be expected, since the tests
 are presumably constrained by the throughput of the I/O subsystem and the
 disk drive itself; the drop in performance is presumably down to the
 increased contention for resources, and perhaps greater disk head movement.

 So what tests should you use, how many copies should you run, and how should
 you interpret the results?  Well, that's up to you, since it depends on
 what it is you're trying to measure.

 Implementation
 --------------

 The multi-processing mode is implemented at the level of test iterations.
 During each iteration of a test, N slave processes are started using fork().
 Each of these slaves executes the test program using fork() and exec(),
 reads and stores the entire output, times the run, and prints all the
 results to a pipe.  The Run script reads the pipes for each of the slaves
 in turn to get the results and times.  The scores are added, and the times
 averaged.

 The result is that each test program has N copies running at once.  They
 should all finish at around the same time, since they run for constant time.

 If a test program itself starts off K multiple processes (as with the shell8
 test), then the effect will be that there are N * K processes running at
 once.  This is probably not very useful for testing multi-CPU performance.


 ============================================================================

 The Language Setting
 ====================

 The $LANG environment variable determines how programs abnd library
 routines interpret text.  This can have a big impact on the test results.

 If $LANG is set to POSIX, or is left unset, text is treated as ASCII; if
 it is set to en_US.UTF-8, foir example, then text is treated as being
 encoded in UTF-8, which is more complex and therefore slower.  Setting
 it to other languages can have varying results.

 To ensure consistency between test runs, the Run script now (as of version
 5.1.1) sets $LANG to "en_US.utf8".

 This setting which is configured with the variable "$language".  You
 should not change this if you want to share your results to allow
 comparisons between systems; however, you may want to change it to see
 how different language settings affect performance.

 Each test report now includes the language settings in use.  The reported
 language is what is set in $LANG, and is not necessarily supported by the
 system; but we also report the character mapping and collation order which
 are actually in use (as reported by "locale").


 ============================================================================

 Interpreting the Results
 ========================

 Interpreting the results of these tests is tricky, and totally depends on
 what you're trying to measure.

 For example, are you trying to measure how fast your CPU is?  Or how good
 your compiler is?  Because these tests are all recompiled using your host
 system's compiler, the performance of the compiler will inevitably impact
 the performance of the tests.  Is this a problem?  If you're choosing a
 system, you probably care about its overall speed, which may well depend
 on how good its compiler is; so including that in the test results may be
 the right answer.  But you may want to ensure that the right compiler is
 used to build the tests.

 On the other hand, with the vast majority of Unix systems being x86 / PC
 compatibles, running Linux and the GNU C compiler, the results will tend
 to be more dependent on the hardware; but the versions of the compiler and
 OS can make a big difference.  (I measured a 50% gain between SUSE 10.1
 and OpenSUSE 10.2 on the same machine.)  So you may want to make sure that
 all your test systems are running the same version of the OS; or at least
 publish the OS and compuiler versions with your results.  Then again, it may
 be compiler performance that you're interested in.

 The C test is very dubious -- it tests the speed of compilation.  If you're
 running the exact same compiler on each system, OK; but otherwise, the
 results should probably be discarded.  A slower compilation doesn't say
 anything about the speed of your system, since the compiler may simply be
 spending more time to super-optimise the code, which would actually make it
 faster.

 This will be particularly true on architectures like IA-64 (Itanium etc.)
 where the compiler spends huge amounts of effort scheduling instructions
 to run in parallel, with a resultant significant gain in execution speed.

 Some tests are even more dubious in terms of host-dependency -- for example,
 the "dc" test uses the host's version of dc (a calculator program).  The
 version of this which is available can make a huge difference to the score,
 which is why it's not in the index group.  Read through the release notes
 for more on these kinds of issues.

 Another age-old issue is that of the benchmarks being too trivial to be
 meaningful.  With compilers getting ever smarter, and performing more
 wide-ranging flow path analyses, the danger of parts of the benchmarks
 simply being optimised out of existance is always present.

 All in all, the "index" and "gindex" tests (see above) are designed to
 give a reasonable measure of overall system performance; but the results
 of any test run should always be used with care.
	Running the Tests
	=================

	All the tests are executed using the "Run" script in the top-level directory.

	The simplest way to generate results is with the commmand:
	./Run

	This will run a standard "index" test (see "The BYTE Index" below), and
	save the report in the "results" directory, with a filename like
	hostname-2007-09-23-01
	An HTML version is also saved.

	If you want to generate both the basic system index and the graphics index,
	then do:
	./Run gindex

	If your system has more than one CPU, the tests will be run twice -- once
	with a single copy of each test running at once, and once with N copies,
	where N is the number of CPUs. Some categories of tests, however (currently
	the graphics tests) will only run with a single copy.

	Since the tests are based on constant time (variable work), a "system"
	run usually takes about 29 minutes; the "graphics" part about 18 minutes.
	A "gindex" run on a dual-core machine will do 2 "system" passes (single-
	and dual-processing) and one "graphics" run, for a total around one and
	a quarter hours.

	============================================================================

	Detailed Usage
	==============

	The Run script takes a number of options which you can use to customise a
	test, and you can specify the names of the tests to run. The full usage
	is:

	Run [ -q \| -v ] [-i <n> ] [-c <n> [-c <n> ...]] [test ...]

	The option flags are:

	-q Run in quiet mode.
	-v Run in verbose mode.
	-i <count> Run <count> iterations for each test -- slower tests
	use <count> / 3, but at least 1. Defaults to 10 (3 for
	slow tests).
	-c <n> Run <n> copies of each test in parallel.

	The -c option can be given multiple times; for example:

	./Run -c 1 -c 4

	will run a single-streamed pass, then a 4-streamed pass. Note that some
	tests (currently the graphics tests) will only run in a single-streamed pass.

	The remaining non-flag arguments are taken to be the names of tests to run.
	The default is to run "index". See "Tests" below.

	When running the tests, I do not recommend switching to single-user mode
	("init 1"). This seems to change the results in ways I don't understand,
	and it's not realistic (unless your system will actually be running in this
	mode, of course). However, if using a windowing system, you may want to
	switch to a minimal window setup (for example, log in to a "twm" session),
	so that randomly-churning background processes don't randomise the results
	too much. This is particularly true for the graphics tests.


	============================================================================

	Tests
	=====

	The available tests are organised into categories; when generating index
	scores (see "The BYTE Index" below) the results for each category are
	produced separately. The categories are:

	system The original Unix system tests (not all are actually
	in the index)
	2d 2D graphics tests (not all are actually in the index)
	3d 3D graphics tests
	misc Various non-indexed tests

	The following individual tests are available:

	system:
	dhry2reg Dhrystone 2 using register variables
	whetstone-double Double-Precision Whetstone
	syscall System Call Overhead
	pipe Pipe Throughput
	context1 Pipe-based Context Switching
	spawn Process Creation
	execl Execl Throughput
	fstime-w File Write 1024 bufsize 2000 maxblocks
	fstime-r File Read 1024 bufsize 2000 maxblocks
	fstime File Copy 1024 bufsize 2000 maxblocks
	fsbuffer-w File Write 256 bufsize 500 maxblocks
	fsbuffer-r File Read 256 bufsize 500 maxblocks
	fsbuffer File Copy 256 bufsize 500 maxblocks
	fsdisk-w File Write 4096 bufsize 8000 maxblocks
	fsdisk-r File Read 4096 bufsize 8000 maxblocks
	fsdisk File Copy 4096 bufsize 8000 maxblocks
	shell1 Shell Scripts (1 concurrent) (runs "looper 60 multi.sh 1")
	shell8 Shell Scripts (8 concurrent) (runs "looper 60 multi.sh 8")
	shell16 Shell Scripts (8 concurrent) (runs "looper 60 multi.sh 16")

	2d:
	2d-rects 2D graphics: rectangles
	2d-lines 2D graphics: lines
	2d-circle 2D graphics: circles
	2d-ellipse 2D graphics: ellipses
	2d-shapes 2D graphics: polygons
	2d-aashapes 2D graphics: aa polygons
	2d-polys 2D graphics: complex polygons
	2d-text 2D graphics: text
	2d-blit 2D graphics: images and blits
	2d-window 2D graphics: windows

	3d:
	ubgears 3D graphics: gears

	misc:
	C C Compiler Throughput ("looper 60 $cCompiler cctest.c")
	arithoh Arithoh (huh?)
	short Arithmetic Test (short) (this is arith.c configured for
	"short" variables; ditto for the ones below)
	int Arithmetic Test (int)
	long Arithmetic Test (long)
	float Arithmetic Test (float)
	double Arithmetic Test (double)
	dc Dc: sqrt(2) to 99 decimal places (runs
	"looper 30 dc < dc.dat", using your system's copy of "dc")
	hanoi Recursion Test -- Tower of Hanoi
	grep Grep for a string in a large file, using your system's
	copy of "grep"
	sysexec Exercise fork() and exec().

	The following pseudo-test names are aliases for combinations of other
	tests:

	arithmetic Runs arithoh, short, int, long, float, double,
	and whetstone-double
	dhry Alias for dhry2reg
	dhrystone Alias for dhry2reg
	whets Alias for whetstone-double
	whetstone Alias for whetstone-double
	load Runs shell1, shell8, and shell16
	misc Runs C, dc, and hanoi
	speed Runs the arithmetic and system groups
	oldsystem Runs execl, fstime, fsbuffer, fsdisk, pipe, context1,
	spawn, and syscall
	system Runs oldsystem plus shell1, shell8, and shell16
	fs Runs fstime-w, fstime-r, fstime, fsbuffer-w,
	fsbuffer-r, fsbuffer, fsdisk-w, fsdisk-r, and fsdisk
	shell Runs shell1, shell8, and shell16

	index Runs the tests which constitute the official index:
	the oldsystem group, plus dhry2reg, whetstone-double,
	shell1, and shell8
	See "The BYTE Index" below for more information.
	graphics Runs the tests which constitute the graphics index:
	2d-rects, 2d-ellipse, 2d-aashapes, 2d-text, 2d-blit,
	2d-window, and ubgears
	gindex Runs the index and graphics groups, to generate both
	sets of index results

	all Runs all tests


	============================================================================

	The BYTE Index
	==============

	The purpose of this test is to provide a basic indicator of the performance
	of a Unix-like system; hence, multiple tests are used to test various
	aspects of the system's performance. These test results are then compared
	to the scores from a baseline system to produce an index value, which is
	generally easier to handle than the raw sores. The entire set of index
	values is then combined to make an overall index for the system.

	Since 1995, the baseline system has been "George", a SPARCstation 20-61
	with 128 MB RAM, a SPARC Storage Array, and Solaris 2.3, whose ratings
	were set at 10.0. (So a system which scores 520 is 52 times faster than
	this machine.) Since the numbers are really only useful in a relative
	sense, there's no particular reason to update the base system, so for the
	sake of consistency it's probably best to leave it alone. George's scores
	are in the file "pgms/index.base"; this file is used to calculate the
	index scores for any particular run.

	Over the years, various changes have been made to the set of tests in the
	index. Although there is a desire for a consistent baseline, various tests
	have been determined to be misleading, and have been removed; and a few
	alternatives have been added. These changes are detailed in the README,
	and should be born in mind when looking at old scores.

	A number of tests are included in the benchmark suite which are not part of
	the index, for various reasons; these tests can of course be run manually.
	See "Tests" above.


	============================================================================

	Graphics Tests
	==============

	As of version 5.1, UnixBench now contains some graphics benchmarks. These
	are intended to give a rough idea of the general graphics performance of
	a system.

	The graphics tests are in categories "2d" and "3d", so the index scores
	for these tests are separate from the basic system index. This seems
	like a sensible division, since the graphics performance of a system
	depends largely on the graphics adaptor.

	The tests currently consist of some 2D "x11perf" tests and "ubgears".

	* The 2D tests are a selection of the x11perf tests, using the host
	system's x11perf command (which must be installed and in the search
	path). Only a few of the x11perf tests are used, in the interests
	of completing a test run in a reasonable time; if you want to do
	detailed diagnosis of an X server or graphics chip, then use x11perf
	directly.

	* The 3D test is "ubgears", a modified version of the familiar "glxgears".
	This version runs for 5 seconds to "warm up", then performs a timed
	run and displays the average frames-per-second.

	On multi-CPU systems, the graphics tests will only run in single-processing
	mode. This is because the meaning of running two copies of a test at once
	is dubious; and the test windows tend to overlay each other, meaning that
	the window behind isn't actually doing any work.


	============================================================================

	Multiple CPUs
	=============

	If your system has multiple CPUs, the default behaviour is to run the selected
	tests twice -- once with one copy of each test program running at a time,
	and once with N copies, where N is the number of CPUs. (You can override
	this with the "-c" option; see "Detailed Usage" above.) This is designed to
	allow you to assess:

	- the performance of your system when running a single task
	- the performance of your system when running multiple tasks
	- the gain from your system's implementation of parallel processing

	The results, however, need to be handled with care. Here are the results
	of two runs on a dual-processor system, one in single-processing mode, one
	dual-processing:

	Test Single Dual Gain
	-------------------- ------ ------ ----
	Dhrystone 2 562.5 1110.3 97%
	Double Whetstone 320.0 640.4 100%
	Execl Throughput 450.4 880.3 95%
	File Copy 1024 759.4 595.9 -22%
	File Copy 256 535.8 438.8 -18%
	File Copy 4096 1261.8 1043.4 -17%
	Pipe Throughput 481.0 979.3 104%
	Pipe-based Switching 326.8 1229.0 276%
	Process Creation 917.2 1714.1 87%
	Shell Scripts (1) 1064.9 1566.3 47%
	Shell Scripts (8) 1567.7 1709.9 9%
	System Call Overhead 944.2 1445.5 53%
	-------------------- ------ ------ ----
	Index Score: 678.2 1026.2 51%

	As expected, the heavily CPU-dependent tasks -- dhrystone, whetstone,
	execl, pipe throughput, process creation -- show close to 100% gain when
	running 2 copies in parallel.

	The Pipe-based Context Switching test measures context switching overhead
	by sending messages back and forth between 2 processes. I don't know why
	it shows such a huge gain with 2 copies (ie. 4 processes total) running,
	but it seems to be consistent on my system. I think this may be an issue
	with the SMP implementation.

	The System Call Overhead shows a lesser gain, presumably because it uses a
	lot of CPU time in single-threaded kernel code. The shell scripts test with
	8 concurrent processes shows no gain -- because the test itself runs 8
	scripts in parallel, it's already using both CPUs, even when the benchmark
	is run in single-stream mode. The same test with one process per copy
	shows a real gain.

	The filesystem throughput tests show a loss, instead of a gain, when
	multi-processing. That there's no gain is to be expected, since the tests
	are presumably constrained by the throughput of the I/O subsystem and the
	disk drive itself; the drop in performance is presumably down to the
	increased contention for resources, and perhaps greater disk head movement.

	So what tests should you use, how many copies should you run, and how should
	you interpret the results? Well, that's up to you, since it depends on
	what it is you're trying to measure.

	Implementation
	--------------

	The multi-processing mode is implemented at the level of test iterations.
	During each iteration of a test, N slave processes are started using fork().
	Each of these slaves executes the test program using fork() and exec(),
	reads and stores the entire output, times the run, and prints all the
	results to a pipe. The Run script reads the pipes for each of the slaves
	in turn to get the results and times. The scores are added, and the times
	averaged.

	The result is that each test program has N copies running at once. They
	should all finish at around the same time, since they run for constant time.

	If a test program itself starts off K multiple processes (as with the shell8
	test), then the effect will be that there are N * K processes running at
	once. This is probably not very useful for testing multi-CPU performance.


	============================================================================

	The Language Setting
	====================

	The $LANG environment variable determines how programs abnd library
	routines interpret text. This can have a big impact on the test results.

	If $LANG is set to POSIX, or is left unset, text is treated as ASCII; if
	it is set to en_US.UTF-8, foir example, then text is treated as being
	encoded in UTF-8, which is more complex and therefore slower. Setting
	it to other languages can have varying results.

	To ensure consistency between test runs, the Run script now (as of version
	5.1.1) sets $LANG to "en_US.utf8".

	This setting which is configured with the variable "$language". You
	should not change this if you want to share your results to allow
	comparisons between systems; however, you may want to change it to see
	how different language settings affect performance.

	Each test report now includes the language settings in use. The reported
	language is what is set in $LANG, and is not necessarily supported by the
	system; but we also report the character mapping and collation order which
	are actually in use (as reported by "locale").


	============================================================================

	Interpreting the Results
	========================

	Interpreting the results of these tests is tricky, and totally depends on
	what you're trying to measure.

	For example, are you trying to measure how fast your CPU is? Or how good
	your compiler is? Because these tests are all recompiled using your host
	system's compiler, the performance of the compiler will inevitably impact
	the performance of the tests. Is this a problem? If you're choosing a
	system, you probably care about its overall speed, which may well depend
	on how good its compiler is; so including that in the test results may be
	the right answer. But you may want to ensure that the right compiler is
	used to build the tests.

	On the other hand, with the vast majority of Unix systems being x86 / PC
	compatibles, running Linux and the GNU C compiler, the results will tend
	to be more dependent on the hardware; but the versions of the compiler and
	OS can make a big difference. (I measured a 50% gain between SUSE 10.1
	and OpenSUSE 10.2 on the same machine.) So you may want to make sure that
	all your test systems are running the same version of the OS; or at least
	publish the OS and compuiler versions with your results. Then again, it may
	be compiler performance that you're interested in.

	The C test is very dubious -- it tests the speed of compilation. If you're
	running the exact same compiler on each system, OK; but otherwise, the
	results should probably be discarded. A slower compilation doesn't say
	anything about the speed of your system, since the compiler may simply be
	spending more time to super-optimise the code, which would actually make it
	faster.

	This will be particularly true on architectures like IA-64 (Itanium etc.)
	where the compiler spends huge amounts of effort scheduling instructions
	to run in parallel, with a resultant significant gain in execution speed.

	Some tests are even more dubious in terms of host-dependency -- for example,
	the "dc" test uses the host's version of dc (a calculator program). The
	version of this which is available can make a huge difference to the score,
	which is why it's not in the index group. Read through the release notes
	for more on these kinds of issues.

	Another age-old issue is that of the benchmarks being too trivial to be
	meaningful. With compilers getting ever smarter, and performing more
	wide-ranging flow path analyses, the danger of parts of the benchmarks
	simply being optimised out of existance is always present.

	All in all, the "index" and "gindex" tests (see above) are designed to
	give a reasonable measure of overall system performance; but the results
	of any test run should always be used with care.