K&K Software Lynxviewer

Ergebnis für URL: http://www.gnu.org/software/gsl/design/gsl-design.html#SEC44
                        GNU Scientific Library -- Design document


    Mark Galassi


    James Theiler


    Brian Gough
     ____________________________________________________________________________

                                      [1]Motivation

   There is a need for scientists and engineers to have a numerical library that:
     * is free (in the sense of freedom, not in the sense of gratis; see the GNU
       General Public License), so that people can use that library, redistribute
       it, modify it ...
     * is written in C using modern coding conventions, calling conventions, scoping
       ...
     * is clearly and pedagogically documented; preferably with TeXinfo, so as to
       allow online info, WWW and TeX output.
     * uses top quality state-of-the-art algorithms.
     * is portable and configurable using autoconf and automake.
     * basically, is GNUlitically correct.

   There are strengths and weaknesses with existing libraries:

   Netlib (http://www.netlib.org/) is probably the most advanced set of numerical
   algorithms available on the net, maintained by AT&T. Unfortunately most of the
   software is written in Fortran, with strange calling conventions in many places.
   It is also not very well collected, so it is a lot of work to get started with
   netlib.

   GAMS (http://gams.nist.gov/) is an extremely well organized set of pointers to
   scientific software, but like netlib, the individual routines vary in their
   quality and their level of documentation.

   Numerical Recipes (http://www.nr.com, http://cfata2.harvard.edu/nr/) is an
   excellent book: it explains the algorithms in a very clear way. Unfortunately the
   authors released the source code under a license which allows you to use it, but
   prevents you from re-distributing it. Thus Numerical Recipes is not free in the
   sense of freedom. On top of that, the implementation suffers from fortranitis and
   other limitations. [http://www.lysator.liu.se/c/num-recipes-in-c.html]

   SLATEC is a large public domain collection of numerical routines in Fortran
   written under a Department of Energy program in the 1970's. The routines are well
   tested and have a reasonable overall design (given the limitations of that era).
   GSL should aim to be a modern version of SLATEC.

   NSWC is the Naval Surface Warfare Center numerical library. It is a large
   public-domain Fortran library, containing a lot of high-quality code.
   Documentation for the library is hard to find, only a few photocopies of the
   printed manual are still in circulation.

   NAG and IMSL both sell high-quality libraries which are proprietary. The NAG
   library is more advanced and has wider scope than IMSL. The IMSL library leans
   more towards ease-of-use and makes extensive use of variable length argument
   lists to emulate "default arguments".

   ESSL and SCSL are proprietary libraries from IBM and SGI.

   Forth Scientific Library [see the URL http://www.taygeta.com/fsl/sciforth.html].
   Mainly of interest to Forth users.

   Numerical Algorithms with C G. Engeln-Mullges, F. Uhlig. A nice numerical library
   written in ANSI C with an accompanying textbook. Source code is available but the
   library is not free software.

   NUMAL A C version of the NUMAL library has been written by H.T. Lau and is
   published as a book and disk with the title "A Numerical Library in C for
   Scientists and Engineers". Source code is available but the library is not free
   software.

   C Mathematical Function Handbook by Louis Baker. A library of function
   approximations and methods corresponding to those in the "Handbook of
   Mathematical Functions" by Abramowitz and Stegun. Source code is available but
   the library is not free software.

   CCMATH by Daniel A. Atkinson. A C numerical library covering similar areas to
   GSL. The code is quite terse. Earlier versions were under the GPL but
   unfortunately it has changed to the LGPL in recent versions.

   CEPHES A useful collection of high-quality special functions written in C. Not
   GPL'ed.

   WNLIB A small collection of numerical routines written in C by Will Naylor.
   Public domain.

   MESHACH A comprehensive matrix-vector linear algebra library written in C. Freely
   available but not GPL'ed (non-commercial license).

   CERNLIB is a large high-quality Fortran library developed at CERN over many
   years. It was originally non-free software but has recently been released under
   the GPL.

   COLT is a free numerical library in Java developed at CERN by Wolfgang Hoschek.
   It is under a BSD-style license.

   The long-term goal will be to provide a framework to which the real numerical
   experts (or their graduate students) will contribute.

                                     [2]Contributing

   The GSL team welcomes new contributions to enhance the functionality of the
   library. Much emphasis is placed on ensuring the stability of the existing
   functions, library consistency, and fixing any reported bugs. Potential
   contributors are encouraged to gain familiarity with the library by investigating
   and fixing known problems listed in the bug tracker on the GSL savannah page.

   Adding large amounts of new code is difficult because it leads to differences in
   the maturity of different parts of the library. To maintain stability, any new
   functionality is encouraged as packages, built on top of GSL and maintained
   independently by the author, as in other free software projects (such as the Perl
   CPAN archive and TeX CTAN archive, etc).

[3]Packages

   The design of GSL permits extensions to be used alongside the existing library
   easily by simple linking. For example, additional random number generators can be
   provided in a separate library:
$ tar xvfz rngextra-0.1.tar.gz
$ cd rngextra-0.1
$ ./configure; make; make check; make install
$ ...
$ gcc -Wall main.c -lrngextra -lgsl -lgslcblas -lm

   The points below summarise the package design guidelines. These are intended to
   ensure that packages are consistent with GSL itself, to make life easier for the
   end-user and make it possible to distribute popular well-tested packages as part
   of the core GSL in future.
     * Follow the GSL and GNU coding standards described in this document This means
       using the standard GNU packaging tools, such as Automake, providing
       documentation in Texinfo format, and a test suite. The test suite should run
       using 'make check', and use the test functions provided in GSL to produce the
       output with PASS:/FAIL: lines. It is not essential to use libtool since
       packages are likely to be small, a static library is sufficient and simpler
       to build.
     * Use a new unique prefix for the package (do not use 'gsl_' -- this is
       reserved for internal use). For example, a package of additional random
       number generators might use the prefix rngextra.
#include 

gsl_rng * r = gsl_rng_alloc (rngextra_lsfr32);

     * Use a meaningful version number which reflects the state of development
       Generally, 0.x are alpha versions, which provide no guarantees. Following
       that, 0.9.x are beta versions, which should be essentially complete, subject
       only to minor changes and bug fixes. The first major release is 1.0. Any
       version number of 1.0 or higher should be suitable for production use with a
       well-defined API. The API must not change in a major release and should be
       backwards-compatible in its behavior (excluding actual bug-fixes), so that
       existing code do not have to be modified. Note that the API includes all
       exported definitions, including data-structures defined with struct. If you
       need to change the API in a package, it requires a new major release (e.g.
       2.0).
     * Use the GNU General Public License (GPL) Follow the normal procedures of
       obtaining a copyright disclaimer if you would like to have the package
       considered for inclusion in GSL itself in the future (see section [4]Legal
       issues).

   Post announcements of your package releases to gsl-discuss at sources.redhat.com
   so that information about them can be added to the GSL webpages.

   An example package 'rngextra' containing two additional random number generators
   can be found at [5]http://www.network-theory.co.uk/download/rngextra/.

                                        [6]Design

[7]Language for implementation

   One language only (C)

   Advantages: simpler, compiler available and quite universal.

[8]Interface to other languages

   Wrapper packages are supplied as "extra" packages; not as part of the "core".
   They are maintained separately by independent contributors.

   Use standard tools to make wrappers: swig, g-wrap

[9]What routines are implemented

   Anything which is in any of the existing libraries. Obviously it makes sense to
   prioritize and write code for the most important areas first.

[10]What routines are not implemented

     * anything which already exists as a high-quality GPL'ed package.
     * anything which is too big -- i.e. an application in its own right rather than
       a subroutine For example, partial differential equation solvers are often
       huge and very specialized applications (since there are so many types of
       PDEs, types of solution, types of grid, etc). This sort of thing should
       remain separate. It is better to point people to the good applications which
       exist.
     * anything which is independent and useful separately. Arguably functions for
       manipulating date and time, or financial functions might be included in a
       "scientific" library. However, these sorts of modules could equally well be
       used independently in other programs, so it makes sense for them to be
       separate libraries.

[11]Design of Numerical Libraries

   In writing a numerical library there is a unavoidable conflict between
   completeness and simplicity. Completeness refers to the ability to perform
   operations on different objects so that the group is "closed". In mathematics
   objects can be combined and operated on in an infinite number of ways. For
   example, I can take the derivative of a scalar field with respect to a vector and
   the derivative of a vector field wrt a scalar (along a path).

   There is a definite tendency to unconsciously try to reproduce all these
   possibilities in a numerical library, by adding new features one by one. After
   all, it is always easy enough to support just one more feature.... so why not?

   Looking at the big picture, no-one would start out by saying "I want to be able
   to represent every possible mathematical object and operation using C structs" --
   this is a strategy which is doomed to fail. There is a limited amount of
   complexity which can be represented in a programming language like C. Attempts to
   reproduce the complexity of mathematics within such a language would just lead to
   a morass of unmaintainable code. However, it's easy to go down that road if you
   don't think about it ahead of time.

   It is better to choose simplicity over completeness. In designing new parts of
   the library keep modules independent where possible. If interdependencies between
   modules are introduced be sure about where you are going to draw the line.

[12]Code Reuse

   It is useful if people can grab a single source file and include it in their own
   programs without needing the whole library. Try to allow standalone files like
   this whenever it is reasonable. Obviously the user might need to define a few
   macros, such as GSL_ERROR, to compile the file but that is ok. Examples where
   this can be done: grabbing a single random number generator.

[13]Standards and conventions

   The people who kick off this project should set the coding standards and
   conventions. In order of precedence the standards that we follow are,
     * We follow the GNU Coding Standards.
     * We follow the conventions of the ANSI Standard C Library.
     * We follow the conventions of the GNU C Library.
     * We follow the conventions of the glib GTK support Library.

   The references for these standards are the GNU Coding Standards document,
   Harbison and Steele C: A Reference Manual, the GNU C Library Manual (version 2),
   and the Glib source code.

   For mathematical formulas, always follow the conventions in Abramowitz & Stegun,
   the Handbook of Mathematical Functions, since it is the definitive reference and
   also in the public domain.

   If the project has a philosophy it is to "Think in C". Since we are working in C
   we should only do what is natural in C, rather than trying to simulate features
   of other languages. If there is something which is unnatural in C and has to be
   simulated then we avoid using it. If this means leaving something out of the
   library, or only offering a limited version then so be it. It is not worthwhile
   making the library over-complicated. There are numerical libraries in other
   languages, and if people need the features of those languages it would be
   sensible for them to use the corresponding libraries, rather than coercing a C
   library into doing that job.

   It should be borne in mind at all time that C is a macro-assembler. If you are in
   doubt about something being too complicated ask yourself the question "Would I
   try to write this in macro-assembler?" If the answer is obviously "No" then do
   not try to include it in GSL. [BJG]

   It will be useful to read the following paper,
     * Kiem-Phong Vo, "The Discipline and Method Architecture for Reusable
       Libraries", Software - Practice & Experience, v.30, pp.107-128, 2000.

   It is available from [14]http://www.research.att.com/sw/tools/sfio/dm-spe.ps or
   the earlier technical report Kiem-Phong Vo, "An Architecture for Reusable
   Libraries" [15]http://citeseer.nj.nec.com/48973.html.

   There are associated papers on Vmalloc, SFIO, and CDT which are also relevant to
   the design of portable C libraries.
     * Kiem-Phong Vo, "Vmalloc: A General and Efficient Memory Allocator". Software
       Practice & Experience, 26:1--18, 1996.
       [16]http://www.research.att.com/sw/tools/vmalloc/vmalloc.ps
     * Kiem-Phong Vo. "Cdt: A Container Data Type Library". Soft. Prac. & Exp.,
       27:1177--1197, 1997 [17]http://www.research.att.com/sw/tools/cdt/cdt.ps
     * David G. Korn and Kiem-Phong Vo, "Sfio: Safe/Fast String/File IO",
       Proceedings of the Summer '91 Usenix Conference, pp. 235-256, 1991.
       [18]http://citeseer.nj.nec.com/korn91sfio.html

   Source code should be indented according to the GNU Coding Standards, with spaces
   not tabs. For example, by using the indent command:
indent -gnu -nut *.c *.h

   The -nut option converts tabs into spaces.

[19]Background and Preparation

   Before implementing something be sure to research the subject thoroughly! This
   will save a lot of time in the long-run. The two most important steps are,
    1. to determine whether there is already a free library (GPL or GPL-compatible)
       which does the job. If so, there is no need to reimplement it. Carry out a
       search on Netlib, GAMs, na-net, sci.math.num-analysis and the web in general.
       This should also provide you with a list of existing proprietary libraries
       which are relevant, keep a note of these for future reference in step 2.
    2. make a comparative survey of existing implementations in the commercial/free
       libraries. Examine the typical APIs, methods of communication between program
       and subroutine, and classify them so that you are familiar with the key
       concepts or features that an implementation may or may not have, depending on
       the relevant tradeoffs chosen. Be sure to review the documentation of
       existing libraries for useful references.
    3. read up on the subject and determine the state-of-the-art. Find the latest
       review papers. A search of the following journals should be undertaken.
          + ACM Transactions on Mathematical Software
          + Numerische Mathematik
          + Journal of Computation and Applied Mathematics
          + Computer Physics Communications
          + SIAM Journal of Numerical Analysis
          + SIAM Journal of Scientific Computing

   Keep in mind that GSL is not a research project. Making a good implementation is
   difficult enough, without also needing to invent new algorithms. We want to
   implement existing algorithms whenever possible. Making minor improvements is ok,
   but don't let it be a time-sink.

[20]Choice of Algorithms

   Whenever possible choose algorithms which scale well and always remember to
   handle asymptotic cases. This is particularly relevant for functions with integer
   arguments. It is tempting to implement these using the simple O(n) algorithms
   used to define the functions, such as the many recurrence relations found in
   Abramowitz and Stegun. While such methods might be acceptable for n=O(10-100)
   they will not be satisfactory for a user who needs to compute the same function
   for n=1000000.

   Similarly, do not make the implicit assumption that multivariate data has been
   scaled to have components of the same size or O(1). Algorithms should take care
   of any necessary scaling or balancing internally, and use appropriate norms (e.g.
   |Dx| where D is a diagonal scaling matrix, rather than |x|).

[21]Documentation

   Documentation: the project leaders should give examples of how things are to be
   documented. High quality documentation is absolutely mandatory, so documentation
   should introduce the topic, and give careful reference for the provided
   functions. The priority is to provide reference documentation for each function.
   It is not necessary to provide tutorial documentation.

   Use free software, such as GNU Plotutils, to produce the graphs in the manual.

   Some of the graphs have been made with gnuplot which is not truly free (or GNU)
   software, and some have been made with proprietary programs. These should be
   replaced with output from GNU plotutils.

   When citing references be sure to use the standard, definitive and best reference
   books in the field, rather than lesser known text-books or introductory books
   which happen to be available (e.g. from undergraduate studies). For example,
   references concerning algorithms should be to Knuth, references concerning
   statistics should be to Kendall & Stuart, references concerning special functions
   should be to Abramowitz & Stegun (Handbook of Mathematical Functions AMS-55),
   etc.

   The standard references have a better chance of being available in an accessible
   library for the user. If they are not available and the user decides to buy a
   copy in order to look up the reference then this also gives them the best quality
   book which should also cover the largest number of other references in the GSL
   Manual. If many different books were to be referenced this would be an expensive
   and inefficient use of resources for a user who needs to look up the details of
   the algorithms. Reference books also stay in print much longer than text books,
   which are often out-of-print after a few years.

   Similarly, cite original papers wherever possible. Be sure to keep copies of
   these for your own reference (e.g. when dealing with bug reports) or to pass on
   to future maintainers.

   If you need help in tracking down references, ask on the gsl-discuss mailing
   list. There is a group of volunteers with access to good libraries who have
   offered to help GSL developers get copies of papers.

   [JT section: written by James Theiler

   And we furthermore promise to try as hard as possible to document the software:
   this will ideally involve discussion of why you might want to use it, what
   precisely it does, how precisely to invoke it, how more-or-less it works, and
   where we learned about the algorithm, and (unless we wrote it from scratch) where
   we got the code. We do not plan to write this entire package from scratch, but to
   cannibalize existing mathematical freeware, just as we expect our own software to
   be cannibalized.]

[22]Namespace

   Use gsl_ as a prefix for all exported functions and variables.

   Use GSL_ as a prefix for all exported macros.

   All exported header files should have a filename with the prefix gsl_.

   All installed libraries should have a name like libgslhistogram.a

   Any installed executables (utility programs etc) should have the prefix gsl-
   (with a hyphen, not an underscore).

   All function names, variables, etc should be in lower case. Macros and
   preprocessor variables should be in upper case.

[23]Header files

   Installed header files should be idempotent, i.e. surround them by the
   preprocessor conditionals like the following,
#ifndef __GSL_HISTOGRAM_H__
#define __GSL_HISTOGRAM_H__
...
#endif /* __GSL_HISTOGRAM_H__ */

[24]Target system

   The target system is ANSI C, with a full Standard C Library, and IEEE arithmetic.

[25]Function Names

   Each module has a name, which prefixes any function names in that module, e.g.
   the module gsl_fft has function names like gsl_fft_init. The modules correspond
   to subdirectories of the library source tree.

[26]Object-orientation

   The algorithms should be object oriented, but only to the extent that is easy in
   portable ANSI C. The use of casting or other tricks to simulate inheritance is
   not desirable, and the user should not have to be aware of anything like that.
   This means many types of patterns are ruled out. However, this is not considered
   a problem -- they are too complicated for the library.

   Note: it is possible to define an abstract base class easily in C, using function
   pointers. See the rng directory for an example.

   When reimplementing public domain fortran code, please try to introduce the
   appropriate object concepts as structs, rather than translating the code
   literally in terms of arrays. The structs can be useful just within the file, you
   don't need to export them to the user.

   For example, if a fortran program repeatedly uses a subroutine like,
SUBROUTINE  RESIZE (X, K, ND, K1)

   where X(K,D) represents a grid to be resized to X(K1,D) you can make this more
   readable by introducing a struct,
struct grid {
    int nd;  /* number of dimensions */
    int k;   /* number of bins */
    double * x;   /* partition of axes, array of size x[k][nd] */
}

void
resize_grid (struct grid * g, int k_new)
{
...
}

   Similarly, if you have a frequently recurring code fragment within a single file
   you can define a static or static inline function for it. This is typesafe and
   saves writing out everything in full.

[27]Comments

   Follow the GNU Coding Standards. A relevant quote is,

   "Please write complete sentences and capitalize the first word. If a lower-case
   identifier comes at the beginning of a sentence, don't capitalize it! Changing
   the spelling makes it a different identifier. If you don't like starting a
   sentence with a lower case letter, write the sentence differently (e.g., "The
   identifier lower-case is ..".)".

[28]Minimal structs

   We prefer to make structs which are minimal. For example, if a certain type of
   problem can be solved by several classes of algorithm (e.g. with and without
   derivative information) it is better to make separate types of struct to handle
   those cases. i.e. run time type identification is not desirable.

[29]Algorithm decomposition

   Iterative algorithms should be decomposed into an INITIALIZE, ITERATE, TEST form,
   so that the user can control the progress of the iteration and print out
   intermediate results. This is better than using call-backs or using flags to
   control whether the function prints out intermediate results. In fact, call-backs
   should not be used -- if they seem necessary then it's a sign that the algorithm
   should be broken down further into individual components so that the user has
   complete control over them.

   For example, when solving a differential equation the user may need to be able to
   advance the solution by individual steps, while tracking a realtime process. This
   is only possible if the algorithm is broken down into step-level components.
   Higher level decompositions would not give sufficient flexibility.

[30]Memory allocation and ownership

   Functions which allocate memory on the heap should end in _alloc (e.g.
   gsl_foo_alloc) and be deallocated by a corresponding _free function
   (gsl_foo_free).

   Be sure to free any memory allocated by your function if you have to return an
   error in a partially initialized object.

   Don't allocate memory 'temporarily' inside a function and then free it before the
   function returns. This prevents the user from controlling memory allocation. All
   memory should be allocated and freed through separate functions and passed around
   as a "workspace" argument. This allows memory allocation to be factored out of
   tight loops.

[31]Memory layout

   We use flat blocks of memory to store matrices and vectors, not C-style
   pointer-to-pointer arrays. The matrices are stored in row-major order -- i.e. the
   column index (second index) moves continuously through memory.

[32]Linear Algebra Levels

   Functions using linear algebra are divided into two levels:

   For purely "1d" functions we use the C-style arguments (double *, stride, size)
   so that it is simpler to use the functions in a normal C program, without needing
   to invoke all the gsl_vector machinery.

   The philosophy here is to minimize the learning curve. If someone only needs to
   use one function, like an fft, they can do so without having to learn about
   gsl_vector.

   This leads to the question of why we don't do the same for matrices. In that case
   the argument list gets too long and confusing, with (size1, size2, tda) for each
   matrix and potential ambiguities over row vs column ordering. In this case, it
   makes sense to use gsl_vector and gsl_matrix, which take care of this for the
   user.

   So really the library has two levels -- a lower level based on C types for 1d
   operations, and a higher level based on gsl_matrix and gsl_vector for general
   linear algebra.

   Of course, it would be possible to define a vector version of the lower level
   functions too. So far we have not done that because it was not essential -- it
   could be done but it is easy enough to get by using the C arguments, by typing
   v->data, v->stride, v->size instead. A gsl_vector version of low-level functions
   would mainly be a convenience.

   Please use BLAS routines internally within the library whenever possible for
   efficiency.

[33]Exceptions and Error handling

   The basic error handling procedure is the return code (see gsl_errno.h for a list
   of allowed values). Use the GSL_ERROR macro to mark an error. The current
   definition of this macro is not ideal but it can be changed at compile time.

   You should always use the GSL_ERROR macro to indicate an error, rather than just
   returning an error code. The macro allows the user to trap errors using the
   debugger (by setting a breakpoint on the function gsl_error).

   The only circumstances where GSL_ERROR should not be used are where the return
   value is "indicative" rather than an error -- for example, the iterative routines
   use the return code to indicate the success or failure of an iteration. By the
   nature of an iterative algorithm "failure" (a return code of GSL_CONTINUE) is a
   normal occurrence and there is no need to use GSL_ERROR there.

   Be sure to free any memory allocated by your function if you return an error (in
   particular for errors in partially initialized objects).

[34]Persistence

   If you make an object foo which uses blocks of memory (e.g. vector, matrix,
   histogram) you can provide functions for reading and writing those blocks,
int gsl_foo_fread (FILE * stream, gsl_foo * v);
int gsl_foo_fwrite (FILE * stream, const gsl_foo * v);
int gsl_foo_fscanf (FILE * stream, gsl_foo * v);
int gsl_foo_fprintf (FILE * stream, const gsl_foo * v, const char *format);

   Only dump out the blocks of memory, not any associated parameters such as
   lengths. The idea is for the user to build higher level input/output facilities
   using the functions the library provides. The fprintf/fscanf versions should be
   portable between architectures, while the binary versions should be the "raw"
   version of the data. Use the functions
int gsl_block_fread (FILE * stream, gsl_block * b);
int gsl_block_fwrite (FILE * stream, const gsl_block * b);
int gsl_block_fscanf (FILE * stream, gsl_block * b);
int gsl_block_fprintf (FILE * stream, const gsl_block * b, const char *format);

   or
int gsl_block_raw_fread (FILE * stream, double * b, size_t n, size_t stride);
int gsl_block_raw_fwrite (FILE * stream, const double * b, size_t n, size_t stri
de);
int gsl_block_raw_fscanf (FILE * stream, double * b, size_t n, size_t stride);
int gsl_block_raw_fprintf (FILE * stream, const double * b, size_t n, size_t str
ide, const char *format);

   to do the actual reading and writing.

[35]Using Return Values

   Always assign a return value to a variable before using it. This allows easier
   debugging of the function, and inspection and modification of the return value.
   If the variable is only needed temporarily then enclose it in a suitable scope.

   For example, instead of writing,
a = f(g(h(x,y)))

   use temporary variables to store the intermediate values,
{
  double u = h(x,y);
  double v = g(u);
  a = f(v);
}

   These can then be inspected more easily in the debugger, and breakpoints can be
   placed more precisely. The compiler will eliminate the temporary variables
   automatically when the program is compiled with optimization.

[36]Variable Names

   Try to follow existing conventions for variable names,
   dim
          number of dimensions
   w
          pointer to workspace
   state
          pointer to state variable (use s if you need to save characters)
   result
          pointer to result (output variable)
   abserr
          absolute error
   relerr
          relative error
   epsabs
          absolute tolerance
   epsrel
          relative tolerance
   size
          the size of an array or vector e.g. double array[size]
   stride
          the stride of a vector
   size1
          the number of rows in a matrix
   size2
          the number of columns in a matrix
   n
          general integer number, e.g. number of elements of array, in fft, etc
   r
          random number generator (gsl_rng)

[37]Datatype widths

   Be aware that in ANSI C the type int is only guaranteed to provide 16-bits. It
   may provide more, but is not guaranteed to. Therefore if you require 32 bits you
   must use long int, which will have 32 bits or more. Of course, on many platforms
   the type int does have 32 bits instead of 16 bits but we have to code to the ANSI
   standard rather than a specific platform.

[38]size_t

   All objects (blocks of memory, etc) should be measured in terms of a size_t type.
   Therefore any iterations (e.g. for(i=0; i 0 && i--;) { ... }

   to avoid problems with wrap-around at i=0.

   If you really want to avoid confusion use a separate variable to invert the loop
   order,
for (i = 0; i < N; i++) { j = N - i; ... }

[39]Arrays vs Pointers

   A function can be declared with either pointer arguments or array arguments. The
   C standard considers these to be equivalent. However, it is useful to distinguish
   between the case of a pointer, representing a single object which is being
   modified, and an array which represents a set of objects with unit stride (that
   are modified or not depending on the presence of const). For vectors, where the
   stride is not required to be unity, the pointer form is preferred.
/* real value, set on output */
int foo (double * x);

/* real vector, modified */
int foo (double * x, size_t stride, size_t n);

/* constant real vector */
int foo (const double * x, size_t stride, size_t n);

/* real array, modified */
int bar (double x[], size_t n);

/* real array, not modified */
int baz (const double x[], size_t n);

[40]Pointers

   Avoid dereferencing pointers on the right-hand side of an expression where
   possible. It's better to introduce a temporary variable. This is easier for the
   compiler to optimise and also more readable since it avoids confusion between the
   use of * for multiplication and dereferencing.
while (fabs (f) < 0.5)
{
  *e = *e - 1;
  f  *= 2;
}

   is better written as,
{
  int p = *e;

  while (fabs(f) < 0.5)
    {
     p--;
     f *= 2;
    }

  *e = p;
}

[41]Constness

   Use const in function prototypes wherever an object pointed to by a pointer is
   constant (obviously). For variables which are meaningfully constant within a
   function/scope use const also. This prevents you from accidentally modifying a
   variable which should be constant (e.g. length of an array, etc). It can also
   help the compiler do optimization. These comments also apply to arguments passed
   by value which should be made const when that is meaningful.

[42]Pseudo-templates

   There are some pseudo-template macros available in 'templates_on.h' and
   'templates_off.h'. See a directory link 'block' for details on how to use them.
   Use sparingly, they are a bit of a nightmare, but unavoidable in places.

   In particular, the convention is: templates are used for operations on "data"
   only (vectors, matrices, statistics, sorting). This is intended to cover the case
   where the program must interface with an external data-source which produces a
   fixed type. e.g. a big array of char's produced by an 8-bit counter.

   All other functions can use double, for floating point, or the appropriate
   integer type for integers (e.g. unsigned long int for random numbers). It is not
   the intention to provide a fully templated version of the library.

   That would be "putting a quart into a pint pot". To summarize, almost everything
   should be in a "natural type" which is appropriate for typical usage, and
   templates are there to handle a few cases where it is unavoidable that other
   data-types will be encountered.

   For floating point work "double" is considered a "natural type". This sort of
   idea is a part of the C language.

[43]Arbitrary Constants

   Avoid arbitrary constants.

   For example, don't hard code "small" values like '1e-30', '1e-100' or
   10*GSL_DBL_EPSILON into the routines. This is not appropriate for a general
   purpose library.

   Compute values accurately using IEEE arithmetic. If errors are potentially
   significant then error terms should be estimated reliably and returned to the
   user, by analytically deriving an error propagation formula, not using guesswork.

   A careful consideration of the algorithm usually shows that arbitrary constants
   are unnecessary, and represent an important parameter which should be accessible
   to the user.

   For example, consider the following code:
if (residual < 1e-30) {
   return 0.0;  /* residual is zero within round-off error */
}

   This should be rewritten as,
   return residual;

   in order to allow the user to determine whether the residual is significant or
   not.

   The only place where it is acceptable to use constants like GSL_DBL_EPSILON is in
   function approximations, (e.g. taylor series, asymptotic expansions, etc). In
   these cases it is not an arbitrary constant, but an inherent part of the
   algorithm.

[44]Test suites

   The implementor of each module should provide a reasonable test suite for the
   routines.

   The test suite should be a program that uses the library and checks the result
   against known results, or invokes the library several times and does a
   statistical analysis on the results (for example in the case of random number
   generators).

   Ideally the one test program per directory should aim for 100% path coverage of
   the code. Obviously it would be a lot of work to really achieve this, so
   prioritize testing on the critical parts and use inspection for the rest. Test
   all the error conditions by explicitly provoking them, because we consider it a
   serious defect if the function does not return an error for an invalid parameter.
   N.B. Don't bother to test for null pointers -- it's sufficient for the library to
   segfault if the user provides an invalid pointer.

   The tests should be deterministic. Use the gsl_test functions provided to perform
   separate tests for each feature with a separate output PASS/FAIL line, so that
   any failure can be uniquely identified.

   Use realistic test cases with 'high entropy'. Tests on simple values such as 1 or
   0 may not reveal bugs. For example, a test using a value of x=1 will not pick up
   a missing factor of x in the code. Similarly, a test using a value of x=0 will
   not pick any missing terms involving x in the code. Use values like 2.385 to
   avoid silent failures.

   If your test uses multiple values make sure there are no simple relations between
   them that could allow bugs to be missed through silent cancellations.

   If you need some random floats to put in the test programs use od -f /dev/random
   as a source of inspiration.

   Don't use sprintf to create output strings in the tests. It can cause hard to
   find bugs in the test programs themselves. The functions gsl_test_... support
   format string arguments so use these instead.

[45]Compilation

   Make sure everything compiles cleanly. Use the strict compilation options for
   extra checking.
make CFLAGS="-ansi -pedantic -Werror -W -Wall -Wtraditional -Wconversion
  -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings
  -Wstrict-prototypes -fshort-enums -fno-common -Wmissing-prototypes
  -Wnested-externs -Dinline= -g -O4"

   Also use checkergcc to check for memory problems on the stack and the heap. It's
   the best memory checking tool. If checkergcc isn't available then Electric Fence
   will check the heap, which is better than no checking.

   There is a new tool valgrind for checking memory access. Test the code with this
   as well.

   Make sure that the library will also compile with C++ compilers (g++). This
   should not be too much of a problem if you have been writing in ANSI C.

[46]Thread-safety

   The library should be usable in thread-safe programs. All the functions should be
   thread-safe, in the sense that they shouldn't use static variables.

   We don't require everything to be completely thread safe, but anything that isn't
   should be obvious. For example, some global variables are used to control the
   overall behavior of the library (range-checking on/off, function to call on fatal
   error, etc). Since these are accessed directly by the user it is obvious to the
   multi-threaded programmer that they shouldn't be modified by different threads.

   There is no need to provide any explicit support for threads (e.g. locking
   mechanisms etc), just to avoid anything which would make it impossible for
   someone to call a GSL routine from a multithreaded program.

[47]Legal issues

     * Each contributor must make sure her code is under the GNU General Public
       License (GPL). This means getting a disclaimer from your employer.
     * We must clearly understand ownership of existing code and algorithms.
     * Each contributor can retain ownership of their code, or sign it over to FSF
       as they prefer. There is a standard disclaimer in the GPL (take a look at
       it). The more specific you make your disclaimer the more likely it is that it
       will be accepted by an employer. For example,
Yoyodyne, Inc., hereby disclaims all copyright interest in the software
`GNU Scientific Library - Legendre Functions' (routines for computing
legendre functions numerically in C) written by James Hacker.

, 1 April 1989
Ty Coon, President of Vice

     * Obviously: don't use or translate non-free code. In particular don't copy or
       translate code from Numerical Recipes or ACM TOMS. Numerical Recipes is under
       a strict license and is not free software. The publishers Cambridge
       University Press claim copyright on all aspects of the book and the code,
       including function names, variable names and ordering of mathematical
       subexpressions. Routines in GSL should not refer to Numerical Recipes or be
       based on it in any way. The ACM algorithms published in TOMS (Transactions on
       Mathematical Software) are not public domain, even though they are
       distributed on the internet -- the ACM uses a special non-commercial license
       which is not compatible with the GPL. The details of this license can be
       found on the cover page of ACM Transactions on Mathematical Software or on
       the ACM Website. Only use code which is explicitly under a free license: GPL
       or Public Domain. If there is no license on the code then this does not mean
       it is public domain -- an explicit statement is required. If in doubt check
       with the author.
     * I think one can reference algorithms from classic books on numerical analysis
       (BJG: yes, provided the code is an independent implementation and not copied
       from any existing software).

[48]Non-UNIX portability

   There is good reason to make this library work on non-UNIX systems. It is
   probably safe to ignore DOS and only worry about windows95/windowsNT portability
   (so filenames can be long, I think).

   On the other hand, nobody should be forced to use non-UNIX systems for
   development.

   The best solution is probably to issue guidelines for portability, like saying
   "don't use XYZ unless you absolutely have to". Then the Windows people will be
   able to do their porting.

[49]Compatibility with other libraries

   We do not regard compatibility with other numerical libraries as a priority.

   However, other libraries, such as Numerical Recipes, are widely used. If somebody
   writes the code to allow drop-in replacement of these libraries it would be
   useful to people. If it is done, it would be as a separate wrapper that can be
   maintained and shipped separately.

   There is a separate issue of system libraries, such as BSD math library and
   functions like expm1, log1p, hypot. The functions in this library are available
   on nearly every platform (but not all).

   In this case, it is best to write code in terms of these native functions to take
   advantage of the vendor-supplied system library (for example log1p is a machine
   instruction on the Intel x86). The library also provides portable implementations
   e.g. gsl_hypot which are used as an automatic fall back via autoconf when
   necessary. See the usage of hypot in 'gsl/complex/math.c', the implementation of
   gsl_hypot and the corresponding parts of files 'configure.in' and 'config.h.in'
   as an example.

[50]Parallelism

   We don't intend to provide support for parallelism within the library itself. A
   parallel library would require a completely different design and would carry
   overhead that other applications do not need.

[51]Precision

   For algorithms which use cutoffs or other precision-related terms please express
   these in terms of GSL_DBL_EPSILON and GSL_DBL_MIN, or powers or combinations of
   these. This makes it easier to port the routines to different precisions.

[52]Miscellaneous

   Don't use the letter l as a variable name -- it is difficult to distinguish from
   the number 1. (This seems to be a favorite in old Fortran programs).

   Final tip: one perfect routine is better than any number of routines containing
   errors.

                                       [53]Copying

   The subroutines and source code in the GNU Scientific Library package are "free";
   this means that everyone is free to use them and free to redistribute them on a
   free basis. The GNU Scientific Library-related programs are not in the public
   domain; they are copyrighted and there are restrictions on their distribution,
   but these restrictions are designed to permit everything that a good cooperating
   citizen would want to do. What is not allowed is to try to prevent others from
   further sharing any version of these programs that they might get from you.

   Specifically, we want to make sure that you have the right to give away copies of
   the programs that relate to GNU Scientific Library, that you receive source code
   or else can get it if you want it, that you can change these programs or use
   pieces of them in new free programs, and that you know you can do these things.

   To make sure that everyone has such rights, we have to forbid you to deprive
   anyone else of these rights. For example, if you distribute copies of the GNU
   Scientific Library-related code, you must give the recipients all the rights that
   you have. You must make sure that they, too, receive or can get the source code.
   And you must tell them their rights.

   Also, for our own protection, we must make certain that everyone finds out that
   there is no warranty for the programs that relate to GNU Scientific Library. If
   these programs are modified by someone else and passed on, we want their
   recipients to know that what they have is not what we distributed, so that any
   problems introduced by others will not reflect on our reputation.

   The precise conditions of the licenses for the programs currently being
   distributed that relate to GNU Scientific Library are found in the General Public
   Licenses that accompany them.
     ____________________________________________________________________________

   This document was generated using the texi2html translator version 1.54.

References

   1. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC1
   2. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC2
   3. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC3
   4. http://www.gnu.org/software/gsl/design/gsl-design.html#SEC40
   5. http://www.network-theory.co.uk/download/rngextra/
   6. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC4
   7. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC5
   8. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC6
   9. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC7
  10. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC8
  11. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC9
  12. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC10
  13. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC11
  14. http://www.research.att.com/sw/tools/sfio/dm-spe.ps
  15. http://citeseer.nj.nec.com/48973.html
  16. http://www.research.att.com/sw/tools/vmalloc/vmalloc.ps
  17. http://www.research.att.com/sw/tools/cdt/cdt.ps
  18. http://citeseer.nj.nec.com/korn91sfio.html
  19. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC12
  20. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC13
  21. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC14
  22. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC15
  23. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC16
  24. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC17
  25. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC18
  26. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC19
  27. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC20
  28. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC21
  29. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC22
  30. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC23
  31. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC24
  32. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC25
  33. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC26
  34. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC27
  35. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC28
  36. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC29
  37. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC30
  38. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC31
  39. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC32
  40. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC33
  41. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC34
  42. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC35
  43. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC36
  44. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC37
  45. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC38
  46. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC39
  47. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC40
  48. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC41
  49. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC42
  50. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC43
  51. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC44
  52. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC45
  53. http://www.gnu.org/software/gsl/design/gsl-design_toc.html#TOC46
Usage: http://www.kk-software.de/kklynxview/get/URL
e.g. http://www.kk-software.de/kklynxview/get/http://www.kk-software.de
Errormessages are in German, sorry ;-)