User's Guide

The GHC Team
Department of Computing Science
University of Glasgow
Glasgow, Scotland
G12 8QQ
Email: glasgow-haskell-{bugs,users}-request@dcs.gla.ac.uk


Table of Contents


1 Introduction to GHC

This is a guide to using the Glasgow Haskell compilation (GHC) system. It is a batch compiler for the Haskell 1.4 language, with support for various Glasgow-only extensions.

Many people will use GHC very simply: compile some modules -- `ghc -c -O Foo.hs Bar.hs'; and link them -- `ghc -o wiggle -O Foo.o Bar.o'.

But if you need to do something more complicated, GHC can do that, too:

ghc -c -O -fno-foldr-build -dcore-lint -fvia-C -ddump-simpl Foo.lhs

Stay tuned -- all will be revealed!

In this document, we assume that GHC has been installed at your site as `ghc'. If you are unfamiliar with the conventions of UNIX compilers, the material in section 18 Tutorial material about this compilation system may help.

2 Knowing us, knowing you: Web sites, mailing lists, etc.

On the World-Wide Web, there are several URLs of likely interest:

Haskell home page -- `http://haskell.org/'
GHC home page -- `http://www.dcs.gla.ac.uk/fp/software/ghc/'
Glasgow FP group page -- `http://www.dcs.gla.ac.uk/fp/'
comp.lang.functional FAQ -- 
   `http://www.cs.nott.ac.uk/Department/Staff/mpj/faq.html'

We run two mailing lists about Glasgow Haskell. We encourage you to join, as you feel is appropriate.

glasgow-haskell-users:
This list is for GHC users to chat among themselves. Subscribe by sending mail to `majordomo@dcs.gla.ac.uk', with a message body (not header) like this:
subscribe glasgow-haskell-users My Name <m.y.self@bigbucks.com>
(The last bit is your all-important e-mail address, of course.) To communicate with your fellow users, send mail to `glasgow-haskell-users'. To contact the list administrator, send mail to `glasgow-haskell-users-request'. An archive of the list is available on the Web at:
http://www.dcs.gla.ac.uk/mail-www/glasgow-haskell-users

glasgow-haskell-bugs:
Send bug reports for GHC to this address! The sad and lonely people who subscribe to this list will muse upon what's wrong and what you might do about it. Subscribe via `majordomo@dcs.gla.ac.uk' with:
subscribe glasgow-haskell-bugs My Name <m.y.self@hackers.r.us>
Again, you may contact the list administrator at `glasgow-haskell-bugs-request'. And, yes, an archive of the list is available on the Web at:
http://www.dcs.gla.ac.uk/mail-www/glasgow-haskell-bugs

There is also the general Haskell mailing list. Subscribe by sending email to `majordomo@dcs.gla.ac.uk', with the usual message body:

subscribe haskell My Name <m.y.self@fp.rules.ok.org>

Some Haskell-related discussion takes place in the Usenet newsgroup `comp.lang.functional'. (But note: news is basically dead at Glasgow. That's one reason Glaswegians aren't too active in c.f.l.)

The main anonymous-FTP site for Glasgow Haskell is `ftp.dcs.gla.ac.uk', in `pub/haskell/glasgow/'. "Important" bits are mirrored at other Haskell archive sites (and we have their stuff, too).

3 Release notes

3.1 Release notes for version 2.09 -- 11/97

Changes made since 2.08:

In 2.09, we've made extensive changes to the libraries that provide access to various GHC extensions. Many of these extensions are now provided by Hugs, and as a result we are now using a common set of extension libraries.

Briefly, the changes are as follows (for the definition of the new libraries, see the section on GHC/Hugs libraries in the User Guide).

3.2 Release notes for version 2.08 -- 10/97

GHC 2.08 is a minor upgrade of GHC 2.07, and is released as source only. It represents work done through September '97.

Here's a list of the changes made since GHC 2.07:

The following changes were made to Happy:

3.3 Release notes for version 2.06 -- 9/97

GHC 2.06 is a minor upgrade of GHC 2.05, and is released as source only. It represents work done through August '97.

Here's a list of the changes made since GHC 2.05:

3.4 Release notes for version 2.04 -- 6/97

GHC 2.04 is a minor upgrade of GHC 2.02 (and the bugfix release, 2.03), representing work done through May '97. This new release adds the following:

In addition to the above, we've made the following minor changes to the GHC libraries/our implementation of the Haskell standard prelude:

3.5 Release notes for version 2.03 -- 4/97

GHC 2.03 is a minor upgrade of GHC 2.02. It differs in the following ways:

None of the bugs in GHC 2.02 are show-stoppers, so you'll only want to upgrade if you are a build-from-source junkie and the bugs are irritating you.

Apart from bug-fixes, which are itemized elsewhere, the exact language and standard-library changes are as follows:

In addition, we've made following changes to GHC libraries/GHC's implementation of the Haskell standard prelude:

3.6 Release notes for version 2.02 -- 3/97

Release 2.02 is the first release of Glasgow Haskell for Haskell 1.4.

The announcement for this release is distributed as `ANNOUNCE-2.02' in the top-level directory. It contains very important caveats about 2.02, which we do not repeat here!

Information about "what's ported to which machine" is in the Installation Guide. Since 2.01, we've added support for Win32 (Windows NT and Windows 95).

3.6.1 New configuration things in 2.02

3.6.2 User-visible changes in 2.02, including incompatibilities

GHC 2.02 is a compiler for Haskell 1.4 and, as such, introduces a bunch of user-visible changes. The GHC user's guide has a section to help you upgrade your programs to Haskell 1.4 from 1.2; all user-visible changes are described there (and not repeated here).

3.6.3 New or changed GHC command-line options

3.6.4 New in the compiler proper

3.6.5 In the libraries

The libraries have been completely reorganised. There's a description in section 10 System libraries.

3.6.6 In "hslibs" libraries

3.6.7 In the runtime system

3.6.8 Other new stuff

2.02 is released together with Green Card, a foreign-language interface generator for Haskell. More details elsewhere...

3.6.9 Known troublespots

The 2.02 compiler has the following known deficiencies:

native code generator, x86:
The native code generator for x86 platforms is by default switched off, as the code the compiler produces with it enabled was discovered just before releaseing to be wonky. Rather than delay the release further, GHC on x86 platforms rely on `GCC' as their backend for now. Hopefully fixed soon.

Simplifier looping:
The simplifier(Glasgow-speak for optimiser) has been observed to get into a loop in one or two cases. If you should observe this, please report it as a bug - the work around is to turn off optimisation.

Undefined `*_vap_info' symbols:
If the linker complains about some undefined `*_vap_info' symbols when linking 2.02 compiled programs (very unlikely) - fix this by compiling the module where the references are coming from with `-fno-lambda-lifting'.

3.7 Old release notes

We used to include the release notes back to the dawn of time in this document. Made for a nice long document, but it wasn't that interesting. If you would like to see old release notes, just ask; we've still got 'em.

4 Invoking GHC: Command-line options

Command-line arguments are either options or file names.

Command-line options begin with `-'. They may not be grouped: `-vO' is different from `-v -O'. Options need not precede filenames: e.g., `ghc *.o -o foo'. All options are processed and then applied to all files; you cannot, for example, invoke `ghc -c -O1 Foo.hs -O2 Bar.hs' to apply different optimisation levels to the files `Foo.hs' and `Bar.hs'. For conflicting options, e.g., `-c -S', we reserve the right to do anything we want. (Usually, the last one applies.)

Options related to profiling, Glasgow extensions to Haskell (e.g., unboxed values), Concurrent and Parallel Haskell are described in section 7 Profiling Haskell programs, section 8 Glasgow extensions to Haskell, and section 11 Concurrent and Parallel Haskell, respectively.

4.1 Meaningful file suffixes

File names with "meaningful" suffixes (e.g., `.lhs' or `.o') cause the "right thing" to happen to those files.

`.lhs':
A "literate Haskell" module.

`.hs':
A not-so-literate Haskell module.

`.hi':
A Haskell interface file, probably compiler-generated.

`.hc':
Intermediate C file produced by the Haskell compiler.

`.c':
A C file not produced by the Haskell compiler.

`.s':
An assembly-language source file, usually produced by the compiler.

`.o':
An object file, produced by an assembler.

Files with other suffixes (or without suffixes) are passed straight to the linker.

4.2 Help and verbosity options

A good option to start with is the `-help' (or `-?') option.

GHC spews a long message to standard output and then exits.

The `-v' option makes GHC verbose: it reports its version number and shows (on stderr) exactly how it invokes each phase of the compilation system. Moreover, it passes the `-v' flag to most phases; each reports its version number (and possibly some other information).

Please, oh please, use the `-v' option when reporting bugs! Knowing that you ran the right bits in the right order is always the first thing we want to verify.

If you're just interested in the compiler version number, the `--version' option prints out a one-line string containing the requested info.

4.3 Running the right phases in the right order

The basic task of the `ghc' driver is to run each input file through the right phases (parsing, linking, etc.).

The first phase to run is determined by the input-file suffix, and the last phase is determined by a flag. If no relevant flag is present, then go all the way through linking. This table summarises:

phase of the            suffix saying  flag saying   (suffix of)  
compilation system      "start here"   "stop after"  output file  
----------------------------------------------------------------
literate pre-processor  .lhs           -             -            
C pre-processor (opt.)  -              -             -            
Haskell compiler        .hs            -C, -S        .hc, .s      
C compiler (opt.)       .hc or .c      -S            .s           
assembler               .s             -c            .o           
linker                  other          -             a.out        

Thus, a common invocation would be: `ghc -c Foo.hs'

Note: What the Haskell compiler proper produces depends on whether a native-code generator is used (producing assembly language) or not (producing C).

The option `-cpp' must be given for the C pre-processor phase to be run, that is, the pre-processor will be run over your Haskell source file before continuing.

The option `-E' runs just the pre-processing passes of the compiler, outputting the result on stdout before stopping. If used in conjunction with -cpp, the output is the code blocks of the original (literal) source after having put it through the grinder that is the C pre-processor. Sans `-cpp', the output is the de-litted version of the original source.

The option `-optcpp-E' runs just the pre-processing stage of the C-compiling phase, sending the result to stdout. (For debugging or obfuscation contests, usually.)

4.4 Optimisation (code improvement)

The `-O*' options specify convenient "packages" of optimisation flags; the `-f*' options described later on specify individual optimisations to be turned on/off; the `-m*' options specify machine-specific optimisations to be turned on/off.

4.4.1 `-O*': convenient "packages" of optimisation flags.

There are many options that affect the quality of code produced by GHC. Most people only have a general goal, something like "Compile quickly" or "Make my program run like greased lightning." The following "packages" of optimisations (or lack thereof) should suffice.

Once you choose a `-O*' "package," stick with it -- don't chop and change. Modules' interfaces will change with a shift to a new `-O*' option, and you may have to recompile a large chunk of all importing modules before your program can again be run safelysection 15.2 The recompilation checker.

No `-O*'-type option specified:
This is taken to mean: "Please compile quickly; I'm not over-bothered about compiled-code quality." So, for example: `ghc -c Foo.hs'

`-O' or `-O1':
Means: "Generate good-quality code without taking too long about it." Thus, for example: `ghc -c -O Main.lhs'

`-O2':
Means: "Apply every non-dangerous optimisation, even if it means significantly longer compile times." The avoided "dangerous" optimisations are those that can make runtime or space worse if you're unlucky. They are normally turned on or off individually. At the moment, `-O2' is unlikely to produce better code than `-O'.

`-fvia-C':
Compile via C, and don't use the native-code generator. (There are many cases when GHC does this on its own.) You might pick up a little bit of speed by compiling via C. If you use `_ccall_'s or `_casm_'s, you probably have to use `-fvia-C'. The lower-case incantation, `-fvia-c', is synonymous.

`-O2-for-C':
Says to run GCC with `-O2', which may be worth a few percent in execution speed. Don't forget `-fvia-C', lest you use the native-code generator and bypass GCC altogether!

`-Onot':
This option will make GHC "forget" any -Oish options it has seen so far. Sometimes useful; for example: `make all EXTRA_HC_OPTS=-Onot'.

`-Ofile <file>':
For those who need absolute control over exactly what options are used (e.g., compiler writers, sometimes :-), a list of options can be put in a file and then slurped in with `-Ofile'. In that file, comments are of the `#'-to-end-of-line variety; blank lines and most whitespace is ignored. Please ask if you are baffled and would like an example of `-Ofile'!

At Glasgow, we don't use a `-O*' flag for day-to-day work. We use `-O' to get respectable speed; e.g., when we want to measure something. When we want to go for broke, we tend to use `-O -fvia-C -O2-for-C' (and we go for lots of coffee breaks).

The easiest way to see what `-O' (etc) "really mean" is to run with `-v', then stand back in amazement. Alternatively, just look at the `@HsC_minus<blah>' lists in the `ghc' driver script.

4.4.2 `-f*': platform-independent flags

Flags can be turned off individually. (NB: I hope you have a good reason for doing this....) To turn off the `-ffoo' flag, just use the `-fno-foo' flag. So, for example, you can say `-O2 -fno-strictness', which will then drop out any running of the strictness analyser.

The options you are most likely to want to turn off are: `-fno-strictness' (strictness analyser [because it is sometimes slow]), `-fno-specialise' (automatic specialisation of overloaded functions [because it makes your code bigger]) [US spelling also accepted], and `-fno-foldr-build'.

Should you wish to turn individual flags on, you are advised to use the `-Ofile' option, described above. Because the order in which optimisation passes are run is sometimes crucial, it's quite hard to do with command-line options.

Here are some "dangerous" optimisations you might want to try:

`-funfolding-creation-threshold<n>':
(Default: 30) By raising or lowering this number, you can raise or lower the amount of pragmatic junk that gets spewed into interface files. (An unfolding has a "size" that reflects the cost in terms of "code bloat" of expanding that unfolding in another module. A bigger Core expression would be assigned a bigger cost.)

`-funfolding-use-threshold<n>':
(Default: 3) By raising or lowering this number, you can make the compiler more or less keen to expand unfoldings. OK, folks, these magic numbers `30' and `3' are mildly arbitrary; they are of the "seem to be OK" variety. The `3' is the more critical one; it's what determines how eager GHC is about expanding unfoldings.

`-funfolding-override-threshold<n>':
(Default: 8) [Pretty obscure] When deciding what unfoldings from a module should be made available to the rest of the world (via this module's interface), the compiler normally likes "small" expressions. For example, if it sees `foo = bar', it will decide that the very small expression `bar' is a great unfolding for `foo'. But if `bar' turns out to be `(True,False,True)', we would probably prefer that for the unfolding for `foo'. Should we "override" the initial small unfolding from `foo=bar' with the bigger-but-better one? Yes, if the bigger one's "size" is still under the "override threshold." You can use this flag to adjust this threshold (why, I'm not sure).

`-fsemi-tagging':
This option (which does not work with the native-code generator) tells the compiler to add extra code to test for already-evaluated values. You win if you have lots of such values during a run of your program, you lose otherwise. (And you pay in extra code space.) We have not played with `-fsemi-tagging' enough to recommend it. (For all we know, it doesn't even work anymore... Sigh.)

4.4.3 `-m*': platform-specific flags

Some flags only make sense for particular target platforms.

`-mv8':
(SPARC machines) Means to pass the like-named option to GCC; it says to use the Version 8 SPARC instructions, notably integer multiply and divide. The similiar `-m*' GCC options for SPARC also work, actually.

`-mlong-calls':
(HPPA machines) Means to pass the like-named option to GCC. Required for Very Big modules, maybe. (Probably means you're in trouble...)

`-monly-[32]-regs':
(iX86 machines) GHC tries to "steal" four registers from GCC, for performance reasons; it almost always works. However, when GCC is compiling some modules with four stolen registers, it will crash, probably saying:
Foo.hc:533: fixed or forbidden register was spilled.
This may be due to a compiler bug or to impossible asm
statements or clauses.
Just give some registers back with `-monly-N-regs'. Try `3' first, then `2'. If `2' doesn't work, please report the bug to us.

4.4.4 Code improvement by the C compiler.

The C compiler (GCC) is run with `-O' turned on. (It has to be, actually).

If you want to run GCC with `-O2' -- which may be worth a few percent in execution speed -- you can give a `-O2-for-C' option.

4.5 Warnings and sanity-checking

GHC has a selection of options that select which types of non-fatal error messages, otherwise known as warnings, can be generated during compilation. By default, you get a standard set of warnings which are generally likely to indicate bugs in your program. These are: `-fwarn-overlpapping-patterns' and `-fwarn-missing-methods'. The following flags are simple ways to select standard "packages" of warnings:

`-Wnot':
Turns off all warnings, including the standard ones.

`-W':
Provides the standard warnings plus `-fwarn-incomplete-patterns' and `-fwarn-unused-names'.

`-Wall':
Turns on all warning options.

The full set of warning options is described below. To turn off any warning, simply give the corresponding `-fno-warn-...' option on the command line.

`-fwarn-name-shadowing':
This option causes a warning to be emitted whenever an inner-scope value has the same name as an outer-scope value, i.e. the inner value shadows the outer one. This can catch typographical errors that turn into hard-to-find bugs, e.g., in the inadvertent cyclic definition `let x = ... x ... in'. Consequently, this option does not allow cyclic recursive definitions.

`-fwarn-overlapping-patterns':
By default, the compiler will warn you if a set of patterns are either incomplete (i.e., you're only matching on a subset of an algebraic data type's constructors), or overlapping, i.e.,
f :: String -> Int
f []     = 0
f (_:xs) = 1
f "2"    = 2

g [] = 2
where the last pattern match in `f' won't ever be reached, as the second pattern overlaps it. More often than not, redundant patterns is a programmer mistake/error, so this option is enabled by default.

`-fwarn-incomplete-patterns':
Similarly for incomplete patterns, the function `g' will fail when applied to non-empty lists, so the compiler will emit a warning about this when this option is enabled.

`-fwarn-missing-methods':
This option is on by default, and warns you whenever an instance declaration is missing one or more methods, and the corresponding class declaration has no default declaration for them.

`-fwarn-unused-names':
Have the renamer report which locally defined names are not used/exported. This option is not currently supported.

If you would like GHC to check that every top-level value has a type signature, use the `-fsignatures-required' option.

If you're feeling really paranoid, the `-dcore-lint' option is a good choice. It turns on heavyweight intra-pass sanity-checking within GHC. (It checks GHC's sanity, not yours.)

4.6 Re-directing the compilation output(s)

When compiling a Haskell module, GHC may produce several files of output (usually two).

One file is usually an interface file. If compiling `bar/Foo.hs', the interface file would normally be `bar/Foo.hi'. The interface output may be directed to another file `bar2/Wurble.iface' with the option `-ohi bar2/Wurble.iface' (not recommended).

To avoid generating an interface file at all, use a `-nohi' option.

The compiler does not overwrite an existing `.hi' interface file if the new one is byte-for-byte the same as the old one; this is friendly to `make'. When an interface does change, it is often enlightening to be informed. The `-hi-diffs' option will make `ghc' run `diff' on the old and new `.hi' files. You can also record the difference in the interface file itself, the `-keep-hi-diffs' option takes care of that.

The `.hi' files from GHC 2.xx contain "usage" information which changes often and uninterestingly. If you really want to see these changes reported, you need to use the `-hi-diffs-with-usages' option.

GHC's non-interface output normally goes into a `.hc', `.o', etc., file, depending on the last-run compilation phase. The option `-o foo' re-directs the output of that last-run phase to file `foo'.

Note: this "feature" can be counterintuitive: `ghc -C -o foo.o foo.hs' will put the intermediate C code in the file `foo.o', name notwithstanding!

EXOTICA: But the `-o' option isn't of much use if you have several input files... Non-interface output files are normally put in the same directory as their corresponding input file came from. You may specify that they be put in another directory using the `-odir <dir>' (the "Oh, dear" option). For example:

% ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `arch`

The output files, `Foo.o', `Bar.o', and `Bumble.o' would be put into a subdirectory named after the architecture of the executing machine (`sun4', `mips', etc). The directory must already exist; it won't be created.

Note that the `-odir' option does not affect where the interface files are put. In the above example, they would still be put in `parse/Foo.hi', `parse/Bar.hi', and `gurgle/Bumble.hi'.

MORE EXOTICA: The `-osuf <suffix>' will change the `.o' file suffix for object files to whatever you specify. (We use this in compiling the prelude.)

Similarly, the `-hisuf <suffix>' will change the `.hi' file suffix for non-system interface files. This can be useful when you are trying to compile a program several ways, all in the same directory. The suffix given is used for all interfaces files written, and for all non-system interface files that your read.

The `-hisuf'/`-osuf' game is useful if you want to compile a program with both GHC and HBC (say) in the same directory. Let HBC use the standard `.hi'/`.o' suffixes; add `-hisuf g_hi -osuf g_o' to your `make' rule for GHC compiling...

NB: A change from 0.26 and before: Before, you might have said `-hisuf _g.hi -osuf _g.o'; now, the `.' is assumed and you specify what comes after it. (This is a more portable solution for the long term.)

FURTHER EXOTICA: If you are doing a normal `.hs'-to-`.o' compilation but would like to hang onto the intermediate `.hc' C file, just throw in a `-keep-hc-file-too' option. If you would like to look at the assembler output, toss in a `-keep-s-file-too', too.

SAVING GHC STDERR OUTPUT: Sometimes, you may cause GHC to be rather chatty on standard error; with `-fshow-import-specs', for example. You can instruct GHC to append this output to a particular log file with a `-odump <blah>' option.

TEMPORARY FILES: If you have trouble because of running out of space in `/tmp/' (or wherever your installation thinks temporary files should go), you may use the `-tmpdir <dir>' option to specify an alternate directory. For example, `-tmpdir .' says to put temporary files in the current working directory.

BETTER IDEA FOR TEMPORARY FILES: Use your `TMPDIR' environment variable. Set it to the name of the directory where temporary files should be put. GCC and other programs will honour the `TMPDIR' variable as well.

EVEN BETTER IDEA: Set the `TMPDIR' variable when building GHC, and never worry about `TMPDIR' again. (see the build documentation).

4.7 For finding interface files, etc.

In your program, you import a module `Foo' by saying `import Foo'. GHC goes looking for an interface file, `Foo.hi'. It has a builtin list of directories (notably including `.') where it looks.

The `-i<dirs>' option prepends a colon-separated list of `dirs' to the "import directories" list.

A plain `-i' resets the "import directories" list back to nothing.

GHC normally imports `Prelude.hi' files for you. If you'd rather it didn't, then give it a `-fno-implicit-prelude' option. You are unlikely to get very far without a Prelude, but, hey, it's a free country.

If you are using a system-supplied non-Prelude library (e.g., the HBC library), just use a `-syslib hbc' option (for example). The right interface files should then be available.

Once a Haskell module has been compiled to C (`.hc' file), you may wish to specify where GHC tells the C compiler to look for `.h' files. (Or, if you are using the `-cpp' option, where it tells the C pre-processor to look...) For this purpose, use a `-I<dir>' in the usual C-ish way.

Pragmas: Interface files are normally jammed full of compiler-produced pragmas, which record arities, strictness info, etc. If you think these pragmas are messing you up (or you are doing some kind of weird experiment), you can tell GHC to ignore them with the `-fignore-interface-pragmas' option.

When compiling without optimisations on, the compiler is extra-careful about not slurping in data constructors and instance declarations that it will not need. If you believe it is getting it wrong and not importing stuff which you think it should, this optimisation can be turned off with `-fno-prune-tydecls' and `-fno-prune-instdecls'.

See also section 4.10 Linking and consistency-checking, which describes how the linker finds standard Haskell libraries.

4.8 Related to the C pre-processor

The C pre-processor `cpp' is run over your Haskell code only if the `-cpp' option is given. Unless you are building a large system with significant doses of conditional compilation, you really shouldn't need it.

`-D<foo>':
Define macro `<foo>' in the usual way. NB: does not affect `-D' macros passed to the C compiler when compiling via C! For those, use the `-optc-Dfoo' hack...

`-U<foo>':
Undefine macro `<foo>' in the usual way.

`-I<dir>':
Specify a directory in which to look for `#include' files, in the usual C way.

The `ghc' driver pre-defines several macros:

`__HASKELL1__':
If defined to n, that means GHC supports the Haskell language defined in the Haskell report version 1.n. Currently 4. NB: This macro is set both when pre-processing Haskell source and when pre-processing generated C (`.hc') files.

`__GLASGOW_HASKELL__':
For version n of the GHC system, this will be `#define'd to 100 * n. So, for version 2.02, it is 202. This macro is only set when pre-processing Haskell source. (Not when pre-processing generated C.) With any luck, `__GLASGOW_HASKELL__' will be undefined in all other implementations that support C-style pre-processing. (For reference: the comparable symbols for other systems are: `__HUGS__' for Hugs and `__HBC__' for Chalmers.)

`__CONCURRENT_HASKELL__':
Only defined when `-concurrent' is in use! This symbol is defined when pre-processing Haskell (input) and pre-processing C (GHC output).

`__PARALLEL_HASKELL__':
Only defined when `-parallel' is in use! This symbol is defined when pre-processing Haskell (input) and pre-processing C (GHC output).

Options other than the above can be forced through to the C pre-processor with the `-opt' flags (see section 4.13.2 Forcing options to a particular phase.).

A small word of warning: `-cpp' is not friendly to "string gaps".. In other words, strings such as the following:

        strmod = "\
        \ p \
        \ "

don't work with `-cpp'; `/usr/bin/cpp' elides the backslash-newline pairs.

However, it appears that if you add a space at the end of the line, then `cpp' (at least GNU `cpp' and possibly other `cpp's) leaves the backslash-space pairs alone and the string gap works as expected.

4.9 Options affecting the C compiler (if applicable)

At the moment, quite a few common C-compiler options are passed on quietly to the C compilation of Haskell-compiler-generated C files. THIS MAY CHANGE. Meanwhile, options so sent are:

`-Wall'       get all warnings from GCC                  
`-ansi'       do ANSI C (not K&R)                        
`-pedantic'   be so                                      
`-dgcc-lint'  (hack) short for "make GCC very paranoid"  

If you are compiling with lots of `ccalls', etc., you may need to tell the C compiler about some `#include' files. There is no real pretty way to do this, but you can use this hack from the command-line:

% ghc -c '-#include <X/Xlib.h>' Xstuff.lhs

4.10 Linking and consistency-checking

GHC has to link your code with various libraries, possibly including: user-supplied, GHC-supplied, and system-supplied (`-lm' math library, for example).

`-l<FOO>':
Link in a library named `lib<FOO>.a' which resides somewhere on the library directories path. Because of the sad state of most UNIX linkers, the order of such options does matter. Thus: `ghc -lbar *.o' is almost certainly wrong, because it will search `libbar.a' before it has collected unresolved symbols from the `*.o' files. `ghc *.o -lbar' is probably better. The linker will of course be informed about some GHC-supplied libraries automatically; these are:
-l equivalent     description                                  
-------------------------------------------------------------
-lHSrts,-lHSclib  basic runtime libraries                      
-lHS              standard Prelude library                     
-lHS_cbits        C support code for standard Prelude library  
-lgmp             GNU multi-precision library (for Integers)   

`-syslib <name>':
If you are using a Haskell "system library" (e.g., the HBC library), just use the `-syslib hbc' option, and the correct code should be linked in.

`-L<dir>':
Where to find user-supplied libraries... Prepend the directory `<dir>' to the library directories path.

`-static':
Tell the linker to avoid shared libraries.

`-no-link-chk' and `-link-chk':
By default, immediately after linking an executable, GHC verifies that the pieces that went into it were compiled with compatible flags; a "consistency check". (This is to avoid mysterious failures caused by non-meshing of incompatibly-compiled programs; e.g., if one `.o' file was compiled for a parallel machine and the others weren't.) You may turn off this check with `-no-link-chk'. You can turn it (back) on with `-link-chk' (the default).

4.11 For the compiler's RTS: heap, stack sizes, etc.

The compiler is itself a Haskell program, so it has a tweakable runtime-system (RTS), just like any other Haskell program.

`-H<size>' or `-Rmax-heapsize <size>':
Don't use more than `<size>' bytes for heap space. If more than one of these arguments is given, the largest will be taken. A size of zero can be used to reset the heap size downwards. For example, to run GHC with a heap of 250KB (the default is 6MB), do `-H0 -H250k'.

`-K<size>' or `-Rmax-stksize <size>':
Set the stack space to `<size>' bytes. If you have to set it very high [a megabyte or two, say], the compiler is probably looping, which is a BUG (please report). A size of zero can be used to rest the stack size downwards, as above.

`-Rscale-sizes<factor>':
Multiply the given (or default) heap and stack sizes by `<factor>'. For example, on a DEC Alpha (a 64-bit machine), you might want to double those space sizes; just use `-Rscale-sizes2'. A non-integral factor is OK, too: `-Rscale-sizes1.2'.

`-Rghc-timing':
Reports a one-line useful collection of time- and space- statistics for a module's compilation.

`-Rgc-stats':
Report garbage-collection statistics. It will create a `<foo>.stat' file, in some obvious place (I hope). Alternatively, if you'd rather the GC stats went straight to standard error, you can "cheat" by using, instead: `-optCrts-Sstderr'.

For all `<size>'s: If the last character of `size' is a K, multiply by 1000; if an M, by 1,000,000; if a G, by 1,000,000,000. Sizes are always in bytes, not words. Good luck on the G's (I think the counter is still only 32-bits [WDP])!

4.12 For Concurrent and Parallel Haskell

For the full story on using GHC for concurrent & parallel Haskell programming, please see Section 11 Concurrent and Parallel Haskell.

4.13 For debugging the compiler

HACKER TERRITORY. HACKER TERRITORY. (You were warned.)

4.13.1 Replacing the program for one or more phases.

You may specify that a different program be used for one of the phases of the compilation system, in place of whatever the driver `ghc' has wired into it. For example, you might want to try a different assembler. The `-pgm<phase-code><program-name>' option to `ghc' will cause it to use <program-name> for phase <phase-code>, where the codes to indicate the phases are:

code  phase                           
------------------------------------
L     literate pre-processor          
P     C pre-processor (if -cpp only)  
C     Haskell compiler                
c     C compiler                      
a     assembler                       
l     linker                          

4.13.2 Forcing options to a particular phase.

The preceding sections describe driver options that are mostly applicable to one particular phase. You may also force a specific option `<option>' to be passed to a particular phase `<phase-code>' by feeding the driver the option `-opt<phase-code><option>'. The codes to indicate the phases are the same as in the previous section.

So, for example, to force an `-Ewurble' option to the assembler, you would tell the driver `-opta-Ewurble' (the dash before the E is required).

Besides getting options to the Haskell compiler with `-optC<blah>', you can get options through to its runtime system with `-optCrts<blah>'.

So, for example: when I want to use my normal driver but with my profiled compiler binary, I use this script:

#! /bin/sh
exec /local/grasp_tmp3/simonpj/ghc-BUILDS/working-alpha/ghc/driver/ghc \
     -pgmC/local/grasp_tmp3/simonpj/ghc-BUILDS/working-hsc-prof/hsc \
     -optCrts-i0.5 \
     -optCrts-PT \
     "$@"

4.13.3 Dumping out compiler intermediate structures

`-noC':
Don't bother generating C output or an interface file. Usually used in conjunction with one or more of the `-ddump-*' options; for example: `ghc -noC -ddump-simpl Foo.hs'

`-hi':
Do generate an interface file. This would normally be used in conjunction with `-noC', which turns off interface generation; thus: `-noC -hi'.

`-dshow-passes':
Prints a message to stderr as each pass starts. Gives a warm but undoubtedly misleading feeling that GHC is telling you what's happening.

`-ddump-<pass>':
Make a debugging dump after pass `<pass>' (may be common enough to need a short form...). Some of the most useful ones are:
`-ddump-rdr'         reader output (earliest stuff in the compiler)    
`-ddump-rn'          renamer output                                    
`-ddump-tc'          typechecker output                                
`-ddump-deriv'       derived instances                                 
`-ddump-ds'          desugarer output                                  
`-ddump-simpl'       simplifer output (Core-to-Core passes)            
`-ddump-stranal'     strictness analyser output                        
`-ddump-occur-anal'  `occurrence analysis' output                      
`-ddump-spec'        dump specialisation info                          
`-ddump-stg'         output of STG-to-STG passes                       
`-ddump-absC'        unflattened Abstract C                     
`-ddump-flatC'       flattened Abstract C                       
`-ddump-realC'       same as what goes to the C compiler               
`-ddump-asm'         assembly language from the native-code generator  

`-dverbose-simpl' and `-dverbose-stg':
Show the output of the intermediate Core-to-Core and STG-to-STG passes, respectively. (Lots of output!) So: when we're really desperate:
% ghc -noC -O -ddump-simpl -dverbose-simpl -dcore-lint Foo.hs

`-dppr-{user,debug,all}':
Debugging output is in one of several "styles." Take the printing of types, for example. In the "user" style, the compiler's internal ideas about types are presented in Haskell source-level syntax, insofar as possible. In the "debug" style (which is the default for debugging output), the types are printed in the most-often-desired form, with explicit foralls, etc. In the "show all" style, very verbose information about the types (e.g., the Uniques on the individual type variables) is displayed.

`-ddump-raw-asm':
Dump out the assembly-language stuff, before the "mangler" gets it.

`-ddump-rn-trace':
Make the renamer be *real* chatty about what it is upto.

`-dshow-rn-stats':
Print out summary of what kind of information the renamer had to bring in.

`-dshow-unused-imports':
Have the renamer report what imports does not contribute.

4.13.4 How to read Core syntax (from some `-ddump-*' flags)

Let's do this by commenting an example. It's from doing `-ddump-ds' on this code:

skip2 m = m : skip2 (m+2)

Before we jump in, a word about names of things. Within GHC, variables, type constructors, etc., are identified by their "Uniques." These are of the form `letter' plus `number' (both loosely interpreted). The `letter' gives some idea of where the Unique came from; e.g., `_' means "built-in type variable"; `t' means "from the typechecker"; `s' means "from the simplifier"; and so on. The `number' is printed fairly compactly in a `base-62' format, which everyone hates except me (WDP).

Remember, everything has a "Unique" and it is usually printed out when debugging, in some form or another. So here we go...

Desugared:
Main.skip2{-r1L6-} :: _forall_ a$_4 =>{{Num a$_4}} -> a$_4 -> [a$_4]

--# `r1L6' is the Unique for Main.skip2;
--# `_4' is the Unique for the type-variable (template) `a'
--# `{{Num a$_4}}' is a dictionary argument

_NI_

--# `_NI_' means "no (pragmatic) information" yet; it will later
--# evolve into the GHC_PRAGMA info that goes into interface files.

Main.skip2{-r1L6-} =
    /\ _4 -> \ d.Num.t4Gt ->
        let {
          {- CoRec -}
          +.t4Hg :: _4 -> _4 -> _4
          _NI_
          +.t4Hg = (+{-r3JH-} _4) d.Num.t4Gt

          fromInt.t4GS :: Int{-2i-} -> _4
          _NI_
          fromInt.t4GS = (fromInt{-r3JX-} _4) d.Num.t4Gt

--# The `+' class method (Unique: r3JH) selects the addition code
--# from a `Num' dictionary (now an explicit lamba'd argument).
--# Because Core is 2nd-order lambda-calculus, type applications
--# and lambdas (/\) are explicit.  So `+' is first applied to a
--# type (`_4'), then to a dictionary, yielding the actual addition
--# function that we will use subsequently...

--# We play the exact same game with the (non-standard) class method
--# `fromInt'.  Unsurprisingly, the type `Int' is wired into the
--# compiler.

          lit.t4Hb :: _4
          _NI_
          lit.t4Hb =
              let {
                ds.d4Qz :: Int{-2i-}
                _NI_
                ds.d4Qz = I#! 2#
              } in  fromInt.t4GS ds.d4Qz

--# `I# 2#' is just the literal Int `2'; it reflects the fact that
--# GHC defines `data Int = I# Int#', where Int# is the primitive
--# unboxed type.  (see relevant info about unboxed types elsewhere...)

--# The `!' after `I#' indicates that this is a *saturated*
--# application of the `I#' data constructor (i.e., not partially
--# applied).

          skip2.t3Ja :: _4 -> [_4]
          _NI_
          skip2.t3Ja =
              \ m.r1H4 ->
                  let { ds.d4QQ :: [_4]
                        _NI_
                        ds.d4QQ =
                    let {
                      ds.d4QY :: _4
                      _NI_
                      ds.d4QY = +.t4Hg m.r1H4 lit.t4Hb
                    } in  skip2.t3Ja ds.d4QY
                  } in
                  :! _4 m.r1H4 ds.d4QQ

          {- end CoRec -}
        } in  skip2.t3Ja

("It's just a simple functional language" is an unregisterised trademark of Peyton Jones Enterprises, plc.)

4.13.5 Command line options in source files

Sometimes it is useful to make the connection between a source file and the command-line options it requires quite tight. For instance, if a (Glasgow) Haskell source file uses `casm's, the C back-end often needs to be told about which header files to include. Rather than maintaining the list of files the source depends on in a `Makefile' (using the `-#include' command-line option), it is possible to do this directly in the source file using the `OPTIONS' pragma :

{-# OPTIONS -#include "foo.h" #-}
module X where

...

`OPTIONS' pragmas are only looked for at the top of your source files, upto the first (non-literate,non-empty) line not containing `OPTIONS'. Multiple `OPTIONS' pragmas are recognised. Note that your command shell does not get to the source file options, they are just included literally in the array of command-line arguments the compiler driver maintains internally, so you'll be desperately disappointed if you try to glob etc. inside `OPTIONS'.

NOTE: the contents of OPTIONS are prepended to the command-line options, so you *do* have the ability to override OPTIONS settings via the command line.

It is not recommended to move all the contents of your Makefiles into your source files, but in some circumstances, the `OPTIONS' pragma is the Right Thing. (If you use `-keep-hc-file-too' and have OPTION flags in your module, the OPTIONS will get put into the generated .hc file).

4.13.6 How to compile mutually recursive modules

Currently, the compiler does not have proper support for dealing with mutually recursive modules:

module A where

import B

newtype A = A Int

f :: B -> A
f (B x) = A x
--------
module B where

import A

data B = B !Int

g :: A -> B
g (A x) = B x

When compiling either module A and B, the compiler will try (in vain) to look for the interface file of the other. So, to get mutually recursive modules off the ground, you need to hand write an interface file for A or B, so as to break the loop. For the example at hand, the boot interface file for A would like the following:

_interface_ A 1
_exports_
A A(A) f;
_declarations_
1 newtype A = A PrelBase.Int ;
1 f _:_ B.B -> A.A ;;

To make sure you get the syntax right, tailoring an existing interface file is a Good Idea.

Note: This is all a temporary solution, a version of the compiler that handles mutually recursive properly without the manual construction of interface file, is in the works.

5 Controlling the run-time behaviour of your programs

To make an executable program, the GHC system compiles your code and then links it with a non-trivial runtime system (RTS), which handles storage management, profiling, etc.

You have some control over the behaviour of the RTS, by giving special command-line arguments to your program.

When your Haskell program starts up, its RTS extracts command-line arguments bracketed between `+RTS' and `-RTS' as its own. For example:

% ./a.out -f +RTS -pT -S -RTS -h foo bar

The RTS will snaffle `-pT -S' for itself, and the remaining arguments `-f -h foo bar' will be handed to your program if/when it calls `System.getArgs'.

No `-RTS' option is required if the runtime-system options extend to the end of the command line, as in this example:

% hls -ltr /usr/etc +RTS -H5m

If you absolutely positively want all the rest of the options in a command line to go to the program (and not the RTS), use a `--RTS'.

As always, for RTS options that take `<size>'s: If the last character of `size' is a K or k, multiply by 1000; if an M or m, by 1,000,000; if a G or G, by 1,000,000,000. (And any wraparound in the counters is your fault!)

Giving a `+RTS -f' option will print out the RTS options actually available in your program (which vary, depending on how you compiled).

5.1 Generally-available RTS options

The most important RTS options are:

`-H<size>':
Set the heap size to <size> bytes [default: 4M].

`-K<size>':
Set the stack size to <size> bytes [default: 64K]. For concurrent/parallel programs, it is the stack size of the main thread; generally speaking, c/p stacks are in heap. Note: if your program seems to be consuming infinite stack space, it is probably in a loop :-) Of course, if stacks are in the heap, make that infinite heap space...

`-s<file>' or `-S<file>':
Write modest (`-s') or verbose (`-S') garbage-collector statistics into file <file>. The default <file> is <program>`.stat'. The <file> `stderr' is treated specially, with the output really being sent to `stderr'. The amount of heap allocation will typically increase as the total heap size is reduced. The reason for this odd behaviour is that updates of promoted-to-old-generation objects may require the extra allocation of a new-generation object to ensure that there are never any pointers from the old generation to the new generation. For some garbage collectors (not including the default one, sadly), you can convert the `-S' output into a residency graph (in PostScript), using the `stat2resid' utility in the GHC distribution (`ghc/utils/stat2resid').

`-N':
Normally, the garbage collector black-holes closures which are being evaluated, as a space-saving measure. That's exactly what you want for ordinary Haskell programs. When signal handlers are present, however, a computation may be abandoned prematurely, leaving black holes behind. If the signal handler shares one of these black-holed closures, disaster can result. Use the `-N' option to prevent black-holing by the garbage collector if you suspect that your signal handlers may share any subexpressions with the top-level computation. Expect your heap usage to increase, since the lifetimes of some closures may be extended.

5.2 RTS options to control the garbage-collector

Besides the `-H' (set heap size) and `-S'/`-s' (GC stats) RTS options, there are several options to give you precise control over garbage collection.

`-M<n>':
Minimum % <n> of heap which must be available for allocation. The default is 3%.

`-A<size>':
Sets a limit on the size of the allocation area for generational garbage collection to <size> bytes (`-A' gives default of 64k). If a negative size is given the size of the allocation is fixed to -<size>. For non-generational collectors, it fixes the minimum heap which must be available after a collection, overriding the `-M<n>' RTS option.

`-G<size>':
Sets the percentage of free space to be promoted before a major collection is invoked to <size>%. The default is 66%. If a negative size is given it fixes the size of major generation threshold to -<size> bytes.

`-F2s':
Forces a program compiled for generational GC to use two-space copying collection. The two-space collector may outperform the generational collector for programs which have a very low heap residency. It can also be used to generate a statistics file from which a basic heap residency profile can be produced (see Section 7.5.2 `stat2resid' -- residency info from GC stats). There will still be a small execution overhead imposed by the generational compilation as the test for old generation updates will still be executed (of course none will actually happen). This overhead is typically less than 1%.

`-j<size>':
Force a major garbage collection every <size> bytes. (Normally used because you're keen on getting major-GC stats, notably heap residency info.)

5.3 RTS options for profiling and Concurrent/Parallel Haskell

The RTS options related to profiling are described in Section 7.3 How to control your profiled program at runtime; and those for concurrent/parallel stuff, in Section 11.2.7 RTS options for Concurrent/Parallel Haskell.

5.4 RTS options for hackers, debuggers, and over-interested souls

These RTS options might be used (a) to avoid a GHC bug, (b) to see "what's really happening", or (c) because you feel like it. Not recommended for everyday use!

`-B':
Sound the bell at the start of each (major) garbage collection. Oddly enough, people really do use this option! Our pal in Durham (England), Paul Callaghan, writes: "Some people here use it for a variety of purposes -- honestly! -- e.g., confirmation that the code/machine is doing something, infinite loop detection, gauging cost of recently added code. Certain people can even tell what stage [the program] is in by the beep pattern. But the major use is for annoying others in the same office..."

`-r<file>':
Produce "ticky-ticky" statistics at the end of the program run. The `<file>' business works just like on the `-S' RTS option (above). "Ticky-ticky" statistics are counts of various program actions (updates, enters, etc.) The program must have been compiled using `-fstg-reduction-counts' (a.k.a. "ticky-ticky profiling"), and, for it to be really useful, linked with suitable system libraries. Not a trivial undertaking: consult the installation guide on how to set things up for easy "ticky-ticky" profiling.

`-T<num>':
An RTS debugging flag; varying quantities of output depending on which bits are set in <num>.

`-Z':
Turn off "update-frame squeezing" at garbage-collection time. (There's no particularly good reason to turn it off.)

5.5 "Hooks" to change RTS behaviour

GHC lets you exercise rudimentary control over the RTS settings for any given program, by compiling in a "hook" that is called by the run-time system. The RTS contains stub definitions for all these hooks, but by writing your own version and linking it on the GHC command line, you can override the defaults.

The function `defaultsHook' lets you change various RTS options. The commonest use for this is to give your program a default heap and/or stack size that is greater than the default. For example, to set `-H8m -K1m':

#include "rtsdefs.h"
void defaultsHook (void) {
   RTSflags.GcFlags.stksSize =  1000002 / sizeof(W_);
   RTSflags.GcFlags.heapSize =  8000002 / sizeof(W_);
}

Don't use powers of two for heap/stack sizes: these are more likely to interact badly with direct-mapped caches. The full set of flags is defined in `ghc/includes/RtsFlags.lh' the the GHC source tree.

You can also change the messages printed when the runtime system "blows up," e.g., on stack overflow. The hooks for these are as follows:

`void ErrorHdrHook (FILE *)':
What's printed out before the message from `error'.

`void OutOfHeapHook (unsigned long, unsigned long)':
The heap-overflow message.

`void StackOverflowHook (long int)':
The stack-overflow message.

`void MallocFailHook (long int)':
The message printed if `malloc' fails.

`void PatErrorHdrHook (FILE *)':
The message printed if a pattern-match fails (the failures that were not handled by the Haskell programmer).

`void PreTraceHook (FILE *)':
What's printed out before a `trace' message.

`void PostTraceHook (FILE *)':
What's printed out after a `trace' message.

For example, here is the "hooks" code used by GHC itself:

#include <stdio.h>
#define W_ unsigned long int
#define I_ long int

void
ErrorHdrHook (FILE *where)
{
    fprintf(where, "\n"); /* no "Fail: " */
}

void
OutOfHeapHook (W_ request_size, W_ heap_size) /* both sizes in bytes */
{
    fprintf(stderr, "GHC's heap exhausted;\nwhile trying to 
        allocate %lu bytes in a %lu-byte heap;\nuse the `-H<size>'
        option to increase the total heap size.\n",
        request_size,
        heap_size);
}

void
StackOverflowHook (I_ stack_size)    /* in bytes */
{
    fprintf(stderr, "GHC stack-space overflow: current size
        %ld bytes.\nUse the `-K<size>' option to increase it.\n",
        stack_size);
}

void
PatErrorHdrHook (FILE *where)
{
    fprintf(where, "\n*** Pattern-matching error within GHC!\n\n
        This is a compiler bug; please report it to
        glasgow-haskell-bugs@dcs.gla.ac.uk.\n\nFail: ");
}

void
PreTraceHook (FILE *where)
{
    fprintf(where, "\n"); /* not "Trace On" */
}

void
PostTraceHook (FILE *where)
{
    fprintf(where, "\n"); /* not "Trace Off" */
}

6 Advice on: sooner, faster, smaller, stingier

Please advise us of other "helpful hints" that should go here!

6.1 Sooner: producing a program more quickly

Don't use `-O' or (especially) `-O2':
By using them, you are telling GHC that you are willing to suffer longer compilation times for better-quality code. GHC is surprisingly zippy for normal compilations without `-O'!

Use more memory:
Within reason, more memory for heap space means less garbage collection for GHC, which means less compilation time. If you use the `-Rgc-stats' option, you'll get a garbage-collector report. (Again, you can use the cheap-and-nasty `-optCrts-Sstderr' option to send the GC stats straight to standard error.) If it says you're using more than 20% of total time in garbage collecting, then more memory would help. You ask for more heap with the `-H<size>' option; e.g.: `ghc -c -O -H16m Foo.hs'. If GHC persists in being a bad memory citizen, please report it as a bug.

Don't use too much memory!
As soon as GHC plus its "fellow citizens" (other processes on your machine) start using more than the real memory on your machine, and the machine starts "thrashing," the party is over. Compile times will be worse than terrible! Use something like the csh-builtin `time' command to get a report on how many page faults you're getting. If you don't know what virtual memory, thrashing, and page faults are, or you don't know the memory configuration of your machine, don't try to be clever about memory use: you'll just make your life a misery (and for other people, too, probably).

Try to use local disks when linking:
Because Haskell objects and libraries tend to be large, it can take many real seconds to slurp the bits to/from an NFS filesystem (say). It would be quite sensible to compile on a fast machine using remotely-mounted disks; then link on a slow machine that had your disks directly mounted.

Don't derive/use `Read' unnecessarily:
It's ugly and slow.

GHC compiles some program constructs slowly:
Deeply-nested list comprehensions seem to be one such; in the past, very large constant tables were bad, too. We'd rather you reported such behaviour as a bug, so that we can try to correct it. The parts of the compiler that seem most prone to wandering off for a long time are the abstract interpreters (strictness and update analysers). You can turn these off individually with `-fno-strictness' and `-fno-update-analysis'. If `-ddump-simpl' produces output after a reasonable time, but `-ddump-stg' doesn't, then it's probably the update analyser slowing you down. If your module has big wads of constant data, GHC may produce a huge basic block that will cause the native-code generator's register allocator to founder. If `-ddump-absC' produces output after a reasonable time, but nothing after that -- it's probably the native-code generator. Bring on `-fvia-C' (not that GCC will be that quick about it, either).

Avoid the consistency-check on linking:
Use `-no-link-chk'; saves effort. This is probably safe in a I-only-compile-things-one-way setup.

Explicit `import' declarations:
Instead of saying `import Foo', say `import Foo (...stuff I want...)'. Truthfully, the reduction on compilation time will be very small. However, judicious use of `import' declarations can make a program easier to understand, so it may be a good idea anyway.

6.2 Faster: producing a program that runs quicker

The key tool to use in making your Haskell program run faster are GHC's profiling facilities, described separately in section 7 Profiling Haskell programs. There is no substitute for finding where your program's time/space is really going, as opposed to where you imagine it is going.

Another point to bear in mind: By far the best way to improve a program's performance dramatically is to use better algorithms. Once profiling has thrown the spotlight on the guilty time-consumer(s), it may be better to re-think your program than to try all the tweaks listed below.

Another extremely efficient way to make your program snappy is to use library code that has been Seriously Tuned By Someone Else. You might be able to write a better quicksort than the one in the HBC library, but it will take you much longer than typing `import QSort'. (Incidentally, it doesn't hurt if the Someone Else is Lennart Augustsson.)

Please report any overly-slow GHC-compiled programs. The current definition of "overly-slow" is "the HBC-compiled version ran faster"...

Optimise, using `-O' or `-O2':
This is the most basic way to make your program go faster. Compilation time will be slower, especially with `-O2'. At present, `-O2' is nearly indistinguishable from `-O'.

Compile via C and crank up GCC:
Even with `-O', GHC tries to use a native-code generator, if available. But the native code-generator is designed to be quick, not mind-bogglingly clever. Better to let GCC have a go, as it tries much harder on register allocation, etc. So, when we want very fast code, we use: `-O -fvia-C -O2-for-C'.

Overloaded functions are not your friend:
Haskell's overloading (using type classes) is elegant, neat, etc., etc., but it is death to performance if left to linger in an inner loop. How can you squash it?

Give explicit type signatures:
Signatures are the basic trick; putting them on exported, top-level functions is good software-engineering practice, anyway. The automatic specialisation of overloaded functions should take care of overloaded local and/or unexported functions.

Use `SPECIALIZE' pragmas:
(UK spelling also accepted.) For key overloaded functions, you can create extra versions (NB: more code space) specialised to particular types. Thus, if you have an overloaded function:
hammeredLookup :: Ord key => [(key, value)] -> key -> value
If it is heavily used on lists with `Widget' keys, you could specialise it as follows:
{-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
To get very fancy, you can also specify a named function to use for the specialised value, by adding `= blah', as in:
{-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
It's Your Responsibility to make sure that `blah' really behaves as a specialised version of `hammeredLookup'!!! An example in which the `= blah' form will Win Big:
toDouble :: Real a => a -> Double
toDouble = fromRational . toRational

{-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
The `i2d' function is virtually one machine instruction; the default conversion -- via an intermediate `Rational' -- is obscenely expensive by comparison. By using the US spelling, your `SPECIALIZE' pragma will work with HBC, too. Note that HBC doesn't support the `= blah' form. A `SPECIALIZE' pragma for a function can be put anywhere its type signature could be put.

Use `SPECIALIZE instance' pragmas:
Same idea, except for instance declarations. For example:
instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }

{-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
Compatible with HBC, by the way. See also: overlapping instances, in Section 8.6 "HBC-ish" extensions implemented by GHC. They are to `SPECIALIZE instance' pragmas what `= blah' hacks are to `SPECIALIZE' (value) pragmas...

"How do I know what's happening with specialisations?":
The `-fshow-specialisations' will show the specialisations that actually take place. The `-fshow-import-specs' will show the specialisations that GHC wished were available, but were not. You can add the relevant pragmas to your code if you wish. You're a bit stuck if the desired specialisation is of a Prelude function. If it's Really Important, you can just snap a copy of the Prelude code, rename it, and then SPECIALIZE that to your heart's content.

"But how do I know where overloading is creeping in?":
A low-tech way: grep (search) your interface files for overloaded type signatures; e.g.,:
% egrep '^[a-z].*::.*=>' *.hi

Strict functions are your dear friends:
and, among other things, lazy pattern-matching is your enemy. (If you don't know what a "strict function" is, please consult a functional-programming textbook. A sentence or two of explanation here probably would not do much good.) Consider these two code fragments:
f (Wibble x y) =  ... # strict

f arg = let { (Wibble x y) = arg } in ... # lazy
The former will result in far better code. A less contrived example shows the use of `cases' instead of `lets' to get stricter code (a good thing):
f (Wibble x y)  # beautiful but slow
  = let
        (a1, b1, c1) = unpackFoo x
        (a2, b2, c2) = unpackFoo y
    in ...

f (Wibble x y)  # ugly, and proud of it
  = case (unpackFoo x) of { (a1, b1, c1) ->
    case (unpackFoo y) of { (a2, b2, c2) ->
    ...
    }}

GHC loves single-constructor data-types:
It's all the better if a function is strict in a single-constructor type (a type with only one data-constructor; for example, tuples are single-constructor types).

"How do I find out a function's strictness?"
Don't guess -- look it up. Look for your function in the interface file, then for the third field in the pragma; it should say `_S_ <string>'. The `<string>' gives the strictness of the function's arguments. `L' is lazy (bad), `S' and `E' are strict (good), `P' is "primitive" (good), `U(...)' is strict and "unpackable" (very good), and `A' is absent (very good). For an "unpackable" `U(...)' argument, the info inside tells the strictness of its components. So, if the argument is a pair, and it says `U(AU(LSS))', that means "the first component of the pair isn't used; the second component is itself unpackable, with three components (lazy in the first, strict in the second & third)." If the function isn't exported, just compile with the extra flag `-ddump-simpl'; next to the signature for any binder, it will print the self-same pragmatic information as would be put in an interface file. (Besides, Core syntax is fun to look at!)

Force key functions to be `INLINE'd (esp. monads):
GHC (with `-O', as always) tries to inline (or "unfold") functions/values that are "small enough," thus avoiding the call overhead and possibly exposing other more-wonderful optimisations. You will probably see these unfoldings (in Core syntax) in your interface files. Normally, if GHC decides a function is "too expensive" to inline, it will not do so, nor will it export that unfolding for other modules to use. The sledgehammer you can bring to bear is the `INLINE' pragma, used thusly:
key_function :: Int -> String -> (Bool, Double) 

#ifdef __GLASGOW_HASKELL__
{-# INLINE key_function #-}
#endif
(You don't need to do the C pre-processor carry-on unless you're going to stick the code through HBC -- it doesn't like `INLINE' pragmas.) The major effect of an `INLINE' pragma is to declare a function's "cost" to be very low. The normal unfolding machinery will then be very keen to inline it. An `INLINE' pragma for a function can be put anywhere its type signature could be put. `INLINE' pragmas are a particularly good idea for the `then'/`return' (or `bind'/`unit') functions in a monad. For example, in GHC's own `UniqueSupply' monad code, we have:
#ifdef __GLASGOW_HASKELL__
{-# INLINE thenUs #-}
{-# INLINE returnUs #-}
#endif
GHC reserves the right to disallow any unfolding, even if you explicitly asked for one. That's because a function's body may become unexportable, because it mentions a non-exported value, to which any importing module would have no access. If you want to see why candidate unfoldings are rejected, use the `-freport-disallowed-unfoldings' option.

Explicit `export' list:
If you do not have an explicit export list in a module, GHC must assume that everything in that module will be exported. This has various pessimising effect. For example, if a bit of code is actually unused (perhaps because of unfolding effects), GHC will not be able to throw it away, because it is exported and some other module may be relying on its existence. GHC can be quite a bit more aggressive with pieces of code if it knows they are not exported.

Look at the Core syntax!
(The form in which GHC manipulates your code.) Just run your compilation with `-ddump-simpl' (don't forget the `-O'). If profiling has pointed the finger at particular functions, look at their Core code. `lets' are bad, `cases' are good, dictionaries (`d.<Class>.<Unique>') [or anything overloading-ish] are bad, nested lambdas are bad, explicit data constructors are good, primitive operations (e.g., `eqInt#') are good, ...

Use unboxed types (a GHC extension):
When you are really desperate for speed, and you want to get right down to the "raw bits." Please see section 8.1 Unboxed types for some information about using unboxed types.

Use `_ccall_s' (a GHC extension) to plug into fast libraries:
This may take real work, but... There exist piles of massively-tuned library code, and the best thing is not to compete with it, but link with it. Section 8.5 Calling C directly from Haskell says a little about how to use C calls.

Don't use `Float's:
We don't provide specialisations of Prelude functions for `Float' (but we do for `Double'). If you end up executing overloaded code, you will lose on performance, perhaps badly. `Floats' (probably 32-bits) are almost always a bad idea, anyway, unless you Really Know What You Are Doing. Use Doubles. There's rarely a speed disadvantage -- modern machines will use the same floating-point unit for both. With `Doubles', you are much less likely to hang yourself with numerical errors.

Use a bigger heap!
If your program's GC stats (`-S' RTS option) indicate that it's doing lots of garbage-collection (say, more than 20% of execution time), more memory might help -- with the `-H<size>' RTS option.

Use a smaller heap!
Some programs with a very small heap residency (toy programs, usually) actually benefit from running the heap size way down. The `-H<size>' RTS option, as above.

Use a smaller "allocation area":
If you can get the garbage-collector's youngest generation to fit entirely in your machine's cache, it may make quite a difference. The effect is very machine dependent. But, for example, a `+RTS -A128k' option on one of our DEC Alphas was worth an immediate 5% performance boost.

6.3 Smaller: producing a program that is smaller

Decrease the "go-for-it" threshold for unfolding smallish expressions. Give a `-funfolding-use-threshold0' option for the extreme case. ("Only unfoldings with zero cost should proceed.")

(Note: I have not been too successful at producing code smaller than that which comes out with `-O'. WDP 94/12)

Avoid `Read'.

Use `strip' on your executables.

6.4 Stingier: producing a program that gobbles less heap space

"I think I have a space leak..." Re-run your program with `+RTS -Sstderr', and remove all doubt! (You'll see the heap usage get bigger and bigger...) [Hmmm... this might be even easier with the `-F2s' RTS option; so... `./a.out +RTS -Sstderr -F2s'...]

Once again, the profiling facilities (section 7 Profiling Haskell programs) are the basic tool for demystifying the space behaviour of your program.

Strict functions are good to space usage, as they are for time, as discussed in the previous section. Strict functions get right down to business, rather than filling up the heap with closures (the system's notes to itself about how to evaluate something, should it eventually be required).

If you have a true blue "space leak" (your program keeps gobbling up memory and never "lets go"), then 7 times out of 10 the problem is related to a CAF (constant applicative form). Real people call them "top-level values that aren't functions." Thus, for example:

x = (1 :: Int)
f y = x
ones = [ 1, (1 :: Float), .. ]

`x' and `ones' are CAFs; `f' is not.

The GHC garbage collectors are not clever about CAFs. The part of the heap reachable from a CAF is never collected. In the case of `ones' in the example above, it's disastrous. For this reason, the GHC "simplifier" tries hard to avoid creating CAFs, but it cannot subvert the will of a determined CAF-writing programmer (as in the case above).

7 Profiling Haskell programs

Glasgow Haskell comes with a time and space profiling system. Its purpose is to help you improve your understanding of your program's execution behaviour, so you can improve it.

Any comments, suggestions and/or improvements you have to are welcome. Recommended "profiling tricks" would be especially cool!

7.1 How to profile a Haskell program

The GHC approach to profiling is very simple: annotate the expressions you consider "interesting" with cost centre labels (strings); so, for example, you might have:

f x y
  = let
        output1 = _scc_ "Pass1" ( pass1 x )
        output2 = _scc_ "Pass2" ( pass2 output1 y )
        output3 = _scc_ "Pass3" ( pass3 (output2 `zip` [1 .. ]) )
    in concat output3

The costs of the evaluating the expressions bound to `output1', `output2' and `output3' will be attributed to the "cost centres" `Pass1', `Pass2' and `Pass3', respectively.

The costs of evaluating other expressions, e.g., `concat output4', will be inherited by the scope which referenced the function `f'.

You can put in cost-centres via `_scc_' constructs by hand, as in the example above. Perfectly cool. That's probably what you would do if your program divided into obvious "passes" or "phases", or whatever.

If your program is large or you have no clue what might be gobbling all the time, you can get GHC to mark all functions with `_scc_' constructs, automagically. Add an `-auto' compilation flag to the usual `-prof' option.

Once you start homing in on the Guilty Suspects, you may well switch from automagically-inserted cost-centres to a few well-chosen ones of your own.

To use profiling, you must compile and run with special options. (We usually forget the "run" magic! -- Do as we say, not as we do...) Details follow.

If you're serious about this profiling game, you should probably read one or more of the Sansom/Peyton Jones papers about the GHC profiling system. Just visit the Glasgow FP Web page...

7.2 Compiling programs for profiling

To make use of the cost centre profiling system all modules must be compiled and linked with the `-prof' option. Any `_scc_' constructs you've put in your source will spring to life.

Without a `-prof' option, your `_scc_'s are ignored; so you can compiled `_scc_'-laden code without changing it.

There are a few other profiling-related compilation options. Use them in addition to `-prof'. These do not have to be used consistently for all modules in a program.

`-auto':
GHC will automatically add `_scc_' constructs for all top-level, exported functions.

`-auto-all':
All top-level functions, exported or not, will be automatically `_scc_''d.

`-caf-all':
The costs of all CAFs in a module are usually attributed to one "big" CAF cost-centre. With this option, all CAFs get their own cost-centre. An "if all else fails" option...

`-ignore-scc':
Ignore any `_scc_' constructs, so a module which already has `_scc_'s can be compiled for profiling with the annotations ignored.

`-G<group>':
Specifies the <group> to be attached to all the cost-centres declared in the module. If no group is specified it defaults to the module name.

In addition to the `-prof' option your system might be setup to enable you to compile and link with the `-prof-details' option instead. This enables additional detailed counts to be reported with the `-P' RTS option.

7.3 How to control your profiled program at runtime

It isn't enough to compile your program for profiling with `-prof'!

When you run your profiled program, you must tell the runtime system (RTS) what you want to profile (e.g., time and/or space), and how you wish the collected data to be reported. You also may wish to set the sampling interval used in time profiling.

Executive summary: `./a.out +RTS -pT' produces a time profile in `a.out.prof'; `./a.out +RTS -hC' produces space-profiling info which can be mangled by `hp2ps' and viewed with `ghostview' (or equivalent).

Profiling runtime flags are passed to your program between the usual `+RTS' and `-RTS' options.

`-p<sort>' or `-P<sort>':
The `-p?' option produces a standard time profile report. It is written into the file <program>`.prof'. The `-P?' option produces a more detailed report containing the actual time and allocation data as well. (Not used much.) The `-P?' option also produces serial time-profiling information, in the file <program>`.time'. This can be converted into a (somewhat unsatisfactory) PostScript graph using `hp2ps' (see Section 7.5.1 `hp2ps'--heap profile to PostScript). ???? -F2s needed for serial time profile??? ToDo The <sort> indicates how the cost centres are to be sorted in the report. Valid <sort> options are:

`T':
by time, largest first (the default);

`A':
by bytes allocated, largest first;

`C':
alphabetically by group, module and cost centre.

`-i<secs>':
Set the profiling (sampling) interval to `<secs>' seconds (the default is 1 second).

`-h<break-down>':
Produce a detailed space profile of the heap occupied by live closures. The profile is written to the file <program>`.hp' from which a PostScript graph can be produced using `hp2ps' (see Section 7.5.1 `hp2ps'--heap profile to PostScript). The heap space profile may be broken down by different criteria:

`-hC':
cost centre which produced the closure (the default).

`-hM':
cost centre module which produced the closure.

`-hG':
cost centre group which produced the closure.

`-hD':
closure description -- a string describing the closure.

`-hY':
closure type -- a string describing the closure's type.
By default all live closures in the heap are profiled, but particular closures of interest can be selected (see below).

Heap (space) profiling uses hash tables. If these tables should fill the run will abort. The `-z<tbl><size>' option is used to increase the size of the relevant hash table (`C', `M', `G', `D' or `Y', defined as for <break-down> above). The actual size used is the next largest power of 2.

The heap profile can be restricted to particular closures of interest. The closures of interest can selected by the attached cost centre (module:label, module and group), closure category (description, type, and kind) and closure age using the following options:

`-c{<mod>:<lab>,<mod>:<lab>...}':
Selects individual cost centre(s).

`-m{<mod>,<mod>...}':
Selects all cost centres from the module(s) specified.

`-g{<grp>,<grp>...}':
Selects all cost centres from the groups(s) specified.

`-d{<des>,<des>...}':
Selects closures which have one of the specified descriptions.

`-y{<typ>,<typ>...}':
Selects closures which have one of the specified type descriptions.

`-k{<knd>,<knd>...}':
Selects closures which are of one of the specified closure kinds. Valid closure kinds are `CON' (constructor), `FN' (manifest function), `PAP' (partial application), `BH' (black hole) and `THK' (thunk).

The space occupied by a closure will be reported in the heap profile if the closure satisfies the following logical expression:

([-c] or [-m] or [-g]) and ([-d] or [-y] or [-k]) 

where a particular option is true if the closure (or its attached cost centre) is selected by the option (or the option is not specified).

7.4 What's in a profiling report?

When you run your profiled program with the `-p' RTS option , you get the following information about your "cost centres":

`COST CENTRE':
The cost-centre's name.

`MODULE':
The module associated with the cost-centre; important mostly if you have identically-named cost-centres in different modules.

`scc':
How many times this cost-centre was entered; think of it as "I got to the `_scc_' construct this many times..."

`%time':
What part of the time was spent in this cost-centre (see also "ticks," below).

`%alloc':
What part of the memory allocation was done in this cost-centre (see also "bytes," below).

`inner':
How many times this cost-centre "passed control" to an inner cost-centre; for example, `scc=4' plus `subscc=8' means "This `_scc_' was entered four times, but went out to other `_scc_s' eight times."

`cafs':
How many CAFs this cost centre evaluated.

`dicts':
How many dictionaries this cost centre evaluated.

In addition you can use the `-P' RTS option to get the following additional information:

`ticks':
The raw number of time "ticks" which were attributed to this cost-centre; from this, we get the `%time' figure mentioned above.

`bytes':
Number of bytes allocated in the heap while in this cost-centre; again, this is the raw number from which we get the `%alloc' figure mentioned above.

Finally if you built your program with `-prof-details' the `-P' RTS option will also produce the following information:

`closures':
How many heap objects were allocated; these objects may be of varying size. If you divide the number of bytes (mentioned below) by this number of "closures", then you will get the average object size. (Not too interesting, but still...)

`thunks':
How many times we entered (evaluated) a thunk -- an unevaluated object in the heap -- while we were in this cost-centre.

`funcs':
How many times we entered (evaluated) a function while we we in this cost-centre. (In Haskell, functions are first-class values and may be passed as arguments, returned as results, evaluated, and generally manipulated just like data values)

`PAPs':
How many times we entered (evaluated) a partial application (PAP), i.e., a function applied to fewer arguments than it needs. For example, `Int' addition applied to one argument would be a PAP. A PAP is really just a particular form for a function.

7.5 Producing graphical heap profiles

Utility programs which produce graphical profiles.

7.5.1 `hp2ps'--heap profile to PostScript

USAGE: `hp2ps' [flags] [<file>[`.stat']]

The program `hp2ps' converts a heap profile as produced by the `-h<break-down>' runtime option into a PostScript graph of the heap profile. By convention, the file to be processed by `hp2ps' has a `.hp' extension. The PostScript output is written to <file>`.ps'. If <file> is omitted entirely, then the program behaves as a filter.

`hp2ps' is distributed in `ghc/utils/hp2ps'. It was originally developed by Dave Wakeling as part of the HBC/LML heap profiler.

The flags are:

`-d'
In order to make graphs more readable, `hp2ps' sorts the shaded bands for each identifier. The default sort ordering is for the bands with the largest area to be stacked on top of the smaller ones. The `-d' option causes rougher bands (those representing series of values with the largest standard deviations) to be stacked on top of smoother ones.

`-b'
Normally, `hp2ps' puts the title of the graph in a small box at the top of the page. However, if the JOB string is too long to fit in a small box (more than 35 characters), then `hp2ps' will choose to use a big box instead. The `-b' option forces `hp2ps' to use a big box.

`-e<float>[in|mm|pt]'
Generate encapsulated PostScript suitable for inclusion in LaTeX documents. Usually, the PostScript graph is drawn in landscape mode in an area 9 inches wide by 6 inches high, and `hp2ps' arranges for this area to be approximately centred on a sheet of a4 paper. This format is convenient of studying the graph in detail, but it is unsuitable for inclusion in LaTeX documents. The `-e' option causes the graph to be drawn in portrait mode, with float specifying the width in inches, millimetres or points (the default). The resulting PostScript file conforms to the Encapsulated PostScript (EPS) convention, and it can be included in a LaTeX document using Rokicki's dvi-to-PostScript converter `dvips'.

`-g'
Create output suitable for the `gs' PostScript previewer (or similar). In this case the graph is printed in portrait mode without scaling. The output is unsuitable for a laser printer.

`-l'
Normally a profile is limited to 20 bands with additional identifiers being grouped into an `OTHER' band. The `-l' flag removes this 20 band and limit, producing as many bands as necessary. No key is produced as it won't fit!. It is useful for creation time profiles with many bands.

`-m<int>'
Normally a profile is limited to 20 bands with additional identifiers being grouped into an `OTHER' band. The `-m' flag specifies an alternative band limit (the maximum is 20). `-m0' requests the band limit to be removed. As many bands as necessary are produced. However no key is produced as it won't fit! It is useful for displaying creation time profiles with many bands.

`-p'
Use previous parameters. By default, the PostScript graph is automatically scaled both horizontally and vertically so that it fills the page. However, when preparing a series of graphs for use in a presentation, it is often useful to draw a new graph using the same scale, shading and ordering as a previous one. The `-p' flag causes the graph to be drawn using the parameters determined by a previous run of `hp2ps' on file. These are extracted from file`.aux'.

`-s'
Use a small box for the title.

`-t<float>'
Normally trace elements which sum to a total of less than 1% of the profile are removed from the profile. The `-t' option allows this percentage to be modified (maximum 5%). `-t0' requests no trace elements to be removed from the profile, ensuring that all the data will be displayed.

`-?'
Print out usage information.

7.5.2 `stat2resid' -- residency info from GC stats

USAGE: `stat2resid' [<file>[`.stat'] [<outfile>]]

The program `stat2resid' converts a detailed garbage collection statistics file produced by the `-S' runtime option into a PostScript heap residency graph. The garbage collection statistics file can be produced without compiling your program for profiling.

By convention, the file to be processed by `stat2resid' has a `.stat' extension. If the <outfile> is not specified the PostScript will be written to <file>`.resid.ps'. If <file> is omitted entirely, then the program behaves as a filter.

The plot can not be produced from the statistics file for a generational collector, though a suitable stats file can be produced using the `-F2s' runtime option when the program has been compiled for generational garbage collection (the default).

`stat2resid' is distributed in `ghc/utils/stat2resid'.

8 Glasgow extensions to Haskell

As with all known Haskell systems, GHC implements some extensions to the language. To use them, you'll need to give a `-fglasgow-exts' option.

Virtually all of the Glasgow extensions serve to give you access to the underlying facilities with which we implement Haskell. Thus, you can get at the Raw Iron, if you are willing to write some non-standard code at a more primitive level. You need not be "stuck" on performance because of the implementation costs of Haskell's "high-level" features -- you can always code "under" them. In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!

Executive summary of our extensions:

Unboxed types and primitive operations:
You can get right down to the raw machine types and operations; included in this are "primitive arrays" (direct access to Big Wads of Bytes). Please see Section 8.1 Unboxed types and following.

Calling out to C:
Just what it sounds like. We provide lots of rope that you can dangle around your neck. Please see Section 8.5 Calling C directly from Haskell.

"HBC-ish" extensions:
Extensions implemented because people said, "HBC does Y. Could you teach GHC to do the same?" Please see Section 8.6 "HBC-ish" extensions implemented by GHC for a quick list.

Before you get too carried away working at the lowest level (e.g., sloshing `MutableByteArray#'s around your program), you may wish to check if there are system libraries that provide a "Haskellised veneer" over the features you want. See Section 10 System libraries.

Section 9 The GHC prelude and libraries is the definitive guide for many of the low-level facilities in GHC.

8.1 Unboxed types

These types correspond to the "raw machine" types you would use in C: `Int#' (long int), `Double#' (double), `Addr#' (void *), etc. The primitive operations (PrimOps) on these types are what you might expect; e.g., `(+#)' is addition on `Int#'s, and is the machine-addition that we all know and love -- usually one instruction.

A numerically-intensive program using unboxed types can go a lot faster than its "standard" counterpart -- we saw a threefold speedup on one example.

Please see Section 9.2 The module `GHC': really primitive stuff for the details of unboxed types and the operations on them.

8.2 Primitive state-transformer monad

This monad underlies our implementation of arrays, mutable and immutable, and our implementation of I/O, including "C calls".

You probably won't use the monad directly, but you might use all those other things!

More information on the state-related types can be found in the elsewhere, Section 9 The GHC prelude and libraries.

8.3 Primitive arrays, mutable and otherwise

GHC knows about quite a few flavours of Large Swathes of Bytes.

First, GHC distinguishes between primitive arrays of (boxed) Haskell objects (type `Array# obj') and primitive arrays of bytes (type `ByteArray#').

Second, it distinguishes between...

Immutable:
Arrays that do not change (as with "standard" Haskell arrays); you can only read from them. Obviously, they do not need the care and attention of the state-transformer monad.

Mutable:
Arrays that may be changed or "mutated." All the operations on them live within the state-transformer monad and the updates happen in-place.

"Static" (in C land):
A C routine may pass an `Addr#' pointer back into Haskell land. There are then primitive operations with which you may merrily grab values over in C land, by indexing off the "static" pointer.

"Stable" pointers:
If, for some reason, you wish to hand a Haskell pointer (i.e., not an unboxed value) to a C routine, you first make the pointer "stable," so that the garbage collector won't forget that it exists. That is, GHC provides a safe way to pass Haskell pointers to C. Please see Section 8.5.3 Subverting automatic unboxing with "stable pointers" for more details.

"Foreign objects":
A "foreign object" is a safe way to pass an external object (a C allocated pointer, say) to Haskell and have Haskell do the Right Thing when it no longer references the object. So, for example, C could pass a large bitmap over to Haskell and say "please free this memory when you're done with it." Please see Section 8.5.4 Pointing outside the Haskell heap for more details.

The libraries section give more details on all these "primitive array" types and the operations on them, Section 9 The GHC prelude and libraries.

8.4 Using your own `mainIO'

Normally, the GHC runtime system begins things by called an internal function

        mainIO :: IO ()

which, in turn, fires up your `Main.main'. The standard definition of `mainIO' looks like this:

        mainIO = catch Main.main 
                   (\err -> error ("I/O error: " ++ 
                                        showsPrec 0 err "\n"))

that is, all it does is run `Main.main', catching any I/O errors that occur and displaying them on standard error before exiting the program.

To subvert the above process, you need only provide a `mainIO' of your own (in a module named `GHCmain').

Here's a little example, stolen from Alastair Reid:

        module GHCmain ( mainIO ) where
        
        import GlaExts
        
        mainIO :: IO ()
        mainIO = do
                 sleep 5
                 _ccall_ printf "%d\n" (14::Int)
        
        sleep :: Int -> IO ()
        sleep t = _ccall_ sleep t

8.5 Calling C directly from Haskell

GOOD ADVICE: Because this stuff is not Entirely Stable as far as names and things go, you would be well-advised to keep your C-callery corraled in a few modules, rather than sprinkled all over your code. It will then be quite easy to update later on.

WARNING AS OF 2.03: Yes, the `_ccall_' stuff probably will change, to something better, of course! One step in that direction is Green Card, a foreign function interface pre-processor for Haskell ("Glasgow" Haskell in particular) -- check out

ftp://ftp.dcs.gla.ac.uk/pub/haskell/glasgow/green-card.ANNOUNCE
ftp://ftp.dcs.gla.ac.uk/pub/haskell/glasgow/green-card-src.tar.gz

8.5.1 `_ccall_' and `_casm_': an introduction

The simplest way to use a simple C function

double fooC( FILE *in, char c, int i, double d, unsigned int u )

is to provide a Haskell wrapper:

fooH :: Char -> Int -> Double -> Word -> IO Double
fooH c i d w = _ccall_ fooC ("stdin"::Addr) c i d w

The function `fooH' will unbox all of its arguments, call the C function `fooC' and box the corresponding arguments.

One of the annoyances about `_ccall_'s is when the C types don't quite match the Haskell compiler's ideas. For this, the `_casm_' variant may be just the ticket (NB: no chance of such code going through a native-code generator):

oldGetEnv name
  = _casm_ "%r = getenv((char *) %0);" name >>= \ litstring@(A# str#) ->
    return (
        if (litstring == "NULL") then
            Left ("Fail:oldGetEnv:"++name)
        else
            Right (unpackCString# str#)
    )

The first literal-literal argument to a `_casm_' is like a `printf' format: `%r' is replaced with the "result," `%0'--`%n-1' are replaced with the 1st--nth arguments. As you can see above, it is an easy way to do simple C casting. Everything said about `_ccall_' goes for `_casm_' as well.

8.5.2 Using function headers

When generating C (using the `-fvia-C' directive), one can assist the C compiler in detecting type errors by using the `-#include' directive to provide `.h' files containing function headers.

For example,

typedef unsigned long *StgForeignObj;
typedef long StgInt;

void          initialiseEFS (StgInt size);
StgInt        terminateEFS (void);
StgForeignObj emptyEFS(void);
StgForeignObj updateEFS (StgForeignObj a, StgInt i, StgInt x);
StgInt        lookupEFS (StgForeignObj a, StgInt i);

You can find appropriate definitions for `StgInt', `StgForeignObj', etc using `gcc' on your architecture by consulting `ghc/includes/StgTypes.lh'. The following table summarises the relationship between Haskell types and C types.

C type name               Haskell Type                  
------------------------------------------------------
`StgChar'        `Char#'              
`StgInt'         `Int#'               
`StgWord'        `Word#'              
`StgAddr'        `Addr#'              
`StgFloat'       `Float#'             
`StgDouble'      `Double#'            
`StgArray'       `Array#'             
`StgByteArray'   `ByteArray#'         
`StgArray'       `MutableArray#'      
`StgByteArray'   `MutableByteArray#'  
`StgStablePtr'   `StablePtr#'         
`StgForeignObj'  `ForeignObj#'        

Note that this approach is only essential for returning `float's (or if `sizeof(int) != sizeof(int *)' on your architecture) but is a Good Thing for anyone who cares about writing solid code. You're crazy not to do it.

8.5.3 Subverting automatic unboxing with "stable pointers"

The arguments of a `_ccall_' are automatically unboxed before the call. There are two reasons why this is usually the Right Thing to do:

It is possible to subvert the unboxing process by creating a "stable pointer" to a value and passing the stable pointer instead. For example, to pass/return an integer lazily to C functions `storeC' and `fetchC', one might write:

storeH :: Int -> IO ()
storeH x = makeStablePtr x              >>= \ stable_x ->
           _ccall_ storeC stable_x

fetchH :: IO Int
fetchH x = _ccall_ fetchC               >>= \ stable_x ->
           deRefStablePtr stable_x      >>= \ x ->
           freeStablePtr stable_x       >>
           return x

The garbage collector will refrain from throwing a stable pointer away until you explicitly call one of the following from C or Haskell.

void freeStablePointer( StgStablePtr stablePtrToToss )
freeStablePtr :: StablePtr a -> IO ()

As with the use of `free' in C programs, GREAT CARE SHOULD BE EXERCISED to ensure these functions are called at the right time: too early and you get dangling references (and, if you're lucky, an error message from the runtime system); too late and you get space leaks.

And to force evaluation of the argument within `fooC', one would call one of the following C functions (according to type of argument).

void     performIO  ( StgStablePtr stableIndex /* StablePtr s (IO ()) */ );
StgInt   enterInt   ( StgStablePtr stableIndex /* StablePtr s Int */ );
StgFloat enterFloat ( StgStablePtr stableIndex /* StablePtr s Float */ );

Note Bene: `_ccall_GC_' must be used if any of these functions are used.

8.5.4 Pointing outside the Haskell heap

There are two types that `ghc' programs can use to reference (heap-allocated) objects outside the Haskell world: `Addr' and `ForeignObj'.

If you use `Addr', it is up to you to the programmer to arrange allocation and deallocation of the objects.

If you use `ForeignObj', `ghc''s garbage collector will call upon the user-supplied finaliser function to free the object when the Haskell world no longer can access the object. (An object is associated with a finaliser function when the abstract Haskell type `ForeignObj' is created). The finaliser function is expressed in C, and is passed as argument the object:

void foreignFinaliser ( StgForeignObj fo )

when the Haskell world can no longer access the object. Since `ForeignObj's only get released when a garbage collection occurs, we provide ways of triggering a garbage collection from within C and from within Haskell.

void StgPerformGarbageCollection()
performGC :: IO ()

More information is provided on the programmers' interface to `ForeignObj' can be found in Section 9.4.5 Foreign objects.

8.5.5 Avoiding monads

The `_ccall_' construct is part of the `IO' monad because 9 out of 10 uses will be to call imperative functions with side effects such as `printf'. Use of the monad ensures that these operations happen in a predictable order in spite of laziness and compiler optimisations.

To avoid having to be in the monad to call a C function, it is possible to use `unsafePerformIO', which is available from the `IOExts' module. There are three situations where one might like to call a C function from outside the IO world:

8.5.6 C-calling "gotchas" checklist

And some advice, too.