5. Preprocessor

Before the Pike code is sent to the compiler it is fed through the preprocessor. The preprocessor converts the source code from its character encoding into the Pike internal representation, performs some simple normalizations and consistency checks and executes the "preprocessor directives" that the programmer may have put into the file. The preprocessor directives are like a very simple programming language that allows for simple code generation and manipulation. The code preprocessor can be called from within Pike with the cpp call.

5.1. Charset Heuristics

Pike code is Unicode enabled, so the first thing the preprocessor has to do is to try to determine the character encoding of the file. It will first look at the two first bytes of the file and interpret them according to this chart.

Byte 0Byte 1Interpretation
0032bit wide string.
0>016bit Unicode string.
>0016bit Unicode string in reverse byte order.
0xfe0xff16bit Unicode string.
0xff0xfe16bit Unicode string in reverse byte order.
0x7b0x83EBCDIC-US ("#c").
0x7b0x40EBCDIC-US ("# ").
0x7b0x09EBCDIC-US ("#\t").
  • With any other combination of bytes the preprocessor will assume iso-8859-1 encoding until a #charset directive has been found.
  • The file must be an multiple of 4 or 2 bytes in order to be correctly decoded as 32bit or 16bit wide string.
  • It's an error for a program written in EBCDIC not to start with a #charset directive.
  • For obfuscation it is possible to encode the #charset directive in a different charset than the charset stated in the #charset directive.

5.2. Code Normalization

The preprocessor collapses all consecutive white space characters outside of strings, except for newlines, to single space characters. All // and /**/ comments are removed, as are #! lines. Pike considers ANSI/DEC escape sequences as white space. Supported formats are <ESC>[\040-\077]+[\100-\177] and <CSI>[\040-\077]*[\100-\177]. Note that this means that it is possible to do color markup in the actual source file.

The preprocessor will treat seven consecutive < characters outside of a string as an version control conflict error and will return "Merge conflict detected."

5.3. Defines and Macros

Defining macros or constants is one of the most used preprocessor features. It enables you to make abstractions on a code generation level as well as altering constants cross-application. The simplest use of the #define directive however is to declare a "define" as present.

#define DO_OVERSAMPLING

The existence of this definition can now be used by e.g. #ifdef and #ifndef to activate or deactivate blocks of program code.

#ifdef DO_OVERSAMPLING
  // This code is not always run.
  img->render(size*4)->shrink(4);
#endif

Note that defines can be given to pike at execution time. In order to set DO_OVERSAMPLING from a command line, the option -DDO_OVERSAMPLING is added before the name of the pike program. E.g. pike -DDO_OVERSAMPLING my_program.pike.

A define can also be given a specific value, which will be inserted everywhere the define is placed in the source code.

#define CYCLES 20

void do_stuff() {
  for(int i; i<CYCLES; i++) do_other_stuff();
}

Defines can be given specific values on the command line too, just be sure to quote them as required by your shell.

~% pike '-DTEXT="Hello world!"' -e 'write("%s\n", TEXT);'
Hello world!

Finally #define can also be used to define macros. Macros are just text expansion with arguments, but it is often very useful to make a cleaner looking code and to write less.

#define VAR(X) id->misc->variable[X]
#define ROL(X,Y) (((X)<<(Y))&7+((X)>>(8-(Y))))
#define PLACEHOLDER(X) void X(mixed ... args) { \
  error("Method " #X " is not implemented yet.\n"); }
#define ERROR(X,Y ...) werror("MyClass" X "\n", Y)
#define NEW_CONSTANTS(X) do{ int i=sizeof(all_constants()); \
    X \
    werror("Constant diff is %d\n", sizeof(all_constants())-i); \
  }while(0)
#define MY_FUNC(X,Y) void my##X##Y()
  • A macro can have up to 254 arguments.
  • It can be wise to put extra parentheses around the arguments expanded since it is a purely textual expansion. E.g. if the macro DOUBLE(X) is defined as X*2, then DOUBLE(2+3) will produce 2+3*2, probably producing a hard to track down bug.
  • Since the preprocessor works with textual expansion, it will not evaluate its arguments. Using one argument several time in the macro will thus cause it to evaluated several times during execution. E.g. #define MSG(X) werror("The value "+(X)+" can differ from "+(X)+"\n") when called with MSG(random(1000));.
  • A backslash (\) at the end of the line can be used to make the definition span several lines.
  • It is possible to define macros with a variable list of arguments by using the ... syntax.
  • Macros are often formulated so that a semicolon after it is apropriate, for improved code readability.
  • In Pike code macros and defines are most often written in all caps.
  • If a macro expands into several statements, you are well advised to group them together in containment block, such as do { BODY } while(0). If you do not, your macro could produce other hard to track down bugs, if put as a loop or if body without surrounding curly braces.
  • A hash (#) in front of a macro variable "casts" it to a string.
  • A double hash (##) in front of a macro variable concatenates it with the text before it.

5.4. Preprocessor Directives

All of the preprocessor directives except the string-related (#string and #"") should have the hash character (#) as the first character of the line. Even though it is currently allowed to be indented, it is likely that this will generate warnings or errors in the future. Note that it is however allowed to put white-space between the hash character and the rest of the directive to create indentation in code.

5.5. Predefined defines


Constant__VERSION__

constant__VERSION__

Description

This define contains the current Pike version as a float. If another Pike version is emulated, this define is updated accordingly.

See also

__REAL_VERSION__


Constant__MAJOR__

constant__MAJOR__

Description

This define contains the major part of the current Pike version, represented as an integer. If another Pike version is emulated, this define is updated accordingly.

See also

__REAL_MAJOR__


Constant__MINOR__

constant__MINOR__

Description

This define contains the minor part of the current Pike version, represented as an integer. If another Pike version is emulated, this define is updated accordingly.

See also

__REAL_MINOR__


Constant__BUILD__

constant__BUILD__

Description

This constant contains the build number of the current Pike version, represented as an integer. If another Pike version is emulated, this constant remains unaltered.

See also

__REAL_MINOR__


Constant__REAL_VERSION__

constant__REAL_VERSION__

Description

This define always contains the version of the current Pike, represented as a float.

See also

__VERSION__


Constant__REAL_MAJOR__

constant__REAL_MAJOR__

Description

This define always contains the major part of the version of the current Pike, represented as an integer.

See also

__MAJOR__


Constant__REAL_MINOR__

constant__REAL_MINOR__

Description

This define always contains the minor part of the version of the current Pike, represented as an integer.

See also

__MINOR__


Constant__REAL_BUILD__

constant__REAL_BUILD__

Description

This define always contains the minor part of the version of the current Pike, represented as an integer.

See also

__BUILD__


Constant__DATE__

constant__DATE__

Description

This define contains the current date at the time of compilation, e.g. "Jul 28 2001".


Constant__TIME__

constant__TIME__

Description

This define contains the current time at the time of compilation, e.g. "12:20:51".


Constant__FILE__

constant__FILE__

Description

This define contains the file path and name of the source file.


Constant__DIR__

constant__DIR__

Description

This define contains the directory path of the source file.


Constant__LINE__

constant__LINE__

Description

This define contains the current line number, represented as an integer, in the source file.


Constant__AUTO_BIGNUM__

constant__AUTO_BIGNUM__

Description

This define is defined when automatic bignum conversion is enabled. When enabled all integers will automatically be converted to bignums when they get bigger than what can be represented by an integer, hampering performance slightly instead of crashing the program.


Constant__NT__

constant__NT__

Description

This define is defined when the Pike is running on a Microsoft Windows OS, not just Microsoft Windows NT, as the name implies.


Constant__PIKE__

constant__PIKE__

Description

This define is always true.


Constant__amigaos__

constant__amigaos__

Description

This define is defined when the Pike is running on Amiga OS.

5.6. Test functions

These functions are used in #if et al expressions to test for presence of symbols.