Macros have a deservedly bad reputation in C++.
#define f(a,b) a * b int main() { int x = 5, y = 10; int z; z = f(2, x + y); // Expands to 2 * x + y }
#define f(a,b) (a) * (b) Now f(2, x + y) is expanded to (2) * (x + y) but 10.0/f(2,x + y) is expanded to 10.0/(2)*(x + y)
#define f(a,b) ((a) * (b)) Now 10.0/f(2,x + y) is expanded to 10.0/((2)*(x + y))
C++ macros have a feature that is not usually noted in C++ texts.
Namely, macro expansion can insert quotes around any macro parameter, effectively converting any parameter expression to a string.
#define f(a) #a f(x+y) would be expanded to "x + y"
This quoting feature of macros can be used to provide a rudimentary, but quite effective way of introducing some automation in testing functions and methods in C++ programs.
The technique is quite simple:
1 #define test_(a) do_it(a, #a, __LINE__) 2 3 void do_it(bool cond, const string& expr, int lineno) 4 { 5 if ( cond ) { 6 cout << "passed: " << expr << "[ line: " << lineno << " ]" << endl; 7 } else { 8 cout << "failed: " << expr << "[ line: " << lineno << " ]" << endl; 9 } 10 11 }
Code Example
passed: lst.size() == 0 [ line: 32] passed: lst.empty() == true [ line: 33] failed: lst.size() == 1 [ line: 34] passed: lst.size() == 1 [ line: 39] passed: lst.front() == 5 [ line: 40] passed: lst.back() == 5 [ line: 41] passed: lst.front() == 5 [ line: 47] passed: lst.back() == 10 [ line: 48] passed: lst.size() == 2 [ line: 49]
The Intel processor has registers which are denoted differently by different assembly languages.
Two such registers are the frame pointer and the stack pointer.
The GNU assembler denotes these by %ebp and %esp, respectively.
The frame pointer contains the address of the bottom entry of the activation record; the stack pointer holds the address of the top entry of the same activation record.
pushl x pushes the effective operand x onto the stack adjusts %esp to point to the new top popl y copies the top stack value into effective operand y adjusts %esp to logically remove this value from the stack movl a,b copy a to b call label push the code return address onto the stack (this adjusts the %esp register) ret pops the return address off the stack and puts it into the program counter register leave movl %ebp, %esp ; resets stack ptr to frame beginning popl %ebp ; copies old ebp value back into %ebp (These reset the stack and frame pointers back to the caller's stack frame.)
1 int f(); 2 3 void g() 4 { 5 int x = 5, y; 6 y = f(); 7 x = x + y; 8 } pushl %ebp // 1 Save g's caller's frame pointer movl %esp, %ebp // 1 Set up g's frame pointer subl $8, %esp // 1 Allocate space for g's activation record movl $5, -4(%ebp) // 2 Initialize x call __Z1fv // 2 call f (pushes addr of movl on stack) movl %eax, -8(%ebp) // 2 return value in eax copied into y movl -8(%ebp), %edx // 3 copying y before adding leal -4(%ebp), %eax // 3 getting the address of x addl %edx, (%eax) // 3 x + y stored at x's address leave // Restore g's caller's activation record ret // return execution to g's caller
Most languages allow some nesting of structures in which names can be bound.
C++ allows nested blocks where variables can be declared that are local to that block.
C++ and Java allow nested classes.
ML allows nested functions definitions.
Most of these situations are still simple for determining binding since they generally use a static binding rule, namely the most-closely nested rule.
Typically languages use this combination: dynamic activation records and static scope binding.
Recursion forces dynamic activation records.
Sanity forces static scope binding.
The run time dynamic allocation of activation records presents a messy problem for compiler writers to be able to implement the static binding that we programmers want.
Imperative languages like C, Perl, C++ (imperative + oo), Java (imperative + oo) use stacks for activation records.
Pascal, Modula, and Ada also use stacks for activation records, but in addition are are block structured languages that allow procedure definitions to be nested.
Here is a potential problem for such block structured languages:
M { int x; P() { int y = 5; print x + y; } R() { int z = 10; print x + z; P(); } x = 2; P(); R(); }
When M calls P
Stack P [ y = 5 ] P accesses its own y and M's x and M [ x = 2 ] printx 7
When M calls R, and then R calls P
Stack P [ y = 5 ] P should access its y and M's x for 7 again R [ z = 10 ] R printed 12 (x + z) M [ x = 2 ]
The problem is how does P find M's x at run time?
M's activation record is at least on the stack, but how far away?
It depends on who called P - M or R.
P [ y = 5 ] P [ y = 5 ] M [ x = 2 ] R [ z = 10 ] M [ x = 2 ]
One solution is to use static links (or access links).
The activation record for a function F will have in its constant part:
P [ y = 5; sl=?; level=1 ] P [ y = 5; sl=?; level=1 ] M [ x = 2; sl=null; level=0] R [ z = 10; sl=M; level=1 ] M [ x = 2; sl=null; level=0 ]
If a function is at nesting level k, then the function that immediately contains it is at level k-1.
Consider the following pseudo code in a language that allows function definitions to be nested.
The language uses stack based activation records and static scope binding using the most-closely nested rule.
Using this static binding (lexical binding), the occurrence x3 is bound to the declaration at x2. But which one?
There will be multiple activation records on the call stack for function p since p and q are calling each other repeatedly.
1 void Main() 2 { 3 int x; // x1 4 void p() 5 { 6 int x; // x2 7 void q() 8 { 9 y = x + 1; // x3 10 if ( y < 10 ) { 11 p(); 12 } 13 return; 14 } 15 x = r(); // x4 16 q(); 17 return; 18 } 19 int r() 20 { 21 ... 22 } 23 x = 1; // x5 (Start here) 24 p(); 25 }
The assembly code example suggests how parameters can be passed.
The calling function can push values onto the stack.
The called function can then access these parameters as offsets from the frame pointer (in the opposite direction from the space allocated for its own local variables).
However, there are still several variations on what to push; how the caller uses what is pushed, and how this affects the callers values.
We will briefly consider these variations:
The r-value of the expression or variable is pushed on the stack.
The called function uses this copy during execution.
Nothing is copied back to the caller's argument variable; the called function doesn't even know the location of this variable.
The l-value of the variable is pushed on the stack. That is, its address is pushed, not the contents of the address.
The called function uses this address to indirectly access the callers variable (location or contents, as needed) during execution.
Nothing is copied back to the caller's argument variable; the called function does have the location of this variable, but the called function has already made any changes to the caller's argument indirectly each time its code modifies the parameter.
The l-value of the expression or variable is pushed on the stack, but separate storage is also allocated to hold a copy of the r-value.
The called function uses this copy during execution. This means it doesn't have to keep using the address to indirectly access the value.
At the return the final value of the local copy is written back to the caller's argument variable; the called function can do this because it has the location of this variable on the stack.
Next time ...
Strictly speaking this is not a calling mechanism as described in terms of activation records.
However, syntactically call a macro defined function looks like an ordinary function call.
As we have seen each call to a macro is replaced by the body of the macro definition at the point of call in the caller's code.
This means effectively any names which do not have local bindings in the macro definition will be bound by looking in the caller's code. That is, as it would be using dynamic scope binding.
Next time ...