[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2.2.3 Data Segments

Cogsys runs in MS-DOS, and MS-DOS is a real mode environment. This unfortunate fact is the key to the system's real time capabilities but is also the chief source of the developer's headaches. Intel (and compatible) processors run in one of two states: real or protected. Real mode does not support multitasking and is therefore almost never used by modern operating systems; it survives only for backwards compatablity and system boot. Protected mode fully exploits the processor's capabilities, and allows for much smoother and faster execution. Protected mode operating systems are always multitasking, however, and so by nature are poor candidates for real time systems.

This section documents a very ugly real mode issue that must be understood for writing working extensions. Your extension will not work unless you incorporate the changes this section describes.

The biggest drawback of real mode is the segmented memory model that comes with it. Unlike modern operating systems, which have a flat memory model, real mode requires the programmer to access memory with two addresses: a 16-bit segment followed by a 16-bit offset. A real mode address is usually written as two two-byte quantities separated by a colon, for example: 162E:023A. Most unfortunately, the highest memory address is not a 32-bit quantity (4 GB, the protected, flat model limit) but rather a 20-bit one, 1 MB. This means that the segments overlap, which in turn implies that there are more than one (in fact, there are 4,096) different ways to talk about the same address: 1001:0000 and 1000:0010 are the same location. To help improve speed, real mode makes two requirements. First a function must fit entirely in one segement (64 kilobytes). Then, once inside a function, an internal processor register, called the data segment register (abbreviated DS in assembly language) is set to the current function's data segment. The data segment is a segment of memory where some of this function's data, including certain constants, local array structures and many other variables specific to the function, resides. The value of the data segment register is used whenver the function needs a value from its data segment.

So what does this all this data segment assembly language garbage have to do with writing extensions? Unfortunately, because the block of memory that the extension routine runs in is allocated dynamically at run time and is so by definition in a completely different segment than everything else, it is the awful responsibility of the extension writer to modify the data segment register manually, or code in the extension will not be able to access its data properly.

Fortunately, this is not as hard as it sounds. Borland Turbo C/C++ 3.0 has an asm keyword which allows the C programmer to directly insert a line of assembly language. What's more, this line can directly reference C code identifiers. In assembly language, the mov directive means move (copy) data from one memory location to another. So changing the value of the data segment register is as simple as this:

 
  asm mov ds, new-data-segment-value;

Because we are in real mode, the processor registers are all 16-bit quantities. We use variables of type unsigned short to store register values. So here's a bit of code that saves the current value of the data segment register, changes it, and then restores it.

 
  unsigned short oldseg, newseg;
  ...
  asm mov oldseg, des;
  asm mov ds, newseg;
  ...
  asm mov ds, oldseg;

So now that we know why and how the data segment register should be changed, the only questions remaining are when it must be reset and what it must be reset to.

The "what" question is easy: as far as the extension writer is concerned, the data segment register should either have it's original value (that is, the value it had when the function was entered), or it should be set to the data segment of the block of memory where the extension was loaded. These are refered to below as the original segment value and the extension segment value.

You get the original segment by saving the contents of the data segment register as soon as your extension is called. The second value, the extension segment, is passed to your extension as the first two bytes of the arg buffer.

So the following lines need to be added to the top of the extension routine in our `shape.c' file:

 
  unsigned short oldseg, myseg;
  ...
  myseg = *((unsigned short *) arg); /* get extension segment value */
  asm mov oldseg,ds  ;               /* save current value of ds */
  asm mov ds,myseg;                  /* reset to ds to our location */

At the end of the function, restore the data segment register to the original value:

 
  asm mov ds,oldseg ;

The final twist is that the answer to the "when" question is not so simple as "change to extension segment at beginning, restore to original segment at end". Many main program objects that are referenced from the extension (for instance, gVars[] or sin() from the examples above) need to have the data segment restored to the original segment before they are called. The word "many" is used because, exasperatingly, not all main program objects need to have the data segment register set. And no, you can't just always change the data segment before all main object references; depending on how your statement makes use of local variables, that may not work either.

So what to do? First, don't panic. In all cases where extensions were not behaving properly, the solution was either to set or reset the value of the data segment register before a single line of C code, and then undo that setting after it. So in the very worst possible case, a little bit of trial and error will solve your problem (surprisingly, this has become a very effective tactic).

Secondly, you can make use of the following rules:

So, in a nutshell, here's the answer to the "when" question:

  1. If you need to access gVar[] or any other main program data object, copy what you need to local variables right at the beginning.
  2. Now, save and reset the data segment register.
  3. If you need to use sin(), cos(), rand(), do this: This can encapsulated nicely in a C macro. For example:
     
    #define getrandom() asm mov ds, oldseg; r= rand(); asm mov ds, myseg
    
    lets you call getrandom(); by itself. Subsequently, the local variable r will be set to a new random number.
  4. Restore the data segment register to original value.

It is presently unclear if all this could be avoided with a smarter dynamic loader in Cogsys--because the dependence on data segments is hard to re-create after compilation, attempts to solve the problem from that approach have not been reliable. As with the limitations on types of external objects called, it is most probable that the correct solution to the Data Segment Register headache is to move Cogsys to a platform which natively supports dynamically linked libraries (see section New Directions). It is also worth nothing that while these modifications are hideously inelegant and can be terribly confusing, in practice, it is little more than an minor irriation.

Here is the revised code for the shape extension. This one will build cleanly following the instructions at the begining of this chapter, and the resulting `SHAPE.CXR' is a real, live extension that can used in the field.

 

The next section is the Extension Reference, which fully describes, for all including extensions, every option to the user and every line of source for the developer. By following the rules detailed here and using examples from the reference section, the fearless user will be able to create some very powerful extensions.

Be sure to send me screenshots. :)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated by Usman Muzaffar on June, 28 2000 using texi2html