Relocatable Code: How do you do it?

← Back to Stories (view on slashdot.org)

Relocatable Code: How do you do it?

Posted by Cliff on Saturday September 25, 1999 @11:45AM from the nomadic-code dept.

Aras Vaichas asks: "I am designing the kernel software for a radio modem. The processor is an AMD186ED - 1MB address space split evenly between DRAM and FLASH/SRAM. Initially I was going to port FreeBSD to it but after scouring the FreeBSD kernel code discovered that there was too much 80386 code there (memory manager, etc). This makes life a little more difficult so I am going with another commercially available realtime operating system. My problem is that I want to implement a Flash File System and be able to execute programs (EXE style) from it. How does the BIOS in a PC relocate the EXE file? This is done via a BIOS call and therefore the code is in the BIOS. Does anyone have or know where I can get the (any flavour of) BIOS source code? "

3 of 11 comments (clear)

Min score:

Reason:

Sort:

Check out ecos from cygnus by A+nonymous+Coward · 1999-09-25 10:29 · Score: 2

I'll get to your relocatable code question , but as an aside, you might want to check out Ecos from Cygnus. I haven't dealt with rom based code for several years now, so I don't know much about ecos. But it comes from Cygnus, is free source (not GPL; their variation on the Mozilla license), and who knows, it may even be ported to your setup.

In general, processors have two kinds of addresses for data and instructions: absolute and relative. Absolute is faster since no airithmetic is involved. A .o file has the symbol name, a static link resolves it immediately, and a dynamic linker resolves it at load/run time.

Relative branches are slower because of the arithmetic, but on some machines the instructions can be quite a bit shorter because a full address is not necessary. Of course, this itself may make enough of a speed difference to cancel the arithmetic, and the code size reduction may be worthwhile too. The linker probably has to compute relative distances at link time.

None of this helps directly with your request for BIOS code, but if you want that just to figure out how it copies code, you may not need it. If your processor supports relative addressing, you probably don't. Make sure you can do everything you want with relative addressing. Some processors are troublesome this way, at least memory says so, but I spent 7 years on just m68k, so I'm a bit hazy on others now.

Of course, you may still need absolute addressing if system subroutines are at absolute locations in low or high memory, and if you have memory mapped I/O.

--

--
Infuriate left and right
Other Techniques by Detritus · 1999-09-25 21:05 · Score: 2

Does it have to be an EXE file?
There is an old trick that I've used on 8-bit CPUs. Link the program as an absolute memory image at two different load addresses. Write a program that compares the two images and outputs a relocation list to a file. Then you can use a simple loader that takes an absolute memory image and the relocation list to load the program at any desired address. You should be able to do this with COM files on the 80x86.

--
Mea navis aericumbens anguillis abundat
relocatable code on x86 by dutky · 1999-09-25 12:38 · Score: 5

You are pretty well screwed if you want to do position independant code on iAPx86 processors prior to the 80386, because of the glorious fun we call Segmented Memory.

On the 8088/8086/80186/80286 and compatibles, you have essentially two kinds of jump instructions: short jumps that add a 8-bot or 16-bit offset to the IP register but do not affect the CS register (remember, your full instruction counter is the 20-bit sum IP+16*CS), and far jumps that replace both the IP and the CS with new values, but do not allow relative addressing.

So you see, you can either have position independant code (jumps addressed from the current IP) or you can have more than 64k of code, but not both. The way around this is to use far jumps in your executable files and modify the target addresses when you load the code into memory at runtime.

I rather doubt that there is a BIOS call to relocate the executable file. I've written a multitasking OS for 8086's, and I don't remember there being such a BIOS call. In fact, I had to write a relocating linking loader myself, so I feel pretty comfortable saying there is no such beast in the motherboard BIOS. You are probably confusing the MS-DOS BIOS (loaded from the file BIOS.SYS on the boot disk) with the motherboard BIOS (contained in ROM). The basics of an .EXE relocating linking loader are, however, pretty simple:

In their header, .EXE files have a table of the offsets of all absolute addresses in the program that are in need of relocation. All you do is, once you have loaded the .EXE into it's new home in RAM, go through the table and add the new base address of the program in RAM to each indicated address in the program image in memory.

There is a reasonable explaination of the .EXE file format and the relocation process on pages 83-85 of the Microsoft MS-DOS Programmer's Reference (Microsoft Press, ISBN 1-55615-546-8, US$27.95) and on page 307 of Disecting DOS (Michael Podanoffsky, Addison Wesley, ISBN 0-201-62687-X, US$39.95)

The Podanoffsky book will provide you with real code to perform this bit of sorcery, but I will quote from the M$ book, since it should be considered more authoritative:

The relocation table is an array of relocation pointers, each of which points to a relocatable-segment address in the program image. The exRelocItems field in the file header specifies the number of pointers in the array, and the exRelocTable field specifies the file offset at which the relocation table starts. Each relocation pointer consists of two 16-bit values and a segment number.

To load an .EXE program, MS-DOS first reads the file header to determine the .EXE signature and calculate the size of the program image. It then attempts to allocate memory. ... [if there isn't enough memory for the program + any extra memory the .EXE file specifies + the OS's bookkeeping data structures (in MS-DOS called the Program Segment Prefix, or PSP) then MS-DOS returns an error, otherwise it allocates the memory -- JSD]

After allocating memory, MS-DOS determines the segment address, called the start-segment address, at which to load the program image. If the value in both the exMinAlloc and exMaxAlloc fields is zero, MS-DOS loads the image as high as possible in memory. Otherwise, it loads the image immediately above the area reserved for the PSP.

Next, MS-DOS reads the items in the relocation table and adjusts all segment addresses specified by the relocation pointers. For each pointer in the relocation table, MS-DOS findes the corresponding relocatable-segment address in the program image and adds the start-segment address to it. Once adjusted, the segment address points to the segments in memory where the program's code and data are loaded.

The preceeding excerpt refers to a memory structure that contains the contents of the .EXE header file. In C structure notation the header looks like this: (in this case, int is 16-bits)
struct EXEHEADER { int exSignature; /* .EXE signature = 0x5a4d */ int exEstraBytes; /* number of unused bytes in last page of file */ int exPages; /* size of file in 512-byte pages */ int exRelocItems; /* number of pointers in relocation table */ int exHeaderSize; /* header size in paragraphs (16-bytes/paragraph) */ int exMinAlloc; /* minimum extra memory required (paragraphs) */ int exMacAlloc; /* extra memory requested (paragraphs) */ int exInitSS; /* initial stack segment value */ int exInitSP; /* initial stack pointer value */ int exCheckSum; /* checksum, unused, usually zero */ int exInitIP; /* initial instruction pointer value: entry point offset */ int exInitCS; /* initial code segment value: entry point's segment */ int exRelocTable; /* byte offset to relocation table */ int exOverlay; /* overlay number, 0 for resident part */ }
DISCLAIMER: If the OS you are using uses a different executable file format, all of the above is inaccurate, but may be helpfull for purely theoretic purposes, since anything running in x86 real mode needs to be able to solve this bit of kludgery.

-- Jeff Dutky