Domain: gamers.org
Stories and comments across the archive that link to gamers.org.
Comments · 56
-
Compilers dont write better code than humansI have already heard that assumption that a compiler can generate better code than a programmer a thousand of times, but it does not get more true by repeating it - it is false. At least until compilers are able to understand the program. In every program there are things the compiler simply doesnt know. For example, in x86 assembler it is possible to save some cycles (and memory accesses) by using 16-bit or even 8-bit registers instead of 32 bit registers. You can double the number of registers by doing it. You need to be sure that the values in the registers are below 65536 or 256 to use these tricks, and the programmer can know this, but the compiler cant. The compiler might profile the possible value range if it is a really advanced compiler (I never heard of any compiler actually doing this), but it cannot be sure so it must at least check the values before for the case that they arent.
As long as the programmer has more knowledge than the compiler, he will always find tricks to save an instruction here or there and outperform the compilers this way. You can find a great example of such programming tricks in the PCGPEs article about texture mapping inner loops here.
-
Re:Is the Metaverse nearing practicality?
Basically, for the Metaverse to work, you need a massive, distributed, dynamically load-balanced database. You need near-zero latency between servers to handle synchronization. You need to be able to have servers dynamically hand off clients to one another without the user being able to perceive it happening. You need to be able to support the one guy wandering off by him/herself in the "frontiers" of the metaverse.
I may be over-simplifying things here, but I don't see why this should be such a problematic issue.
If you want to build a realistic universe, you need to provide a way for the players to move freely from one area to another without even noticing that the two areas of the universe are handled by different servers.
You could of course include teleporters, subways, or special doors to move the player from one server to another (IIRC, that idea was mentioned by John Carmack and also by John Romero in the pre-Quake days). But that would not really give the feeling of a single large universe because switching from one server to another would still require a specific action. So the areas would only be semi-connected.
Note that even this simple scheme contains some interesting problems:
- Server A must check that server B is on-line before attempting to transfer the player. Server B must be reacheable by server A but also by the user.
- Server A must also check that server B is ready to accept a new player. What if B is full? Will the player be stuck? Will a new server be spawned automatically?
- The transfer must be performed as an atomic operation, so that the player does not get lost or duplicated if there is a failure in one server, in the client or in the network.
- The transfer should also be performed in a somewhat secure way (depending on the application, you may want to prevent spoofing, replay attacks, etc.)
- The servers must trust each other to some extent, and maybe trust the client too (depending on the application). i.e. if you use the metaverse concept for a game, you do not want someone to insert a new server that will modify some players and send them back into the game with unfair advantages or disadvantages.
But the real challenge is to implement a seamless world, in which people can move around as if everything was part of a single huge map. The players should not be able to see that they are moving to a different server. In addition to the problems mentioned above, you get a lot of problems in the "frontiers" of the metaverse, as mentioned by the original poster.
For example, how will you be able to see an area that is handled by a different server than the one you are connected to? One solution would be to replicate (cache) the visible parts of the "foreign" areas on each server, but that would not work for players or any other moving objects. So the servers must exchange some informations whenever something changes near the frontiers. But there are some latency problems. If you are familiar with the QuakeWorld/QuakeII/QuakeIII problems regarding lag and movement prediction, you can imagine what will happen if more than one server is involved.
Anyway, I have been playing around with these ideas for a while and I think that I have found some solutions for building a fully distributed world (taking input from Quake, CrossFire and my personal experience about building distributed and redundant systems). Maybe I will write them down someday, and maybe even build a library that implements the necessary network protocols. Maybe...
-
Your Development environmentHi,
Back in the day, you used to do some development under NEXTSTEP. I think QuakeEd was the last app you wrote before moving to Windows/OpenGL. You said that you were moving because NEXTSTEP's display postscript system was not a good fit for your apps. That was about three years ago, and since then, Apple bought NeXT and will be releaseing a new operating system, MacOS X, with updated (Openstep/Cocoa) development tools, and a new Windowing system, referred to as Quartz, that allows for hardware acceleration with support for OpenGL. Have you considered moving to a MacOS X development environment?
-
Just like science
Reverse engineering is just like science. You pose hypotheses about the system you are reverse engineering, then you find ways to test those hypotheses.
Like Raphael, I have been involved in reverse engineering a number of fun systems -- Quake network protocols, Quake map formats, OpenGL programs, LEGO Mindstorms. All of these systems required the same general strategy but different tools and background knowledge.
My experience has been that the hardest step of reverse engineering a system is getting started. You typically find yourself needing some tool to analyze a system that you just don't have.
For the Quake network protocol, that tool was a UDP proxy that dumped data in a format I could understand. For OpenGL programs, getting a tracing infrastructure set up was required before meaningful analysis of how programs use OpenGL could proceed.
For LEGO Mindstorms, the hardest part would have been figuring out the baud and bit encoding of a serial stream, since I didn't have easy access to an oscilloscope at the time, and I do not like trial and error when something unrelated -- like my serial port setup -- could go wrong; however, somebody had figured out the serial encoding already, and the starting hump ended up being obtaining a serial line data analyzer. (I ended up using a SGI Indy as a serial proxy.) Later Mindstorms reverse engineering required a disassembler/assembler/compiler tool suite.
Quake map files were easy; the tools were a hexdump program, a program to factor numbers to find strides, an HP calculator, and some programs to convert number formats.
The second part of reverse engineering something is finding useful ways to sort through the data that gets collected or generated. A lot of times I found that this boiled down to writing a program to analyze and print out the data, which I could then look over and study.
For example, the Quake 2 network protocol included some compressed information whose presence or absence was indicated by a bit vector; to figure out which bits mapped to which data, I used a program that tabularized and printed out the data in a really wide format; I then looked for patterns in the compressed data across many, many packets. By lining up columns of numbers that were clearly the same data, it was possible to infer which bits mapped to that data. Kind of like playing a really long game of Mastermind where somebody else gets to choose most of the guesses.
For Quake map files, after figuring out the basic layout of the records in the file (which hasn't changed much from version to version), the important part was figuring out the meaning of all the data. Early on, a useful tool was one that started at a given offset and printed out the range of numbers located at a particular stride from the starting point; this helped associate records of different types to one another. Later, and by far the most useful tool for analyzing Quake map files, was a level renderer used to verify the meaning of the map data. Related tools verified not only the meaning of certain data structures, but also high-level aspects of the algorithms that used these data structures, e.g. collision detection.
A single-stepping, single-buffered OpenGL trace player helps enormously when trying to figure out what algorithms an OpenGL program uses.
In any event, along with these common aspects of reverse engineering (getting started, developing the right tools), the general strategy of posing hypotheses and testing them holds throughout. Once you think you have figured out something new, you need to come up with a way of testing and verifying (or rejecting) the new idea. Unverified knowledge is just a guess, it's not really valid until you have confirmed it with at least one test; the more independent tests the better, as this leads to more confidence in both the new and the established knowledge. Hacking is of the essence here; the faster you can test an idea, the faster you can move on to testing new ones. Not only that, but the results of testing one new idea often opens up more questions and leads to further progress, at least early on.
This is just like science. The only difference is that when you are reverse engineering something, presumably the underlying mechanisms are already known by others -- the original engineers.
Since the original poster was interested in graphics, I will add that for OpenGL programs, I use a "DLL proxy" replacement for SGI's OpenGL Stream Codec based on ideas from a program called gltrace. The proxy dumps a trace of OpenGL/GLX/WGL calls that can later be replayed, single-stepped, run through a simulator, etc.
-Kekoa
-
Use trial and error, compare input and output.
There are at least two things that you can do when attempting to reverse engineer a piece of software. The first one (not legal in several countries) is to decompile the code: take a debugger or decompiler and check what instructions are executed. The second one (legal in most countries) is the "blackbox" approach: consider the software as something that produces some output(s) depending on its input(s), and try to guess what is inside.
This second approach is the "real" reverse engineering. By carefully crafting some inputs and observing the outputs, you can often draw some conclusions about how the software behaves. With some patience and a lot of trial and error on simple inputs, you can find some patterns in the software: stuff that does not change, stuff that changes depending only on one of the inputs, and so on.
In the good old days (well, five years ago), I was the author of DEU (Doom Editing Utilties), the first program that was able to create new levels for Doom. I also contributed to Matt Fell's Unofficial Doom Specs and Olivier Montannuy's Unofficial Quake Specs, the documents that describe the WAD and PAK file formats and other internal details about Doom and Quake. Almost everything in the Unofficial Doom Specs was gathered by reverse-engineering. It was only later (with the release of Doom II) that id Software released some information to the community, presumably after they saw that editing Doom levels was a very popular activity. I am grateful for id Software's support of the editing community in their later games, but the first informations about Doom had to be found the hard way.
Most of my efforts in decoding Doom's WAD file format (and later Quake's PAK file format) involved an hex editor for viewing and editing the raw files, and custom tools that I built along the way for making editing easier (or tools that I received from other people, like DEU 3.0 from Brendon Wyber). A key thing is also to share as much information as possible with other people who are progressing on the same front because you often get more in return than what you found by yourself. For WAD files, it was easy to find that the file was organized a bit like a tar archive: a header, a directory containing names of objects and offsets within the file, and the data for the objects. Then the trial and error starts: try to guess what an object might be, modify a few bytes, run the game and see what happens. If your changes produced something useful, write it down and share the info with others. If the game crashed, try again. Repeat until you have understood everything.
Sometimes, you will find data structures that you do not understand. That was the case for Doom's NODES, SEGS and SSECTORS data. If you share enough information with others, maybe someone will have an idea and find that the data structures are related to something that they know. This is exactly what happened for Doom: Alistair Brown and a group of students from Bradford suggested that the unknown data might be a BSP tree. After reading some papers on that topic (I didn't know anything about BSP trees), I was able to implement a first BSP builder in DEU. And then it became possible to create brand new levels for Doom, instead of only changing the textures and location of the monsters as we did in the first few months. Releasing the source code for the tools has probably helped a lot. Other people were able to create their own tools based on that, and then the next reverse-engineering steps became much easier when the other games based on the same engine were released (Doom II, Heretic, Hexen, Strife,...)
Ah well... The good old times... Sigh!
-
Use trial and error, compare input and output.
There are at least two things that you can do when attempting to reverse engineer a piece of software. The first one (not legal in several countries) is to decompile the code: take a debugger or decompiler and check what instructions are executed. The second one (legal in most countries) is the "blackbox" approach: consider the software as something that produces some output(s) depending on its input(s), and try to guess what is inside.
This second approach is the "real" reverse engineering. By carefully crafting some inputs and observing the outputs, you can often draw some conclusions about how the software behaves. With some patience and a lot of trial and error on simple inputs, you can find some patterns in the software: stuff that does not change, stuff that changes depending only on one of the inputs, and so on.
In the good old days (well, five years ago), I was the author of DEU (Doom Editing Utilties), the first program that was able to create new levels for Doom. I also contributed to Matt Fell's Unofficial Doom Specs and Olivier Montannuy's Unofficial Quake Specs, the documents that describe the WAD and PAK file formats and other internal details about Doom and Quake. Almost everything in the Unofficial Doom Specs was gathered by reverse-engineering. It was only later (with the release of Doom II) that id Software released some information to the community, presumably after they saw that editing Doom levels was a very popular activity. I am grateful for id Software's support of the editing community in their later games, but the first informations about Doom had to be found the hard way.
Most of my efforts in decoding Doom's WAD file format (and later Quake's PAK file format) involved an hex editor for viewing and editing the raw files, and custom tools that I built along the way for making editing easier (or tools that I received from other people, like DEU 3.0 from Brendon Wyber). A key thing is also to share as much information as possible with other people who are progressing on the same front because you often get more in return than what you found by yourself. For WAD files, it was easy to find that the file was organized a bit like a tar archive: a header, a directory containing names of objects and offsets within the file, and the data for the objects. Then the trial and error starts: try to guess what an object might be, modify a few bytes, run the game and see what happens. If your changes produced something useful, write it down and share the info with others. If the game crashed, try again. Repeat until you have understood everything.
Sometimes, you will find data structures that you do not understand. That was the case for Doom's NODES, SEGS and SSECTORS data. If you share enough information with others, maybe someone will have an idea and find that the data structures are related to something that they know. This is exactly what happened for Doom: Alistair Brown and a group of students from Bradford suggested that the unknown data might be a BSP tree. After reading some papers on that topic (I didn't know anything about BSP trees), I was able to implement a first BSP builder in DEU. And then it became possible to create brand new levels for Doom, instead of only changing the textures and location of the monsters as we did in the first few months. Releasing the source code for the tools has probably helped a lot. Other people were able to create their own tools based on that, and then the next reverse-engineering steps became much easier when the other games based on the same engine were released (Doom II, Heretic, Hexen, Strife,...)
Ah well... The good old times... Sigh!