You have me up to the last one. Parsers are not inperpreters. If you mean that you have a bunch of scripts to be interpreted concurrently, the interpreter, not the parser, needs to be reentrant. If I've understood you correctly, this problem is solved fairly easily (without requiring reentrant code parser side): the parser has an input queue containing bundles of raw source code. Parser does its trick, then spits out the resulting parse trees to an output queue where a bunch of worker threads are waiting to handle the interpreting. Considering that parsing is an inherently IO bound problem, I seriously doubt that you'll see enough of a performance difference to warrant the misery of trying to deal with the nasty mess that bison spits out when you ask it to use its reentrant skeleton. Of course, I may well be misunderstanding you.
Having learned the hard way (again, in my compiler class), this is unfeasible for a truly compiled language for a couple simple reasons:
1) Which symbols get stored in your symbol table? Most people don't export the entire symbol table -- just the global symbols (which usually means only global variables shared across translation units and function declarations). And even then, you aren't guaranteed to get the global symbols -- if you've linked statically or run strip over your executable to make it smaller, you're out of luck.
2) Loops? Assembly like languages don't have loops, they have jumps and logical tests. So how exactly are you planning on matching loop structures, given that they all look essentially identical in assembly?
There are other, equally hairy problems to be dealt with. If you're talking about an interpreted language, you may have more hope.
The best you can really hope for is a disassembler and dust off your assembly skillz.
bison has been made to be reentrant (using the hairy skeleton) -- but this is not something for the faint of heart. just a technical point;-) as for ANTLR, i believe that it is by design all nice and encapsulated & reentrant and stuff. altho, i really do have to wonder -- who in their right mind would want a multi-threaded parser? parsing is an inherently sequential process, so unless you're having to parse multiple streams simultaneously, i don't know why this would be of any use to you. even then, perhaps you should rethink your design -- like doing batch processing of your input and then dispatching a thread to work on the already parsed input.
You've got a couple choices -- finding yourself a good regular expression library seems like a good start;-) If you're looking to do something a little more interesting than just lexical analysis, check out the red dragon book (better known as Compilers: Principles, Techniques, and Tools by Aho, Sethi & Ullman. I used it in my compiler course and I can tell you that they hit all the various parsing techniques (recursive descent, LA, LALR, SLR, etc.) very well, along with some other stuff. They concentrate on Lex/Yacc as tools -- you may prefer to check out ANTLR -- Terrence Parr's parser generator. It can be targeted at a bunch of languages and can also produce tree walkers for when it comes time to use your parsed data.
However, in the case of Free/free software, my wallet is not the victim. Think about it: almost every other industry is held to certain minimum standards. Civil engineers are required to build bridges that don't crumble or they have hell to pay otherwise. Even car manufacturers are required to meet minimum quality and safety requirements, even tho they've got you by the short and curlies for as long as you plan on maintaining your car. Asking for similar accountability in software is not unreasonable, especially given the sums of money involved.
Unfortunately, that is not the case in the US. The UK still believes in nicities such as basic consumer protection and the principle of fair use as established by the Statute of Anne and is willing to protect its citizens' inviolate rights. Unfortunately, the US government, for whatever reasons, have decided to whore out the rights of its citizens in exchange for increased corporate revenues and, as a result, increased taxes (not to mention all those nice PAC campaign contributions). Hate to break it to you, but the folks in the US are phuqed. The DMCA et al have pretty much given big companies carte blanche to milk consumers for all they're worth.
And the computer industry in general has demonstrated that the concept of ethics no longer applies when there is money at stake. Read the average EULA: you have to surrender fundamental rights, such as fair use. Worse than that, the developers generally absolve themselves of any responsibility or liability whatsoever -- they won't even guarantee that the software that you have just bought will do what they claim it does!
What we're seeing is the culmination of an unfortunate trend. The creators of a piece of software for as long as they control it have a monopoly -- anyone committed to using their product is pretty much at their mercy. And that means money -- lots of money.
I've had an opportunity to help develop quite a few RDBMS based applications for various customers ranging from the backbone for a distributed chat system to NSI's BARS (Billing And Receivables System) and every single one of them was implemented using Oracle. Why? Our customers run systems that see millions of transactions in a day -- something that most other RDMBS's out there simply can't handle. Oracle is reliable and is a hell of a lot more scalable than SQLServer, but that's beside the point: our customers use Oracle because they trust it. Would you implement a billing system that doesn't implement some kind of transactional integrity? Most sane people would answer no, because they understand that 'low cost' solutions are not necessarily inexpensive.
If you're going to insist on academics, you have before you the perfect opportunity to mold a good human being out of this child. Java, ASM and all that bullshit can wait -- hell, if he's half as smart as you say he is, he'll pick it up in about a month. What he needs to learn is what has been lacking in his education, which I am willing to wager a significant amount is lacking severly in art, humanities, music, philosophy etc. Challenge him and give him an opportunity to see the beauty in Bach and Debussy; show him that Frost and Lorca defy the rational quantization that has been so firmly drilled into him. Let him have his breath stolen by the sheer grandeur of a Bierstadt(sp?) or the aching power of a Matisse. In short: let him live, becuase God knows that Java and assembly and the rest of the technological litany people like to spout as if it were some holy ward will still be there when he is ready for them.
At the APCS level, most kids should be proficient enough in C++ to attempt a simple recursive descent compiler/interpreter. I recommend Ronald Mak's Writing Compilers and Interpreters: An Applied Approach Using C++. The book is a very gentle (some will say too gentle) introduction to writing compilers and interpreters. Ronald Mak has done a particularly good job of introducing more advanced programming techniques (inheritence, binary file manipulation, etc.) that are not covered in the APCS curriculum. The examples in the book give a strong sense of accomplishment, every chapter has at least one utility so it is easy to see progress. Needless to say, this is not for every student, but there's something very cool about being able to say that you've written your own compiler...
You have me up to the last one. Parsers are not inperpreters. If you mean that you have a bunch of scripts to be interpreted concurrently, the interpreter, not the parser, needs to be reentrant. If I've understood you correctly, this problem is solved fairly easily (without requiring reentrant code parser side): the parser has an input queue containing bundles of raw source code. Parser does its trick, then spits out the resulting parse trees to an output queue where a bunch of worker threads are waiting to handle the interpreting. Considering that parsing is an inherently IO bound problem, I seriously doubt that you'll see enough of a performance difference to warrant the misery of trying to deal with the nasty mess that bison spits out when you ask it to use its reentrant skeleton. Of course, I may well be misunderstanding you.
Having learned the hard way (again, in my compiler class), this is unfeasible for a truly compiled language for a couple simple reasons:
1) Which symbols get stored in your symbol table? Most people don't export the entire symbol table -- just the global symbols (which usually means only global variables shared across translation units and function declarations). And even then, you aren't guaranteed to get the global symbols -- if you've linked statically or run strip over your executable to make it smaller, you're out of luck.
2) Loops?
Assembly like languages don't have loops, they have jumps and logical tests. So how exactly are you planning on matching loop structures, given that they all look essentially identical in assembly?
There are other, equally hairy problems to be dealt with. If you're talking about an interpreted language, you may have more hope.
The best you can really hope for is a disassembler and dust off your assembly skillz.
bison has been made to be reentrant (using the hairy skeleton) -- but this is not something for the faint of heart. just a technical point ;-) as for ANTLR, i believe that it is by design all nice and encapsulated & reentrant and stuff. altho, i really do have to wonder -- who in their right mind would want a multi-threaded parser? parsing is an inherently sequential process, so unless you're having to parse multiple streams simultaneously, i don't know why this would be of any use to you. even then, perhaps you should rethink your design -- like doing batch processing of your input and then dispatching a thread to work on the already parsed input.
You've got a couple choices -- finding yourself a good regular expression library seems like a good start ;-) If you're looking to do something a little more interesting than just lexical analysis, check out the red dragon book (better known as Compilers: Principles, Techniques, and Tools by Aho, Sethi & Ullman. I used it in my compiler course and I can tell you that they hit all the various parsing techniques (recursive descent, LA, LALR, SLR, etc.) very well, along with some other stuff. They concentrate on Lex/Yacc as tools -- you may prefer to check out ANTLR -- Terrence Parr's parser generator. It can be targeted at a bunch of languages and can also produce tree walkers for when it comes time to use your parsed data.
or any number of crud free channels... whoops -- guess i'm stealing from them 'cos i opt to watch their competition.
okay, now the rest of you can move on with your business ;-)
However, in the case of Free/free software, my wallet is not the victim. Think about it: almost every other industry is held to certain minimum standards. Civil engineers are required to build bridges that don't crumble or they have hell to pay otherwise. Even car manufacturers are required to meet minimum quality and safety requirements, even tho they've got you by the short and curlies for as long as you plan on maintaining your car. Asking for similar accountability in software is not unreasonable, especially given the sums of money involved.
Unfortunately, that is not the case in the US. The UK still believes in nicities such as basic consumer protection and the principle of fair use as established by the Statute of Anne and is willing to protect its citizens' inviolate rights. Unfortunately, the US government, for whatever reasons, have decided to whore out the rights of its citizens in exchange for increased corporate revenues and, as a result, increased taxes (not to mention all those nice PAC campaign contributions). Hate to break it to you, but the folks in the US are phuqed. The DMCA et al have pretty much given big companies carte blanche to milk consumers for all they're worth.
And the computer industry in general has demonstrated that the concept of ethics no longer applies when there is money at stake. Read the average EULA: you have to surrender fundamental rights, such as fair use. Worse than that, the developers generally absolve themselves of any responsibility or liability whatsoever -- they won't even guarantee that the software that you have just bought will do what they claim it does! What we're seeing is the culmination of an unfortunate trend. The creators of a piece of software for as long as they control it have a monopoly -- anyone committed to using their product is pretty much at their mercy. And that means money -- lots of money.
Their MAJC5200 processor already does SMT, although they call it something like spatial computing. Check it out here
I've had an opportunity to help develop quite a few RDBMS based applications for various customers ranging from the backbone for a distributed chat system to NSI's BARS (Billing And Receivables System) and every single one of them was implemented using Oracle. Why? Our customers run systems that see millions of transactions in a day -- something that most other RDMBS's out there simply can't handle. Oracle is reliable and is a hell of a lot more scalable than SQLServer, but that's beside the point: our customers use Oracle because they trust it. Would you implement a billing system that doesn't implement some kind of transactional integrity? Most sane people would answer no, because they understand that 'low cost' solutions are not necessarily inexpensive.
If you're going to insist on academics, you have before you the perfect opportunity to mold a good human being out of this child. Java, ASM and all that bullshit can wait -- hell, if he's half as smart as you say he is, he'll pick it up in about a month. What he needs to learn is what has been lacking in his education, which I am willing to wager a significant amount is lacking severly in art, humanities, music, philosophy etc. Challenge him and give him an opportunity to see the beauty in Bach and Debussy; show him that Frost and Lorca defy the rational quantization that has been so firmly drilled into him. Let him have his breath stolen by the sheer grandeur of a Bierstadt(sp?) or the aching power of a Matisse. In short: let him live, becuase God knows that Java and assembly and the rest of the technological litany people like to spout as if it were some holy ward will still be there when he is ready for them.
At the APCS level, most kids should be proficient enough in C++ to attempt a simple recursive descent compiler/interpreter. I recommend Ronald Mak's Writing Compilers and Interpreters: An Applied Approach Using C++. The book is a very gentle (some will say too gentle) introduction to writing compilers and interpreters. Ronald Mak has done a particularly good job of introducing more advanced programming techniques (inheritence, binary file manipulation, etc.) that are not covered in the APCS curriculum. The examples in the book give a strong sense of accomplishment, every chapter has at least one utility so it is easy to see progress. Needless to say, this is not for every student, but there's something very cool about being able to say that you've written your own compiler...