Which is precisely why Portable.NET's C# compiler is written in C, not in C#. It is a lot harder for someone to put a trojan in - if you trust your version of gcc, and then inspect the pnet code for problems, you can trust pnet. With a C# compiler written in C#, there is no way to be 100% certain.
Languages that are implemented in themselves suffer from the "magic binary" problem. You have to trust a magic binary version of the compiler to bootstrap. Since inspecting binaries for trojans is hard, this gives unscrupulous individuals an avenue of attack.
DotGNU Portable.NET has a functional VM right now too. I haven't bothered with a JIT yet, because the primary goals are portability, stability, and completeness. I try to do things in order. A JIT would take a few weeks once the rest of the infrastructure is in place. No point doing it until the basics are there.
DotGNU Portable.NET has a fraction of classes written, but only because we have 1 very busy developer working full time on this, and my time is split across the entire project. Contributions are always welcome.
DotGNU Portable.NET has garbage collection now, using the Hans-Boehm collector that is used by gcc.
DotGNU has less public relations only because the media have bought into the Mono hype and haven't bothered to talk to DotGNU to get the other side of the story. Even when we seek them out to correct blatant mistakes in their articles.
The actual count is indeed 250,000+. You forgot treecc and pnetlib, which are part of pnet, but distributed separately. You also forgot the *.tc input files for treecc, and the *.html and *.texi documentation files.
I was very careful with the count. I did a complete "make distclean" to remove automatically generated files. I removed imported packages like libffi and libgc which I didn't write. And then I counted up all the lines in all three packages.
I'm an obsessive commentor, but any good programmer should be. Comments help explain the code to yourself six months from now, when you've completely forgotten why you did something a particular way. For a multi-year project like this, obsessive comments are vital.
250,000 lines is a conservative estimate. I've probably thrown away 50,000+ lines of code in the process of building this. Some experiments just didn't work and had to be redone.
Although this can be a fairly malicious worm, it is very unlikely to infect many servers due to the fact that majority of Microsoft SQL servers have administrator passwords
This has to go down in the "famous last words" category. Since when have administrators of Microsoft servers ever demonstrated basic common sense?
The focus of Mono and.NET is slightly different: we are writing as much as we can in a high level language like C#, and writing reusable pieces of software out of it. Portable.NET is being written in C.
Just because a system is written in C doesn't mean that it isn't reusable. GNOME was written in C, and it is reusable, just as Miguel said in the interview. The core of Portable.NET is also very reusable.
Portable.NET has a different focus to Mono. Writing the compiler in C has two benefits: speed and bootstrapping. A well-crafted compiler in C will always be faster than one written in a garbage collected language, no matter how good the JIT is.
Bootstrapping is also easier with a C compiler: anyone with gcc can install Portable.NET and get it to run on their system. To bootstrap Mono, you have to have Microsoft's system installed.
There are many people who don't have Windows or don't want Windows. They then have to install the binary version of Mono. This introduces a security problem: you have to trust that the binary is correct, because you cannot guarantee that the published source matches the binary.
With Portable.NET, if you trust your copy of gcc, and you can't find any backdoors in the code, you can trust your copy of Portable.NET.
In reality, it comes down to preference: I prefer to write compilers in C, because I believe that is the best language for writing compilers. Miguel has a different preference.
Only the claim carry legal weight. The rest of the text is explainatory information supplied as comments to help the patent clerk and the public understand the patent.
The claims make no mention of databases, word processing, etc. Patent owners want you to believe that everything in the application has legal force. Only the claims do.
Read the claims: it covers applying automatic virus scanner and security updates on a user's computer. Everything else is McAfee's wishful thinking.
Someone else mentioned Symantec's HealthyPC, which existed 1 year prior to the filing date. This patent should have been dead on arrival at the USPTO. It's yet another example of why the patent system must be reformed ASAP.
Personally, I believe that a phone should not be made into a browser. A phone is well.... a PHONE
I'm with you on this. However, a mobile phone with full IP support would be a god-send in some areas if you could plug your laptop/PDA into it and access the Internet. Many of the objections to small screens, etc, would go away if the IP functionality was usuable by other devices the user had to hand.
The mobile phone manufacturers completely missed the potentional of the mobile IP market because of this. The phones are digital already, so put a USB port on it and make it useful!
One of the reasons WAP was a disaster, IMHO, was because they invented a completely new set of protocols to do things that IP/UDP/TCP could already do. Presumably this was because high latency, low bandwidth wireless links couldn't handle regular Internet prototols.
This was always a nonsense claim, since people were running IP over 2400 baud modems 10+ years ago, which is about as high latency, low bandwidth as you can get. IP protocol stacks typically have trouble keeping up with high bandwidth links such as fibre, not the low end.
The real reason was control: anything interoperable with the regular Internet would have been impossible to charge a premium for. This resulted in a separate WAP-Internet that didn't have the same level of content as the regular Internet. Users stayed away in droves.
Let's hope the new "wireless Internet" is based on existing standards this time, instead of something they made up out of thin air.
Taking Easter eggs, about boxes, and other forms of personal identification out of a product is just plain wrong, regardless of what it is trying to acheive.
At the end of every movie, everyone involved in the "product" gets mentioned in the credits, even the lowly payroll officer and dog minder. Is there a chance of poaching? Yes. But take one of those names off the list and watch the court cases mount up.
Everyone deserves to be honoured for their work, no matter how trivial it may be. The commercial software industry is the only Copyright-based industry that intentionally denies recognition to the authors of the works it produces. This state of affairs is a disgrace.
I can definitely relate to much of this article. Especially working longer, harder, faster for little extra recognition. Confronting management usually back-fired. Health-wise, I was a mess, just like the author of this piece.
I eventually escaped, and now am blessedly self-employed. But it took me six months to deflate and become (sort of) social again.
Managers and VC-crazed CEO's have a lot to answer for over the last 5 years. Unfortunately, they seem to get away scot free when their bad decisions hurt the employees.
There really does need to be a "tech guild" of some sort, or dare I say it "union", but I despair of ever coming up with a way to stop the union turning into the problem it is trying to solve.
"Equal work for equal pay", protectionism, and other traditional union bureaucratic nonsense is not what we need. But we definitely need the right to strike to protest irrational decision making. "You want a global e-commerce system in 3 days? Sorry, you'll have to take that nonsense up with the union".
The most important thing in elections is *trust*. The populace must trust that the outcome cannot be rigged one way or the other. If a black box is doing the counting, trust becomes harder.
In Queensland, Australia, we're going through a painful process whereby the electoral rolls were found to be corrupted by one of the political parties to skew pre-selection ballots. The public are (understandably) concerned that holding a fair election will be very difficult until the rolls are cleaned up. One can imagine the political mischief that would be possible with closed-source election software.
The software for running an election would seem to be ideal for the open source community to tackle, because its peer review process will implicitly find bugs and loopholes earlier rather than later.
The last thing we want is for a major security flaw to be found on election day, when it is too late to fix it.
Because there is an election happening somewhere in the world every other month, plenty of testing can be performed leading up to a major country election. The voting systems are different in different countries, but adding such flexibility won't be difficult. As an added bonus, poorer countries will be able to run computer-counted elections a lot cheaper than if they had to buy a commercial package.
Don't take the above as Microsoft-bashing: I believe that *no* commercial entity should be allowed to write the software. They simply don't have the peer review processes in place to ensure that the code is right for election day. We do.
As a former H1-B, who got trapped at a company too long, it's about time they improved job portability. I eventually said "stuff it" and went home to Australia.
The next thing they need to fix is the green card process. Having the employer sponsor the application is bad news. There's no incentive for the employer to fast-track the application, because they lose the person faster. There's no incentive for the employer's lawyer to fast-track the application, because they'll get paid more for delay.
The green card process should be solely between the INS and the H1-B holder. If a person can prove that someone will hire them on a H1-B, why do they need to prove again that they'll be employable with a green card?
I like the provision to limit INS stuff to six months, but you have to get it to the INS first. That's where the employer and their lawyer can slow things down. I went for 1.5 years before the damn lawyer submitted the green card paperwork, because of nit-picking on things he already had but lost.
One thing's for sure - they won't fool me a second time. The whole process will need to be fixed first, including the green card, before I set foot in the US again.
I'm pretty sure you're being sarcastic... but whatever.
Autogenerating code... anything that transforms one language into another (e.g. Eiffel to C, TeX to PostScript, etc). Since they have so many instructions (16-bit opcodes), it is conceivable that they started with a pseudo-language that describes the operations to be performed in groups, and then generates C code from the output.
e.g. There are likely to be large numbers of "move reg to reg" type instructions. Instead of coding every one, build a macro-like language that describes how one works, and then write a translator to convert to C. Sure, C's pre-processor can do most of that, but sometimes the transformation is algorithmic rather than straight text substitution.
40,000 lines. Yes Mr Wizard, it is possible to build something as heavy duty as a VM in 40,000 lines. IF YOU PICK THE RIGHT DESIGN. In fact, I've been working on VM's myself for quite some time, so I absolutely do know what I'm talking about when it comes to VM design.
Did you ever wonder where the bytecode verifier and the GC came from in Java? Really came from. Not just what Sun says. It's due to a bug in their instruction set. You can push an integer on the stack and pop it as a pointer - instant security issue. So the verifier and GC are necessary to prevent that from being a problem.
But, here's the kicker: what if you could design an instruction set that didn't have that bug? No verifier. No GC. No type checking. No complicated reflection guff. No single-language dependencies. But still 100% secure, and less than 40,000 lines!!
Sun made a mistake, which Microsoft copied blindly. But they didn't realize it was a mistake until it was too late to fix it, so they made up various marketing reasons for why "C++ can't possibly be secure because we were too dumb to figure it out".
The Internet C++ VM is what I normally describe as "Bob's First VM" or "Graduate Student VM" - obvious opportunities for optimization being missed (switch vs dispatch functions), and code that is pretty much impossible to read.
Have a nice day, and pick on someone clueless next time, instead of someone who's actually built a VM from the ground up.
His heart is in the right place, but the implementation leaves a lot to
be desired. 200000+ lines of code in one source file! Yikes! It looks
like some of it may have been auto-generated. If so, then the starting
files for the autogen process should have been distributed - not the output.
From my shallow understanding of the code (I've spent 30 mins on it so far),
here are the problems I see:
1. Not thread-safe. The VM register state is kept in global variables.
2. If this is their "fast" VM, then I'd hate to see their "slow" one.
Dispatch functions are used for opcodes. i.e. every instruction
involves a table lookup and a function call overhead. Tip for
VM writers: use a switch statement! All of the VM state
can be kept in local variables, which C compilers can optimize
very heavily. If you split it across multiple functions, then
you are defeating the C compiler's optimization of the interpreter!
3. Doesn't appear to be pointer safe, so this cannot be used for
"download and run" applications, which destroys its usefulness
as a real Java replacement. (Yes kids, C can be made pointer
safe - see EiC - it's a question of how much overhead you want
to tolerate).
4. X11 access appears to be more or less raw to the X server. This
would allow lots of fun and games to be played: full X11 gives
you the ability to mess with other application's windows, inspect
the clipboard, the user's resource settings, and generally get up
to insecure mischief.
5. No instruction set documentation so that a better VM could be
written by someone knowledgeable about such matters.
6. Too much code. Another tip for VM designers: if you get 10,000
lines of code into a VM, and you've barely scratched the surface,
then your design is wrong and you should start again. The target
size for VM's should be around 40,000 lines max. Both Java and C#
get this wrong, BTW.
Nice try, but we'll have to wait for someone else to come up with a
better VM design before we can use C/C++ on the Internet. It's possible,
but not this way.
The DotGNU Project is building a Free (capital F) CLR, based on the Portable.NET code.
t ml
Mono is not the only game in town.
http://www.dotgnu.org/
http://www.southern-storm.com.au/portable_net.h
Only because Microsoft completely changed the metadata format between Beta 1 and Beta 2, for no discernable good reason.
Languages that are implemented in themselves suffer from the "magic binary" problem. You have to trust a magic binary version of the compiler to bootstrap. Since inspecting binaries for trojans is hard, this gives unscrupulous individuals an avenue of attack.
Portable.NET's Web Site
DotGNU Portable.NET has a functional VM right now too. I haven't bothered with a JIT yet, because the primary goals are portability, stability, and completeness. I try to do things in order. A JIT would take a few weeks once the rest of the infrastructure is in place. No point doing it until the basics are there.
DotGNU Portable.NET has a fraction of classes written, but only because we have 1 very busy developer working full time on this, and my time is split across the entire project. Contributions are always welcome.
DotGNU Portable.NET has garbage collection now, using the Hans-Boehm collector that is used by gcc.
DotGNU has less public relations only because the media have bought into the Mono hype and haven't bothered to talk to DotGNU to get the other side of the story. Even when we seek them out to correct blatant mistakes in their articles.
The actual count is indeed 250,000+. You forgot treecc and pnetlib, which are part of pnet, but distributed separately. You also forgot the *.tc input files for treecc, and the *.html and *.texi documentation files.
I was very careful with the count. I did a complete "make distclean" to remove automatically generated files. I removed imported packages like libffi and libgc which I didn't write. And then I counted up all the lines in all three packages.
I'm an obsessive commentor, but any good programmer should be. Comments help explain the code to yourself six months from now, when you've completely forgotten why you did something a particular way. For a multi-year project like this, obsessive comments are vital.
250,000 lines is a conservative estimate. I've probably thrown away 50,000+ lines of code in the process of building this. Some experiments just didn't work and had to be redone.
Portable.NET has a different focus to Mono. Writing the compiler in C has two benefits: speed and bootstrapping. A well-crafted compiler in C will always be faster than one written in a garbage collected language, no matter how good the JIT is.
Bootstrapping is also easier with a C compiler: anyone with gcc can install Portable.NET and get it to run on their system. To bootstrap Mono, you have to have Microsoft's system installed.
There are many people who don't have Windows or don't want Windows. They then have to install the binary version of Mono. This introduces a security problem: you have to trust that the binary is correct, because you cannot guarantee that the published source matches the binary. With Portable.NET, if you trust your copy of gcc, and you can't find any backdoors in the code, you can trust your copy of Portable.NET.
In reality, it comes down to preference: I prefer to write compilers in C, because I believe that is the best language for writing compilers. Miguel has a different preference.
Rhys Weatherley - author of Portable.NETl
http://www.southern-storm.com.au/portable_net.htm
Man, all this time I thought volume 4 was an urban legend.
The claims make no mention of databases, word processing, etc. Patent owners want you to believe that everything in the application has legal force. Only the claims do.
Read the claims: it covers applying automatic virus scanner and security updates on a user's computer. Everything else is McAfee's wishful thinking.
Someone else mentioned Symantec's HealthyPC, which existed 1 year prior to the filing date. This patent should have been dead on arrival at the USPTO. It's yet another example of why the patent system must be reformed ASAP.
I'm with you on this. However, a mobile phone with full IP support would be a god-send in some areas if you could plug your laptop/PDA into it and access the Internet. Many of the objections to small screens, etc, would go away if the IP functionality was usuable by other devices the user had to hand.
The mobile phone manufacturers completely missed the potentional of the mobile IP market because of this. The phones are digital already, so put a USB port on it and make it useful!
This was always a nonsense claim, since people were running IP over 2400 baud modems 10+ years ago, which is about as high latency, low bandwidth as you can get. IP protocol stacks typically have trouble keeping up with high bandwidth links such as fibre, not the low end.
The real reason was control: anything interoperable with the regular Internet would have been impossible to charge a premium for. This resulted in a separate WAP-Internet that didn't have the same level of content as the regular Internet. Users stayed away in droves.
Let's hope the new "wireless Internet" is based on existing standards this time, instead of something they made up out of thin air.
At the end of every movie, everyone involved in the "product" gets mentioned in the credits, even the lowly payroll officer and dog minder. Is there a chance of poaching? Yes. But take one of those names off the list and watch the court cases mount up.
Everyone deserves to be honoured for their work, no matter how trivial it may be. The commercial software industry is the only Copyright-based industry that intentionally denies recognition to the authors of the works it produces. This state of affairs is a disgrace.
I can definitely relate to much of this article. Especially working longer, harder, faster for little extra recognition. Confronting management usually back-fired. Health-wise, I was a mess, just like the author of this piece. I eventually escaped, and now am blessedly self-employed. But it took me six months to deflate and become (sort of) social again. Managers and VC-crazed CEO's have a lot to answer for over the last 5 years. Unfortunately, they seem to get away scot free when their bad decisions hurt the employees. There really does need to be a "tech guild" of some sort, or dare I say it "union", but I despair of ever coming up with a way to stop the union turning into the problem it is trying to solve. "Equal work for equal pay", protectionism, and other traditional union bureaucratic nonsense is not what we need. But we definitely need the right to strike to protest irrational decision making. "You want a global e-commerce system in 3 days? Sorry, you'll have to take that nonsense up with the union".
In Queensland, Australia, we're going through a painful process whereby the electoral rolls were found to be corrupted by one of the political parties to skew pre-selection ballots. The public are (understandably) concerned that holding a fair election will be very difficult until the rolls are cleaned up. One can imagine the political mischief that would be possible with closed-source election software.
The software for running an election would seem to be ideal for the open source community to tackle, because its peer review process will implicitly find bugs and loopholes earlier rather than later.
The last thing we want is for a major security flaw to be found on election day, when it is too late to fix it.
Because there is an election happening somewhere in the world every other month, plenty of testing can be performed leading up to a major country election. The voting systems are different in different countries, but adding such flexibility won't be difficult. As an added bonus, poorer countries will be able to run computer-counted elections a lot cheaper than if they had to buy a commercial package.
Don't take the above as Microsoft-bashing: I believe that *no* commercial entity should be allowed to write the software. They simply don't have the peer review processes in place to ensure that the code is right for election day. We do.
The next thing they need to fix is the green card process. Having the employer sponsor the application is bad news. There's no incentive for the employer to fast-track the application, because they lose the person faster. There's no incentive for the employer's lawyer to fast-track the application, because they'll get paid more for delay.
The green card process should be solely between the INS and the H1-B holder. If a person can prove that someone will hire them on a H1-B, why do they need to prove again that they'll be employable with a green card?
I like the provision to limit INS stuff to six months, but you have to get it to the INS first. That's where the employer and their lawyer can slow things down. I went for 1.5 years before the damn lawyer submitted the green card paperwork, because of nit-picking on things he already had but lost.
One thing's for sure - they won't fool me a second time. The whole process will need to be fixed first, including the green card, before I set foot in the US again.
Hmmm - on second thoughts - maybe you weren't being sarcastic. Apologies if so. Need more caffeine to jump-start the brain.
Autogenerating code ... anything that transforms one language into another (e.g. Eiffel to C, TeX to PostScript, etc). Since they have so many instructions (16-bit opcodes), it is conceivable that they started with a pseudo-language that describes the operations to be performed in groups, and then generates C code from the output.
e.g. There are likely to be large numbers of "move reg to reg" type instructions. Instead of coding every one, build a macro-like language that describes how one works, and then write a translator to convert to C. Sure, C's pre-processor can do most of that, but sometimes the transformation is algorithmic rather than straight text substitution.
40,000 lines. Yes Mr Wizard, it is possible to build something as heavy duty as a VM in 40,000 lines. IF YOU PICK THE RIGHT DESIGN. In fact, I've been working on VM's myself for quite some time, so I absolutely do know what I'm talking about when it comes to VM design.
Did you ever wonder where the bytecode verifier and the GC came from in Java? Really came from. Not just what Sun says. It's due to a bug in their instruction set. You can push an integer on the stack and pop it as a pointer - instant security issue. So the verifier and GC are necessary to prevent that from being a problem.
But, here's the kicker: what if you could design an instruction set that didn't have that bug? No verifier. No GC. No type checking. No complicated reflection guff. No single-language dependencies. But still 100% secure, and less than 40,000 lines!!
Sun made a mistake, which Microsoft copied blindly. But they didn't realize it was a mistake until it was too late to fix it, so they made up various marketing reasons for why "C++ can't possibly be secure because we were too dumb to figure it out".
The Internet C++ VM is what I normally describe as "Bob's First VM" or "Graduate Student VM" - obvious opportunities for optimization being missed (switch vs dispatch functions), and code that is pretty much impossible to read.
Have a nice day, and pick on someone clueless next time, instead of someone who's actually built a VM from the ground up.
His heart is in the right place, but the implementation leaves a lot to
be desired. 200000+ lines of code in one source file! Yikes! It looks
like some of it may have been auto-generated. If so, then the starting
files for the autogen process should have been distributed - not the output.
From my shallow understanding of the code (I've spent 30 mins on it so far),
here are the problems I see:
1. Not thread-safe. The VM register state is kept in global variables.
2. If this is their "fast" VM, then I'd hate to see their "slow" one.
Dispatch functions are used for opcodes. i.e. every instruction
involves a table lookup and a function call overhead. Tip for
VM writers: use a switch statement! All of the VM state
can be kept in local variables, which C compilers can optimize
very heavily. If you split it across multiple functions, then
you are defeating the C compiler's optimization of the interpreter!
3. Doesn't appear to be pointer safe, so this cannot be used for
"download and run" applications, which destroys its usefulness
as a real Java replacement. (Yes kids, C can be made pointer
safe - see EiC - it's a question of how much overhead you want
to tolerate).
4. X11 access appears to be more or less raw to the X server. This
would allow lots of fun and games to be played: full X11 gives
you the ability to mess with other application's windows, inspect
the clipboard, the user's resource settings, and generally get up
to insecure mischief.
5. No instruction set documentation so that a better VM could be
written by someone knowledgeable about such matters.
6. Too much code. Another tip for VM designers: if you get 10,000
lines of code into a VM, and you've barely scratched the surface,
then your design is wrong and you should start again. The target
size for VM's should be around 40,000 lines max. Both Java and C#
get this wrong, BTW.
Nice try, but we'll have to wait for someone else to come up with a
better VM design before we can use C/C++ on the Internet. It's possible,
but not this way.