GCC Compiler Finally Supplanted by PCC?
Sunnz writes "The leaner, lighter, faster, and most importantly, BSD Licensed, Compiler PCC has been imported into OpenBSD's CVS and NetBSD's pkgsrc. The compiler is based on the original Portable C Compiler by S. C. Johnson, written in the late 70's. Even though much of the compiler has been rewritten, some of the basics still remain. It is currently not bug-free, but it compiles on x86 platform, and work is being done on it to take on GCC's job."
I notice that TFS doesn't say that anyone is actually able to compile anything (other than PCC) with it. The BSD folks would love to have a BSD-licensed drop-in replacement for GCC; but it doesn't sound like this is it. Not yet at least.
./build.sh or whatever).
Wake me up when you're able to use PCC instead of GCC to do a 'make world' (or
Kind of depends on who you ask, doesn't it?
OK, so it compiles C on x86. What do I use when I want to compile objective C on my microwave?
Someone relicense it under the GPL!
"NetBSD's" pkgsrc is really everyone's pkgsrc. Try it on what you're running right now.
It's my primary package manager on Interix, Mac OS X, Linux, and NetBSD.
GCC Compiler Finally Supplanted by PCC?
No. Next question.
CDE open sourced! https://sourceforge.net/projects/cdesktopenv/
I really don't see any point in implementing a new C compiler under the BSD lisence. There's no reason to duplicate effort: it's not like the compiled binaries would be under the GPL. And any GPL libraries you link to, you wouldn't need to distribute (thus avoiding the GPL). So, really, there's no point in duplicating effort on a BSD lisenced compiler. Correct me if I'm wrong.
Seriously. Let's duplicate the wheel twice: once for GPL, once for BSD, and then bicker amongst ourselves. Stuff like this stands in the way of actual progress being made. Neither side is right, I don't have a solution, but this is just dumb.
twitter.com/gravitronic
call me when pcc does something useful, like, say, working.
Not really; you already have the sun and intel compilers for Linux (I've been told that the intel compiler has even been tweaked so you can build a bzImage with it).
But you're still stuck with using glibc if you want to be able to compile anything. You do have different libcs floating around, uclibc, etc; but they're all gnu and they're all meant for embedded market. I doubt you'd be able to recompile the linux kernel with any of them.
pcc will take YEARS to get the functionality and optimizations that gcc has. Even if it compiles slowly and sometimes generates dumb code.
Either way, they'd much, much better off if they imported LLVM and redirected their compiler brain power to clang.
PCC is interesting, but it's based on technology from the 70's, doesn't support a lot of interesting architectures, and has no optimizer to speak of.
If you're interested in advanced compiler technology, check out LLVM, which is an ground up redesign of an optimizer and retargettable code generator. LLVM supports interprocedural cross-file optimizations, can be used for jit compilation (or not, at your choice) and has many other capabilities. The LLVM optimizer/code generator can already beat the performance of GCC compiled code in many cases, sometimes substantially.
For front-ends, LLVM supports two major ones for C family of languages: 1) llvm-gcc, which uses the GCC front-end to compile C/C++/ObjC code. This gives LLVM full compatibility with a broad range of crazy GNU extensions as well as full support for C++ and ObjC. 2) clang, which is a ground-up rewrite of a C/ObjC frontend (C++ will come later) that provides many advantages over GCC, including dramatically faster compilation and better warning/error information.
While LLVM is technologically ahead of both PCC and GCC, the biggest thing it has going is both size of community and the commercial contributors that are sponsoring work on the project.
-Chris
"So, really, there's no point in duplicating effort on a BSD lisenced compiler. Correct me if I'm wrong."
From the discussion of TFA:
The licence is just the top of the iceberg
Let me get this straight. A compiler that has been production-quality for over 15 years, compiles everything on every architecture, and has been continuously improved every minute of its existence needs to be replaced by ... Son of pcc? Because of a license?
Sure, I prefer BSD-style licenses, and so do some other people, but what drives gcc development is the GNU license. I think I'll stick to the compiler that's debugged. Oh, that's right, I forgot, it comes with a debugger too. If you like that sort of thing.
Even if a compiler generates miserably inefficient code, it is valuable if the code is correct. It ia a valuable tool to use for the verification of other compilers. It can also be used as part of a compiler bootstrapping process. Since its code size is probably a small fraction of GCC's, it may make a better teaching tool. If people are actually going to use it, given that it must coexist in a world with much more mature compilers, it will itself probably become much more mature in a relatively short period of time. GCC currently has no competitors in the free realm and has suffered from neglect in the past. A little competition may keep the developers on their toes and prevent another egcs.
GCC has been continuously changed not continuously improved. With each new chip that it optimizes for it seems to drop support for an older one. Plus it is dog slow.
The reason they look increasingly extremist is because the FSF tends to make up policies and rules which bind GCC development in order to avoid the theoretical risk of making GPL violations easier. As compiler technology advances these restrictions have become increasingly burdensome, in particular, several of the technical advantages of LLVM are things the GCC team would have liked to do but RMS nixed because it would have made it too easy to circumvent the license.
Please. I think RMS would prefer, GNU/NetBSD != GNU/Linux. Obey the beard!
- Must compile C code (GCC does this).
- Must support all of the platforms OpenBSD targets (GCC has a habit of dropping support for various platforms).
- Must be easy to add new backends for new architectures (GCC makes this really hard).
- Must be easy to audit for security (GCC is a tangled mess).
Maybe a few I've missed. GCC is like Linux; it's a fairly good solution for a lot of problems, but it's rarely the best solution for any given problem. PCC is a better fit for the needs of the OpenBSD base system.I am TheRaven on Soylent News
> It's like that phony debate between "great taste" and "more filling"
That's "tastes great" and "LESS filling". Clearly you're trying to push a "tastes great" agenda by deliberately misrepresenting the opposing viewpoint. Typical tactics for the tasteistas.
(imagine Daffy Duck saying all that)
Done with slashdot, done with nerds, getting a life.
The biggest reason for the new compiler (despite the jackass article submitter's position) is that GCC does *NOT* support every architecture. GCC drops architectures frequently as the core contributors lose interest, which hurts OSes like NetBSD that try to support more than the mainstream architectures. NetBSD relies on a combination of GCC 2, 3, and 4 to compile the OS on all of the architectures it supports.
The idea with PCC is not that it will be BSD licensed (nobody really gives a fuck what license the compiler is under), but that it will be supported directly by the BSD community, including the NetBSD hackers who have their bazillion architectures to support.
First: PCC has not YET supplanted GCC. The BSDs are hoping it will in the future.
Second: The biggest attraction of PCC is NOT the license. The article submitter who stated otherwise is a jackass.
Third: There are techical reasons why GCC is actaully unusable by some BSDs, such as NetBSD, which aims to support many architectures that GCC has dropped. NetBSD uses a combination of GCC 2, 3, and 4 to compile all of its different architectures. The NetBSD developers would rather have a single compiler that handles them all. Obviously PCC is nowhere near that level yet, of course.
Fourth: GCC politics are a pain in the ass for many BSD developers who just want to submit patches to a compiler without the overhead of GNU's policies and GCC's management.
Fifth: GCC produces crappy code more often than anyone would like. GCC bugs are far from unheard of, performance of generated code is often unpredictable between releases, and in many less commonly used architectures or sources GCC will produce incorrect code. Yes, these cases are very rare, but the BSD folks have hit the problem often enough for it to be a concern. PCC, being simpler and less bloated with cruft from multiple rewrites of the internals will hopefully produce correct and predictable code more often than GCC.
Sixth: PCC actually works today. It can compile most of the NetBSD userspace, as I recall, and the kernel will be ready to roll soon after some inline assembler problems are fixed. This isn't some theoretical hacky project - it works right now. It's not ready to replace GCC just yet, by any means, but it's a lot more than some Slashdotters seem to think it is.
while the person you're responding to *is* a troll, I guess it's worth pointing out that GCC and other highly optimizing compilers will "break" some apps that a simpler compiler won't break. Why?
Many optimizations rely on careful reading of the standard, and explicitly taking the liberties that the standard lets you take. For instance, the following loop terminates on a simple compiler, but becomes infinite on some optimizing compilers:
int i = 1;
while (i > 0)
. . . i = i * 2;
The ANSI C standard states specifically that signed integer overflow behavior is implementation defined. If you were expecting 'i' to go negative after 30 iterations, and for that to terminate the loop, you could be in for a nasty surprise.
Suppose an application relied on this behavior, and now it misbehaves when compiled with GCC. Did GCC "break" that application? In some sense, yes: The app functions correctly with compiler (a) but not with compiler (b), so the app must be compiled with compiler (a). The breakage, however, happened because the application its not strictly conforming. It uses compiler dependent semantics, and that's hardly GCC's fault.
Simpler compilers also don't reorder code as much, and don't optimize away as much "dead code." Stuff that really should have memory barriers, explicit synchronization and perhaps the volatile keyword applied to them run just fine without all those things when compiled with a simple compiler and run on a scalar, in-order CPU. The source code is also easier to read, because in the end the semantics are much more restricted--meaning the compiled output more closely resembles the source input. Give that code to a highly optimizing compiler, though, and run it on a super-scalar, out-of-order machine, and it'll break left, right and center. Is it the compiler's fault? Is it the CPU's fault? It's really the gap between the semantics the programmer thought he had (and happened to have in the simpler environment), and what C actually guarantees.
Simpler compilers implement simpler semantics that are easier to understand, but only because they're compiling a very restricted form of C that offers way more implicit guarantees than the C standard actually does. Personally, unless that's made explicit (and therefore truly guaranteed forevermore by the compiler), I suspect it's actually a recipe for disaster. If nothing else, it could lead to code that's significantly harder to move to different platforms, since it'll start to rely on these simpler, "easier" semantics. Of course, then again, super-scalar out-of-order CPUs still strip a bunch of that away, so who knows, it might not be that bad.
--JoeProgram Intellivision!
Seriously, if you're writing code for a living, especially performance-critical code, isn't hardware/platform optimization for the end-use binary far more important than speed of compilation? Particularly if that binary gets blown out to hundreds or (of?) thousands of boxes. If I had to choose between a slow, but hand-tuned GCC for my platform or a quick other compiler that made correct but mediocre-performing (no SSE?/3DNow/VMX/VIS or whatever) binary code, I'd say GCC no contest.
And frankly, slower compilers mean secksier hardware requirements for workstations.. ("Yes, GCC4 is slow, that's why I need that dual quad-core Xeon with 4GB RAM!!")
Meh.
Then the answer is no. I may be alone in the world but I'm perfectly happy with the gcc compiler and have been for years. It does what its supposed to, It is FREE, It is crossplatform (MingW), and it annoys the BSD guys.
Clear Winner. GCC
It has been pointed out here, that people who choose a compiler based on its license are idiots. Well if I'm working on windows I use MingW specifically because of its license. If I'm working in Linux and I usually am, I choose GPL above all others. Count me as an Idiot if you like, But you can shove the alternatives. I know what I am getting and have a reasonable expectation what is coming in the future, and if I need to modify it (Heaven Forbid) I can. BSD is a fine license for people who NEED it. I don't. When given the choice I choose GPL. GCC Slower, maybe so. Code works and I get paid. If it takes 3 hrs for QT to compile. I bill for 3Hrs.
Sorry but, I'm a pragmatist in all things except freedom. I've been burned enough. (Admittedly, I've personally never been burned by BSD code, unless you count Windows.)
OSGGFG - Open Source Gamers Guide to Free Games
Think about it. Getting a new compiler into free UNIX and the open source community is going to be as hard as getting a new platform on the desktop to compete with Windows. And for similar reasons.
You're not going to supplant GCC until you get all the code that depends on GCC-specific features modified to be standard portable C. That's a barrier to entry as steep as Microsoft's application barrier to entry. Now it's not as bad as it was in the early '90s when GCC was sprouting new C extensions everywhere (like the ability to have declarations not at the start of blocks, or the ability to leave the second element out of the trinary conditional operator, or things like alloca), and a lot of those features have now become common and even standardized (and others, like the shortcut trinary, have been deprecated). But it's not as easy as just having a good compiler, or even a good language translating ecosystem like Tendra... the playing field is anything but level.
The license doesn't explicitly say you can't modify it because it goes without saying that you are not allowed to modify it. If the default was for licenses to be freely modifyable by recipients then they would all be worthless. Also, it would be asinine for the default to be that you can freely modify the license because then every single license would have to have a standard clause that says you can't modify it.
The same rule applies to copyright notices. You are not allowed to modify the copyright notice on a work even though it doesn't explicitly say such modifications are forbidden. The BSD license reminds recipients that they have to keep the copyright intact, but this is done as a courtesy and is not required.
As for your Wikipedia quote, I already gave a detailed explanation of how BSD licensed code can be distributed along with code that has a more restrictive license because this might be what has caused the widespread misunderstanding that you are still suffering from.
The rule is trivially simple: unless you are given explicit permission from the original author, you can't change the license or copyright notice on someone else's work. Period.
We don't see the world as it is, we see it as we are.
-- Anais Nin
It is extremely difficult and next to pointless to write code that is strictly conforming. It is in fact quite useful, for instance, to use unions to re-interpret bit patterns. (Note that "portable" is something rather different than "strictly conforming," or even "conforming." Many non-conforming programs are still highly portable because of commonalities among implementations.)
For example, suppose you want to bit-reverse an entire array in memory. That is, bit 0 of the first element in the array swaps locations with bit N-1 of the last element of the array, bit 1 swaps locations with bit N-2, etc., all the way down to the middle of the array. How would you implement that as a strictly conforming ANSI C program? It turns out to be rather difficult to do correctly. (Why would you want to? Well, it's a handy way to flip bitmaps for one thing.)
First, you have to know how many bits are in each unsigned char. There could be from 8 to who-knows-how-many bits in an unsigned char. (Yes, 256-bit unsigned char are legal in ANSI C, as long as there's at least as many bits in a short int, int, long int and long long int.) So, you can't rely on any fast, cute implementations such as this ever-popular word-reversal routine:
That code is implementation defined. It cannot be part of a strictly conforming program. It can be part of a conforming program, though it only works as expected on machines whose unsigned int is 32 bits. (That happens to be over 90% of the PCs and *NIX boxes people work with these days, but that wasn't true as recently as 10 years ago.)
What about other undefined things? Well, sometimes an implementation defines them usefully. For example, consider this bit of code:
This is useful code. Chances are nearly every compiler you meet (at least, which offers 32-bit ints) will handle this correctly and tell you the endianness of the machine. That means it's reasonably portable. It also happens to be quite undefined.
Sure, it fails miserably on oddball machines with non-standard word sizes, but most programs only care to be portable amongst the vast majority of machines that have 8-bit char, 16-bit short, 32-bit int. (This is part of the reason why LP64 machines are more popular than ILP64 machines.)
In general, compilers implement a superset of the standard by providing reasonable semantics to expressions that the standard leaves undefined. For instance, on most compilers, signed arithmetic wraps around the same as unsigned arithmetic, and the values you get are exactly what you'd expect from 2s complement arithmetic, despite the fact that the standard leaves those results undefined. Heck, until the adoption of C9x, C++ style comments were not technically legal in C programs, but most compilers accepted them.
--JoeProgram Intellivision!
Unfortunately, Microsoft made a decision not to use the boundary protection in their new operating system called "Windows". They ignored most of the work that Intel did providing support in the silicon for a decent micro operating system. The boundary protection could have been built into the programming language and runtime. Things would have been much better in the long term.
16 bit Windows did use it, but just not for protection.
Originally 8086s had a simple segmentation mode without protection. Each address was built up of seg<<16|offset. Since both segment and offset were 16 bit, this limited the address space to 1MB. Famously, the designers of the IBM PC reservered the upper 384K for IO, and this is where the 640K limit came from.
Later on the 80286, protected mode was supported, where the value you loaded into a segment register was a selector, and index into a table of segments. The CPU supported different privilege levels called Rings with only the highly privileged ones allowed to create entries in this table.
16 bit Windows did used 286 protected mode - it had to to get access to memory above 1MB. When it loaded it would install a DOS extender which would switch the PC into protected mode and allow 16 bit protected mode tasks to run on top of DOS. Since the 286 didn't allow you to switch back, each call into DOS used a triple fault to reset the processor and some Bios code to jump back into Windows.
It was even possible to allocate buffers bigger than 64K. In that case, windows would set up an array of selectors for each 64K chunk. If C code wanted to access an arbitrary location in the buffer, the compiler would work out which 64K chunk it was in, load a segment register with the corrrect selector (this was a very slow operation since microcode in the 286 had to check permissions), calculate the offset and load it into one of the normal registers and then do the segment read. This was an incredibly slow process.
It's also worth pointing out that 16 bit Windows used protected mode to get access to more memory, the like Dos the OS didn't protect itself from being damaged by third party applications. And it didn't stop third party applications damaging each other. As Walter Oney put it the philosphy was that it's a personal computer after all - If you're a programmer you can do what you like to it, just like you're free to run your car without oil until it seizes up.
Once the 386 came out and allowed offsets into segments to be bigger than 64K Windows would even set the limit on the first selector to allow you to access the whole buffer with operand size overides. The 386 also supported Virtual 8086 mode, so protected mode could jump into DOS and segments would work like a 8086. But loading segment registers was still a very slow operation, and Windows NT and Linux which are both designed to stop applications corrupting each other or the system both used page tables to do it instead. But page tables don't protect against buffer overruns.
Mind you segmentation only protects against buffer overruns if you malloc the buffers. Automatic variables on the stack are not protected. And allocing a stack variable is just a subtract instruction - it is orders of magnitude faster than calling into the OS, switching to Ring 0, allocating the memory, filling in the segment table and the returning to the caller who would load the selector into a segment register. Worse, there are very few segment registers, an each time you access any buffer you need to reload one. On a 286 there is CS,DS,ES.CS and DS are needed for near code and data, so only ES is free for far pointers. The 386 has FS and GS too, but three registers with very slow loads is not a recipe for a speedy machine.
So Microsoft tried it and it was slow. If they'd have used it enough to avoid buffer overruns - i.e. malloc every buffer rather than allocating some on the stack it would have been really slow. And so all modern OSs rely on the page table for protection instead of segments so they can run on multiple processors. In x64 mode, segment limits aren't even checked by hardware anymore.
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;