Celebrating 30th Anniversary of the First C++ Compiler: Let's Find Bugs In It
New submitter Andrey_Karpov writes: Cfront is a C++ compiler which came into existence in 1983 and was developed by Bjarne Stroustrup ("30 YEARS OF C++"). At that time it was known as "C with Classes". Cfront had a complete parser, symbol tables, and built a tree for each class, function, etc. Cfront was based on CPre. Cfront defined the language until circa 1990. Many of the obscure corner cases in C++ are related to the Cfront implementation limitations. The reason is that Cfront performed translation from C++ to C. In short, Cfront is a sacred artifact for a C++ programmer. So I just couldn't help checking such a project [for bugs].
I'm sure I got a PDP-8 somewhere in my back closet.
"Sacred Artifact"? Are you kidding?
I will happily agree that the language and compilers were both pretty awful back then. The worst warts have been beaten out now, and it's a pretty capable language in the main but this is now. 1990s C++ was horrible.
This "new submitter" is the "science adviser" of the company who wrote that blog post, of which the main point is to sell you their product.
Before C++ came around, wasn't C just a glorified macro assembler?
Only if you know what you are doing...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Without C++ there would be no Java. Since my community college couldn't afford a new Microsoft site license, I had to learn every flavor of Java for all my programming courses. Meh...
If you know what you're doing, you probably are already using another language.
Such as? I doubt that there's anyone not relying on C++. Never mind that every major browser (and a few minor ones) are written in C++. Many language implementations are in C++. And for those that are in C, well, all the major C compilers (LLVM, VisualStudio and now GCC of course) are written in C++.
Personally, I like the language. Oh I can whine about it all day and it has many warts, but the combination of efficiency, expressiveness, flexibility and static typing fit well with me.
SJW n. One who posts facts.
1979 - Work on "C with classes" started
At that time, "Object Oriented programming was considered too slow, too special purpose, and too difficult for ordinary mortals."
Well, I'm glad C++ fixed that last problem!
But it's small and fast when you use it right..
C++ is but a tool we use to get our jobs done. Every tool has it's limitations, provisos, flaws and a purpose. When a tool is placed in the hands of a skilled craftsman using it for it's intended purpose it produces excellent results, but if it's used by somebody without the necessary skills, or for a task it is not designed to do, the results can be horrid. C++ may be dated, but for some problem domains it remains the tool of choice. You don't write device drivers in Java for a reason.
The wise programmer keeps as many tools in his tool box as possible. He sharpens them and maintains them and keeps adding the new and useful tools he finds. He knows the intended purpose and best use of his tools and selects the appropriate tools for the job at hand. All to often, young bucks show up to work armed with only ONE tool shiny and new, thinking that it is the only tool they need for any job. They deride the old salts who use the "old" tools well and make fun of the well warn, old tools they use. "Use this new shiny tool," they say, "It's the only tool you need." While the old experienced guys chuckle and shake their heads, remembering when they too said similar things and try to teach the younger ones that there is value in putting tools in the box...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
C++ can have, and has had, bugs.
The standard for C++ could specify something impossible, for example. Like, make std::string impossible. This is similar to having a program that fails to compile, but as the standard is interpreted, it only causes problems when someone actually interprets it.
It could also have mistakes (as opposed to bugs), like mandating that your sort routine take at least O(n^3) time.
You can examine the defects and fixes in the C++ standard; they are published publicly. A defect is usually fixed by compilers even when compiling in prior version modes.
Translating to C would not impose a limitation on the language features of C++, its possible to generate whatever C code you need to support C++ features. Using an LALR parser would introduce limitations on language design, however. This was once very common.
CFront wasn't a compiler, it was a preprocessor that spat out C code that was subsequently compiled by whatever C compiler you happened to have.
Looking at CFront output was the best way to understand how C++ actually worked at the time, since it was all mapped to pretty straightforward C constructs. I don't think anyone around today knows what a vtable and ptable is, but back then it was how you could tell the programmers who really dug in to the language from those that didn't.
it is an unmanaged language that gives low level access without adequate tools to guard that access.
Uh, have you heard of smart pointers? Or RAII? Have you used C++ in the last 5-10 years... at all, actually?
Security was never the primary design goal for C, UNIX or the Internet. The technology for the last half-century got "foisted on this world" because it worked and worked quite well.
I remember when it first came out. Back in the days of BBS's
The old joke was
"Have you seen the new C+++"
I wonder how many will still get the joke?
Without C++ there would be no Java. Since my community college couldn't afford a new Microsoft site license, I had to learn every flavor of Java for all my programming courses. Meh...
GCC was free the last time I checked and a free graphical frontend should be possible to find as well. Notepad++ would do (not 100% of license for a college though). Using java instead of C++ for license issues sounds more like an excuse to me, partly because in my university days Microsoft donated licenses to the students and I got free windows and stuff. After all they wanted the new engineering students to be used to microsoft products when they joined the software workforce.
I learned Java as well, not due to costs, but because the cross platform code made it "the only language you will ever need in the future". Everybody hated it and preferred C or C++ (C was actually surprisingly popular compared to C++) and to this day I haven't had the need to use java even once since I graduated. I use C++ for performance and scripting langauges for whatever uni figured I should use Java for.
Nope, C++ is still a thing when you need to create really large, complex programs, and when efficiency still really matters. Here in the videogame industry, C++ absolutely reigns supreme. Nothing else even comes close. Large applications like MS Office are still written in C++, from what I'm told, as are *many* large applications. It's not just legacy stuff either.
C++ has the native performance of C, but is able to use powerful zero-cost abstractions that allow programs to scale up safely. For instance, if you write modern C++, it's almost impossible to write code that will stomp on random memory or leak resources, a real issue with C or older style C++ programs, yet that protection is completely optimized away and costs *nothing* at run-time (which I think is something many programmers don't properly appreciate).
An easy language to master? Absolutely not. It's a language that takes a long time to learn well, and it can be rather unforgiving at times, but it's great for what it does. C++ 11/14 has also really breathed new life into things as well, IMO. It's really amazing how much the language feels almost like it's using managed memory (e.g. garbage collection) now that I'm using smart pointers ubiquitously.
C++ is incredibly portable as well. My game engine works across several platforms, only a smallish percentage of the code is different between platforms, mostly for low-level graphics, audio, windowing, or other system calls.
It's stability as a language is legendary as well, and that's important for real-world projects that depend on it. You can probably still compile most the earliest C++ code on a modern compiler and expect it to still work, not to mention most C code as well.
I'd never claim C++ is the end-all, be-all of languages (I sound like I'm gushing, but I have plenty of complaints as well), but it most assuredly has a very long future with us, and for some very good reasons.
Irony: Agile development has too much intertia to be abandoned now.
While Microsoft may donate to the universities, it required payment from the community college that I attended. The dean taught C/C++ and GCC in the Linux Admin classes. The administration refused to go with open source programming because Microsoft Visual Studio was the industry norm and Java was the next industry norm on the horizon. When the funding issue did get resolved, none of the classroom computers could run VS .NET as the hardware was too old. I never used Java after graduating from school. These days I use Python and occasionally Cython (C compiler for Python).
This sounds like the people over that particular program at your community college didn't really understand what options were available.
.NET has been free for quite some time for personal or academic use (starting with Visual Studio 2006). Even before that, the Microsoft C++ command-line compiler has been available as free download as a part of the Windows SDK/Platform SDK since the release of Windows XP (if not before).
The Express versions of Visual Studio
It is true that the Windows SDK does not include an IDE of any kind but it includes the compiler and linker, so all you really need is a text editor to use it to develop programs.
I don't think Notepad (or Notepad++) has the same cachet as Microsoft Visual Studio on a resume. Getting a job is the whole point of learning computer programming at a community college. If you know C/C++ but don't know MSVS, you're not going to get a job. The same thing for Java if you don't know Eclipse or Netbeans.
I don't know if it was CFront, but the first C++ programs I wrote were for homework assignments in the late 1980's. The compiler output was unreadable, of course, because it wasn't compiling my code. It was compiling a C++ to C translation of my code. All I really knew was that there was a problem with my program *somewhere* at the line, or above it, where the first error message appeared. It was an exercise in agony management.
I remember thinking, "what an awful language!" at the time. Of course that changed eventually; I remember Borland's C++ compiler being especially good at producing helpful error messages.
I am going to bet you that the 3B2 was that primary computer architecture for cfront.
However, it does appear that cfront was extremely portable:
Which is why you'll never get one through a code review at my company. std::unique_ptr does what auto_ptr did, only much more elegantly. std::shared_ptr is also useful.
I'm not sure you really do understand RAII, if you refer to it as "automatic finalization". It's far more useful than finalization in the languages I've seen that use it. It's a unified resource manager that can handle every type of resource uniformly.
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
Tell that to the HR drone who writes up the job description. I wouldn't be surprised if Microsoft Visual Studios is listed as a requirement for the job applicants you want to interview.
I don't remember what the ptable was for anymore either.
I do remember name mangling, which I suppose doesn't happen anymore?
nonsense, plenty of large complex projects are written in C. Open source kernels or apache for example. Any C++ data structure or algorithm has a fairly straightforward C equivalent, "objects" in pure C are easy. No real reason to use C++ for anything, just introduces difficulty in debugging and maintenance.
Yeah, 20 years experience in c++ development here
The Internet was created by DAPRA. The civilian application was a byproduct.
https://en.wikipedia.org/wiki/DARPA
If you know C/C++ but don't know MSVS, you're not going to get a job. The same thing for Java if you don't know Eclipse or Netbeans.
That is nonsense. No one cares what IDE I use.
And I switch to an unknown IDE immediately.
What the fuck should be the difference between them?
In Java there is basically none at all and if we use C++ all IDEs generate a makefile.
If you believe you did not get the latest job because you did not know the IDE you have a serious problem.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
If you believe you did not get the latest job because you did not know the IDE you have a serious problem.
You have obviously never dealt with a recruiter or HR drone following a checklist. If you didn't have what's on the checklist, you didn't get the interview or job. I was once turned down for an interview because I didn't know "Red Hat GUI" for a Linux job that was 99% command line work, which I thought I was perfect for because the Linux command line 99% of the time. Since I didn't know "Red Hat GUI" and that box couldn't be checked off, the recruiter hung up on me.
I don't get called by drones like that.
When I have a phone interview I'm usually hired after it.
Next time simply cheat and face palm when you are sure it is just a drone.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
And what has C++ ever given us in return?
...And smart pointers.
Zero-cost abstraction?
What?
Zero-cost abstraction?
Oh. Yeah yeah. It did give us that, that is true.
Oh yeah. Smart pointers Reg. Remember what programming used to be like?
Yeah, all right, I'll grant you, zero-cost abstraction and smart pointers are two things that C++ has done.
And the portability.
Well yeah obviously the portability, I mean portability goes without saying, doesn't it? But apart from the zero-cost abstraction, smart pointers and portability...
Language stability?
Happy people make bad consumers.
nonsense, plenty of large complex projects are written in C.
Where did I say they weren't? Obviously a lot of great stuff is written in C as well.
Any C++ data structure or algorithm has a fairly straightforward C equivalent, "objects" in pure C are easy.
Oh? I'd love to hear how you'd implement a ref-counted smart pointer in C. Or perhaps a scope based auto-lock/unlock mechanism. Tell me how you implement a reusable, generic container like a key-value map while remaining typesafe? How about any way at all to guarantee that a resource won't leak if you miss a function call due to some unexpected control flow? How about a way to avoid clobbering memory if you forget that you need to account for strlen() + 1?
Look, there's nothing wrong with C, but don't pretend it can do what C++ can. C is manual mode programming - the language itself is wonderfully simple and straightforward, but you have to do everything, including managing raw memory and properly cleaning up after yourself. C++ enables the compiler do a lot of automatic cleanup and safety checks, but has a cost in language complexity and obfuscation.
Yeah, 20 years experience in c++ development here
Congrats. 23 years C++ experience here.
Irony: Agile development has too much intertia to be abandoned now.
Recruiters generally have no idea what they're talking about. You kind of have to pad your resume and language you use about yourself and skills against that.
This is OLD advice (I haven't had to make a resume in 15 years) was to list every possible interpretation of your skillset for the resume consumed by headhunters. Example:
- JDBC
- Java Database Connections
- C++
- Microsoft Visual C++
- Unix
- HPUX
- Solaris
- Linux
- PHP
- PHP4
- PHP5
The headhunters are just scanning and checking boxes. If you feel "icky" about saying you know MSVC++ when you haven't used it just grab the freebie compilers and throw some stuff at it. There... legit.
If an Indian recruiter calls with an incomprehensible accent calls, I answer "yes" to every question on their script. If they get confused by a "yes" answer and repeat the question, I answer "no" for that question and "yes" to the remaining questions. Surprisingly, this works every time. I've gotten a few interviews this way.
My five-page resume that I put on the job search websites is loaded with buzzwords from the last 20 years. Recruiters love this version. I give the hiring managers my two-page resume that summarizes the last three positions and/or last three years. For my current I.T. job, I had to provide my long resume because 10 to 15 years of I.T. experience was a requirement.
Tell me how you implement a reusable, generic container like a key-value map while remaining typesafe?
oooh! I know this one.
Please note, I'm not suggesting that it wouldn't be simpler and easier to do it in C++ with templates. Naturallt it would, but it's *possible* in C, and in fact I believe templates were part of the mechanism to formalise what people were doing in C anyway.
So basically, create a header: generic_vector.h
Do *not* give the header include guards.
Then do this:
#define VECTYPE float
#include "generic_vector.h"
#undef VECTYPE
#define VECTYPE int
#include "generic_vector.h"
And the header file looks like this:
typedef struct {
VECTYPE* data;
size_t size;
size_t nelem;
} generic_vector_##VECTYPE;
static int generic_vector_##VECTYPE##_insert(VECTYPE v) //Logic goes here... /* etc */
{
}
An alternative is to have the header file define a colossal macro and then do:
GEN_VEC(float)
GEN_VEC(int)
but I generally prefer the former since the debugging is much easier, and it's the least nasty option of the two.
So there you go, it is possible to make generic, typesafe container in C by severely abusing the preprocessor. Obviously a key/value one would be more complesx, and have multiple #defines, one each for the two types and then more for the comparator function for ordering.
I invite any C fan to claim that lot would be easier to read then the equivalent C++ code...
Look, there's nothing wrong with C, but don't pretend it can do what C++ can. C is manual mode programming - the language itself is wonderfully simple and straightforward, but you have to do everything, including managing raw memory and properly cleaning up after yourself.
This is one of the things I don't understand. If you're programming in C you're *programming*. IOW automating things that could be done by hand. Why would you not want to automate away half of the work?
SJW n. One who posts facts.
It's a pre-processor that translates C++ to C, that then has to then be compiled by a C Compiler. Back in the day, I used the Glockenspiel implement of CFront that then used the Microsoft C Compiler (this was before Microsoft had a C++ compiler) to compile the resulting C code to .obj files to be linked by Microsoft's LINKer.
Exactly. Modern C++ should mostly read like a high-level managed language would, not C.
A successful API design takes a mixture of software design and pedagogy.
That's correct. The oft overlooked aspect of RAII is that C++ has strictly defined order of object destruction. It forms the core language semantics and modern C++ would be useless without it. Calling it "automatic finalization" is completely missing the point.
A successful API design takes a mixture of software design and pedagogy.
Congrats, you just sketched out a non working version of std::vector in 20 lines...
That was kinda the point: to show it's possible to do such stuff in C.
SJW n. One who posts facts.
You are so funny, all those things you mention exist in the libraries of major scripting languages with constructs made in pure C
You're making an argument for why those that love needless complexity shouldn't be allowed to design languages
Where's the automatic resource cleanup?
Those who do not learn from commit history are doomed to regress it.
There isn't, but I think the OP asked for type safety and generality, not resource cleanup.
Seriously, why is everyone going full pedant overload on this? Generic programming is perfectly possible in C, if extremely unpleasant. Templates were in no small part designed to allow people to use something more pleasant and obvious than macro fuckery.
SJW n. One who posts facts.
Resource cleanup is definitely part of generality. I would argue even part of type safety. Just because it's overlooked doesn't mean it isn't a crucial part of it. Clean generic programming is only really possible with value semantics, and value semantics means resource cleanup for non-trivial types.
Those who do not learn from commit history are doomed to regress it.
Resource cleanup is definitely part of generality. I would argue even part of type safety
No, I disagree. The generality and type safety was really about generic programming, as in can it be used on a variety of types without messing aronud with void*. Type safety is something quite specific and doesn't in any definition I've seen include resource cleanup.
Just because it's overlooked doesn't mean it isn't a crucial part of it.
I don't think it's overlooked but it's not a part of type safety. Type safety is about whether you can have type violations, not whether you run out of memory because you forgot to free something.
Clean generic programming is only really possible with value semantics, and value semantics means resource cleanup for non-trivial types.
You can emulate that all in C if you like with abuse of macros. But C idiomatically tends to work with pointers more, which are really reference semantics. I mean sure you might leak if you're not super-super careful but you can get type safety and genericity in C.
If I had to choose between losing templated from C++ and losing destructors, I'd almost certainly lose the latter, since you can amulate the former badly and with more pain but nonetheless reasonably completely with macros, but there's nothing you can do about the latter.
Obviously that's a silly hypothetical.
SJW n. One who posts facts.
Seriously, this is getting old.
Type safety is something quite specific and doesn't in any definition I've seen include resource cleanup.
Which is my point. Just because it's overlooked such that it doesn't figure into other people's definitions doesn't mean it isn't necessary. We have a lot of experience with generic programming, and resource cleanup has been found to be a requirement.
not whether you run out of memory because you forgot to free something.
Memory cleanup is only a small part of resource cleanup. This point has been repeated to death.
But C idiomatically tends to work with pointers more, which are really reference semantics. I mean sure you might leak if you're not super-super careful but you can get type safety and genericity in C.
Which is why actual generic programming isn't possible in C, because it can't emulate value semantics for complex types. If you have to be super-super careful, then it's not generic.
Those who do not learn from commit history are doomed to regress it.
Which is my point. Just because it's overlooked such that it doesn't figure into other people's definitions doesn't mean it isn't necessary. We have a lot of experience with generic programming, and resource cleanup has been found to be a requirement.
Resource cleanup is useful both with and without generic programming. You can write typesafe (this has a specific definition), generic code with manual cleanup or with automatic cleanup. For example, a std::vector is typesafe, in that you can't have type violations (unless you screw with casting---C++ isn't the best example), generic in that it works for all T, but it won't clean up whatever those T*s point to.
Resource cleanup is simply an orthogonal concept to type safety and genericity.
You can write templates with incomplete resource cleanup (above) and it's certainly generic (which means works on more than one type) and typesafe (which means you can't store a ofstream* in a vector).
Memory cleanup is only a small part of resource cleanup. This point has been repeated to death.
Picking one example resource out does not make my example incorrect. So well done for ignoring the point and making a tangential, irrelevant criticism rather than addressing the point I was making.
Which is why actual generic programming isn't possible in C,
Except it is, using the definitions of "generic" that everyone else uses.
because it can't emulate value semantics for complex types.
Plenty of languages implement generic programming with reference semantics just fine. A vector> is essentially reference semantics, on Ts, and that's certainly an example of generic programming.
If you have to be super-super careful, then it's not generic.
I don't think you understand what generic programming is. It's not about care it's about parameterization over types. And you can certainly parameterize over types in C, with sufficient effort.
SJW n. One who posts facts.
Resource cleanup is useful both with and without generic programming. You can write typesafe (this has a specific definition), generic code with manual cleanup or with automatic cleanup.
But generic programming is not useful without resource cleanup. Generic programming algorithms, as I keep saying, relies on value semantics. If the algorithms are working on things that, say, have mutable state, then the generic algorithm breaks. Value semantics means type safe automatic resource cleanup becomes a necessity and thus should be included in the definition. People only resist it because it would mean their language can't be generic.
For example, a std::vector is typesafe, in that you can't have type violations (unless you screw with casting---C++ isn't the best example), generic in that it works for all T, but it won't clean up whatever those T*s point to.
No, that's why you store objects in vectors, not raw pointers. If you store objects in vectors, it will clean up all Ts, including smart pointers.
Plenty of languages implement generic programming with reference semantics just fine.
No, they don't. eg Java's generics aren't generic. They're just syntactic sugar for type casting.
Except it is, using the definitions of "generic" that everyone else uses.
"Everyone else" uses incomplete definitions of generic. It's incomplete given the experience of generic programming, namely in C++. Your "generic vector" does not allow generic algorithms to operate on them because of reference semantics and non-generic resource cleanup.
It's not about care it's about parameterization over types. And you can certainly parameterize over types in C, with sufficient effort.
The parametrization is incomplete if it doesn't include type-aware cleanup (and copy and move and even swap).
Picking one example resource out does not make my example incorrect. So well done for ignoring the point and making a tangential, irrelevant criticism rather than addressing the point I was making.
Really? But you thought nothing of reducing "resource management" down to a leaked memory strawman completely irrelevant to the larger picture of value semantics. I love when people accuse me of apparently doing something when they were the ones to actually do it first.
Those who do not learn from commit history are doomed to regress it.
But generic programming is not useful without resource cleanup
That's trivially false. std::sort does not cause any allocations or deletions of the type T objects.
Generic programming algorithms, as I keep saying, relies on value semantics
You can keep saying that and you'll keep being wrong. The definition of "generic programming" is parameterisation of types:
https://en.wikipedia.org/wiki/...
No, that's why you store objects in vectors, not raw pointers. If you store objects in vectors, it will clean up all Ts, including smart pointers.
That's great unless you don't want it to, e.g, if you're storing iterators in a std::vector, aka pointers, i.e. reference semantics.
No, they don't. eg Java's generics aren't generic. They're just syntactic sugar for type casting.
Who cares how it's implemented underneath? You can make classcontaining a vector and wrap it so that operator[] always returns the T boxed up by the any. That's still generic code. Java still allows you to parameterize stuff over types. Not the world's best generics sure, but it's still type parameterization. C# is similar but goes rather further because it has generics in the VM so it doesn't need to do type erasure like Java does.
Nonetheless you now have to modify your silly semantic fiddling yet again because C# now fits your definition (not type erasure and runtime casting) but is reference based.
"Everyone else" uses incomplete definitions of generic. It's incomplete given the experience of generic programming, namely in C++.
You know, except not. If literally everyone else is using a different definition from you, then it is you that are wrong not them. Definitions are simply language and language is ultimately defined by how it is used. If you perversely insist on different definitions of terms from everyone else then you're always going to be talking at cross purposes.
Your "generic vector" does not allow generic algorithms to operate on them because of reference semantics and non-generic resource cleanup.
Tell me, what cleanup does std::sort require? The thing you're talking about is containers which is literally a subset of generic programming. Furthermore containers will work perfectly on all structs without pointers internally, which includes a lot of useful cases. The only time they don't work perfectly is if those pointers (and other resource handles) need cleanup: at that point you have to do the clean up yourself.
Of course you could always make the C style std::vector call a user-specified cleanup function. By making it one of the #defines rather than abusing void* it's even type safe.
All you're arguing is that without automatic resource cleanup, certain generic containers in certain cases require more effort.
The parametrization is incomplete if it doesn't include type-aware cleanup (and copy and move and even swap).
Reference heavy languages don't have the concept of move, because they don't need it. You can move any C struct with memmove or memcpy. As long as you don't clean up the sets of handles in both copies you're fine. That is also the way D works in a lot of cases. I invite you to tell me why D doesn't have generics...
Really?
Yes really, because it's easier to say "run out of memory" than "run out of memory or file handles or semaphores or thread handles or shared memory segment keys or inodes and so on". The point is precisely the same. It's also not a straw man because it's actually a real problem (running out of memory) that will really happen and almost certainly sooner than anything else if you never free. You well know this, but for some reason you want to engage in idiotic pedantry and fiddling with weird corner cases of definitions than have an actual discussion.
SJW n. One who posts facts.