Interviews: Ask Alexander Stepanov and Daniel E. Rose a Question
An anonymous reader writes "Alexander Stepanov studied mathematics at Moscow State University and has been programming since 1972. His work on foundations of programming has been supported by GE, Brooklyn Polytechnic, AT&T, HP, SGI, and, since 2002, Adobe. In 1995 he received the Dr. Dobb's Journal Excellence in Programming Award for the design of the C++ Standard Template Library. Currently, he is the Senior Principal Engineer at A9.com. Daniel E. Rose is a programmer and research scientist who has held management positions at Apple, AltaVista, Xigo, Yahoo, and is the Chief Scientist for Search at A9.com. His research focuses on all aspects of search technology, ranging from low-level algorithms for index compression to human-computer interaction issues in web search. Rose led the team at Apple that created desktop search for the Macintosh. In addition to working together, the pair have recently written a book, From Mathematics to Generic Programming. Alexander and Daniel have agreed to answer any questions you may have about their book, their work, or programming in general. As usual, ask as many as you'd like, but please, one per post."
Alexander Stepanov, I have never had a chance to ask someone as qualified as you about this topic. I grew up on the opposite side of the Iron Curtain and have constantly wondered if (surely there must have been) alternative computing solutions developed in the USSR prior to Elbrus and SPARC. So my question is whether or not you know of any hardware or instruction set alternatives that died on the vine or were never mass fabricated in Soviet times? I don't expect to you to reveal some super advanced or future predicting instruction set but it has always disturbed me that these things aren't documented somewhere -- as you likely know failures can provide more fruit than successes. Failing that, could you offer us any tails of early computing that only seem to run in Russian circles?
If you can suggest references (preferably in English) I would be most appreciative. I know of only one book and it seems to be a singular point of view.
My work here is dung.
This is more for Daniel Roseis, but to what do you attribute the seeming decline in the quality of search results? I used Digital's Alta Vista search engine when it was fairly new and it seemed revolutionary and seemed to provide me with exactly what I wanted. Over time that declined and Alta Vista as it was ceased to be, and Google initially also seemed to provide me with exactly what I wanted. Now it seems like I have to put a whole lot of thought into faking Google into performing a somewhat-boolean-style search for me, and normal boolean expressions themselves no longer seem to work.
Is this the result of attempting to dumb-down the interface for tailored results, or something else or more insidious? Obviously the amount of content on the Internet is growing, but the computing power to process through all of it is growing too, so I would expect it wouldn't be getting this much worse, this quickly.
Do not look into laser with remaining eye.
Have you had regret nightmares since unleashing STL?
I'm a huge fan of the STL, and I think the design has stood the test of time amazingly well.
That said, you now hae a bunch of hindsight. What would you do differently knowing what you know now.
Also if you were doing it today and using today's languages, how do you think it would differ?
SJW n. One who posts facts.
My question is similar: when will programming evolve to use subject-predicate syntax, rather than function-argument?
Function-argument goes back (at least) to Frege, and his prejudices against subject-predicate syntax (which dominates natural languages). But isn't changePassword(a,b) more ambiguous than "change the password from a to b"? Don't we get an "information gain" effect from using a syntax we are familiar with outside of programming? When you first come to a function-argument command such as (in Oz, which is used in the Paradigms of Computer Programming MOOC) {Push S X}, there is maximum entropy as to whether S is pushed, or pushed onto. "Push X onto S" has no entropy; you know immediately, from the syntax alone, what is pushed onto what.
I remember reading you settled on C++ back in the 90s to implement STL.
Why the choice for an existing language and not craft your own language?
Some "real world" programming languages already have that feature. The foremost is Objective C, widely used for programming Apple products, which I believe inherited it from Smalltalk. Compare this Objective C fragment:
myColor = [UIColor colorWithRed: 127 green:127 blue:127 alpha:1];
with the equivalent Java code:
myColor = new Color(127, 127, 127, 255);
A singular feature of Objective C, as compared with other languages, is that the method signature is composed of all the intermediate words (adverbs or prepositions) in that particular order. For example the method above is referred to as colorWithRed:green:blue:alpha. You must use them in the same order, otherwise you get a compilation error, because you might be invoking an entirely different method. This is consistent with subject-predicate usage in natural languages, where "tell X by Y to Z" may be different from "tell X to Y by Z"
Other languages, mainly dynamic ones (Python, Groovy, etc.) take a hybrid approach, where you can pass named parameters, but their presence or ordering is not taken into account when choosing which method to invoke. Some of them (Groovy) allow the programmer to give meaning to the position of the arguments, when needed, while others don't.
How to achieve excellence in programming and design ? I am working as a programmer for a year now. but I don't know what i am doing
Russian programmers and hackers are famous for being extremely smart. This is also a fame your fellow countryman hold from Chess tournaments and other competitions of the mind.
In your opinion:
Thanks
Alex: I regard my first encounter with the STL (very shortly after its first public release) as one of the great eye-opening moments in my software development career. Unfortunately, as I'm sure you well know, quality of implementation issues in compiler support for the C++ template idiom cultified (i.e. made cult-like) the deeper principles for at least five (if not ten) years thereafter.
GotW #50
I've long regarded the criticism against vector[bool]—I'm not going to fugger with angle brace entitiesâ"not being a container were misguided. Of course, it *must* be a container for reasons of sanity, but to portray the problem as a standardization committee brain fart seems to miss the main point.
Just as STL introduced a hierarchy of iterator potency (that was the main technical innovation behind the STL, was it not?) one could likewise introduce a hierarchy of container potency. The container we ended up returns interators which promise a dereference operator returning an lvalue (it's been a long time since I've used this terminology) which is why the following statement from the linked discussed is expected to work:
typename T::value_type* p2 = &*t.begin();
But actually, of all the uses of containers found in the wild, I highly doubt that more than a small percentage (potentially a very small percentage) exploit the property that interator dereference returns an lvalue rather than an rvalue.
The net effect is that the standard containers promise us a potency we rarely exploit, yet the burden of this potency is universal. Forsake it in even the smallest way, and you'll be shouted out of the room for non-containerhood.
We could have handled vector[bool] by changing the standard container to not promise IDLV (container iterators dereference to lvalue). In cases where the programmer goes ahead and tries to do this, he or she obtains a simple syntax error (ha ha ha) and knows to either reformulate the algorithm to not require this property or to go back and add a specification override to the container setting the IDVL property to true.
With IDVL set, vector[bool] does not specialize.
With IDVL unset, vector[bool] will specialize.
Problem solved, except for the language overhead of introducing (and managing) a container strength hierarchy.
But instead, Herb Sutter decides to write this:
Doesn't that attitude make you want to pound your head upon a table somewhere? Seriously, if one repeats that remark 1000 times, we could almost make the entire STL go away (and return to the world we would have had instead had the STL not rescued us from parsimony mass produced.)
Clearly, there was enough of a pain point in the C++ standarization effort around iterators that the STL gained traction exceedingly quickly (and very late in the day), yet the C++ community is also extremely hidebound about minor pain points, as evidenced by Sutter's explanatory tack.
Obviously, there were some advantages in demonstrating that the STL approach could achieve performance comparable to C (and in some cases, better than C) in proving that the STL was not just another abstraction gained at the expense of runtime overhead (which all looks fine until five or ten different runtime overheads—however small each of these appears in isolation—begin to interact adversely).
But very quickly, the initial quality of implementation issues and the quirky (to be extremely kind) limitations of the C++ template mechanisms threw up some major walls in pursuing the underlying ideas behind the STL more extensively.
So, my question is this, more or less: in retrospect, was the early victory with C++ worth it (it's extremely easy to understimate the value of having a good idea noticed at all), or does the eternal puberty of the C++ STL continue to grate?
How much of your time do you dedicate to computing vs doing other things; what are your other hobbies or is the work you do also your play time?
How difficult would it be to eliminate heap overflows, buffer overruns & stack exploits on the current x86 PC compatible architecture?
javac references imported classes from the .class(es) rather than, say, a plaintext header file. Java bytecode doesn't typically preserve parameter names*. Hence the only information the compiler sees for Color is a constructor with 4 int parameters.
* (apparently if you compile with the -g option, for debugging, that info is preserved. Or with -parameters with Java 8)
We see many programming languages with at least some support for Generics, but usually as a second class citizen, and often added as an afterthought in later releases, and subordinate to some other programming paradigm. Java is primarily OO, with generics added later. C# is also primarily OO, though with generic support. It took C++ several iterations to get generics, and C++ is "multi paradigm". Go doesn't have generics, and doesn't seem like it will not a while.
It seems to me like generic programming is sufficiently powerful as a paradigm to not need other paradigms like OO in the same language. In fact, in many ways, OO, which ties together data and alogrithms, seems antithetical to generic programming.
So, do you see a possibility of a programming language whose primary paradigm is generic programming? Why do language designers not get generics into the first releases of their languages, even now, when the issues would seem to be well known? What would such a language look like?
Alex, I saw some of interesting programming lecture video at A9 lab. What is the outcome of your teaching? Do you think the audiences applied your methodologies in daily programming work?
Often times I search for something and I want to search for a regular expression. Is there any technique that allows for indices on such?
Is there a domain that is better for generic programming than others? When Alexander talks about efficiency in "Elements of Programming", it seems that generic programming is specifically targeted for writing libraries. So, are there areas that benefit more from generic programming (like libraries) and areas that benefit less, such as user interfaces?
What is your current view on Dijkstra classic claim:
"go to statement considered harmful"
in Communications of the ACM / March 1968
Anyone can tell you Boost is a mixed bag. It was begun as an incubator for future std libraries, and it's succeeded: most of the new libraries in C++11 were Boost libraries first. Shared_ptr, chrono, random, regex... all good libraries. That's not to say all the Boost libraries should be in the standard, or are fit for that.
The STL is about three decades old. In that time, we've seen both OS and hardware evolution. What is the impact of these changes on how the STL should be used? How would the STL be different if it where implemented targeting modern environments?
Mr Stepanov, do you follow the ongoing efforts to standardise Concepts as a language feature in iso c++? If so, how do you think it's going? Would you do anything differently? To what extent do you think the standard library should provide Concepts? I recall learning about Concepts from the SGI STL doc many years ago and being very surprised to find that things like ForwardIterator weren't part of the iso spec.
STL was a pretty radical departure from the way classes and libraries were designed pre-STL. I am very keen to know a bit about the history of STL’s inclusion into the standard.
When you originally proposed STL for adoption into the C ++ standard, how receptive / enthusiastic was the C++ committee towards STL? What design decisions / compromises did you have to make to get it accepted? How much resistance did you face?
For example, you have noted that it took a major effort to convince the committee that vector must be contiguous. Was such instances common?
STL has been wildly successful and has pretty much completely changed the way libraries are designed not just in C++ but also in in other languages. Most mainstream languages have added facilities to write generic code.
When designing and proposing STL for inclusion into the standard, did you expect it to be this successful? Why do you think it has been this successful?
In you book "Elements of Programming", you spend a lot of time on concepts. The paper "A Concept Design for the STL", the basis of the latest concept design for C++, references your book extensively. You of course co-authored that paper. I am therefore quite keen to hear your views on C++ Concepts.
Do you think that language support for concepts (or equivalent constructs like Haskell typeclasses) is important for writing generic code? How deeply are you involved in the effort to get concepts into the C++ standard?
As we all know, C++ is far from perfect. There are several features which you discuss in your books and papers, like concepts and UNDERLYING_TYPE, which C++ is currently missing but proposed for C++17 (e.g. destructive move). However there are things you have criticized before, like the memory allocation interface, which are still as they were 25 years back.
What do you dislike the most about C++? What would you change or add to the language to make it better?
The iterator based approach of STL works very elegantly for 1 dimensional data structures but fails to generalize cleanly for higher dimensional structures. For example, there is no easily defined way of iterating over a 2d array or a graph. Also, the notion of regular types, discussed in your book Elements of Programming, also fails to generalize for 2 or higher dimensional types, like complex numbers and matrices. They lack the total ordering property.
Of course, you can artificially define an ordering, say force a row-by-row iteration over a 2D-Array or a breadth-first iterator over a tree or an artificial ordering on complex numbers, but such constructs feel artificial. Do you think this limitation is fundamental to the iterator based design approach?
In your book you mention Euclid's Elements. It is the oldest continuously use textbook in history, but is it really relevant to a computer science education?
The STL iterator approach as proven very successful for data structures like lists/vectors/tree/etc. Yet for data structures that have a segmented structure such as deque, hash tables, sub-matrices, B-trees, STL has a high abstraction cost for iterators, forcing you to make extra unnecessary checks when comparing two iterators. (see for example Matthew H. Austern's paper "Segmented Iterators and Hierarchical Algorithms": http://lafstern.org/matt/segme...) Do you believe STL should be extended with an additional iterator abstraction to handle such data structures (like Austern's proposed segmented iterators) or do you believe it's too difficult to handle such data structures in a optimal generic way?
Eric Niebler recently proposed a major TS (https://github.com/ericniebler/range-v3) that aims to refactor STL to introduce ranges. It also breaks compatibility in minor ways (e.g. by allowing for two iterators specifying first and last positions to be of different types), so it will likely be introduced as a new additional library to eventually replace the existing STL, while being very close to compatible. Given that such a rewrite is open to making some changes that break compatibility, are there any other modifications you'd like to see? And what else would you like to see in a C++17 version of STL?