Hejlsberg Talk About Generics in C# and Java
An anonymous reader writes "artima.com has a very interesting interview with Anders Hejlsberg - the Borland guy now at Microsoft who can best be defined as MR C# - doing all the stuff that Borland wouldn't let him do. He discusses generics in C#, Java (1.5) and C++. Naturally there is the chance of bias but he does raise some interesting points againt Java's generics. Specifically that Java's genericised collections will have to box all primitive types as full objects, whereas C# does not. This is a big performance plus for C#. Java created the primitive types in the first place to address performance concerns but appears to be stepping sideways here. I can't help wondering if Sun has taken this approach to get the syntatic sugar in the language without requiring a bytecode change, but perhaps in a future VM version will allow primitive generics (obvioulsy forcing a bytecode regeneration)?"
" Specifically that Java's genericised collections will have to box all primitive types as full objects, whereas C# does not. This is a big performance plus for C#."
Do you have any references to back that up, or is this just "conventional wisdom"?
I'm not saying it's incorrect, but I'd wait until there are two implementations to compare.
Primitive types are boxed in C#, just automatically wrapped and unwrapped as required. But what he seems to fail to realise is that Java 1.5 is introducing this too, so that I will be able to define method(Object obj) and type method(12) and will receive a boxed Integer type. This should work fairly for generics too (I hope).
Specifically that Java's genericised collections will have to box all primitive types as full objects, whereas C# does not. This is a big performance plus for C#.
The Java collections operate this way right now (in JDK 1.4) and AFAIK have operated this way since the beginning. You can't add a primitive type to a collection, genericised or not, it has to be a first class Object. In terms of performance, whether or not the conversion is done explicitly or automatically shouldn't matter. If anything, the automatic conversion with autoboxing should be faster because it can be optimized "under the covers" in the compiler or the JVM.
I don't see how this performance issue is related to generics in C# vs Java.
I'm a Java developer of 4 years and I'm unimpressed by generics. Why have all those 's dirtying up my code, only to enforce strong typing on my collections? If strong typing is really important, I can create my own strongly-typed collection. Otherwise, there's something called GOOD CODING, along with runtime exceptions, which enforce it. I don't see the need for all that extra ugly syntax just to enforce it at compile time.
reech bee-yond ur clip-0n
I've been waiting for a long time before generics get released for C#. At least now e know they're doing them properly. Sounds like they will outperform java significantly... at least with value types.
Not having to "box primitives" is certainly an advantage in that it makes the language cleaner. So I think C# is better because of this. But I find it hard to believe that this "improves performance". I would think the whole reason for treating primitives differently is to improve performance, as they can go through optimized code rather than the path taken by normal user-defined objects. So a collection of boxed primitives in Java may be slower than non-boxed primitives in C#, but this was done so that a collection of normal objects is faster.
Though really, I want to know why these languages have to be written so "primitives" are special at all. I would really like to be able to subclass an int or other built-in, and having "methods" on an int would be nice (even if you can't define new ones). The main reason is so you don't have to rewrite if you change your mind about whether type A is a primitive or something you define. Can't the byte compiler do a bit of work so any fast-path for the primitive can be used, without building in such restrictions to the syntax?
People complain about the dumbest things. I always laugh when someone complains about the performance of java. Guess what, if your app needs to be high performance, you shouldn't be doing it in java. If it doesn't need to be high performance but it needs to run on every os and computing device under the sun, then you should use java.
Programming languages are tools. Hammers for nails, screwdrivers for screws, C for hardcore big stuff and python/perl for everyday fun. Java was never intended to be high performance. It was intended to let you run a program on every single device and operating system there is, and have it run almost identically. The fact that java is as high performance as it is now is amazing. If you need performance and java isn't cutting it maybe you should ask yourself "why am I using a screwdriver to put a nail in this piece of wood?"
As for the specific issue of genericising and java's collections. I think java's collections rule. Sure you can't make a collection of primitive types, big deal. You can make an array of primitive types. And you can even make your own class, which inherits from collection and is in fact, a collection of primitive types. But just about everything in java is an Object anyway. So an ArrayList or a LinkedList are good enough 99% of the time. You keep on coding in C#, and when the world doesn't use windows anymore we'll see if your app is still around.
The GeekNights podcast is going strong. Listen!
I read the article to see what the numbers were on performance differences. Turns out there were none, what a surprise! So as usual, in theory, perhaps maybe there might be a performance difference and maybe it might be significant or not, or both. Until real comparisons are made, safe to ignore.
What's more important is his other problem with Java generics which is that in runtime it's not possible to tell what type a collection is compiled as, this is to retain bytecode compatibility. So reflection will have no clue on any of this generics stuff, which isn't a deal breaker, it's just a downside I hadn't heard about before.
Because Java is the best and C# is wrong and if this really is a performance difference then performance doesn't matter today but it probably isn't really a performance difference and what matters is clean syntax and open portability VM I bytecode everybody knows bean he needs to read blah, grouty, blah.
It seems many Slashdot readers not only fail to read the articles or others' comments, but can't even be bothered to read the whole little summary blurb before commenting.
Half the comments on this story are pretty much just template responses to anything that would compare Java and C# - and a good portion of the rest don't seem to have grasped what is stated in the summary.
My theory: posters realize that after the first 50 comments, they'll pretty much be talking to a black hole - so they comment earlier than their comprehension might dictate.
Let's not stir that bag of worms...
where it makes sense, but we are also very
conscious about not sharing where you want
the performance.
Welcome to the Microsoft business model. :)
I've been reading through the various segments of the interview, and I tend to buy most of what Hejlsberg says on various Java vs. C# issues.
But I keep coming back to the idea that the changes (or improvements) aren't enough. If you accept that all of the changes are improvements, that they make things better, they're still not enough to justify getting locked into a single vendor, or in learning new libraries.
C# cleans up some of Java's annoyances, which is great, but the annoyances just aren't big enough to make the shift worthwhile. That's the problem.
I think the libraries problem is huge for Microsoft. The java libraries are just getting to be so big, complex, and rich that it will be very hard to get people to move away from them.
I don't think that anyone says there aren't annoying things in Java, parts of it that wouldn't be done differently if the language could be redesigned from scratch. But those annoyances are liveable -- for the most part, you can deal with them.
Java's has those libraries, though, and one of the reasons the libraries are so rich is that Sun opened up the process to other companies. MS is huge, they have a lot of smart guys, but I just don't think they can compete with Java's comparative openness.
That's the thing -- you can read about checked exceptions, and agree that it would be nice if java handled things more like C#, but it's not even close to being enough to overcome the value of java's openness vs. Microsoft's closed approach.
In the end, it really comes down to the business model.
The types of performance problems you're talking about are orders of magnitude away from the performance problems that users percieve when using Java applications though. The same problem exists in C#, and users will percieve it as slow too as soon as average and below average programmers start using C# for applications that people actually use.
When somebody says that a java application is painfully slow, the problem usually stems from the use of stock objects and the programmer's lack of understanding of the internals of these objects. There's a rich library of complex and convienient objects available for Java that allow Java programmers to quickly implement features that take much longer to implement in other languages, and programmers use them freely. Unfortunatly many of the operations on these objects have a high order of complexity that is hidden by operators like a simple '+'. This fools the unknowing application writer into thinking this is a fast operation, even if it's not. When Java applications are written by skilled programmers who take the cost of object operations into account when they write their software, they are actually quite snappy, and on modern machines the performance gained through different primitive implementations is only visible in benchmarks and scientific applications.
Over and over and over again...
Bwahaha. Nice troll. Ask any serious programmer, and he/she will tell you that good coding practice calls for as many checks as possible during compilation, while it's still in the lab. "Whew, it got caught during runtime!" is a shitty excuse when the run time in question was the demo in front of the customer, or during actual use on the flight deck, etc.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
I've never used genrics. From what I understand, it's a way to keep a group of different objects in a collection without having to warry about what type of object they are when you retrieve them.
I can't think of any useful application where I need a collection of different types of objects. In some case where I might (a shopping cart) the items can just implement an interface to guarantee me the operations I need to perform on different objects.
Why would I need to collect a String, an int and some user-defined object all in one place? Sounds sort of like the junk drawer in the kitchen that everybody has.
It's simple: I demand prosecution for torture.
The types of performance problems you're talking about are orders of magnitude away from the performance problems that users percieve when using Java applications though.
.NET framework use native code libraries for the GUI. It will never have the same perception of "slowness" that java has.
I disagree here.
Think of Moore's law, processor speeds, etc... Java is a fast enough language for doing just about anything a user needs to do. Even if java were only 25% as fast as native code, that would be 2 cycles of Moore's law, 36 months, 3 years ago. (And, java is much faster than 25% of C, check here)
3 years ago, users were all doing the same things they are doing today.
A couple of exceptions apply, of course: scientific computing, games, etc, tax the hardware pretty heavily.
But, the primary reason that Java is perceived as slow by users is the terrible speed of the GUI.
All the widgets are implemented in Java directly. This is almost like the same exception as game software, since all this rendering code involves moving around lots of memory, etc...
The GUI matters more than anything to user perception of slowness.
An old 14mhz 68000 amiga often "felt" faster than a 50mhz 386, because the amiga's os/gui were very responsive, while the 386 was running win3.1
Look at the recent developments with the linux kernel. Compare X responsiveness with a preemptible low-latency kernel, and how the whole machine "feels" better.
By going with preemption and low-latency, the overall throughput of the machine is actually slightly slower. But it feels loads better.
C# and the
In the end, what I like most about java isn't the language itself, but the culture of being verbose and clear about the design and intent. C# is just an updated C++, which is nice, but it retains the same culture. What culture is that? To put it bluntly, it is do what ever you can to get the code done as fast as possible and don't about the design because we will rewrite it anyways. Guess what? from first hand experience, that type of culture leads to software that costs 3x more to maintain than to develop. In the end, you might save a day or two doing it the Microsoft way, but you end up having to spend the same amount of time to fix bugs and add features. How is that suppose to be sustainable or better?
Just what is it about Denmark? One small little Scandinavian country is responsible for both C++ (Bjarne Stroustrup) and C# (Anders Hejlsberg). Do they have a particularly good education system when it comes to Computer Science or what?
What I'm saying is it *doesn't* make the C# implementation faster. If C# is written intelligently, the best it can do is make it the *same* speed.
Now C# could in general be written better and do the *same* job faster. And I agree that this ability makes the language *better*. But saying this better feature of the languages is the *reason* it does something faster is bogus. I am pretty certain the artificial differentiation between primitives and objects in both languages is in order to make them more efficient, and removing that restriction by defninition will slow them down, or at least not make them faster.
One other area where java gets a bad rap for being slow is in startup time. I write a lot of small java apps and continually loading the jvm adds about a second to the startup time on my machine. I've gotten around this by creating a RAM disk at startup and copying the jvm executable and jars over. This gets rid of the noticeable startup time. JVMs should come with the option of loading into memory at startup...it would help dispell the myth that java, by nature is too slow.
But you're right about the GUI...that's the single biggest fuck up that Sun made. IBM has done better as SWT comes pretty close to being fully responsive (though it still makes it possible to be unresponsive if you write bad code.)
I would agree with you if it were impossible to make a responsive GUI in Java. It isn't. Most unresponsive Java GUI's are slow due to poor choice of object operations when responding to events, not because it's using 'non-native' GUI calls, which at worst add a layer of library indirection.
The slowness we're talking about isn't typically miliseconds, it's whole seconds. You're talking about graphical slugishness, which is a whole differen't issue, and one that most users don't even notice.
<oversimplification>So for example, an int[10] takes 40 contiguous bytes (4 bytes per int) on the heap. That block contains the actual values of the ints. An Integer[10] also takes 40 continguous bytes on the heap, but now those are just references (4 bytes per object ref, on x86 at least) to 10 4-byte blocks, potentially scattered around the heap.</oversimplification>
Imagine how much longer it takes to compare two Integer[]'s than two int[]'s. The former involves a following a reference to get each value, the latter is just pointer addition. Furthermore, primitives allow the JIT compiler to create far more optimized native code. (Your hypothetical "bit of work" would have to fall back any time a public method takes an int argument, and ripple from there on inward whenever that reference is passed.)
Not to say there isn't a class of problems/programmers for which subclassable "primitives" may be appropriate and valuable. If you truly find yourself in that situation, it might be time to consider Python or Ruby.
Allowing primitive type arguments in generics would not necessarily require any changes to the VM or bytecode.
The Java 1.5 implementation of generics is substantially based on the Pizza compiler, which allowed primitive type arguments without requiring boxing (links: the pizza compiler, the GJ compiler it evolved into, some academic papers about the compilers).
If I remember correctly, the pizza compiler generates separate classes for the different primitive types. It needs a different class loader, but generates classes for an ordinary VM.
Time for a snack.
On the other hand, IIUC, MS intends to write large sections of Longhorn in C#.
Maybe that's why it's not due until 2006: they have to wait for commodity hardware to get fast enough to run it responsively.
Sounds like they will outperform java significantly... at least with value types.
I know it's too much to ask for on slashdot... But I guess if you had actually read the fucking article (which was interesting) you would know that you are an idiot.
The article clearly states that there is a performance benifit for both value types (which don't have to be boxed) and reference types (which don't have to be downcasted).
There.
I'm assuming that you read the article, and are therefore just dumb. Let me spell it out. The article says (all in theory here) that C# generics are faster for both primatives and reference types.
You wrote: Yes, but this is apparently because it is *better written*, not "because it can do primitives"
Wrong. C# generics are (in theory) faster than Java's because
A) primatives don't have to be boxed because generics are supported by the run-time.
B) references don't have to be downcast at runtime.
Did you *read* the article???
I'll skip the rest of your moronic post.
I mean, how many value types are there already? So you get a couple instantiatians of the class instead of one. And if you are really worried about footprint, you can use only reference types in your list.
But Matlab has a cool way of using Java objects, and Java objects have a less-fuss connection to Matlab than MEX-C++. And for tight loops in pure numerical code, Java oddly enough is actually pretty efficient.
JNI also lets you create and operate on Java objects just like Matlab does -- you have to load a VM into your process, but hey, that is what invoking "java myclass" at the command line does. As a vehicle for creating plugins for a C++ program, Java .class files are much more compact than .dll files which tend to link in so much extra . . . stuff.
The JNI is really meant for Java calling into C/C++ to get at some low-level system stuff, but it allows C++ to call Java -- it has to have this capability for the low-level C++ system routine to get stuff back from the Java environment.
Does Mono support C++ calling a C# module? I know there is a MS/VS.NET solution for this, and it is a lot cleaner than JNI. All you need to do is use VS.NET to build an "unmanaged" (i.e. ordinary Windows app) C++ program, and then you need to use some kind of GCHandle template class to take care of the reference into managed code, and then you just invoke methods off that handle as if it were managed C++.
Given that the Mono project doesn't quite have the ambition to make everyone use C# everywhere for every last thing, and given cross-language development (Python-C++, etc) is getting a lot of attention, is there a Mono solution to this? You know, if you could call C# from C++, you could just as well call C# from Python by having a C++ module in the middle. Is there any interest in this capability?
You really can't say anything positive about Microsoft here. I go and say that C# may be "better written" and people jump like crazy to call me an idiot. Hey: there are real intelligent people working at Microsoft! I don't like what they do all the time, but they are not stupid.
Face it, Java does not do primitives in the containiers. It does not do them "slower", it just DOES NOT DO THEM!!!
Therefore C# is not "faster". It is instead, better designed.
And I would like it if somebody came up with a language that did not have these artificial restrictions in it. C# is closer but it still treats primitives different all over the place.
Wrong. A C# generic containter of reference objects (aka non-primatives) is in theory faster than a non-generic C# container or a generic Java container or a non-generic Java container - because there is no need to do a checked downcast when accessing an object in the container.
Can I spell that out any more clearly for you? Any big words that you didn't quite get?
Man, I think I'm being trolled here - are you really this dumb?
There are two types of collections: 1) collections of value types, and 2) collections of reference types.
Collections of value types are usually faster because they don't require: 1) individual heap storage for each element, and 2) dereferencing references to elements.
Java doesn't have generic collections of value types.
C# will potentially on-demand JIT compile many different versions of a generic collection, one for all reference types, and one for each value type.
The performance of Java's generic collections will be similar to C#'s 'reference type' instantiation of a generic collection. But C#'s 'value-type' instantiations will usually be faster than both C#'s and Java's 'reference-type' versions, in some cases significantly faster.
Most java programmers aren't that sharp.
That's why java doesn't have multiple inheritance, makes you check every exception, doesn't allow operator overloading, doesn't have call-with-current-continuation or any other power expression in it.
It's the Fischer-Price of the programming languages. It's the plastic knife of gourmet cutlery. All the sharp edges have been rounded, and it's terrible to use, the results aren't satisfying, but it's safe, and you only need a minimal amount of ability to use it. And you can pay rock bottom rates for that kind of work, and offshore it to India as well.
It's like frets on a violin. Yeah sure, you can get music out of such a thing, but does it really make you want to pick it up and play it on your own time?
Contrary to what the interview claims, there are. Please refer to Bjarne Stroustrup's page: http://www.research.att.com/~bs/bs_faq2.html#const raints
In addition, "The compiler checks it, but you could also be doing it at runtime with reflection, and then the system checks it". It is a waste to double-check at runtime what has been guaranteed at compile time.
BUT he is lying, it is actually the other way around. ;)
I don't understand what Mr. Hejlsberg is saying here. Does he imply that C# code is checked before you compile it, and the programmer is prevented from typing the wrong thing, using their super-secret-mind-reader-thingy, or what? No, what he is saying that if the code compiles, C# guarantees that the syntax is correct. Well, that's a new one, I wonder when Microsoft will patent that idea?Blast, I didn't think there would be any prior art to prevent this patent, but there is: Any/most computer languages that compiles code does also do this!
C++ does not guarantee anything - until you compile the code! (And even if you compile it any function could throw an exception.)
One will not create boost::spirit, blitz++ or anything else but some containter classes with that. What he is saying is that C# generics is really inheritance/implementation of one or more interfaces.With C++ templates the 'constraints' are implicitly created for each operation you use on the template parameter, you don't need to state anything, it just works.
There are constraints, but they're not by type... They're by supplied function signatures. Any function called on the Template parameter "T" must exist in the datatype passed in the s.
So, you can't restrict by base class at compile time like you can with C#.
T
---- It puts the lotion on its skin or else it gets the hose again. It does this whenever it's told.
I love that .sig! I'd've thought figuring out endian-ness would require much more code than that.
Sorry, mods. I'll take my off-topic medicine like a man.
Or will I?,Anonymous Coward
I'm a die-hard Java guy, but I couldn't help but cede the point Hejlsberg made about reflection. Does it bother anyone else that Java is incapable of giving you the type parameter via reflection? It bothers me...
I admit, it's a small thing, but reflection amounts to meta-programming, and that can get really messy if you start allowing inelegancies to creep in. Is this an unsupported statement that I'm making off-the-cuff, with little or no experience to back it? Well sure, but it does seem right, doesn't it? :-)
sevMy ex-gf was flakier than a French pastry crust perched on De Gaulle's shoulder.
but have you considered the following argument: shut up.
Hejlsberg points out that C++'s strong typing can be circumvented using templates, whereas C# still does strong type-checking on generic types. I wrote a little useless program to prove this to myself:Here, you can have an A of B as well as an A of C. You can have an A of any type as long as the type has a const function called "bar" that takes no arguments.
I was so excited about generics and then the MSFT genius comes out and ruins it for me but pointing out the flaws. I thought Java was perfect! But I guess I was wrong.
Actually, he makes some good points about the benefits of C#'s generics. Especially about how in Java you can't get the type by reflection. That bites! Now our debuggers won't be able to show us what type each variable is unless it compiles its own metadata.
Boo!