Sun's JVM is good, but not great. There are lots of people working on JVMs out there -- there is *no lack* of open source JVMs. There must be at least thirty JVMs out there, not counting variations produced by a single company. AFAIK, IBM's JVM is the highest-performance thing out there (for Linux at least) and if we're demanding that something be open-sourced based on the fact that it's really good, I'd like to see IBM open-source theirs.
IBM's code is based on Sun's code, so IBM can't change the license unless Sun says so. Kaffe is unmaintained and so pretty worthless, AIUI. gcj isn't bad at what it can do, and there is the Java -> mono converter. But generally everyone uses either the Sun or IBM JVMs.
It works fine. We have had no problems with the current system.
Of course we don't have any problems, no one is using it. Sure internal company code is often using J2EE, but that is heavily influenced by comming from Solaris boxes. The only core application using it is Eclipse, which is heavily backed by IBM pushing their view of the world (and might be available in FC2)... even the Java in OpenOffice is just causing all the distributions to patch out all the code again.
First off, it helps your argument to state that the C++ spec was standardized in 1998, not 1999.
You're right. For some reason I thought both C and C++ had come out in the same year.
Standardization always preceeds implementation.
That is not true, the original C std. was basically a statement of what was already implemented... thus. the strncpy() and puts() warts.
So how long is too long?
One year? Two? Six months? It took the STLPort folks 2-3 years for a working implementation after the official spec was published.
STLPort is not a complete environment, if you're compiler doesn't support templates properly (or at all) then STLPort isn't a solution. Also very few implementors do "ports" by taking the entire runtime with them from platform to platform... mainly because everything you use (Ie. third party libraries) now also need to use the new runtime.
Personally I've dealt with C++ code that's been "in production" for 8-10 years or more.
If I use std::string or std::vector in my code, it compiles and it works no matter the implementation.
Do you want a NULL-terminated string? Use.c_str().
So now you agree? What you originally said isn't true. std::string implementations on Solaris/NT seem to do the same thing for.data() and.c_str()... so when moving to Linux say, you can have bugs (it also doesn't help that RogueWave's string type has.data() as the.c_str() equivilent).
Use of the other two is deprecated per the 1998 standard.
You couldn't read that part of the spec in the six years that it's been published?
I never said "I do", and it doesn't matter if the code still has to run on platforms that still use the older behaviour.
Bah. GCC 3.x has had a ISO-98 standard C++ library for the last two years.
And it was first released in a real product, RHEL 3, in Oct. 2003. You seem to be confused between the real world, and "writing toy apps. in my bedroom". If you think C++ is wonderful, that's fine have your delusions... and if you don't care that it's been 10 years in the making and there are only starting to be something closer to C++ compilers now, that delusion is also fine. But to suggest that the implementation of the Runtime is as similar across platforms as Java,.net or even python and perl is just obvious and easily refuted untruth.
If I use std::string or std::vector in my code, it compiles and it works no matter the implementation. You are confusing warts with fundamental deficiencies.
Well unless you use.data() instead of.cstr() where NT/Solaris use the same implementation for both and libstdc++ doesn't... or you want to use <strstream.h> er I mean <strstream>... errr I mean <stringstream>
Put another way I would class it as fundamentally deficient that C++ only got it's first std. in 1999 and the first time you could get something that comes close to a std. c++ implementation is within the last few months.
Funny you should mention this, because I was going to write something about pipes. Getting pipes right with good error semantics is hard.
You should seperate the syntax of pipes with the implementation of calling pipe(). It should be trivial for the shell to transform "foo | bar" into "foo -> __internal; bar -< __internal;", it also seems like it shouldn't be that hard to use the pipe() on the backend but they should at least get the syntax right.
I'm also not convinced that you wouldn't want a more high level view, so you can move the script around machines... and possible add dependancies between machines (like Tivoli). And, unsupprisingly, the throw/catch type error detection is just as buggy as real error detection.
but $45 for tens of millions of lines of code that is the single most important element of the PC (how great is that PC minus software)? Whoa, that's just unacceptable!
How great is that PC without RAM, or a hard disk? (you going to run XP from floppies?)
It's also very misleading to call it "tens of millions of lines of code", most people didn't move to XP because they required all of the changes from win2000 or even win98... they would have been happy with tens of thousands of lines of code, for new drivers etc.... or not even that much for those NT4 machines which are now in "screw you" mode.
Making the leap from shell programming to lower level languages is simple.
*laughs*, I think not. One could argue that going from ksh to perl is relativly easy, but I wouldn't call perl "lower level" than ksh... and I'm also not sure how easy it really is. sh scripts tend to be fairly small and self contained so indentation etc. doesn't become a factor.
It also goes against the idea that for a good understanding you want to go "up" like... HW -> asm -> C -> perl
My specific gripe was Redhat 7.1, I think -- the ssh version is woefully out of date and vulnerable.
And my probably badly phrased retort was that if the entire OS is out of date there is an obvious solution upgrade the OS. The 95% use case (IMO) is being able to upgrade certain application(s) (those that the user(s) care significantly more about than the rest) when most of the system is at the latest stable version... ergo. security updates should almost certainly be got in the normal way.
This is what the external Fedora trees, and debian unstable/testing try and do... and they are much closer to something that people can realisticly use and not screw themselves with.
From a system standpoint, it doesn't really matter how much spaghetti is under the hood, does it?
I would have to say yes, it does, esp. when talking about LD_LIBRARY_PATH etc. which have to be preserved in all the environments you want to run an application. One of the minor problems on my boxes atm. is that "ssh foo cmd" stopped working last OS upgrade because "cmd" is in $HOME/bin and for some reason that isn't in the PATH when just running the command over ssh (but if I ssh and then run the command manually it works fine). Also setuid apps. will just not honor LD_LIBRARY_PATH.
I think you totally missed the point. With this system, you can install as many new/old/cvs versions of openssl you wish.
This is not true, maybe I didn't explain very well libraries link to openssl... so if libA links to openssl and appB links to libA and openssl you cannot just compile appB if you have previously upgraded openssl (and the symlinks are using the new version). Your examples used static libraries, but I assume that was just a typo... and it doesn't help you anyway it'll just not give a compile time error some of the time.
The biggest hit you currently take here is gtk/gnome/app. But you have to be able to manage this problem all the way down to libc changing the "public interface ABI" -- and realistically you should try managing some of the private ABI changes of glibc (which I'm guessing would be a lot worse in a stow based system).
Ever try installing a current SRPM of openssh onto an older Redhat release? It's a nightmare! The RPM requires a current version of openssl, but the KDE libraries all require openssl 0.9.5 (or some such). You just cannot get it to work.
Without upgrading you mean, ok fair enough. Although I'm not sure why I'd care about what version openssh is (unless it's a security errata -- in which case it's coming from my vendor anyway). But using a better example of evolution or whatever...
So you may now be thinking to yourself, "Okay, that's kinda useful. But when you have hundreds (or thousands) of apps, your PATH would be insanely long. This just won't work."
That's a good point -- but there's a solution. Use the "lndir" command from within/mfs, to link your desired package into the root/mfs directory
Ta-da you've now got the same problem plus a billion symlinks. You could argue that putting a serious amount of crack in the PATH and/or LD_LIBRARY_PATH could save you... I'd disagree, openssl is pretty notorius of breaking backward compat. on library upgrades. So you are likely to end up with something that installs but doesn't work. And the deps. won't be half as good as any of the major distros, as you are basically doing download tarball type./configure -- see what fails -- then, eventually, run and see what fails.
But another great benefit is that you can change the stable version on the fly, on a live system, simply by pointing that stable link to the version you wish it to now point to (yes, sometimes between versions you gain or lose files, so this is not perfect).
Understatement of the century. Be very affraid of any "package management" system that hand waves away problems like this (and god help you if you are doing anything like "obsoletes" markings... which again all the major package
management solutions have).
Yes, it runs across more platforms, but the core code across all of them is strikingly similar.
Actually not even that is true, apache 1.3.x and 2.0.x are very different (and currently you can still buy aomething with one or the other from all major Linux vendors).
Most Apache exploits to date have been completely cross-platform exploits, meaning that it really is more of a monoculture than you might think.
Err, I don't think so. I have seen ones that had a table of different "known" offsets for FreeBSD and Linux... but I haven't seen any that worked on ppc, Sun and alpha as well as ia32... and then you have some people using their own compile maybe with StackGuard/exec-shield/mudflap features turned on. Sure you can still play the percentages, but it's far from the same thing IMO.
Sure, I can write unit tests for functions that take data and spit out data, but for functions that take sql and split out html, I just can't seem to find a good solution. I have tried to maintain a "testing" sql database with carefully selected data, but it quickly becomes out of synch with the real development databases.
First, IMO, html is the UI... and the best way to test that isn't with a unit test (that I've seen). You want all your UI functions to just glue sql/data commands to the UI. Then you can "easily" test the sql/data commands with a test database and a bunch of function calls... then you'll know that any bugs you find in the html part must be the fault of the UI code (which should be fairly small).
As far as the testing DB, I'd recommend having testing data outside the DB and use something to init. the database just before you run your unit tests. This should be easier to keep upto date, and solves problems of testing alterations to the DB.
To get concrete about this, I can write a function that adds a user to the database by calling a function with predetermined values. Except, once I call it, I can't use it again because the system detects the existance of that user account and acts different. We don't delete accounts (by policy requirements), but just disable them, so I can't even just yank the account.
Having an external representation of the DB that can be reloaded should solve this problem.
If you know tools/methods/whatever that address this kind of programming, I'm eager to check them out!
If you look at the code coverage report for the parent example, you'll see that a common cause of less than 100% code coverage is multiple return points in functions.
That's a simplification, a significant portion of it now seems to be gcc coverage bugs (Ie. lines marked not covered I can step through in a debugger), but yes a lot of the difficult areas are to do branches.
If you write your function to have only a single return point at the end of the function, you can increase the code coverage, while making the code easier to read. You then have to ensure your function's control structures send control to the end of the function (no gotos, you understand!). If you try to achieve this, your programs become more logically structured, easier to test and maintain.
And I'd agrue that this isn't the same thing. Moving from "if (!foo1()) return NULL; if (!foo2()) return NULL; return bar();" to "if (ret && !foo1()) ret = FALSE; if (ret && !foo2()) ret = FALSE; if (ret) return bar(); return NULL;" doesn't buy you anything IMO. But if you can move to "foo1(); foo2(); return bar();" then yes, that's much more likely to be working code (as you've divided the number of possible paths by 3).
On one project where we set ourselves a 100% coverage target, and followed the above coding standard (along with others), we produced some excellent code.
Did you hit 100%? If so then I'm impressed, out of curiosity what was the ratio of lines of code to lines of tests?
Of course, it's really hard to exercise some exceptional states, and 100% coverage doesn't mean you have bullet-proof code, but it's another metric to use to assess your code quality.
Sure because there's that single (sic) path through the code that goes to all the same places but has a different state when it gets to the end:(. It'd be interesting to see something with 100% code path coverage.
#1: No, the bugs aren't in the UI itself, the bugs originate through the UI. Users can do things in such a way that you simply can't predict.
Then the unit tests could/should have been better. Sorry, it really is that simple.
#2: But you know, if you have time, there are LOTS of things you can do that might help, or might not. I stopped writing unit tests thinking that they one day *might* catch a bug.
True, and if you can get a large amount of free debugging (like a prominent OSS project) then any personal testing (like writing unit tests) might not be worth it for you, because getting X hundred people to just use it is going to exercise the normal paths of code pretty well. Of course that isn't going to find subtle bugs or secutiry flaws (and unit testing isn't guaranteed to either, but IMO it's much more likely).
I don't hack away at code, so I don't think unit tests will save me from something I don't do. My interfaces are developed exactly at the level of quality called for; no more and no less. So in the sense that unit tests can help me "think"...I never found myself lacking in that area. Perhaps it's something that helps less experienced programmers ramp up...I don't know. I've been programming for 20+ years and I'm alright in that regard.
Of course you don't intentionally write bad code, and neither do I (and I'll admit I assumed I made much fewer mistakes before I tested it all). But I presume you wouldn't compile C without any warnings turned on, or not use prototypes in header files. Why? Because the computer is much better at telling you that... Duh! fprintf("%s\n", foo); isn't what my brain wanted my hands to type. This is how I think of unit tests, they aren't there to catch design mistakes because I'm good at that... they are there to "prove" the implementation of the design "works".
I'm also pretty sick of seeing/using other people's code and it having obvious bugs in it... "testing is not having to say you're sorry". For instance, when testing Vstr I found 4 or 5 bugs in glibc with some of the more esoteric uses of the *printf() functions when using the double formatters. When I found them, half of me was happy... the other half depressed.
My main OSS project is a window manager. How do you unit test that? It is nearly impossible without writing your own test X server and the like. Just not worth it.
Creating mock objects is much simpler than creating a "text X server", although admitedly a wm is slightly harder than normal as you can't take the easy route of running a seperate version on your main display.
Howewver taking a quick look at blackbox, textPropertyToString() is the only thing in Util.cc that couldn't trivially be unit tested and at least all of i18n.cc and Timer.cc. That's 3% with basically no changes.
I did unit testing (once upon a time), and even developed my own test suite for C++, but I find that it catches VERY few bugs and I end up spending time writing unit tests AS WELL AS hunting down bugs the same old ways I always have.
Sorry to let you know but, you didn't write good unit tests and probably did waste your time. I've found very close to 100% of the bugs in Vstr a network IO string library using unit tests. That includes a couple of ones that would have been damn hard to track down otherwise.
However it's been over a year since 1.0.0 which had a unit test for every function and every function option, to the last release which had over 99% code coverage found a couple of weird corner case issues (not just bugs, but optimizations that could never be reached for some reason). And going from 98% coverage to 99% coverage took a significant time investment, and required significant thinking about how the test should be written.
As with much software development, it's easy to write simple tests that don't show much and aren't very useful. It's much harder to write tests that find bugs (and you have to appraoch writing the tests with a very different mindset to how you approach writting the code you are testing. This is not even close to being "Like picking lint from your belly-button."
I'm far from convinced that TDD is actually a good approach. Although it's pretty obvious that without testing the code is often trivially buggy, and unit testing is the cheapest way to perform testing. For instance this kind of thing is all too easy to do with TDD.
For unit tests you want to write your code, and then look at the best set of unit tests to do complete code coverage. For an OSS e3xample of that you can look at Vstr string library and the code coverage for that project.
Over half thought it was a real virus, and clicked it to see "What would happen" or "If it would work." Please note that this was only a couple weeks after "I Love You." infected half the computers on the network, and a company wide meeting about NOT opening attachments that you weren't expecting.
Half of them thought it was a real virus and opened it anyway.
My guess is that they'd seen how they'd basically got "time off" when the computers/network went down. And so like rats pressing the button when the light comes on, they did the same again next time the oportunity came along.
We didn't follow the LSB because it was a standard specific.
Who is this "we" you are talking about... in your opinion all the *BSD varients, or a specific BSD that you are involved with? If the later did you actually try and become involved with LSB to fix the points you found too specific?
I'd also argue that something like LSB needs to be as specific/detailed as possible. SuS etc. are pretty general in places, and I'd like something that said if you want to configure your system clock go alter/etc/sysconfig/clock... want to add a daemon call chkconfig --add. But it isn't even that specific.
Re:The replacement is already here
on
United Linux Dead
·
· Score: 1
I get more done becuase of my chutzpah and sometimes, I admit, arrogance. You gotta get attention for ideas to get them done.
This is the same mantra that RMS cries, correcting people to say "GNU/Linux" gets attention and possibley gets people who will help. It misses the point that everytime you do it that way you alienate people too.
We can get Oracle on board. It might take some time, but we can get their customers to bring them there.
Oracle is one of the more important ISVs, but there are more than a few ISVs certified for RHEL. And just saying "user demand" will solve all your problems is provabley not true... where's the debian or gentoo certification. I know United Linux had hell to get certification from Oracle, and they had something that looks like a std. RHEL box that people are the corporate users are used to.
If a read() uses a page-aligned buffer, from a page-aligned source, then why wouldn't the OS map a page directly into the application space? (Assuming that the area had not been mmap'd shared). The same optimization can be made on write() calls.
Because the app. doesn't share the data with the OS so if the app. alters the data the OS needs to have setup COW so the data it sees is the same. And it is very rare for applications to use page aligned buffers to read or write, it is also very common to change the buffer just after calling a read or write. This makes it a bad trade off to setup COW mappings in the general case, as it hurts all the normal apps. which are using sendfile() and/or mmap() to do this kind of zero copy operation.
To make sure that I wasn't just blowing steam, I did a little bit of looking around, reading some research papers...
I don't have all of the links, but a lot of them were within a few clicks of the so-called
The Memory Management Reference website
Well it's hard to argue with no links:). But your point that large objects are easier on a malloc()/free() seems like common sense. It's much more often that you'd write a custom allocator for small objects.
Refcounting is actually expensive, more so than GC. In multi-threaded apps, you have to use an interlocked instruction to update the refcount, or perhaps even make a function call.
Again with the threads?:). Yes, I've heard this argument before, however... 1) You don't need to do locked refcounting. If you do you are almost certainly sharing too much information between the threads. Locking should not be done "automatically" inside objects. This will almost certainly lead to the threads serializing against each other. 2) GC needs something similar so you can have quick destruction. This is often especially noticable when an app. is written assuming GC, at this point the application will assume that doing "x = new foo();" is just as cheap as "foo x();", then objects of very different lifetimes will be intermixed and the GC needs to know to free the "stack like" objects quickly. For instance python has both a GC and a ref counting system, and this seems to make most people happier.
The function calls to malloc and free tend to blow the TLB and cache almost as effectively as collections. The function calls to specialized allocators often do so as well. The difference is that GC is only called once in a while (relatively), so the TLB and cache are only blown a few times a second instead of every few hundred allocations.
Yes, malloc/free are not simple functions. However I'd disagree that custom allocators can blow the cache. Often the "allocation" happens with a test and two pointer assignments (for instance Vstr does this -- actually one pointer assingment per object, and another for the entire group), and deallocation with just two pointer assignments. I fail to see how a GC could compete with this. In threaded programs this is even more pronounced as you don't have to do any locking.
Then you look at things like filesystems, which have had GC like properties for a long time... and the ones that are the best at managing space are always the ones that do the most work at allocation time. The filesystems that have been design to "make allocation fast" tend to royally screw up management of the space under a bunch of conditions. And I don't see why GC would be any better at this.
The one paper on stats. that was linked directly from the Boehm C collector was Memory Allocation Costs in Large C and C++ Programs (1994), which is almost 10 years old:(. But I guess Boehm hasn't changed much (although I'm not sure I'd say the same for malloc/free).
This paper doesn't suggest to me that GC is very close to malloc/free, with it being worse in both space and time (perl taking 125% of the time and 275% of the memory). Xfig was the only app. sfaster CPU wise (by 0.3 of a second) and was almost half a MB bigger.
Personal observation also suggests this, as when gcc recently move from malloc/free to using Boehm it got both slower and bigger.
However I would probably be happy to take the hit on some of those packages tested... mainly due to the fact that they aren't performance sensitive to me.
Aggregation of the info is irrelevant. The fact that some system makes it easy to collate and/or find the information doesn't change the fact that the information is *already* out there. How can it have any privacy concerns beyond its public existance in the first place?
Making it eaiser is a big thing though. For instance it's possible for someone to find out my social security or credit card numbers by just stealing information from the right place(s). This is not particuly well kept information, I'd imagine most people on/. could do it. However I'm not likely to post them on my website, as that makes it much easier.
In the same way, in the example I gave in the previous post. Matching up all the information, who I am... where I live based on the photos, etc. is non-trivial for someone to do manually. However, if you can just say give me a list of people who match profile X. Then that is a significant difference.
Let me prefix this by saying that I didn't want to get into a GC flamewar, as I said I think GC can be used in some applications where you don't care about the negative side affects. I only replied to you as you seemed to be giving one of the better "Let's burn everything and rebuld using only GC" arguments. Possibly some of the difference is due to you working for MSFT and me being at RHAT, but I doubt it... and if you don't hold it against me I'll do the same:). So anyway...
Caching slab, lookaside lists, whatever, you're still calling some kind of allocator with some kind of cost. There are definitely applications where memory allocation/deallocation follows a pattern that allows that kind of optimization to work well.
All applications follow a pattern that allows custom allocators to work better than general ones, like GC or just calling malloc(). This is because the custom allocator is a very specific version of the former, and so they have more innate knowledge of when to allocate, free, etc. Custom allocators also tend to have programmer control built in, so you can do things like unshare data if the reference count is one... which is very hard to do easily in a general purpose GC.
You seem to be under the impression that running the GC isn't going to blow your cache... why is that?
If the garbage collector itself causes additional page faults, then yes, it isn't an advantage. But the argument was that when compared with page faults (something we've all been willing to accept, something that hasn't caused an unacceptable performance hit), GC is small potatoes. In addition, modern GC implementations take the amount of memory available on the machine into account, and they actively work to keep memory usage under the level that would start causing page faults.
I said cache, page faults are a similar but different case. If your GC is on another thread then it can be doing cross CPU invalidation which will stall your application. If it's running in the same thread, when it runs it will evict some of your application text/data from cache... also possibly introducing latencies.
Collectors do "waste" a LOT of CPU cycles moving memory around, but it only happens once in a while. Your CPU can move memory at speeds measured in hundreds of megabytes per second, so the compaction only takes a few milliseconds, and it only happens every few seconds at worst.
That 100s of MB a second assumes one linear piece of memory, this is generally not the case... but even assuming it can just eat CPU to move it and not affect the cache and you have the CPU to spare... then you have to have an algorithum to work out where it goes, without blowing the cache, taking a fault or taking too long to introduce latency to the app.
I suppose that your programs never wait for anything? Never block on IO? You always have something worthwhile to calculate while you're waiting for that IO operation to complete? Somehow I doubt that. The vast majority of applications spend most of their time waiting for events of some sort -- waiting for a socket connection, a keystroke, file IO, etc. Most CPUs are idle 99% or more.
It obviously depends on the application. But yes I've written/advised a few applications where CPU speed was a factor. Sure my text editor could probably be in Java instead of C and I wouldn't care... and I probably wouldn't care if my spreadsheet was too, but someone who runs a large calculation might.
If you actually want threads... which many of us don't.
Welcome to the 21st century, buddy. Threads are here, they're useful, and if used properly, they increase performance,
IBM's code is based on Sun's code, so IBM can't change the license unless Sun says so. Kaffe is unmaintained and so pretty worthless, AIUI. gcj isn't bad at what it can do, and there is the Java -> mono converter. But generally everyone uses either the Sun or IBM JVMs.
Of course we don't have any problems, no one is using it. Sure internal company code is often using J2EE, but that is heavily influenced by comming from Solaris boxes. The only core application using it is Eclipse, which is heavily backed by IBM pushing their view of the world (and might be available in FC2) ... even the Java in OpenOffice is just causing all the distributions to patch out all the code again.
You're right. For some reason I thought both C and C++ had come out in the same year.
That is not true, the original C std. was basically a statement of what was already implemented ... thus. the strncpy() and puts() warts.
STLPort is not a complete environment, if you're compiler doesn't support templates properly (or at all) then STLPort isn't a solution. Also very few implementors do "ports" by taking the entire runtime with them from platform to platform ... mainly because everything you use (Ie. third party libraries) now also need to use the new runtime.
Personally I've dealt with C++ code that's been "in production" for 8-10 years or more.
So now you agree? What you originally said isn't true. std::string implementations on Solaris/NT seem to do the same thing for .data() and .c_str() ... so when moving to Linux say, you can have bugs (it also doesn't help that RogueWave's string type has .data() as the .c_str() equivilent).
I never said "I do", and it doesn't matter if the code still has to run on platforms that still use the older behaviour.
And it was first released in a real product, RHEL 3, in Oct. 2003. You seem to be confused between the real world, and "writing toy apps. in my bedroom". If you think C++ is wonderful, that's fine have your delusions ... and if you don't care that it's been 10 years in the making and there are only starting to be something closer to C++ compilers now, that delusion is also fine. But to suggest that the implementation of the Runtime is as similar across platforms as Java, .net or even python and perl is just obvious and easily refuted untruth.
Well unless you use .data() instead of .cstr() where NT/Solaris use the same implementation for both and libstdc++ doesn't ... or you want to use <strstream.h> er I mean <strstream> ... errr I mean <stringstream>
Put another way I would class it as fundamentally deficient that C++ only got it's first std. in 1999 and the first time you could get something that comes close to a std. c++ implementation is within the last few months.
You should seperate the syntax of pipes with the implementation of calling pipe(). It should be trivial for the shell to transform "foo | bar" into "foo -> __internal; bar -< __internal;", it also seems like it shouldn't be that hard to use the pipe() on the backend but they should at least get the syntax right.
I'm also not convinced that you wouldn't want a more high level view, so you can move the script around machines ... and possible add dependancies between machines (like Tivoli). And, unsupprisingly, the throw/catch type error detection is just as buggy as real error detection.
Do you know where you read that? Or what the subject etc. was?
I assume they mean that addoption is slow but steady. Whereas the addoption of win32 was fast, but is now declining.
How great is that PC without RAM, or a hard disk? (you going to run XP from floppies?)
It's also very misleading to call it "tens of millions of lines of code", most people didn't move to XP because they required all of the changes from win2000 or even win98 ... they would have been happy with tens of thousands of lines of code, for new drivers etc. ... or not even that much for those NT4 machines which are now in "screw you" mode.
*laughs*, I think not. One could argue that going from ksh to perl is relativly easy, but I wouldn't call perl "lower level" than ksh ... and I'm also not sure how easy it really is. sh scripts tend to be fairly small and self contained so indentation etc. doesn't become a factor.
It also goes against the idea that for a good understanding you want to go "up" like... HW -> asm -> C -> perl
Actually they do play music videos, it's just not that often (and at weird times like 7am and 3am). TiVo kind of solves the problem.
And my probably badly phrased retort was that if the entire OS is out of date there is an obvious solution upgrade the OS. The 95% use case (IMO) is being able to upgrade certain application(s) (those that the user(s) care significantly more about than the rest) when most of the system is at the latest stable version ... ergo. security updates should almost certainly be got in the normal way.
This is what the external Fedora trees, and debian unstable/testing try and do ... and they are much closer to something that people can realisticly use and not screw themselves with.
I would have to say yes, it does, esp. when talking about LD_LIBRARY_PATH etc. which have to be preserved in all the environments you want to run an application. One of the minor problems on my boxes atm. is that "ssh foo cmd" stopped working last OS upgrade because "cmd" is in $HOME/bin and for some reason that isn't in the PATH when just running the command over ssh (but if I ssh and then run the command manually it works fine). Also setuid apps. will just not honor LD_LIBRARY_PATH.
This is not true, maybe I didn't explain very well libraries link to openssl ... so if libA links to openssl and appB links to libA and openssl you cannot just compile appB if you have previously upgraded openssl (and the symlinks are using the new version). Your examples used static libraries, but I assume that was just a typo ... and it doesn't help you anyway it'll just not give a compile time error some of the time.
The biggest hit you currently take here is gtk/gnome/app. But you have to be able to manage this problem all the way down to libc changing the "public interface ABI" -- and realistically you should try managing some of the private ABI changes of glibc (which I'm guessing would be a lot worse in a stow based system).
There is already stow.
Without upgrading you mean, ok fair enough. Although I'm not sure why I'd care about what version openssh is (unless it's a security errata -- in which case it's coming from my vendor anyway). But using a better example of evolution or whatever...
Ta-da you've now got the same problem plus a billion symlinks. You could argue that putting a serious amount of crack in the PATH and/or LD_LIBRARY_PATH could save you ... I'd disagree, openssl is pretty notorius of breaking backward compat. on library upgrades. So you are likely to end up with something that installs but doesn't work. And the deps. won't be half as good as any of the major distros, as you are basically doing download tarball type ./configure -- see what fails -- then, eventually, run and see what fails.
Understatement of the century. Be very affraid of any "package management" system that hand waves away problems like this (and god help you if you are doing anything like "obsoletes" markings ... which again all the major package
management solutions have).
Actually not even that is true, apache 1.3.x and 2.0.x are very different (and currently you can still buy aomething with one or the other from all major Linux vendors).
Err, I don't think so. I have seen ones that had a table of different "known" offsets for FreeBSD and Linux ... but I haven't seen any that worked on ppc, Sun and alpha as well as ia32 ... and then you have some people using their own compile maybe with StackGuard/exec-shield/mudflap features turned on. Sure you can still play the percentages, but it's far from the same thing IMO.
First, IMO, html is the UI ... and the best way to test that isn't with a unit test (that I've seen). You want all your UI functions to just glue sql/data commands to the UI. Then you can "easily" test the sql/data commands with a test database and a bunch of function calls ... then you'll know that any bugs you find in the html part must be the fault of the UI code (which should be fairly small).
As far as the testing DB, I'd recommend having testing data outside the DB and use something to init. the database just before you run your unit tests. This should be easier to keep upto date, and solves problems of testing alterations to the DB.
Having an external representation of the DB that can be reloaded should solve this problem.
The only DB thing I've seen is for Java that was on testdriven.com.
That's a simplification, a significant portion of it now seems to be gcc coverage bugs (Ie. lines marked not covered I can step through in a debugger), but yes a lot of the difficult areas are to do branches.
And I'd agrue that this isn't the same thing. Moving from "if (!foo1()) return NULL; if (!foo2()) return NULL; return bar();" to "if (ret && !foo1()) ret = FALSE; if (ret && !foo2()) ret = FALSE; if (ret) return bar(); return NULL;" doesn't buy you anything IMO. But if you can move to "foo1(); foo2(); return bar();" then yes, that's much more likely to be working code (as you've divided the number of possible paths by 3).
Did you hit 100%? If so then I'm impressed, out of curiosity what was the ratio of lines of code to lines of tests?
Sure because there's that single (sic) path through the code that goes to all the same places but has a different state when it gets to the end :(. It'd be interesting to see something with 100% code path coverage.
Then the unit tests could/should have been better. Sorry, it really is that simple.
True, and if you can get a large amount of free debugging (like a prominent OSS project) then any personal testing (like writing unit tests) might not be worth it for you, because getting X hundred people to just use it is going to exercise the normal paths of code pretty well. Of course that isn't going to find subtle bugs or secutiry flaws (and unit testing isn't guaranteed to either, but IMO it's much more likely).
Of course you don't intentionally write bad code, and neither do I (and I'll admit I assumed I made much fewer mistakes before I tested it all). But I presume you wouldn't compile C without any warnings turned on, or not use prototypes in header files. Why? Because the computer is much better at telling you that... Duh! fprintf("%s\n", foo); isn't what my brain wanted my hands to type. This is how I think of unit tests, they aren't there to catch design mistakes because I'm good at that ... they are there to "prove" the implementation of the design "works".
I'm also pretty sick of seeing/using other people's code and it having obvious bugs in it ... "testing is not having to say you're sorry". For instance, when testing Vstr I found 4 or 5 bugs in glibc with some of the more esoteric uses of the *printf() functions when using the double formatters. When I found them, half of me was happy ... the other half depressed.
Creating mock objects is much simpler than creating a "text X server", although admitedly a wm is slightly harder than normal as you can't take the easy route of running a seperate version on your main display.
Howewver taking a quick look at blackbox, textPropertyToString() is the only thing in Util.cc that couldn't trivially be unit tested and at least all of i18n.cc and Timer.cc. That's 3% with basically no changes.
Sorry to let you know but, you didn't write good unit tests and probably did waste your time. I've found very close to 100% of the bugs in Vstr a network IO string library using unit tests. That includes a couple of ones that would have been damn hard to track down otherwise.
However it's been over a year since 1.0.0 which had a unit test for every function and every function option, to the last release which had over 99% code coverage found a couple of weird corner case issues (not just bugs, but optimizations that could never be reached for some reason). And going from 98% coverage to 99% coverage took a significant time investment, and required significant thinking about how the test should be written.
As with much software development, it's easy to write simple tests that don't show much and aren't very useful. It's much harder to write tests that find bugs (and you have to appraoch writing the tests with a very different mindset to how you approach writting the code you are testing. This is not even close to being "Like picking lint from your belly-button."
I'm far from convinced that TDD is actually a good approach. Although it's pretty obvious that without testing the code is often trivially buggy, and unit testing is the cheapest way to perform testing. For instance this kind of thing is all too easy to do with TDD.
For unit tests you want to write your code, and then look at the best set of unit tests to do complete code coverage. For an OSS e3xample of that you can look at Vstr string library and the code coverage for that project.
My guess is that they'd seen how they'd basically got "time off" when the computers/network went down. And so like rats pressing the button when the light comes on, they did the same again next time the oportunity came along.
Who is this "we" you are talking about ... in your opinion all the *BSD varients, or a specific BSD that you are involved with? If the later did you actually try and become involved with LSB to fix the points you found too specific?
I'd also argue that something like LSB needs to be as specific/detailed as possible. SuS etc. are pretty general in places, and I'd like something that said if you want to configure your system clock go alter /etc/sysconfig/clock ... want to add a daemon call chkconfig --add. But it isn't even that specific.
This is the same mantra that RMS cries, correcting people to say "GNU/Linux" gets attention and possibley gets people who will help. It misses the point that everytime you do it that way you alienate people too.
Oracle is one of the more important ISVs, but there are more than a few ISVs certified for RHEL. And just saying "user demand" will solve all your problems is provabley not true ... where's the debian or gentoo certification. I know United Linux had hell to get certification from Oracle, and they had something that looks like a std. RHEL box that people are the corporate users are used to.
Because the app. doesn't share the data with the OS so if the app. alters the data the OS needs to have setup COW so the data it sees is the same. And it is very rare for applications to use page aligned buffers to read or write, it is also very common to change the buffer just after calling a read or write. This makes it a bad trade off to setup COW mappings in the general case, as it hurts all the normal apps. which are using sendfile() and/or mmap() to do this kind of zero copy operation.
Well it's hard to argue with no links :). But your point that large objects are easier on a malloc()/free() seems like common sense. It's much more often that you'd write a custom allocator for small objects.
Again with the threads?:). Yes, I've heard this argument before, however ... 1) You don't need to do locked refcounting. If you do you are almost certainly sharing too much information between the threads. Locking should not be done "automatically" inside objects. This will almost certainly lead to the threads serializing against each other. 2) GC needs something similar so you can have quick destruction. This is often especially noticable when an app. is written assuming GC, at this point the application will assume that doing "x = new foo();" is just as cheap as "foo x();", then objects of very different lifetimes will be intermixed and the GC needs to know to free the "stack like" objects quickly. For instance python has both a GC and a ref counting system, and this seems to make most people happier.
Yes, malloc/free are not simple functions. However I'd disagree that custom allocators can blow the cache. Often the "allocation" happens with a test and two pointer assignments (for instance Vstr does this -- actually one pointer assingment per object, and another for the entire group), and deallocation with just two pointer assignments. I fail to see how a GC could compete with this. In threaded programs this is even more pronounced as you don't have to do any locking.
Then you look at things like filesystems, which have had GC like properties for a long time ... and the ones that are the best at managing space are always the ones that do the most work at allocation time. The filesystems that have been design to "make allocation fast" tend to royally screw up management of the space under a bunch of conditions. And I don't see why GC would be any better at this.
The one paper on stats. that was linked directly from the Boehm C collector was Memory Allocation Costs in Large C and C++ Programs (1994), which is almost 10 years old :(. But I guess Boehm hasn't changed much (although I'm not sure I'd say the same for malloc/free).
This paper doesn't suggest to me that GC is very close to malloc/free, with it being worse in both space and time (perl taking 125% of the time and 275% of the memory). Xfig was the only app. sfaster CPU wise (by 0.3 of a second) and was almost half a MB bigger.
Personal observation also suggests this, as when gcc recently move from malloc/free to using Boehm it got both slower and bigger.
However I would probably be happy to take the hit on some of those packages tested ... mainly due to the fact that they aren't performance sensitive to me.
Making it eaiser is a big thing though. For instance it's possible for someone to find out my social security or credit card numbers by just stealing information from the right place(s). This is not particuly well kept information, I'd imagine most people on /. could do it. However I'm not likely to post them on my website, as that makes it much easier.
In the same way, in the example I gave in the previous post. Matching up all the information, who I am ... where I live based on the photos, etc. is non-trivial for someone to do manually. However, if you can just say give me a list of people who match profile X. Then that is a significant difference.
Let me prefix this by saying that I didn't want to get into a GC flamewar, as I said I think GC can be used in some applications where you don't care about the negative side affects. I only replied to you as you seemed to be giving one of the better "Let's burn everything and rebuld using only GC" arguments. Possibly some of the difference is due to you working for MSFT and me being at RHAT, but I doubt it ... and if you don't hold it against me I'll do the same :). So anyway...
All applications follow a pattern that allows custom allocators to work better than general ones, like GC or just calling malloc(). This is because the custom allocator is a very specific version of the former, and so they have more innate knowledge of when to allocate, free, etc. Custom allocators also tend to have programmer control built in, so you can do things like unshare data if the reference count is one ... which is very hard to do easily in a general purpose GC.
I said cache, page faults are a similar but different case. If your GC is on another thread then it can be doing cross CPU invalidation which will stall your application. If it's running in the same thread, when it runs it will evict some of your application text/data from cache ... also possibly introducing latencies.
That 100s of MB a second assumes one linear piece of memory, this is generally not the case ... but even assuming it can just eat CPU to move it and not affect the cache and you have the CPU to spare ... then you have to have an algorithum to work out where it goes, without blowing the cache, taking a fault or taking too long to introduce latency to the app.
It obviously depends on the application. But yes I've written/advised a few applications where CPU speed was a factor. Sure my text editor could probably be in Java instead of C and I wouldn't care ... and I probably wouldn't care if my spreadsheet was too, but someone who runs a large calculation might.