Eivind+Eklund · Slashdot Mirror

Re:JRuby versus Java [code comparison - short] on JRuby Great Addition To Java Development · 2004-09-14 03:10 · Score: 3, Informative

... and we have a loser writing.

When you have zero clue how to write the language, I suggest you shut up.

The equivalent Ruby is (sorry about the lack of indentation - Slashdot seems to eat it):

require 'java' include_package 'javax.swing' class Calculator < JFrame def initialize super("Slashdot Rul3z") setSize(400, 400) setVisible(true) end end Calculator.new()

That's exactly one require-line and two blank lines more than your non-idiomatically-indented Java with missing newlines. It is one line less than what I'd consider normal amount of spacing - a newline after the import statement, a newline between the methods, and the {}-block for main actually spaced out.

Eivind.

Re:Project Scoop? on Genesis: Data in good condition · 2004-09-14 02:52 · Score: 1

Yeah, I've watched The Andromeda Strain many times - the way the pages riffle is really fascinating.

Re:Too Far? on Independent Developers Fight Piracy & Lose · 2004-09-14 01:13 · Score: 1

I both create works that I consider art for free and have lived off proceeds of work I've done and that has been mass marketed.

I agree with the basic premise "Quite honestly, I think the creator of a work should be able to indefinitely control his work no matter how long that may be.". However, I've got an And to add:

And this of course only applies as long as the author keeps his work to himself. The moment he releases it to the public, the value created is in the intersection between the artist and the public. The influence of the piece comes from the audience, and the piece becomes a part of the culture - it use a small part of the brain of each audience member, and derives parts of its commercial value from this.

Let's use Backstreet Boys as an example. It seems quite clear that the value of her music is a cultural value; it's not because the music in itself is of great and lasting value, it is because it is something that people can listen to and associate with a particular culture. The influence associated isn't in the creation or the music itself - it is in the consumption and the consumers. A much larger chunk of "brainspace" is used by the audience than the performers.

As a such, I see it as reasonable to see that audience as being to a larger degree owners of the work than the performers. The work (in context) is the song as heard by and doing influence to the audience, to society. We, as a society, may want to let parts of the society (creators) take some control of what they release into the society. This control has a number of beneficial effects (getting more or better works produced, for instance.)

However, it is the audience that is the important part of culture. It is, in a way, the audience that own the culture. It is their - our - heads.

Eivind.

Re:Donald Becker on Unsung Heroes of Open Source Software? · 2004-09-08 07:16 · Score: 4, Informative

... and Donald Becker got the DrDobbs "Programming Excellence" award and is one of the most loudly acclaimed people of Open Source.

If you want "unsung heros", I'd look elsewhere. (In the same space, Bill Paul of FreeBSD has my vote - more drivers, better code quality. That's my opinion from having hacked the code of drivers from both. But Bill has also gotten a fair amount of public recognition, especially after his Project Evil - supporting NDIS drivers on FreeBSD.)

Eivind.

Re:I like perl on Live Nightclub Hacking · 2004-09-03 04:08 · Score: 1

I've programmed with it (in Perl) and without it (in Ruby); given a language with the other capabilities of Perl (ie, Ruby), the loose typing has turned out to be a liability for large systems in my real life experience.

Note that I work on fairly large Perl projects - several in the 50kloc+ range. For smaller projects, it's a wash, and for really small projects, Perl is excellent :-)

Eivind.

Re:I like perl on Live Nightclub Hacking · 2004-09-03 01:57 · Score: 1

I used to feel just like you. I sort of still do - Perl is, in its way, a great tool (I won't go so far as to say a great language).

However, a lot of the features of perl are targetted towards making small programs even smaller. Take for instance my favourite hate-object: Autovivification. Perl fills in whatever you use as a hash reference to be a hash reference, so you don't have to do it yourself. Great for small, short programs where you are bulding data structures. A liability for large programs, where the problems with this (e.g, $stuff = undef; if (exists($stuff->{foo})) makes $stuff == {} - an empty hashref) makes debugging the programs harder.

Or the automatic conversion of numbers to and from strings, requiring that there are different operators for comparing strings (eq, ne, cmp) and numbers (=, !=, <=>). This is great for small, simple programs - removing the need for writing explict conversions is a boon, and automatically being able to mix data coming from a text file and a database is great. However, it gets in the way of writing generic code, and lead to small inconsistencies when one tries to work around it. As an example: Yesterday I had to write code that work with IDs in a sorted order, and had to use regexp checks to find out if it is to do numeric or string compares. That still won't be be stable for sets that mix integers and strings - fortunately, I could just disallow that for the time being.

I still love the power a language like Perl gives me. However, I've found another "language like Perl" where most of the traps are removed, the syntax is cleaned up, the standard library is object oriented and clean, the data structures are orthogonalized[1], and "normal scale code" ends up half[2] the size: Ruby.

I still use Perl for work (for legacy reasons), and I still use Perl for one-liners and throwaway conversion scripts.

But I could easily live without it, and I feel it inappropriate for any new project that end up more than a hundred lines.

Eivind.

[1] For instance, hashes can use any sort of object as a key, not just scalars. Arrays can contain any sort of object. There is no distinction between an array and an arrayref (there is no way to declare anything but a ref), numbers and strings can both be compared with <=>, etc.

[2] The style code I write when doing "clean perl" end up fairly exactly half the size when directly translated to Ruby. Code I write directly in Ruby is generally even more succint, due to using Ruby idioms. See CVSFile-0.2.tar.gz for an example of code I've had this experience with.

Re:hmm on Live Nightclub Hacking · 2004-09-03 01:25 · Score: 1

The Slashdot way is going to nightclubs and *thinking* about getting rejected by women. Talking to them so you get rejected is too scary.

Re:Memory Footprints and Performance on FreeBSD 5.3-BETA2 available · 2004-09-02 07:21 · Score: 1

Answers to your questions:

(A) That beta is still running with our debugging code enabled, which will slow you down a lot. That's done intentionally, as it helps us catch bugs. This is why it is slow.

(B) That on-disk footprint sounds like you've added ports or src; that will take a lot of space, but the space does not increase based on the number of apps installed.

(C) When you are looking at how much memory is free, free memory is wasted memory. FreeBSD use as much of your memory as it can. "Free" is memory that FreeBSD at the moment has no data in, and thus memory that is wasted. I have less than 100MB free on a 2.5GB system here; but I have a heap of "Inactive" memory, which means the memory is used for disk cache.

Eivind.

Re:Fact 37 - code reviews catch errors on Facts and Fallacies of Software Engineering · 2004-09-01 01:52 · Score: 1

Hi Tom! We're usually eye-to-eye on software development issues, but in this case, I think you're missing the target (and possibly shooting people behind you).

Automated tests and code review partially fill the the same space, but not at all completely. I'm strongly in favour of both. (I have experience using both the techniques, in various forms, but little experience with using both at the same time except when pair-programming with tests.)

Tests give you:

Less likelihood of bugs in the moment, because of the actual tests
A mechanical form of documentation that is, to the degree it is there, up to date (as long as the tests pass)
An extra consumer for the code, often forcing better code structure

These are very good qualities. I love them.

Inspections has different good qualities:

Less likelihood of bugs in the moment, because of the actual tests
An extra pair of eyes for the code, often forcing better code structure
Better naming and readability for the code, directly from human feedback
Ensuring that in-code documentation (whether that is comments or well-named symbolsl) is present and match the code
Simplifying the design (by direct suggestions from somebody with a different point of view)
Helping to find places where the code duplicate functionality elsewhere in the system (instead of writing tests for this new code, use that old, well-tested method that's over in this Util class...)
Training "rookies" in the craft issues of software development, including stuff like the above
Finding bugs in code that is not covered by the tests (by some sort of mistake_
Finding bugs that are both in the test cases and the code. This is in my experience a fairly small but still noticable number of bugs. (An interesting datum here is that when NASA run with several separate, non-communicating teams, they still had a fair number of the same bugs after minimum Something like 20-30%, if I remember correctly.)
It teaches the person that review the code somewhat about how the code works, decreasing the "truck count" (number of problems you get when somebody get run over by a truck)
The "super-anal nitpicker" will likely also know that the reason that serial device driver you're hacking had an extra 2ms delay if bit 3 of the status register is set is that there was a bug in the hardware delivered by Crowynx in the first half of 1993. Or (to take an example where I've been the nitpicker) that the semantics of the the software protection level macros have changed since the architecture book was written, and they are now masks instead of levels. Thus, the seemingly correct locking code will lead to semi-random crashes roughly every three weeks. Or that said locking is necessary at all, something that no amount of unit tests will find if the person writing the unit tests isn't aware of it.

There is another point missing in here, too: Writing tests is more expensive than doing reviews, as to write tests, you need to first review the code to find out how it works, and then write the tests to verify that it ACTUALLY work that way. This is a factor ten difference in amount of work.

OBTW: Feel free to hail me on IRC to continue discussing this (if you feel like using a different medium).

Eivind AKA eek/#ruby-lang.

etcmerge - mergemaster replacement on FreeBSD 5.3 Beta1 · 2004-08-26 22:11 · Score: 1

I find mergemaster to be a pain. It force the user to do manual merge between files because it retains too little history, and that's always annoyed me (because I knew there is a better way.)

To scratch that itch, I've written a replacement that works on full directories, and use so-called 3-way merges. This means that it retains a copy of the unmodified /etc from the install point, and use that to automate the entire prosses. The replacement is available from /usr/ports/sysutils/etcmerge

In case you did not take a backup of /etc before the installer got to do any mods (and who does?), I've made tarballs of the reference etc directories for a series of releases available from http://people.freebsd.org/~eivind/etc/.

Feel free to mail me with any questions about it etc.

Eivind.

Re:My answer, based on my experiences on Communication Within Programming Teams? · 2004-08-22 23:19 · Score: 1

Thanks for the answer. I had a long reply, but an ACPI crash took down my web browser :-/

Summary: I've looked for this bug for ten years. In that time, I've seen it twice from other people, and have never made it myself. My personal conclusion is that the extra space consumed cost more than those errors.

As for the if (ptr) vs if (ptr != NULL) variant: I feel the extra space and "= NULL" decrease the signal/noise ratio of my code. The significant part, the stuff I need to keep attention on, is the ! (or lack of such) and which variable this is done to. Context will tell me if that variable is a pointer or not (if I need to care).

Oh, and BTW (please take this as friendly information): free() doesn't need that guard; it has one internally, specified by the ANSI89 standard (and all subsequent standards.) "Unwritten code has no bugs" (to quote Simon Shapiro). fclose() needs the guard, though.

Eivind.

Re:My answer, based on my experiences on Communication Within Programming Teams? · 2004-08-16 23:33 · Score: 1

That always bracketing statements leads to less bugs is definately not true for everybody, and it may be a pure myth.

I personally do not curly brace until I have to, because I want my code to be compact. The more compact the code, the more I get into a screen, and thus the more context I can see when I make a change. Seeing more context decrease the chance of bugs.

When working on code that is properly indented (all mine is), I have not once made the mistake this defend against in over twenty years of programming.

Other common wisdom is to write C statements as if (CONSTANT == variable) ... instead of if (variable == CONSTANT) The latter form feel more natural to me, and I've been using it for ten years, while keeping a careful track of how many more bugs it leads to. I've had one (1) detected bug from using that form in ten years, and it was easy to diagnose and fix.

If you have the choice between two otherwise equivalent styles where one is safer than the other, by all means chose the safe one. However, there are usually other influences. I especially feel succintness often gets ignored.

Eivind.

Re:You haven't heard about "taint mode"? on Gosling on Computing · 2004-08-13 00:31 · Score: 1

Taint mode is not unique to perl. It also exists in Ruby, at least.

As for SQL injection problems: I'm avoiding them by using a self-built API that is based more closely on the relational model than SQL is. That makes it trivial to write safe code, even without prepared statements. The basic problem of incompetent programmers still remain, of course - it is non-trivial for somebody to learn a relational API instead of SQL, and writing a relational API is even less trivial (and I'm not going to release mine until I feel it "complete", as I can't field the noise involved.)

Eivind.

Re:Type inference (was Re:No chance in hell) on Parrots, Pythons And Things That Go Splat · 2004-08-08 23:27 · Score: 1

It IS fairly hard to do, but there has come some things of it. You can look at e.g. Sun Microsystems Laboratories Technical Report 96-52 (SMLI TR-96-52), "Concrete Type Inference: Delivering Object-Oriented Applications" by Ole Agesen for examples of what has been done around Self (to take an example of type inference done around proto-based languages).

Another form of type "inference" done at run time is type feedback; that also gives reasonable performance, and IMO much has come of it, it just isn't beeing made a lot fo noise around it, because it is fairly complex and the performance "mainstream" still do type declarations.

The complete type inference problem for proto based languages is NP-complete, but solving limited versions of this problem gives interesting results. I noted this in the first post I did.

The approach I described is a variant of the Vitek, Horspool and Uhl data flow system described in Vitek, J., N. Horspool, and J. S. Uhl. Compile-Time Analysis of Object-Oriented Programs. In Proceedings of CC'92, 4'th International Conference on Compiler Construction, p. 237-250, Paderborn, Germany, October 1992. Springer-Verlag (LNCS 641).

Talking about the cost of looking up slots assume you are not doing type inference or anything like it. My basic premise was that the original poster (great grandparent of this post) was ignoring type inference and partial evluation, and that it was assuming that the structure of code generated was equal to the code structure in the input program. The parent of this post keep assuming that the same structure has to be used.

Dynamic loading is relevant if we assume that as a property of the language - it do naturally block inference before the code is loaded. I do not consider dynamically loading the code to be executed at run time as a feature of a language; that's a feature of an implementation, and may or may not be relevant for a particular use. For high performance programs, I consider the whole program compilation (or almost whole program compilation, where defined aspects are left for run time loading - see Zoran Budimlic's "Compiling Java For High Performance and the Internet) as the only relevant case.

I've not been able to find relevant performance comparisons for the different cases right now; the closest was the high performance Java paper above, showing that it was necessary to do specialization to get Java to compile to within an order of magnititude of FORTRAN, but I found no real data. I know that compiler writers are silently doing type inference from having searched up this previously, but I cannot find the references right now. They've been limited to a sentence or two in the manuals of various compilers and anecdotes from people working on commercial compilers.

Eivind.

What IBM can do to contribute much more ... on IBM Has 'No Intention' of Using Patents Against Linux · 2004-08-05 02:44 · Score: 1

... is to use their patents to DEFEND free software. IBM has estimated the value of their patent portfolio to force cross licensing at one order of magnitude more than licensing fees.

The promise they give here (worth the wobbles in the air it was transmitted by) is to not use the offensive capacity - it would be much, much more interesting if they started using their much higher value defensive capacity.

Oh, and if they covered *BSD - as I'm totally uninterested in using Linux ;-)

Eivind.

Type inference (was Re:No chance in hell) on Parrots, Pythons And Things That Go Splat · 2004-08-04 03:27 · Score: 3, Insightful

Moderators, PLEASE DO NOT MODERATE INSIGHTFUL UNLESS YOU KNOW THE SUBJECT AREA! The text quoted below (parent of this article) is full of simplistic (and wrong) assumptions, and do NOT deserve an "Insightful".

These languages make certain assumptions about typing and binding that Python and Perl do not. Additionally, Java's class structure is much *much* more time-efficient (though I rather like it less) than Python's memory-efficient proto-based object structure. It's the nature of the languages. It's why they're SCRIPTING languages. They traded speed for ease of coding. Which is just fine. But don't oversell them, you just look foolish.

Having looked reasonably deeply at the type inference space, this comment seems wrong to me. The division between "scripting languages" and "compiled languages" is superficial. The idea that doing type declarations etc makes a crucial difference is superficial. And the idea that a class is implemented the same way in the compiled code as in the executed code - that's extremely superficial.

The static (visible, programming level) declarations of type information in Java/C++/etc makes it much easier to write a compiler that generates reasonably fast code. You just follow the type specifications, leave all abstract types as the abstract types, and use a doubly indirected jump table to resolve abstract methods.

I've seen a non-optimizing C compiler done in three days of intense work by one programmer - and the inheritance/object/abstract methods addition here is fairly trivial.

Similarly, a simple interpreter is easy to write. I've done one in a day for a simple Lisp dialect. And yeah, they're slow.

However, this stuff is 1950s technology. FORTRAN came in 1956 (with an amazing optimizing compiler that competed with hand-written assembly), Lisp with an interpreter in 1958-1959 (depending a bit on how you interpret the history). What's interesting today is what we can do with type inference, constant propagation, partial evaluation, memory and cache use optimization, etc. And then the picture changes.

To be able to handle abstract types effectively, you need to do type inference to find out what actual method will be called by each abstract method call, in order to be able to do partial evaluation and constant propagation through the methods. And - guess what - that's the same stuff you need to do for all methods and variable lookups in a fully dynamic language.

In other words: The same problem is there for compiler writers, and has to be solved for a fast compiler. You get SOME extra information from the type declarations, but this information is usually the same you would get for the first pass or three of the type inference engine.

So, for good compiling/execution of Java/C#, I would start doing speculative execution with partially evaluated expressions taking the place of variables in my execution path, resolving these to constants when I can. Different code prefixes (relevant system state parts) result in re-evaluation of the pseudo method call, noting if this makes a difference in evaluation, and either registering the prefix on a new resolution block or adding it to an old resolution block. When all resolution blocks have become non-changing (no more information is being aquired through partial evaluation), I'd stop the partial evaluation and do (virtual machine) code generation.

For good compiling/execution of Ruby/Python/Perl, I would start doing speculative execution with partially evaluated expressions taking the place of variables in my execution path, resolving these to constants when I can. Different code prefixes (relevant system state parts) result in re-evaluation of the pseudo method call, noting if this makes a difference in evaluation, and either registering the prefix on a new resolution block or adding it to an old resolution block. When all resolution blocks have become non-changing (no more information is being aquired through partial ev

Perl is for one liners and legacy code on Paul Graham On 'Great Hackers' · 2004-07-30 06:20 · Score: 1

It is quite hard to use Perl to create really "stable, maintainable code bases", as perl is filled with various forms of traps almost made to make this hard.

Perl has OO - yes - but this OO mix up classes and name spaces, it mix up object methods and class methods, it mix up default handlers and AUTOLOADing, it does not support method signatures - not even to the level of being able to check the count of parameters. And, of course, the standard API is not at all object oriented - there are extra packages that reimplement much of it in an OO fashion, but then you lose even the little checking Perl usually has.

Of course, perl has nice points. It used to be one of my favourite languages, because it was so flexible, implemented such high level constructs, allowed me to do meta-programming, etc. Only occasionally would I curse it for its auto-vivification[1], or the fact that it GC the fileglobs incorrectly when you use the old recommended syntax[2], or the fact that => is just an alias for comma[3], or the lack of parameter count validation[4]. The flexibility was so nice.

Alas, to my daily sorrow, I learned Ruby. I learned that it is possible to do the things Perl make sort of convenient - and still keep everything beautiful.

And I work in Perl daily, because I get paid for it, because there is a legacy codebase we need to support, there are promises to keep.

And I hurt for every push(@{$this->{'FriksCount'}, scalar(grep { /Friks/ } @{$this->GetFrikable})); I need to write.

Eivind.

[1] "AUTOVIVIFICANTION" is when Perl makes up a hash entry from your typo, because OF COURSE you would want to save a line of code those one out of 20 times this happens it was intentional. This is brilliant for one-liners, though.
[2] Supposedly "impossible" to fix inside the present interpreter structure; I've discussed it with the maintainers. Or at least without doing a nasty grammar hack.
[3] A nasty grammar hack done to make it simple to write the backend to support both , and => as separators in hashes.
[4] Because it would have been too much work to actually *fully add functions/subroutines to the language*, so instead they used a hack based on an implictly declared array that varies from function to function. There is a new hack allow parameter checking, though. Unless you use OO, of course, because then you obviously are writing a small program and know exactly what is happening everywhere.

Re:Yeah... and? on Oxford Students Hack University Network · 2004-07-16 01:35 · Score: 1

The belief that morality and cultural value is relative is a western european and from there (and to an even larger degree) american belief.

What says that that relative belief is better than the other beliefs that say it is absolute?

I don't believe in your relativity of morals. I believe there are something closer to absolutes, and that there are difference of qualities in cultures. Not that I think that the one I am in necessarily is the best - it is just the one I am in. I do, however, dispute its belief that there is no such thing as quality of culture.

Eivind.

Re:Just to be clear... on An Online ID Registry · 2004-07-11 22:50 · Score: 1

I think this is all done in a wrong "frame" (set of thinking patterns). As a lot of people have pointed out, you cannot guarantee that people do not bypass the system. What you can do is make it expensive to bypass the system, and the expense will usually be divided between investment and per-identity costs. As an example, let's say you use a text message to a cellular phone to verify identity (a popular way to verify identity at least here, where I do not know ANY person that use the Internet and do not have a celluar phone). With this, the attacker can effectively purchase IDs for $20 (the cost of a new cash subscription) plus transaction costs (going out and getting that subscription). These are anonymous and sold at any gas station etc. Or the attacker can purchase/crack an ISDN P(A)BX and a set of phone numbers - this will bring the cost per ID down (especially if cracking it), but will have a higher "investment cost" (legal issues if caught, higher risk, payment if buying, etc) Or the attacker can purchase/crack a complete phone number series for a phone company. This will bring the cost per id to virtually zero (a few cents in practice), but again carries higher investment/risk. This is the way to analyse a system. Those of us that work or have worked in the security business know that you generally CAN'T guarantee anything - you can just change the cost profile for attacks. Eivind.

perhaps@yes.no on Where Do Dummy Email Addresses Go? · 2004-07-11 21:10 · Score: 1

Dummy e-mail addresses aren't. Or at least not that one. It reach me, including the questions if it is real. (But not too often recently - I've mostly stopped reading it due to people abusing it and it thus getting way too much spam...) maybe@yes.no belongs to a friend of mine and is much worse, though. Eivind.

Re:Two awful suggestions on Alternatives to Autoconf? · 2004-05-24 01:25 · Score: 1

As a non-Linux packager, I want autoconf to die. It makes building a software package into a random event - either it works (by magic), or debug hell is afore you. Somewhat like installing Windows software.

I actually found it easier porting software back before anybody did any attempt at making it fully automatic. autoconf's use of sh as a back-end language for a compiler for an auto-detect language often makes it necessery to muck about in the "object files" (sh files), and reverse engineering these is harder than reverse engineering normal object files.

My opinion is that autoconf must die. Its model ("each package shall run a 10,000+ lines shell script to attempt to guess what is installed") is wrong, and the release engineering done by the autoconf team does not seem to be particularly good, either (there are a bunch of incompatibilities between minor versions).

The classic portability systems had one configuration file per system type, and had the user select one of these. This was much easier to deal with. If autoconf had been used only to write out a sample configuration file that should match the user's system, it would IMO be reasonably OK. As it is, it often is an utter pain unless it works on first try.

Eivind.

Re:Wrong Approaches on OptInRealBig Wins Restraining Order On SpamCop · 2004-05-13 08:18 · Score: 1

I'll just skip the part about how to calculate signatures as it is an attack on a straw man. (Signature computation is cheap enough that it is unlikely to affect anything over time unless we're specifically attempting to do a hashcash implementation, so it won't change anything.)

By an amazing coincidence, if I were a spammer and I had to attach monetary postage to each of eleventy zillion messages, I'd (all together now) hijack other people's computers because I couldn't afford to buy enough postage for my own account.

This is actually almost a legitimate counter-argument. However, there are two issues that crop up here:

Postage is money. Thus, it is starting to affect the people who gets broken into directly financially, and this will influence how security is handled at that level.
We can regulate the basic price. At some payment per message, I'm perfectly willing to handle spam. Heck, if each spam included a $1, I'd be perfectly DELIGHTED to accept it. At my present rate of perhaps 100 spams per day, it would be a welcome addition of petty cash :-)

Just the addition of cash to each mail change the picture enormously - some people will try to get as much spam as they can to some addresses, in order to gather the cash from it, never looking at the e-mail addresses. The war turns around a lot - instead of not wanting to have addresses enter the spammer lists, and instead of the spammers wanting as many addresses as possible because each mail is essentially free, the spammers want to prune their lists to be an as good match as possible inside a sea of fake addresses.

As of the present, the cost per mail from a bulk e-mail company is about $0.0012[1]. Charging $0.02 as a "stamp" for e-mail would make the spammers steal 20x more *and spend it to send the e-mail*. It seems obvious that they would be much more interested in stealing the money directly than using it to spam you and me - so the spam problem is basically gone, just by increasing the cost per mail.

This basically change which resources are scarce, which is a standard way to change behaviour in an economic system.

As to the bottom line: Any rule will influence legitimate users, somehow. Whether the appropriate rule is to pay the recipient per mail, I don't know - but I as far as I can tell, it would stop spam.

Eivind.

[1] $0.0012 is the cost per mail for 500,000 mails, based on the top hit for bulk email cost on Google.

Re:Success due to Bitkeeper? on Bitkeeper News Redux · 2004-05-13 07:44 · Score: 1

I'll just give a quick comment on the "recover from error"-issue: I don't see recovering from error as a particularly important side of a version control system.

In practice, I find the history, the ability to see what changes I have in the active space easily, and the ability to manage several spaces (including merge from a moving baseline) much much more important. I hardly ever need to recover from error that are committed - but I use the other features all the time.

I haven't tried using distributed branching yet, but I've seen a large number of places where it would apply. When it gets done right, I think it will revolutionize free software development.

Eivind.

Re:Wrong Approaches on OptInRealBig Wins Restraining Order On SpamCop · 2004-05-13 05:42 · Score: 1

[On enforcing identity mapping.]

Nope; in this case it's trivial (when fifty people all want to send the same "v1agr@" ad, they ain't fifty people).

In a little haphazard order:

Identifying "the same" is difficult. Spammers already individualize messages; this would just be done everywhere. Identifying which messages are "equal" and not is a non-trivial problem. And if 50 people world wide write messages to people they know with similar content describing the latest Taliban bombing, we don't want their mail blocked as spam. Not even if they quote from the same news source.

The protocol I described (which was the one you said was trivial to change to use identity mapping) already automatically individualize every message by joining it up with the receiver address before checksumming.

Even if we assume that Trent has a copy of all the message, tt is still non-trivial to find what messages are "too similar" and which are "distinct enough" to be counted as different. It is a deep statistical problem, and it is not clear that it is easier to do this problem than to just classify mail as "99% certainly spam" or "99% certainly not spam" directly.

In reality, people are much more likely to trust Trent with 2 cents "postage" to sign their e-mail than they are to trust him with the contents of their mail, and forcing the disclosure of mail content to another entity is a definite hardship. In fact, it is one that I think would automatically kill any proposal along these lines. And identifying semi-identical mail in a case where the mail is scrambled enough that it is not possible to extract other information from is (A) definately a research level problem, and (B) most likely effectively impossible.

To go a trifle ad hominem: Please go through your arguments and analyze *how to break them* before posting. In the case of a spam protection system, it is a good idea to turn about and say "If this was implemented and I were a spammer, what would I do?".

Eivind.

Re: Large CVS projects on Bitkeeper News Redux · 2004-05-13 04:35 · Score: 1

I'm a FreeBSD developer/committer (and has been for 7 years).

We use CVS for the primary repository, and Perforce for an extra repository with a gazillion branches where a lot of the larger projects are done.

The primary reasons we use CVS is (A) habit (it was the only relevant alternative when we started), and (B) The distribution architecture we have for it (cvsup).

Given that we use cvsup as one of our primary methods of distributing continous updates, we would have to re-train about a million users (guesstimate). We also have thousands of developers that are familiar with CVS, these would also need re-training.

Those of us that has looked carefully at the version control/configuration management issues believe we are taking a continous and fairly high cost from CVS. We are probably going to do the switch at some point, but before we do, the relevant infrastructure needs to be in place (distribution of repository copies and checkout a la cvsup is probably the most important), and we should get something with all of the features we need, so we don't have to do one more switch.

The most important that we presently lack are cheap branching, distributed branching (being able to create a branch that only exists in a mirrored repository), and idempotent merges (the same change can be merged through several branches and only enter the system once). Tracking directory metadata and helping with renames etc would be nice, but isn't that important for FreeBSD. (I believe this to be more important than the other shortcomings of CVS for smaller projects, BTW).

A final interesting thought: I believe that the sharp distinction between NetBSD, OpenBSD, FreeBSD, and DragonflyBSD is a result of the lack of distributed branching and easy merging. The personality issues and different goals are there, of course, but I believe these would have been "overrun" with the ability to easily move changes back and forth. The single issue that has resulted in the "deep" splits is the effort involved in moving changes.

Eivind.

Slashdot Mirror

User: Eivind+Eklund

Comments · 1,177