Diomidis+Spinellis · Slashdot Mirror

Re:So what? on LinkedIn Password Hashes Leaked Online · 2012-06-06 04:06 · Score: 4, Informative

I've occasionally daydreamed a fun academic paper would be to collect sets of password hashes, rub them up against a rainbow table, and make graphs and correlations and wild assumptions about the correlation coeff of IQ and rate of easily cracked pwd vs site etc etc. Sounds like fun so its probably been done before.

Yes, it's been done on 70 million passwords. See http://www.cl.cam.ac.uk/~jcb82/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdf

Re:adoption associated with.less productive employ on How Big US Firms Use Open Source Software · 2012-03-22 23:56 · Score: 1

The rules of academic publishing are that you have to cite relevant related work. This includes both fresh results and old classics. Where possible, we tried to cite the most recent studies. Some studies that are appear dated indicate a research opportunity to update the corresponding area. Also, it would be wrong to dismiss a paper because of its age. Some of the older studies we cite present theoretical frameworks of enduring value and importance, demonstrated by the thousands of citations they have received over the years. For instance, the 2003 study by Venkatesh and his colleagues on the user acceptance of information technology, which we cite, has received almost five thousand citations. It would be wrong to ignore it, just because of its age.

Re:adoption associated with.less productive employ on How Big US Firms Use Open Source Software · 2012-03-22 21:37 · Score: 2

You have a point here. And you haven't mentioned the huge cost associated with procurement processes for proprietary software, especially in the public sector. These can drag on for months. In contrast, acquiring an open-source product is often simply a matter of a one-click download. Even if the organization's legal has trouble understanding open source licenses, this is a hurdle you have to overcome just once.

Re:Not required, just makes it easier. on Does Wiretapping Require Cell Company Cooperation? · 2011-04-27 18:14 · Score: 1

The article cited refers to software planted on the phone exchange, not the towers. The rogue wiretapping software was essentially a rootkit, complete with a backdoor for future access and detection countermeasures.

Re:Paper summary on Researchers Outline Targeted Content Poisoning For P2P Data · 2009-07-23 19:16 · Score: 1

Very well put. I didn't have space to explain this in the submission's summary, but this is the gist of the paper.

Re:"Code quality" is bunk on Code Quality In Open and Closed Source Kernels · 2008-05-17 09:01 · Score: 1

A few hours after replying to the "code quality is that it 'works'" comment, I read Joseph Bergin's Do the Right Thing design pattern in an IEEE Software article. I found it quite funny.

The absolute worst part of critiques like yours is the ideas it gives pin headed MBAs who bungee jump into engineering departments, book in hand, with no practical experience. The ideas spouted by the book become the drive, not the product. It is an almost certainty the project will be dreadfully late or never finished. I absolutely agree.

Re:Is it just me? on Code Quality In Open and Closed Source Kernels · 2008-05-16 20:03 · Score: 3, Informative

Please let me clarify here:

I can not extrapolate the agreeable portions of your thought to the seemingly obvious short comings of the Windows operating system. On any facet, whether it is security, stability, functionality or reliability. Windows is, far behind on all fronts.... aside from secrecy from a Microsoft point of view. I'm not claiming anything regarding these external quality attributes of Windows, the metrics I collected just show that there are no vast differences in the code's quality.

Or, perhaps, the WRK has been a meticulous focus at Microsoft before it's release... this is likely possible, as it's WIDELY known, from nearly ALL examples of closed source proprietary software being released to the Open Source, that it takes years just to clean up and prepare for the ultra high standards of the OS community. This is entirely possible. In fact, a README file in the distribution states:

The primary modifications to WRK from the released kernel are related to cleanup and removal of server support, such as code related to the Intel IA64.

Re:"Code quality" is bunk on Code Quality In Open and Closed Source Kernels · 2008-05-16 19:52 · Score: 1

Economy is an attribute different from quality, and this is where engineering comes in. The engineer has to balance the various demands on quality against factors like cost, time to market, and customer demands. All your arguments are perfectly valid, and they are engineering decisions.

Re:The data measured is just noise on Code Quality In Open and Closed Source Kernels · 2008-05-16 19:46 · Score: 1

To my surprise there was no clear winner or loser..

Forget what you *think* you're measuring (code quality). Instead, consider whether you're measuring anything at all. That is, is there any information in the data you've measured?

In the past other researchers have used a few of the metrics I used to measure what they called a system's maintainability, and they were able to match this with the subjective perceptions of developers at HP regarding the code's quality. So these measures are not just noise.

For another indication, consider this figure, showing a trend that matches our expectations: how the maintainability of the FreeBSD system is, in general, falling over time. Again, this is derived from some of the metrics I used to compare the four kernels. These metrics do not yield noise.

Re:statistical wash-out? on Code Quality In Open and Closed Source Kernels · 2008-05-16 19:37 · Score: 1

My personal opinion is that if statistics are a wash-out in general, then the researcher is asking the wrong questions. I know that the author pre-defined his metrics in order to avoid bias, but that's not necessarily good science. Scientific questions should be directed toward answering specific questions, and the investigatory process must allow the scientist to ask new questions based on new data. There is clear non-anecdotal evidence that these operating systems behave differently (and, additionally, we assign a qualitative meaning to this behavior), so the question as I understand it is: is this a result of the development style of the OS programmers? The author should seek to answer that question as unambiguously as possible. If the answer to that question is "it is unclear", then the author should have gone back and asked more questions before he published his paper, because all he has shown is that the investigatory techniques he used are ill-suited to answering the question he posed. Wait a minute here: being unable to prove a hypothesis is a long-established scientific path. Due to the tiny number of samples (four kernels) I could not prove my case with statistical rigor, but still publishing the results that show I could not find a difference is the scientifically honest thing to do. Reformulating the questions until you find an answer that suits you distorts the picture due to the file drawer problem.

Re:The 99% Solution on Code Quality In Open and Closed Source Kernels · 2008-05-16 12:30 · Score: 4, Informative

I've put the data and the SQL queries on the web. It is therefore easy for you to do what you suggest, because the filenames are stored in the database. Just perform a cascade delete for the files you think that don't belong to each system's core and rerun the queries. I'd be interested to know the results.

Re:question on Code Quality In Open and Closed Source Kernels · 2008-05-16 10:23 · Score: 3, Interesting

what was the most foul comment you encountered :D ? and where did it reside Decency laws in various parts of the world, do not allow me to answer this question. However, I can say that in total the four kernels contain in C files 18389 comments marked XXX. The most famous Unix comment is of course the well-known "You are not expected to understand this". See dmr's page for more details. This is also an interesting comment, especially considering the current troubles of the person who wrote it.

Re:Preprocessing: here we go again on Code Quality In Open and Closed Source Kernels · 2008-05-16 08:29 · Score: 1

This is a very perceptive comment. It goes deeper: Linux and FreeBSD can be (and often are) configured by end-users. These can tailor the kernel in hundreds of ways. In FreeBSD 6.2 I measure more than 340 kernel options. These are mostly handled by the preprocessor.

Thanks for pointing this out.

Re:Not that surprising on Code Quality In Open and Closed Source Kernels · 2008-05-16 07:36 · Score: 4, Interesting

I don't think that my results can support us in making arguments regarding 'slightly' higher quality, or 'exactly the same quality'. My figures are based on possibly interdependent, unweighted, and unvalidated metrics. Therefore they only allow us to make conclusions involving large differences.

Re:Is it just me? on Code Quality In Open and Closed Source Kernels · 2008-05-16 07:31 · Score: 3, Interesting

The preprocessor algorithm I described in the Dr. Dobb's article is the one I used for parsing the code of this study. A strange preprocessor construct in the Linux kernel caused the macro-expansion algorithm I used previously to fail.

Re:Weird logic on Code Quality In Open and Closed Source Kernels · 2008-05-16 07:14 · Score: 1

What I'm saying is that when we're looking at maintainability of a large operating system (FreeBSD) there are few outliers. Therefore, one can make the case that in another similarly large operating system we can get a representative picture of its maintainability by looking at a subset of its code. My conclusion is related to the Law of large numbers, nothing deeper or more complex.

Think of my argument as looking at the people living in China and seeing that there are no areas occupied by giants or dwarfs. I then say that based on that fact, I can obtain the average height of people living in America, by looking at the people living in California.

It is not a water-tight argument, but it is the best argument I could make. I really wish Microsoft would supply me with more code (and ideally also process data) to study, but this is the best I could do with the available code.

Re:"Code quality" is bunk on Code Quality In Open and Closed Source Kernels · 2008-05-16 07:04 · Score: 1

With a liberal reading of "if it works" you're right. You can say that if the code is functional, reliable, usable, efficient, maintainable, and portable, then it is of high quality. But this is a circular definition, because this is how software quality is defined. As somebody else posted earlier, the quest for quality can lead you to an endless motorcycle trip on America's back-roads.

Re:So.... on Code Quality In Open and Closed Source Kernels · 2008-05-16 06:52 · Score: 2, Interesting

The way you license code can't directly affect its quality, but the way you develop it can. Here are some possible ways in which a company can affect (positively or negatively) the quality of the software:

Have managers and an oversight group control quality (+)
Through its bureaucracy remove incentives to find creative solutions to quality problems (-)
Pay for developers to attend training courses (+)
Provide a nice environment free of distractions that allows developers to focus on developing quality software (+)
Buy expensive tools that can detect quality problems (+)
Developers take their paycheck for granted and loose interest in what they are doing (-)
Developers write obfuscated code for job security (-)

And here are some possible ways in which an open source development effort can affect (positively or negatively) the quality of the software:

Volunteers are more motivated than paid employees (+)
Nobody takes responsibility for the overall quality of the code; responsibility is diffused (-)
Working conditions can be suboptimal (-)
Developers work part-time (-)
Developers eat their own dog food and therefore care about their code (+)
There are many eyeballs to spot code problems (+)
There are no marketing pressures to deliver substandard work (+)
Developers are geographically dispersed and can't communicate easily (-)

Both lists can be expanded, and many of the arguments can be refuted. Still you get the idea: the inputs to the two development processes differ substantially and this could affect quality.

Re:The winner is still open source on Code Quality In Open and Closed Source Kernels · 2008-05-16 06:39 · Score: 1

This is a very interesting comment. I had not thought of my results in this light, because, based on my experience as a (minor) FreeBSD committer and as a Windows user, I was expecting to see a large difference in favor of open source code. Yes, in the way you put it, open source is a winner.

Re:No Clear Winner, but... on Code Quality In Open and Closed Source Kernels · 2008-05-16 06:09 · Score: 2, Insightful

Ten years ago I wrote an article criticizing the Windows API. Most of what I wrote then, continues to be true today. Based on that external view of Windows, and the BSODs I regularly see, I was expecting to find in the kernel many worse things. The header file you mention is a clear manifest of an inappropriate design, and I suspect that at higher levels of system functionality (say OLE or the GDI) there will be more parts of similarly bad quality.

Re:KLOCs? on Code Quality In Open and Closed Source Kernels · 2008-05-16 05:54 · Score: 2, Insightful

You can automatically recognize some bad smells of poor quality code. However, this will still let through poor quality code that has been explicitly written to guard against the bad smells. So, you can say for sure that some code stinks, but you can't (and I suspect you will never be able to) tell that some code excels.

Re:Is it just me? on Code Quality In Open and Closed Source Kernels · 2008-05-16 05:49 · Score: 2, Insightful

This is a very good point...

Re:"Code quality" is bunk on Code Quality In Open and Closed Source Kernels · 2008-05-16 05:28 · Score: 3, Interesting

Coding to achieve some code quality metrics is dangerous, but so is saying that code that works is good. Let me give you two examples of code I've written long time ago, and that still survives on the web.

This example is code that works and also has some nice quality attributes: 96% of the program lines (631 out of the 658) are comment text rendering the program readable and understandable. With the exception of the two include file names (needed for a warning-free compile) the program passes the standard Unix spell checker without any errors.

This example is also code that works, and is quite compact for what it achieves.

I don't consider any of the two examples quality code. And sprucing bad code with object orientation, design patterns, and a layered architecture will not magically increase its quality. On the other hand, you can often (but now always) recognize bad quality code by looking at figures one can obtain automatically. If the code is full of global variables, gotos, huge functions, copy-pasted elements, meaningless identifier names, and automatically generated template comments, you can be pretty sure that its quality is abysmal.

Re:Stupid metrics on Code Quality In Open and Closed Source Kernels · 2008-05-16 05:15 · Score: 3, Insightful

It took me about two months of work to collect these metrics. Yes, running in addition the code of the four kernels through a static analysis tool would have been even better, but this would have been considerably more work: You need to adjust each tool to the peculiarities of the code, add annotations in the code, weed out false positives, and then again you only get one aspect of quality, that related with bugs, like deadlocks and null pointer indirections.

Using one of the tools you propose, you will still not obtain results regarding the analysability, changeability or readability of the code.

Re:Is it just me? on Code Quality In Open and Closed Source Kernels · 2008-05-16 05:06 · Score: 5, Interesting

It's not a very good summary, but the paper is well-written, which is interesting considering that the author is the one who submitted the summary to Slashdot. I suspect that he assumes we have more familiarity with the subject than we actually do. In my submission I did not include the last sentence with the "summary", which, I agree, is completely incomprehensible in the form it appears.

Slashdot Mirror

User: Diomidis+Spinellis

Comments · 86