gringer · Slashdot Mirror

Re:So? on Wi-Fi Direct Gets Real With Product Certification · 2010-10-25 09:30 · Score: 1

Doesn't OLPC XO-1 use 802.11s for ad-hoc/mesh networking?

Re:Spinning disks have left this customer on Are Consumer Hard Drives Headed Into History? · 2010-10-24 12:21 · Score: 1

look at those small files in /lib - they're symlinks to larger files

The command line that I ran dereferenced symlinks (du -L -b -a), as I've previously mentioned. Due to the command I ran, the small files in /lib are either files or directories. If they are directories, then the number is the total size of the files within the directories, so the files within the directories are no larger than that.

Following your prompting, my previous posts have looked a number of locations on my computer(s) and in all cases found a substantial proportion of small files. I don't claim to be a normal user, but suggest that based on my evidence, your analysis and interpretation of results may not be statistically sound.

/home
/home/Desktop
/lib
[lsof output]

Re:Spinning disks have left this customer on Are Consumer Hard Drives Headed Into History? · 2010-10-24 09:20 · Score: 1

Then again, go into any user's desktop directory ... most have LOTS of big files there.

Desktop? really? Okay:
$ du -L -b -a 318 ./Konsole.desktop 4508 ./Home.desktop 659 ./Braid.lnk 73 ./.directory 197 ./trash.desktop 5963 .
They look like pretty small files to me.

Or do like I did -go look in /lib, where most of your programs actually live. The only files at 4k or under are symlinks and directory entries

Fine, if you want:

du -L -b -a /lib | awk '{print $1}' > ~/libfiles.txt [-L: dereference symlinks, -b: apparent size in bytes, -a: all files]

[analysis using R]: > a <- read.table("libfiles.txt") > mean(a$V1) [1] 76875.4 > sd(a$V1) [1] 2258044 > median(a$V1) [1] 3776 > sum(a$V1 <= 4096) [1] 8875 > sum(a$V1 > 4096) [1] 8428 > 8875 / (8875 + 8428) [1] 0.5129168

This reports directory sizes as the size of the containing files, which will skew to larger than actual sizes. Despite this, in /lib, I have 51% of my files with size 4kiB or less (median file size 3776 bytes, mean 76875.4 bytes). This is probably due to the linux kernel tree being in there on that computer. So I'll try the eee PC that I have (stripped down to a pretty minimal system):

> b <- read.table("libfiles.txt") > mean(b$V1) [1] 155920.5 > sd(b$V1) [1] 2595565 > median(b$V1) [1] 13651.5 > sum(b$V1 <= 4096) [1] 350 > sum(b$V1 > 4096) [1] 3318 > 350 / (350+3318) [1] 0.09541985

So now we get the number of files less than 4096 bytes as 9.5% of the total files. Quite different from my other desktop, but I'll still stick with my statement that the frequency of small files on my computer(s) is not insignificant -- even when looking at /lib, there are still a reasonable amount of files with size less than 4kiB.

Re:very telling on Rounding the Bases Faster, With Math · 2010-10-23 21:39 · Score: 1

From the article:

A path that follows a circle turned out to be a whopping 25 percent faster.

That's a pretty big performance boost. It'd need to get to 33% faster to turn a 3rd base run into a home run every time, but there may be times when 25% is all you need.

Re:Spinning disks have left this customer on Are Consumer Hard Drives Headed Into History? · 2010-10-23 21:01 · Score: 1

No - I said average because I meant average.

Sure, but which average did you mean?

If you're talking about the average function in Excel/Calc, then that's the arithmetic mean, which is not useful for explaining how many of your files are under a particular file size (as I mentioned previously). To reiterate an often-mentioned issue with the mean, in the case where you have a small number of really large files (e.g. ISOs, DVD rips), the mean will be affected to a large degree.

So you're not so happy about the home directory usage because it's an "exception", let's try lsof (the currently open files on my computer, lsof -s -b -F ns0 > usedfiles.txt, analysed using R). Here are some statistics:

mean file size: 456807 bytes (~450kiB) SD of file size: 2551370 bytes (i.e. ~2MiB!) median file size: 56536 bytes (~50kiB)

The mean and median, in this case, are quite different, and suggest a substantial skew towards low-size files. So 50% of the files currently in use on my computer are more than 50kiB. Hence it is likely that "most" of the files are over 4kiB. I can verify this with counts:

number of open files with file size > 4kiB: 2972 number of open files with file size = 4kiB: 475 (13.8%)

13% is less than 50%, granted, but it's not insignificant. Your comment was "Almost NO file on your file system is under 4k in size", and again I suggest that at least on my computer, this statement is incorrect.

Re:Spinning disks have left this customer on Are Consumer Hard Drives Headed Into History? · 2010-10-23 16:41 · Score: 1

my home directory, the average file size is 19,065,740 bytes

Given that you said "average", I presume you mean mean, which is not a good indicator of the most frequently present file. Median would be better, if you want to say "50% of my files are under this size".

Re:SSD vs HDD terminology on Are Consumer Hard Drives Headed Into History? · 2010-10-23 15:57 · Score: 1

... and then I re-read the summary to see this confusing statement:

...may have the clout to shift the market away from hard drives, even if they're still an order of magnitude cheaper

SSD drives are "hard drives". Arguably, they're harder than HDDs because they can have less air in them (required for moving parts to move).

SSD vs HDD terminology on Are Consumer Hard Drives Headed Into History? · 2010-10-23 15:52 · Score: 1

Well, I was going to whisper into the cacophony, "can we please assert that SSDs are also HDDs?" Then, just before writing this out, I expanded the acronyms and realised that "solid state drives" are not "hard disc drives". No doubt this will not be realised by most consumers -- I talk about bad computer memory and they get confused, or ask me if the files were backed up; another common confusion is hard drive == case + motherboard.

Re:Godwin's Law, regular edition on Are Consumer Hard Drives Headed Into History? · 2010-10-23 15:36 · Score: 1

That evokes an interesting question: does a person lose if they hint at the famous dictator, but don't mention him specifically?

Re:Spinning disks have left this customer on Are Consumer Hard Drives Headed Into History? · 2010-10-23 15:13 · Score: 2, Informative

No, small random reads are NOT the primary pattern in desktop usage. Almost NO file on your file system is under 4k in size, which is the "chunk" size for most 8mb to 64mb hd caches.

I differ in that respect. Not sure if my use is typical, but here's a dump of the counts for the smallest file sizes in my home directory:
~$ du .* --apparent-size -a 2>/dev/null | awk '{print $1}' | sort | uniq -c | sort -n -k 2,2 | head -n 10 40006 1 11237 2 6862 3 4831 4 3554 5 2964 6 2783 24 2619 7 2477 8 2229 22
In other words, the highest frequency file size is 1kB (blocks are 1kB in my version of du), next highest 2kB, and so on. I get an odd jump at 24kB and 22kb (and FWIW 0kB comes in at #18), but in general the smaller a file is, the more frequent it is.

Re:Common misconception on US Elections Dominated By Closed Source. Again. · 2010-10-20 09:44 · Score: 1

I admit I'm impressed if you actually make that work in the UK. If it does work just like that for you, then sure, no reason to change it. Sadly, experience has shown it's not nearly so well done in the US

Seems to work in New Zealand as well. When I was a polling officer, it was just one person from one party who was there, but they were sitting next to our table recording the names (actually page/row numbers) of the people who voted. No talking to them -- that could get the scrutineers kicked out.

It's a bit trickier with postal ballots (and I'm not quite sure how they're scrutinised). However, we've just had a bit of an upset in Wellington with the underdog green candidate ending up as mayor because too many preferred her over the current mayor (our mayor is voted under an STV system). The difference was some small number of votes (fewer than 500, I think), but the incumbent team doesn't seem to be crying foul over the election itself.

Re:Because... on US Elections Dominated By Closed Source. Again. · 2010-10-20 09:30 · Score: 1

Oh, and take a snapshot when (if) it gets to +5 troll. If enough people did that it might be believable -- surely /. wouldn't commit voter fraud.

Re:How about a revoke? on NRO Warns They Are On Final IPv4 Address Blocks · 2010-10-18 13:39 · Score: 1

Don't you mean 1/4 every month? Remember! Always simplify your fractions.

It's 1 /7 every month. Computer maths is a little stranger than normal maths.

Re:I'm Ray Ozzie, on Ray Ozzie To Step Down From His Role At Microsoft · 2010-10-18 13:19 · Score: 2, Funny

Its up to ten million and it hasn't found any more one digit UIDs, just the first ten.

Have you checked to make sure that there aren't any in the vast space between two whole numbers? That sounds like it could be quite a complicated exercise.

Re:Open office != MS Office on Why Microsoft Is So Scared of OpenOffice · 2010-10-17 23:00 · Score: 1

If I get a 5mm screw from "Scott's screws" and decide to one day switch to "Sam's Screws" I don't have to worry about retraining staff for how to use them

Yeah, but what if "Scott's screws" had a little kink just near the head that gave the screw a little extra bite, so your staff got used to tightening those screws a little less than most other screws?

Re:Modelling real disease? on Microsoft Eyes PC Isolation Ward To Thwart Botnets · 2010-10-07 13:11 · Score: 1

That's not quite how our immune system works, but I agree with the idea.

I consider the whitelist to be equivalent to the process of selection against autoimmune antibodies, mentioned at the end of this section. B cells won't ordinarily progress through to maturation if they generate antibodies with affinity for self signatures.

Modelling real disease? on Microsoft Eyes PC Isolation Ward To Thwart Botnets · 2010-10-07 12:17 · Score: 4, Informative

If you want to model how our body recognises and deals with disease, you need to concentrate on whitelists, rather than blacklists. Vaccinations are similar to a community blacklist, but for most pathogens our own immune system can work out what things are appropriate to reject.

Re:DON'T DO IT! You'll get fired on Simple Virus For Teaching? · 2010-10-06 14:59 · Score: 3, Informative

No where was it mentioned about creating one. Ever.... actually read the summary ffs.

I think you may have missed this part of the summary:

do I try to write one my self

File size on Bittorrent To Replace Standard Downloads? · 2010-10-03 14:38 · Score: 4, Insightful

Why? because for small files (as I expect most software updates would be), downloading directly is quicker and safer.

Re:Vanishing People on Copyrights and CD-Rs Endanger Audio History · 2010-10-02 01:02 · Score: 1

We will be a mystery to archaeologists of the future.

You people from the future are a mystery to us here in New Zealand. 9/11 hasn't happened for us yet.

Article link on Scientists Stack Up New Genes For Height · 2010-09-30 10:39 · Score: 2, Informative

Took me a bit of time to find, but here's the link to the actual research paper (requires nature subscription):
http://www.nature.com/nature/journal/vaop/ncurrent/full/nature09410.html

From the abstract:

Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation)

The introduction of the paper states that "80% of the variation [for height] within a given population is estimated to be attributable to additive genetic factors, but over 40 previously published variants explain less than 5% of the variance." While this paper pushes that to 16%, it's nowhere near the limit of what can be detected.

I find it interesting that they've got a sample size of around 100,000 individuals for this study (actually a meta-analysis of summary statistics from 46 GWAS of 133,653 individuals), but still claim a need for more individuals. I suspect that'll still be said when a study is done on 10 million individuals, or a billion.

Re:Obligatory on HDCP Encryption/Decryption Code Released · 2010-09-29 14:40 · Score: 1

Don't worry, a couple of minutes with the HDCP Encryption/Decryption Code, and everyone will be able to see it again.

Re:100% buzzword compliance on UK Goverment IT Chief Backs Open Source Suppliers · 2010-09-22 16:31 · Score: 0

a utility-maximising foray into language improvement optimisation techniques to obviate the degredation of A) core procedural goals and B)reconstruction of enlightened creative thought processes

FTFY

Good thing on Aussie Student Responsible For Twitter Exploit · 2010-09-22 12:08 · Score: 1

It's a good thing they just used onmouseover rather than onload. That would have been quite a chaotic mess.

Re:Awesome stuff, but it doesn't take off like a b on First Human-Powered Ornithopter · 2010-09-22 11:46 · Score: 2, Informative

You can open the login link in a new tab (or window, if that gets your fancy). Then when you preview/submit, you'll be logged in.

Slashdot Mirror

User: gringer

Comments · 792