Vesvvi · Slashdot Mirror

What about HPN? on OpenSSH Has a New Cipher — Chacha20-poly1305 — from D.J. Bernstein · 2013-12-11 06:33 · Score: 3, Insightful

This is all well and good, but what's the status of seeing HPN-SSH or similar incorporated? FreeBSD has incorporated it, but it's still messy on Linux systems.

Re:if you can access it on a website on Storing Your Encrypted Passwords Offline On a Dedicated Device · 2013-12-08 18:06 · Score: 1

It doesn't specifically solve any of those problems (except forbidden punctuation mark), although it simplifies them a bit.

Required characters (uppercase, punctuation, numbers) can be added post-hash as an insecure suffix to meet site requirements. These don't add any security, so you can carry them around with you, put them on a public website, or leave them on a sticky note on your monitor: "work suffix: #U1_. Github suffix: (#$JHi/."

The same thing can be said for length issues, although I've found that most systems these days are happy with 16. Admittedly, with the character set restriction it would be better to keep it long, but I would argue that by avoiding sending plaintext to the servers, we're avoiding the vast majority of vulnerabilities.

Expiration is made more simple by making it easier to remember passwords: changing one isn't a big deal. This continues onto your next point as well: you'll never have an error message that your new password is too similar to your old one.

I think your last shows another benefit of terminating private passwords as soon as possible. On insecure hardware, your public (hashed) password is exposed, and of course it could be captured for future use. But that will limit exposure to a single service, and it won't reveal any hints about your password trends.

You actually overlooked the most important point: if we're hashing passwords on a secured and user-controlled device, it's very easy (space-, energy-, speed-efficient) to get the public/hashed passwords off (LCD), but it's still a bit annoying to get the private passwords onto the device. UI concerns are a problem: I can do it extremely fast an efficiently if I'm working on a desktop, but it's a bit slower even on a tablet. The further we go towards hardware which can be fully locked down (keyfob with a single chip), the harder it is to get the data onto the device.

Re:if you can access it on a website on Storing Your Encrypted Passwords Offline On a Dedicated Device · 2013-12-08 08:20 · Score: 1

I don't understand why there is so much effort placed on storing passwords. We already know what to do with passwords from the perspective of the server: discard them as soon as possible!

The password should be salted and hashed immediately, and it should never be stored in plaintext. So let's not store them at all: let the user remember the risky password, and encrypt it as soon as possible. It's a validated methodology, and it removes many/most of the trust issues of the user/server relationship: I don't care if the server fails to salt my password if it's already encrypted.

Now take this to the next step. The user-side "passwords" can be pretty weak, since they need to be memorable but not high-entropy. We don't want to re-use the same "password" everywhere (different sites/services), since that's a risk, but we can come up with a weak per-site salt that's easy to remember. Combine that with a relatively weak password and we have a winner

Use-everywhere password: invsqrt
Site: slashdot.org. "Salt": modmadness. Full password: invsqrtmodmadness
hashlib.sha256(getpass.getpass()).hexdigest()[::2][:16]
Password sent to server: "dee4ea048518f588"

Use-everywhere password: invsqrt
Site: stackexchange.com. "Salt": xyproblem. Full password: invsqrtxyproblem
hashlib.sha256(getpass.getpass()).hexdigest()[::2][:16]
Password sent to server: "be6065c67f055583"

Yes, I know it's just a hash, but this is a simple example. There's some loss of strength from key vs hash lengths, re-using "passwords" etc, and I've thrown in some complication, but I think the general idea is sound. The most important fact is that insecure, memorable, secret information never leaves my brain. Ok, in practice it does: I enter it onto an offline encryption device, but it never goes anywhere else.

There is no private key to lose.
I don't have to store private information.
The public-side "passwords" are high-entropy and pseudo-random.
The user-side "passwords" are highly memorable.
An offline encryption device adds security, but it isn't necessary: in an emergency I can generate hashes nearly anywhere, since I carry my secure passphrases around with my in my brain.

You can stack additional levels of complication to make it more robust, but even the crudest implementation put you in the top 0.01% of hardest-to-crack passwords. For example, your encryption fob can contain a private key: smash the fob and you have securely destroyed the ability to re-create passwords. It also would make the outgoing passwords much more secure.

Please don't re-invent the wheel. on A Protocol For Home Automation · 2013-11-01 05:18 · Score: 4, Insightful

Please don't re-invent the wheel unless you need to. By that, I mean to say that automation and interconnection of "gadgets" is a well-established field in industry and tech. For example, vehicle ECU and sensor systems, factory automation, and data acquisition systems are all now decades old, and we should have a really solid idea of how to do these things properly.

Of course these existing systems aren't the same as what we're talking about here, with modules that span different physical link layers, protocols, etc. I just hope that we can take the best lessons from existing "gadget integration" attempts to make forward progress more successful and not just something doomed to rapid obsolescence.

For some fun and background, have a look at the old HPIB/GPIB physical/protocol standard (http://en.wikipedia.org/wiki/IEEE-488), which was used in many different pieces of scientific equipment. When that somewhat died out it was replaced by CAN (http://www.team-cag.com/support/theory/chroma/hplc_bas_at/system/cableConnections.html). Agilent uses that for their HPLCs (maybe test equipment, too?), and Waters uses the same physical link, but with a different protocol? Other vendors still work with contact-closure, and USB is becoming more popular, but that pushes so much onto the host computer and really enforces lock-in.

I will personally be watching this closely from the perspective of someone who operates a lot of data-acquisition equipment. Could this be the foundation for better interop between different vendors at the more commercial/research level, in addition to the consumer? I hope so.

Re: Right move on DNA Sequence Withheld From New Botulism Paper · 2013-10-28 06:13 · Score: 1

It looks you are correct: the costs have dropped low enough that it's feasible to just make it synthetically. I guess that really shouldn't be surprising, given that this is basically a "parallelizable" technology.

what this will look like: on PubMed Commons Opens Up Scientific Articles To User Comments · 2013-10-23 10:04 · Score: 4, Interesting

I'm going to go out on a limb and predict where this will go first: improved metadata and citation networking. I'm an eligible author with pretty good experience with the system.

The initial comments will not be excessively negative. As I've mentioned before on Slashdot, publications are a summary of findings and never the full story: the authors are always holding back. On average, if it looks like they've overlooked something (from the standpoint of the reader), it's more likely to be an error or oversight on the reader's part than the authors. I think people generally appreciate this point, so they'll be conservative in their criticism to avoid looking foolish.

Getting cited is a really big deal, and not being cited (when your work is highly relevant to the topic) is considered a serious slight. I've seen nasty phone and email messages bounced around because of this. So in the context of comments, you're going to see a lot of things along the lines of "They should have considered author X, work Y from 2003 because it is highly relevant." This is a safe comment to make, but it can also be used to make a subtle point, drawing attention to competing work the authors chose to ignore, etc.

There won't be a lot of novel observations/data/interpretations being presented. Online comment pages will not be considered a place to stake your claim on an idea. Hence, people won't want to be "scooped", and they will reserve key insights for themselves.

There will be a lot of referencing preprint sources as they become more popular. This will be a new form of citation: retroactive citation of "future" (current) works, and it will greatly improve the citation network. This is important because that network is critical (besides in-person networking) to follow the development of a research field.

Re:Right move on DNA Sequence Withheld From New Botulism Paper · 2013-10-19 12:07 · Score: 1

There is a chance I'm wrong (I buy proteins/peptides, not DNA), but I doubt it.

Notice on the page you linked that they are always describing "genes" and not generic sequences. Also note that the two categories are "human/mouse/rat" and "other", and that they specify "for ORF genes present in existing NCBI database". This is not a coincidence: they can offer these products because the know that it can be cloned out of the host species, after which "mutagenesis is starting from $149/mutation".

To my knowledge there is still no magic bullet for long DNA synthesis, although it appears I was wrong about the scale. Genscript will sell oligos in the range of 15-60, not 5-20, so that will substantially reduce the amount of work to assemble a bunch of them together.

Re:I know the scientist... on DNA Sequence Withheld From New Botulism Paper · 2013-10-19 05:45 · Score: 1

BSL-3 labs will attract DHS-type attention when they don't follow the rules carefully. Botulinum of any kind is a "select agent": http://www.selectagents.gov/Select%20Agents%20and%20Toxins%20List.html

On the other hand, there are a lot of "loopholes" (maybe not the best term). I've been surprised to see how simple it was to get samples out of BSL-4 and into an unregulated environment, even while following all the rules to the letter.

Re:Terrists on DNA Sequence Withheld From New Botulism Paper · 2013-10-19 05:33 · Score: 2

Sorry, that reference doesn't mean what you think it means. GP wants to know what it takes to go from arbitrary data to protein. The Science paper you linked describes what it takes (more than a decade ago) to take existing proteins and deposit them in an organized pattern onto a surface, which is a completely different topic.

I am not current on the data->protein problem, but to the best of my knowledge the current state of the art, at scale, is to engineer an organism to do it for you. All of the vitro work ("synthetic" protein production machinery in a test tube, without live cells) will not scale to useful quantities: it's still academic.

Re:Right move on DNA Sequence Withheld From New Botulism Paper · 2013-10-19 05:24 · Score: 4, Informative

You and the previous few generations of comments are both correct and wrong.

The comment 3-up is wrong that anyone can do it: even with the sequence, it would be extremely difficult for even top-level professionals to do it from scratch.

The comment 2-up is wrong to say that it's hard, because if you can get the DNA construct then it's extremely easy. This deserves clarification: nearly everyone here (Slashdot audience, not molecular biologists) is going to assume that there's a magic black box that will turn a sequence into a real physical DNA construct, and they are mistaken. Data/sequence to DNA construct, absent of anything else, is extremely hard.

You are correct about nearly everything, except that it is not simple to just buy big sections of DNA. If you want 5-20 bases, that's not a problem. But this protein is ~450 bases long. You can't just order something like that, and "stitching it together" is possible but would probably take years to get right, even for a pro.

But the idea behind your comment is still valid, because this gene will not be a from-scratch, random sequence. It's going to be 95+% identical to existing sequences, so instead of splicing together 60 synthetic sequences (purchased from a company), you only need to splice together maybe 2-4 big pieces. Those pieces could be purchased, or possibly isolated if you can get the bacteria.

Re: Is this the right move? on DNA Sequence Withheld From New Botulism Paper · 2013-10-19 05:06 · Score: 1

No, it will slow down professionals as well.

Without the sequence, what can you do? It's pretty much guaranteed that the new strain produces a toxin with extremely high sequence homology to existing strains, so you know that to make the new toxin you just have to add/delete/exchange a few amino acids, or maybe add an insertion.

But there is no way to know or guess what should be altered. There are ways to create libraries of mutants, but then they will need to be screened, and that will not be a fast, simple, or safe process.

Without access to the original strain, there's not much you can do, and the few things you can attempt are no better than starting from scratch.

Re:The other issue with much of modern science on How Science Goes Wrong · 2013-10-17 13:15 · Score: 3, Insightful

It's resource intensive, but also just plain difficult. For example, publications are never a full description of an experiment, just the highlights. It takes a skilled researcher to fill in the gaps and then a second level of skill to accurately carry it out.

Looking at it from another perspective, ignoring scientific developments which are the result of inspired genius (which I would argue are rare), every new publication is the more novel and difficult work that has been conducted to date. If it weren't, it would have been done already.

So how can you expect someone else (who wasn't able or interested to carry out the work themselves) to immediately duplicate cutting-edge work based on an incomplete description?. It's a bit amazing that up to 50% of publications could be replicated at all.

Re:Lord Forgive me, but on How Science Goes Wrong · 2013-10-17 13:09 · Score: 2

Scientists and researchers are not hamstrung by NDAs. If anything things are going the other direction: university libraries are setting up self-publishing, open-access projects to disseminate the work being conducted by the researchers.

I've only seen NDAs and similar come up in one situation: when a researcher employed by the university is a guest or collaborator with a private company. Then the company might try to introduce such things, but the university legal is very hostile to that. I can't think of a single situation in which I've seen a university require an NDA, even when dealing with inventions and IP. They do require disclosure to the university, especially after the Stanford case ("will assign" vs "hereby assigns").

I would have expected this to change substantially with the America Invents Act, but to date no lawyers I've talked to have indicated that anything has or will change, which I think is a bit odd.

Re:Yeah, but it does depend on the area of science on How Science Goes Wrong · 2013-10-17 12:53 · Score: 1

If you have accurately and fully described the requirements of that department, she's better off not getting a degree from such a sham institution. I have never seen requirements like that.

Re:More details? on Ask Slashdot: Best Language To Learn For Scientific Computing? · 2013-10-17 06:49 · Score: 1

R is terrific but it's also horribly overused (much like Excel). In my experience R is best when called for very specific calculations within the context of a larger package written in something more general-purpose.

We are the ones in need of a network on Extreme Complexity of Scientific Data Driving New Math Techniques · 2013-10-11 10:20 · Score: 5, Insightful

I like some of the more subtle details in the title and summary: new math "techniques", "researchers need new mathematical tools", etc.

I find it hard to believe that our sciences are driving the math fields, as mature and well-developed as the math community is. But it is true that existing knowledge and tools from mathematics drive huge advances in the sciences when they are brought to bear. The sad truth is that scientists just don't play terribly well with others (maybe no one does): interdisciplinary work is rare and difficult, and so we end up re-inventing the wheel over and over again. The reality is that the "wheel" being created by the biologist in order to interpret their data is a poor copy of the one already understood by the physicist across campus.

What can we do about this? I'm not sure, but I think it's safe to say that our greatest scientific advances in the next few decades will be the result of novel collaborations, and not novel math or (strictly speaking) novel science.

Re:Sorry, this is a botched study. on PCBs Cause Birds To Sing a Different Tune · 2013-09-23 03:53 · Score: 1

I don't understand the point of your comments.

They averaged birds together by location, and compared that to song.

We already know that "location->song" probably has some kind of causation. We think that "PCB->song" may have a causation. So why would you try to stack "PCB->location->song"? There is no question that it introduces biases.

Can we remove those biases? Maybe, if we're careful. Can we remove those biases if we discard variability within location via averaging? NO!

Should the direct "PCB->song" relationship be presented? YES!!

Re:Sorry, this is a botched study. on PCBs Cause Birds To Sing a Different Tune · 2013-09-21 15:18 · Score: 1

Actually, I'm a professional. I make mistakes just like anyone: I overlooked Table 1 when I made my original statements.

But I do tend to review about 10 papers like this per month, I'm fairly good at it, and I have valid points.

Re:Sorry, this is a botched study. on PCBs Cause Birds To Sing a Different Tune · 2013-09-21 09:26 · Score: 1

You're partially right, I did overlook their "analysis". Table 1 gives us their conclusions, but there is no data. There are regression plots and PCA for other comparisons, but they left out everything relevant to to PCB-vs-song.

So they didn't show the song data per bird. They did describe how they reduced song data to a high/low binary value ("trill performance"). They didn't show "trill performance" per bird. They didn't show the models. They didn't show any evaluation of the models. They did show the relative evaluation of the models.

I just don't understand why they left so much out! In the end they used a continous variable (PCB) to predict a binary high/low song value, when they could have just kept and used the original song data. Maybe that's what they did? It would make sense, but that's not what they described in the paper.

Furthermore, there are all kinds of oddities in the Supplemental Table 1. It's presented as averaged per-region, but the data is filtered according to their individual-bird LOD/LOQ: filtering should be at the bird level, not after averaging. The error in their quantitation just so happens to always top out at 100%, which shows they've massaged that as well. They used the LOQ to arbitrarily set values to zero: at minimum these need to be treated as exceptions in the analysis. The values below LOQ have errors set to zero, while these values should have the largest relative error of all.

None of this directly invalidates the analysis, but it's just bizarre and sloppy. Considering that it's the entire cornerstone of their hypothesis, I still think that it implies poor work or deception.

Re:Sorry, this is a botched study. on PCBs Cause Birds To Sing a Different Tune · 2013-09-21 04:59 · Score: 1

I read the PLoS article: it wasn't there.

They took very substantial trouble to match the individual bird's song to the blood sample: that's excellent work! But where did they analyze that data? They didn't.

Sorry, this is a botched study. on PCBs Cause Birds To Sing a Different Tune · 2013-09-21 04:09 · Score: 1

I don't want to belittle the scientific work of others, because I know how hard it can be, but they completely dropped the ball on this one.

They measured:
geographic location of the birds
PCB levels of the birds
song patterns of the birds
some other stuff that doesn't matter (in the big picture)
All of these were measured on a per-bird basis.

They concluded by comparing geographic region to song patterns.

WHY?! Why didn't they directly compare the PCB levels to the songs?! Now, we are left wondering if the song patterns are due to geographic influences (local dialects?), or if it really is due to the PCBs. It's either a sloppy, sloppy omission, or they didn't like what those results showed, and I'm leaning towards the latter. This is the kind of public-interest stuff that the "top tier" journals love, so I have a feeling it didn't make it there for a reason.

Re:Advatages of ZFS over BTRFS? on OpenZFS Project Launches, Uniting ZFS Developers · 2013-09-17 17:55 · Score: 2

I had an upgrade path similar to yours, starting with RAIDZ and moving the a group of mirrors. I try not to let any pool get too big, so there are maybe 20 drives/pool. It's always the small files that are lost (see post above) I think each server does about 6 PB/year each direction on these highly-accessed files, so I think it's reasonable to drop ~1MB of non-critical files (they basically store notes of data analysis).

So far I've never had a problem with VM images, but now we're mitigating that by adding redundant but isolated storage servers. I'm sure you could manage this without ZFS snapshots and send/recv, but I wouldn't want to try.

Re:Advatages of ZFS over BTRFS? on OpenZFS Project Launches, Uniting ZFS Developers · 2013-09-17 17:30 · Score: 4, Interesting

This is correct.

It is statistically assured that you will lose some data with anything less than obscene redundancy. I've run the numbers and we've settled on what's acceptable to us: we have offline backups far more frequently than 2 times/year for everything, so dropping about 2 files/year that are completely unrecoverable without backups isn't a big deal.

These systems are running a moderate number of very large static files, mixed with a very large number of very small files. The small files are SQLite-style records, and we churn through them very rapidly. I don't know exactly why, but it is always these small files that we lose: there is clearly a bias towards things that are written frequently. The analyst in me is quick to point out that implies failures in ZFS itself, beyond just the disks and "bit rot", but the accelerated failure isn't enough to worry about. So our non-failure rate is easily 6-nines or better per year on the live storage system, but it's still a bit uncomfortable to know that some data is going to be gone, despite that.

With a minimal amount of effort you can get hardware and software which is not longer the biggest threat to your data. I am personally the most likely source of a catastrophic failure: operator error is more likely than an obscure hardware failure. ZFS has allowed me to reduce that operator error (snapshots, piping filesystems, nested datasets with inheritance), and simultaneously it's outperforming other options on both speeds and security. Overall, I'm extremely pleased.

Re:Advatages of ZFS over BTRFS? on OpenZFS Project Launches, Uniting ZFS Developers · 2013-09-17 12:59 · Score: 5, Informative

I don't have any practical experience with BTRFS, but I use ZFS heavily at work.

The advantage of ZFS is that it's tested, and it just works. When I started with our first ZFS testbed, I abused that thing in scary ways trying to get it to fail: hotplugging RAID controller cards, etc. Nothing really scratched it. Over the years I've made additional bad decisions such as upgrading filesystem versions while in a degraded state, missing logs, etc, but nothing has ever caused me to lose data, ever.

The one negative to ZFS (if you can call it that) is that it makes you aware of inevitable failures (scrubs catch them). I'll lose about 1 or 2 files per year (out of many many terrabytes) just due to lousy luck, unless I store redundant high-level copies of data and/or metadata. Right now I use use stripes over many sets of mirrored drives, but it's not enough when you read or write huge quantities of data. I've ran the numbers and our losses are reasonable, but it's sobering to see the harsh reality that "good enough" efforts just aren't good enough for 100% at scale.

Re:IP Rights on Former Lockheed Skunkworks Engineer Auctioning a Prototype "Spy Rock" · 2013-08-25 11:03 · Score: 3, Informative

http://yro.slashdot.org/story/05/09/23/2022243/eminent-domain-applied-to-ip-due-to-state-secrets

Slashdot Mirror

User: Vesvvi

Comments · 98