I would happily support DRM that actually cared about customers' rights. I want the guarantee that, like physical media, DRM-protected content will be available in the far future. Blu-ray already fails this test, and I only purchase Blu-rays to strip the DRM and save a long-term format. I want the ability to gift, loan, or sell any media that I possess the rights to. I don't want to possess merely a ticket which grants me admittance to content for a limited time, under limited conditions, subject to the dissolution of whatever producer, licencor, or operator manages the DRM scheme.
Because piracy has absolutely no effect on 99% of customers I am fairly certain that what content producers/licencors truly fear is "casual piracy" and fair use like loans and libraries where market forces drive the resale cost of digital media down to its natural price in the free market.
It's perfectly natural to resist inferior DRM schemes by refusing to make them standard. If you want me to support an open DRM standard then it needs to be capability based with normal customers like you or me represented as first class owners of those capabilities and implement a durable scheme for transfer of those capabilities into the indefinite future.
For example, consider a ownership-based scheme where producers issue N digitally-signed capabilities to a particular copyrighted work and sell them to customers on an electronic marketplace. Bitcoin has proven that it's possible to maintain a globally consistent transaction ledger of ownership of individual tokens, and a much cheaper implementation could maintain ownership and facilitate programmatic transfer of capabilities to digital works (to support sales, gifts, and even temporary loans) because the marginal value of acquiring more than one capability to the same work is zero and so there will be little need to spend gigawatts of electricity maintaining the blockchain against adversaries. The copyrighted work doesn't even have to be encrypted. Just make standards-compliant devices/software require current ownership of a capability to use the work. Yes, this is an easily defeated scheme for pirates, but so is every other DRM scheme. At least this respects individual property rights, the first sale doctrine, fair use, and libraries for the vast majority of users.
After dealing with channel bonding where one link would spontaneously become 100% busy while the other three hung around 10% busy every time one VM wanted to copy files somewhere, I decided channel bonding is basically useless.
Frankly I doubt concealed carry on a plane could possibly be very effective. If five terrorists can get box cutters on a plane then five terrorists can get concealed carry permits and outnumber any other concealed carry citizens. Not to mention that explosive decompression is no fun and I doubt plane windows were intended to resist more than a couple shots from the inside before catastrophically failing. Explosives and planes don't mix well, and given that a fair number of gun injuries are actually accidents (not to mention uncounted accidental firings) I would imagine the first time a plane has issues due to gunfire it would be accidental from an otherwise legal concealed handgun.
In the best case scenario some idiot terrorist without a gun gets shot a couple times in the center of mass with no over-penetration before setting off a bomb. Virtually any other scenario doesn't require firearms to handle; terrorists on a plane are literally at arms-length to a hoard of people who hate terrorists. Terrorists taking overt action need room to maneuver or the ability to barricade themselves, or else the ability to instill overwhelming fear and inaction in everyone surrounding them. Taking over a plane with the threat of personal violence is pretty much the hardest thing a terrorist could accomplish at this point.
1.) all the photos I took of her seemed incredibly important at the time but are never looked at any more
Yeah, photos have a weird W-shaped utility; They get shared and looked at a lot when brand new. After 6 months to a year they sit in boxes/drives for years and after about 20 years the utility climbs again until ~150 years later when no living relatives remember the people in the photos. Then after a few more decades they have historical value. Hence the need to plan for long-term storage.
Take UDF. Expand it to the PB realm, not the existing 2TB. Add some ZFS features like ditto blocks, 64-128 bit CRCs, cryptographically signed writes with public keys, standard encryption, standard compression, ability to duplicate the filesystem as an image (so rsync utilities are usable to preserve hierarchy), snapshot directories a la OneFS/WAFL,
ZFS is probably your best bet for now. Oracle built filesystem-level encryption into the Solaris offering, no luck for the free versions. No cryptographic signing of writes, but that is imho overkill when you have to trust the whole kernel and filesystem layer and so whole-disk encryption plus SHA256 checksums gives basically the same assurance that no data has been modified. You can hold snapshots in ZFS to prevent them from being accidentally deleted and treat them as basically WORM.
So from your smallest box 3x 3TB = 9 TB of data, and Glacier and Google nearline (maybe others too?) are charging $0.01/GB-month, so about $90/month if you back up the whole thing. I don't know how much you pay for electricity in both locations, but if a box can run/idle at 100W and you leave it on all the time you spend ~900KW a year. At $0.20/KWh that's about $180/year per server. Disks every 3 years (if you get HGST's warranty) is $140/year (using $0.035/GB rough cost today), or $27/month per server for ongoing costs not including replacing the other hardware periodically. $54/month vs. $90/month? Sure, it's a little cheaper. If you wanted one box and one online service it makes running your own look better; $120/month vs $54/month. What about connectivity at both sites? If you are already paying an ISP for other reasons at both ends that's one thing, otherwise throw another ~$50/month on top of at least the backup server cost. AWS and Google appear to currently charge $0/GB for incoming transfers. Of course if you can get deals on cheap drives and run them past the warranty in a state with cheap electricity (or in a dorm room with free Internet/electricity) it's a lot cheaper.
As for security, encrypt before copying anywhere. You might as well be running local disk encryption too so you never have to worry about returning a disk with plaintext for warranty repair. I don't trust any company to keep the data I upload secret (FISA courts, NSA, bla bla bla), so encrypting incremental ZFS snapshots and uploading them is an efficient way of maintaining an offsite backup. I only have 1TB I care to back up this way so it's less sticker shock each month, but I still find it amusing that the first box I built was 4*320GB RAID5 and now that costs $9/month.
1500 VMs isn't that crazy for 3000 people when you have to use Windows. Every individual piece of software is going to want its own VM, often two or more for redundancy/load balancing, plus an equal number for the test environment, and often a few more for dev/upgrade environments. Many software packages with a server component are big cumbersome globs of many.exes that the vendor "recommends" be run on separate VMs because the developers have no clue how to write software and rebooting Windows is the first solution to half the issues. Think a 3000 person company doesn't have the necessary ~200 apps to reach 1500 VMs by this measure? There's usually several software applications that are specific to each department, and there are lots of departments: purchasing, accounting, distribution/receiving, each core business unit, HR, PR, engineering/plantops, business office, sales, and last but not least IT which is guaranteed to run dozens if not hundreds of separate apps to do their jobs. Sure; not all of them require a server, but many do, even if it's just a ridiculous license server. Data? Anyone processing video or images is just going to have a crapload of data period. Same for some raw scientific data from instrumentation. That said, it really does depend on the industry; I can imagine a 3000 person company where most employees are sales/warehouse/factory drones not needing that much software. Basically if most employees are "knowledge workers" (or shoehorned into it like healthcare where doctors and nurses are required to use atrocious piles of software to record minutia about patient care) then IT is going to be bigger than others.
That said, you could probably use a synchronized random number generator as the shared pad data. The other side would only be able to decrypt messages for as long as they buffer the random number data; after which the message is lost to everyone for eternity. This could work for a TLS session where messages are exchanged with only a couple minutes (or preferably seconds) delay so that the buffer does not need to be very big.
That's roughly the definition of a stream cipher (e.g. RC4 or a block cipher in Counter mode). Only a cryptographically secure random number generator works, which is why such a thing is called a stream cipher and not just a "pseudo-random one time pad". In any case it's not a true one time pad because the entropy of the stream of pseudorandom data is limited to the entropy of the internal state of the cipher, and further limited by the entropy of the key. That means stream ciphers can be broken given only the ciphertext, as opposed to using a one time pad. Stream ciphers also share the same weakness as one time pads; reusing the same stream cipher key is just as bad as reusing a one time pad (virtually automatic recovery of all plaintexts encrypted with the same pad/stream).
For high throughput/IOPS requirements build a Lustre/Ceph/etc. cluster and mount the cluster filesystems directly on as many clients as possible. You'll have to set up gateway machines for CIFS/NFS clients that can't directly talk to the cluster, so figure out how much throughput those clients will need and build appropriate gateway boxes and hook them to the cluster. Sizing for performance depends on the type of workload, so start getting disk activity profiles and stats from any existing storage NOW to figure out what typical workloads look like. Data analysis before purchasing is your best friend.
If the IOPS and throughput requirements are especially low (guaranteed < 50 random IOPS [for RAID/background process/degraded-or-rebuilding-array overhead] per spindle and what a couple 10gbps ethernet ports can handle, over the entire lifetime of the system) then you can probably get away with just some SAS cards attached to SAS hotplug drive shelves and building one big FreeBSD ZFS box. Use two mirrored vdevs per pool (RAID10-alike) for the higher-IOPS processing group and RAIDZ2 or RAIDZ3 with ~15 disk vdevs for the archiving group to save on disk costs.
Plan for 100% more growth in the first year than anyone says they need (shiny new storage always attracts new usage). Buy server hardware capable of 3 to 5 years of growth; be sure your SAS cards and arrays will scale that high if you go with one big storage box.
The only thing you're missing is support for arbitrary SIP-level proofs beyond type safety (e.g. support for arbitrary proofs of SIP behavior such as time/space complexity, halting, semantic properties, etc.) , and a formally verified self-verifying proof-checker to make sure the compiler is generating correct code and proofs. It looks like you're looking into PCC and TAL, so once you can ship the verifier with its own proof and self-verify during the boot process, you can be fairly certain that hardware errors are the only problem left. I assume you've already executing with a subset of the x86(_64) instruction set for easier verification. I figure that limiting code generation to the smallest set of opcodes can take advantage of formal verification done by Intel/AMD/others in processor design, while excluding all the complex protected-mode and virtualization instructions. Turning off SMM and injection of other arbitrary BIOS/EFI code would also be handy. The hardest part to model and prove correct will probably be the mutli-processor cache coherency behavior, but hopefully Intel at least has done some of that work already and can guarantee adherence to the specs.
With a warrant, that is. Same with webmail and any other hosted service. Warrents describing a particular place and person have a way of producing encryption keys from service providers. When warrents aren't fast enough for them, then you know they're doing something very, very wrong. Unlike movies where Jack Spy decrypts the terrorists' plans in real-time to thwart them, our jokers can barely even share high-priority bulletins about suspected terrorists planning to board a plane in a day or two. It's ludicrous to suggest that they need faster access to information when they can't even manage what they have already.
How about this; the 9 supreme court justices post their public keys on www.supremecourt.gov, keep their private keys safe, and I'll voluntarily split copies of my private keys into 5-of-9 shares using Shamir's secret-sharing scheme and encrypt each share to one justice and post the ciphertext publicly. Then the NSA can stop introducing weaknesses in the free software I use, and heaven forbid they need to peek at my shopping list, but if they do they can convince some actual judges to let them see it.
I dunno, I'm happy enough with my voluntary free association with the United States. I'm free to leave if I stop liking it, as are you.
What anti-state people don't seem to grasp is that the very same people who you hate in the government, the people who want to control your life and take things from you, weren't made that way by big government. Just look at Mexico. Big drug cartels (who may or may not be entirely the creation of anti-drug big government) are more powerful than the government. Wherever there is an advantage to be had by banding together and robbing the weaker or more honest people, you'll find that niche being filled. The job of government is to fill that niche with the least harmful and most inept robbers. That overpaid, uncooperative, unfriendly civil servant that you despite? Give them a gun and a posse and see how well that turns out for you.
Yeah, assuming you're not doing anything at all with the array while it's rebuilding, and none of the sectors have been remapped causing seeks in the middle of those long reads/writes.
To throw out one more piece of advice; RAID6 is useless without periodic media scans. You don't want to discover that one of your drives has bit errors while the array is rebuilding another failed drive. RAID6 can't correct a known-position error and an unknown-position error at the same time. raidz2 has checksums that should detect the bit flip and reconstruct the stripe from the N-2 known good copies, but at these sizes you should probably start worrying about the possibility of two bit flips in the same stripe.
Putting nuclear bombs on the tips of rockets and programming them to hit other parts of the Earth is also mere tool use. Tools are not inherently safe, and never have been. Autonomous tools are even less inherently safe. The most likely outcome of a failed singularity isn't being ruled by robot overlords, it's being dead.
Both Capitalism and Communism are supposed to be about maintaining the work force, so guess where we all are today?
A nominally capitalist country pays a communist country for much of its manufacturing because it's cheaper, instead of employing its own citizens. So the logical next step is to just buy the robot factory workers from China to replace workers in the U.S. to save on shipping costs.
The machine has no fucking clue about what it is translating. Not the media, not the content, not even what to and from which languages it is translating (other than a variable somewhere, which is not "knowing". None whatsoever. Until it does, it has nothing to do with AI in the sense of TAFA. (The alarmist fucking article)
How would you determine this, quantitatively? Is there a series of questions you could ask a machine translator about the text that would distinguish it from a human translator? Asking questions like "How did this make you feel?" is getting into the Turing Test's territory. Asking questions like "Why did Alice feel X" or "Why did you choose this word over another word in this sentence?" is something that machines are getting better at answering all the time.
To head off the argument that machine translation is just using large existing corpus of human-generated text, my response is that is pretty much what humans do. Interact with a lot of other humans and their texts to understand the meaning. Clearly humans have the tremendous advantage of actually experiencing some of what is written about to ground their understanding of the language, but as machine translation shows it is not a necessity for demonstrating an understanding of language.
For the argument that meaning must be grounded in conscious experience for it to be considered "intelligence" I would argue that machine learning *has* experience spread across many different research institutions and over time. Artificial selection has produced those agents and models which work well for human language translation, and this experience is real, physical experience of algorithms in the world. Not all algorithms and models survived, the survivors were shaped by this experience even though it was not tied to one body, machine, location, or time. Whether machine translation agents are consciously aware of this experience, I couldn't say. They almost certainly have no direct memory of it, but evidence of the experience exists. Once a system gets to the point that it can provide a definite answer to the question "What have machine translation agents experienced?" and integrate everything it knows about itself and the research done to create it, then we'll have an answer.
Everything humans do is simply a matter of following a natural-selection-generated set of instructions, bootstrapping from the physical machinery of a single cell. Neurological processes work together in the brain to produce intelligence in humans, at least as far as we can tell. Removing parts of the human brain (via disease, injury, surgery, etc.) can reduce different aspects of intelligence, so it's not unreasonable to think that humans are also a pile of algorithms united in a special way that leads to general intelligence and that AI efforts are only lacking some of the pieces and a way of uniting them. As researchers put together more and more of the individual pieces (speech and object recognition, navigation, information gathering and association, etc.) the results probably won't look like artificial general intelligence until all the necessary pieces exist and it's only the integration that remains to be done. For example there's another article today about the claustrum in a woman that appears to be an effective on-off switch for her consciousness, strengthening the evidence for consciousness being an integration of various neural subsystems mediated by other regions that produce consciousness.
It's important to consider that AGI may act nothing like human or animal intelligence, either. It may not be interested in communication, exploration, or anything else that humans are interested in. Its drives or goals will be the result of its algorithms, and we shouldn't discount the possibility of very inhuman intelligence that nonetheless has a lot of power to change the world. Expecting androids or anthropomorphic robots to emerge from the first AGI is wishful thinking. The simplest AGI would probably be most similar to bacteria or other organisms we find annoying; it would understand the world well enough to improve itself with advanced technology but wouldn't consider the physical world to consist of anything but resources for its own growth. It may even lack sentient consciousness.
Producing human-equivalent AGI is a step or two beyond functional AGI. Implementing all of nature's tricks for getting humans to do the things we do in silicon will not be a trivial task. Look at The Moral Landscape or similar for ideas about how one might go about reverse engineering what makes humans "human" so that the rules could be encoded in AGI.
Unless all the code running on the machine is absolutely type-safe and only allows "safe" reflection then trying to hide sensitive data from other bits of code in your address space is a lost cause. Code modification, emulation, tracing, breakpoint instructions, hardware debugger support, etc. are all viable ways for untrusted code with access to your address space to steal your data.
Wiping memory is only effective for avoiding hot or cold boot attacks against RAM, despite its frequent use for hacking terrible operating systems to hope/pretend that userspace software isn't leaking data into other processes either directly via attacks or accidentally through kernel mishandling of memory.
Confidence bands/intervals don't make statements about the probability of certain outcomes. They make statements about the interval itself. At best 95% of the bands calculated will include the "true value". No, this is not a nitpick.
Mod up. There is quite a difference between being 95% certain of a particular outcome and a particular outcome being within a 95% confidence interval. When rolling a D20 a 10 is within the 95% confidence interval [2-20] but rolling a 10 sure as hell isn't 95% likely.
A credible interval (sometimes called a Baysian confidence interval) predicts how likely it is that the true value lies within the interval.
There's a problem if being without official citizenship is an automatic one-way ticket to GITMO. The obvious solution is to give him a work visa in the U.S., with the option of citizenship.
So, sure, whitelisting might prevent your uses from running unapproved browsers at work, but it will not secure a computer system against actual attackers. Not to mention that a good chunk of would-be whitelisted binaries actually have embedded language environments (macros, javascript, shell/batch scripts, java, vbscript, etc.) that would also need to be added to the whitelisting framework.
I would happily support DRM that actually cared about customers' rights. I want the guarantee that, like physical media, DRM-protected content will be available in the far future. Blu-ray already fails this test, and I only purchase Blu-rays to strip the DRM and save a long-term format. I want the ability to gift, loan, or sell any media that I possess the rights to. I don't want to possess merely a ticket which grants me admittance to content for a limited time, under limited conditions, subject to the dissolution of whatever producer, licencor, or operator manages the DRM scheme.
Because piracy has absolutely no effect on 99% of customers I am fairly certain that what content producers/licencors truly fear is "casual piracy" and fair use like loans and libraries where market forces drive the resale cost of digital media down to its natural price in the free market.
It's perfectly natural to resist inferior DRM schemes by refusing to make them standard. If you want me to support an open DRM standard then it needs to be capability based with normal customers like you or me represented as first class owners of those capabilities and implement a durable scheme for transfer of those capabilities into the indefinite future.
For example, consider a ownership-based scheme where producers issue N digitally-signed capabilities to a particular copyrighted work and sell them to customers on an electronic marketplace. Bitcoin has proven that it's possible to maintain a globally consistent transaction ledger of ownership of individual tokens, and a much cheaper implementation could maintain ownership and facilitate programmatic transfer of capabilities to digital works (to support sales, gifts, and even temporary loans) because the marginal value of acquiring more than one capability to the same work is zero and so there will be little need to spend gigawatts of electricity maintaining the blockchain against adversaries. The copyrighted work doesn't even have to be encrypted. Just make standards-compliant devices/software require current ownership of a capability to use the work. Yes, this is an easily defeated scheme for pirates, but so is every other DRM scheme. At least this respects individual property rights, the first sale doctrine, fair use, and libraries for the vast majority of users.
After dealing with channel bonding where one link would spontaneously become 100% busy while the other three hung around 10% busy every time one VM wanted to copy files somewhere, I decided channel bonding is basically useless.
In the best case scenario some idiot terrorist without a gun gets shot a couple times in the center of mass with no over-penetration before setting off a bomb. Virtually any other scenario doesn't require firearms to handle; terrorists on a plane are literally at arms-length to a hoard of people who hate terrorists. Terrorists taking overt action need room to maneuver or the ability to barricade themselves, or else the ability to instill overwhelming fear and inaction in everyone surrounding them. Taking over a plane with the threat of personal violence is pretty much the hardest thing a terrorist could accomplish at this point.
1.) all the photos I took of her seemed incredibly important at the time but are never looked at any more
Yeah, photos have a weird W-shaped utility; They get shared and looked at a lot when brand new. After 6 months to a year they sit in boxes/drives for years and after about 20 years the utility climbs again until ~150 years later when no living relatives remember the people in the photos. Then after a few more decades they have historical value. Hence the need to plan for long-term storage.
Take UDF. Expand it to the PB realm, not the existing 2TB. Add some ZFS features like ditto blocks, 64-128 bit CRCs, cryptographically signed writes with public keys, standard encryption, standard compression, ability to duplicate the filesystem as an image (so rsync utilities are usable to preserve hierarchy), snapshot directories a la OneFS/WAFL,
ZFS is probably your best bet for now. Oracle built filesystem-level encryption into the Solaris offering, no luck for the free versions. No cryptographic signing of writes, but that is imho overkill when you have to trust the whole kernel and filesystem layer and so whole-disk encryption plus SHA256 checksums gives basically the same assurance that no data has been modified. You can hold snapshots in ZFS to prevent them from being accidentally deleted and treat them as basically WORM.
So from your smallest box 3x 3TB = 9 TB of data, and Glacier and Google nearline (maybe others too?) are charging $0.01/GB-month, so about $90/month if you back up the whole thing. I don't know how much you pay for electricity in both locations, but if a box can run/idle at 100W and you leave it on all the time you spend ~900KW a year. At $0.20/KWh that's about $180/year per server. Disks every 3 years (if you get HGST's warranty) is $140/year (using $0.035/GB rough cost today), or $27/month per server for ongoing costs not including replacing the other hardware periodically. $54/month vs. $90/month? Sure, it's a little cheaper. If you wanted one box and one online service it makes running your own look better; $120/month vs $54/month. What about connectivity at both sites? If you are already paying an ISP for other reasons at both ends that's one thing, otherwise throw another ~$50/month on top of at least the backup server cost. AWS and Google appear to currently charge $0/GB for incoming transfers. Of course if you can get deals on cheap drives and run them past the warranty in a state with cheap electricity (or in a dorm room with free Internet/electricity) it's a lot cheaper.
As for security, encrypt before copying anywhere. You might as well be running local disk encryption too so you never have to worry about returning a disk with plaintext for warranty repair. I don't trust any company to keep the data I upload secret (FISA courts, NSA, bla bla bla), so encrypting incremental ZFS snapshots and uploading them is an efficient way of maintaining an offsite backup. I only have 1TB I care to back up this way so it's less sticker shock each month, but I still find it amusing that the first box I built was 4*320GB RAID5 and now that costs $9/month.
1500 VMs isn't that crazy for 3000 people when you have to use Windows. Every individual piece of software is going to want its own VM, often two or more for redundancy/load balancing, plus an equal number for the test environment, and often a few more for dev/upgrade environments. Many software packages with a server component are big cumbersome globs of many .exes that the vendor "recommends" be run on separate VMs because the developers have no clue how to write software and rebooting Windows is the first solution to half the issues. Think a 3000 person company doesn't have the necessary ~200 apps to reach 1500 VMs by this measure? There's usually several software applications that are specific to each department, and there are lots of departments: purchasing, accounting, distribution/receiving, each core business unit, HR, PR, engineering/plantops, business office, sales, and last but not least IT which is guaranteed to run dozens if not hundreds of separate apps to do their jobs. Sure; not all of them require a server, but many do, even if it's just a ridiculous license server. Data? Anyone processing video or images is just going to have a crapload of data period. Same for some raw scientific data from instrumentation. That said, it really does depend on the industry; I can imagine a 3000 person company where most employees are sales/warehouse/factory drones not needing that much software. Basically if most employees are "knowledge workers" (or shoehorned into it like healthcare where doctors and nurses are required to use atrocious piles of software to record minutia about patient care) then IT is going to be bigger than others.
That said, you could probably use a synchronized random number generator as the shared pad data. The other side would only be able to decrypt messages for as long as they buffer the random number data; after which the message is lost to everyone for eternity. This could work for a TLS session where messages are exchanged with only a couple minutes (or preferably seconds) delay so that the buffer does not need to be very big.
That's roughly the definition of a stream cipher (e.g. RC4 or a block cipher in Counter mode). Only a cryptographically secure random number generator works, which is why such a thing is called a stream cipher and not just a "pseudo-random one time pad". In any case it's not a true one time pad because the entropy of the stream of pseudorandom data is limited to the entropy of the internal state of the cipher, and further limited by the entropy of the key. That means stream ciphers can be broken given only the ciphertext, as opposed to using a one time pad. Stream ciphers also share the same weakness as one time pads; reusing the same stream cipher key is just as bad as reusing a one time pad (virtually automatic recovery of all plaintexts encrypted with the same pad/stream).
For high throughput/IOPS requirements build a Lustre/Ceph/etc. cluster and mount the cluster filesystems directly on as many clients as possible. You'll have to set up gateway machines for CIFS/NFS clients that can't directly talk to the cluster, so figure out how much throughput those clients will need and build appropriate gateway boxes and hook them to the cluster. Sizing for performance depends on the type of workload, so start getting disk activity profiles and stats from any existing storage NOW to figure out what typical workloads look like. Data analysis before purchasing is your best friend.
If the IOPS and throughput requirements are especially low (guaranteed < 50 random IOPS [for RAID/background process/degraded-or-rebuilding-array overhead] per spindle and what a couple 10gbps ethernet ports can handle, over the entire lifetime of the system) then you can probably get away with just some SAS cards attached to SAS hotplug drive shelves and building one big FreeBSD ZFS box. Use two mirrored vdevs per pool (RAID10-alike) for the higher-IOPS processing group and RAIDZ2 or RAIDZ3 with ~15 disk vdevs for the archiving group to save on disk costs.
Plan for 100% more growth in the first year than anyone says they need (shiny new storage always attracts new usage). Buy server hardware capable of 3 to 5 years of growth; be sure your SAS cards and arrays will scale that high if you go with one big storage box.
And let the ones smart enough to hide breed even smarter kids?
The only thing you're missing is support for arbitrary SIP-level proofs beyond type safety (e.g. support for arbitrary proofs of SIP behavior such as time/space complexity, halting, semantic properties, etc.) , and a formally verified self-verifying proof-checker to make sure the compiler is generating correct code and proofs. It looks like you're looking into PCC and TAL, so once you can ship the verifier with its own proof and self-verify during the boot process, you can be fairly certain that hardware errors are the only problem left. I assume you've already executing with a subset of the x86(_64) instruction set for easier verification. I figure that limiting code generation to the smallest set of opcodes can take advantage of formal verification done by Intel/AMD/others in processor design, while excluding all the complex protected-mode and virtualization instructions. Turning off SMM and injection of other arbitrary BIOS/EFI code would also be handy. The hardest part to model and prove correct will probably be the mutli-processor cache coherency behavior, but hopefully Intel at least has done some of that work already and can guarantee adherence to the specs.
and your HR department is paying "competitive wages" at the 50th percentile?
Let me know how that works out for you.
With a warrant, that is. Same with webmail and any other hosted service. Warrents describing a particular place and person have a way of producing encryption keys from service providers. When warrents aren't fast enough for them, then you know they're doing something very, very wrong. Unlike movies where Jack Spy decrypts the terrorists' plans in real-time to thwart them, our jokers can barely even share high-priority bulletins about suspected terrorists planning to board a plane in a day or two. It's ludicrous to suggest that they need faster access to information when they can't even manage what they have already.
How about this; the 9 supreme court justices post their public keys on www.supremecourt.gov, keep their private keys safe, and I'll voluntarily split copies of my private keys into 5-of-9 shares using Shamir's secret-sharing scheme and encrypt each share to one justice and post the ciphertext publicly. Then the NSA can stop introducing weaknesses in the free software I use, and heaven forbid they need to peek at my shopping list, but if they do they can convince some actual judges to let them see it.
I dunno, I'm happy enough with my voluntary free association with the United States. I'm free to leave if I stop liking it, as are you.
What anti-state people don't seem to grasp is that the very same people who you hate in the government, the people who want to control your life and take things from you, weren't made that way by big government. Just look at Mexico. Big drug cartels (who may or may not be entirely the creation of anti-drug big government) are more powerful than the government. Wherever there is an advantage to be had by banding together and robbing the weaker or more honest people, you'll find that niche being filled. The job of government is to fill that niche with the least harmful and most inept robbers. That overpaid, uncooperative, unfriendly civil servant that you despite? Give them a gun and a posse and see how well that turns out for you.
Yeah, assuming you're not doing anything at all with the array while it's rebuilding, and none of the sectors have been remapped causing seeks in the middle of those long reads/writes.
To throw out one more piece of advice; RAID6 is useless without periodic media scans. You don't want to discover that one of your drives has bit errors while the array is rebuilding another failed drive. RAID6 can't correct a known-position error and an unknown-position error at the same time. raidz2 has checksums that should detect the bit flip and reconstruct the stripe from the N-2 known good copies, but at these sizes you should probably start worrying about the possibility of two bit flips in the same stripe.
Putting nuclear bombs on the tips of rockets and programming them to hit other parts of the Earth is also mere tool use. Tools are not inherently safe, and never have been. Autonomous tools are even less inherently safe. The most likely outcome of a failed singularity isn't being ruled by robot overlords, it's being dead.
Why opponents hate basic income but love individual retirement accounts is beyond me.
Both Capitalism and Communism are supposed to be about maintaining the work force, so guess where we all are today?
A nominally capitalist country pays a communist country for much of its manufacturing because it's cheaper, instead of employing its own citizens. So the logical next step is to just buy the robot factory workers from China to replace workers in the U.S. to save on shipping costs.
The machine has no fucking clue about what it is translating. Not the media, not the content, not even what to and from which languages it is translating (other than a variable somewhere, which is not "knowing". None whatsoever. Until it does, it has nothing to do with AI in the sense of TAFA. (The alarmist fucking article)
How would you determine this, quantitatively? Is there a series of questions you could ask a machine translator about the text that would distinguish it from a human translator? Asking questions like "How did this make you feel?" is getting into the Turing Test's territory. Asking questions like "Why did Alice feel X" or "Why did you choose this word over another word in this sentence?" is something that machines are getting better at answering all the time.
To head off the argument that machine translation is just using large existing corpus of human-generated text, my response is that is pretty much what humans do. Interact with a lot of other humans and their texts to understand the meaning. Clearly humans have the tremendous advantage of actually experiencing some of what is written about to ground their understanding of the language, but as machine translation shows it is not a necessity for demonstrating an understanding of language.
For the argument that meaning must be grounded in conscious experience for it to be considered "intelligence" I would argue that machine learning *has* experience spread across many different research institutions and over time. Artificial selection has produced those agents and models which work well for human language translation, and this experience is real, physical experience of algorithms in the world. Not all algorithms and models survived, the survivors were shaped by this experience even though it was not tied to one body, machine, location, or time. Whether machine translation agents are consciously aware of this experience, I couldn't say. They almost certainly have no direct memory of it, but evidence of the experience exists. Once a system gets to the point that it can provide a definite answer to the question "What have machine translation agents experienced?" and integrate everything it knows about itself and the research done to create it, then we'll have an answer.
Everything humans do is simply a matter of following a natural-selection-generated set of instructions, bootstrapping from the physical machinery of a single cell. Neurological processes work together in the brain to produce intelligence in humans, at least as far as we can tell. Removing parts of the human brain (via disease, injury, surgery, etc.) can reduce different aspects of intelligence, so it's not unreasonable to think that humans are also a pile of algorithms united in a special way that leads to general intelligence and that AI efforts are only lacking some of the pieces and a way of uniting them. As researchers put together more and more of the individual pieces (speech and object recognition, navigation, information gathering and association, etc.) the results probably won't look like artificial general intelligence until all the necessary pieces exist and it's only the integration that remains to be done. For example there's another article today about the claustrum in a woman that appears to be an effective on-off switch for her consciousness, strengthening the evidence for consciousness being an integration of various neural subsystems mediated by other regions that produce consciousness.
It's important to consider that AGI may act nothing like human or animal intelligence, either. It may not be interested in communication, exploration, or anything else that humans are interested in. Its drives or goals will be the result of its algorithms, and we shouldn't discount the possibility of very inhuman intelligence that nonetheless has a lot of power to change the world. Expecting androids or anthropomorphic robots to emerge from the first AGI is wishful thinking. The simplest AGI would probably be most similar to bacteria or other organisms we find annoying; it would understand the world well enough to improve itself with advanced technology but wouldn't consider the physical world to consist of anything but resources for its own growth. It may even lack sentient consciousness.
Producing human-equivalent AGI is a step or two beyond functional AGI. Implementing all of nature's tricks for getting humans to do the things we do in silicon will not be a trivial task. Look at The Moral Landscape or similar for ideas about how one might go about reverse engineering what makes humans "human" so that the rules could be encoded in AGI.
Unless all the code running on the machine is absolutely type-safe and only allows "safe" reflection then trying to hide sensitive data from other bits of code in your address space is a lost cause. Code modification, emulation, tracing, breakpoint instructions, hardware debugger support, etc. are all viable ways for untrusted code with access to your address space to steal your data.
Wiping memory is only effective for avoiding hot or cold boot attacks against RAM, despite its frequent use for hacking terrible operating systems to hope/pretend that userspace software isn't leaking data into other processes either directly via attacks or accidentally through kernel mishandling of memory.
Confidence bands/intervals don't make statements about the probability of certain outcomes. They make statements about the interval itself. At best 95% of the bands calculated will include the "true value". No, this is not a nitpick.
Mod up. There is quite a difference between being 95% certain of a particular outcome and a particular outcome being within a 95% confidence interval. When rolling a D20 a 10 is within the 95% confidence interval [2-20] but rolling a 10 sure as hell isn't 95% likely.
A credible interval (sometimes called a Baysian confidence interval) predicts how likely it is that the true value lies within the interval.
There's a problem if being without official citizenship is an automatic one-way ticket to GITMO. The obvious solution is to give him a work visa in the U.S., with the option of citizenship.
So, sure, whitelisting might prevent your uses from running unapproved browsers at work, but it will not secure a computer system against actual attackers. Not to mention that a good chunk of would-be whitelisted binaries actually have embedded language environments (macros, javascript, shell/batch scripts, java, vbscript, etc.) that would also need to be added to the whitelisting framework.