If you're after data from two different stripe positions on a RAID 1 set, sending one drive to each place will get you the data faster than having both go to one place then both go to the other.
So, I have seven database servers, all with identical copies of the data. Do I really care if I lose all the data on one of them because one drive in a RAID 0 set fails? The completely redundant systems do the job better than any RAID setup can.
You consider RAID 0 when you don't care about losing the data if there's a drive failure and want the benefits of striping and the extra space available for a given number of drive bays, compared to other RAID levels. RAID 5 can get you some of the space but it's slower for database work.
Since I wanted some facts, Wikipedia ordered two systems for database service, both dual Opterons with 4GB of RAM and six drives. One with 10,000 RPM SCSI drives and one with 10,000 RMP SATA drives. The SATA system, without NCQ, was generally faster and ended up with a higher proportion of the site load assigned to it. The SCSI system was sometimes faster in mixtures which included lots of writes with lots of reads and that made it lag a bit less in replication of bulk update operations, so newer systems have been SCSI. If more drive bays had been available, adding another couple of SATA drives would probably have made the SATA set faster for that case as well and still cheaper.
If lower access times are needed, SCSI drives beat SATA drives just because you can only get 15,000 RPM with a SCSI interface. May also make sense to have 15,000 RPM drives if you're already spending a lot of money on 16GB of RAM.
The question about this drive which interests me is whether drive write caching can be easily turned off and will stay off, so you don't lose database data when the database thinks the data has been flushed to the surface but it hasn't really been flushed. If you can't do that, it's unsuitable for a lot of database work - certainly unsuitable for use with RADI controllers with battery backed up write caches, where you have the battery to make sure you don't lose cached data if the power goes off. Anyone who things colo power and UPS will protect against loss of power hasn't suffered enough yet...:)
Right on all points. The same would apply to wikipedia as well, because it's also used generically for the encyclopedia and is essential for everyone else who needs to identify their distribution as being a distribution of wikipedia.
By producing FOSS I mean people like you and me producing things for the benefit of all and trying to maximise code reuse and ability to customise things to meet our needs and preferences.
Just using the end result is a different matter. Talking here about code reuse and sharing, so we can do things like customise a printer driver to our needs by adding some copyleft code, if the driver is PD or BSD or even under another copyleft license. BSD or PD you can in theory solve by using the copyleft license of the code you want to integrate. But if the other license is copyleft, but a different copyleft, you're pretty much stuck with hoping the author(s) aren't advocates of one particular copyleft license and are willing to be part of the community and work together.
Copyleft licenses are far worse than the public domain. The typical type blocks all reuse except by something with exactly the same license, making each license a walled garden, only compatible with things having exactly the same license. Public domain or even a license like BSD lets just about any person producing FOSS use the work, without dividing the OSS world.
The story is quite sad. We start with the public domain and BSD making it really easy for hackers to share and improve programs with all of them being easy to merge with projects with other license types so everyone can work together.
Then RMS arrives on the scene and sees a commercial company not doing that sharing. So he comes up with a license (contrary to his not sign a license or NDA objective) which tries to force unwilling people to do good things.
Unfortunately, it has a side effect. It's no longer possible for all the other licenses the good people are using to work with it. The moment they use a bit of software with the new license, they lock out all the rest of the good people, as their work ceases to be public domain or BSD and easy for everyone to use and integrate.
It'll be inteesting to see how the future turns out and whether RMS becomes most known for splitting the free software community.
Personally, I wish some people would stop making life hard for the other good people in their quest to try to force the less good people to do what they want. Better to work with the other good people than continually fracture the community.
Linux is what the applications run on. It's the applications and the kernel they use to interact with the world that businesses are focussed on. The command line utilities are incidental and most end users of corporate applications or home users don't see them.
That's the world view which makes the OS called Linux, not GNU/Linux. It makes perfect sense for a corporation or home user. It's less so for a person who lives in a CLI and is using the tools routinely but most users aren't sysadmins. Of course, I do live in the sysadmin world but that doesn't mean that I don't see why it's normally called Linux.
You'd need to be prepared to deal with people hardwiring USB keyloggers to the motherboard or inserting into keyboard itself. Or inserting whatever into any other bits of the computer which are available. Add more when you might have to deal with actual professionals in the business of compromising such systems to get at their contents or install bugs for audio. Sounds like a really poor concept to try mixing use. But do ask the real experts, who I assume are your customers.
Quite right. It's far from certain that those cures are worth their price in terms of other medication required and the life consequences that brings. Still, even with flaws, they are cures and a way to do at least one of them could conceivably be found with reduced rejection risk. Maybe, eventually. Or not. Caveats because it's been said before and cure talk is an old, old story.:)
Both islet cell and pancreas transplants work to cure type 1 diabetes. They aren't viable for type 1s in general at present because of the great imbalance between the numbers of diabetics and donors but they are effective treatments. They may never be viable for most - certainly pancreas transplants won't be.
This is one of the proposals. Variations on time exist but it can help to reduce the reward for vandalism by making it impossible to show your friends what you just did, as well as increasing the time available to fix. But short periods, in the few minutes range, are more likely than hours, because part of the power of a wiki is that there are many eyes, including anonymous eyes, who can read and correct vandalism.
Imrpovements in tools to highlight edits which show indications of being possible vandalism are also likely. Those should increase the speed and accuracy of human review of edits.
The Wikipedia community is easily capable of doing it, given the presence of a sufficient number of people with the required vision to see that local encyclopedic content is of tremendous value. At present, those without that vision tend to dominate. If you want geo-coding and such, there are people there with an interest but there's still some development work needed in the MediaWiki software to support it.
Once that's done, and cell phone companies integrate it, we could all have an encyclopedia making available information about most things near to us.
Anthony was referring to the objection a sufficient number of authors have to the inclusion of articles about individual hotels, bars, pubs, cafes, elementary schools, malls, shops, streets and bus stops. In general those who object have successfully prevented the inclusion of most examples of such things in the encyclopedia, barring it from being an effective reference work at the fine-grained level contemplated here. That is: you will be able to fing out about Burger King but it won't have information about the one on the next block, so it won't be able to tell you about it as you approach or identify the local eating places and provide their history and that of any chains they belong to.
Re:Excellent RAID reference: not
on
Basics of RAID
·
· Score: 2, Informative
Consider:
RAID 10 disadvantage: "All drives must move in parallel to proper track lowering sustained performance". In fact each drive can seek independently for reads and only pairs must seek together for writes.
RAID 1 advantage: "Transfer rate per block is equal to that of a single disk" RAID 5 disadvantage: "Individual block data transfer rate same as single disk"
Would be nice if it was consistent about whether that's good or bad.
RAID5: "Highest Read data transaction rate" except for RAID 10, of course, where you've less chance of being bottlenecked because there are two sources for each stripe.
RAID5: "Medium Write data transaction rate", only the lowest of all except 50, because of the parity calculating and writing to a second drive.
As the operator of the access point said, he made a deliberate choice, being aware of his options, to leave the access point configured so anyone in the area could use it and the services availabe through it. Also, so that they _would_ use it if it happened to be the strongest signal their computer found when auto-selecting the AP to use.
It could have been one of his own neighbors in the neighbors own home if his AP happened to have a better signal than the AP the neighbor had set up for their internet connection. In which case I suppose his neighbor might be having _him_ arrested for intercepting the communications the neighbor was tring to make through his own access point?
No intent necesary here, it's just how this particular technology works when you have it configured to:
1. Reply to the pings asking if anyone out there is offering service, identifying yourself as being available. An optional feature you can turn off if you don't want to advertise your service as being available.
2. Offer wireless connectivity to anyone who stops by with a wireless card configured as they often are to automatically connect to the strongest signal around. Easy to turn off by enabling WEP.
3. Grant an IP address to anyone when requested. Easy to turn off by turning off the DCHP server.
4. Presumably not use a network name/SSID indicating that it's supposed to be private with something like the word "private" in it.
5. Offer whatever services are then available for onward connectivity. Which might or might not be contrary to the agreement made with the upstream provider/ISP, but the person who has at this point had equipment automatically request service three different times and been told yes each time has no way to know that.
The gentlemans AP was apparently configured to offer the services it offered to anyone passing by. That's a choice he made. One he knew about and could easily, in many ways, have made differently with no more than a second or two of work, like changing the SSID to include the word private.
Now, the intent question in this one is a different matter. The guy in the SUV apparently concealed his access, suggesting that he believed his access was unauthorized. And that makes all of the foregoing irrelevant, because it establishes an intent to use it without authorization. Had he said what he was doing when first approached, if first approached, and not tried to conceal his activity, that would be completely different and would lack that sign of intent to have unauthorized service.
Not for the list but for checking every address you want to send to against the list. You don't get the list because that would tell you a list of addresses of children. Since you generally don't know where an address is located that means you have to run every address in a worldwide mailing through it or break the law.
The problem isn't filtering it at source but preventing a modified peer client from inserting bad things which it then sends on to end users. Hard to prevent that because the attacker has control of everything which is being sent from the peer.
Wikipedia is useful enough that we'd probably end up with the major ISPs hosting a Coral-type cache on their own network, if it was a plug it in and forget about it arrangement.
15 cents per thousand emails, assuming 50 states do it with the charge given in the Michigan law, sounds rather expensive for just checking database records. Cheap enough way to get contact points of children, though.
Yahoo was first, about a year before the Google thing Yahoo arranged some content linking. Then ON THE SAME DAY both Google and Yahoo agreed to provide hardware. The Google news leaked, making it appear as though Google was first when it was actually as close to simultaneous as these things can be. Each is being accepted and used in the order which works out most conveniently for Wikipedia.
Both Yahoo and Google deserve approximately equal kudos for being helpful to the projects. Thanks!
Averaging 60-70 megabits per second over a whole month. Peaks at 320 megabits per second in extreme cases. Typical daily peaks in the 120 megabit per second range. 6 months ago it was more than 200 million database queries per day and it's probably several times that today.
I'm wondering about setting up a network of boxes running the Coral software. Those have built in fault tolerance so it wouldn't take lots of admin work and would allow accepting many small bandwidth offers, in countries with comparatively low traffic. Makes most content even closer to the end users and spreads the bandwidth load around. Nothing actually happening on this front yet, though.
A very large number of places witih full database servers and page builders, like this Yahoo announcement, would have too much admin overhead - 3-6 of those places is about right.
P2P is a security problem. People can always modify P2P programs to add nasty content and Wikipedia has already seen people trying to upload that and has filters in place to catch and block some things.
The donor has a data center and people to look after the servers there. I assume that they looked carefully and concluded that it wass an excellent site form which to serve the region for their own operations and getting the benefit of their operational experience is a good thing.
Wikipedia doesn't need that. It needs more - those aren't enough to handle the full load.:) They should be enough for the Asia-Pacific region for a few months at least. Wikipedia growth is still limited by performance when it comes to viewing pages not in cache and editing (adding and changing content).
Because the technical team at Wikipedia includes the developers and we know that there are sure to be problems as it is introduced to full service. Anything from outright bugs to database queries with unacceptable load properties. It'll probably be released for a general audience in four to eight weeks, once it's been very thoroughly tested at its biggest user site.
If you're after data from two different stripe positions on a RAID 1 set, sending one drive to each place will get you the data faster than having both go to one place then both go to the other.
So, I have seven database servers, all with identical copies of the data. Do I really care if I lose all the data on one of them because one drive in a RAID 0 set fails? The completely redundant systems do the job better than any RAID setup can.
You consider RAID 0 when you don't care about losing the data if there's a drive failure and want the benefits of striping and the extra space available for a given number of drive bays, compared to other RAID levels. RAID 5 can get you some of the space but it's slower for database work.
Since I wanted some facts, Wikipedia ordered two systems for database service, both dual Opterons with 4GB of RAM and six drives. One with 10,000 RPM SCSI drives and one with 10,000 RMP SATA drives. The SATA system, without NCQ, was generally faster and ended up with a higher proportion of the site load assigned to it. The SCSI system was sometimes faster in mixtures which included lots of writes with lots of reads and that made it lag a bit less in replication of bulk update operations, so newer systems have been SCSI. If more drive bays had been available, adding another couple of SATA drives would probably have made the SATA set faster for that case as well and still cheaper.
:)
If lower access times are needed, SCSI drives beat SATA drives just because you can only get 15,000 RPM with a SCSI interface. May also make sense to have 15,000 RPM drives if you're already spending a lot of money on 16GB of RAM.
The question about this drive which interests me is whether drive write caching can be easily turned off and will stay off, so you don't lose database data when the database thinks the data has been flushed to the surface but it hasn't really been flushed. If you can't do that, it's unsuitable for a lot of database work - certainly unsuitable for use with RADI controllers with battery backed up write caches, where you have the battery to make sure you don't lose cached data if the power goes off. Anyone who things colo power and UPS will protect against loss of power hasn't suffered enough yet...
Right on all points. The same would apply to wikipedia as well, because it's also used generically for the encyclopedia and is essential for everyone else who needs to identify their distribution as being a distribution of wikipedia.
By producing FOSS I mean people like you and me producing things for the benefit of all and trying to maximise code reuse and ability to customise things to meet our needs and preferences.
Just using the end result is a different matter. Talking here about code reuse and sharing, so we can do things like customise a printer driver to our needs by adding some copyleft code, if the driver is PD or BSD or even under another copyleft license. BSD or PD you can in theory solve by using the copyleft license of the code you want to integrate. But if the other license is copyleft, but a different copyleft, you're pretty much stuck with hoping the author(s) aren't advocates of one particular copyleft license and are willing to be part of the community and work together.
Copyleft licenses are far worse than the public domain. The typical type blocks all reuse except by something with exactly the same license, making each license a walled garden, only compatible with things having exactly the same license. Public domain or even a license like BSD lets just about any person producing FOSS use the work, without dividing the OSS world.
Then RMS arrives on the scene and sees a commercial company not doing that sharing. So he comes up with a license (contrary to his not sign a license or NDA objective) which tries to force unwilling people to do good things.
Unfortunately, it has a side effect. It's no longer possible for all the other licenses the good people are using to work with it. The moment they use a bit of software with the new license, they lock out all the rest of the good people, as their work ceases to be public domain or BSD and easy for everyone to use and integrate.
It'll be inteesting to see how the future turns out and whether RMS becomes most known for splitting the free software community.
Personally, I wish some people would stop making life hard for the other good people in their quest to try to force the less good people to do what they want. Better to work with the other good people than continually fracture the community.
That's the world view which makes the OS called Linux, not GNU/Linux. It makes perfect sense for a corporation or home user. It's less so for a person who lives in a CLI and is using the tools routinely but most users aren't sysadmins. Of course, I do live in the sysadmin world but that doesn't mean that I don't see why it's normally called Linux.
You'd need to be prepared to deal with people hardwiring USB keyloggers to the motherboard or inserting into keyboard itself. Or inserting whatever into any other bits of the computer which are available. Add more when you might have to deal with actual professionals in the business of compromising such systems to get at their contents or install bugs for audio. Sounds like a really poor concept to try mixing use. But do ask the real experts, who I assume are your customers.
Quite right. It's far from certain that those cures are worth their price in terms of other medication required and the life consequences that brings. Still, even with flaws, they are cures and a way to do at least one of them could conceivably be found with reduced rejection risk. Maybe, eventually. Or not. Caveats because it's been said before and cure talk is an old, old story.:)
Both islet cell and pancreas transplants work to cure type 1 diabetes. They aren't viable for type 1s in general at present because of the great imbalance between the numbers of diabetics and donors but they are effective treatments. They may never be viable for most - certainly pancreas transplants won't be.
That would be an over-reaction. The employee apparently had no ill intent and no lasting harm, or even significant harm, was done.
This is one of the proposals. Variations on time exist but it can help to reduce the reward for vandalism by making it impossible to show your friends what you just did, as well as increasing the time available to fix. But short periods, in the few minutes range, are more likely than hours, because part of the power of a wiki is that there are many eyes, including anonymous eyes, who can read and correct vandalism.
Imrpovements in tools to highlight edits which show indications of being possible vandalism are also likely. Those should increase the speed and accuracy of human review of edits.
Once that's done, and cell phone companies integrate it, we could all have an encyclopedia making available information about most things near to us.
Anthony was referring to the objection a sufficient number of authors have to the inclusion of articles about individual hotels, bars, pubs, cafes, elementary schools, malls, shops, streets and bus stops. In general those who object have successfully prevented the inclusion of most examples of such things in the encyclopedia, barring it from being an effective reference work at the fine-grained level contemplated here. That is: you will be able to fing out about Burger King but it won't have information about the one on the next block, so it won't be able to tell you about it as you approach or identify the local eating places and provide their history and that of any chains they belong to.
Consider:
RAID 10 disadvantage: "All drives must move in parallel to proper track lowering sustained performance". In fact each drive can seek independently for reads and only pairs must seek together for writes.
RAID 1 advantage: "Transfer rate per block is equal to that of a single disk"
RAID 5 disadvantage: "Individual block data transfer rate same as single disk"
Would be nice if it was consistent about whether that's good or bad.
RAID5: "Highest Read data transaction rate" except for RAID 10, of course, where you've less chance of being bottlenecked because there are two sources for each stripe.
RAID5: "Medium Write data transaction rate", only the lowest of all except 50, because of the parity calculating and writing to a second drive.
As the operator of the access point said, he made a deliberate choice, being aware of his options, to leave the access point configured so anyone in the area could use it and the services availabe through it. Also, so that they _would_ use it if it happened to be the strongest signal their computer found when auto-selecting the AP to use.
It could have been one of his own neighbors in the neighbors own home if his AP happened to have a better signal than the AP the neighbor had set up for their internet connection. In which case I suppose his neighbor might be having _him_ arrested for intercepting the communications the neighbor was tring to make through his own access point?
No intent necesary here, it's just how this particular technology works when you have it configured to:
1. Reply to the pings asking if anyone out there is offering service, identifying yourself as being available. An optional feature you can turn off if you don't want to advertise your service as being available.
2. Offer wireless connectivity to anyone who stops by with a wireless card configured as they often are to automatically connect to the strongest signal around. Easy to turn off by enabling WEP.
3. Grant an IP address to anyone when requested. Easy to turn off by turning off the DCHP server.
4. Presumably not use a network name/SSID indicating that it's supposed to be private with something like the word "private" in it.
5. Offer whatever services are then available for onward connectivity. Which might or might not be contrary to the agreement made with the upstream provider/ISP, but the person who has at this point had equipment automatically request service three different times and been told yes each time has no way to know that.
The gentlemans AP was apparently configured to offer the services it offered to anyone passing by. That's a choice he made. One he knew about and could easily, in many ways, have made differently with no more than a second or two of work, like changing the SSID to include the word private.
Now, the intent question in this one is a different matter. The guy in the SUV apparently concealed his access, suggesting that he believed his access was unauthorized. And that makes all of the foregoing irrelevant, because it establishes an intent to use it without authorization. Had he said what he was doing when first approached, if first approached, and not tried to conceal his activity, that would be completely different and would lack that sign of intent to have unauthorized service.
Not for the list but for checking every address you want to send to against the list. You don't get the list because that would tell you a list of addresses of children. Since you generally don't know where an address is located that means you have to run every address in a worldwide mailing through it or break the law.
Wikipedia is useful enough that we'd probably end up with the major ISPs hosting a Coral-type cache on their own network, if it was a plug it in and forget about it arrangement.
15 cents per thousand emails, assuming 50 states do it with the charge given in the Michigan law, sounds rather expensive for just checking database records. Cheap enough way to get contact points of children, though.
Both Yahoo and Google deserve approximately equal kudos for being helpful to the projects. Thanks!
I'm wondering about setting up a network of boxes running the Coral software. Those have built in fault tolerance so it wouldn't take lots of admin work and would allow accepting many small bandwidth offers, in countries with comparatively low traffic. Makes most content even closer to the end users and spreads the bandwidth load around. Nothing actually happening on this front yet, though.
A very large number of places witih full database servers and page builders, like this Yahoo announcement, would have too much admin overhead - 3-6 of those places is about right.
P2P is a security problem. People can always modify P2P programs to add nasty content and Wikipedia has already seen people trying to upload that and has filters in place to catch and block some things.
The donor has a data center and people to look after the servers there. I assume that they looked carefully and concluded that it wass an excellent site form which to serve the region for their own operations and getting the benefit of their operational experience is a good thing.
Wikipedia doesn't need that. It needs more - those aren't enough to handle the full load.:) They should be enough for the Asia-Pacific region for a few months at least. Wikipedia growth is still limited by performance when it comes to viewing pages not in cache and editing (adding and changing content).
Because the technical team at Wikipedia includes the developers and we know that there are sure to be problems as it is introduced to full service. Anything from outright bugs to database queries with unacceptable load properties. It'll probably be released for a general audience in four to eight weeks, once it's been very thoroughly tested at its biggest user site.