Slashdot Mirror


User: Jamesday

Jamesday's activity in the archive.

Stories
0
Comments
325
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 325

  1. Re:permastor on Best Way to Back Up Photos and Video? · · Score: 1

    Interesting products, particularly the OfficeStor with VPN and remote mirroring support. Makes me wonder whether I should suggest a set of these for Wikipedia. Sizes are potentially a bit small (but we currently need only about 400GB). I'll think more about it.

    Some questions you might consider answering on the site:

    How many hard drives and which capacity individual drives are used for each available size?

    What RAID level/type/whatever is used within the system?

    Any compression of data over the VPN links? During normal copying? During copying from one OfficeStor to another? Thinking here of really copying 400-600GB datasets over the net, which is significant even for a place routinely supplying tens of terrabytes a month.

    Can hard drives be user-replaced as capacities grow (thinking that several Wikipedia root admins might have one of these boxes).

    What's the form factor? Any rackmount option if we were to put one or two in a few colocation sites?

    An explicit statement about what's different between HomeStor and OfficeStor (seems to be VPN and remote copying/backup between systems) would be helpful.

    Knowing which tools are used internally, assuming you're using standard software components, would be nice.

    Knowing shipping weights, even approximate, would be nice. Fedex is faster than a home broadband connection and maybe even than 340 megabits/s internet connections between colos.

  2. Re: In-house punishments please! on Felony Charges For H.S. Hacking · · Score: 1
    The state constitution and federal law almost certainly allow you to establish rules that it would be a felony to violate.

    "You may use this computer for purposes A, B and C. All other uses are prohibited and not authorized."

    Anyone knowingly exceeding the access you have permitted to that system is then potentially subject to felony computer trespass charges.

  3. a heck of a long time is how long? on BBC Launches Linux Powered Weather Format · · Score: 1
    Real time to recover from corruption of a 180GB database in a top 100 web site: about two hours. Time to switch from a master database server to a slave as the new master so people don't notice: seconds to minutes. How long is a heck of a long time, how much bigger than that is the BBC database and if they are using replication, as the story says, are they doing the obvious and using replication for a standby server or two?

    Time given is for a complete copy of the Wikipedia database from one replicating slave to another. Switching master to continue taking data takes seconds to minutes, depending on degree of automation.

  4. Please refrain from misrepresentations on MPAA CEO Dan Glickman on the Broadcast Flag · · Score: 2, Interesting

    Mr. Glickman, with respect, please refrain from misrepresenting the benefit and effect of the broadcast flag.

    "The challenges lie in protecting that content so that it is not stolen and resold or rebroadcast by video pirates. ... Broadcast flag technology protects the content of our shows from redistribution over the Internet."

    As we know, broadcast television shows movies after cinemas, pay per view, and video tape/DVD sell-through. Those present four opportunities to make and distribute copies of the works, two of which provide a digital picture stream identical to the broadcast stream. There is also the widely used pre-cimema opportunity, which results in distribution before first cinema showing even in the US. Please explain why you believe that those you seek to inhibit will choose to wait for broadcast television instead of doing what they currently do and using the earlier opportunities.

    For two Of those earlier opportunities, cable and video tape, the studios or broadcasters have preveiously gone to the Supreme Court arguing that they would destroy their business. Please identify the businesses they destroyed after those cases were lost, since it appears that both are actually major revenue streams, and explain why you believe your arguments in this instance are of greater accuracy in predicting the future benefits to your members' businesses than those your predecessors made with their predictions of doom.

    "The sole purpose and effect of broadcast flag is to assure a continued supply of high-value programming to off-air"

    I have rejected the TiVo technology as insufficiently flexile. It limits me to a narrow range of playback devices and restricts my ability to do things like editing to remove offenive content before playing to others, such as children. Compatibility between different implementations by different vendors in fights to achieve market dominance is also a concern. Capturing a video stream and producing more tools, provided secrecy and restrictions on protocols is not required, is a very promising market. The controls of the broadcast flag regime appear to kill this market for intelligent filtering and editing tools developed by a very wide range of small producers, often single individuals with limited funds, like the college student who developed the well known Virtual Dub video editing program.

    Today I can time shift a video broadcast from homoe to my computer and then to an airplane or hotel room on a business or other trip. Using a single portable computer to do both this and the bunsiness activities. It appears that the restrictions of the broadcast flag will block this existing very useful capability or require the entirely impractical approach of taking the main family recording device with me.

    "The basic outline of the broadcast flag was approved in principle by a large and diverse group of consumer electronics, computer technology and video content companies. This consensus was reached after a thorough process involving all affected parties."

    That list of parties misses the most broadly affected group: end users of the video at home watching it on their home digital televisions with the great potential of ubiquitous home digital networks and home recording. It also appears to lack broadcast television stations. Perhaps consultation with the most affected parties would be of use - the ones who dislike this because they know it will fundamentally limit their opportunities for uninfringing use of the content?

    Today, the threat of the broadcast flag is one of the factors which discourages me from purchasing or using digital television equipment. The sooner that threat is gone, the sooner it is that I'm likely to be interested in purchasing something which will no longer threaten to dramatically limit my legitimate uses of the content being broadcast. Congress acting today to prohibit the use of the broadcast flag or similar systems would be of significant help in encouraging my adoption of digital televisio

  5. Re:AMEN.... on Publishers Protest Google Library Project · · Score: 4, Informative

    There should indeed be choice by the author. These academic publications generally prohibit the author from making any other choice than assigning copyright to them, effectively tying the spread of knowledge to the financial interests of the publication.

  6. Re:The copyleft JVM should have fixed its issues on FSF, OpenOffice.org Team Reach Agreement on Java · · Score: 1

    Dodging any Sun-only libraries makes sense. Standards are important. So much better to encourage working on doing that and working on improving the capabilities of the non-SUN JVMs than encouraging forking which that can make unnecessary.

  7. Re:Ignorance on FSF, OpenOffice.org Team Reach Agreement on Java · · Score: 1
    Yes, forking is a critically important tool but it does take care over deciding when to do it and when to pursue the available more productive courses. That's why the initial reactions were inappropriate - there was a more productive course available than forking, one which will be of far greater benefit to the community: working on free and copyleft JVMs so all projects can use them when java is the most appropriate tool for the problem at hand.

    Personally, I don't think that a fork over this would have made any sense at all, because it would have been, at best, a symptom of impatiance: an unwillingness to accept an intermediate state while the other JVMs were developed to the point that they could run the code. Yes, it's frustrating not to have everything now - but that's still not a reason for a major fork.

  8. Re:Ignorance on FSF, OpenOffice.org Team Reach Agreement on Java · · Score: 1
    The complaint wasn't irrational. Suggesting throwing away the resources of the community by forking the project instead of trying to recruit more JVM developers was.

    Wanting copyleft is fine, but free software is supposed to be more efficient and such forks both split the community and waste its resources. Really bad thing for one of the leaders of the community to be suggesting. Lets at least try to work together efficiently!

  9. The copyleft JVM should have fixed its issues on FSF, OpenOffice.org Team Reach Agreement on Java · · Score: 5, Insightful

    The FSF was being irrational. There was a JVM licensed with an FSF license which wasn't compatible with the latest Java standards. Instead of advocating fixing the broken code, Stallman was apparently advocating not using anything which didnt work with the broken code, to the point of forking a major project to avoid fixing that broken code. That's hardly an example of good programming ethics. Fix the bugs, don't complain about others not working around them.

  10. Re:I know it's necessary, but... on Yahoo! Search Providing Support to Wikipedia · · Score: 1
    If those scenarios happen we'll do what we are doing anyway, to balance things: ask for donations from the public. Last time we cut the fund drive early after exceeding our $75,000 target by some $15,000. Expect us to consciously and deliberately work to ensure that NO single party can seriously harm the sites and to recognise that donations from the general public are a key part of that prevention picture.

    At the moment the biggest single party vulnerabilty is the Wikimedia Foundation, since it owns all of the hardware. I'd like to see some diversity in that, so it's impossible to lose all of the hardware from a single court decision. Not that I expect it to suddenly become evil or vanish but it's good to diversify, in part to make the Foundation a less attractive legal target (so taking it out won't take the works out).

  11. Re:good news on Yahoo! Search Providing Support to Wikipedia · · Score: 1
    One little correction:

    Wikimedia isn't licensing or releasing under any license. The authors are licensing and releasing to Wikimedia (and everyone else).

    The difference doesn't seem that great until you consider who can change the license (not Wikimedia), who can send takedown notices (not Wikimedia) and who can republish their work under non-GFDL licenses elsewhere (not Wikimedia). As one of the authors I've done things like granting other licenses to other people for my work, something Wikimedia just can't do.

    Remembering the distinction between the authors and Wikimedia becomes important if Wikimedia ever suffers a major legal defeat which costs it its assets (it's only a licensee, so it can't lose the work!) or somehow became taken over by bad actors and tried to abuse the GFDL by over-zealous interpretation of it to try to create a monopoly by financial and legal pressure on other licensees. At present it can't do that because it has no legal rights to use to apply that legal pressure.

    Of course, I don't expect these things to happen, but they are significant factors to be protected against as we think more than a few years ahead, past the point at which we know who is in control of Wikimedia and the computers which host the works. All part of long-term contingency preparation.

  12. Re: How about from two? on Yahoo! Search Providing Support to Wikipedia · · Score: 1

    It isn't currently on their servers. When some requests are being served on the equipment donated, they won't be getting first crack at new versions or indexing because of the server location. They might, if located at the same data center, get faster ping times. Neither Google, Yahoo nor any other sponsor will be saying that we have to withhold something so they get first look.

  13. Re:Can someone explain the MySQL license? on Open Source Licensing - Cuts Both Ways? · · Score: 1

    1) No need to pay. Up to you to decide whether you want support or not.

    2)(a) If it's FLOSS no license needed. Though you or your customers might want one anyway, for the support or the Certified Partner package or marketing things. Up to you and them.

    2)(b) if it's not FLOSS, see section C7 of the partner FAQ and the Certified Technology Partner program you'll find it's not too painful. At least, I hope that $595 isn't too painful for any ongoing business - you'd be bankrupt if it was!:)

  14. Re:foreign keys? try write-ahead logging on 'Most Important Ever' MySQL Reaches Beta · · Score: 1
    FWIW. I disagree with the Microsoft recommendation to turn off torn page detection when you have a battery backed up hardware caching controller. Both Wikipedia and Livejournal had those controllers and experienced what Microsoft calls torn pages. Better to know than not. To give some idea, since the problems at Livejournal and Wikipedia:

    One controller vendor has released a new firmware update, required for all users of battery backed up cache. Anyone with a 3-Ware SATA controller and battery backup should get their latest firmware. Wikipedia has two of these controllers.

    A SCSI controller vendor didn't turn of hard drive write caching (!!) and didn't provide any way to get through the controller to do it, has released a utility to let you turn off the hard drive write cache on some models, with more to come. Not turning off drive caches made the battery almost completely useless. Wikipedia has three of these controllers.

    A phone call to a drive vendor pre-sales support said that their 400GB SATA drives would still cache writes if told not to cache writes. When I pointed out that this effectively guaranteed data loss the support person checked then transferred me to their labs, which said that yes,the drive really would respect the cache off setting, so it actually is safe. Glad they got it right in the end (I hope - but I'll still be testing!).

    Not related: Apple was found to ignore fsync flushes, caching instead. MySQL now has a workaround for that. And some FC3 versions apparently don't properly respect fsyncs either.

    MySQL, recognising the sad reality of the OS and hardware layers, is doing something more about it (beyond write-ahead logging, doublewrite and page checksums), judging by blogs and their public responses to my public comments on this.

    I'm distinctly unimpressed by the below the database layers and their respect for what's needed to have reliable storage. If you're serious about reliability, don't even think of trusting the OS, controller and drive layers. Test, with real power disconnects and active disk writes going on at the time.

    If you are worried about power problems and are using racks, here's one easy tip: power distribution unit with meter or comparable but with remote on-off. It's quite irritating to lose power because of an overloaded circuit. Wikipedia saw it happen once at rack level when more machines were being added and possibly a second time related to a circuit breaker trip.

  15. Re:foreign keys? try write-ahead logging on 'Most Important Ever' MySQL Reaches Beta · · Score: 2, Informative

    The book chapter you quoted from isn't current, though it's still very useful as an overview. Recent versions handle this, both logging consistently and rolling back consistently, including to replicating slaves. See the manual section on the binary log.

  16. Re:being a paying customer... on 'Most Important Ever' MySQL Reaches Beta · · Score: 1
    Google uses Oracle as well. They are currently hiring Oracle DBAs. If not there now, they have also been hiring MySQL DBAs. Doesn't mean they use either for their main search tool, of course.

    Like Wikipedia and LiveJournal, Slashdot uses MySQL for its main database servers. The five main Wikipedia database servers see a total of some 200 million queries per day.

  17. Up to Google to say that on 'Most Important Ever' MySQL Reaches Beta · · Score: 1

    I don't track Google's activities that much.

  18. WAL on 'Most Important Ever' MySQL Reaches Beta · · Score: 1

    I used roughly one seek/rotation per update and some fudge factor, for the fsyncs. Assumed update=transaction, which isn't necessarily so, just worst case.

    The InnoDB engine in MySQL uses the write-ahead logging approach. Writes to its log and does an fsync to flush it but caches the database page updates in RAM. Flushes those as time allows, at a checkpoint or when the number of dirty (modified) pages exceeds the user-configured threshold.

    Of course, write caching disk controllers and more fancy storage systems change the numbers substantially. Was trying to give those who haven't looked at this sort of thing some idea of what was being discussed. I'm interested to know what it's currently running on.

  19. Re:Opinions... on 'Most Important Ever' MySQL Reaches Beta · · Score: 1
    20,000 inserts per second is the interesting number there. How are you doing that now on the hardware side? How long is that insert rate sustained? How much data volume in those 20,000 inserts per second?

    5.0 has a more greedy query optimizer and better index use so it may help. Or not.

    Have a chat with the MySQL performance team before giving it another try. Sounds like an interesting project.

    For anyone who doesn't know why those inserts are interesting, a single 15,000 RPM hard drive can do about 300 seeks per second (fudging a bit because writes are slower than reads), so this is talking about a peak insert rate which, if sent directly to the drives, would require every seek available from 70 15,000 RPM hard drives. Fortunately, there's RAM page caching and logs, the logs can be written sequentially and write caching disk controllers exist. But it's still interesting. The reads are far easier to handle. For scale, Wikipedia runs at perhaps 30 updates per second average, with about 2500-3500 mostly simple queries per second.

  20. Some Wikipedia corrections on 'Most Important Ever' MySQL Reaches Beta · · Score: 1

    We're actually only preparing for 400GB of disk space on our current generation of Wikimedia database servers. Once all of our compression work is finished we'll be down to perhaps 100GB in the database. Without compression we'd be at about 300GB already. Of course, people are just going to keep editing, so we'll use the extra space and compression power soon enough. Next generation will be after a terrabyte of usable RAID 10 space.

  21. Re:being a paying customer... on 'Most Important Ever' MySQL Reaches Beta · · Score: 1
    You might want to visit the MySQL User's Conference and ask one of the keynote speakers for that conference how Google is using MySQL.

    Also see this news story: "And MySQL's popularity seems to be growing. Yahoo and Google use the software to run many parts of their Web sites"

  22. Insignificant. See Bender v. West Publishing. on SCO Website Using Groklaw's Content · · Score: 2, Informative
    The writing involved is insignificant. It's all been thoroughly explored in Feist v. Rural (the phone directory case). The article also covers Matthew Bender v. West publishing Co. (a legal publisher, denied copyright on its numbering and organising schemes for public domain legal writing). Also Assessment Technologies v. WIREdata, which ruled that a copyright holder in a compilation of public domain data cannot use that copyright to prevent others from using the underlying public domain data, but may only restrict the specific format of the compilation, if that format is itself sufficiently creative. A scan won't be sufficiently creative.

    SCO is simply a Bender and is fully entitled to do as it has done.

    Forget a takedown notice as well. The documents are public domain and there are penalties (including legal fees) for filing a false DMCA takedown notice.

    It's amusing, but that's all it is.

    Do read the WIREdata (PDF) decision. It's an excellent and readable decision giving an overview of the principles and key cases involved.

  23. Untrustworthy Computing Platform on BBC on DRM and Trusted Computing · · Score: 1
    The key point for consumers to remember is that the Trusted Computing Platform makes their computers an Untrustworthy Computing Platform.

    It does so by allowing vendors to take back things you have already purchased (like the TiVo and Apple examples) and by making it harder to keep the works you purchase as you change computers every three to five years and find incompatibilities or changes in operating system or application vendors locking you out of your own property.

  24. Re:Delayed edit visibility on Google Goes to Answers.com · · Score: 1

    Right. Or at least how to think more than one move ahead in the chess game.:)

  25. Re:Thanks for your report. Your image removed. on Google Goes to Answers.com · · Score: 1

    It's my view that all images should have proper photo credits. In this case, crediting you as part of the image caption.

    Many US people appear to have difficulty understanding moral rights questions like the right to be associated with your work, perhaps because moral rights are quite limited in the US, particularly for text.

    Many at en.wikipedia.org accept links to Wikipedia for practical reasons: it's much more convenient than making all reusers make available the full history of every version of every item. Still, I do think that that is what a prudent mirror should do, in part because an image or article could be deleted from en.wikipedia.org at any time, removing all publicly visible evidence that a license was ever granted.

    Yes, the effect of the practice is to deny credit to the creators for their work and instead give it to Wikipedia and eventually, I suppose, to the Wikimedia Foundation instead of the real creators.

    This and anything else which serves to have wikipedia.org or the Wikimedia Foundation take credit for the work of others is one of the least pleasant aspects of producing the work. It's still worth doing, at least at present, but it is discouraging at a place which is, in many other ways, trying to do the right thing.

    In my opinion this is also a conributor to the perception some have that the work lacks authority, for it's quite hard to track the authorship information and realise that yes, you do actually consider that a particular author has authority and can be trusted.

    Version control systems for programmers may have a feature called "blame", which tracks the actual author of each piece of text in the work. "Blame" becuase one common use is to work out who is to blame for (who wrote) a particular bug.:-) That's not currently available in the Mediawiki software. It is one enhancement I'd like to see, for it would significantly simplify the matter of giving proper credit to the authors.

    Do note, however, that these views are mine and do not necessarily reflect the views of en.wikipedi.org, the Wikimedia Foundation or anyone else.