Slashdot Mirror


User: crschmidt

crschmidt's activity in the archive.

Stories
0
Comments
35
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 35

  1. Re:This is an efficiency issue on Google Permits India To Download YouTube Content Overnight (thestack.com) · · Score: 2
    Hi,

    Google's GGC program -- our in-ISP caching program, more info at https://peering.google.com/#/o... -- is targeted at ISPs with > 1Gbps of cachable end-user traffic. This is simply a matter of practicality: there are tens of thousands of ISPs in the world, and in cases with 1Gbps of traffic, there simply isn't enough value to deploy in an ISP network. ("Our edge node offering was designed for end-user networks with greater than 1Gbps of peak Google traffic. Google encourages networks with less than 1Gbps peak traffic to Google to join a local Internet Exchange or peer directly with us." -- Google Peering FAQ). If you are an ISP smaller than that, you're right that you'll have some difficulty getting access to in-ISP caching.

    If you have more usage than that, and have not been able to get a response to an expressed interest via the GGC page, I'm happy to take your information and try and see why that is. I'll admit that I know less about South American GGC deployments than I do about other parts of the world, simply because I tend to work less often with folks who work on that part of the problem, so it's possible that there's more to it than I'm aware of. You can email me at crschmidt@google.com; if you do, please include your ASN number.

    I think that there is a known need for the ability to scale caches down to smaller sizes -- e.g. to make it cost effective to deliver more caches to smaller ISPs. I don't have anything to say, but I will say that I think that we are aware that this is a gap in our coverage, and we don't like it any more than you do.

    As for making it "difficult to cache" -- I'm not sure exactly what you mean, but generally speaking, there's two things I can think of:
    • We use SSL for video streams. Protecting users is the most important thing we can do at Google, and without SSL, bad actors were able to use unencrypted YouTube streams as a source of invading the privacy of our users ( U.S. firm helped the spyware industry build a potent digital weapon for sale overseas). Obviously this isn't the only reason to go SSL, but non-SSL communications simply aren't an option in the modern internet anymore.
    • We use signed URLs with relatively short expiry. This is to largely to protect the CDN from abuse.

    Neither of these is *targeted* at cache-busting, but both have that effect; with the GGC program in place, we don't make it a primary goal to make the raw streams cachable, because we simply don't wish to have ISPs do caching that way, and instead prefer GGCs, which give better user performance where we can use them.

    In any case, if you are having trouble finding someone to talk to, please feel free to let me know, and I'll see what I can do, if anything.

    -- Christopher Schmidt, YouTube Quality of Experience

  2. Re:"guided" disassembly on Kids Review the OLPC · · Score: 1

    One thing the video doesn't show is that after taking the laptop apart, and putting it back together -- the keyboard didn't work.

    So the adult in question said "Fix it" and walked away: and came back 15 minutes later to the kids playing with the fixed laptop.

    I'm not sure why that wasn't demonstrated as the more important part of the video here -- I only know it because SJ told me about it while I was taking the pictures in the Flickr set.

  3. Re:Interesting concept on Marc Andreessen's Social Platform: Ning · · Score: 0, Offtopic

    I know at *least* 2 females on Slashdot.

    But I considered just making the choices "male" or "CowboyNeal". Since those two females wouldn't want to date anyone from here anyway.

  4. Re:"Redirection limit exceeded" on Marc Andreessen's Social Platform: Ning · · Score: 2, Informative

    Turn on Cookies.

  5. Re:Slashdot dating on Marc Andreessen's Social Platform: Ning · · Score: 5, Interesting

    Create an account. Apply for beta developer status. Click "Clone This" button on dating.ning.com. Type in that title, add a few extra fields ("What programming langugages do you know?" "Who is your ideal BOFH?")

    It's that easy.

    That's the power of cloning, and the primary force behind Ning.

    Want Proof? I just did it: SlashDot Dating.

  6. Re:bigger explination on LiveJournal Servers Go Down · · Score: 1

    I doubt that they'll move, but that's personal opinion only. They've already got two remote sysadmins in Seattle: Barring another incident like this, there's not really much that can't be done remotely, I don't think.

  7. Re:bigger explination on LiveJournal Servers Go Down · · Score: 1

    LiveJournal The Company is based in Portland, (although I think the office address is still a PO Box in Beaverton) and will soon be moving to San Francisco. However, their data is indeed at the Seattle center.

  8. Re:bigger explination on LiveJournal Servers Go Down · · Score: 1

    My cluster knowledge has gone out of date after the recent switch to much of the data being switched to 64 bit hardware, but here's a guess, which is probably low: 8 user clusters, each with a "Master" and 2 slaves. Most machines running dual masters in case one fails. Every machine runs memcached in its spare ram: 6 months ago, this amounted to 40 gig, but now it's way bigger, although I don't have estimates. About 60 diskless web slaves. 40 backend database server. Approximately 300-400 MBps, although ,that's again, a guess. I don't work there: these numbers are relatively well informed guesses, but may be off. Do not take my word as gospel: I am an uninformed user.

  9. Re:bigger explination on LiveJournal Servers Go Down · · Score: 1

    300 Mbps, 2000 dynamic page views per second, 300 journal updates a minute, probably 10 times that in comments posted.

    That's at a slow time, of course, but that's the numbers that I pull from on-site statistics and slides from Brad's various talks.

  10. Re:Disclaimer: I am Not an Electrical Engineer on LiveJournal Servers Go Down · · Score: 1

    Based on past experience, LiveJournal is very generous for awards to users due to something on Livejournal being broken. During a pretty widespread DDoS, an A block was blocked upstream for 24 hours - and all paid users received an extra three days for the trouble.

    I may be slightly misremembering details, but like I said, LiveJournal has always been generous in making sure its paid customers are happy.

    The free users, of course, are shit out of luck.

  11. Re:But how do I use this semantic data? on On Finding Semantic Web Documents · · Score: 1

    Check out http://crschmidt.net/semweb/ for info on some of the projects I've worked on which use the semantic web.

    The most interesting one, in my opinion, is lorebot. Lorebot sits in a channel, and associates identified users to their FOAF files. Once it does this, it links them to a human readable description of HTMl about them, and, if possible, displays an image for them. Example output: online users, personal output.

    There's also things like the FOAF or DOAP a matic: both of which take RDF and spit out a machine readable description. The Firefox plugins on my semweb page let you see in the corner of your browser when you have that information available.

    There's more tools out there, but they don't tend to be as down to earth, because a lot of RDF data is in high-level stuff. The demonstrations are becoming a lot more usable though, and I expect that to continue over the next year.

  12. Re:It's not about the filename on On Finding Semantic Web Documents · · Score: 1

    Preferably application/rdf+xml . Anything else is not appropriate for RDF-serialized triples. text/xml and application/xml are both wrong for this kind of data.

    This will become more important as resources are represented in multiple ways, for tools to consume: they ask for a specific type, and fallback may fall the wrong way if people start telling their webservers that RDF is something it's not.

  13. LiveJournal and other weblogging services on On Finding Semantic Web Documents · · Score: 3, Informative

    Every user of a LiveJournal-based website running recent code has a FOAF file. Let's look how many users that is:

    * LiveJournal.com: 5751567
    * GreatestJournal.com: 717406
    * DeadJournal.com: 474435
    * Weedweb.net: 22650
    * InsaneJournal.com: 12970
    * JournalFen.net: 7629
    * Plogs.net: 7086
    * journal.bad.lv: 4530

    (This list is most likely incomplete.)

    In addition to this, every Typepad user has an account: according to the 6A merger stories, that's another million users. Add in the RDF from all the Typepad RSS files, and that's another 1 million.

    All Wordpress blogs have a feed, located at /feed/rdf or /wp-rdf.php, which is in RDF. Movable Type comes preinstalled with an RSS 1.0 feed. Each of these has at least a couple thousand users.

    So, we've got, just as a guess, about 9 million RDF files out there in the blogging world alone. Throw in a hell of a lot of scientific data, and everything on RDFdata.org, and you start to get an idea that the world is a lot more Semantic Web enabled than you seem to think it is.

  14. Re:LiveJournal is more interesting than you think on LiveJournal Buyout Rumor · · Score: 1

    That data is old. The current is more along the lines of 3 times that.

    Slashdot's traffic, according to the maintainers of /. and LJ, is significantly less than LJ's. Not to mention the fact that Slashdot is dealing with a miniscule amount of data in comparison to LJ.

  15. Re:Oh no! on LiveJournal Buyout Rumor · · Score: 1, Funny

    Like you had a chance in the first place.

  16. Re:What Does 42 Mean for Privacy? on Tim Berners-Lee and the Semantic Web · · Score: 2, Informative

    There are several solutions to the problems you describe. I'll address the few I'm most comfortable with responding to - not because the others are unsovable, simply because I don't want to provide inadequate information.

    All information on the web should be taken with, as they say, a grain of salt. Depending on what you are looking at, it has more or less value. For example, something on Wikipedia can probably be assumed to be relatively accurate, whereas something on Joe Schmo's website on Geocities will probably be considered to be less accurate in general. The semantic web allows for you to see who is saying something in a number of ways, and to verify this information:

    • URI Source - If the source of data about Chevy Trucks is at chevy.com/trucks.rdf, you'll probably have a pretty good reason to trust it.
    • dc:creator - a self-assigned name for the creator of the document
    • Most importantly, wot:assurance: a signature, using standard public/private key encryption, of a document, assuring that the signer indeed did create the information

    Each of these methods of determining where information is coming from has its own special place in assigning credence to the document in question. Thus, if a document signed by crschmidt@crschmidt.net says that the person "CHristopher Schmidt" owns the email address crschmidt@crschmidt.net - it's probably safe to trust that person.

    Once the data is available on the web, it is easy to find other data: one of the basic terms is "seeAlso" - a way for providing other URLs to look for data at. Once the web starts, it is easy to link it, and to do so is to increase the data .You don't need something smart or intelligent - simply wander around, collect all the rdfs:seeAlso links, and download those - and continue from there. This process, known as "scuttering", is an easy way to start creating a relatively large data store.

    Using descriptions of when information is updated allows tools to understand when they should check back for more information. Similar to the way RSS feeds (which are a part of the Semantic Web) can inform tools that they will be updated in 2, 4, 6, 24 hours, general RDF documents can do the same thing - saying 'check me again in a week" or more.

    There are currently tools for working with the semantic web in a small scale. Although this is nothing like the big dream - having almost everything described, so that computers can really understand the world around them - these tools do have their usefulness. I can now ask "What is the name of the person whose aim name is cr5chmidt", and be told the answer. Although it's not perfect - very little about the semantic web is perfect yet - it doesn't need to be. For more information, see my post on the bot I created to spider semweb data in my blog.

    As you said, it won't be easy. However, it is possible, and it seems to me more and more likely each day that working on these tools and increasing the amount of semantic data in every little way can help.

  17. Re:yes on Corporate Servers Spreading IE Virus [Updated] · · Score: 1

    Personally, I find that w3m just does a better job of rendering in general than links does, but I haven't used links a whole lot. Mostly, I use w3m for fetching headers or source, with dump_head or dump_source - quite useful for determining server something is running on, and it's quick.

    I'm sure there's better ways to do it, but that's just the one i use.

  18. Re:yes on Corporate Servers Spreading IE Virus [Updated] · · Score: 1

    I've heard of those, I think. I've also heard of images, though, which is why I use w3m.

  19. Re:interesting... on Welcome to the 'Plogging' World · · Score: 1

    Speaking as ex-development manager for the site, we were working under the defintion "People Log" - an alternative to LiveJournal, at the time. More oriented toward hardcore bloggers - Trackback, and other things, included, Plogs.net has always offered a cheap alternative for those people who want a large scale product but don't want to set it up themselves.

  20. Re:Never made sense on Trekkie Communicators Now a Reality · · Score: 2, Funny

    Congratulations, you have just asked the exact same question that someone else already asked earlier in the thread, in addition to ignoring even the short description of the article.

  21. Re:Slashdotted on Tracking Social Networking In Shakespeare Plays · · Score: 1

    Jibbler got his server moved to a different box by his webhost, so you should be much better off now.

  22. Re:Here's the best solution: on Getting Power to a Rack Enclosure? · · Score: 1

    #5 (or is it 6?) Learn to count?

  23. Re:windowsupdate.microsoft.com Breakins? on Gentoo rsync Server Compromised [updated] · · Score: 1

    ot quite accurate: [crschmidt@peanut ~]$ host www.microsoft.com www.microsoft.com is an alias for www.microsoft.akadns.net. www.microsoft.akadns.net is an alias for www2.microsoft.akadns.net. Microsoft is still using Akamai, whose servers report as Linux, last I checked. Which it seems like, from this comment, you think it isn't. Either I'm miunderstanding, or you are, but Microsoft definitely does use a level of OSS between them and their servers. Of course, this level of seperation has no affect on security - it simply passes on what needs to be passed on. However, it is still there.

  24. Re:You don't remember correctly on Gentoo rsync Server Compromised [updated] · · Score: 1

    Not quite accurate:

    [crschmidt@peanut ~]$ host www.microsoft.com
    www.microsoft.com is an alias for www.microsoft.akadns.net.
    www.microsoft.akadns.ne t is an alias for www2.microsoft.akadns.net.

    Microsoft is still using Akamai. Which it seems like, from this comment, you think it isn't. Either I'm miunderstanding, or you are, but Microsoft definitely does use a level of OSS between them and their servers.

    Of course, this level of seperation has no affect on security - it simply passes on what needs to be passed on. However, it is still there.

  25. Re:Hmmm... on Spammer DDoS-By-Virus On spamhaus.org · · Score: 1

    www.spamhaus.org: Server: Apache/2.0.47 (Unix) SinkBot/0.6a AttACK/0.5 Spamcop.net Server: Apache/1.3.27 (Unix) mod_perl/1.27