I totally called this, back in 2007, when LiveJournal started to use SpinVox's services.
I was suspicious at the time, and started to look for information. What I found made me absolutely sure that at least part of it wasn't actually as automated as it was made out to be, and in fact, gave me the distinct feeling that it was mostly manually done by humans.
I started to write an article on the subject that I was going to publish in the LJ community "no_lj_ads". Being a Support volunteer, I had access to the feature before it was released for general use, and I was able to make some observations. However, although I made good progress on the article, it was never finished. There were lots of points to make, and it wasn't long after that that LiveJournal was the subject of a controversy known as "Strikethrough". The article got buried on my computer and forgotten about, half-finished.
In 2008, I dug up the article again, completed it using notes that I had left, and reposted it to my LiveJournal. I'll reproduce it here, too, because I think people will be interested.
Remember, this article was originally made in 2007. Because of that, some of the links are now defunct. The article has been slightly edited in places in order to note where this is the case; these edits will be noted [2008: Like this!] or [2009: Like this!], depending on whether I noticed it in my 2008 reposting, or in this 2009 reposting.
On to the article!
The Problem of Logistics First, let me address the obvious problem of logistics. Yes, logistics are a big problem. LiveJournal has tons of users, to put it mildly, and SpinVox already has quite a lot of clients, I believe. If SpinVox weren't fully automated, how could they solve this problem? Is SpinVox some sort of sweatshop?
To tell the truth - I don't know how SpinVox solve that. It seems like for that reason alone SpinVox would be an automated system, and I'll be the first to admit that it's a good question that deserves an answer, and a good reason to believe it's automated. On the other hand, though, I believe I have evidence that shows pretty strongly that all is not automated. I'll be covering that evidence here.
The Evidence Well, let's get started with some obvious points. The first thing to do is to look at some random people's journals and check out the quality of the transcription for yourself, so go check out the post in paidmembers and click to some random commenters' journals. Chances are, most of them will probably have made a voice post by now to test the system, and auto-transcription only occurs on public entries, so you have a good chance of finding some. Heck, some commenters link to their posts for you. Go check them out. I'll be here when you get back. (If you want, you can also try this Google Blog Search search for recent voicepoists too, but not all of them will be from paid members, and Google doesn't pick up all of them.)
Okay, you're back? Cool. You've probably noticed that the quality of the transcriptions is really pretty good, but obviously it still makes mistakes. That's okay - it's to be expected, from an automated system, right? And yes, it *is* to be expected. No automated system is perfect. Mistakes will always be made. I encourage you to bear this in mind and be skeptical about what I have to say. Analyse it for yourself; don't let me brainwash you. Be skeptical, it's healthy for you.
Pros Having said that, however, SpinVox is still very awesome, if we consider it to be automated:
1. It understands a wide variety of accents. 2. It understands when you speak quickly. 3. It works over the phone. 4. It doesn't mind background noise, or quiet voices. 5. It knows when and how to
It's more analogous to counting people as drivers of cars when in reality they're only ever passengers.
Nobody's saying that Linux servers aren't used. What the GP is saying is that you can't count *every single user* of some popular site as a user of the OS that site runs on.
Or to put it another way: Let's say 70% of the Web-browsing public uses GMail. (which, of course, is a number I pulled straight out of my ass.) Does that mean 70% of the Web-browsing public are Linux/GoogleOS/whatever-OS-GMail-runs-on users? No, and to try to say otherwise is just outright skewing the numbers. They're GMail users, and that's all you can say about them. It makes no sense in this case to say that Linux use is up from a user perspective.
Now, had you framed it in the context of the servers themselves - with more users of the service equating to needing more Linux servers to cope with the load - then you might have a point. (though even then, it's still only use by one company.)
Depending on where I'm tracking, I generally use either FastTracker 2.08 or SoundTracker - not the original, but a new program that is basically an XM editor for Linux.
It depends very much on specific circumstances, of course, and with the fast progress of software nowadays you'd really need to be in control of both the compiler source and the target's source to pull this off. But the possibility is there.
Yep, it was right, I should have used the preview button. That was meant to read:
Apparently motivated by his recent trip into space, perhaps he has found a higher purpose while orbiting so high above the earth.
o/~ far beneath the ship the world is mourning they don't realize he's alive no one understands but Major Tom sees now the life commands this is my home I'm coming home o/~
Apparently motivated by his recent trip into space, perhaps he has found a higher purpose while orbiting so high above the earth.
o/~ far beneath the ship
the world is mourning
they don't realize
he's alive
no one understands
but Major Tom sees
now the life commands
this is my home
I'm coming home o/~
I wasn't personally suggesting that YouTube was a search engine. In fact, I agree with you that it's a specific dataset. I was merely responding to the idea that grep would normally be used in a non-specific dataset. I don't believe many people normally do a grep -r/ - since you do, that's a bad assumption on my part, and I apologise. But I think you get the rest of what I was saying.
Have you ever done a grep -r/, grep -r ~, or similar?
Because if not, you're illustrating the GP's point. You know what files you're looking in, but you don't know what file something is in. You're searching a specific set of data, in this case, files that reside in a particular place.
Similarly with locate. Your set of data is a list of pathnames, and not only that, but pathnames that have already been filtered by the locate command. Again, you know you want to look in a specific place - a list of pathnames.
The for/wget/grep is more like the search engine, but even then it's a non-typical use, analogous to using the site: operator in Google, which most people don't.
The RIAA deals with music. The MPAA deals with movies. Neither of them deal with software.
If you're going to make an argumnt like that, at least get your company names right.
I totally called this, back in 2007, when LiveJournal started to use SpinVox's services.
I was suspicious at the time, and started to look for information. What I found made me absolutely sure that at least part of it wasn't actually as automated as it was made out to be, and in fact, gave me the distinct feeling that it was mostly manually done by humans.
I started to write an article on the subject that I was going to publish in the LJ community "no_lj_ads". Being a Support volunteer, I had access to the feature before it was released for general use, and I was able to make some observations. However, although I made good progress on the article, it was never finished. There were lots of points to make, and it wasn't long after that that LiveJournal was the subject of a controversy known as "Strikethrough". The article got buried on my computer and forgotten about, half-finished.
In 2008, I dug up the article again, completed it using notes that I had left, and reposted it to my LiveJournal. I'll reproduce it here, too, because I think people will be interested.
Remember, this article was originally made in 2007. Because of that, some of the links are now defunct. The article has been slightly edited in places in order to note where this is the case; these edits will be noted [2008: Like this!] or [2009: Like this!], depending on whether I noticed it in my 2008 reposting, or in this 2009 reposting.
On to the article!
Hopefully you don't actually use those; they can be derived.
I use PuTTYcyg, which lets you use Cygwin with PuTTY's own terminal.
To me, this is by far the best way; PuTTY does pretty much everything I want, and is *miles* better than cmd.exe.
It's more analogous to counting people as drivers of cars when in reality they're only ever passengers.
Nobody's saying that Linux servers aren't used. What the GP is saying is that you can't count *every single user* of some popular site as a user of the OS that site runs on.
Or to put it another way: Let's say 70% of the Web-browsing public uses GMail. (which, of course, is a number I pulled straight out of my ass.) Does that mean 70% of the Web-browsing public are Linux/GoogleOS/whatever-OS-GMail-runs-on users? No, and to try to say otherwise is just outright skewing the numbers. They're GMail users, and that's all you can say about them. It makes no sense in this case to say that Linux use is up from a user perspective.
Now, had you framed it in the context of the servers themselves - with more users of the service equating to needing more Linux servers to cope with the load - then you might have a point. (though even then, it's still only use by one company.)
It was a joke on number types - integer vs. real. It didn't actually have anything to do with this discussion.
But in the long run, you get lots *more* spam, since each address you give out can count as a separate entry in some spammer's email address list.
Depending on where I'm tracking, I generally use either FastTracker 2.08 or SoundTracker - not the original, but a new program that is basically an XM editor for Linux.
Most people in the US do.
Except that the standard way of storing times as a POSIX timestamp (number of seconds since the epoch) specifically doesn't include leap seconds.
Being able to inspect the source isn't the be-all and end-all. In some cases there may be more than you bargained for.
It depends very much on specific circumstances, of course, and with the fast progress of software nowadays you'd really need to be in control of both the compiler source and the target's source to pull this off. But the possibility is there.
The best first programming paradigm is one that doesn't involve the word "paradigm".
(You should watch out, you'll become a manager that way.)
The Regular mode uses Hibernate, which they say takes 20-22 seconds. The Fast mode uses Suspend.
Yep, it was right, I should have used the preview button. That was meant to read:
Apparently motivated by his recent trip into space, perhaps he has found a higher purpose while orbiting so high above the earth.
o/~ far beneath the ship
the world is mourning
they don't realize
he's alive
no one understands
but Major Tom sees
now the life commands
this is my home
I'm coming home o/~
Apparently motivated by his recent trip into space, perhaps he has found a higher purpose while orbiting so high above the earth. o/~ far beneath the ship the world is mourning they don't realize he's alive no one understands but Major Tom sees now the life commands this is my home I'm coming home o/~
I wasn't personally suggesting that YouTube was a search engine. In fact, I agree with you that it's a specific dataset. I was merely responding to the idea that grep would normally be used in a non-specific dataset. I don't believe many people normally do a grep -r / - since you do, that's a bad assumption on my part, and I apologise. But I think you get the rest of what I was saying.
Have you ever done a grep -r /, grep -r ~, or similar?
Because if not, you're illustrating the GP's point. You know what files you're looking in, but you don't know what file something is in. You're searching a specific set of data, in this case, files that reside in a particular place.
Similarly with locate. Your set of data is a list of pathnames, and not only that, but pathnames that have already been filtered by the locate command. Again, you know you want to look in a specific place - a list of pathnames.
The for/wget/grep is more like the search engine, but even then it's a non-typical use, analogous to using the site: operator in Google, which most people don't.