Funny, last time I watched BBC (a few days ago) there was no closed captioning. And AFIAK, very few Chinese (CCTV) or Japanese (NHK) stations have it.
As far as CC getting keywords right -- have you LOOKED at closed captioning lately? You need to be careful -- something like CNN's captioning will be very good if a story has been repeated in their news cycle for several hours. As the news is repeated, the closed captioning is improved via an editing process, since the news doesn't change much. But on breaking stories, or anything the transcriber hasn't seen before -- it's decidedly worse. Also, watch something like Firing Line or a sportscast for a much better view of realtime closed captioning.
And, more to the point, generating closed captioning is VASTLY more expensive both in dollar cost and labor. Speech recognition currently attains 90%+ accuracy on a problem like broadcast news, for a completely trivial cost compared to human transcribers (i.e. buy a machine, plug it in, it works forever).
Guess what -- only US TV is closed-captioned, and for a lot of interesting stuff (say, breaking news on CNN) the closed-captioning BITES. Even when it's good, it's not an "accurate" transcript. Speech Rec can get closer, even with errors. And it's essential if you're not interested in US TV.
It really depends on the speech recognition you are trying to do. If your problem domain is recognizing 50 words over a phone, sure -- no problem. If your problem domain is dictation or transcription in the 64k-128k vocabulary space, you need much more bandwidth, because your training data will have to be of higher quality to get the accuracy you need.
I doubt people moshing in a pit somewhere exactly qualify as audiophiles:-)
Well, given that they recommend off-the-shelf hardware to route and broadcast the packets, it would sound like it. The Gibson article didn't make any such claim.
Right, and I acknowledge that, and it's all well-and-good. But don't call it "Ethernet". The reason they did that is annoying, they want to piggyback on the idea of computer networking. It's a marketing maneuver. But hey, we're all engineers here, and when we say a word, we have a particular meaning in mind.
A few other notes:
- Since they're running point-to-point, no broadcasts, with high-speed ethernet they should have no problem with bandwidth. I do realtime speech recognition with audio streamed in realtime over an ethernet, WITH TCP/IP, and we do just fine. And don't try and convince me that speech recognition is less quality and latency sensitive than a guitar!
- Despite all the people telling me to "read the fscking article", the phrase "Gibson did this by modifying the Ethernet networking protocol..." makes me highly suspicious. But then again,it is really marketing fluff we're arguing about, so all bets are off.:-)
Right, I did fail to normalize for press-release-itis and tech-journalism-itis. [but then again, so does 95% of the staff of slashdot, heh]
However, to respond to your comments:
- If you change the spec, and the thing no longer performs all the functions in the spec, you have done a bad thing. Hence my comment about running it over a regular networking fabric, with other traffic. So, the fact that the helicopter still flies means it's still a helicopter. But if you can't do all the other fun stuff that ethernet does, it ain't ethernet.
What the hell -- they "modified" ethernet? Sorry, then it's not ethernet. Can you broadcast other data over the same fabric and have it work? Then MAYBE I'll believe it's ethernet. Other than that, they ripped off some ideas. But why do people keep reinventing the wheel like that? I bet they could have used EXACTLY ethernet and it would have just worked.
And this means what? 802.11 networks are (I believe) Part 15 devices, which means that they are free for anyone to use, as long as they emit a certain amount of power and don't interfere with other equipment. Unless the cable companies can get this law changed (good luck) I don't see how this has any bearing on the point in question.
At the risk of being gauche and following up to my own post:
http://www.computers4sure.com/linksys/store/att_ st artup.asp
This is a link to a page I got to via http://www.broadband.att.com. Sign up with AT&T broadband, and they'll dropship you the Linksys NAT of your choice (wired or wireless). Tah dah!
Sorry, but internet technology is NOT regulated by the FCC.
But the funny thing is: I don't see why this guy's got his undies in a bundle. AT&T was SELLING Linksys NAT boxes in a promotion this summer in my area (Cambridge, MA -- ex-Mediaone). Big flyers! Network your entire house! Share your connection! Granted, they didn't mean with your neighbors, but there you go. I doubt anyone has shelled out the extra money for their vastly overpriced extra IP service.
The CD keys are only used in multiplayer games, and they don't authenticate with the server you are connecting to, but with a master server that ID maintains -- else, how could they run a blacklist of banned keys?
You should still be able to play single player w/o the CD.
That depends entirely on how you author your SOAP application. SOAP describes (at a high level) how you can make applications talk to each other, ala TCP/IP. It does not (and should not) define what the two applications say to each other.
At the risk of replying to a troll, I'd say that if your organization is so fscked that you'd rather obscure what you're doing rather than attempt to cooperate with another arm of your organization -- well, you've got more severe problems than SOAP security!:-)
I, for one, am rather glad that SOAP is light on the security front. Security is a discipline all in its own, with its own science and technologies. If you start to intermingle it with too many other things, not only is it more burdensome, but you run the risk of missing things. If you are worried, for instance, about people spying on your SOAP traffic, don't build an "encrypted SOAP", use one of the 6.02E23 other transport protocol security mechanisms that have been designed and thoroughly thought out (SSL,IPsec,you-name-it). Let SOAP (or whatever you are developing) do what IT does well, and let the security do what IT does well.
Right, I should have written "the most common binding". However, I would hazard a guess that 90%+ of SOAP implementations are using HTTP, and not DIME, BEEP/BXXP, or (gasp) SMTP(!).
Exactly, there has been much gnashing of teeth on the xml-dist-app list about this (a SOAP standardization list).
Although SOAP is bound to HTTP, there is no requirement that you use port 80 -- it's just a well-known HTTP port. As long as the people who need to use your service agree to it, you can use port 12345 if you want. If you are really paranoid, you should be running HTTP over something more secure, like a VPN between you and the service requestor, and not the public (great unwashed) internet.
My recommendation: find an ADT sticker, scan it, print a few, and stick it on your house. You'll get plenty of security that way.
Avoid ADT like the plague: They will try to get you to commit to a $22/mo *3 year* contract, that will automatically roll over for two further years after that. Now, remember what they do: the alarm goes off, they call your house. If there is no answer, or the person who answers doesn't give the passcode, they call the cops. THAT'S IT. It is NOT an active monitoring system, the alarm in your house calls them -- so you are paying $22/mo for someone to answer the phone and screen false alarms.
The biggest difference is that SOAP is a W3C-ratified protocol, while XML-RPC is not.
SOAP also isn't that hard to use -- using a package prevents you from ever having to look at the wire protocol. SOAP::Lite makes using SOAP completely trivial.
http://spaceflight.nasa.gov/station/assembly/com po nent_view.pdf
You will note that there are on the order of 35 modules. All but 8 of them are US or Russian (78%). The 8 remaining are among the smallest physical modules in the station -- two arms, two labs, two "logistics modules" and some miscellaney.
I would be willing to guess that my numbers are largely correct by most any measurement you care to name: percent of work, percent of modules, percent of budget (which is what the article was talking about -- remember, it's only us and the Russians actually sticking that stuff up there).
It's worth noting that the other nations taking part in the ISS are Japan, Brazil, Canada, and the EU and Italy. Since Italy is part of the EU, I'm not sure why the distinction here, but there you go. Japan is the 3rd biggest contributor, at 4 modules.
It's also worth noting that there are many large projects that DON'T get completed on time/on budget. A prime example of this is the 10+ year 15+ BILLION DOLLAR "Big Dig" here in Boston. I won't go into details, you can get more info on the fiasco on the web.
One distinction between many software projects and the examples you give above is that after the first example of a new thing, no "new" engineering is done. The 5,000th B-747 is to a first order, exactly like #s 1-4,999 (or at the very least, exactly like all the others in its model family). However, building the FIRST B-747 is extremely complicated (and, if you read any of the history of Boeing, was a "bet the company" project!).
Skyscrapers and bridges are similar in that many have been built before, and there's usually a PENALTY for innovation beyond a certain degree. For instance, skyscrapers usually look aesthetically different, but structurally they are very similar to the one down the block. The same goes for bridges.
Can the same be said for software? Not always. The SCSI driver you write tomorrow is probably a lot like the one you wrote 4 years ago, but things like the first browser, MP3 player, Napster, Gnome/KDE, Quake -- more groundbreaking pieces of software -- probably didn't have a lot to go on.
This is one area where Open Source can make a huge contribution, by letting people spend time on the innovative areas of their project, and let them draw from a stable toolkit of features and technologies that they don't have to reinvent.
I agree with your risk management comment, and a later poster who mentioned fixing the endpoint, but I'm not sure I agree on your claim that it can't be pinpointed with any degree of accuracy.
After ~15 years in the industry, I've found that one thing that makes a huge difference is the experience of the team, and the familiarity between the actual engineers and the project management.
As you have experience solving a variety of classes of problems, you can predict with increasing accuracy the time it'll take you to solve later problems. And as your management sees you getting increasingly accurate in your estimates (based on past projects) they can create better and better schedules and estimates for the project as a whole, and have a better intuition for the gray areas of development, or the greener developers.
Projects that tend to go off into the weeds have included (in my experience) wholly green teams, wholly green management, or areas of development that are outside the areas of expertise of one or both.
Might I ask why your requirements include arbitrary retrieval time? I would put it to you that if you have arbitrary retrieval time, you DON'T have to save everything. Are you saying that if I built you a system that saved everything at your highest bandwidth constraints, but took a year to do a retrieval, that is acceptable? I seriously doubt it.
If you are saving that much and care that little about it, you don't need to save that much.
Funny, last time I watched BBC (a few days ago) there was no closed captioning. And AFIAK, very few Chinese (CCTV) or Japanese (NHK) stations have it.
As far as CC getting keywords right -- have you LOOKED at closed captioning lately? You need to be careful -- something like CNN's captioning will be very good if a story has been repeated in their news cycle for several hours. As the news is repeated, the closed captioning is improved via an editing process, since the news doesn't change much. But on breaking stories, or anything the transcriber hasn't seen before -- it's decidedly worse. Also, watch something like Firing Line or a sportscast for a much better view of realtime closed captioning.
And, more to the point, generating closed captioning is VASTLY more expensive both in dollar cost and labor. Speech recognition currently attains 90%+ accuracy on a problem like broadcast news, for a completely trivial cost compared to human transcribers (i.e. buy a machine, plug it in, it works forever).
Guess what -- only US TV is closed-captioned, and for a lot of interesting stuff (say, breaking news on CNN) the closed-captioning BITES. Even when it's good, it's not an "accurate" transcript. Speech Rec can get closer, even with errors. And it's essential if you're not interested in US TV.
Subject says it all...!
It really depends on the speech recognition you are trying to do. If your problem domain is recognizing 50 words over a phone, sure -- no problem. If your problem domain is dictation or transcription in the 64k-128k vocabulary space, you need much more bandwidth, because your training data will have to be of higher quality to get the accuracy you need.
:-)
I doubt people moshing in a pit somewhere exactly qualify as audiophiles
Well, given that they recommend off-the-shelf hardware to route and broadcast the packets, it would sound like it. The Gibson article didn't make any such claim.
Right, and I acknowledge that, and it's all well-and-good. But don't call it "Ethernet". The reason they did that is annoying, they want to piggyback on the idea of computer networking. It's a marketing maneuver. But hey, we're all engineers here, and when we say a word, we have a particular meaning in mind.
:-)
A few other notes:
- Since they're running point-to-point, no broadcasts, with high-speed ethernet they should have no problem with bandwidth. I do realtime speech recognition with audio streamed in realtime over an ethernet, WITH TCP/IP, and we do just fine. And don't try and convince me that speech recognition is less quality and latency sensitive than a guitar!
- Despite all the people telling me to "read the fscking article", the phrase "Gibson did this by modifying the Ethernet networking protocol..." makes me highly suspicious. But then again,it is really marketing fluff we're arguing about, so all bets are off.
Right, I did fail to normalize for press-release-itis and tech-journalism-itis. [but then again, so does 95% of the staff of slashdot, heh]
However, to respond to your comments:
- If you change the spec, and the thing no longer performs all the functions in the spec, you have done a bad thing. Hence my comment about running it over a regular networking fabric, with other traffic. So, the fact that the helicopter still flies means it's still a helicopter. But if you can't do all the other fun stuff that ethernet does, it ain't ethernet.
What the hell -- they "modified" ethernet? Sorry, then it's not ethernet. Can you broadcast other data over the same fabric and have it work? Then MAYBE I'll believe it's ethernet. Other than that, they ripped off some ideas. But why do people keep reinventing the wheel like that? I bet they could have used EXACTLY ethernet and it would have just worked.
And this means what? 802.11 networks are (I believe) Part 15 devices, which means that they are free for anyone to use, as long as they emit a certain amount of power and don't interfere with other equipment. Unless the cable companies can get this law changed (good luck) I don't see how this has any bearing on the point in question.
At the risk of being gauche and following up to my own post:
_ st artup.asp
http://www.computers4sure.com/linksys/store/att
This is a link to a page I got to via http://www.broadband.att.com. Sign up with AT&T broadband, and they'll dropship you the Linksys NAT of your choice (wired or wireless). Tah dah!
Sorry, but internet technology is NOT regulated by the FCC.
But the funny thing is: I don't see why this guy's got his undies in a bundle. AT&T was SELLING Linksys NAT boxes in a promotion this summer in my area (Cambridge, MA -- ex-Mediaone). Big flyers! Network your entire house! Share your connection! Granted, they didn't mean with your neighbors, but there you go. I doubt anyone has shelled out the extra money for their vastly overpriced extra IP service.
The CD keys are only used in multiplayer games, and they don't authenticate with the server you are connecting to, but with a master server that ID maintains -- else, how could they run a blacklist of banned keys?
You should still be able to play single player w/o the CD.
That depends entirely on how you author your SOAP application. SOAP describes (at a high level) how you can make applications talk to each other, ala TCP/IP. It does not (and should not) define what the two applications say to each other.
At the risk of replying to a troll, I'd say that if your organization is so fscked that you'd rather obscure what you're doing rather than attempt to cooperate with another arm of your organization -- well, you've got more severe problems than SOAP security! :-)
I, for one, am rather glad that SOAP is light on the security front. Security is a discipline all in its own, with its own science and technologies. If you start to intermingle it with too many other things, not only is it more burdensome, but you run the risk of missing things. If you are worried, for instance, about people spying on your SOAP traffic, don't build an "encrypted SOAP", use one of the 6.02E23 other transport protocol security mechanisms that have been designed and thoroughly thought out (SSL,IPsec,you-name-it). Let SOAP (or whatever you are developing) do what IT does well, and let the security do what IT does well.
Right, I should have written "the most common binding". However, I would hazard a guess that 90%+ of SOAP implementations are using HTTP, and not DIME, BEEP/BXXP, or (gasp) SMTP(!).
Exactly, there has been much gnashing of teeth on the xml-dist-app list about this (a SOAP standardization list).
Although SOAP is bound to HTTP, there is no requirement that you use port 80 -- it's just a well-known HTTP port. As long as the people who need to use your service agree to it, you can use port 12345 if you want. If you are really paranoid, you should be running HTTP over something more secure, like a VPN between you and the service requestor, and not the public (great unwashed) internet.
My recommendation: find an ADT sticker, scan it, print a few, and stick it on your house. You'll get plenty of security that way.
Avoid ADT like the plague: They will try to get you to commit to a $22/mo *3 year* contract, that will automatically roll over for two further years after that. Now, remember what they do: the alarm goes off, they call your house. If there is no answer, or the person who answers doesn't give the passcode, they call the cops. THAT'S IT. It is NOT an active monitoring system, the alarm in your house calls them -- so you are paying $22/mo for someone to answer the phone and screen false alarms.
Oop, my mistake -- but it is on a path to be ratified -- and XML-RPC isn't even there. That gives it a huge leg up in my opinion.
The biggest difference is that SOAP is a W3C-ratified protocol, while XML-RPC is not.
SOAP also isn't that hard to use -- using a package prevents you from ever having to look at the wire protocol. SOAP::Lite makes using SOAP completely trivial.
Ok, wiseass :-)
m po nent_view.pdf
Check out the following PDF:
http://spaceflight.nasa.gov/station/assembly/co
You will note that there are on the order of 35 modules. All but 8 of them are US or Russian (78%). The 8 remaining are among the smallest physical modules in the station -- two arms, two labs, two "logistics modules" and some miscellaney.
I would be willing to guess that my numbers are largely correct by most any measurement you care to name: percent of work, percent of modules, percent of budget (which is what the article was talking about -- remember, it's only us and the Russians actually sticking that stuff up there).
It's worth noting that the other nations taking part in the ISS are Japan, Brazil, Canada, and the EU and Italy. Since Italy is part of the EU, I'm not sure why the distinction here, but there you go. Japan is the 3rd biggest contributor, at 4 modules.
Only the 45% that we paid for and built, and the 45% that we paid Russia for and they built... I guess the other 10% belongs to someone else!
It's also worth noting that there are many large projects that DON'T get completed on time/on budget. A prime example of this is the 10+ year 15+ BILLION DOLLAR "Big Dig" here in Boston. I won't go into details, you can get more info on the fiasco on the web.
One distinction between many software projects and the examples you give above is that after the first example of a new thing, no "new" engineering is done. The 5,000th B-747 is to a first order, exactly like #s 1-4,999 (or at the very least, exactly like all the others in its model family). However, building the FIRST B-747 is extremely complicated (and, if you read any of the history of Boeing, was a "bet the company" project!).
Skyscrapers and bridges are similar in that many have been built before, and there's usually a PENALTY for innovation beyond a certain degree. For instance, skyscrapers usually look aesthetically different, but structurally they are very similar to the one down the block. The same goes for bridges.
Can the same be said for software? Not always. The SCSI driver you write tomorrow is probably a lot like the one you wrote 4 years ago, but things like the first browser, MP3 player, Napster, Gnome/KDE, Quake -- more groundbreaking pieces of software -- probably didn't have a lot to go on.
This is one area where Open Source can make a huge contribution, by letting people spend time on the innovative areas of their project, and let them draw from a stable toolkit of features and technologies that they don't have to reinvent.
I agree with your risk management comment, and a later poster who mentioned fixing the endpoint, but I'm not sure I agree on your claim that it can't be pinpointed with any degree of accuracy.
After ~15 years in the industry, I've found that one thing that makes a huge difference is the experience of the team, and the familiarity between the actual engineers and the project management.
As you have experience solving a variety of classes of problems, you can predict with increasing accuracy the time it'll take you to solve later problems. And as your management sees you getting increasingly accurate in your estimates (based on past projects) they can create better and better schedules and estimates for the project as a whole, and have a better intuition for the gray areas of development, or the greener developers.
Projects that tend to go off into the weeds have included (in my experience) wholly green teams, wholly green management, or areas of development that are outside the areas of expertise of one or both.
Might I ask why your requirements include arbitrary retrieval time? I would put it to you that if you have arbitrary retrieval time, you DON'T have to save everything. Are you saying that if I built you a system that saved everything at your highest bandwidth constraints, but took a year to do a retrieval, that is acceptable? I seriously doubt it.
If you are saving that much and care that little about it, you don't need to save that much.