mysterons · Slashdot Mirror

Doesn't this already happen? on Canadian Court Orders Google To Remove Websites From Its Global Index · 2014-06-17 00:51 · Score: 1

For example, in Germany https://en.wikipedia.org/wiki/... and Thailand https://en.wikipedia.org/wiki/...

It is all about followers on Data Mining Reveals How Wording Influences Tweet Propagation · 2014-05-15 05:36 · Score: 3, Interesting

We did a study on predicting when a tweet would be retweeted (this paper cites us). The dominant factor is not what you write, but how many followers you have.

Basically, a famous person can write anything and it will be retweeted. An unknown person can write the same tweet and it will be ignored.

Link to paper:

Sasa Petrovic, Miles Osborne and Victor Lavrenko. RT to win! Predicting Message Propagation in Twitter. ICWSM, Barcelona, Spain. July 2011. http://homepages.inf.ed.ac.uk/...

Re:Cheapskate? on How Amazon Keeps Cutting AWS Prices: Cheapskate Culture · 2014-04-15 02:27 · Score: 1

People here are forgetting the costs associated with flying senior (ie expensive) people around. There is an argument that if you are billing a client for three figure sums a day, you had better ensure that the person flying arrives in good shape so they can work straight from the flight. Sending people coach can be a false economy.

Re:This is relevant to my interests on Twitter's Fake Followers Watching IPO Closely · 2013-11-10 06:50 · Score: 1

RT

And I will claim this as a fake first post. #slashdot

Re:A simple tech solution on Twitter's Fake Followers Watching IPO Closely · 2013-11-10 06:41 · Score: 1

Well, Twitter could simply stop making public the number of followers an account has. Or not even reveal that number to the user in question.

There are simple ways to stop this. Whether Twitter does this is another matter.

Re:What's the point? on Twitter's Fake Followers Watching IPO Closely · 2013-11-10 05:48 · Score: 1

This is a proxy for what marketers call "reach". The more followers you have, the more people will read your posts. Except here the followers are not real and so people buying this SEO snake oil are being ripped-off.

To be expected on Twitter's Fake Followers Watching IPO Closely · 2013-11-10 05:46 · Score: 1

This is the same as any other optimisation task (eg link farms for Page Rank). People will try ti and (eventually) Twitter will work-out how to clamp-down on it.

Rinse and repeat.

Why is this news?

Economics on Hoax-Proofing the Open Access Journals · 2013-11-04 05:03 · Score: 2

A major problem with open-access journals is that there is no motivation for them to reject submissions, If anything, the more they publish the more money they make. Likewise, peer reviewers (at least in my field --natural language processing and machine learning) are never paid to review them. This is not a good combination. I cannot see any reason for journals nowadays. Either publish in conferences (which in some fields are competitive and very tightly reviewed) or better still publish them on arvXiv and have some kind of citation / comment system as a way to crowd-source quality control.

Re:replication on How Science Goes Wrong · 2013-10-18 05:09 · Score: 1

if you want to go to the other extreme look at SIGIR. They have extremely demanding standards for experimentation, along with an associated conservative nature. It is very hard to get something non-incremental (eg using some new dataset) published there. But I agree, experiments at ACL tend to be quite sloppy.

Re:replication on How Science Goes Wrong · 2013-10-18 05:04 · Score: 1

Being plausible and being reproducible are not sufficient and necessary conditions. Science is a community, with an expectation of what a believable result should look like. This comes from actually understanding the field, including what is written and what is not written down. It is very rare for there to be some genuinely implausible result and Good Science typically seems obvious in hindsight.

Moses decoder on Google Deprecates Translation API · 2011-05-27 10:28 · Score: 1

There is an academic statistical machine translation system: http://demo.statmt.org/index.php This is open source. Help improve it!

Re:Abuse? on Google Deprecates Translation API · 2011-05-27 10:09 · Score: 1

You can use this to produce spam.

does it blend? on A "Throne" Fit For a Tech King · 2011-04-18 05:00 · Score: 1

that is the question

Regret is a standard term in statistics on Google Teaches Computers "Regret" · 2011-04-17 19:36 · Score: 1

There is a vogue for such terms: an improper prior is one that does not sum to one; loss is when probability mass cannot be reached.

+1 on Google Ties Employee Bonuses To +1 Success · 2011-04-07 22:45 · Score: 0

do I get my bonus now?

Re:I don';t think so.... on Why Eric Schmidt Left As CEO of Google? · 2011-01-23 09:38 · Score: 1

You can look at linkedin to get a rough estimate of ex-Google people at Facebook

This article

http://www.insidefacebook.com/2010/10/15/as-source-for-current-facebook-employees-google-has-big-lead-on-yahoo-microsoft-oracle/ suggests there are 277 ex-Googlers at Facebook. (There are reduced numbers from other big tech employeres).

Re:Facebook: Hot Tech Company — Explain??? on Why Eric Schmidt Left As CEO of Google? · 2011-01-23 07:38 · Score: 5, Informative

--stock options: Facebook is/was pre-IPO. If you want to get rich as an engineer you would work there. You will never get that rich at Google.

--freedom: Google is a large company and it is hard to get stuff done. Facebook is small.

--Google is perceived as no longer being the place where the best work.

Verification and Science on Predicting Election Results With Google · 2010-10-31 06:38 · Score: 1

Results from query logs and great, but until the raw data is made public, no-one can verify or reproduce these results. Until that is done they remain a curiosity at best.

Re:End of Science on The Big Promise of 'Big Data' · 2010-09-14 08:17 · Score: 1

Experiments being reproduced can be hard if no-one else has the data (this can happen --for example if you are Google and publish results using large fractions of the Web as data) or even if something as trivial as moving it from one site to another requires a lot of effort. This is not really a question of storage costs --it is a question of having the data in the first place and the mechanics of moving it around. Models are used in Science as idealisations; but if you really really want to model the long tail of effects, then your model becomes the data. And this relates to summary statistics: all they do is capture aspects of the data (it is after all a summary). If you want the whole truth, then you can't summarise. Fernando Pereira and Peter Novig have a nice paper on this: http://googleresearch.blogspot.com/2009/03/unreasonable-effectiveness-of-data.html [The Unreasonable Effectiveness of Data]

End of Science on The Big Promise of 'Big Data' · 2010-09-14 07:28 · Score: 2, Informative

Related to using Big Data in Business is Big Data in Science. Wired ran a nice series of articles looking at this (http://www.wired.com/wired/issue/16-07). This raises all sorts of problems (for example, how can results be reproduced? What if the model of the data is as complex as the data? Are all results obtained with Small Data simply artefacts of sparse counts?).

Re:Attach the stupid URL as metadata on Why Twitter's T.co Is a Game Changer · 2010-09-13 05:35 · Score: 2, Informative

oh you mean twitter annotations http://techcrunch.com/2010/06/02/twitter-annotations-testing/

Re:Ensemble learning on New Leader In Netflix Prize Race With One Day To Go · 2009-07-26 03:28 · Score: 1

Well, you really want to think about bias/variance reductions which brings ideas of averaging and using better classifiers together. For example, "bagging" can be thought of as a variance-reduction technique; "boosting" does both if I recall.

Ensemble learning on New Leader In Netflix Prize Race With One Day To Go · 2009-07-26 02:42 · Score: 1

I'm actually surprised that this hasn't been done before. You can prove that using multiple models will on average produce better results than using any single model in isolation. For example, each netflix system will make different errors; using multiple systems will tend to average-out these errors and the consensus decision is most likely to be correct.

Re:But for how long? on Google Outlines the Role of Its Human Evaluators · 2009-06-07 18:45 · Score: 1

That's a pretty good way to think about it --this really is like a kind of reverse hashing.

+1 insightful

Re:But for how long? on Google Outlines the Role of Its Human Evaluators · 2009-06-07 09:30 · Score: 1

There is another possibility: automatic search (attempting to find relevant pages for everyone) has reached a plateau in terms of performance and if you want to do better, you will need to employ raters. Clearly Google would like to find "the next best thing" in Search, but that sounds quite uncertain. Employing lots of people is a much surer way to improve results.

Slashdot Mirror

User: mysterons

Comments · 32