...in addition to the 20TB worth of dictionary data that it would take
to hold 5,000 4GB movie files to use as the snippet library. I could
get outstanding compression if I ship a Redbox machine to your house
and send you the symbol "Sharknado".
I tried to ignore the issue of having a complex decoder. However,
looking at the math, the argument still works for a single 2 hour
250MB reference movie. (The number of reference movies only appears
in the log2() factor.)
Still there is a strong intuition that one can cheat with a huge
decoder. However, even with a massive decoder, one can't escape the
counting argument based on the number of distinguishable movies. I
think the reason is probably a Shannon versus Kolmogorov distinction.
Roughly, I can create enormously complex objects with high Kolmogorov
complexity but represent them with a few symbols if my goal is just to
select between them. This is similar to you Redbox example that
stores a lot of "information" but only a small number of movies. A
realistic compression solution would have to handle both issues.
I think it's possible to give some reasonable lower bounds on
lossy compression using a little bit of perceptual hand waving.
The key is to generate a large set of "different" movies. These
don't have to be good movies, but it needs to be obvious to a human
that they are different. Let's take the top 5000 movies of all time
and break them up into 10 second pieces. Assuming on average each
movie is around 2 hours long this give one an alphabet of 3,600,000
symbols. Now I can use this alphabet to create a bunch of 2 hour
movies by looking at all combinations of this alphabet. This gives me
3600000^720 movies (including properly edited versions of Pulp Fiction
and Memento).
Clearly a human can tell the difference between these movies, so
any respectable lossy compression scheme needs to be able to uniquely
identify them. Therefore the compression scheme better give a
different file for each movie. This requires lg(3600000^(720))/8
bytes, so around 2kB.
Notice that key is not so much the 5000 original movies but the
size of the slices. Using 1 second slices makes it much harder for a
human to follow, but one could still detect differences. In this
case, the compressed files would need to be around 22kB. I think this
is a lot bigger than Sloot was claiming and it's not close to
including all the movies a human can easily differentiate.
I was thinking that negative prices were a nice way to encourage people to invest in home batteries (Powerwall). With a big battery one can better exploit peak pricing (or in this case spot pricing). However, your idea is better/complimentary.
The real issue is what is a good way to deal with variable generated power. You either need to waste it, store it, or use it. Seems like a good business opportunity to come up with some clever ways to use it as it becomes available. Is this the kind of stuff Enron did? That can't be good.
Privatized prisons need to pull a profit. If they don't pull a bigger
profit than last year, few will invest in a non-growing market
segment.
I'd really like to understand this dynamic. Getting a constant 10%
return on an investment would be great for most people. Isn't this a
constant profit? (This assumes I get the money as a dividend,
otherwise things get more complicated.)
The tax payers pay for these profits. It's like a stock ticker where
you can buy your taxes back as income.
Just like a company, you can save money by doing it in house. In
other words, we can save money by just having the government do it.
It should be obvious why one shouldn't privatize state
functions. Eventually they become (to support the business needs) more
expensive than an organization that doesn't carry the burden of being
profitable. Yes, there are inefficiencies in government, but these are
balanced by the inefficiencies of industry.
Right. The argument is that competing capitalist industry is better
at being efficient. However, one needs to do a careful analysis to
see when that holds true. It is the natural/necessary tendency of
companies to try and distort the free market to maximize profits.
Also one needs to understand how maximizing profits is connected to
the goals of service. Maximizing the number of people is prison is
profitable for private prisons, but is not great for the country.
While we pay too much to keep people in prison. (And spiteful people seem to want to keep them there.) The changes in California are not unreasonable. They show a 6% yearly increase. Given that the prison population is shrinking, it's not surprising that the fixed costs that are built into the system are going to give a number that is higher than inflation, which is about 2% over that timespan.
I have no doubt biologic drugs should be more expensive. The argument is about what kind of markups they get and what kind of economic pressures will limit these markups. One traditional drug mentioned "costs" $80 a year and is "sold" for $3036 a year. It would not surprise to find out that a particular biological drug costs $1000 and is sold for $100,000. Of course, I'm just making this up, but without evidence otherwise, it seems in line with what these companies do.
As my links shows, even generics are often outrageously overpriced. I'm sorry, without independent evidence I find it hard to believe claims made by pharmaceutical companies on price. If generics sometimes have markups of over 5000%, then why would they decide to have only a 100% markup on a patent protected medication. Given that medicare must pay, the only thing stopping the pharmaceutical companies is public outcry.
Do you have any references that show these drugs cost $50K to make? The pharmaceutical industry has a history of jacking up the prices of drugs well beyond cost. This is true even for generics.
http://www.lifeextension.com/m...
I have no doubt that they will increase the price even more for products with monopoly protection that they can bill to health care providers.
Honestly I didn't know he had his own definition until reading the mentioned Wikipedia page. I think his distinction between weak and strong AI is more useful, but it's unfortunate that he used terms that have already been taken in this area of research.
No, that's a naive definition, good enough for a dictionary, I guess.
A word can have many definitions. Just agree on one and avoid
confusion and equivocation. What definition are you using?
If you want to understand the issue, you need to understand the
difference between weak and strong AI. At least read the Wikipedia
article, it will give you a good overview of the topic. AlphaGo is
weak AI, not even the creators claim it is strong AI.
Are you using the research definition of strong versus weak AI or the
Kurzweil popularized definition? Most people who know about this
topic use the older research definition. As researchers, of course
the AlphaGo people refer to their work as weak AI. They probably
don't regard the philosophical idea of strong AI as relevant.
Thanks. I knew they did this for deep blue, but I thought AlphaGo was hands off. It definitely cheating. In the long run, just like chess, they will probably not need human input.
Carnivore (predator) pets like dogs and cats tend to be much more sensitive to motion.
They will *perceive* motion on TV, it will just look more choppy and flickering to them.
I would assume prey would be more sensitive. I'd rather miss a meal than be a meal.
I understand what these techniques do; what I don't understand is what you think they need to do. Perhaps you can relate this to what you speculate humans do.
I guess the only way for me to interpret what you are saying is that you think strong AI is implied by something being able to learn from its own mistakes. This is an interesting claim. What do you mean by being able to learn from its own mistakes?
There are many areas of machine learning. Try looking up co-training or multi-view learning. In essence, these techniques can label mistakes and improve performance without supervised labels coming from the outside.
Your problem seems to be based on what people call this technology.
Most people in machine learning don't think these programs are
intelligent in the human sense. Roughly speaking, machine learning is
often good at solving problems that are difficult to code but are
solvable by humans. As for artificial intelligence, a whale shark is
not both a mammal and a fish.
For example, we do only observe actual intelligence in connection with
consciousness. Seeing them as separate is hence not a scientifically
sound approach.
I don't agree. There are very few things we call intelligent. I'm
sure they have lots of incidental correlations between them.
And we have even less of an idea what consciousness is. According to
the current scientific state-of-the-art, there is no physical
mechanism for consciousness, yet it clearly exists.
This is a good point. We have no scientific definition for
intelligence or consciousness. Trying to reason about them is just an
exercise in contradiction and equivocation.
How do you program for even every physical condition a stop sign may
find itself in?
This assume the AI even needs to see the stop sign. A driverless
car has many advantages over a human. It can have a database of the
locations of all stop signs. It have telemetry information from other
nearby cars. It can have 360 degree sensors that include cameras and
lidar. It doesn't get tired or drunk. It can receive updates based on
"mistakes" made by other driverless cars.
Even if there are problems with some of the information, the
system can still perform an action based on the total information that
is safe for the people in the situation. For example, even if doesn't
see a new stop sign, it might still have enough information to see
that there is another car entering the intersection.
Of course, it will make mistakes, but it just has to make
significantly fewer mistakes than humans. Honestly, given the pace of
progress, that doesn't seem too hard.
You wouldn't even go about training a machine learning algorithm that
way as it would be pointless. The idea is to let it make better
predictions, not train to to make the same predictions as an existing
person.
Actually, much of machine learning is about trying to do as well as a
human. Humans are expensive. Google could hire lots of professional
language translators to handle every query, but it would cost a lot of
money. Ideally, you want the algorithm to do as well as the existing
people who created the gold standard training data. But not only does
the algorithm do worse, it also reflects the bias of the training
data.
Rejected applications are pointless for training as you don't know
whether they were a good or bad rejection, whereas if you just give it
approved loans and the outcome (i.e., was the loan defaulted on) then
the AI can try to develop a set of rules.
What you suggest would create badly biased (in the statistical sense)
data. You need to something more sophisticated. Maybe create a hold
out set where you approve everybody and see how they do. This would
be great for this problem as it would remove any human bias in the
labels. A less expensive option (in terms of money lost on defaults)
is to use a more sophisticated algorithm that does more than simple
batch induction. Perhaps a contextual bandit algorithm or an apple
picking algorithm...
If you truly wanted to avoid racial or gender bias you would just
remove that information from what you feed into the algorithm, at
which point it can't a priori be biased against anyone because it
can't even evaluate them based on those criteria.
In general, it depends on the labels. If a human labeled the data
and has a bias then the hypothesis learned will reflect those biases.
As explained in the article, for complex problems based on ideas such
as word embeddings, these biases can also show up as a result of
things not obviously connected to labels.
I do agree it's a good idea to remove features that can be used
for bias. A machine learning algorithm can use any features that are
correlated with the label. Even if we are dealing with simple batch
learning and unbiased labels, "bad" features can make the learned
hypothesis biased. Assume race is correlated with poverty which is
correlated with loan default rate. If there is a race feature, the
algorithm might give some influence/weight to that feature. Now we
have a model that is biased. A black man might just miss the cutoff
because of his race, while he would have gotten the loan if he was
white. This might even be logical when given a Bayesian
interpretation; given a lack of other information, the algorithm uses
the prior information associated with his race to infer this missing
information and determine he is a loan risk.
But let's suppose you do that and then look at the results after
the fact, add that data back in and come to the startling conclusion
that your AI is disproportionately rejecting candidates from some
group. It can't possibly be because it knows they're a member of that
group, but because that group happens to have worse outcomes.
If the labels are biased then the model is probably biased. Even if
you remove "biased" features, the algorithm might learn a model that
is based on features that are correlated to your biased labels. For a
simple batch induction problem, it might be enough to remove any
biased features and to ensure that you have labels that are
generated by some type of unbiased process.
But the authors of the article are making such a statement, they just
have nature completely backwards. They believe mankind, separated from
"society" is naturally non-racist, non-sexist, non-gendered even, and
that the outcomes of race, gender, or class groups is imposed on the
formless humans by society, to where the concepts themselves of race
and gender are "social constructs," and if we smash them everything
will just...be great.
I would actually claim the opposite. Man can be racist, sexist, etc,
but that "good" societies sets up rules to prevent those qualities
from discriminating against people. This seems consistent with the
article.
Smash the Patriarchy and gender equality will simply
emerge. If it doesn't, well, it must be because there's still evil
sexists hiding around here and they need to be identified and purged.
This is a good point. Someone can always point out differences, and
this is not a solid argument that things are unfair. I think people
need to be reasonable and logical in coming up with rules of society
to try to make things fair.
I think that although we presented this pretty liberally we were also
pretty open minded and clear about the fact that language communicates
all associations, learning the associations is called "bias" in ML and
bias is what you need, it's the signal you've found in all the noise
of the universe.
While you can call that bias, the term is already pretty
overloaded in ML. I first learned bias in the sense of Tom Mitchell's
inductive bias work. Here the basic idea is to get around the No Free
Lunch Theorems by assuming things about the problem eg. restrict the
concept space. An older ML related definition is based on statistics
in terms of the bias variance decomposition...
I hope you mean the Guardian article not the Science article?
Unfortunately, I only had a chance to read the Guardian article.
Still it seemed fairly reasonable. One concern was their claim that
humans might lie about why they made a biased decision. I would think
it's more likely that they don't know why they made a decision and
just rationalized an answer when questioned. This is part of the
reason why expert systems failed so badly.
The other thing that seems hard, which they acknowledge, is how to
correct for bias. You can remove features that could directly lead to
bias, such as race or gender, but ML is all about correlation. They
system might learn concepts that are correlated with race, but still
not causal. For example, it could learn that people who eat
sauerkraut are horrible drivers and should pay higher insurance rates.
No, they're still paying more taxes. Far more, in most cases. The fact
that those taxes make up a lower fraction of their income does not
mean they are paying less taxes than those with lower income.
True.
Compared to the value they get for those taxes, which does not vary
much from one individual to another based on income, they are
significantly overpaying.
This is debatable. I would say the US spends a lot of money in the
interest of rich people. Around 40% of income taxes are spent on the
military which protects the assets of rich people (among other
things.) If someone conquered the US, it's doubtful they would let
Richie Rich keep his mansion.
Moreover, that portion they don't spend on taxable goods is being
invested, which does far more good for society than one could
reasonably expect to result from handing it over to the government.
There must be a limit. As this process concentrates wealth, we must
eventually reach this limit. Does the money leave the country to
invest in other opportunities? Is this better than letting the
government redistribute the money which helps drive our own economy.
You're proposing to seize those "excess" earnings and distribute them
as a handout, which at best would just drive up prices
Did the parent propose that? Before we talk inflation, why don't we
start by paying down the national debt...
Punishing saving and investment in particular is a lousy way to help
the average citizen, ensuring that the next generation will be worse
off than its predecessors.
Some income redistribution could help direct this investment. By
creating extra demand for less expensive products, businesses would
have an incentive to help the average citizen.
I tried to ignore the issue of having a complex decoder. However, looking at the math, the argument still works for a single 2 hour 250MB reference movie. (The number of reference movies only appears in the log2() factor.)
Still there is a strong intuition that one can cheat with a huge decoder. However, even with a massive decoder, one can't escape the counting argument based on the number of distinguishable movies. I think the reason is probably a Shannon versus Kolmogorov distinction. Roughly, I can create enormously complex objects with high Kolmogorov complexity but represent them with a few symbols if my goal is just to select between them. This is similar to you Redbox example that stores a lot of "information" but only a small number of movies. A realistic compression solution would have to handle both issues.
I think it's possible to give some reasonable lower bounds on lossy compression using a little bit of perceptual hand waving.
The key is to generate a large set of "different" movies. These don't have to be good movies, but it needs to be obvious to a human that they are different. Let's take the top 5000 movies of all time and break them up into 10 second pieces. Assuming on average each movie is around 2 hours long this give one an alphabet of 3,600,000 symbols. Now I can use this alphabet to create a bunch of 2 hour movies by looking at all combinations of this alphabet. This gives me 3600000^720 movies (including properly edited versions of Pulp Fiction and Memento).
Clearly a human can tell the difference between these movies, so any respectable lossy compression scheme needs to be able to uniquely identify them. Therefore the compression scheme better give a different file for each movie. This requires lg(3600000^(720))/8 bytes, so around 2kB.
Notice that key is not so much the 5000 original movies but the size of the slices. Using 1 second slices makes it much harder for a human to follow, but one could still detect differences. In this case, the compressed files would need to be around 22kB. I think this is a lot bigger than Sloot was claiming and it's not close to including all the movies a human can easily differentiate.
I was thinking that negative prices were a nice way to encourage people to invest in home batteries (Powerwall). With a big battery one can better exploit peak pricing (or in this case spot pricing). However, your idea is better/complimentary.
The real issue is what is a good way to deal with variable generated power. You either need to waste it, store it, or use it. Seems like a good business opportunity to come up with some clever ways to use it as it becomes available. Is this the kind of stuff Enron did? That can't be good.
I'd really like to understand this dynamic. Getting a constant 10% return on an investment would be great for most people. Isn't this a constant profit? (This assumes I get the money as a dividend, otherwise things get more complicated.)
Just like a company, you can save money by doing it in house. In other words, we can save money by just having the government do it.
Right. The argument is that competing capitalist industry is better at being efficient. However, one needs to do a careful analysis to see when that holds true. It is the natural/necessary tendency of companies to try and distort the free market to maximize profits. Also one needs to understand how maximizing profits is connected to the goals of service. Maximizing the number of people is prison is profitable for private prisons, but is not great for the country.
While we pay too much to keep people in prison. (And spiteful people seem to want to keep them there.) The changes in California are not unreasonable. They show a 6% yearly increase. Given that the prison population is shrinking, it's not surprising that the fixed costs that are built into the system are going to give a number that is higher than inflation, which is about 2% over that timespan.
I have no doubt biologic drugs should be more expensive. The argument is about what kind of markups they get and what kind of economic pressures will limit these markups. One traditional drug mentioned "costs" $80 a year and is "sold" for $3036 a year. It would not surprise to find out that a particular biological drug costs $1000 and is sold for $100,000. Of course, I'm just making this up, but without evidence otherwise, it seems in line with what these companies do.
As my links shows, even generics are often outrageously overpriced. I'm sorry, without independent evidence I find it hard to believe claims made by pharmaceutical companies on price. If generics sometimes have markups of over 5000%, then why would they decide to have only a 100% markup on a patent protected medication. Given that medicare must pay, the only thing stopping the pharmaceutical companies is public outcry.
Do you have any references that show these drugs cost $50K to make? The pharmaceutical industry has a history of jacking up the prices of drugs well beyond cost. This is true even for generics. http://www.lifeextension.com/m... I have no doubt that they will increase the price even more for products with monopoly protection that they can bill to health care providers.
Honestly I didn't know he had his own definition until reading the mentioned Wikipedia page. I think his distinction between weak and strong AI is more useful, but it's unfortunate that he used terms that have already been taken in this area of research.
A word can have many definitions. Just agree on one and avoid confusion and equivocation. What definition are you using?
Are you using the research definition of strong versus weak AI or the Kurzweil popularized definition? Most people who know about this topic use the older research definition. As researchers, of course the AlphaGo people refer to their work as weak AI. They probably don't regard the philosophical idea of strong AI as relevant.
Thanks. I knew they did this for deep blue, but I thought AlphaGo was hands off. It definitely cheating. In the long run, just like chess, they will probably not need human input.
Do you have any references that specialists and engineers tweak AlphaGo between matches?
I would assume prey would be more sensitive. I'd rather miss a meal than be a meal.
This is somewhat controversial. https://en.wikipedia.org/wiki/...
For state schools, the evidence points at decreased subsidizing by the federal and state government.
I understand what these techniques do; what I don't understand is what you think they need to do. Perhaps you can relate this to what you speculate humans do.
I guess the only way for me to interpret what you are saying is that you think strong AI is implied by something being able to learn from its own mistakes. This is an interesting claim. What do you mean by being able to learn from its own mistakes?
There are many areas of machine learning. Try looking up co-training or multi-view learning. In essence, these techniques can label mistakes and improve performance without supervised labels coming from the outside.
Your problem seems to be based on what people call this technology. Most people in machine learning don't think these programs are intelligent in the human sense. Roughly speaking, machine learning is often good at solving problems that are difficult to code but are solvable by humans. As for artificial intelligence, a whale shark is not both a mammal and a fish.
Your definition of "Weak" AI is not standard and is not how machine learning works.
I don't agree. There are very few things we call intelligent. I'm sure they have lots of incidental correlations between them.
This is a good point. We have no scientific definition for intelligence or consciousness. Trying to reason about them is just an exercise in contradiction and equivocation.
This assume the AI even needs to see the stop sign. A driverless car has many advantages over a human. It can have a database of the locations of all stop signs. It have telemetry information from other nearby cars. It can have 360 degree sensors that include cameras and lidar. It doesn't get tired or drunk. It can receive updates based on "mistakes" made by other driverless cars.
Even if there are problems with some of the information, the system can still perform an action based on the total information that is safe for the people in the situation. For example, even if doesn't see a new stop sign, it might still have enough information to see that there is another car entering the intersection.
Of course, it will make mistakes, but it just has to make significantly fewer mistakes than humans. Honestly, given the pace of progress, that doesn't seem too hard.
Actually, much of machine learning is about trying to do as well as a human. Humans are expensive. Google could hire lots of professional language translators to handle every query, but it would cost a lot of money. Ideally, you want the algorithm to do as well as the existing people who created the gold standard training data. But not only does the algorithm do worse, it also reflects the bias of the training data.
What you suggest would create badly biased (in the statistical sense) data. You need to something more sophisticated. Maybe create a hold out set where you approve everybody and see how they do. This would be great for this problem as it would remove any human bias in the labels. A less expensive option (in terms of money lost on defaults) is to use a more sophisticated algorithm that does more than simple batch induction. Perhaps a contextual bandit algorithm or an apple picking algorithm...
In general, it depends on the labels. If a human labeled the data and has a bias then the hypothesis learned will reflect those biases. As explained in the article, for complex problems based on ideas such as word embeddings, these biases can also show up as a result of things not obviously connected to labels.
I do agree it's a good idea to remove features that can be used for bias. A machine learning algorithm can use any features that are correlated with the label. Even if we are dealing with simple batch learning and unbiased labels, "bad" features can make the learned hypothesis biased. Assume race is correlated with poverty which is correlated with loan default rate. If there is a race feature, the algorithm might give some influence/weight to that feature. Now we have a model that is biased. A black man might just miss the cutoff because of his race, while he would have gotten the loan if he was white. This might even be logical when given a Bayesian interpretation; given a lack of other information, the algorithm uses the prior information associated with his race to infer this missing information and determine he is a loan risk.
If the labels are biased then the model is probably biased. Even if you remove "biased" features, the algorithm might learn a model that is based on features that are correlated to your biased labels. For a simple batch induction problem, it might be enough to remove any biased features and to ensure that you have labels that are generated by some type of unbiased process.
I would actually claim the opposite. Man can be racist, sexist, etc, but that "good" societies sets up rules to prevent those qualities from discriminating against people. This seems consistent with the article.
This is a good point. Someone can always point out differences, and this is not a solid argument that things are unfair. I think people need to be reasonable and logical in coming up with rules of society to try to make things fair.
While you can call that bias, the term is already pretty overloaded in ML. I first learned bias in the sense of Tom Mitchell's inductive bias work. Here the basic idea is to get around the No Free Lunch Theorems by assuming things about the problem eg. restrict the concept space. An older ML related definition is based on statistics in terms of the bias variance decomposition...
Unfortunately, I only had a chance to read the Guardian article. Still it seemed fairly reasonable. One concern was their claim that humans might lie about why they made a biased decision. I would think it's more likely that they don't know why they made a decision and just rationalized an answer when questioned. This is part of the reason why expert systems failed so badly.
The other thing that seems hard, which they acknowledge, is how to correct for bias. You can remove features that could directly lead to bias, such as race or gender, but ML is all about correlation. They system might learn concepts that are correlated with race, but still not causal. For example, it could learn that people who eat sauerkraut are horrible drivers and should pay higher insurance rates.
True.
This is debatable. I would say the US spends a lot of money in the interest of rich people. Around 40% of income taxes are spent on the military which protects the assets of rich people (among other things.) If someone conquered the US, it's doubtful they would let Richie Rich keep his mansion.
There must be a limit. As this process concentrates wealth, we must eventually reach this limit. Does the money leave the country to invest in other opportunities? Is this better than letting the government redistribute the money which helps drive our own economy.
Did the parent propose that? Before we talk inflation, why don't we start by paying down the national debt...
Some income redistribution could help direct this investment. By creating extra demand for less expensive products, businesses would have an incentive to help the average citizen.