It woudn't be healthy if mozilla share were 99% either. But that's neither here nor there: it isn't and there's no plausible scenario in which they turn into a monopolist, not to mention the fact that they're a non-profit whose interests are perhaps less likely to push them to abuse than companies peddling your private data and opinions to the highest bidder.
One thing that article glosses over a little too quickly (at least - I'm not convinced) is the bit depth, and specifically the fact that appropriate dither means that 16bit is enough (and by explicit implication, that more is wasteful).
So, for uncompressed audio... sure!
But almost nobody listens to uncompressed audio, and the argument was about "what kind of format should my music files be in".
Compressing dither isn't trivial. Usually dithered signals don't compress as well. Certainly in visual applications they do; I'm not sure about audio - but I'd be surprised if it were different.
And that suggests that if 16bit is transparent because of dither, then you're possibly better off using 17-20bits without dither, and (if necessary) dithering post-decompression to playback on a 16bit output.
Then again, I rarely use headphones, so I'm never going to hear the difference anyhow. A 64kpbs opus file sounds perfect to me;-).
Dealing with rendering layers is a tricky optimization process. All browsers have had notable issues with it over the years, and almost certainly will for the foreseeable future, including most definitely chromium - as a job I maintained a chrome+website plugin for a few years and ironically chrome had the most issues of all browsers in this regard. If you've ever tried to optimize an HTML layout for animation, and low interactive jank, you may have run into issues with similar root causes.
It's just not that trivial to figure out which few potential layers - amids the thousands (and sometimes more) in a typical web page - the browser should materialize as one of it's highly scarce usually on-GPU layers, and which it should flatten onto another one.
So that hidden div can be nasty, because the browser can't predict if it will *stay* hidden. I'm going to go out on a limb here and guess that the div wasn't actually `display: none`, but instead used one of the many less complete forms of hiding, which is probably why whatever heuristics edge used went wrong. It may have been non-hidden in a previously rendered frame. Who knows?
Seriously though - if a huge site like youtube can't be bothered to seriously test in edge, then something is wrong. The only reason to continue supporting edge for MS then would be strategic. And they apparently just decided: screw that.
What do you know; wikipedia has statistics on risk factors, such as drunk driving: https://en.wikipedia.org/wiki/... -- even at 4 times the legal limit (beyond which there are no stats) a drunk human would be unlikely to cause a fatality in as few miles as this.
So yeah, evidence suggests autonomous vehicles aren't all that safe yet.
Going by the numbers, and considering the fact that the idea that self-driving cars are safer, it would appear most people then overestimate the driving ability of autonomous vehicles by at least a factor 10.
So far, the record isn't good, and that's with backup drivers for tricky bits, cherry-picked circumstances, and not counting safety interventions that would have caused accidents, so let's not get too exuberant about how great this tech - at this point, anyhow.
This is simply factually incorrect. Current statistics suggest autonomous vehicles are *orders of magnitude* more dangerous; autonomous vehicles have so few miles driven that the sample size is low.
Consider that a little more than 1 fatality per 100 million miles travelled occurs (ref: https://en.wikipedia.org/wiki/...), and that includes all those drunk distracted meatbags that happen to be using their phone too.
Uber just had their first fatality, and they're nowhere near 100 million miles. And even that is being unreasonably charitable, because those cars have back up drivers that deal with complicated situations the driver can't handle, and safety drivers that can and do intervene if the car appears to be making a mistake; if the cars had to drive regardless of the circumstances (the way a human does), and without someone to correct mistakes, the safety record many well be much, much worse. For some perspective: if people caused as many fatalities and traffic levels remained unchanged, then traffic fatalities would be the cause of more than half of all deaths; it would cause a massive reduction of life-expectency by many, *decades* (!)
Tesla's record too is poor - although their accident rate for the autodrive is similar to that of a human, it simply doesn't work in complicated situations at all. And in highway-style traffic where the system *is* used and made its first fatality, human error is even rarer; and again, consider that the human driver is there and supposed to intervene, so this too likely underestimates the actual risk caused by the autopilot.
Waymo has no serious accidents, but with so few miles driver (it's not much more than uber), it's too early to tell. If they drive at *least* 100 times more than the total they have so far (without serious error), you might cautiously venture a hope that they really are safer than human drivers, but even that wouldn't be statistically sound.
It's totally reasonable to expect autonomous vehicles to become safer than drunk-meatbag vehicles at some point, but they clearly are not yet. I'm not even sure they're safer than an actual drunk driver!
Unfortunately, it's not unthinkable for language to be misused to the point of becoming meaningless. It may well be that the phrase "this is not a drill" is headed that way. This certainly isn't the first time I heard obvious drills begin with "this is not a drill" - usually followed by sheepish announcements immediately thereafter that, eh, sorry, it kind of was a drill.
I don't think you're going to be able to avoid the need for people to simply use common sense, and *not* follow instructions sometimes, if there's reason not to. (No idea if that was the case here!)
Birkenstock doesn't sell on amazon precisely because of a falling out in which it claims amazon doesn't effectively prevent counterfeiters on amazon's own marketplace.
If you think EU fines have dubious beneficiaries (not unjustifiably so), consider that due to the existence of punitive damages in the US, the US fines far more heavily overall. E.g. banks have been fined 321 billion (!) dollars; mostly by the US due to the financial shenanigans in the crash (see e.g. https://www.bloomberg.com/news...). Similarly, VW is likely to pay a lot more in fines than a US firm would in the EU. (Not that it's weird for VW to be fined so heavily, it's just that the law isn't symmetric).
Frankly, I think the EU fines are absurdly low, especially fines such as this which undermine the whole point of capitalism in the first place. Firms have grown absurdly large, causing competition to cease in significant portions of the economy - particularly in large homogeneous markets such as the US. And as you might expect, such firms engage in rent-seeking behavior: their profits soar, while customers stagnate (again, as economics 101 dictates).
I do agree that it's problematic that there is this perverse incentive for a prosecutor to "capture" as many spoils of war as they can (on a somewhat related note, WTF asset forfeiture). Part of the problem here is the voting public - the very sentiment you're now feeling; where you're probably quietly relieved that VW is fined a lot in the US (hey, it's foreign!) but indignant that intel is elsewhere. You'd want that to be fixed, but how? There is at least some solace in that anticompetitie behavior is much, much more harmful to the US than a relatively piddling fine. Just be happy that the far more questionably fair punitive damages haven't (yet) arrived in the EU; even if the concept is fair, the distribution of "loot" surely is not.
A prediction market is a a prediction method that runs on a bunch of humans, not a computer. It must obey the same convergence laws as all other such processes, from other markets, to evolution, to human learning, and indeed machine learning.
It definitely does need to make assumptions about smoothness to be able to find even a local optimum - in the infinite space of possible prediction methods the market is exploring, if the optimum method is almost identical to a bunch of other terrible methods, then the market isn't likely to find that solution. If, by contrast, the nearby methods show promise but arent' *quite* as good, then normal market action works: people will see the winners getting rich and try to beat them at their own game by trying variations. Some of those variations might be better than the orignal; etc. This process is obviously highly complicated in practice; "nearby" isn't even a clearly defined term, and yet is still applies.
Philosophically, I doubt a prediction market will ever find a global optimum - not sure that it matters, however. There's no way we can tell, I suspect.
The point isn't that a market "is" a prediction method - sure from some perspective it is; yet there are lots of interesting differences too. The point is that it's risky to assume a prediction market will always beat any other prediction method; i.e. that it is an "optimal swarm intelligence". It's one possible and known effective way to leverage a diversity of other prediction methods and ongoing research into new ones. But it may not be optimal. We don't know that. In several ways, we *know* it's not optimal: not just are there known issues with markets in general, but more specifically, markets necessarily react only when some of its actors already have: markets are slow. It can be worth a lot of money to be faster, and other methods that don't rely on actors to interpret information for them can have an edge there.
What you call meta-prediction is really just a variation of https://en.wikipedia.org/wiki/... - a prediction market isn't radically different. Indeed even within one model, combining separate predictions is useful - low-level density estimation layers deep learing have some similarity.
The overlap between markets and ML (and e.g. evolution) is that they're both complex optimization problems, where finding the "true" solution is generally infeasible. There are various approaches to come up with a best guess, but they all need to make various assumptions about the problem space to work - for one, that the problem space is "smooth" in some sense (so by exploring the current solution and nearby choices, you have a chance of going in the right direction), and e.g. that there aren't too many local minima.
Not all problems are like that, and sometimes a problem takes some massaging until it is a candidate. And these techniques - including markets - do *not* generally converge to the global optimum, the converge to some local optimum. If you're lucky, or under some non-obvious preconditions, then it's global.
Note that a prediction market is not particularly more likely to be accurate than any other machine learning technique. If there's been one thing that's been demonstrated time and time again over the years, it's that there are many techniques that can work, but that to get truly excellent results, appropriate data collection, selection, filtering etc. is critical. It's easy to get charmed by techniques that have a great story and convincing argument they'll work - but that doesn't mean they're the best.
High-end GPUs have been larger than CPUs for many, many years now. It's a matter of perspective whether you find 471 mm^2 a significant step up from 195mm^2.
Really old generations aren't listed with die sizes there, but even the first generation that is (geforce 8xxx) includes e.g. the GeForce 8800 GTS (nov 2006) at 484 mm^2
Even a 10-core (modern) broadwell-E chip is around half that (http://hothardware.com/reviews/intel-core-i7-6950x-extreme-edition-10-core-cpu-review-broadwell-e-arrives lists that as 246mm^2)!
When I last evaluated zxcvbn (2 years ago) it was, however, a denial of service waiting to happen: it tries to estimate entropy by brute forcing its way through a bunch of different strategies for predicting structures in passwords. At the time it was possible to let a single (server-side) check take minutes of CPU time by carefully constructing your password. It may have improved, but I'd be careful if you really want to deploy it. Preferably use some client-side port; at least that way you just chase away a user with bad habits rather that let anyone that wants to DOS you.
The medieval warm period wasn't as warm as you're suggesting (I can't find any citations for more than 2 degrees, and the delta may well be less), and it wasn't world-wide: northern Europe (and some other parts of the northern hemisphere) was warmer, and as it turns out, europe ended up writing a disproportionate part of modern history, so that was remembered.
Globally, temperatures were lower than they are now.
There may be some truth to the inevitability of global warming, but make no mistake: our generation sure is screwing over the future thoroughly. Even in optimistic assumptions, it seems likely that greenland will lose most of its ice; which sounds to me like the world is likely to experience sea level rises of at least 10 meters (since greenland isn't the only glacier on the planet, and because warm water expands).
The question is whether that takes thousands of years - so cultures and populations get to adapt relatively calmly - or something scarier than that.
At issue are the speech (sythesis+recognition) API's, not the audio API's. However...
outright rejected features (websql).
In fact it does not award points for it: it is listed, but its inclusion does not award any points. Firefox does not have it and it still gets 35/35 points in that test.
You're right - I was mislead by the fact that the feature is listed as providing 5 points, but that seems to be in error. The same also goes for speech api's incidentally.
The test isn't as bad as it seemed at first glance (though it's unfortunate that it's unclear what counts for what). Nevertheless, it counts proposed and experimental features, and misdetects at least keygen (which doesn't bode well for others), fails to do even basic validation whether a feature is implemented correclty, and it doesn't clearly make the distinction between html5 and the living spec, going so far as to link to the w3c spec for features like datetime inputs, even though that's not in the spec, but is in the whatwg living spec (from which likely later iterations 5.1 will emerge). It largely follows the living spec, but not everywhere (e.g. keygen, as you point out.)
In short: it's still not a good idea to read anything much into these numbers.
You're quoting out of context. What you say is true, but doesn't affect the validity of my argument that html5test is poorly designed.
Note that if you're going to exclude the living spec in an attempt to rationalize html5test's behavior, be aware that many features it checks for aren't present in the static html5, only in the living spec.
There are multiple perspectives here. As you point out, keygen wasn't always deprecated, and it hasn't been removed from the standard yet. So, as you point out, it's OK for a browser to support that. And I totally agree with that.
But also look at the context - the suggestion here is that a low score means a browser that is lagging in standards support. And that's clearly misleading. There may be nothing wrong with supporting keygen; but clearly the aim is to *remove* it, and there should certainly not be anything wrong with actually doing that. I understand that webkit+blink need to deal with a lot of legacy, but we shouldn't be cheering that on, just like we weren't cheering on all the IE6 quirks that lingered for years on the web.
If html5test wants to promote a modern, standards compliant web that keeps up to date with the standards - and clearly that *was* once its aim - then it too should deprecate keygen. It's understandable to support keygen (and if you do - follow the deprecated standard). But it's best to move on and drop support.
Incidentally, evaluating keygen due to this conversation leads me to question html5test even more - I tested keygen in chrome+firefox+edge, and it actually works in chrome and firefox, even though html5test suggests it works only in chrome. In other words, the test isn't just misguided, it's buggy too...
That would only further demonstrate the misleading nature of html5test. An test aiming to measure support for modern "html5" should not award bonus points for non-standard (speech apis), deprecated (keygen) or outright rejected features (websql).
It woudn't be healthy if mozilla share were 99% either. But that's neither here nor there: it isn't and there's no plausible scenario in which they turn into a monopolist, not to mention the fact that they're a non-profit whose interests are perhaps less likely to push them to abuse than companies peddling your private data and opinions to the highest bidder.
One thing that article glosses over a little too quickly (at least - I'm not convinced) is the bit depth, and specifically the fact that appropriate dither means that 16bit is enough (and by explicit implication, that more is wasteful).
So, for uncompressed audio... sure!
But almost nobody listens to uncompressed audio, and the argument was about "what kind of format should my music files be in".
Compressing dither isn't trivial. Usually dithered signals don't compress as well. Certainly in visual applications they do; I'm not sure about audio - but I'd be surprised if it were different.
And that suggests that if 16bit is transparent because of dither, then you're possibly better off using 17-20bits without dither, and (if necessary) dithering post-decompression to playback on a 16bit output.
Then again, I rarely use headphones, so I'm never going to hear the difference anyhow. A 64kpbs opus file sounds perfect to me ;-).
Dealing with rendering layers is a tricky optimization process. All browsers have had notable issues with it over the years, and almost certainly will for the foreseeable future, including most definitely chromium - as a job I maintained a chrome+website plugin for a few years and ironically chrome had the most issues of all browsers in this regard. If you've ever tried to optimize an HTML layout for animation, and low interactive jank, you may have run into issues with similar root causes.
It's just not that trivial to figure out which few potential layers - amids the thousands (and sometimes more) in a typical web page - the browser should materialize as one of it's highly scarce usually on-GPU layers, and which it should flatten onto another one.
So that hidden div can be nasty, because the browser can't predict if it will *stay* hidden. I'm going to go out on a limb here and guess that the div wasn't actually `display: none`, but instead used one of the many less complete forms of hiding, which is probably why whatever heuristics edge used went wrong. It may have been non-hidden in a previously rendered frame. Who knows?
Seriously though - if a huge site like youtube can't be bothered to seriously test in edge, then something is wrong. The only reason to continue supporting edge for MS then would be strategic. And they apparently just decided: screw that.
Wherein browsers are actually simply video players.
What do you know; wikipedia has statistics on risk factors, such as drunk driving: https://en.wikipedia.org/wiki/... -- even at 4 times the legal limit (beyond which there are no stats) a drunk human would be unlikely to cause a fatality in as few miles as this.
So yeah, evidence suggests autonomous vehicles aren't all that safe yet.
Going by the numbers, and considering the fact that the idea that self-driving cars are safer, it would appear most people then overestimate the driving ability of autonomous vehicles by at least a factor 10.
So far, the record isn't good, and that's with backup drivers for tricky bits, cherry-picked circumstances, and not counting safety interventions that would have caused accidents, so let's not get too exuberant about how great this tech - at this point, anyhow.
This is simply factually incorrect. Current statistics suggest autonomous vehicles are *orders of magnitude* more dangerous; autonomous vehicles have so few miles driven that the sample size is low.
Consider that a little more than 1 fatality per 100 million miles travelled occurs (ref: https://en.wikipedia.org/wiki/...), and that includes all those drunk distracted meatbags that happen to be using their phone too.
Uber just had their first fatality, and they're nowhere near 100 million miles. And even that is being unreasonably charitable, because those cars have back up drivers that deal with complicated situations the driver can't handle, and safety drivers that can and do intervene if the car appears to be making a mistake; if the cars had to drive regardless of the circumstances (the way a human does), and without someone to correct mistakes, the safety record many well be much, much worse. For some perspective: if people caused as many fatalities and traffic levels remained unchanged, then traffic fatalities would be the cause of more than half of all deaths; it would cause a massive reduction of life-expectency by many, *decades* (!)
Tesla's record too is poor - although their accident rate for the autodrive is similar to that of a human, it simply doesn't work in complicated situations at all. And in highway-style traffic where the system *is* used and made its first fatality, human error is even rarer; and again, consider that the human driver is there and supposed to intervene, so this too likely underestimates the actual risk caused by the autopilot.
Waymo has no serious accidents, but with so few miles driver (it's not much more than uber), it's too early to tell. If they drive at *least* 100 times more than the total they have so far (without serious error), you might cautiously venture a hope that they really are safer than human drivers, but even that wouldn't be statistically sound.
It's totally reasonable to expect autonomous vehicles to become safer than drunk-meatbag vehicles at some point, but they clearly are not yet. I'm not even sure they're safer than an actual drunk driver!
In fact, even MD5 hasn't been broken for this use case. Pre-image attacks are very hard to pull off.
Fortunately, Mr. Petrov didn't follow your advice or you may well not have been alive today.
Unfortunately, it's not unthinkable for language to be misused to the point of becoming meaningless. It may well be that the phrase "this is not a drill" is headed that way. This certainly isn't the first time I heard obvious drills begin with "this is not a drill" - usually followed by sheepish announcements immediately thereafter that, eh, sorry, it kind of was a drill.
I don't think you're going to be able to avoid the need for people to simply use common sense, and *not* follow instructions sometimes, if there's reason not to. (No idea if that was the case here!)
I think you should consider whether those guidelines are really followed in contemplative silence. Around 4'33" would do just nicely.
Birkenstock doesn't sell on amazon precisely because of a falling out in which it claims amazon doesn't effectively prevent counterfeiters on amazon's own marketplace.
Related: http://newsroom.wiley.com/pres...
It's not quite clear why there are different people for both pizes, though.
Background: https://en.wikipedia.org/wiki/...
If you think EU fines have dubious beneficiaries (not unjustifiably so), consider that due to the existence of punitive damages in the US, the US fines far more heavily overall. E.g. banks have been fined 321 billion (!) dollars; mostly by the US due to the financial shenanigans in the crash (see e.g. https://www.bloomberg.com/news...). Similarly, VW is likely to pay a lot more in fines than a US firm would in the EU. (Not that it's weird for VW to be fined so heavily, it's just that the law isn't symmetric).
Frankly, I think the EU fines are absurdly low, especially fines such as this which undermine the whole point of capitalism in the first place. Firms have grown absurdly large, causing competition to cease in significant portions of the economy - particularly in large homogeneous markets such as the US. And as you might expect, such firms engage in rent-seeking behavior: their profits soar, while customers stagnate (again, as economics 101 dictates).
I do agree that it's problematic that there is this perverse incentive for a prosecutor to "capture" as many spoils of war as they can (on a somewhat related note, WTF asset forfeiture). Part of the problem here is the voting public - the very sentiment you're now feeling; where you're probably quietly relieved that VW is fined a lot in the US (hey, it's foreign!) but indignant that intel is elsewhere. You'd want that to be fixed, but how? There is at least some solace in that anticompetitie behavior is much, much more harmful to the US than a relatively piddling fine. Just be happy that the far more questionably fair punitive damages haven't (yet) arrived in the EU; even if the concept is fair, the distribution of "loot" surely is not.
With his microsoft OS running on an intel chip...
A prediction market is a a prediction method that runs on a bunch of humans, not a computer. It must obey the same convergence laws as all other such processes, from other markets, to evolution, to human learning, and indeed machine learning.
It definitely does need to make assumptions about smoothness to be able to find even a local optimum - in the infinite space of possible prediction methods the market is exploring, if the optimum method is almost identical to a bunch of other terrible methods, then the market isn't likely to find that solution. If, by contrast, the nearby methods show promise but arent' *quite* as good, then normal market action works: people will see the winners getting rich and try to beat them at their own game by trying variations. Some of those variations might be better than the orignal; etc. This process is obviously highly complicated in practice; "nearby" isn't even a clearly defined term, and yet is still applies.
Philosophically, I doubt a prediction market will ever find a global optimum - not sure that it matters, however. There's no way we can tell, I suspect.
The point isn't that a market "is" a prediction method - sure from some perspective it is; yet there are lots of interesting differences too. The point is that it's risky to assume a prediction market will always beat any other prediction method; i.e. that it is an "optimal swarm intelligence". It's one possible and known effective way to leverage a diversity of other prediction methods and ongoing research into new ones. But it may not be optimal. We don't know that. In several ways, we *know* it's not optimal: not just are there known issues with markets in general, but more specifically, markets necessarily react only when some of its actors already have: markets are slow. It can be worth a lot of money to be faster, and other methods that don't rely on actors to interpret information for them can have an edge there.
What you call meta-prediction is really just a variation of https://en.wikipedia.org/wiki/... - a prediction market isn't radically different. Indeed even within one model, combining separate predictions is useful - low-level density estimation layers deep learing have some similarity.
The overlap between markets and ML (and e.g. evolution) is that they're both complex optimization problems, where finding the "true" solution is generally infeasible. There are various approaches to come up with a best guess, but they all need to make various assumptions about the problem space to work - for one, that the problem space is "smooth" in some sense (so by exploring the current solution and nearby choices, you have a chance of going in the right direction), and e.g. that there aren't too many local minima.
Not all problems are like that, and sometimes a problem takes some massaging until it is a candidate. And these techniques - including markets - do *not* generally converge to the global optimum, the converge to some local optimum. If you're lucky, or under some non-obvious preconditions, then it's global.
Note that a prediction market is not particularly more likely to be accurate than any other machine learning technique. If there's been one thing that's been demonstrated time and time again over the years, it's that there are many techniques that can work, but that to get truly excellent results, appropriate data collection, selection, filtering etc. is critical. It's easy to get charmed by techniques that have a great story and convincing argument they'll work - but that doesn't mean they're the best.
High-end GPUs have been larger than CPUs for many, many years now. It's a matter of perspective whether you find 471 mm^2 a significant step up from 195mm^2.
You might e.g. compare https://en.wikipedia.org/wiki/... die sizes and https://en.wikipedia.org/wiki/...
Really old generations aren't listed with die sizes there, but even the first generation that is (geforce 8xxx) includes e.g. the GeForce 8800 GTS (nov 2006) at 484 mm^2
Even a 10-core (modern) broadwell-E chip is around half that (http://hothardware.com/reviews/intel-core-i7-6950x-extreme-edition-10-core-cpu-review-broadwell-e-arrives lists that as 246mm^2)!
When I last evaluated zxcvbn (2 years ago) it was, however, a denial of service waiting to happen: it tries to estimate entropy by brute forcing its way through a bunch of different strategies for predicting structures in passwords. At the time it was possible to let a single (server-side) check take minutes of CPU time by carefully constructing your password. It may have improved, but I'd be careful if you really want to deploy it. Preferably use some client-side port; at least that way you just chase away a user with bad habits rather that let anyone that wants to DOS you.
The medieval warm period wasn't as warm as you're suggesting (I can't find any citations for more than 2 degrees, and the delta may well be less), and it wasn't world-wide: northern Europe (and some other parts of the northern hemisphere) was warmer, and as it turns out, europe ended up writing a disproportionate part of modern history, so that was remembered.
Globally, temperatures were lower than they are now.
This isn't a secret, nor is the information hard to find; e.g. https://en.wikipedia.org/wiki/...
There may be some truth to the inevitability of global warming, but make no mistake: our generation sure is screwing over the future thoroughly. Even in optimistic assumptions, it seems likely that greenland will lose most of its ice; which sounds to me like the world is likely to experience sea level rises of at least 10 meters (since greenland isn't the only glacier on the planet, and because warm water expands).
The question is whether that takes thousands of years - so cultures and populations get to adapt relatively calmly - or something scarier than that.
People aren't great at dealing with rapid change.
An test aiming to measure support for modern "html5" should not award bonus points for non-standard (speech apis)
Webaudio is a W3C standard.
At issue are the speech (sythesis+recognition) API's, not the audio API's. However...
outright rejected features (websql).
In fact it does not award points for it: it is listed, but its inclusion does not award any points. Firefox does not have it and it still gets 35/35 points in that test.
You're right - I was mislead by the fact that the feature is listed as providing 5 points, but that seems to be in error. The same also goes for speech api's incidentally.
The test isn't as bad as it seemed at first glance (though it's unfortunate that it's unclear what counts for what). Nevertheless, it counts proposed and experimental features, and misdetects at least keygen (which doesn't bode well for others), fails to do even basic validation whether a feature is implemented correclty, and it doesn't clearly make the distinction between html5 and the living spec, going so far as to link to the w3c spec for features like datetime inputs, even though that's not in the spec, but is in the whatwg living spec (from which likely later iterations 5.1 will emerge). It largely follows the living spec, but not everywhere (e.g. keygen, as you point out.)
In short: it's still not a good idea to read anything much into these numbers.
You're quoting out of context. What you say is true, but doesn't affect the validity of my argument that html5test is poorly designed.
Note that if you're going to exclude the living spec in an attempt to rationalize html5test's behavior, be aware that many features it checks for aren't present in the static html5, only in the living spec.
There are multiple perspectives here. As you point out, keygen wasn't always deprecated, and it hasn't been removed from the standard yet. So, as you point out, it's OK for a browser to support that. And I totally agree with that.
But also look at the context - the suggestion here is that a low score means a browser that is lagging in standards support. And that's clearly misleading. There may be nothing wrong with supporting keygen; but clearly the aim is to *remove* it, and there should certainly not be anything wrong with actually doing that. I understand that webkit+blink need to deal with a lot of legacy, but we shouldn't be cheering that on, just like we weren't cheering on all the IE6 quirks that lingered for years on the web.
If html5test wants to promote a modern, standards compliant web that keeps up to date with the standards - and clearly that *was* once its aim - then it too should deprecate keygen. It's understandable to support keygen (and if you do - follow the deprecated standard). But it's best to move on and drop support.
Incidentally, evaluating keygen due to this conversation leads me to question html5test even more - I tested keygen in chrome+firefox+edge, and it actually works in chrome and firefox, even though html5test suggests it works only in chrome. In other words, the test isn't just misguided, it's buggy too...
That would only further demonstrate the misleading nature of html5test. An test aiming to measure support for modern "html5" should not award bonus points for non-standard (speech apis), deprecated (keygen) or outright rejected features (websql).