There will always be some outliers/exceptions, but it should be possible to sufficiently specifically define the rules and vocabulary of a given system, possibly by breaking it further down into facets/perspectives and then mapping the relations and constraints.
So then you could have many ontologies, which will gradually converge over time. I'm talking long-term, of course. The annotation part could also require consensus, or vetting, by multiple recognized entities. All in all, the result would still be more or less a fluid body, but then so is everything around us, as the only constant in our world is that everything is changing.
And I agree with you that ML and annotation/classification & co. are complimentary tools. And it will take a lot of work to have end users semantically enrich their output.
Where I disagree is in your definition of a model, which is not necessarily an incorrect representation. It's just a representation, the level of detail varies from use-case to use-case.
So anyway, the big question is how to get there...
I admit that I took that quote a bit out of context. I apologize.
But as mentioned above, I think we just lack a killer feature. And people do use semantically enriched data (also in addition to ML), mostly research, but some do actual work.
And if I've misrepresented rockmuelle, or misunderstood your question, qpqp, it's because I don't have an exact model of what you're saying.
Come now, don't blame everything on me!
What I meant by exact model is of course a predictable, and in a sense deterministic process; inasmuch as that is possible for the given case. Even with machine learning you create a representation of the surveyed system, but this model will (currently, and in most cases) always be an approximation. By mapping concepts, their (often ambiguous) meanings, usage scenarios and other relations from different areas to each other, supported by these approximations, it should in time be possible to avoid the issues related to the fuzziness and create a truly smart and adaptive system.
Of course, our universe (as far as we know) is (inherently?) non-deterministic. And obviously, if that is so, you'd have to somehow cheat (e.g. be able to observe our universe from more than the 4 dimensions we can perceive) to get a truly exact model, assuming that some (reachable) abstraction point is deterministic.
What I'm suggesting is that with some effort it should be possible for us to come up with something with the ability to understand something (like you did with my question, despite lacking an exact model;) ). And while ML is quite crude and more like a sledgehammer, an accurate definition is more like a chisel. At least with respect to the model(s). Assuming such a system is created, it will have similar limitations like humans with regard to the ability to understand something, as we do not know everything as far as I am aware.
But anyway, the librarians didn't have the technical capability to create such a multi-dimensional mess like we currently can, so maybe these things we're talking about just have their own math that we just need to understand the proper rules for. It's all metadata anyway, but currently, I guess the closest we have to an exact model is in the hands of the NSA...
Well the other services (except for email, obviously) are largely run by volunteers and don't even have ads (spam notwithstanding).
Quality in the things that are not important to contributors, but are important to many of the people who do not contribute? Not so high.
Now I'm not sure that I follow. Sure, there's lots of stuff that lacks the polish of countless missing man-hours, but we've all come a really long way since the 80s/90s. I'm sure we'll get there if we don't fuck up before that. I've also seen lots of examples of features that were unimportant to the contributors, but since there was an itch to scratch e.g. in getting recognition from their users, a similar level of rigor was applied to satisfy them. (Certainly, there's lots of negative examples too, but the point stands, that there was little "physical" value that some devs received for their work and yet still the projects thrive(d). I was, of course, assuming that you meant money when you said "paying for things" in your original post.)
I agree that the tools are currently insufficient (though quite powerful, e.g. Protege), but I also believe that it's quite possible to achieve a high level of accuracy by combining better tools, dividing the problem space and working on killer features that require this higher level of abstraction.
Ideally, people (at first for industrial applications) would recognize the need for a proper machine-readable representation of the different states of a specific environment, so that eventually the different ontologies could be mapped to each other. An exhausting (i.e. universal) categorization of all possible states (of everything) is largely unnecessary, as even now, when we communicate with each other, use the respective vocabulary of the specific topic/area/system and only (comparatively) rarely need to "interface" or interesect with other areas/vocabularies, e.g. when we want to draw parallels to a similar concept in a different system. With time, I'm sure we'll could even get to a meta-ontology and evolve our language and understanding accordingly.
[...] the current Big Data and Machine Learning techniques [...] trump the whole categorization and knowledge extraction / data mining process [...]
Could you please explain, how a statistical approximation can trump an exact model? I think that big data & co. is a step in the right direction with the means that we currently have available and that we'll get there eventually. There's too many benefits that would result from doing it properly to neglect the required effort.
But, please, don't give me a blinking and whirling semantic web whereby every move of the mouse updates your AHDH-laden site.
FTFY. The semantic web is a vision that has little to do with what you described:
According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries".[2] The term was coined by Tim Berners-Lee for a web of data that can be processed by machines.[3] While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.[4]
Where people are in the habit of paying for things, the providers of those things worry about quality.
Bullshit. The Internet was a fine place before youtube and google and continues to be so now. It just became more convenient, for everyone. Including the parasites.
Go look at other segments of the Internet: email, ftp, irc, jabber, torrents... dominated by quality-oriented mentality!
Look at linux (the systemd debacle notwithstanding;) ), BSD, the open source community in general... Sure, a lot is paid for, but even more is driven by enthusiasm first and foremost.
So, you're saying that it's so simple to get SHA256 collisions that thousands of people getting sued for torrenting can fuck these copyright companies right over?
I don't think I quite believe you and last time I checked I needed quite a server farm to (reliably) produce one collision in a meaningful amount of time.
Who cares? Fortunately, these days Germans don't have an army capable of attacking anyone, unless it's with broomsticks. So they'd have to live with it.
The oil markets will not accept local currencies; it has to be dollars or in some cases, euros.
You're forgetting Rubles, which is a serious oversight currently.
Also, Greeks do have their own oil, if they start drilling. They have their own electricity too. It'd be difficult, but possible.
They need to start working with the plan that was made for them to clear up the mess and other nations will help them.
Wrong. I wouldn't work with a plan that forces me to cut my own legs (sell state-owned, important infrastructure to the likes of Telekom, Vattenfall, etc.) either, instead I'd look for a proper way out (grow - not shrink).
In soviet Russia, you watch TOR!
err... Wait a minute..
In soviet Russia, government watches you using TOR!
Wait. That isn't news... For Russia maybe... hmm...
In Soviet Russia, you watch the government!
That about right...? I'm confused.
Dude, I'm trying real hard to explain the other perspective to you, you know, the one, where there's not a one-sided synchronization of all^H^H^Hmost available media, but you know, one based on certain historical facts and tidbits you might not have known, because they are not presented by the western propaganda machine (which is working just as good as the Russian one).
"military invasion of Eastern Ukraine, Russian incursions into Moldova, Azerbaijan, Syria, Latvia, Estonia, Japan, and Sweden as what, "humanitarian"?"
[citation needed, citation needed, citation needed,...]
Please be so kind and provide something that actually proves what you are claiming. I know this is an old argument, and is often used to discredit someone using it as a "Russian puppet," but please, for gods sake, fucking provide EVIDENCE! You know, hard, tangible evidence, not some bullshit rhetorics. It's been what, almost a year and there's not a single reliable satellite photo depicting the alleged thousands of troops and hundreds of tanks and APCs. WTF?! And the other countries... just WTF are you talking about? Japan? Latvia? Estonia? Sweden? WTF, WTF, WTF are you fucking talking about?? The "submarine", which was later found out to be a hoax? What?
And Please, fucking J-A-P-A-N? WHAT THE FUCK?! You mean the friggin' Kuril Islands that they're disputing for decades? The fuck you talking about?
The Baltic countries? They're in NATO, do you understand the absurdity of your statement that Russia had incursions into their territory? Do you know what collective security is and means? Man... You're waaaaaay out of your comfort zone. This is ridiculous. Azerbaijan... Hmm, please read about the frozen conflict there, please understand the genocide of the Armenians that happened with the help of Azerbaijani mercenaries orchestrated by Turkey. You're conflating things that have absolutely nothing to do with each other, even if we assume your invalid premise of Russian imperialism.
As opposed to you, I did provide at least rudimentary links to non-media sites that back up the claims I make. But you're just lacking them altogether.
Like: "Hell, just a couple of weeks ago Russia flew a pair of nuclear bombers only a few miles off the coast [...]"
[...] in international waters [...] [!!!!!!!!!!] Here, fixed that for you. Or did you miss the memo that you can do whatever the fuck you want there? Really, please do read some of the numerous sources from your country that are more critical of The Media Script (tm). If you want, look up in historical sources how often the UK scrambles their jets for stuff like this. Or the US. Or Russia. This is fucking normal! (Not that I don't think it's batshit crazy to do, but it's happening regularly!)
"If you can't see the hypocrisy in the entirety of your argument with your desperate primarily US focus then you're beyond hope"
Bullshit. It's not an anti-US focus, read it again and again, until you finally fucking get it. At least look at some of the proof/backup for the arguments that I use! Understand that you're being just as manipulated with as what you allege me to be. You're just plain wrong on so many counts, that I don't know what else to tell you other than to get a history book.
I was clearly in your second group a few years back...
Maybe you just were not sexy enough, hm?
Except of course that it IS mathematics.
And the obligatory: xkcd
There will always be some outliers/exceptions, but it should be possible to sufficiently specifically define the rules and vocabulary of a given system, possibly by breaking it further down into facets/perspectives and then mapping the relations and constraints.
So then you could have many ontologies, which will gradually converge over time. I'm talking long-term, of course. The annotation part could also require consensus, or vetting, by multiple recognized entities. All in all, the result would still be more or less a fluid body, but then so is everything around us, as the only constant in our world is that everything is changing.
And I agree with you that ML and annotation/classification & co. are complimentary tools. And it will take a lot of work to have end users semantically enrich their output.
Where I disagree is in your definition of a model, which is not necessarily an incorrect representation. It's just a representation, the level of detail varies from use-case to use-case.
So anyway, the big question is how to get there...
I admit that I took that quote a bit out of context. I apologize.
But as mentioned above, I think we just lack a killer feature. And people do use semantically enriched data (also in addition to ML), mostly research, but some do actual work.
And if I've misrepresented rockmuelle, or misunderstood your question, qpqp, it's because I don't have an exact model of what you're saying.
Come now, don't blame everything on me!
What I meant by exact model is of course a predictable, and in a sense deterministic process; inasmuch as that is possible for the given case.
Even with machine learning you create a representation of the surveyed system, but this model will (currently, and in most cases) always be an approximation.
By mapping concepts, their (often ambiguous) meanings, usage scenarios and other relations from different areas to each other, supported by these approximations, it should in time be possible to avoid the issues related to the fuzziness and create a truly smart and adaptive system.
Of course, our universe (as far as we know) is (inherently?) non-deterministic. And obviously, if that is so, you'd have to somehow cheat (e.g. be able to observe our universe from more than the 4 dimensions we can perceive) to get a truly exact model, assuming that some (reachable) abstraction point is deterministic.
What I'm suggesting is that with some effort it should be possible for us to come up with something with the ability to understand something (like you did with my question, despite lacking an exact model;) ). And while ML is quite crude and more like a sledgehammer, an accurate definition is more like a chisel. At least with respect to the model(s).
Assuming such a system is created, it will have similar limitations like humans with regard to the ability to understand something, as we do not know everything as far as I am aware.
But anyway, the librarians didn't have the technical capability to create such a multi-dimensional mess like we currently can, so maybe these things we're talking about just have their own math that we just need to understand the proper rules for. It's all metadata anyway, but currently, I guess the closest we have to an exact model is in the hands of the NSA...
why did it fizzle out?
I think it's too early to say that it did. Scholar has 10.5k hits for articles from this year alone...
I'm not sure I really follow your argument
Well the other services (except for email, obviously) are largely run by volunteers and don't even have ads (spam notwithstanding).
Quality in the things that are not important to contributors, but are important to many of the people who do not contribute? Not so high.
Now I'm not sure that I follow. Sure, there's lots of stuff that lacks the polish of countless missing man-hours, but we've all come a really long way since the 80s/90s. I'm sure we'll get there if we don't fuck up before that.
I've also seen lots of examples of features that were unimportant to the contributors, but since there was an itch to scratch e.g. in getting recognition from their users, a similar level of rigor was applied to satisfy them.
(Certainly, there's lots of negative examples too, but the point stands, that there was little "physical" value that some devs received for their work and yet still the projects thrive(d). I was, of course, assuming that you meant money when you said "paying for things" in your original post.)
I agree that the tools are currently insufficient (though quite powerful, e.g. Protege), but I also believe that it's quite possible to achieve a high level of accuracy by combining better tools, dividing the problem space and working on killer features that require this higher level of abstraction.
Ideally, people (at first for industrial applications) would recognize the need for a proper machine-readable representation of the different states of a specific environment, so that eventually the different ontologies could be mapped to each other.
An exhausting (i.e. universal) categorization of all possible states (of everything) is largely unnecessary, as even now, when we communicate with each other, use the respective vocabulary of the specific topic/area/system and only (comparatively) rarely need to "interface" or interesect with other areas/vocabularies, e.g. when we want to draw parallels to a similar concept in a different system. With time, I'm sure we'll could even get to a meta-ontology and evolve our language and understanding accordingly.
Thanks for the WDC link. This is awesome!
[...] the current Big Data and Machine Learning techniques [...] trump the whole categorization and knowledge extraction / data mining process [...]
Could you please explain, how a statistical approximation can trump an exact model? I think that big data & co. is a step in the right direction with the means that we currently have available and that we'll get there eventually. There's too many benefits that would result from doing it properly to neglect the required effort.
But, please, don't give me a blinking and whirling semantic web whereby every move of the mouse updates your AHDH-laden site.
FTFY. The semantic web is a vision that has little to do with what you described:
According to the W3C, "The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries".[2] The term was coined by Tim Berners-Lee for a web of data that can be processed by machines.[3] While its critics have questioned its feasibility, proponents argue that applications in industry, biology and human sciences research have already proven the validity of the original concept.[4]
(From the related Wikipedia article.)
It's our fault.
It's Eternal September all the way down.
Where people are in the habit of paying for things, the providers of those things worry about quality.
Bullshit. The Internet was a fine place before youtube and google and continues to be so now. It just became more convenient, for everyone. Including the parasites.
Go look at other segments of the Internet: email, ftp, irc, jabber, torrents... dominated by quality-oriented mentality!
Look at linux (the systemd debacle notwithstanding;) ), BSD, the open source community in general... Sure, a lot is paid for, but even more is driven by enthusiasm first and foremost.
This is a great hack. Thanks a lot for the link!
The obligatory Tripp Crosby
So, you're saying that it's so simple to get SHA256 collisions that thousands of people getting sued for torrenting can fuck these copyright companies right over?
I don't think I quite believe you and last time I checked I needed quite a server farm to (reliably) produce one collision in a meaningful amount of time.
have had just about enough of it
Who cares?
Fortunately, these days Germans don't have an army capable of attacking anyone, unless it's with broomsticks. So they'd have to live with it.
They could also pivot towards the EEU and/or SCO, who'd both probably be welcoming Greece with open arms.
The oil markets will not accept local currencies; it has to be dollars or in some cases, euros.
You're forgetting Rubles, which is a serious oversight currently.
Also, Greeks do have their own oil, if they start drilling. They have their own electricity too. It'd be difficult, but possible.
They need to start working with the plan that was made for them to clear up the mess and other nations will help them.
Wrong. I wouldn't work with a plan that forces me to cut my own legs (sell state-owned, important infrastructure to the likes of Telekom, Vattenfall, etc.) either, instead I'd look for a proper way out (grow - not shrink).
Yeah, also check that the wifi is off... No shit.
In soviet Russia, you watch TOR!
err... Wait a minute..
In soviet Russia, government watches you using TOR!
Wait. That isn't news... For Russia maybe... hmm...
In Soviet Russia, you watch the government!
That about right...? I'm confused.
Dude, I'm trying real hard to explain the other perspective to you, you know, the one, where there's not a one-sided synchronization of all^H^H^Hmost available media, but you know, one based on certain historical facts and tidbits you might not have known, because they are not presented by the western propaganda machine (which is working just as good as the Russian one).
"military invasion of Eastern Ukraine, Russian incursions into Moldova, Azerbaijan, Syria, Latvia, Estonia, Japan, and Sweden as what, "humanitarian"?" [citation needed, citation needed, citation needed, ...]
Please be so kind and provide something that actually proves what you are claiming. I know this is an old argument, and is often used to discredit someone using it as a "Russian puppet," but please, for gods sake, fucking provide EVIDENCE! You know, hard, tangible evidence, not some bullshit rhetorics. It's been what, almost a year and there's not a single reliable satellite photo depicting the alleged thousands of troops and hundreds of tanks and APCs. WTF?! And the other countries... just WTF are you talking about? Japan? Latvia? Estonia? Sweden? WTF, WTF, WTF are you fucking talking about?? The "submarine", which was later found out to be a hoax? What?
And Please, fucking J-A-P-A-N? WHAT THE FUCK?! You mean the friggin' Kuril Islands that they're disputing for decades? The fuck you talking about?
The Baltic countries? They're in NATO, do you understand the absurdity of your statement that Russia had incursions into their territory? Do you know what collective security is and means? Man... You're waaaaaay out of your comfort zone. This is ridiculous.
Azerbaijan... Hmm, please read about the frozen conflict there, please understand the genocide of the Armenians that happened with the help of Azerbaijani mercenaries orchestrated by Turkey. You're conflating things that have absolutely nothing to do with each other, even if we assume your invalid premise of Russian imperialism.
As opposed to you, I did provide at least rudimentary links to non-media sites that back up the claims I make. But you're just lacking them altogether.
Like: "Hell, just a couple of weeks ago Russia flew a pair of nuclear bombers only a few miles off the coast [...]" [...] in international waters [...] [!!!!!!!!!!] Here, fixed that for you. Or did you miss the memo that you can do whatever the fuck you want there? Really, please do read some of the numerous sources from your country that are more critical of The Media Script (tm).
If you want, look up in historical sources how often the UK scrambles their jets for stuff like this. Or the US. Or Russia. This is fucking normal! (Not that I don't think it's batshit crazy to do, but it's happening regularly!)
"If you can't see the hypocrisy in the entirety of your argument with your desperate primarily US focus then you're beyond hope"
Bullshit. It's not an anti-US focus, read it again and again, until you finally fucking get it. At least look at some of the proof/backup for the arguments that I use! Understand that you're being just as manipulated with as what you allege me to be. You're just plain wrong on so many counts, that I don't know what else to tell you other than to get a history book.
Apparently you're out of arguments. As expected.
So you're saying this doesn't happen anymore?
Whoosh!