The other day I had the urge to draw a graph of the GDP of European Countries.
Rather than just type data into a spread sheet I though wouldn't it be cool if I could somehow easily pull this data off the web for instance wikipedia or the CIA's world factbook.
This got me thinking about how how the semantic web could all fit together to achieve this and similar tasks. There could be a number of components:
Client applications, graphing is just one possibility. Spreadsheets filled with data from the web, or even a garden design programs pulling data about different sorts of plants.
Servers, lots of these about with all sort of cool data, world facts, government statistics, performance of super computers.
Semantic Search Engine: say you want to graph world population,
it could be possible to build a search engine which could list available semantic info from all the different sources out there. For example you could type in "Countries:Population", giving URLs of all the web services which contain information on the populations of countries.
Repository of screen-scrapers. Over the years I've written a number of small scripts to pull data out of a number of different HTML web pages. Why not share these? This might be a quicker method to semanticise the web rather than wait for servers to publish their sites in XML.
Pull these together with some appropriate standards and imagine the possibilities. Homework assignments in geography, chemistry or any subject with quantative data could be a breeze.
I get the same but there is absolutly nothing in the page all I get is <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title></title></hea d><body></body></html> i.e. zilch in the body. Its serving somthing as there favicon.
I have same problem with yahoo. So I guess its something to do with my browser FireFox 1.0.4 on win 98.
There is a danger in trying to draw moral lessons from the likes of Star Wars and Star Trek. First off we must remember that these are acts of fiction,
often constructed by merging of several basic plots.
A particular favourite is the 'overcoming the monster' plot where a man in a lowly place
manages to overcome a vast evil and good triumphs in the end.
They are often constructed to appeal to some basic emotional responses
providing the feel good factor when you leave the cinema.
Star Trek, like much of American TV likes to end with a nice moral
to the story.
But do these plots really reflect on our normal mundain lives? Should we base our lives on the products of the movie industry? Is life really as simple as the good vrs evil divide so common in these offerings?
Take Star Wars, Lucas does draw something from eastern philosophy (non attachment) and martial arts training. But these are much watered down and simplified. At least episode three did delve a bit deeper into the fall of man
and the dilemmas of Anakin are maybe closer to reality than most.
If you want to learn non-attachment find a Buddhist, if you want to become a jedi join a martial arts class. Just don't get then from a movie.
Two other good maths encyclopedias are
PlanetMath and
Wikipedia
both are open content, open source etc. PlanetMath is pear reviewed and at a high level.
The problem here is that she is learning to follow step-by-step instructions - and not learning to abstract what is actually happening. I notice this a lot when I'm helping non-techy people.
There are a lot of theories of learning which explain this sort of behaviour. Broadly there people who prefer step by step learning and other who can grasp abstract concepts easier. Techies seem to fall into the latter, the rest of the world in the former. This might be a big usability question as computers have mainly been designed by techies and they fit their prefered learning style but not those who need a more step by step approach. Posibly there is too much choice on how to do things, for instance I have at least seven ways to get from
here to the slashdot home page. Thats seven different things to remember - confusion.
Ive seen and used these a fair bit. Often its best use is browsing website, theres some great material out there for education and the electronic whiteboard can really help. Great if you want to show a demo of some software, better than getting a class to huddle round a computer. Great for media related subjects, I've seen some very powerful videos on a whiteboard.
For the most part a projector would do just fine. But on a couple of times I've made use of the interactive nature. The best fun has been a 3D program for displaying mathematical objects Singsurf. Here it really open up the idea of tactile computing. You can touch an object with you finger and drag it round, it almost feels like your holding the object. The students really responded well to this.
well, paying the students WOULD be a great motivation to come to class everyday
Intrestingly in the UK they have started paying students. For 16-21 there is a thing called EMA which gives the students £30/week: provided they attend all their classes and behave themselves. It seems to be working as a motivational tool.
Re:Same old GNU/God Complex
on
Drafting GPL3
·
· Score: 1
To quote from the article
* The GPL is the Literary Work of Richard M. Stallman
Some copyright licenses are no doubt known, in the restricted circle of one firm or law office, as the achievement of a single author's acumen or insight. But it is safe to say that there is no other copyright license in the world that is so strongly identified with the achievements, and the philosophy, of a single public figure. Mr. Stallman remains the GPL's author, with as much right to preserve its integrity as a work representative of his intentions as any other author or creator. Under his guidance, the Free Software Foundation, which holds the copyright of the GPL, will coordinate and direct the process of its modification.
This does seem to be a little odd and posibly contradictory. The GPL
is copyright FSF but RMS is identified as its author. Similar situation
as for most books where the publisher holds the copyright. I'd be more worried if RMS held copyright.
I've seen similar situations before, where founders in their later years get a bit woried that their ideas will be watererd down. Indeed I've had a bust up with a similar founder over these issues, where it was less clear who had the copyright.
There are advantages of this situation, it makes it clear who has the final say, which can help break deadlocks. Similar for linux where linus has the final say.
Whether RMS is felexable enough to cope with a changing climate and really take onboard ideas from the community remains to be seen.
"but an old growth forest is irreplaceable." hmm that looks like a moral judgement to me
Scientifically, irreplaceable means that the habitat, once lost cannot be replaced.
For which there is a lot of evidence. At its simplest it would take 180 years for the canopy to reform, and this would only happen if the land was managed with habitat recreation in mind. What is more likely is that the land will be managed as commercial forestry with perhaps a 100 year cycle, never reaching the same level of biodiversity. There there are issues of habitat fragmentation. For more on subject see Old growth forest
and the numerous links there.
Morally, yes I could say that I view loss of biodiversity, habitats
as one of the major problems facing today's world. Your view of private property
as all important is just as much a moral stance, and one I suspect the native Americans (who were the original inhabitants of the land) might disagree with.
When activities on private land have global consequences, then yes I think some regulation is in order.
what makes you think you can push your life style on me and the government as well
I believe there is something in the US constitution about my right to express my opinion. I also believe it is the business of a democratic government to listen to the opinions of the people they represent and also the opinions of those they share the world with, although the current US Gov does not seem to be very good at that, preferring to force their belief system on others.
One critique of these maps is that they are not comparing like with like.
The forest clearing shown in
http://www.ers.usda.gov/Briefing/LandUse/Gallery/m ap1.htm
is happening mainly in the old growth forest in the rockies.
New planting in the east is often plantations of pine trees and other
commercial forestry.
While it is good that total forest cover in the US is increasing
an old growth forest has a much greater biodiversity than a
comercial plantation. Old growth forests will have many different species
of trees at a variety of different ages, they will support many sorts
of wild-life, bears, wolves, rare owls, and all manner of other plant and insect life. A conifourous plantation can be close to monoculture with rows
and rows of a single species, often the dense planting and the blanket of needles supresses any low growth. Thankfully there is a trend towards better forest management today, but an old growth forrest is ireplacable.
It could be useful to see that homework is really to different things:
Independent study, with the student working through problems on their own.
Work done after school hours
While there may not be a need for "Homework" i.e. 1 and 2. There
is definitely a need for 1). This is one of the most effective ways of learning,
and often where the learning takes place. It does suit some students
better than others, and does have some problems, e.g. motivation.
But when you can make it work its great.
Whether independent study should be done after school, is a different question.
There are advantages (chance of actually getting it done are probably better
than if its done at school), it involves parents in the learning process
and it extends the school day. There are other options, say
cut down the number of straight class room hours and devote time in the school day for it.
There quite a few things where people might want to use some semantic mark-up:
Creative Commons, use rdf to specify copyright and licence info about a page, you can now search on this using special pages on google and yahoo.
Anyone who want to sell something, will be interested in making their content easy to find. A little bit of semantic mark-up , could help them shift units.
Anything pulled out from a database. Here its relatively easy to modify the code to add some extra mark-up.
Tagging this seems to be all the rage, with sites like deli.cio.us et al
Specialised content, stuff like dates, contact info
My guess is we'll see more and more semantic mark-up creep in through the back door. A few years time we be griping how MS has invented its own tag format.
My guess is that we'll see
p.s. Why is this in the Hardware section?
If there are any Geeks out there intrested in all things plant like
and informatics
then they might be intrested in the
permaculture.info project.
Were hoping to build a community driven online database of plants
and their relationships, together with a host of
related information and features. Theres been quite a lot of
interesting ideas floating around with visual representation of
data, distributed events and link systems. Theres a good few chalanges ahead especially in the relms of knowledege representation. Email me or see the website for details.
Allow me to introduce a problem I'd like to solve.
I'm part of a large international permaculture movement
many of the the different groups have their own websites
with space for events. However there is no communication between
the website so the task of actually finding an event is rather tricky
involving searching of numerous websites.
What I'd like to do is introduce a distributed events system, so that
information on an event could be submitted at one site and it could
propagate around the network keeping all the listings up to date.
Requirements for the system are simplicity. i.e. setting up a node should
be very easy (most groups are not very computer literate) so recommendations
for easy to install software would be good. The software would need to
integrate easily with existing websites, so nice configurable php scripts
would be good. Maybe a bit richer format than iCal so its easy to search for
events in England or a particular county and also some tagging features
to allow for certain types of events.
I'd also like to do something similar for links to websites.
Most posts here seem to be down on this idea.
In the maths world at least there does seem to be quite a push
for interoperability of the many maths 4GL's Mathematica, Maple,
Matlab. Theres a thing called MathML which is specifically designed
so that there is some common way that these different platforms can
communicate. In theory you can write some code in Mathematica, convert it to
MathML, send it to a colegue using Maple and they can convert it to Maple code and run it.
Maybe mathematics is a special domain in the the 4GL's we are seeing
are really trying to represent the same 7GL (i.e. maths notation),
so the problem is intrinsically easy.
I'm getting kind of worried by the conservatism of the Slashdot crew.
Any new idea coming along (database yesterday, 4GL's today)
seems to get shot down. Are we going to actually see new ideas
coming from the opensource movement, or is it just going to be limited
to reimplementing existing programs?
As to alternatives, ideas from the semantic web get somewhere close.
This is actually a real problem for me. We're trying to produce an open source plant "database" permaculture.info.
We keep running into big problems with data representation,
any schema we use seems to be very week with holes in it, and instantly limiting what can be represented. Flexability and extensibility are central
requirments and, as you say, RDMS are probably the wrong tool.
Is language that broken? Language is sort of words plus grammer, both of which have shown great potential for flexability and extensability.
I think I agree with the parent. Databases are methods of storing and retrieving data. Trying to make queries fuzzy, or less structured is just wrong.
Its not so much a question of the data but more the way the actual data is structured. Databases impose a rigid structure on the data, or
a leaky abstraction.
If the data you have
does not fit a rigid structure then the underlying asumptions of a database
do not fit your needs well.
There are certain problems where this sort rigid structure works well, say
a catalogue of car parts. But theres others where the rigid structure
can prove to be more a pain than an advantage.
Take for example
representing the medical records of a person. This has potential to be
a very complicated structure. Indeed any structure imposed may well fall down at some point. There will be patients with rare conditions, who will
have some specalised diagnostic tests. The database designer will not no
aproir how to represent the results from these test. Fitting
such data into the structure could be a maintance nightmare (maybe this is
why so many of the big government IT systems have such a hard time).
We could play a fun game here: you propose a data structure, I'll come up with
with a counter example which will not fit.
I guess the problem is we are trying to represent the world here.
Its known that there is no ontology (clasification system) which will
serve all purposes. Theres philosophical problems about the nature of languare
and how we can represent grammer and draw inferences. Yet what do
databases offer us to represent textual data: a block of text! Fifty
years of computers and they best method we've comeup with for representing
a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the
good a database does, indeed thats what yahoo does.
The most pressing open question about is whether it is a normal number -- whether any digit block occurs in the expansion of just as often as one would statistically expect if the digits had been produced completely "randomly". This must be true in any base, not just in base 10. Current knowledge in this direction is very weak; e.g., it is not even known which of the digits 0,...,9 occur infinitely often in the decimal expansion of .
Bailey and Crandall showed in 2000 that the existence of the above mentioned Bailey-Borwein-Plouffe formula and similar formulas imply that the normality in base 2 of and various other constants can be reduced to a plausible conjecture of chaos theory. See
Bailey's web site for details.
Nice computer science, but poor pure mathematics.
I thought the point of the article is a warning against the assumption that because pi is known to be irrational, then it follows that it is a good source for a random seed. There may be applications that use this assumption, and it may not be yielding sufficiently random results.
Point taken, i.e. its a useful result for comp sci. Pure mathematically
it should be a conjecture: "The digits of pi are not randomly distributed"
which is neither proved or disproved.
The scientists took approximately the first 100 million digits of pi, broke the string up into 10-digit segments, and gave the segments a form that defines a point somewhere within a cube with sides one unit long. To specify each point, three such segments are necessary - one for each dimension. For example, the sequence 1415926535 was given the form 0.1415926535, which specifies the point's distance along the x-axis. Similarly, the two subsequent sequences give the point's y and z coordinates. All of the sequences thus became coordinates between zero and one, giving millions of points that lay within the imaginary cube.
So they have only taken a very small number of the digits of pi.
Would they get the same result if they took the first googleplex (10^10^100)
of digits? Even that would prove nothing about the randomness of pi's digits.
Finally a use for all the AC's who post to slashdot!
This got me thinking about how how the semantic web could all fit together to achieve this and similar tasks. There could be a number of components:
Client applications, graphing is just one possibility. Spreadsheets filled with data from the web, or even a garden design programs pulling data about different sorts of plants.
Servers, lots of these about with all sort of cool data, world facts, government statistics, performance of super computers.
Semantic Search Engine: say you want to graph world population, it could be possible to build a search engine which could list available semantic info from all the different sources out there. For example you could type in "Countries:Population", giving URLs of all the web services which contain information on the populations of countries.
Repository of screen-scrapers. Over the years I've written a number of small scripts to pull data out of a number of different HTML web pages. Why not share these? This might be a quicker method to semanticise the web rather than wait for servers to publish their sites in XML.
Pull these together with some appropriate standards and imagine the possibilities. Homework assignments in geography, chemistry or any subject with quantative data could be a breeze.
I get the same but there is absolutly nothing in the pagea d><body></body></html>
all I get is
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title></title></he
i.e. zilch in the body. Its serving somthing as there favicon.
I have same problem with yahoo. So I guess its something to do with my browser
FireFox 1.0.4 on win 98.
Any ideas?
Rich
But do these plots really reflect on our normal mundain lives? Should we base our lives on the products of the movie industry? Is life really as simple as the good vrs evil divide so common in these offerings?
Take Star Wars, Lucas does draw something from eastern philosophy (non attachment) and martial arts training. But these are much watered down and simplified. At least episode three did delve a bit deeper into the fall of man and the dilemmas of Anakin are maybe closer to reality than most.
If you want to learn non-attachment find a Buddhist, if you want to become a jedi join a martial arts class. Just don't get then from a movie.
Two other good maths encyclopedias are PlanetMath and Wikipedia both are open content, open source etc. PlanetMath is pear reviewed and at a high level.
The problem here is that she is learning to follow step-by-step instructions - and not learning to abstract what is actually happening. I notice this a lot when I'm helping non-techy people.
There are a lot of theories of learning which explain this sort of behaviour. Broadly there people who prefer step by step learning and other who can grasp abstract concepts easier. Techies seem to fall into the latter, the rest of the world in the former. This might be a big usability question as computers have mainly been designed by techies and they fit their prefered learning style but not those who need a more step by step approach. Posibly there is too much choice on how to do things, for instance I have at least seven ways to get from here to the slashdot home page. Thats seven different things to remember - confusion.
Ive seen and used these a fair bit. Often its best use is browsing website, theres some great material out there for education and the electronic whiteboard can really help. Great if you want to show a demo of some software, better than getting a class to huddle round a computer. Great for media related subjects, I've seen some very powerful videos on a whiteboard. For the most part a projector would do just fine. But on a couple of times I've made use of the interactive nature. The best fun has been a 3D program for displaying mathematical objects Singsurf. Here it really open up the idea of tactile computing. You can touch an object with you finger and drag it round, it almost feels like your holding the object. The students really responded well to this.
Intrestingly in the UK they have started paying students. For 16-21 there is a thing called EMA which gives the students £30/week: provided they attend all their classes and behave themselves. It seems to be working as a motivational tool.
This does seem to be a little odd and posibly contradictory. The GPL is copyright FSF but RMS is identified as its author. Similar situation as for most books where the publisher holds the copyright. I'd be more worried if RMS held copyright.
I've seen similar situations before, where founders in their later years get a bit woried that their ideas will be watererd down. Indeed I've had a bust up with a similar founder over these issues, where it was less clear who had the copyright.
There are advantages of this situation, it makes it clear who has the final say, which can help break deadlocks. Similar for linux where linus has the final say.
Whether RMS is felexable enough to cope with a changing climate and really take onboard ideas from the community remains to be seen.
Creative Commons have done good work along this line. They produce a CC-GPL which is the GPL + Human readable code + Metadata.
However, we shape our planet with two other intelegent species: mice and dolphins.
"but an old growth forest is irreplaceable." hmm that looks like a moral judgement to me
Scientifically, irreplaceable means that the habitat, once lost cannot be replaced. For which there is a lot of evidence. At its simplest it would take 180 years for the canopy to reform, and this would only happen if the land was managed with habitat recreation in mind. What is more likely is that the land will be managed as commercial forestry with perhaps a 100 year cycle, never reaching the same level of biodiversity. There there are issues of habitat fragmentation. For more on subject see Old growth forest and the numerous links there.
Morally, yes I could say that I view loss of biodiversity, habitats as one of the major problems facing today's world. Your view of private property as all important is just as much a moral stance, and one I suspect the native Americans (who were the original inhabitants of the land) might disagree with. When activities on private land have global consequences, then yes I think some regulation is in order.
what makes you think you can push your life style on me and the government as well I believe there is something in the US constitution about my right to express my opinion. I also believe it is the business of a democratic government to listen to the opinions of the people they represent and also the opinions of those they share the world with, although the current US Gov does not seem to be very good at that, preferring to force their belief system on others.
One critique of these maps is that they are not comparing like with like. The forest clearing shown in http://www.ers.usda.gov/Briefing/LandUse/Gallery/m ap1.htm
is happening mainly in the old growth forest in the rockies.
New planting in the east is often plantations of pine trees and other
commercial forestry.
While it is good that total forest cover in the US is increasing
an old growth forest has a much greater biodiversity than a
comercial plantation. Old growth forests will have many different species
of trees at a variety of different ages, they will support many sorts
of wild-life, bears, wolves, rare owls, and all manner of other plant and insect life. A conifourous plantation can be close to monoculture with rows
and rows of a single species, often the dense planting and the blanket of needles supresses any low growth. Thankfully there is a trend towards better forest management today, but an old growth forrest is ireplacable.
- Independent study, with the student working through problems on their own.
- Work done after school hours
While there may not be a need for "Homework" i.e. 1 and 2. There is definitely a need for 1). This is one of the most effective ways of learning, and often where the learning takes place. It does suit some students better than others, and does have some problems, e.g. motivation. But when you can make it work its great.Whether independent study should be done after school, is a different question. There are advantages (chance of actually getting it done are probably better than if its done at school), it involves parents in the learning process and it extends the school day. There are other options, say cut down the number of straight class room hours and devote time in the school day for it.
My guess is we'll see more and more semantic mark-up creep in through the back door. A few years time we be griping how MS has invented its own tag format. My guess is that we'll see p.s. Why is this in the Hardware section?
If there are any Geeks out there intrested in all things plant like and informatics then they might be intrested in the permaculture.info project. Were hoping to build a community driven online database of plants and their relationships, together with a host of related information and features. Theres been quite a lot of interesting ideas floating around with visual representation of data, distributed events and link systems. Theres a good few chalanges ahead especially in the relms of knowledege representation. Email me or see the website for details.
What I'd like to do is introduce a distributed events system, so that information on an event could be submitted at one site and it could propagate around the network keeping all the listings up to date.
Requirements for the system are simplicity. i.e. setting up a node should be very easy (most groups are not very computer literate) so recommendations for easy to install software would be good. The software would need to integrate easily with existing websites, so nice configurable php scripts would be good. Maybe a bit richer format than iCal so its easy to search for events in England or a particular county and also some tagging features to allow for certain types of events.
I'd also like to do something similar for links to websites.
Any suggestions?
Some how it seems strange. I'm more than happy to pay for a printed newspaper/magazine. But not for an online one. Why is this?
Not really food, but it can damage keyboard. Lots of my keys have timy little melt marks where a hot bit of ash has fallen.
Maybe mathematics is a special domain in the the 4GL's we are seeing are really trying to represent the same 7GL (i.e. maths notation), so the problem is intrinsically easy.
I'm getting kind of worried by the conservatism of the Slashdot crew. Any new idea coming along (database yesterday, 4GL's today) seems to get shot down. Are we going to actually see new ideas coming from the opensource movement, or is it just going to be limited to reimplementing existing programs?
As to alternatives, ideas from the semantic web get somewhere close. This is actually a real problem for me. We're trying to produce an open source plant "database" permaculture.info. We keep running into big problems with data representation, any schema we use seems to be very week with holes in it, and instantly limiting what can be represented. Flexability and extensibility are central requirments and, as you say, RDMS are probably the wrong tool.
Is language that broken? Language is sort of words plus grammer, both of which have shown great potential for flexability and extensability.
Interesting time.
Its not so much a question of the data but more the way the actual data is structured. Databases impose a rigid structure on the data, or a leaky abstraction. If the data you have does not fit a rigid structure then the underlying asumptions of a database do not fit your needs well.
There are certain problems where this sort rigid structure works well, say a catalogue of car parts. But theres others where the rigid structure can prove to be more a pain than an advantage.
Take for example representing the medical records of a person. This has potential to be a very complicated structure. Indeed any structure imposed may well fall down at some point. There will be patients with rare conditions, who will have some specalised diagnostic tests. The database designer will not no aproir how to represent the results from these test. Fitting such data into the structure could be a maintance nightmare (maybe this is why so many of the big government IT systems have such a hard time). We could play a fun game here: you propose a data structure, I'll come up with with a counter example which will not fit.
I guess the problem is we are trying to represent the world here. Its known that there is no ontology (clasification system) which will serve all purposes. Theres philosophical problems about the nature of languare and how we can represent grammer and draw inferences. Yet what do databases offer us to represent textual data: a block of text! Fifty years of computers and they best method we've comeup with for representing a richly structured piece of writing like a wikipedia article is: a block of text. Ok, theres a bit of markup there but its all just stuffed together in a textblob. I might as well just dump it in a file for all the good a database does, indeed thats what yahoo does.
To my mind databases are broken beyond belief.
The most pressing open question about is whether it is a normal number -- whether any digit block occurs in the expansion of just as often as one would statistically expect if the digits had been produced completely "randomly". This must be true in any base, not just in base 10. Current knowledge in this direction is very weak; e.g., it is not even known which of the digits 0,...,9 occur infinitely often in the decimal expansion of .
Bailey and Crandall showed in 2000 that the existence of the above mentioned Bailey-Borwein-Plouffe formula and similar formulas imply that the normality in base 2 of and various other constants can be reduced to a plausible conjecture of chaos theory. See Bailey's web site for details.
So they have only taken a very small number of the digits of pi. Would they get the same result if they took the first googleplex (10^10^100) of digits? Even that would prove nothing about the randomness of pi's digits.
Nice computer science, but poor pure mathematics.