Domain: moreover.com
Stories and comments across the archive that link to moreover.com.
Comments · 21
-
Open Source Intelligence
What's wrong with using what is traditionally referred to as Open Source Intelligence (OSINT) against publicly available sources?
This has been done for years, and is a time-honored and respected mechanism for gathering intelligence. What's wrong with then leveraging technology to more effectively search larger volumes of information and weed out individual pieces of information for further analysis, to identify trends, and so on?
The Open Source Center, formerly the Foreign Broadcast Information Service, already does this with foreign broadcast media, and is able to collect and transcribe, on the fly, information from foreign radio and television broadcasts in a variety of languages and dialects with incredible accuracy, and then make the resultant material searchable. The new initiative would go one step further and apply artificial intelligence techniques to automated searching, that can more easily target and bring to light trends or time-critical information.
Different business and governmental entities do this globally; it's traditionally referred to as "current awareness", and many academic and corporate entities offer current awareness services. All of these services will leverage technology, live realtime searching and alerting, and so on, to make the information more timely, valuable, and relevant.
Remember, this is publicly available and published information.
Also, submitter is a little misguided when he says "No hint is given as to how this would apply to syndicated articles written in the US and published abroad." That misunderstands the purpose of this; the program is designed to look at foreign media sources as one component of OSINT, because they are a a valuable source of such information, and can reflect local trends and patterns, and may reveal changing or growing (or waning) sentiments on particular topics on the part of a local populace or media outlet, or even a government in the case of state-controlled media. We generally don't get that kind of information from US-based media, and this has nothing to do with whether US-based media outlets publish abroad. It's already public information and has been published publicly. The restrictions are geared to prevent an appearance of overt US press monitoring.
OSINT is a one-way source of intelligence information: from it, to the gathering entity. Any assumptions that the viewing of already-public information then implies that there will be a commensurate attempt to silence such information (especially when the information isn't under our control, and ignores the fact that we can't "silence" things like Iran's state media) both makes a a fallacious logical leap and grossly misunderstands the purpose and scope of OSINT.
All the critics can say is that it's "creepy and Orwellian," but of course, there's nothing wrong with the government or its intelligence components reading, viewing, or collecting publicly available and indeed overtly publicly published information. The intelligence community gets ripped when it doesn't gather enough information, and will no doubt get ripped for gathering "too much" in a "creepy" way, even when it's from overtly and intentionally public sources, and especially if it uses technology to do it.
There is a real concern about the growing use of automated and electronic intelligence gathering in lieu of human intelligence, but ultimately, both are valuable. Unfortunately, electronic and signals intelligence is often much more costly, and sometimes gets more attention in some parts of the intelligence community while human intelligence needs languish. -
Re:Prior art ...
Prior art - Moreover provide me with free RSS news feeds - they make the first item an advert, the advert is dependent on the search term.
HTML
XML
Google themselves advertised / provided the adverts on this when it first came out I believe - this might be where they started a year and a half ago - I can't see any evidence of Google involvement now though. Is there a Moreover/Google relationship? -
Re:Prior art ...
Prior art - Moreover provide me with free RSS news feeds - they make the first item an advert, the advert is dependent on the search term.
HTML
XML
Google themselves advertised / provided the adverts on this when it first came out I believe - this might be where they started a year and a half ago - I can't see any evidence of Google involvement now though. Is there a Moreover/Google relationship? -
At least they link to /.
Go to: http://uk.newsbot.msn.com/search/?nq=Minnesota+Se
n ator+:
Minnesota Senator Says Email Tax Might Reduce Spam
Slashdot - 19 Nov.
We are not responsible for them in any way. Read this story
Perhaps they need a bit more work on picking out relevant parts of the article to post as a blurb. :) -
Checksum used by nytimes
The nytimes website has a number of partner websites like this through which you can access the nytimes content without having to register. The partner websites use a URL containing a checksum which the nytimes website verifies before giving access. The checksum access only works for a limited time like one week after creation.
- Question 1:What is the 16-digit hexadecimal checksum?
- Question 2: From what input is the checksum calculated, e.g. some concatenation of the article's text, date, URL, plus random data?
A valid nytimes url with checksum in URL as value of 'en'
(http://www.nytimes.com/2002/07/22/international/e urope/22EURO.html?ex=1028001600&en=9bcfc7fe702d6bd 0&ei=5040&partner=MOREOVER)
By the time you read this the URL may no longer work; I got it as a redirect from this. You can find other working nytimes URLs by visiting www.moreover.com/news and looking for nytimes stories. Just grab the URL you see after moreover.com re-directs your browser to nytimes.com. - Question 1:What is the 16-digit hexadecimal checksum?
-
Checksum used by nytimes
The nytimes website has a number of partner websites like this through which you can access the nytimes content without having to register. The partner websites use a URL containing a checksum which the nytimes website verifies before giving access. The checksum access only works for a limited time like one week after creation.
- Question 1:What is the 16-digit hexadecimal checksum?
- Question 2: From what input is the checksum calculated, e.g. some concatenation of the article's text, date, URL, plus random data?
A valid nytimes url with checksum in URL as value of 'en'
(http://www.nytimes.com/2002/07/22/international/e urope/22EURO.html?ex=1028001600&en=9bcfc7fe702d6bd 0&ei=5040&partner=MOREOVER)
By the time you read this the URL may no longer work; I got it as a redirect from this. You can find other working nytimes URLs by visiting www.moreover.com/news and looking for nytimes stories. Just grab the URL you see after moreover.com re-directs your browser to nytimes.com. - Question 1:What is the 16-digit hexadecimal checksum?
-
Re:3400+ Slashdotters Can't Be Wrong...
-
SOAP me up baby !
I was just thinking about incorporateing a Google search on my websites after an impressive experience with a few websites that employed their Free WebSearch plus SiteSearch feature.
This is even better. With this feature, I'll be able to SSI and/or push results using something as simple as SoapLite to get the job done.
I sure hope other content providers are taking note. Imagine how useful (not to much fun) it would be to snap up stuff from places like MoreOver.Com? -
Copy of Senate hearing speeches & press covera
-
Copy of Senate hearing speeches & press covera
-
Re:AlternativesI think you mean moreover.com.
Alternatively, you can check out XMLTree.com
-
Speaking of SW toys
-
Ontologies: handmade vs. automatedFrom the CYC website:
CYC's knowledge base is built upon a core of over 1,000,000 hand-entered assertions (or "rules") designed to capture a large portion of what we normally consider consensus knowledge about the world.
As the interview with Google on /. yesterday brought out, one of the great challenges of the moment is how to take enormous quantites of easily available data, and store it for retrieval in ways that reflect an understanding of the real world. (One might try to quantify the "intelligence" of a database by the extent to which it can achieve this kind of data association / data reduction).Good ontologies are a big part of this -- identifying and distinguishing different contexts, associated with their likely possible properties.
The work CYC have done in finding good ways to represent such ontologies is important, but only goes so far -- in particular it seems to be essentially static. What impresses me more is some of the work that has been done elsewhere to automate the process of the discovery and maintenance of ontology -- extracting it dynamically from the associations revealed in a large pile of documents.
One example of a site which is an end user of such technology is the well known news portal moreover.com, powered mostly (I believe) by Autonomy
-
Re:Sounds like rdf...
Yes, it is RDF. There are many areas of the SW work where it's not clear what the final technology will be (notably the schema expression tools, such as RDF Schema vs. OIL or DAML or DAML+OIL), but RDF itself seems almost certain to be used - there's just nothing else offering itself as a competitor in that niche.
Some clarifications: XML isn't RDF, and RDF isn't XML. RDF is fundamentally a data model, whereas XML is just a serialisation of a much simpler infoset model. As RDF doesn't have its own serialisation (how you write it down), then the convention has been that it's done in XML. You could serialise RDF into anything you like, but I've yet to see a non-XML one.
XML Schema isn't the same as most other schema languages in this field. XML Schema is concerned about structure and operational matters, not about expressing semantics. XML Schema would be a very bad choice for expressing the semantics of the SW. It works OK for Ariba and XrML, because they're quite limited applications of discussion (an invoice is an invoice is an invoice). Even with MPEG-7, XML Schema has run out of steam and the MPEG group have had to invent their own schema expression language. Using XML Schema for bureaux like BizTalk is extremely limiting, and a bad move long-term.
DTD are dead. Use XML Schema instead.
RSS (the site-summary format used for Moreover newsfeeds and to make the Slashboxen work) isn't RDF. It's expressed in RDF and defined in RDF Schema, but it's just one RDF application out of many.
-
more over....
try http://www.moreover.com they provide news feeds for several sites.
-
focus
deja and google are great and all, but i still like focused resources ie slashdot, moreover amd silicongod.com
-
now what we need is climate controlThese types of transportation devices -- small, one-person stand-up units, relatively slow, energy-efficient -- could be the future of the modern city. Look at Chinese cities - a sea of bicycles. If even 10% more of those people had cars, there'd be total chaos: the population is too dense to be able to accomodate large vehicles.
Compare this with the modern US city: cars, on average, getting bigger all the time; traffic jams getting worse. Christ, there's a whole industry in traffic jams, with news helicopters and on-board computers. Meanwhile, cities continue to get denser and populations continue to rise. There's a clear end to this: cars have to go as the primary means of inner-city transportation.
Some cities are taking steps already: Portland OR has expanded it's bike lanes over the last 15 years and they're now pretty pervasive. Other OR cities, like Salem and Eugene (college towns) have even more aggressive bike lane programs and laws. bike lanes are clear policy and popular support for smaller, more economical short-distance transportation.
What's to replace cars? Scooters, perhaps (cf. Ginger), or something similar. What are the major objections to scooters?- Safety. First, the system has to separate cars and small vehicles - they can't interact. Also, there has to be a licensing program, like for cars - we do NOT need thousands of untrained scooter riders - one fuck-up would take out a crowd. Finally, remember the dire predictions about cars? Thousands of deaths? Environmental destruction? (Well, ok, most of those seem to have come true, but why let that stop us?)
- Social engineering. No, most people won't trade their V-16 Ford Luxohemoth in for a battery-powered skateboard anytime soon, but that can certainly change (look how well the US government managed people in WWII). People don't think they're as susceptible to propaganda anymore, but they're wrong. We just call it advertising now, and it works really well. I don't think this will be a real problem.
- Climate. Here's the stake through the heart of this little idea. In Portland we get 36" of rain a year, and it's spread out very thinly - 5 or 6 days a week in winter are overcast and somewhat "moist." No way are people going to tool around in the open in a climate like that. So what do we need? Climate control.
Thus, as I submit in the title, these types of transportation won't become widely used until we have pretty well-established nanotechnology. Unless, of course, the world eceonomy collapses or gets spread veeeeery evenly, in which case I guess even US citizens will be happy to ride bikes to work in the rain.
Just a thought...
question: is control controlled by its need to control?
answer: yes -
What Google and Yahoo! are missing.....It's currently very hard to search for information on the Web that is less than 2 weeks old. When you're keeping up with current events and industry developments, 2 weeks is just too long to wait for information.
That's where specialty search engines like Moreover come in. Eventually, sites like this will let you search those bits of the Web that change often (news sources, weblogs, discussion groups, sites like Slashdot, message boards, financial news, etc.), allowing people to keep up with things as they happen.
Existing search engines are great at finding things that are archived on the Web, but poor at keeping up with what's currently happening. Looking for all the articles on the latest Shuttle mission, as well as what people are saying about it? You might find one or two things about it on Yahoo! or Google, but a search engine like Moreover will find the fluff article on CNN, the more in-depth article on Space.com, and a discussion about the mission on Slashdot. That's pretty powerful.
-
Re: more *very* useful uses of XMLIf anyone is interested in integrating XML delivered content into their application, at Moreover.com we've just given free access to all of our headlines from 1500 sources, in a variety of flavors of xml: moreoverxml; wddx; rss and. See Moreover News Categories
From our own perspective, what is interesting is that some of the more sophisticated XML-based initiatives for syndication of XML content such as ICE are over complex for many applications. Some much simpler definitions such as wddx allow very speedy integration of content and metadata into a database.
-
Re: more *very* useful uses of XML
If anyone is interested in integrating XML delivered content into their application, at Moreover.com we've just given free access to all of our headlines from 1500 sources, in a variety of flavors of xml: moreoverxml; wddx; rss and. See Moreover News Categories From our own perspective, what is interesting is that some of the more sophisticated XML-based initiatives for syndication of XML content such as ICE are over complex for many applications. Some much simpler definitions such as wddx allow very speedy integration of content and metadata into a database.
-
Export your database records
This is a problem I came up against a few years ago and tried to solve at the time. I never got fart enough down the path to release a standard or anything, but I'll explain the thinking.
Basically the problem is that search robots can get stuck in loops on your site if it's a database-driven one. Equally if the database contains something like stock quotes or postcodes it would just succeed in filling the engines with contextless gibberish.
So instead, the plan was to get people to manually export their database into a flat text file referenced from the robots.txt file. The text file would in some way have a data field and an address field. So the data field has the content itself as plain text and the address tells the search engines where they should send people, rather than referring them to that text file.
Now there's another big problem that the author hasn't mentioned. How do you find real-time information?
Here's the scenario: You hear from a friend that a school in Dublin, Ireland has just been closed down due to sexual harrassment from the principal. Since this is your field, you want to find out more. The major sources: CNN, BBC, ABC etc aren't covering it. You know that some local new site will cover it though. So how do you find it?
Right now, the only way would be to find the Brand in a subject index like Yahoo, then hope they cover it. Looking for Dublin Time might work. But why can't you search for sexual harrassment school dublin?
The answer lies in a real-time database of news, requiring the news services to either update a file with all the news in it or perhaps in some way push information into the search engines.
One approach to this problem is Moreover who index news themselves without the benefit of metadata. These guys are very clued in about metadata though.