aharth · Slashdot Mirror

Unstable identifiers problematic? on Why We Need OpenStreetMap (Video) · 2014-01-29 10:34 · Score: 1

OpenStreetMap identifiers are not stable (at least according to a 2011 post), which makes reusing and linking OpenStreetMap data a bit challenging. Did that change?

Latency vs. bandwidth on Google and OpenDNS Work On Global Internet Speedup · 2011-08-30 06:42 · Score: 1

There are two factors that affect the performance of web (HTTP) lookups: latency and bandwidth. Latency depends on the distance between client and server. You won't be able to send data faster than the speed of light. Bringing the data closer to the client helps to reduce latency, especially for small lookups. Bandwidth becomes the limiting factor when you transfer (large amounts of) data over under-dimensioned pipes. In general, I'd be a much more happy person if people would use HTTP caching headers (Expires and such) more often, as then a Squid proxy can bring substantial performance gains.

Now what? How to actually monitor and debug? on What Makes Parallel Programming Difficult? · 2011-05-27 08:44 · Score: 1

Parallel programming is hard. Bummer. I'd rather be interested in an article that talks about *monitoring and debugging* parallel programs (I currently struggle to monitor parallel algorithms implemented in Java). Anybody?

Linked Data? on Open Data Needs Open Source Tools · 2010-03-09 09:06 · Score: 1

Semantic Web technologies (in particular RDF, a graph-structured data format) are ideally suited for publishing data. Also, these technologies facilitate the integration of separate pieces of information; integration is what you want to do if thousands of people start publishing structured data. Linked Data (RDF using HTTP URIs to identify things) is already used by the NYT and the UK government to publish data online.

Re:Data Search Interface on Attractive Open Source Search Interfaces? · 2010-01-20 12:08 · Score: 1

Yes, the better the data the better the system will work. However, VisiNav works quite well on relatively scruffy web data due to the integrated ranking component.
The underlying data has to be in graph-structured format (in RDF syntax); reasoning, most notably object consolidation, is supported via OWL. Once the data is indexed, users can search and browse right away. There's no configuration needed, because the ordering of data is done based on the calculated ranks. The UI can be configured via XSLT and CSS for adding a logo or changing the look and feel.
We've developed VisiNav as part of a research project, and the university owns (and manages) the IP. I guess they will make it available free of charge for educational and research organisations, but commercial applications would require a license.

Data Search Interface on Attractive Open Source Search Interfaces? · 2010-01-13 12:50 · Score: 1

Hi, there's also VisiNav which lets you assemble complex queries over data, covering keyword search and faceted browsing (as Flamenco) and a bit more (path navigation). Drag and drop UI, where people who don't know facets or path navigation can do keyword search without being distracted. -- Andreas. Disclaimer: I'm one of the developers of VisiNav.

Future Internet Symposium 2009 on Who Will Fix the Internet? No One, Apparently · 2009-08-26 02:52 · Score: 1

There's the Future Internet Symposium 2009 (http://www.fis2009.org/ ) in Berlin next week which exactly targets the topic in the post. From the call for papers: "With over a billion users today's Internet is arguably the most successful human artifact ever created. The Internet's physical infrastructure, software, and content now play an integral part of the lives of everyone on the planet, whether they interact with it directly or not. Now nearing its fifth decade, the Internet has shown remarkable resilience and flexibility in the face of ever increasing numbers of users, data volume, and changing usage patterns, but faces growing challenges in meetings the needs of our knowledge society. Yet, Internet access moves increasingly from fixed to mobile, the trend towards mobile usage is undeniable and predictions are that by 2014 about 2 billion users will access the Internet via mobile broadband services. This adds a further layer of complexity to the already immense challenges."

Re:ah yes, semantic web via RDF is the future on The Web of Data, Beyond What Google and Yahoo Show · 2009-07-26 21:20 · Score: 1

The field has come a long way since 2001 or 2003.

The main obstacle to "this golden future" so far has been an insufficient amount of data published online. Many organisations sit on their data like hens sit on their eggs, and publishing data right requires some effort.

That's slowly changing, especially with more openness and transparency -- voluntarily or forced -- in all kinds of organisations and agencies (data.un.org, data.gov, data.gov.uk... ), more people getting the idea of open data, and the establishing of simplified best practices on how to publish data on the web following the Linked Data paradigm.

It's about time that Yahoo and Google finally start to take note and add open data to their systems (which don't exploit the full power of these technologies but hey you've got to start somewhere).

Re:Linux FS for SDD drives? on Creating a Low-Power Cloud With Netbook Chips · 2009-04-19 05:25 · Score: 1

You misunderstand how that's supposed to work. You don't "free main memory" to SSD. The idea is to use SSD as a pre-buffer for RAM, so it's quicker to access than reading from disk.

Sure.

But there's something wrong if the Linux kernel buffers SSD I/O in main memory and swaps code fragments to disk. At least that's what happened in my experiments.

Linux FS for SDD drives? on Creating a Low-Power Cloud With Netbook Chips · 2009-04-16 11:55 · Score: 1

I've been toying around with a Samsung 16GB SSD. Performance improvement over spinning disks in an I/O-heavy scenario was neglegible. Also, it seemed as if the Linux kernel was still using memory to buffer SSD disk I/O. Which somewhat negates the argument of using SSDs to free main memory for other stuff.

Any idea what type of OS/filesystem combination they were using?

Linked Data on Data.gov To Launch In May · 2009-04-05 07:31 · Score: 1

I really hope they publish the stuff as Linked Data.

Proper data first, then cool applications on Untangling Web Information · 2008-10-27 03:23 · Score: 2, Insightful

In a nutshell, the goal of the Semantic Web is to bring knowledge representation to the Web (using graphs, networks, binary predicates, however you want to call it).

I've been trying to apply data from the Semantic Web for a few years now.
I can see two roadblocks to mainstream adoption:

* Web data is immensely scruffy. If thousands of people contribute to a dataset without any restrictions, you get a mess (e.g. multiple URIs used to denote the same class or individual, which results in fractured data). Having said that, I can see some convergence happening on reusing URIs (for classes that has happened for a while now, for instances this is getting better every day).
* Without proper data, it's hard to show the benefit of having a web-wide knowledge base. Right now, my marketing pitch for our semantic web search engine is to go "from documents to objects", i.e. you want to locate objects (the person CmdrTaco) rather than documents matching keywords.

Once you have achieved a web-wide knowledge base of decent quality, you can start thinking about how to navigate that information space to actually answer questions (and I don't mean natural language understanding, but a point-and-clic, menu-based interface). CmdrTaco's phone number, people he knows, blog posts he's written, and so on.

The chicken-and-egg circle is slowly breaking up. For a demo, our system is online at http://swse.deri.org/.

Also works with WiMax on A DIYer's Quick Guide To Cheap Wireless Extension · 2008-07-17 06:31 · Score: 2, Interesting

Just tried on my balcony: WiMax box in front of old sat dish = ~ 30% higher transfer rate!

Re:Random read ops? on Samsung 256GB SSD is World's Fastest · 2008-05-26 04:09 · Score: 2, Interesting

How about random reads? I've benchmarked a 16G Samsung SSD and the standard Linux file systems (ext2, ext3) seem to cache read blocks in the (main memory) file system buffers.

Doing so seems to diminish some of the the possible overall system performance improvements - if I have a SSD I want to use the main memory for either HD io caching or programs. Caching disk blocks from the fast SSD in main memory seems suboptimal.

Persistent storage? on Amazon and Hardware As a Service · 2007-10-29 18:45 · Score: 1

I've heard this before and didn't understand it.

What does "no persistent storage in ec2" mean?

Here's the Tech Report on Super-Fast RDF Search Engine Developed · 2007-05-04 03:19 · Score: 5, Informative

Hello, I am one of the main developers of SWSE. True, the press release is vague, but there is only so much you can say in a press release aimed for the general public.

We have a Technical Report available at http://www.deri.ie/fileadmin/documents/DERI-TR-200 7-04-20.pdf that should answer most of the technical questions.

From the abstract:

"We present the architecture of an end-to-end search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web.

In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of computers.

We evaluate the system on a dataset with 430 million statements collected from the Web, and provide scale-up experiments on 7 billion synthetically generated statements."

Standardised APIs? on Social Networking Sites Opening Their APIs · 2007-02-13 22:51 · Score: 1

Some social networking sites (e.g. livejournal.com and d.hatena.ne.jp) already provide basic data export in FOAF (Friend-of-a-Friend) vocabulary. Search engines such as Swoogle and SWSE aggregate some of the content published in RDF. The problem is that crawling large database-driven sites with millions of files takes years when adhering to the Robots Exclusion Protocol. On the other hand, an API can provide on-demand integration, but with every site building their own API, a lot of schema wrapping (e.g. via XSLT's) is needed to aggregate data. Vocabularies such as SIOC could provide a standardised API and data format for all sorts of community sites, which would facilitate the integration of data from multiple places.

Indexing component is key for performance on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-22 22:41 · Score: 2, Informative

From experiments with my database-like system in Java (http://sw.deri.org/2004/06/yars/yars.html), I learned the hard way that the indexing/storage component is the key piece for any system dealing with large amounts of data. Performance depends to a great extent to the underlying storage mechanisms used (and different implementations of e.g. B+-Trees vary greatly in performance and functionality). Implementing a fast B+-Tree with transactions is a non-trivial task.

Let's hope MySQL AB. find a good replacement for InnoDB.

Semantic Desktop is a research topic on A Glimpse at the Linux Desktop of the Future · 2005-07-05 00:39 · Score: 1

There is research going on in Europe in the area of next-generation PIM and collaboration. One project is the networked social semantic desktop, there's a workshop about the topic in November 2005: http://www.semanticdesktop.org/

RDF Crawlers on RDF and OWL Are W3C Recommendations · 2004-02-10 06:02 · Score: 5, Informative

A lot of RDF out there is in FOAF and RSS 1.0 vocabularies. Increasingly, people use to link RDF files, which makes it possible to have RDF crawlers ("scutters") harvest RDF from the web. I have an RDF aggregator service running that crawls the semantic web. There's a lot of useless broken RDF out there, so if you put RDF on your web site please use W3C's RDF Validator to check for valid RDF.

P2P network not only for filesharing on Modelling P2P Networks · 2002-02-28 01:33 · Score: 1

The focus on P2P networks in the past has been solely on filesharing. But you can also exchange other data using a P2P network. There's an open source project with a P2P network for exchanging recommendations for web resources. Help us test the scalability of our network, just grab the tar.gz and run the software!

recommendation instead of seeking on Is The Web Becoming Unsearchable? · 2001-03-27 05:33 · Score: 2

A totally new approach could be that you don't search but interesting web resources gets recommended to you by your personal agent. We are currently working on a peer-to-peer system that doesn't exchange files but exchanges recommendations for web sites.

It's much like a good friend suggests that you have to look at a interesting web site. You can see all the marketing blurb at http://www.iowl.net/. At the moment this is a seminar paper of some people (including me) at the Wuerzburg University of Applied Sciences. We have a working prototype that will be released hopefully in about a month or so.

Slashdot Mirror

User: aharth

Comments · 22