Ask Slashdot: Knowledge Management Systems?
Tom writes: Is there an enterprise level equivalent of Semantic MediaWiki, a Knowledge Management System that can store meaningful facts and allows queries on it? I'm involved in a pretty large IT project and would like to have the documentation in something better than Word. I'd like it to be in a structured format that can be queried, without knowing all the questions that will be asked in the future. I looked extensively, and while there are some graphing or network layout tools that understand predicates, they don't come with a query language. SMW has both semantic links and queries, but as a wiki is very free-form and it's not exactly an Enterprise product (I don't see many chances to convince a government to use it). Is there such a thing?
Kinda offtopic, but bear with me. Enterprise grade is what closed source rolled out once they started losing sales to well maintained and stable open source projects. it comes with support contracts and licenses, but not much else. Just as many closed vendors will disappoint you with their support as open source. You could argue wikimedia is enterprise grade, because it supports 1.21 million accounts. but unless and until the business is committed to defining exactly what they mean by "enterprise grade" you have nothing to go on other than "software that requires a purchase order and recurring license"
that having been said, check out foswiki. search and control are all pretty good.
Good people go to bed earlier.
Just store a bunch of documents somewhere with a search feature that does full text indexing. Or use a simple Wiki system.
Anything more complicated than that and you'll be the only one using it. Other people won't care enough to spend their time entering data into specific fields and learning a query system.
Atlassian Confluence may be the thing you're looking for.
Is there an enterprise level equivalent of Semantic MediaWiki, a Knowledge Management System that can store meaningful facts and allows queries on it?
I asked my Knowledge Management System and it said that no, no there isn't.
systemd is Roko's Basilisk.
From what you asked and the example you gave, it seems like SharePoint or a system like it is what you're looking for, or are you just so Microsoft averse that you never even looked at their products?
SharePoint. Create a form that takes in the information and saved to a list library. That is just one way I can think of on top of my head. There are probably a dozen more ways using SharePoint if I think about it. Make sure that if you are using SharePoint 2010, that you have FAST installed for search queries. Heck you can even load up your Word files and still search based off of those. With SharePoint 2013, you just need to have Enterprise Search enabled, they took the FAST guts out and added them to SharePoint search.
SharePoint can do anything, and anything it can't do, just slap in a bit of jquery.
The Oracle Knowledge product, which was InQuira Knowledge Management until it was acquired by Oracle ~2012. We've built an integrated knowledge management / troubleshooting tool that's deployed to 100k call center agents.
A big issue to be aware of for information management systems is the large training effort to use them and the effort to move your documentation into them. We have had problems that a new system is brought in. It takes literally a few years for employees to get their work in the new system and get comfortable with it - then a newer latest and greatest system comes out. We now have 20 years of documentation in half a dozen different places - each of those places originally declared as our "permanent solution".
You need to budget a lot of training and content transfer time. If you just hope it happens naturally, you will be very disappointed.
If you don't want to spend time and $ on training and moving documents, your best bet IS just files in a directory tree with a normal OS provided content search. If people use keywords in documents, that is good enough for 95% of all documentation uses, and its free.
You are not managing knowledge, you are managing laziness. Writing something up, even in a crude way, is fun and refreshing. After all the specialist can claim to have shared his knowledge and get praise from the team lead. This applies to all teams where the definition of "done" includes documentation.
The next day, however, maintaining documentation is as hard as it is maintaining code and releases. It's boring work, just like fixing old bugs. The rockstars have moved on, and code rots like the documentation.
The essence is that there needs to be a clear focus on the value of it, if it has any. Otherwise it will be a waste of effort down the line, regardless of the system used.
The number one problem is at the input interface: People will only use it if it's useful or there is someone standing over them with a cosh. So how do you do that? By finding applications they find useful for their knowledge or sharing knowledge. Progress report, interface specs, requests for changes or whatever the knowledge generators want. So it's a management problem.
Say to management, "I have this as a solution, I think it's the most flexible, can we give it a try? Look! I've piloted it on my latest project and see what it can do... Think how useful if..." When management champions it there is some chance of it working. Until then paddle your own canoe and offer to show people how clever you are.
It's a good overall question, but exactly the same issues apply to 'Enterprise'(whatever that is) and novelists trying to keep track of places, people, timelines, todos, feedback etc. Until you've really put any solution (I've tried all sorts over 35 years and keep coming back to a book of notes or a master notes document.) to the test by actual use you won't understand the practicalities. The human brain is a pretty good filter if you can do basic organisation and remember to make notes/put things in the right place.
I'm not sure why you're discounting Semantic Mediawiki as an option. Sure, it's not as adaptable as a proper relational database system and a custom interface built on top of it, but on the other hand it is more free-form, higher-level, and you can encode whatever information you like into the individual pages, link between them, and then query the pages based on that information and present the results of those queries to the user. It's not hard to use the CSS to customize the appearance a bit. It's not perfect, but I've used it for a project with ~10000 individual pages and tens of thousands of links between them all (over 50000). Works fine. Mind you, I didn't establish all those links manually, so an "untrained user" situation might not work as well. Hardware-wise I've got it on a vanilla i3-class machine with an SSD, a linux install, apache web server and database. It's nothing special. With that setup it isn't slow at all, even when a page display generates a few hundred individual queries at a time, but then I haven't tried it under very heavy load (probably wouldn't survive a slashdotting). Together with MediaWiki it has pretty effective caching options to speed pages up that don't change often. I only found one thing it didn't do that I wanted, so I modded the code in one module to implement the new feature. The source code is open so it was relatively easy to do once I understood how the combination of Semantic MediaWiki and MediaWiki itself work together.
It's not clear to me how much you've experimented with this option, but if you haven't done so yet it is worth setting up a test case and hammering it with some load tests matching your expected "enterprise" demands.
At my last job, we used Docuwiki for all the teams to store their documentation. IT used it to document how to setup systems with kickstart, modify DNS, add the node to puppet, etc. It had most of the features and was web-based, so you could get to from anywhere. It was easy to maintain and backup since it was static files and web server.
My current job uses Confluence which has the same features as Docuwiki but is integrated with their bugtracker and source code control offerings. It needs a lot more juice to run and maintain and should have been put on something other than a small VM. With 500 people banging on it, it frequently crashes.
If all you want is something quick and easy and will scale to 100 easily, Docuwiki should work for you.
(I don't see many chances to convince a government to use it)
The government uses a tremendous amount of open source software, I don't see any reason they wouldn't consider mediawiki? Plus, everyone's heard of wikipedia,it's a pretty easy sell: "You've used wikipedia right? We're going to use the same exact software that runs wikipedia - and it's free!"
I developed a large information based system recently and we used Drupal 7 and a plugin to push the content to OpenCalais which then tagged the content with the semantic info back into the drupal system. You can then use a faceted search which will allow you to drill down to your data.
Seeing as I've seen Tom reject every single suggestion anyone has had, I guess the answer to his question is "No."
I do not fail; I succeed at finding out what does not work.
... Howtos + 5-minute Screencasts are what you're looking for.
Most KMSes/DMSes are crap - wether FOSS or not. Don't burden yourself with an extra system that is more trouble than use. Verbose opening comments of classes, API docs with examples, documented Usecases, double-checked by the users, Howtos and Screencasts are what you're need and want.
Once you've generated the final docs, give them a nice design, some search-thingy with elasticsearch or something and put the Howtos andd Screencasts Front and center along with some Intros for n00b users.
All that is done best with textfiles and API doctools + proper versioning. Perhaps some diagrams of archticture, setup and Main Usecases nicht help.
KMSes are the Fallout of 2000s mid-execs bullshit-bingo sessions and IMHO hardly ever worth the hassle.
My 2 cents.
We suffer more in our imagination than in reality. - Seneca
In the title. Thank you and have a great day.
Even a basic wiki or any kind of system (let's say internal IM or some stuff to schedule when the meeting room gets used) may get approved, set up and then virtually unused. Or in stronger terms it will be unused.
e.g. the tags attached to slashdot stories. At least I've noticed that today clicking on them brings a list of stories (it used to not work I think). But it is likely that 80% of stories (or a lot more) that would warrant relating to a given tag are missing, and many tags were one-time snarky remarks. Now that they don't fail they do seem to bring very interesting content though.
Semantics technology seems ideal for e.g. a database of animal or botanical species with people paid to exhaustively maintain the data. Or a collection of towns, some "booming", some "decaying", some linked to others in a certain way?
Thus you may want to define some areas of knowledge where the semantical features will really be used more than in others, and somehow get it enforced through policy?
Looking at http://semantic-mediawiki.org/... there are a few examples that can definitely be considered enterprise users, including some high-risk government users (NASA uses it to plan EVAs for the ISS for example).
The "enterprise mentality" makes most of the alternatives too cumbersome to actually be effectively used - ultimately you have to have buy-in from you users, or what management wants is not going to matter - if it's not pleasant to use you'll be back to emailing 70 different versions of the same different Word document around in a few months time with file renames as your only version control (if you are that lucky).
I don't see many chances to convince a government to use it
In fact, the UK Environment Agency uses SMW for it's Restore Rivers project: https://restorerivers.eu/wiki/index.php?title=Main_Page
So apparently it is possible to convince government to use it.
http://protege.stanford.edu/ Java Desktop Application.,Used to define/manage ontologies. Not sure if they have a web version meanwhile and if comes close to what you need. However it supports plugins, perhaps the frontend can be adapted to access a centralized DB. Oh, found it: http://semanticweb.org/wiki/We...égé.html
This is a info page with an overview about various tools: https://en.wikipedia.org/wiki/...
Did you stumble over this: http://www.w3.org/2001/sw/wiki...? Dozens of various tools mentioned.
Another tool, I stumbled iver, but did not use it yet: http://oboedit.org/
And then there is https://jena.apache.org/docume...
But that is more a programming API to dynamically create classes to store/manage data in an ontology described database. (Did not use it yet, but looks promizing)
And then we have this: http://semanticweb.org/wiki/To...
BTW, I can offer remote programming/assistance in such tools.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
you might want a tool storing claims in the form of RDF triples; these can be used as the basis for deductive reasoning. one standard in this area is the Web Ontology Language. several software reasoners are named in the wikipedia article. also, for existing knowledge bases, see: https://en.wikipedia.org/wiki/Web_Ontology_Language#Public_ontologies
Surely you just say something like... Look the cost is in the staff who you already have, it's Open Source and sits on top of an Open Source application, it's free, the platform to host it is either free or low cost commodity, plenty of people use it already (proven technology), it'll look good that the government is investing time not necessarily money from tax payers and it's using Open Source and Open Standards, so you're not tied into some niche technology only supported by a select few large corporation that when things go wrong give you zero support or guarantees (contrary to the contract). That last one is just my rant from an experience I've had already when I failed to convince a company to use Open Source and things went south big time.
When shit hits the fan get some of these https://youtu.be/pY-GncsZ-UE
There was a discussion on Slashdot a couple of months about a more sophisticated file system.
In my opinion we should extend file systems instead of replace them because users and support staff are used to them and their stuff is already there.
If file systems easily allowed meta data to be attached to files and folders, then semi-structured queries etc. could be done on them. "Views" could be made of combinations of folder trees, similar to RDBMS views. Rough example:
While there is an existing standard on file meta-data, it appears inconsistent across vendors/OS/versions, poor support by file API's, and poorly tested.
I'm thinking of adding a secondary system on top of the file system to store meta-data rather than depend on vendors' meta-data. I have a rough-draft for an open-source product. (It won't be very fast, but if it catches on, optimizers could be added.)
It could also serve light-duty CRUD, such as specialized tracking systems.
Table-ized A.I.
I've found a couple companies that might be what you're looking for. I'd be interested in talking (erin at hyperbuddha.net) if you find something that fits your requirements; a good semantic system is something I've wanted for systems engineering of certain projects, but I haven't found one yet.
I'm treating Semantic+CMS as kind of equivalent to a KMS, but maybe I'm misunderstanding there.
http://www.webnodes.com/
http://redlink.co/ which I found through http://www.iks-project.eu/
http://flow.li/ seems to be using a KMS to organize data on the backend for the purpose of making publishing easier.
Git + Grep is fully decentralized, works off line and easily scales the amounts if information you are likely to have. Its modular (you can replace grep with other tools) and it allows users to use their preferred editor. It keeps backups, and great version history. Its also tamper resistant, though be aware of the potential weaknesses with sha-1. Its portable between many OSes, and there are several existing tools for exporting data from it for use in other systems. There are also available privet or hosted web UIs, and a lot of people who already know how to use it, and lots of documentation and good long term support.
http://www8.hp.com/us/en/softw...
I set up a wiki for engineering, and it's been a huge success. All the key information for a project, design documents, vendor data sheets, build procedures, test procedures, area all there for all of engineering to see. If you ask a question and the answer isn't on the wiki, the policy is add it to the wiki and reply with a link.
I'd be interested to hear about where approaches like this have failed.
Disclaimer: I work on the project, don't get any kickback from sales.
The first thing that comes to my mind when talking about enterprise-grade KMS is ibm.com/suppory/knowledgecenter, you write html | DITA | EHS type content and index/share it using KC, there is an offline version available as well.
There are tools for that, http://www.x-media-project.org...
Experience.
As for query "all the cities with more than a million people, in frenchspeaking countries" - we are not there yet. But could it be submitted as city.country.legalLanguages ="French" city.population >= 1M - if yes, and you are fine with the fact that numerical values will only be from some "fact box" - then it could be done, I think even sort of auto-complete for users wouldn't be too hard to implement.
Cms
Take a look at the book:
Jörg Rech; Björn Decker; Eric Ras Emerging Technologies for Semantic Work Environments: Techniques, Methods, and Applications
IGI Global, 2008
Do not be fooled by word "semantic" in it, some chapters will let you understand the requirements for KMS better. And when requirements are understood,
it will be much simpler to come to suitable solution.
Its not cheap, but its very, very good.
But had you have a look at Collibra? There is a lot of noise around it for data management and governance where I work. Not sure if you have the ability to query it though.
DEVONThink with database synchronisation comes close to what you are asking for. Mac only (although there is a webserver built in). Fantastic AI augmented classification, searching and document association.
I worked in KM systems for years. What we found was:
* 2 people will edit a wiki article. Everyone else will read and complain.
* document management systems without a strong, smart, structure suck. Metadata in documents are pretty much worthless without manual validation and double checks.
The short answer is to get a librarian and use a DMS with full text search, but do NOT allow normal document files. Only allow text and simple HTML. Page layout crap like powerpoint, doc, xls ARE NOT knowledge. The words are. Simple lists (ordered/unordered/nested) are amazing organization tools. If you lose the fight to avoid "office files" (and I expect you will), try to get the data in ODF files so people aren't stuck using a specific version of a paid application for access. PDF may seem like a good idea, but it is not.
Options are Xerox Docushare, if you have money, and Alfresco, if you have even more money. The average cost of a free Alfresco deployment is $250K.
However, if you can make a wiki something that people will edit as part of their jobs, encourage edits internally, great! Don't make anyone use wiki-markup. Only 2% of nerds will, not anyone else. Heck, I setup our mediawiki system and found the wiki syntax too difficult - I already use other markdown languages and didn't want to pollute my existing knowledge.
Well, I had written out an incredibly long response, on my phone, and then lost it when I went to look something up. You are lucky I actually care about this topic a lot or I would have just blown the whole thing off...
Anyway...
First, a clarification question: Will your users actually be submitting whole documents (whether in the form of a wiki page, html content, a .PDF file, or a word processing document) and THEN supposedly selecting multiple snippets of that document and adding metadata about those snippets? Or, will they be submitting those same documents and then merely adding metadata about the document as a whole? {Though from your description of the kinds of queries you want to be able to do, this does not seem to be what you are doing.} OR, will they be merely entering independent "factlets" of information and you want those to be structured?
Your original question lead all of us to believe that the first option is what you want. That is the option that is almost impossible to get done reliably.
I will wait for your answer before writing more.
Oracle has a Semantic layer over their RDBMS that comes with their Spatial package. My only knowledge of it comes from talking to a product manager about it a long time ago. My son works in a shop which uses SMW for their LIMS. I'd say that it's an enterprise application with him as the main developer and a few other non-IT/CS/EE folks writing queries against it. He's like a developer + DBA and everything goes through him and he writes almost all of the forms and reports from requests by his department. I've been fairly impressed with the size of their application that's been developed by a pretty small and mostly non-technical team. He has to fix SMW problems from time to time too. SMW had some rather severe problems with maintenance and I think that the group that used to do it is in Germany. There is a place to get maintenance I think but I don't know how reliable it is. He goes in and fixes problems with SMW when there's a problem that they can't work around. That's part of the fun with Open Source.
You can always convert your data into RDF triplets using Jena and store them in fuseki (like sql server but for facts). You can then query the data using Sparql. However you need an ontology first to organize your facts. There is an ongoing effort to move us government data to semantic web (nih repos for example). This is the vision Tim Berners Lee wants for the future and he has several vision papers on this subject ( the benefit of moving to linked data etc.) which can be a good starting point for convincing ppl to follow the semantic web principles.
From:
http://radar.oreilly.com/2013/07/why-choose-a-graph-database.html
"Instead of de-normalizing for performance, you would normalize interesting attributes into their own nodes, making it much easier to move, filter and aggregate along these lines. Content and asset management, job-finding, recommendations based on weighted relationships to relevant attribute-nodes are some use cases that fit this model very well."
An example of this is/was FreeBase:
https://www.freebase.com/
(Look at the query examples.)
http://www.thebrain.com/