Slashdot Mirror


Independent Data and Formatting with Microformats

IdaAshley writes to tell us IBM DeveloperWorks is running an article about how to best utilize microformats to embed data within standard XHTML code. From the article: "Microformats are a pragmatic approach to solving the issue of structured data on the Web. Is it as architecturally pure as XML-encoded data separated from its formatting through a mechanism such as XSLT style sheets? No. But I think this approach is a realistic middle step that will help build a more intelligent Web that is easier to use and provides better search and data integration."

10 of 99 comments (clear)

  1. Tagging in Text by inKubus · · Score: 1, Informative

    This is just tagging in text; it's exactly what you do for CSS: You're saying this text is of a certain class. And you contain it in a box. All this is doing is using the same stuff and storing a little variable name and using it later. One might argue you are already doing that with CSS, it's just formatting stuff you're attaching to the variable rather than, ah, data structure..

    I do like the idea of being able to move XML around without having to parse to view the basic file in a formatted fashion. So, you're mixing HTML with a tag. Again, SO WHAT? But what about the encapsulated text, what's the point? If you're going to use a viewer eventually (because you have the encapsulated text), use a viewer. This would only help in reading the actual data, but not in bug fixing, because the XML is that much more unreadable.

    On the other hand, this is kindof like the PDF format, with text as text. The PDF client renders it as a font bitmap but it's rendered from TEXT in the PDF, therefore you can do things like cut/paste/etc. This takes it a step further by adding a data structure around it which allows you to import rows of things. Pretty sweet, I might use this somewhere. I can see it being useful in mobile stuff, so you don't have to muck with a client parser.

    --
    Cool! Amazing Toys.
    1. Re:Tagging in Text by Mr_Tulip · · Score: 4, Informative
      The thing that makes Microformats stand out from homebrew versions is the attempt to standardize the formats, allowing others to easily work out what microformat you are using and integrate them into their own site.

      The article mentions the wiki, but doesn't link to it, except at the very bottom of the resources section.

  2. Re:META headers by Anonymous Coward · · Score: 4, Informative

    Get off your hobby-horse, Jorn. At some point, please realise that you are clueless about markup. Only then will you be able to learn a bit about what you are so high-and-mighty about.

    Firstly, <meta> is an element type, not a header. It doesn't do your credibility much good when you don't even know what it is.

    Secondly, <meta> is an astonishingly limited element type. It's scoped to the page not particular parts of it, and it has a plain-text content model because it uses attributes instead of child elements.

    Thirdly, I anticipate you saying that you could fix this by changing the <meta> element type. Sure you could. You could fix it by changing it to a set of element types that describe content more accurately and changing it so that it could appear in other parts of the document. And you know what you'd have then? The structured HTML that you despise so much. That's right, microformats embody the very thing you are criticising.

    Finally, given that HTML hasn't changed recently to allow microformats, everything that is possible today with microformats was possible five years ago with microformats. It's a design strategy, not a new technology.

    Again, please learn a bit about something before you turn your nose up at it. You might be smart in other respects, but when it comes to markup, you are dumb. Please accept this so you can change it.

  3. Re:Geez, man... by ChaoticChowder · · Score: 2, Informative

    I just wrote a Java program to do all that in one step last week. I even took it a step further and used the Sun classes for parsing HTML and Xerces for XHTML. Anyone who has ever had to do a datamining project knows how to do this. I don't really think this is a big deal at all. Just another excuse to apply a Web 2.0 buzzword to a technique that's been around for quite a while. Tutorials on the web these days are getting to be pretty lame. Maybe I'll write a couple myself, at least I have the chance of being recognized on /.

  4. Re:Standardization is the problem by TedTschopp · · Score: 4, Informative

    So now why is this "vevent" class special, and who decided it would be "vevent" and not "scheduledevent" or "calendarevent" or "microsoftcalendarhassomethingforyoutodotoday"?

    The idea is to leverage standards that are already out there, and in this case it would be the iCalendar standard.

    --
    Fantasy remains a human right; we make in our measure and in our derivative mode... -- JRR Tolkien
  5. Re:I don't get it... by Karma+Farmer · · Score: 5, Informative
    The class attribute was never intended to be limited to CSS. From the HTML 4.01 specification:
    The class attribute... assigns one or more class names to an element; the element may be said to belong to these classes. A class name may be shared by several element instances. The class attribute has several roles in HTML:
    • As a style sheet selector (when an author wishes to assign style information to a set of elements).
    • For general purpose processing by user agents.
  6. Re:META headers by Karma+Farmer · · Score: 4, Informative
    How much of this could have been done 5 years ago
    All of it. Microformats use features introduced with HTML 4.0 in 1997, so all of this was possible nearly 10 years ago.

    How much of microformats could have been done using META
    None of it. META tags and microformats serve two entirely seperate purposes, and neither is in any way a replacement for the other.
  7. Re:META headers by oneiros27 · · Score: 2, Informative
    So I guess I have to ask again: How much of microformats could have been done using META, given that it's scoped to the page (which is no problem for the most important page semantics), and uses attributes?

    Very little. For instance -- if I had a full page calendar display -- because META is scoped to the whole page, I couldn't include an event record for each individual event -- I'd have to have the person go to a 'more information' link, and then give the event information. If I wanted them to do that, I could've just given them an iCal file. This allows the semantic marking to be along side the format to be presented to the user. (as we would assume that the person wouldn't want to pull down all events from the calendar -- think something like registering for classes in college, where you might only want one or a few from the full list of events)

    And many times, even when there is a single event mentioned within a document, it would not be semantically correct to say that the event applies to the entire page -- it may only be a section of the page that is relevent to the event. (eg, the front page of a website, with info about a company, and then an upcoming event announcement)

    I personally didn't like the examples given in the IBM article. Some of the past examples that I've seen include embedding semantic detail within a paragraph of text (eg, a movie review), so that different review formats could then be processed in an automated way.

    --
    Build it, and they will come^Hplain.
  8. Wheel of re-incarnation strikes again... by sreekotay · · Score: 2, Informative

    Mixing presentation and data - good... bad... good. But it gets better a little, each time (maybe more of a spiral than a wheel).

    We're using them on aim pages for module development (I cover it a bit here). Its a nice simple standard, and the idea needed SOME name - don't make more of it than it its.
    -----
    graphically speaking

  9. JSON (Javascript over the wire) by c0d3r · · Score: 2, Informative

    Look into JSON..its basically javascript data structures that you eval on the client. Why bother assembling thick XML that needs to be parsed on the client. XML is slow, and even slower if you have to XSLT it out of the XHTML.