Microsoft Opening Outlook's PST Format
protosage writes to tell us that Microsoft Interoperability is working towards opening up Outlook's .pst format under their Open Specification Promise. This should "allow anyone to implement the .pst file format on any platform and in any tool, without concerns about patents, and without the need to contact Microsoft in any way." "In order to facilitate interoperability and enable customers and vendors to access the data in .pst files on a variety of platforms, we will be releasing documentation for the .pst file format. This will allow developers to read, create, and interoperate with the data in .pst files in server and client scenarios using the programming language and platform of their choice. The technical documentation will detail how the data is stored, along with guidance for accessing that data from other software applications. It also will highlight the structure of the .pst file, provide details like how to navigate the folder hierarchy, and explain how to access the individual data objects and properties."
Another sign of the Apocalypse - and it's a doozy. I always figured hell would freeze over before Microsoft opened up something like the .pst specs.
"Data portability has become an increasing need for our customers and partners as more information is stored and shared in digital formats. One scenario that has come up recently is how to further improve platform-independent access to email, calendar, contacts, and other data generated by Microsoft Outlook.
On desktops, this data is stored in Outlook Personal Folders, in a format called a .pst file"
Straight from the link in the summary.
What is .pst used for exactly?
The 'PST' or 'Personal STore' file contains the Outlook/Outlook Express Message Mail Box.
It's MS's overly complicated version of a mail spool file.
It's nice, but like everything else related to MS, they wouldn't be doing it unless they had something to gain, and anything good for MS is bad for everyone else in the long run.
If we had actually wanted it, we would have gone ahead and figured it out for ourselves.
Never trust an atom. They make up everything.
Dunno. Think it's something the "need-a-machine-to-run-my-life" types use.
It's the file that's supposed to contain the imbecile users mail-archives and private folders as a local backup when they save them as such. Only thing is, users keep forgetting to put a mark in "include subfolders" and so they loose 3 years of mail when the EEEEEVIIIIIILL supporter shows up to swap their old POS for a new shiny one. BOOOOOHOOOOO.
If you quote this signature there'll be 72 copies of Windows ME waiting for you in Heaven.
Should make migration to Zimbra easier.
If you quote this signature there'll be 72 copies of Windows ME waiting for you in Heaven.
Outlook Express never used PST files (but it could import them).
Nerd rage is the funniest rage.
This is yet ANOTHER example of Microsoft's continual battle against the open source community! This company is EVIL and needs to be destroye..
Oh wait.
Remember OOXML? ... http://tech.slashdot.org/article.pl?sid=08/04/21/1821251
But regardless, open is a good thing.
I don't see much use for it though.
Sent from my PDP-11
Um, ok, then explain this
http://kb.mozillazine.org/Import_.pst_files
and this
http://www.five-ten-sg.com/libpst/rn01re01.html
Outlook Express never used PST files
Sorry, My Bad...
Its good to see Microfsoft open up the Outlook PST format, if only to improve importing into other mail clients like Thunderbird etc.
But honestly, using the PST format in other applications sounds like a terrible idea to me: Those monolithic PST files, which Outlook uses to store mail data get corrupted easily (at least in my experience) and storing all your email data in one gigantic file always struck me as a really bad design choice anyway.
Attitudes are changing at Microsoft - they are still a business and can't go 'open' over night however. I don't see much use for opening up PST either... maybe I'm missing something.
I'd wager that Microsoft is willing to do this because the .pst format is becoming irrelevant. Medium and large businesses already want nothing to do with them due to issues with performance and management. That leaves small businesses and a small number of home users. With hosted exchange options becoming more common among small businesses, the need for .pst files is going away very quickly.
what happen to the obligatory tag that gets added on Slashdot to a post about Microsoft "opening up" something, the "itsatrap" tag.
here are some prime examples:
Microsoft Partially Opens Proprietary XML Format
(mainly because this happened: Microsoft Open Document Standard Not So Open)
Microsoft Releases Linux Device Drivers As GPL
in fact, there are plenty of other examples in the " itsatrap " tag-egory
Their motive is probably to make money, like always -- and like any business. Even RedHat. Sure, RH may employ kernel devs, Gnome devs, etc., but at the end of the day its just to make the system that they sell better.
Opening PST means being able to more freely move Outlook data between mail programs such as Evolution. The more interoperable the mail client is, the less it matters if all your engineers are on Linux and all your marketers are on Windows, as this is likely just a step towards being able to have say, Evolution, fully support being able to talk with an Exchange server. If you can get all of the features of Exchange across platforms at the expense of opening specs of a mail client that they don't really make that much money off of anyway, then they'll likely be able to make some more sales of Exchange server.
From a purely technical point of view, that may or may not be optimal, but if every part of the business could tie in with the Exchange server regardless of what operating system they need to run for the rest of their tasks, then it makes it all the more attractive from a business standpoint.
I could just be off base though, but it seems like that is a possible eventuality. This just has to do with data storage I think, but even being able to import contact lists, mail boxes, etc, more smoothly is a good start, I'd say.
the fun part is when they let the pst grow to 1G or so and the file corrupts itself.
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
So where's the link to the RFC or other plain text document describing the .PST file?
now trojans and spyware can scan outlook data for private information with ease... but yeah openness also means issues with format will be public and fixable.
With a 1980s-era 3-character file extension, to be sure.
A documented binary format is better than an undocumented one, but it would be better to enable import/export of XML files or some other standard encapsulation.
That's like saying a blimp is an overly complicated way to cross the street.
Try upgrading to a version of Outlook that isn't so archaic. I have 10GB+ PSTs laying around.
Need help treating your acne? Come here!
People who program different migration utilities benefit from this, and of course users of such tools. Even wild ideas like Fuse filesystem that mounts it as Maildir.
So, converters, importers, exporters, indexing tools, repair/forensics, optimize/defragment/find duplicates tools, sort, grep.
Also, if its a standard than it needs to be STANDARDIZED, so no special treatment for own products.
Sorry, but I don't see any evidence of Microsoft's attitude changing.
I hear lots of talk and activities such as the Codeplex Foundation, but scratch a little under the surface and it all looks like more of the same old microsoft: crush competitors, destroy alternatives to Microsoft dominance on the desktop, make tactical partnerships and strategically ruin the partner.
Basically when Microsoft holds out the hand of friendship, first check if there's a knife in the other hand.
First thing many corporations turn off is the ability to save mail in PST files. One of the better Group Policies IMHO.
Make your named socket a .pst file and outlook can access your real email database through the defined interface.
Nice and spiffy and you don't end up tied to the Microsoft format.
the most witty comment I've ever read on slashdot. thank you for a hearty laugh :)
Asking people to think is like asking them to buy you a new car
Try upgrading to a version of Outlook that isn't so archaic. I have 10GB+ PSTs laying around.
Seriously? And this is brag-worthy how again?
"A government is a body of people usually -- notably -- ungoverned." -Shepherd Book
Embrace
Extend
Extin... oh wait
A documented binary format is better than an undocumented one
As long as
A) the documentation describes the stuff that exists in the real world, rather than what it would look like in some alternate universe (as is MS's usual tactic.)
and
B) the documentation isn't a bunch of "OOMXL"-like "implement this like Outlook 97 did"
"I have 10GB+ PSTs laying around."
Ah, its you who have been eating up all the cycles in the Cray.
NO SIG
You have to upgrade to Outlook 2007 to get sane behavior, though. For some odd reason, Outlook 2003 refuses to use the 2k3 PST format when doing anything BUT Exchange. If, e.g., you are communicating with an IMAP server, it still uses the old Outlook 97 format for its cache file. This means that if you're say, the kind of person who might like to move large amounts of email around (e.g., IT person or other mail administrator), you cannot use Outlook 2003 unless you want to remove and re-add the IMAP account to Outlook every few minutes. Outlook 2007 fixes that one, but it took, what, 10 years to fix it?
In Outlook 2007, you still can't select an entire mail folder (where message count exceeds something like 300 messages) and expect to move that mail to another folder. Outlook complains that it is out of memory. This bug has been in Outlook forever. This is a joke-- a freshman CS student should know how to solve that one.
Outlook 2007 made my machine grind to a halt. So I'm back on 2003, because I HAVE to use Outlook at work.
You forgot about the lazy admin that painstakingly documents the step by step process (including screenshots) of creating and maintaining PST files for users but neglects to inform them to never ever EVER copy a PST file that is open in an Outlook session. Oh and my favorite scenario is when there are hundreds of PST files on a single file server that open with Outlook and remain open/active all day/night without forced logoff policy. I love when I stumble on admins that recommend that they use the PST on a network share so they can be "backed up".
"Keep at least 3-6 full bottles of hard alcohol on hand, a 2 week resignation notice,..." - Poetmatt
What? .pst is a import/archive format, it has absolutely no relation to Evolution talking to Exchange.
I have an .ost file on my laptop you insensitive MS clods. Does this great revelation include them?
Some mornings it's hardly worth chewing through the restraints to get out of bed.
People have been creating plugins for $10 a pop to do this for nearly a decade now. How about instead of opening a broken format, they open up some Exchange connectivity so that we can use a proper mail client (ie: NOT Outlook) with Exchange? TBird comes to mind. I know that there are workarounds, but why should one mail server be married to one mail client?
Firstly, the reason it saves it in "Such an old format" is because it is the least common denominator. ODBC for data storage if you will.
Nextly, the "bug" is likely the way the underlying libraries handle the situation, not the application itself. That doesn't make it any better, but MS is great about keeping old code around. Nobody has probably looked at that code in literally years. No freshmen CS is going to be able to outcode management's blind decisions.
Lastly, no one in 2009 is still plagued with these problems unless they live under a rock. Unfortunately, there are still way too many of them.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
Yes, folks who are running older versions of Outlook do indeed have a 2 GiB limit - and it will approach and hit that limit without warning and corrupt. Newer versions of Outlook have a PST format that doesn't have such a limit, but even with the newer Outlook executable if the users are still using files in the old format the 2 GiB limit applies. For new files, in the last couple versions of Outlook - not so much.
Outlook personal data files *.pst files hold the archived data. The copy of the Exchange database is in the *.ost file. Let microsoft release that file format and we might be able to replace the Exchange data store.
zenray
To update to Thunderbird, or Pronto like I use. It's particularly useful for business users wanting to migrate off Outlook and have access to a decent code monkey.
Understanding the scope of the problem is the first step on the path to true panic.
And the iconoclastic tree of RMS bears another fruit. You can bet that without the pressure exerted by free and/or open source software and its advocates this would never have happened...
(I now await moderation punishment for having mentioned the name of him is not to be named...)
The hard limit is 2GB, but you can have amusing things happen at any size. Heck, use Outlook for IMAP and it pretty much guaranteed to corrupt your IMAP store's PST. The recommended solution? Exchange Server.
The part that galls me, though, is how users gasp "But how could this possibly happen?" and then get really twitchy about attempting to fix it. People place, I feel, too much faith in computers in general, but Outlook has an incredible white-knight reputation. It's literally the Teflon application: no matter how much it fucks up, corrupts data, gets compromised, fails to run, stalls, kills small children, etc., it's reputation cannot be besmirched among the general population. I'm boggled, I really am.
--srj/mmv
I thought that, until I joined an organisation that used Lotus Notes.
PST oh how I miss thee.
Sara
Designer, Gamer, Macgrrl in an XP World
So now we can write open source tools to fix corrupt PST files!
Don't even think about doing anything open source with PST files, until you have a tool to fix the files when they go corrupt.
Outlook Express uses .PBX as its filke format.
And .PST includes contacts as well as mail.
Not really, Exchange data is still in (ugly) JET databases on the server, and the OST is only the Offline Folders on the client (for when Cached Exchange Mode is enabled).
The Exchange RPC and EWS specs are open, so you could just implement those and dump the contents in an SQLite database or something instead.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
No, it's because PST is of no value. In Microsoft's opinion, "disk is cheap. Give your users bigger mailboxes already".
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
"Data portability has become an increasing need for our customers and partners as more information is stored and shared in digital formats. One scenario that has come up recently is how to further improve platform-independent access to email, calendar, contacts, and other data generated by Microsoft Outlook.
As a linux mail admin, I'm excited that there may soon be a possibility for Dovecot to deliver mail directly into a 2 GB .pst file sitting on my mail server because the PST format*snort* is so*choke* superior to maildHAHAHAHAHA! Sorry--I couldn't keep a straight face.
There's no place like
I avoid windows at work but guys who VPN in from home have to load that two gig file across the WAN just to check their mail. I tell them to use rdp or another remote desktop instead.
http://michaelsmith.id.au
Hahaha, yeah SQLite better than Jet Blue, *laugh*. Sorry but while Exchange data store corruption CAN occur it doesn't very often, and hasn't much at all since Exchange 2003. In fact for systems that lack the care and feeding that a DBA gives to your typical large scale database I think they do remarkably well. I have a love hate relationship with Exchange (I've been certified since 5.5) and while I did on occasion get called in to clean up a nasty Exchange related problem, I only had to deal with two major data problems in all those years. I've seen much worse problems with Notes, Groupwise, and Maildir based solution then I have with Exchange.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I have seen massive PST files, and tough they seem to work, I don't want to be the one to try and recover them once corruption sets in. Also, sometimes Outlook (2007) takes forever to open them. One of our executives never closes her mail only for that reason.
There's a little known piece of middlware from IBM called DAMO. Domino Access for Microsoft Outlook. Domino is the server behind the loathsome Notes client. Basically it maps Notes fields to a PST. Then you can pull all your notes email and calendars into an Outlook .PST. You'll need to pay IBM $100 for the privilege and they're not going to support it for much longer, but if you hate (hate, hate, hate) Notes and need to hit a Domino server, this is cool. For me it's been $100 for three years of sanity for my PIM and no need to deal with Notes. Even the latest 8.5 version seems to be a bunch of badly done java emulating Outlook.
If you go this route, stay under the radar and don't hip the IT guys to what you're doing. Unless they're particularly eagle-eyed they probably won't notice what you're up to. You among thousands of users. They don't have the time. Don't ask for support from them. Figure it out on your own. Get into the VPN, figure out the IP address of your email server and keep your notes id handy for when the prompt asks. Expect it to take a little fiddling and do lots of backups.
Firstly, the reason it saves it in "Such an old format" is because it is the least common denominator. ODBC for data storage if you will.
So what? You should have the option (or default to) to save in the new format without using exchange 2k3
Nextly, the "bug" is likely the way the underlying libraries handle the situation, not the application itself.
So what? MS can't version libraries now?
Reboot macht Frei.
the "bug" is likely the way the underlying libraries handle the situation, not the application itself. That doesn't make it any better, but MS is great about keeping old code around. Nobody has probably looked at that code in literally years. No freshmen CS is going to be able to outcode management's blind decisions.
Actually this should be possible. If you can move ONE message, then the freshman should be able to fix the program so that it divides the selection up into individual messages and tells the database to do each move independently. Unless a single message is too big, this will work.
I think this is what the original poster was getting at, and I agree it should be possible to fix this bug no matter what the back end is doing.
And calendar, and tasks, and memos, and pretty much any kind of data you managed with Outlook.
Personally, Outlook is not my favorite e-mail client, but for integration of PIM features with email, I've yet to find something that works as well as it does.
No sig
Running SQL Server (Express or otherwise) on the desktop is a BAD IDEA. Trust me, I used to support an app that did just this. SQL Server is not designed for the desktop environment and, if you can even get it to install, it causes more problems than it solves.
An embedded database, like SQL Server Compact, might be a good idea, though.
Companies like Zarafa will benefit from this if they act quickly enough. It's already a valid alternative for Exchange. If they allow you to import .pst mailarchives it'll be so much easier to make the switch. Well, not easier but surely a lot less complains from users.
You say that as though you think Microsoft is unique. That's the general attitude of most successful businesses. Their shareholders don't really care who they "backstab" if it takes care of the bottom line. It's not like that hasn't gone both ways throughout history.
For instance, while the crowd around here celebrates Dell installing Ubuntu on their laptops... that's Dell backstabbing Microsoft. Of course, MS is always the "bad guy" so presenting them as the victim is frowned upon.
Or maybe Intel refusing to upgrade the graphics on many of their platforms to comply with the "Vista ready" status, just so they could make a couple extra bucks while screwing MS. I know, I know, the horror that someone could try to take an unbiased view of the situation!
Quick, someone write an app that syncs my contacts from Outlook 2007 (or WM6.1/6.5) to Thunderbird... Google's buggy Exchange Activesync implementation is driving me bat shit crazy. :(
It isn't, but what's surprising is how EASY it is to hit 10GB with a PST file - you'd think that that the incredible slowness in Outlook was caused by some sort of mega compression that reduces the file size to a tenth of what it was, but nooooo, I guess it's just a feature - gotta have time to drink some coffee and have a donut while switching between IMAP folders...
And you require the latest version in order to handle files this large? The only thing stopping unix machines from 20+ years ago having gigs and gigs of mail is the physical capacity of the disks.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
i wonder if someday they are going to grow some balls and actually make their own(Microsoft) stuff compatible. Because entourage which is the mac outlook sucks. It cant open pst. It is a pain in the ass to migrate mails from outlook to entourage.
I have succesfully used libpst (http://www.five-ten-sg.com/libpst/) to import pst files. I cannot remembeer since when, but longer than one year ago at least.
So this was already possible (and not thanks to them, by the way).
Natxo Asenjo
Please dear god someone now save me Entourage hell and rewrite something that works...
I think therefore I can't be ~TTNH
Or maybe Intel refusing to upgrade the graphics on many of their platforms to comply with the "Vista ready" status, just so they could make a couple extra bucks while screwing MS.
I was in the middle of preparing this huge response with background, citations, the lot. But then it dawned on me that you are a crazy person. Good luck with that.
But obviously this is not the case. Otherwise, you'd see things like Asp.Net run well on a Apache stack (without extra mods), a version of SQL Server native to Linux, XBoxes that share media via a standard network interface, Zunes that sync using standard USB mass storage, MSN Messenger for Macs, etc etc.
[...] laying around.
Yeah, have tried actually using them?
Exactly.
Brain surgery - it's not rocket science!
Maybe next they should fix Outlook to send attachments in the standard way, rather than embedding them in .tnef files.
Cut that out, or I will ship you to Norilsk in a box.
Mod me down for being cynically but: give MS's history, the fact that no release date is given and that presumably they just have to edit/issue an already internally documented format then it just sounds like hot air aimed at naive users and EU legislators.
one big use I can see is a PST rebuilder, MS tells you to copy anything you want to keep out after repairing a corrupt PST with scanpst but i've found out the hard way that sometimes outlook can read a mail in a PST but when it tries to copy it to another PST it will fail.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
In my experiance the "fixed" pst files while just about readable have a lot of problems. Things like lots of mail being dumped in one folder and being able to view mails but not copy them out of the bad pst files.
Has this improved since outlook 2K (the last version I used)?
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Or all this has been possible for quite some.
Evolution has been able to load pst files 6 months ago, as well as mapi support with the openchange plugin.
All this really means to us users is they are probably not going to sue us over it.
excuse me while I do some cartwheels down the hall...
Cheap storage VM.
There is nothing to prevent your custom apps from spreading that data over multiple different .PST files and then displaying it as if it were one database.
I have been using Outlook for over a decade. I synchronize the data between a desktop and a laptop as well as hotsync it to a Palm Pilot (Sony Clie NR70v). I have only had one or two times when the database got corrupted. I just restored from the copy that was on my laptop and was on my way.
I think this is a great thing as I have no intention of giving up Outlook any time soon but I am about to give up on the Palm OS platform and move on to Android. Unless Chapura comes out with an Android version of Keysuite, I will likely have to create my own apps for Android and was concerned about figuring out the Outlook .PST format. Now I can get it straight from the horse's mouth. Sure, I know MS isn't always completely forthcoming. Duh. But it is better than relying on hacking my way through from scratch.
You wish.
I have moved thousands upon thousands of messages from one folder to another.
It took for freaken ever, but it worked. Definently not the best supported scenario though.
Moving messages from one PST to another, now THAT is painful! Functional but painful.
(I have in excess of 30GB of PST files total, I may have hit almost every single limitation in the PST format so far, including Outlook 2k7's 20GB PST limit)
Need help treating your acne? Come here!
Which hasn't been an issue if you installed updates that were 5 years old or more at this point.
And the size was 2G, not one.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
As you rightly point out, if you're already working in a live Windows / Outlook / Exchange environment, then having the PST format spec is largely irrelevant.
However, There are huge amounts of government data locked up in PST files. I work in digital preservation - our use case is to profile the information contained in these files, both to determine if their contents should be preserved for accountability and historic interest reasons, and to be able to process their contents on a server and apply preservation policies to them.
So roll on the PST spec - it will be a major help in unlocking government data, assuring accountability, and preserving what it for the ages. When so much vital information is locked up in an undocumented format, there are always lots of use cases that will be important to a significant group of people. They may be corner cases to you, but they will be vitally important to others.
No, the size was 1G back in the day. Judging from the comments here, even that isn't safe with the current version of the file. Best to use a DB/spool if you need a lot of mail kept around.
"We returned the General to El Salvador, or maybe Guatemala, it's difficult to tell from 10,000 feet"
I don't really hate Exchange at all. I use it myself, and loathe all the constant bashing about "constant Exchange DB corruption" (seriously, I've never seen it happen). But you can't say (earlier version of) JET is a good format. I still reckon they should let you stick it in MSSQL databases.
Also, the SQLite was a joke, though OSS folks would I'm sure love to do that.
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".
People using Entourage (Outlook (lite) for Mac) had to live that surprise when their OS X Leopard with Time Machine went insane with 1-2 GB backups hourly. Some didn't figure what is going on until TM started to delete old backups for space. MS, as usual, didn't even bother tell the people using "enemy OS". I think they still have to exclude their mails while backing up and use a different application for backing up their mails.
Apple went from mbox (flat) to single mail files almost instantly when they had Spotlight enabled since Spotlight is not suitable for single mailbox. Opera guys did the same in recent versions for safety, indexing and backup reasons. It is only MS Entourage which is amazingly expensive doesn't have that choice.
Nonsense. Slashdotters like a laugh as much as anyone. Except perhaps my wife, and most of her friends.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."