Organizing Data Across a Heterogeneous Net?

Database and rsync+ssh by ddkilzer · 2002-05-31 05:26 · Score: 3, Informative

Without knowing more about the type of data you're storing, I would recommend putting it in a database. I like PostgreSQL 7.x myself.

For the software, I would organize it in a directory structure and use rsync+ssh to mirror it as needed.

For backup software, use Amanda.

For file sharing, use Samba.

'Nuff said.

Re:Database and rsync+ssh by anwnn · 2002-05-31 05:29 · Score: 3, Informative

the one problem with rsync/ssh, for Mac OS X atleast, is that it will munge the resource forks of most files. for some this isn't a problem, but if you do a lot of mac work, on files with forks, then it most definately is. setfileinfo can help sometimes, but that can be rather tedious.
Re:Database and rsync+ssh by angst_ridden_hipster · 2002-05-31 05:39 · Score: 4, Insightful

I currently use rdiff-backup for backups, which is pretty cool. I probably should have mentioned that.
Unfortunately, much of the data I have is not sufficiently structured for an RDBMs. To be more specific, I have about 5 GB of digital photographs / scanned negatives, 1 GB of email archives, 1 GB of various and sundry text files, 100 MB of assorted MS Office-type documents, 100 MB of source code (only about half of which is in CVS), 500 MB of web site source material (Photoshop files, HTML, etc).
So I figure that the filesystem is the best database for this kind of information. But I could well be wrong!

--
Eloi, Eloi, lema sabachtani?
www.fogbound.net
Re:Database and rsync+ssh by Oculus+Habent · 2002-05-31 09:06 · Score: 4, Informative

Well, there are several angles to look at. I'm going to hazard a few guesses at the situation, and hopefully I won't be too far off.

Accounts: You mentioned many accounts, so part of the problem could be (not saying that you don't know, just that I don't). different users on different boxes. It's initially easier to use groups to clear up these issues, and tackle account changes later. Create some extra users to make usernames match for box to box, and then group them together so they all can access the appropriate files. This still leaves room for account name matching later.

File System Uniformity: Some people will probably think this is an awful solution, but if you use a single directory (like /mnt) and mount/link everything to identical naming on each box, you won't have the location problems. Sure, it's cyclical to have / linked to /mnt/mylinuxbox on your linux box, but you will always know that your MP3s are in /mnt/mylinuxbox/mp3 (or whereve the hell they are).

Remote Access to your Filesystems: I'm not really qualified for this one, but the NFS/SSH combo is secure and tried. If you don't mind the at-home network traffic, you can make life easier by mounting everything on one computer, and then mounting it. Not recommended for heavy use, but it's easier than managing four connections.

Mirroring is OK if you have specific, regular downtime that the computers can spend, or you have an OC-3 from home to work and great drive access times. The probelm mirroring can present is synchronization lag. Unless you specifically set up your mirroring to syns ASAP, what will you do if you make it home before your data does? Live access does two things; you only transfer the files you need, and you don't have to worry about sync'ing. Plus, what's the point of the Internet if not to make information available? : )

Organization: I've been re-organizing my files for years now, and the best this I've done for most files is to just simplify. I used to make subdirectories for everything. Just recently I have realized the real intent of the "filing cabinet" metaphor...

Filing cabinets are only ever four layers deep. Department (what the cabinet is for - cabinets and drawers are physical limitations, not part of the concept), Group (Hanging Folders), Project (Manila Folders) and then files. Sure, you may end up with alot of "Groups", but that is what alphabetization is for.

Mind you, I haven't managed to change over all of my filing systems to this format. It takes time to sit down and think about what should be where. But it seems (at least to me) like a good though for personal file organization.

Good Luck.

--
That what was all this school was for... to teach us how to solve our own problems. -- janeowit

Hey, thats my question by SirSlud · 2002-05-31 05:27 · Score: 3, Insightful

I've been thinking of tackling this problem for awhile too. The best I can do is that you abstract the 'directory' (the list of what you have), for replication, accessibility (with convenience as the priority, especially). Then, when you need to do something with that data, your directory knows where it is and how to get at it. In this case, the convenience of accessibility isn't as crucial, and thus the need to transparently glue all these platforms and protocols, etc together isn't quite as important.

For me, I'd just like a top down, real time view with convenient access of what I have - getting it anywhere and anytime isn't quite as crucial for me.

Maybe you make a little daemon that can monitor your data respositories at several sources and 'merge' the data listings at a central source for publishing to multiple sources again?

--
"Old man yells at systemd"

Re:Hey, thats my question by SirSlud · 2002-05-31 06:11 · Score: 2

> How hard is it for you to transfer your kiddie porn pics from linux to mac to amiga or whatever the hell you use.

Pretty tough when you're confined to doing it all with one hand. With your level of insight, I'd have thought you could have figured that out!

--
"Old man yells at systemd"

not enough info by gmhowell · 2002-05-31 05:27 · Score: 2

Not enough info to answer the question. How much data total? How much needs to go out of the house? Do you want common accounts to various machines? What machines do you use most? What kind of data are you storing/want access to? What is your backup medium, what os is it linked to?

--
Jesus was all right but his disciples were thick and ordinary. -John Lennon

Re:not enough info by SirSlud · 2002-05-31 05:30 · Score: 2

I almost think hes wishing (probably fruitlessly, but hey, lots of software started out as sounding like they were going to attack impossible, lofty goals) that this solution would be independant of those details. I think he wants to slap down a data listing/retrieval abstraction layer over his real deal, so presumably something with an extention-like feature that could let you plug your data respository listing/searching software into new platforms/protocols as they develop.

Sounds like an intriguing idea, something I'd almost consider trying to hack around with ..

--
"Old man yells at systemd"
Re:not enough info by angst_ridden_hipster · 2002-05-31 05:45 · Score: 2

Sorry. To answer the questions:

How much data? I've got is about 5 GB of digital photographs / scanned negatives, 1 GB of email archives, 1 GB of various and sundry text files, 100 MB of assorted MS Office-type documents, 100 MB of source code (only about half of which is in CVS), 500 MB of web site source material (Photoshop files, HTML, etc).

How much of it do I normally need at any given site at any given time? Not much. But when I need it, I want it available.

Common accounts? Yes, when I can manage it. Unfortunately, I don't have absolute control of all the machines, so I have to have "similar" accounts on some.

Use most? Depends on the task. I tend to pretty much round-robin.

What kind of data? See above.

Backup Medium? Hard drive in a spare Debian Linux box, using rdiff-backup.

--
Eloi, Eloi, lema sabachtani?
www.fogbound.net
Re:not enough info by gmhowell · 2002-05-31 06:25 · Score: 2

It's at this point I should mention I'm not a techie, I'm a shirt. But one who knows the right questions to ask:)

That said, I would probably put as much stuff as possible on the Debian box. I assume you have total, or near total, control over this box. Set up methods of choice for access, and set up appropriate aliases for outside accounts.

Given the mix, you are going to be limited in protocol. If you can, I'd consider Amanda for backups (put server on box with tape, put clients on other machines).

Good luck getting useful answers.

--
Jesus was all right but his disciples were thick and ordinary. -John Lennon

Sharing data across machines and locations by PhysicsGenius · 2002-05-31 05:28 · Score: 2, Funny

I put my porn collection on a couple of Jaz drives and just carry them around. I call them my "jiz disks".

What about Amoeba? by raddan · 2002-05-31 05:28 · Score: 2, Interesting

I was browsing through the XFree86 changelog yesterday, and I noticed that they had dropped support for "Amoeba". What is Amoeba?, I thought.

I found this on google: Amoeba WWW Home Page

This seems to me to be a unique way of sharing data, since it isn't machine centric. Rather, it focuses on the user and the user's data. I have no experience with Amoeba, but on the face, it seems to answer this person's question.

My question is this: Why has interest for Amoeba dried up? (Or has it?) What with the proliferation of alternative OS'es over the past few years, why hasn't Amoeba caught on?

AFS? by alsta · 2002-05-31 05:28 · Score: 5, Informative

IBM has released Transarc's AFS as OpenAFS (http://www.openafs.org). Don't know if that is what you're looking for, but it is pretty nice. It's also portable, so it runs on various unices as well as Windows. Most can be found as binaries if you don't want to roll your own.

AFS is an NFS style implementation though, so you would have to save your files onto a special mount.

--
Wealth is the product of man's capacity to think. -Ayn Rand

Re:AFS? by sparcv9 · 2002-05-31 06:34 · Score: 3, Insightful

AFS is an NFS style implementation though, so you would have to save your files onto a special mount.
No, AFS is a global file system (a.k.a. a distributed file system). NFS shares can be mounted anywhere in your directory heirarchy, but afs space is always found under /afs. The AFS client software automatically "mounts" your home cell's filespace under /afs/<cellname>/. With a published CellServDB file (a list of other organizations' AFS servers) or the new DynRoot feature of OpenAFS (DNS record type AFSDB is used to locate a cell's cervers), you have instant, transparent access to the entirety of public AFS space, as well. Transarc's cell can be found in /afs/transarc.com, MIT's found in /afs/mit.edu, CMU's found in /afs/cmu.edu, etc. -- all completely transparently.

--

This is not a Fugazi .sig
Re:AFS? by C+Joe+V · 2002-05-31 07:23 · Score: 2, Informative

I use OpenAFS between work/school and home. It is very convenient to access at work, where a fast Ethernet connects me to the AFS server, but quite slow from home over DSL. Examples: when Emacs auto-saves to AFS, I have to stop typing for a while; I try hard to avoid compiling things (or running TeX) where the code is in AFS; when I kept my email in AFS, sylpheed took a really freakin' long time to scan my inbox and was much slower to incorporate new messages than it ought to have been.

Also, I was frustrated by the process of compiling OpenAFS for my Mandrake 8 box (GCC version crap), and if I ever try to mount AFS when anything is wrong with the network, I know I am in for a serious crash later on. Perhaps these are just my fault, of course.

Hope this helps.
Re:AFS? by mpb · 2002-05-31 08:17 · Score: 2

Yes! AFS is excellent for this.
I work in a multi-national team
we use AFS to share access to one filespace
using AIX, Linux, Windows.
IMHO, it's simply brilliant.

The AFS transport protocol "Rx" is optimised
to work well over WAN. It's definitely *not* NFS
and has a whole bunch of systems management tools.

A very neat thing about AFS is that it is scalable - it can be grown to meet your needs dynamically: you can add new servers to your AFS cell and move data between servers "live" with no outages.

You can also use RAM cache instead of disk cache
for faster access to cached files.

Love it love it love it

AFS is dependent on your network which needs
to be reliable and fast.

IBM sells the original Transarc version
and now you can also have access to the OpenAFS
source ( http://www.openafs.org )

"The universe is full of magical things patiently
waiting for our wits to grow sharper." --Eden Phillpots

Use the fish by aldjiblah · 2002-05-31 05:31 · Score: 3, Interesting

From the kio_fish homepage:
"kio_fish is a kioslave for KDE 2/3 that lets you view and manipulate your remote files using just a simple shell account and some standard unix commands on the remote machine. You get full filesystem access without setting up a server - no NFS, Samba, ... needed."

It works through SSH, so everthing is encrypted.

I use this with the konqueror file browser, but all KDE apps can transparently access files on remote hosts using this amazing utility, which required no special setup on either end, at least on my systems.

Solved all my data sharing needs - and andromeda solved the rest :)

--
sig sig sputnik

It's called a server by FozzTexx · 2002-05-31 05:31 · Score: 5, Insightful

What you need is something known as a "server." A server is where you can store all your files, and in some cases, account information.

With the right kind of server, it can do AppleShare, NFS, and SMB, allowing all your other machines to mount the shares and make them appear as local drives. This keeps all your data in one place, allowing for easy backups, and also makes it easy to get at the same files from any computer.

My personal preference is a Linux computer with several cheap IDE drives each on their own IDE controller (no slave drives). The drives are configured as software RAID 5 and ext3. Regular backups are setup through cron to a tape drive. Samba handles file sharing, printing, roaming profile, and PDC duties for Windoze. Netatalk 1.6cvs handles file sharing duties for pre-OSX systems. NFS is used for file sharing to *nix systems. The only thing I'm missing is a NetInfo daemon for Linux so it can act as a complete configuration server for NeXTSTEP, OPENSTEP, and MacOS X systems.

Re:It's called a server by tzanger · 2002-05-31 07:56 · Score: 2
The drives are configured as software RAID 5 and ext3.

Ext3 on a production server? You're braver than I... it's still an experimental FS (from 2.4.18 make config):
- Ext3 journalling file system support (EXPERIMENTAL)
IDE will work fine for most small offices and you've got the "no slave devices" right for RAID. My personal preference for small offices is two large IDE disks in RAID1 with a tape backup.
Re:It's called a server by mprinkey · 2002-05-31 09:17 · Score: 3, Informative

I also agree that a server makes the most sense. I would amplify these recommended transport mechanisms to include a few others that will allow remote connectivity.

First is a secure IMAP server for centralized email. This will allows any SSL-enabled IMAP client to access your mailbox. Also, Squirrelmail running on an SSL web server can give your access to your centralize mail repository from any web browser.

SMB and NFS are the obvious choices for LAN-based access, but WAN access needs more care. I think that a VPN setup using CIPE is a good approach. One the CIPE links are build, you can use most services as if you were located on your wired LAN.

The other need might be for file access from "arbitrary" locations. In addition to the normal scp and sftp apps in OpenSSH, there is a nice SCP client for windows, WinSCP. Lastly, if you have a SSL web server there already, Web-FTP will give you access to your files via https.

This sounds like a lot. In the end, you would need to expose SSH, SSL IMAP, SSL Apache, and CIPE servers. I am midway through this deployment myself, but it has stalled a bit because one of primary Internet access points started disallowing outgoing SSH.
Re:It's called a server by Eil · 2002-05-31 11:58 · Score: 2

Alternatively...

I've been using Reiser on my production machines with not a single hiccup, and I know of many others who do the same. For that matter, ext3 is used (reliably) in a lot of places as well. rpmfind.net is one that comes to mind.

I know at least one guy who absolutely swears by XFS since it's not a "new" fs like Reiser and ext3 and has actually been used in production for years now. I'm thinking of giving it a try soon.

It's really hard to go wrong with any of the journaling filesystems available in Linux these days. The visible differences betweent them are fairly small and which one you choose will depend mainly on if you have any special needs. (For example, ext3 is forwards and backwards compatible with ext2, NFS is noted to be more cranky with some filesystems than others, etc.)

I use WebDAV by marick · 2002-05-31 05:32 · Score: 4, Informative

I'd say what you need is an internet-enabled file system. Some might say NFS, and that seems like a fine solution.

On the other hand, if you have a computer that is always on, that can run Apache, you can have your own personal WebDAV server instead. Simply install mod_dav, and access it through mod_ssl, and have a secure web-based filesystem.

Better than NFS, you can mount it on Windows (through web folders), Linux (through davfs) and Mac OSX (through the native DAV file system client that is designed to run with iDisk).

NOTE: I work for Xythos software, and we make an enterprise-level WebDAV server called the Xythos WebFile Server. It's significantly more expensive than free, and we run in-house copies of the product (y'know eat your own dogfood), so that's where I keep my shared data, but if I didn't, I'd have mod_dav running right now.

Re:I use WebDAV by kwerle · 2002-05-31 05:38 · Score: 2

Last time I looked at webdav, it sounded a lot like a fair read solution, but writes sounded really iffy. Has it improved (in the past 9 months)?

I use vtun - vtun.sf.net.
I understand that openvpn.sf.net is nice, too.
Re:I use WebDAV by marick · 2002-05-31 06:32 · Score: 2

Actually, yes, I have tried it. Web Folders does work. It mounts a WebDAV-enabled HTTP "directory" on your windows box. Don't believe me? Feel free to try it yourself with a free account on Xythos's free evaluation-server first before taking the plunge with your own mod_dav server.

Re:Exchange Server by Jobe_br · 2002-05-31 05:35 · Score: 2, Funny

And this posted to a Slashdot forum ... yikes.

I don't know if this applies by slaker · 2002-05-31 05:38 · Score: 3, Interesting

I have twelve computers in my apartment and use all of them for something-or-other. Several are just test machines but even with those, I used to run into situations all the time where I saved something on one machine and forgot to do anything with it.

My solution was to write a series of little scripts to copy data from common share points on each machine to a large, central data store, and into a "backed-up" directory on the workstations. Presently my central data store is 600GB of IDE disks in a RAID1 array (10 disks, total). If I lose the central fileserver, all my data, and the scripts needed to recreate the information in that 600GB is sitting out on my workstations

It's kind of a brute force approach, but it works OK. I'm not sure how well it would work for non-local systems, though.

I'm sure there are better ways to do what I do, too, but it's nice to have a single place to look for my MP3s or whatever, while knowing they're backed-up in multiple locations as well. :)

--
-- I wanna decide who lives and who dies - Crow T. Robot, MST3K

Re:I don't know if this applies by No+Such+Agency · 2002-05-31 07:12 · Score: 3, Funny

I have twelve computers in my apartment and use all of them for something-or-other.

Your apartment looks like the one in "Pi", doesn't it? Are any of these computers currently calculating a 216-digit number that you'll use to predict the stock market?

:-)

--
Freedom: "I won't!"
Re:I don't know if this applies by 4of12 · 2002-05-31 07:45 · Score: 2

Your apartment looks like the one in "Pi", doesn't it?
Such an apartment would be incomplete without stimulant pills in the bathroom medicine cabinet to replace food in the refrigerator.

Oh, and don't forget the electric drill!

--
"Provided by the management for your protection."
Re:I don't know if this applies by slaker · 2002-05-31 08:50 · Score: 2

I wish. A couple of them are duplicating configurations of hardware used by my clients. A couple are notebooks. A couple do nothing but serve files and run backups. The rest consist of a Win98 game machine, a 2000 server for testing, a HTPC (including projector and presentation monitor), a couple Linux boxes and an OpenBSD machine doing firewall duties.

But yeah, with full-size APC rack, a pair of IBM RAID cabinets, a Cisco 5005 and five machines in a walk-in closet, yeah, I can definately get into the pi sort of atmosphere.

--
-- I wanna decide who lives and who dies - Crow T. Robot, MST3K
Re:I don't know if this applies by EvilStein · 2002-05-31 13:29 · Score: 2

You filled up a closet? My girlfriend filled them all with shoes. I put the computers in the living room instead. :-)

It's a bit loud, but we got used to it..
Re:I don't know if this applies by Pfhor · 2002-05-31 14:07 · Score: 2

What the hell are the 6100s doing?

And what appears to be either Beige G3s or 72/3/5/6 00's.

Damn man, what are you using those machines for?

(tempted, because my own college dorm room is starting to look that, which is why I am probably going to have to move off campus junior year)
Re:I don't know if this applies by EvilStein · 2002-05-31 16:11 · Score: 2

heh, believe it or not, the 61xx class machines are DNS servers running QuickDNS. Haven't had a single problem with either of them, they're fast (they both have Sonnet G3 cards in them) and they've been rock solid.

You're right. There's a G3 (AppleShare IP server) and a 7200 that used to be running NNTP software, but my upstream news server sucks, so now it just sits there. :)

Some of them are co-located, but others are web servers/etc... typical stuff.
Re:I don't know if this applies by slaker · 2002-05-31 17:27 · Score: 2

Looks like you need to invest in some cable management. I like Panduit's stuff myself.

I'd mod your comment +1 Funny if I hadn't already posted. I got the closet only after I promised to move all my stuff *OUT* of the living room. :)

--
-- I wanna decide who lives and who dies - Crow T. Robot, MST3K
Re:I don't know if this applies by EvilStein · 2002-05-31 18:24 · Score: 2

You probably have more closet space than I. My apartment seriously lacks storage space. ;)

cable MISmanagement is more fun. ;)

Seperate home and work! by bluGill · 2002-05-31 05:38 · Score: 5, Insightful

first of all, seperate your home life and work life. Then seperate the data. I understand that once in a while you need data from one place at the other, but avoid those situations.

At work: that is IS's problem. Store all work data on the work machines, and make IS do the backups. Use SSH, or other VPN when you want to work from home. Compile (or whatever) at work as much as possible. If you have data that you need on the road, get a laptop or PDA for work, and synchronize that when you are at work.

At home: set up a linux box (a 386 is enough, though you might want more) with a big disk, a UPS, and a network card. Put it in a closet or on a shelf. Install SAMBA, and Netatalk. with NFS built in (though there is better than NFS if you look, nfs is there) Use one loging for all machines.

Laptops are a problem, because you often want to use them where you can't get to the network. The first solution to that problem is 820.11. Use it at home, and look for open access on the road. With good VPN (ssh+nfs) you can get to your network server from many places. I manually synchronize only the files I need, but my laptop is rarely used outside of 802.11 areas, if you travel often, then you might need more. (CODA? AFS? )

Re:Seperate home and work! by TheNecromancer · 2002-05-31 06:02 · Score: 3, Insightful

At work: that is IS's problem. Store all work data on the work machines, and make IS do the backups.

For Pete's sake, this is a recipe for disaster! In the 5 or 6 companies I've worked for, every time the IS department managed someone else's data, they screwed it up! No one knows the value and purpose of your data better than you, so why on Earth would you allow someone who doesn't give a rip about it to manage it?

I would suggest using the IS departments resources and knowledge to help you manage your data yourself. Then, you have control of the backups, etc.

--
Attention all planets of the Solar Federation! We have assumed control! - Neil Peart
Re:Seperate home and work! by nolife · 2002-05-31 06:46 · Score: 2

I am using a similar approach but since I am still on a dialup, I can only dialin to get remote access, which works but SLOW..

I have roughly 10 computers in the house and 4 users.
I have one main Linux machine with several 40GB drives that basically holds everything for the Linux and Windows clients (some dual boot, some static) and the web server (another Linux machine).

The main Linux machine has Samba and NFS. All other Linux machines mount a single /home and the Win clients map the same homes through a login script. I use IMAP locally and Fetchmail for remote mail. I choose not to use roaming profiles in Windows but I still use domain logins and modified the resistry to store Favorites, My Documents, Temporary Int Files, Cookies, and a few others in the homes share also. K-meleon is just as easy. I have access to any of my files, email, bookmarks, Open Office docs, any fonts and whatever else I need from any Linux or Win computer in the house simply by logging in as myself.. I have a seperate share for my MP3's which are availble to all clients and NFS'd over to my web server where I use a php script called Andromeda to format and make them look nice from a web browser, there are tons that use SQL but I'm not that advanced yet..
The only thing I need to backup is /home and a few of the other smaller shares on the main server. My tape drive in only 2GB so I simply tgz it and send it over to my Linux web server weekly via cron.

--
Bad boys rape our young girls but Violet gives willingly.

Mac n' Windows by peatbakke · 2002-05-31 05:38 · Score: 2, Informative

I'm not sure if this is entirely applicable to your situation, but here's what I do, and it works reasonably well.

I have a server on a public IP address that runs SAMBA, but only accepts connections from 'localhost'. From my Windows box and iBook (running OS X), I just do a bit of SSH tunneling, and I'm able to mount the machine from anywhere I happen to be.

As far as I can tell, it's reasonably secure, and it works just fine for general files.

I also have a CVS repository on the server for my development projects, but that doesn't work so well for binary files like images and Word documents. :P

One of my friends keeps his files synchronized via an htaccess protected website which allows him to download and upload files. If you're interested, I'll see what I can do to track down his PHP script ...

first impression by Alien54 · 2002-05-31 05:46 · Score: 3, Insightful

It almost sounds like, "I would like to have the advantadges of a centralized database while keeping it distributed across random machines"

Now this is not totally fair, since it implies a pointy haired boss situation. All it really means is that that you would would have to have a better definition of the problem.

What it seems that you really need is an application, a database, that would constantly monitor in realtime the status and availablility of your various resources. This would tie into your other dataservices so that when you do a query on "XP sourcecode", or whatever, one of the result you get is from this resource monitor database saying that "the resource is offline" or "the data is available, but you don't have access rights", etc. depending of the resource status, and other realtime situations.

It occurs to me that clever design of the database may be able to do the resource availibily query in advance of the actual access of the data, so that you do not get a crash or whatever if a child record or whatever is unavailable.

Currently, I do not know of any tool that does this, although obviously this is not my area of expertise.

--
"It is a greater offense to steal men's labor, than their clothes"

Re:first impression by Alien54 · 2002-05-31 06:44 · Score: 2

I'm gonna call troll on myself. - That sounds a lot like Active Directory.
So then the question becomes how workable is Active Directory? And what are alternatives that could be open source?
I have heard a certain amount of discontent with it.

--
"It is a greater offense to steal men's labor, than their clothes"

Use what already works... by Anonymous Coward · 2002-05-31 05:47 · Score: 2, Interesting

I might be off base here, but..

Why not use Gnutella or a similar P2P system? There are clients for basically any OS out there, the files don't have to reside in a central location.

It works for the internet - Why not your own 'mini-internet'?

One modification you would want make is to get it to make a listing of all that you have.

Could you use SSH tunneling with a system like that?

huh? use a standard file server. by Telastyn · 2002-05-31 05:49 · Score: 2

Put a small raid 5 partition on your *nix machine. Store everything there, and use your access of choice to use the data on your other machines. I like ssh/scp for access, as it works mostly everywhere, and is encrypted, but then again most of my data needs accessed in a CVS manner. If you are constantly editing word docs and the such, samba would perhaps be a better option.

This way the data is in one spot, but it's much less vulnerable to hdd failure. Plus since it's on a *nix machine, you can export it to your clientelle.

don't use NFS by Kunta+Kinte · 2002-05-31 05:49 · Score: 5, Informative

Unless you want to share your data with lots of 'friends' you just haven't met yet.

NFS is used very often to mount home directories. But what is stopping someone from unplugging the workstation, plugging in a linux laptop with the IP of the legitimate workstation and mount the share, "su - user", and voila, you now have all the user's files.

That's just the simplest way. The problem is that most NFS implementations don't have *any* authentication except for IP authentication. So so other DNS attacks would work as well.

I am surpised that the most widely used network file system implementation for linux and most posix OSes has no real authentication. There *has* been authentication built in the protocol since version 3, but last time I checked, it was not supported on the linux. I was told by one guy working on the project that the problem was that there's no crypto in the kernel.

I used secure NFS on Solaris 8 for a while but I constantly lost the mounts. That but be fixed now, I don't know.

Use AFS, CVS, rsync, intermezzo, or something. But I would stay away from NFS.

--
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW

Re:Mirroring concept by angst_ridden_hipster · 2002-05-31 05:49 · Score: 2

Good point.

I was thinking more about the hardware failure issue.

--
Eloi, Eloi, lema sabachtani?
www.fogbound.net

How I do it by MaxVlast · 2002-05-31 05:50 · Score: 3, Informative

Yep. Unified access to e-mail via IMAP is definitely the linchpin of a good arrangement.

I've been trying to deal with the same problems as you for several years. I have a Mac running Mac OS X, Windows PC, Linux server, and a NeXT around my desk. I have two large hard drives. One is in the Mac and that holds my home directory, and the Linux machine has all my MP3s. My home is exported via NFS and is mounted on the Linux box and on the NeXT so I always have live access to my files. The Windows box only does my TV program and Kazaa, so I'm content to simply have it use FTP to copy files back and forth (I haven't found a decent Windows NFS program.)

It all gets the job done, and it all works smoothly. Printing is done by IP printing to my big 'ol LaserJet. All the mail is kept either on my server at school, or on the cyrus server on the Linux box. It's a delight =)

--
There should be a moratorium on the use of the apostrophe.
Max V.
NeXTMail/MIME Mail welcome

Re:How I do it by MaxVlast · 2002-05-31 05:52 · Score: 2

Oh, I forgot about my PowerBook. Conveniently, all FireWire Apple products can go into 'hard drive mode' and act like a dumb hard drive. To sync, I connect the PowerBook as a hard drive and use a cool Mac OS X syncing program (who's name I forget right now) to sync up the files. rsync and tar don't work because they slaughter the resource forks of HFS files. We're moving away from that problem, but it still exists.

--
There should be a moratorium on the use of the apostrophe.
Max V.
NeXTMail/MIME Mail welcome

It's called "The World Wide Web" by Bert690 · 2002-05-31 05:51 · Score: 2, Interesting

There's this great standard for sharing files over the internet called the World Wide Web. Perhaps you've heard of it?

Seriously -- run a webserver + WebDAV on each of your machines. Then you can read/write from anywhere, and with any platform.

Systems like YouServ/uServ provide a webserver, access control, and mirroring/replication support in a single package. This way as long as only some of your machines are online, the data from every machine remains accessible. Unfortunately the system is not available for general public use, but the system may be in open source soon.

Re:It's called "The World Wide Web" by Peyna · 2002-05-31 07:57 · Score: 2

Then you become reliant on the connection to the Internet. If that connection is lost, and you need access to resources that are there, what do you do? You also have to trust the remote server with your data. You have to trust every person that has physical access to the place where your data is stored. I would much prefer keeping my data somewhere close by, preferrable right by my feet or in my closet.

--
What?

Back it all up using gnutella by anthony_dipierro · 2002-05-31 05:52 · Score: 2

That's what it's for, right?

Seriously, I think it would be great if there was a P2P backup system. Private files could be encrypted, and everything could be uploaded to multiple peers. Obviously some sort of trust system would have to be worked out, but it could work. Even if I just connected to myself and two or three real life friends with DSL connections, it'd be great to have my files accessible everywhere, almost all the time.

Shared filesystems & well named directories by ryanwright · 2002-05-31 05:52 · Score: 2

I have two large drives sitting on a Linux box doing RAID mirroring. For remote access, I use ssh/scp. For local access from other Linux machines, I use NFS, and for Windows machines, SMB.

The point is, everything is stored and "backed up" centrally, but accessed using a different mechanism depending on where I'm at when I need my data. Since I don't delete files accidentally, mirroring works fine for a backup - I'm really only concerned with drive failure.

I then structure the directories according to type of file. I've got a documents directory where I keep anything I create myself. Specific projects that require multiple files generally go under documents/projectname. I've got a music directory, and many subdirectories under it:

music/fullalbums/artistname/albumname/files.mp3. music/singletracks/cache1/files.mp3.
music/musicv ideos/ ..

Etc. Then software. apps/isos. apps/windows. apps/linux. And so on, and so forth.

--
-Ryan, with the unoriginal sig

rsync + ssh + logout scripting + cron by Ashurbanipal · 2002-05-31 05:54 · Score: 3, Interesting

Use the excellent rsync from Paul Makerras (of pppd fame) and Andrew Tridgell (samba team) in combination with OpenSSH and SSH for windows (both based on Tatu Ylonen's work; OpenSSH is maintained by and expert team including Markus Friedl and the recently monkey-cracked Dug Song, among others).

Set up your accounts to rsync-upload changes to whichever server is most secure when you log out, and use a cron job on that server to rsync-download to all the other servers nightly. You can make a tar backup part of the system also.

You will have to remember what's going on so you don't modify the same file differently on two different systems within 24 hours. If you want to overcome that shortcoming by making this work on an immediate sync basis rather than periodically, you'll need something like SGI's fam (included with recent linux distros) to trigger the updating processes.

You should already be 90% there if you have your ssh keys set up for passwordless login. Passwordless PKI logins are not significantly less secure than passworded logins in most situations (granted hostile system management can get you, but the BOFH can trojan your login anyway).

Lots of people use this technique to sync CVS trees over slow links. Rsync is very efficient for that kind of thing (large volume of files, low number of changed bytes).

Can there be only one? by rwa2 · 2002-05-31 06:00 · Score: 5, Informative

Well, here's my approach...

First, I try to adhere loosely to the FHS for ideas on overall organization. Even though it's mostly intended for POSIX systems, following their philosophy will really help you separate your data from your platform-dependent program files and libraries.

Most of my important stuff goes on the Linux server in /home (on an IDE software RAID1). However, I try to limit files in here to stuff that's absolutely essential to keep the size down. I occasionally mirror this offsite to my friends' servers with rsync (with the private stuff pgp encrypted). I try to make browser caches, etc. symlinks to dirs in /tmp . Try to keep only the stuff you created yourself in here.

I keep media and downloads on a plain partition under /home/ftp/pub (which is also symlinked from the http document root). That way, all my computers can easily get access to music and installers and junk.

Samba helps win32 boxes access the /home and /tmp directories.

NFS exports /home to the other UNIXen, as well as /usr for the other machines with the same CPU arch. It should be acceptable to export /usr/share to other UNIXen with different architectures.

I'd like to set up CODA, since it seems to support more different kinds clients than Intermezzo. These support disconnected operation and are good for laptops. For the meantime, I just use rsync to mirror home dirs onto my laptop, though (and just keep track of stuff that I change on the road manually :/ )

No thoughts on how to combine everything into a distributedFS so you could have parts of, say, a music archive living over several machines. There are several projects for Linux-only (PVFS) or Win32-only (more advanced network-neighborhoods). I'd say your best bet for convenience is just to make sure everything is visible from your one server and reexport it from there (invest in a switch so it doesn't deadlock your network). Until better DFSes exist, though, I think you'll get better performance and less confusion from running everything from one beefed-up server with a RAID (or two if you want failover).

Unison works, perfectly. by Ecyrd · 2002-05-31 06:01 · Score: 4, Insightful

Here's my situation: I have a dual-booting Linux/Win98 machine at home, a Win98 laptop, a Linux server sitting in some network in a galaxy far, far away; and a bunch of other computers around the world.

At one point, managing all my data (I would change a bit here, and a bit there, then try to copy and synchronize by hand) was manageable, but I got real tired of it real fast. I considered putting together a CVS server, and then synchronizing that way, but it's really overkill and not a very user-friendly solution anyway.

Enter Unison. Now I just have a few directories designated as shared, and they get synchronized by Unison automatically. At home, my data is on a FAT partition, which is accessible to both Linux and Win98.

The good thing about this is that since I synchronize with the laptop when I'm connected, I get to use my data even when I'm on the move - not so with NFS. And I get free backups as well - I do have roughly 2Gigs of data, which would be a hassle to backup any other way. Besides, if I took tape backups, I would have to manually carry them off-site in case of a fire; now Unison takes care of backups to and from my remote machines.

inferno! by rpeppe · 2002-05-31 06:04 · Score: 2

inferno was designed for exactly this kind of thing. it provides secure connectivity between heterogeneous boxes across heterogeneous networks. the security architecture is highly modular (i.e. once you have a reliable data connection, you can securely access any sort of resources provided by an inferno instance at the other end). that includes all kinds of devices as well as files.

the connectivity and security are as versatile (or more so) as unix pipes; also you can write programs for it that run without change (really!) on any supported platform ('cos it provides an OS level view of everything rather than trying to shoehorn itself into the parent environment like java).

the security model is public-key based and because it's end to end, you don't need to worry at all about little things like 802.11 insecurities...

plus it's all small, clean and beautiful as befits something coming from CSRG at bell labs.

centralize and distribute. by rusty0101 · 2002-05-31 06:04 · Score: 2, Informative

For those systems that are on all the time, select one system to be a common server, I personally recomend a Linux box, though xBSD or OSX may provide the features you need as well.

In your home directory, create a folder you are going to put your mount points in to mount the data stores you need.

On all the other systems, create a share that will contain the data you want to access "anywhere". On the central server Mount all of these shares in that sharesmount folder. This may be nfs or cifs as the architecture of the servers dictates.

As this is all mounted to your home directory, you can go to just about any system in the network and remotely mount all of your folders by Mounting your home folder from your primary server.

To remotely access this storage center, use either nfs over ssh, or build appropriate links into your web pages, and run a secure varient of apache.

I also recomend keeping your work data in a seprate storage area from your personal/home data. You may recall that Northwest Airlines successfully sued to get the personal computers of Flight Attendants who they believed co-operatively negotiated a sick-out strike. Keeping your personal data completely separate would reduce the likelyhood of loosing your entire computer setup if someone at work files a complaint that they believe you are doing something wrong.

There are other advantages to this kind of a setup. By centralizing your data storage tree, it is easier to perform backups, you will only need to backup the one server's home directory, tracing into the peripheral servers. If you wish to set up a thin client in a bedroom, or someplace where you don't want to have a lot of fans going, this gives you a platform ready made for your storage needs, as well as a reasonable terminal server. I think you get the idea.

-Rusty

--
You never know...

My approach by captaineo · 2002-05-31 06:12 · Score: 2

Keep your valuable files on a Linux machine (running on good hardware you can trust, a stable kernel, a journaling filesystem, and software RAID if you want to go that far). Do backups from there. Run NFS to serve all non-Windows clients. Run SAMBA to serve Windows clients.

Sharing data files is easy with the NFS/SAMBA combination - e.g. non-Windows machines mount my home directory as /home/foo and on Windows it's H:\ - all the files are there.

Sharing software is less easy since none of the common UNIXy filesystem layouts really let you have binaries for multiple platforms available at once. There are unconventional layouts that do this, but you'll have to compile a lot of things yourself and mess with configure scripts a lot... I've given up on sharing binaries and libs; I just run Debian on as many of my systems as possible, and run a script now and then that ensures the same packages are installed on each machine.

For remote work I use SSH to set up a VPN. However, unless I'm on a very low-latency connection, I find it difficult to use a shell remotely, much less NFS. I usually end up manually rsync'ing the files I need.

cvs & rsync by joey · 2002-05-31 06:14 · Score: 4, Insightful

I use cvs for all of my home directory except for large data files which are rsynced around using Makefiles checked out of cvs. For a long explanation of the CVS part of it, see CVS Homedir.

This works well for me to keep about 30 accounts in sync, most of them just get a minimal checkout of my home directory (5 mb or so), while 3 or 4 get the whole home directory and rsynced files (5 gb). The CVS repository is about half a gigabyte in size these days.

Once something that allows proper file rename tracking, like subversion, comes along, I plan to stop using rsync alltogether, and just check all the files in.

As has been noted elsewhere in this thread, one of the key things is coming up with a consistent directory structure and sticking with it.

--
see shy jo

root directory: by edrugtrader · 2002-05-31 06:15 · Score: 2

/porn/
/video/
/pics/
/warez/
/gamez/
/app z/
/audio/
/mp3/
/rock/
/hiphop/
/jazz/

oh filter, why must thee filter my comment

--
MARIJUANA, SHROOMS, X: ONLINE?! - E

I tried the same thing at school... by nadador · 2002-05-31 06:17 · Score: 2

When I was still in school I tried to figure out some good way of being able to work on my research project at home and at school, or at least massage the code and the data in both places.

Being an engineer, I thought of a bunch of ways of setting up complicated distributed ways of doing this, but settled on just leaving the data in one place, and SSH'ing to that box.

The benefits of keeping it simple were:

1. No new work, which is good for the lazy^H^H^H^Hefficient among us.
2. Data coherency. If its only ever in one place its hard to mess up.
3. Backups are easy, since you're only backing up one data set.
4. Did I mention no new work?

As much as data sharing on a heterogenous network would have been nice (Linux box at home, Suns in the lab, Windows at my parent's place, iBook in my backpack), the marginal utility of that data sharing was low compared to the marginal cost of actually doing the work to make it happen.

My vote is for keeping the data in one place and remembering how much you love the terminal. Not a sexy solution, but it works.

--

Outside of a dog, a book is a man's best friend. Inside a dog, its too dark to read.

RSYNC by dbarclay10 · 2002-05-31 06:21 · Score: 3, Informative

Aha, been through this myself ;)

Okay, you *could* use some form of networked file system, but a) your laptop and other machines would need to be connected to use it, and b) I hope you are willing to fight to get a good implementation to work, and c) I hope you aren't playing with big files :)

I use rsync. I have ~/Makefile, 'make sync' works wonders. Here's the contents:

On the laptop:

get: rsync -avuz --exclude "*~" willow:/home/david/data /home/david put: rsync -avuz --exclude "*~" /home/david/data willow:/home/david sync: get put

Works like a charm :)

--

Barclay family motto:
Aut agere aut mori.
(Either action or death.)

Re:AFS? Not suitable by sparcv9 · 2002-05-31 06:23 · Score: 3, Informative

angst_ridden_hipster asked for something that runs on OS X

OpenAFS *does* run on OS X.

--

This is not a Fugazi .sig

We need a HOWTO by jaaron · 2002-05-31 06:26 · Score: 2, Interesting

This question (or ones like it) has come up many times. This isn't the first time something like this has been posted on Slashdot. I'm currently looking at doing something like this myself and I'm obviously not the only one. While that lays the ground for a good open source project (ie- a distro that is set up for something like this, or a project that easily combines several tools to do this kind of thing), what I think we really need is a good HOW-TO. Maybe there already is one or are several related HOW-TO about setting up this type of file access. There have already been a number of good suggestions posted here on Slashdot. We need to get these and others together and put into a HOWTO so that it's not a research project every time someone starts exploring this idea of distributed data and somehow consolidating the mess. (And no, I'm not volunteering yet since I haven't done this yet and currently don't have the resources. But if something doesn't happen in a while, maybe I will...). If you know of a HOWTO or other site that covers this info, you should post it somewhere here.

--
Who said Freedom was Fair?

Re:We need a HOWTO by teamhasnoi · 2002-05-31 07:23 · Score: 2

I am with you on that. I am trying to connect my home (xp and beos) and work (OSX, 9) w/ VNC through 2 router/fwls and am feeling like a big more-on. Have had no luck so far; looked on google and couldn't find any info that is applicable to me. I know it's a simple? thing, but having to do all my home config at lunch, and hoping it works when I get back to work, sucks.
Re:We need a HOWTO by JabberWokky · 2002-06-01 06:20 · Score: 2

It's called System Administration, and is a field unto itself. The profession values experience above all else because that's how you learn to tame these beasts. A HOWTO would be obselete within two years, and wouldn't apply to any particular configuration. Sure, there are good rules of thumb, but they are covered in plenty of O'Riley books, and the key is how you apply those general rules. One of the most important things is that you learn rigor (to the point of being anal about caps in filenames) and flexibility (letting the users doing it their way whenever you can to make them happy - after all, all BOFH jokes aside, you're there for them). Learning when to apply either is a multifaceted decision involving politics, technology and social skills. Even when you are the only user ("Am I going to get too lazy to keep up this directory structure?").
Sure, a HOWTO could be written, but you could also write a HOWTO about how to be an attorney or a plumber - it wouldn't do the field justice. You need, at the *very* least, a good solid book. Or a turnkey solution (wherein you are trading money for experience) like the netstorage boxes that you plug into the network and speak a half dozen protocols for all your systems to talk to.
--
Evan

--
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien

Re:AFS? Not suitable by Matthias+Wiesmann · 2002-05-31 06:27 · Score: 2

Please don't call people names.
Instead, read this page:

http://www-personal.umich.edu/~srb/openafs/

Re:Exchange Server by VP · 2002-05-31 06:32 · Score: 3, Funny

He-he, nice way to bring attention to this news item.

Excellent Example of WebDAV by Spencerian · 2002-05-31 06:36 · Score: 2

WebDAV is used practically by many Mac OS users in the form of the iDisk service on Apple's iTools network services. Being an open standard, there must be some commonality that makes it practical to set up WebDAV services on any or all boxes for basic file sharing, or even a common location. iDisk itself isn't the solution, of course, but it shows the practicality of a WebDAV solution.

--
Vos teneo officium eram periculosus ut vos recipero is.

Web Design by Lando · 2002-05-31 06:40 · Score: 3, Informative

I'm not really sure what type of work your doing where you need access to your files... I can relate my knowledge on dealing with unison over the past year though.

I do a lot of back end web development. As such I usually like to copy the entire site down to a local machine, work on the system, upload to a test machine, test, and then move to a development machine. Unison has made my job a lot easier than it using a bunch of ssh scripts since unison automatically checks for changes and only copies over files with changes.

A sample script is as follows:

From my local file system $HOME/web/(website) I execute the following script

unison -auto -batch include ssh://user@somehost.com//www/(website)/include

unison -auto -batch www ssh://user@somehost.com//www/(website)/www

This script pulls all my programming work in include and the website accessable files www to my local system... I then work on the files and upload using the following script

unison -auto -batch include ssh://user@testhost.com//www/(website)/include

unison -auto -batch www ssh://user@testhost.com//www/(website)/www

I then check the coding and on the test host, when I get it to the point I want I upload it to the production machine...

If I have problems on the test host, I can go in and remove all files on my development system and pull a fresh copy of files from the live site...

Since I don't need to program and compile on different systems, just uploading the the test and production machines it works well.

Recently I took a trip and did not have access to my local system. I was able to borrow a windows system and after installing putting, winscp and unison I was up and running within 10-15 minutes at the remote site, which allowed me to get back to work.

The problem with using a remote mounting system is that you have to maintain network connectivity while working on files, not always an option, plus you are working with the live production files...

So basically I use unison just like a cp command except that it does not copy files that already are synced between systems and it automatically keeps my permissions sync'd as well.

Hope that helps

--
/* TODO: Spawn child process, interest child in technology, have child write a new sig */

Here's how I stay organized: by oni · 2002-05-31 06:51 · Score: 5, Funny

I keep all the porn in a seperate directory. That seems to work pretty well.

Re:Try CVS. by dossen · 2002-05-31 06:52 · Score: 2, Interesting

If you use cvs for this kind of thing, cron-jobs can make things a lot smoother when dealing with program settings and the like (I use it for stuff like my bookmarks). Just make the cron-job sync all your machines to a central repository at short intervals. That way you should be able to maintain consistent files in all machines, and if something goes wrong you can roll back to an earlier version.

Add a bit of clever scripting, and you might also handle whole dirs automagically (cvs works on individual files).

One word of caution: Be careful with binary files, and programs that restructure files, since thats not what cvs is made for (you can set files as binary though).

Segregate the data, manage each. by jmanning2k · 2002-05-31 06:56 · Score: 4, Informative

I agree with you. Your question though, was overly general.
There's really three (or more) different separate data issues that you have to deal with.

Like most, I have many accounts, and just manage them on the fly. My data is retrieved manually when I need it. SSH (and scp), VNC, etc. This usually does the job.

Not the easiest way to do it. Especially when I recently changed jobs and had to setup new data and profiles - I thought, there must be a better way to do it.

So, here's a breakdown of the problems, and suggested fixes.

Break it down into 3 separate sets of data:
1. Profile data - Your shell scripts, .bashrc, environment, ssh directory, pgp keys, etc.
2. Daily Documents - My Documents folder, data directory. Limit this to stuff you need in ALL locations (though you could have a personal and a work version...) and on a regular basis.
3. Archived files - Infrequently used, but you occasionally need to access them from various places.

Then, the problem becomes much simpler. Instead of a grand scheme to manage all three of these at once, you have three smaller, simpler problems.

Here's my suggestions:
1. Profile info - Wasn't originally my idea, but the best thing I've found is to use CVS to manage the files. You'll also have to setup your shell scripts to detect the OS / machine you are on and run OS / machine specific versions.
For example: .bashrc
Detects OS, runs ~/.profile.d/linux, ~/.profile.d/win32, ~/.profile.d/macosx, etc.
Detects hostname, runs ~/.profile.d/hostname.
Put core stuff in the .bashrc, put specific things in the separate files.

The rest, usually doesn't change.

Add it all to CVS on a personal server. Then just checkout to each account you have. cvs update will keep it up to date if you change the master copy. You might need a special .cvsignore to make sure it only manages the files you want it to.
Then, you have the same profile files on all of your machines. Got a new .emacs macro, or shell prompt tweak? Edit one account, cvs commit, cvs update the rest.

2. Daily use Documents. This is a mix. Perhaps you could use a separate CVS repository. Or, use rsync and rdiff type backup sync programs. The key here is to keep this to a minimum. How much to you really need, and how much *must* be in sync between all your machines at all times. Again, this is fairly easy for a small number of documents, so don't let it get out of hand. If you don't use the file all the time, and don't need to maintain changes, then push it to archives.
This is the issue that most other posts address, so I won't get into too much detail. All those solutions are much easier with a small number of documents.

3. Archived files. This is probably what you were really asking about with regards to NFS and sharing files. These are the files you need every so often, stuff like your mp3 collection, downloaded software, extended (non category 2) documents, and the like.
For these, it depends on your setup and level of network access (the speed is important too). rsync might work if you need a locally cached copy, but this is much easier if you leave it in one place. Setup a gateway on your home network with IPSec or PPTP. Or, find WebDAV or some internet accessible filesystem you can use (NFS or SMB even, depends on your security needs). Then, connect to the central repository when you need these files.
This can be large, but keep it so that you don't need to synchronize frequently, and preferably only in one direction. You listen to your mp3's, but you don't change them frequently. Same with your downloaded tar/zip files of software you've collected. (Face it, having a single directory with cygwin, mozilla, etc - all the software you have installed at each location - is much easier than finding and downloading them all from their various sites each time.)
Or, for these files, if you really don't need them all the time, leave them on the central server, and scp them when you need them.

--

So, that pretty much covers it. I hope these suggestions are useful. There comes a time where managing it on the fly just gets too cumbersome. (You'll know that time - it usually happens right after you wipe out some vitally important data because you didn't synchronize the files.)

Beyond this, you can always add all kinds of stuff. Some examples: ACAP (a configuration file server, I use it with mulberry, my IMAP client. It lets me set preferences), Kerberos for common authentication, LDAP for an address book or netscape roaming profiles, the list goes on and on.

What would be nice is a set of scripts to help manage this.
Imagine, getting a new account and typing "pullprofile", and having your environment and data all retrieved, pulled from your central server. Then you could have login and logout scripts to synchronize the data, or just manually (possibly remotely if you forgot to sync before you left work) run them. A cron job to synchronize the big data store overnight.

I'll keep dreaming, and keep looking on freshmeat and sourceforge for a project like this. Maybe one day I'll get up the energy to start it myself, but don't count on it.

;-)

~Jonathan

AFS + kerberos by iocc · 2002-05-31 07:03 · Score: 2, Informative

Use AFS and kerberos. Works for mit.edu, Ericsson, kth.se and MANY others so it should work for you too.

http://www.openafs.org
http://www.pdc.kth.se/heimdal

Samba + VPN by AntiChristX · 2002-05-31 07:22 · Score: 2, Informative

My mom's office had the same types of problems so here's what I did:

1. Set up samba on the reliable (linux) machine, with proper tape backup, etc.
2. Firewalled the segment (which included their desktops) with a WatchGuard SOHO router (about $500 for 25 user support, runs linux :)
3. Set up Mobile User VPN on the firewall, and any laptops that might travel out of the office.

Samba and SMB are not the world's fastest solutions, but it is nice to be able to have the directory browsing in winders and macos. Samba is easy to set up, my first install of a samba PDC only taking about 3-4 hours (and never touch it again). If you need real speed for transferring over large files, you can always use SSH and SCP (putty and pscp for windows, niftytelnet for mac). Just always attempt to maintain a central data server, back it up as needed, and you'll be successful in clearing the data clutter.

--
AntiChristX
Daring to remain below 5 karma indefinitely

Re:Exchange Server by Tim+Ward · 2002-05-31 07:26 · Score: 2

Cost to me would be £0 - I need the MSDN Universal anyway, so it doesn't cost me anything extra to use more bits of it.

I'm sure a whole bunch of slashweenies will now accuse me of voluntarily paying a Microsoft tax, but I can assure you that I've made lots more money out of Microsoft than I've ever paid them, and I get a jolly good ROI on the subscription.

Re:Alternative to IMAP by MaxVlast · 2002-05-31 07:27 · Score: 2

I really (really!) like Mail.app on Mac OS X. It's one of the reasons I kept my NeXTs for so long. It caches all of the messages locally, so I don't have to worry about connectivity. If I want to use the pleasant text-based options, they're still available.

--
There should be a moratorium on the use of the apostrophe.
Max V.
NeXTMail/MIME Mail welcome

Server yes! And NetInfo vs. LDAP by plsuh · 2002-05-31 07:34 · Score: 4, Informative

This response is dead on. The original asker needs a file server that speaks multiple protocols. Once you have a server, it is much easier to create the necessary ssh or ssl tunnels that you need for total security.

Trying to maintain coherency of data via replication across multiple machines is begging for trouble -- this is a hard problem that to my knowledge has not been solved in a clean, cheap way.

If you want to use NetInfo for Mac OS X, create a new port from the Open Darwin sources. There's a port of an old NetInfo server module for Linux floating around, but it's not what I'd call up to date.

A better choice would be to use OpenLDAP, as Mac OS X is designed to pull directory service info from an LDAP data source. Windows systems can also pull from a LDAP, as can Linux and *BSD and Solaris and so on.

--Paul

Re:Server yes! And NetInfo vs. LDAP by TheAJofOZ · 2002-05-31 12:36 · Score: 2

A better choice would be to use OpenLDAP, as Mac OS X is designed to pull directory service info from an LDAP data source. Windows systems can also pull from a LDAP, as can Linux and *BSD and Solaris and so on.
Now what about when there's a laptop in the mix? It would be simple to flag specific files as "current" and have them copy over the laptop regularly (use rsync and do an update just before leaving), but what about user accounts? How easy is it to have the laptop use a remote NetInfo or LDAP server when available but use a local one when on the road or plugged into a remote network? Obviously the local one would have to sync to the real one regularly as well.
Re:Server yes! And NetInfo vs. LDAP by plsuh · 2002-05-31 14:56 · Score: 2

Use common uid's for the laptop local users vs. the network directory services users. I do this myself across three domains: Apple's NetInfo network, my own NetInfo network at home, and my TiBook's local NetInfo domain.

--Paul

KISS by softsign · 2002-05-31 09:00 · Score: 2

What you need is a 20 GB VST Firewire drive. Set up a coherent filing system, store all your data on there, and take it with you wherever you go. These things are tiny and lightning quick - no network storage can compare If you've got proper Firewire ports (bus powered), you don't even need to carry a power adapter - just the firewire cable.

My supervisor swears by one of these things... He used to have a complete mess of redundant files all over the place and could never remember which was the most current. Now it's easy. The VST drive is the definitive version.

Of course, there is an outside chance that you could lose the drive or the data be destroyed, so make a habit of backing up (using rsync or something similar) on a weekly, or even nightly, basis to a more secure machine (a desktop, for example). You could probably set up a nightly cron job to run that would check to see if the drive is connected and backup if it is. That way, backups for you would be as simple as connecting the drive when you get home...

tar and ssh by Cyno · 2002-05-31 10:38 · Score: 2

Well, let's say your working on a unix system and it crashes or loses its configuration and the network underneath it gets reconfigured. I find the best solution for moving data and preserving permissions is a tar pipe through ssh or rsh. Cpio and other stuff might work better since tar has problems for deep directories. But here's what I use:

tar -cf - * | ssh backuphost 'cd /mnt/backupdir && tar -xf -'

But AFS looks cool. Does anyone know how secure it is?

Re:Exchange Server by Master+Bait · 2002-05-31 11:21 · Score: 2

Sheesh. So many complex answers to such a simple question. Just put all the data on a Linux or BSD box and serve it with Samba and netatalk and nfs. Keep all the applications local, and use the fileserver for your data. When you save a file on the Mac, simply add a filename extension to the name. Netatalk handles Mac resource forks transparently and has been out for years and years.

Use imap for email clients and keep your email on the fileserver.

--
"Only in their dreams can men truly be free 'twas always thus, and always thus will be."
--Tom Schulman

SQL Database by Jester99 · 2002-05-31 14:56 · Score: 2

Where I work, there are Gigs of data stored
in massive Oracle SQL databases.

Obviously, if you are asking this question,
you don't need such a high powered system
as this (we have a big-iron Sun machine that
does the serving).

However, buying a powerful Dell Server, and
running Access on Win2K would give you a
decent SQL system to work with.

Applets can be written for any platform
which will all use SQL and can then translate
the results of the query into native stuff
for the computer its on.

Furthermore, look into Macromedia ColdFusion.
CF can be used to quickly create web-based
systems which interface with an SQL database
rediculously easy. (My department does just
this.)

You can use a web app and a database to
retrieve data and upload data, perform
authentication, all sorts of great stuff.

Raid1? Are you on crack? by Inoshiro · 2002-05-31 17:31 · Score: 2

If you have enough disks to make 600GB with the 100% overhead of RAID1, I hate to think howmuch space you're wasting that you'd have free if you used something smarter like RAID5.

RAID5's overhead is a fraction of RAID1's overhead, and as long as you don't have a lot of drives fail at once (which is rare, and RAID isn't a replacement for backups anyways), you're much better doing anyways.

--
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.

Control of technology by Tim+Ward · 2002-05-31 20:36 · Score: 2

None of us is in control of the technology we use.

I've long since given up reading the hardware specs for the processors I'm using and expecting to understand every wire on the circuit board and every byte of code in the PROM. (Yes, I used to do this.) It's just all too complicated, and one does wish to have some time left to use the stuff.

It all got too much for me when processors started caching stuff internally, so you could no longer see what they were doing by watching the data fetches with a logic analyser; it was at this point that you could no longer calculate how long a processor would take to do something, because the same instruction might take a different number of cycles depending on cache history; you had to just run the code several times and measure it.

So the fact that I don't have a copy of several million lines of source code that I have no desire at all to spend time reading doesn't bother me in the slightest.

Re:Control of technology by Beliskner · 2002-06-02 04:01 · Score: 2

I've long since given up reading the hardware specs for the processors I'm using and expecting to understand every wire on the circuit board and every byte of code in the PROM. (Yes, I used to do this.) It's just all too complicated, and one does wish to have some time left to use the stuff.
It all got too much for me when processors started caching stuff internally
You poor poor guy. You're a computer fanatic fallen from grace. Go to University (a good one like Imperial) and do a computing course taking the Simulation and Modelling module. Then you'll be able to calculate execution based on cache latency and associativity, LRU cache management, etc. At least the signals on the system bus haven't changed much. I think you'll probably the only person that'd be against SiS integrating Northbridge and Southbridge onto one chip. I'll feel sorry for you when hyperthreading CPUs come out bwa ha ha haaaa! Look, if it goes over your head just stick with the worst case and put speed limiters in if the worst case is missed by caching.
Ironically your worst nightmare is Transmeta and Itanium - where the CPU modifies the code. You'll just have to face up to the fact that to model a mechanism this complicated you'll have to write a computer program yourself to predict these complexities. Remember, all chip and chipset manufacturers aren't breaking the laws of physics. How the heck do you get a logic probe onto the pin of a P4 without creating a pin to pin short anyway? I think it's possible if you use a hypodermic needle not to inject yourself but instead as a logic probe.
What I find funny is that you say you're a computer freak and yet on your CV you state you use Micro$oft Access and Microsoft C++ <Krusty the Clown> Bwa ha ha ha, huh, huhhhhhhhhhhh </Krusty the Clown>
Presumably your embedded development sparked your interest in knowing what the CPU does, well you'd be pretty stupid to use a P4 in embedded RealTime situations, unless it's supposed to double as a hand-warmer. Make your own CPU using an Altera FPGA.

--
A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?

My setup by UncleFluffy · 2002-05-31 20:59 · Score: 2

Machines:

Dev box (PC - W2K/Linux)
Server (PC - Linux)
Firewall (PC - Linux)
Laptop (PC - W2K/Linux)
GF's machine (old Mac)
PDA (Psion 5)

All my working data lives on the server and is available to the other machines via Samba, NFS or netatalk. Backup via DDS3 on the server using afbackup, "minimal restore information" encrypted and mailed to a free webmail account so I can get to it if, say, the server catches fire.

Laptop has a directory under my ~ called "mirrored" which contains my current working set of stuff from the server. This is synced using unison whenever I come back from / head off on a trip to the office (I work from home 3 days a week).

GF has a home dir on the server which is visible on the mac desktop and has been told "put stuff here, it gets backed up, put stuff anywhere else, it's your problem."

Dev box and laptop are dual-boot linux/W2K, with a VMWare install running inside linux set up to boot from the physical W2K install, which can see both ~ on the host machine and on the server (if connected).

PDA syncs with Outlook (no email, just calendar and tasks) on the dev box - VMWare or real hardware, works just the same and the data is visible in both as it's the same physical drive.

Everything works very smoothly, except for:

- Unison which dies if it tries to use more than 64M of RAM to do the sync. This has only happened to me once, when trying to sync about 40-50,000 files in one go. For normal day-to-day jobs, I've never had a problem with it.

- The W2K VMWare session on the dev box losing the serial port occasionally, which means I need to reinstall the port or boot into native W2K before I can sync the PDA. Not really a problem as it only happens very occasionally.

--

What would Lemmy do?

Re:AFS? Not suitable by sparcv9 · 2002-06-01 05:16 · Score: 2

Heh, I actually remember your posts to the OpenAFS lists. You're right, the server software doesn't quite work on OS X, and the Windows version is kinda dodgy. But, you only need the client software to actually access AFS-space, which works fine on both OS X and Windows. Put the server software on a couple of UNIX machines, and access the filespace from any OpenAFS-supported platform.

--

This is not a Fugazi .sig

Re:Alternative to IMAP by Matthew+Weigel · 2002-06-01 05:43 · Score: 2

...pine, Emacs's vm, etc.)

It's funny, but they support IMAP too. So does mutt, in fact. There's no reason to not use IMAP just because you only provide shell access... and to follow Ashley's line, if you have a laptop, then IMAP with locally cached messages gives you much better access to your mail if you travel.

And, if you ever provide mail for people in a different state or country, a mail system that's not dependent upon a constant and fast connection to your machine is pretty much necessary.

Take the taste test: consider setting up a super-small machine to host your mail for a little while, on IMAP; configure vm to use IMAP; go ahead and download the imap-utils package from UWash (it gives you things like icat, that cats messages from the server). See if you notice a real difference or not. A little Sparc IPX would be enough for this, with a tiny 3-4G drive. Just give it a try... heck, email me if you need help.

--
--Matthew

Slashdot Mirror

Organizing Data Across a Heterogeneous Net?

88 of 293 comments (clear)