What is the Worst Tech Mistake You Ever Made?
"In the interest of full disclosure, this is mine:
I was working at a Fortune 50 bank as a consultant. I was due to go on vacation for a week and the company did not have webmail. I decided that I would try forwarding emails to my corporate account. (I know this was a bad idea, and probably against several corporate policies.) I set it up so that any email that came in would forward to my consulting company's account. My mistake was I also left Delivery Receipt on. This was not Microsoft, it was Lotus Notes. The system began forwarding the incoming mail to my account. But then it would get a Delivery Receipt, which in turn would be forwarded to my account, which would generate another delivery receipt, ad infinitum. When I got back from vacation they claimed I had brought down the email system for 4 hours. This incident caused the bank to stop allowing consultants to set up email rules. What's your story?"
During my IBM mainframe, COBOL, VSAM, CICS days at a large midwestern bank, I crashed the entire CICS online system for about a day because of a screw up when allocating memory within the CICS region.
About 300 branches were not happy.....
I launched SkyNet, right before my daughter and future husband rushed in to warn me. Boy was my face red!
"I'll say it again for the logic-impaired." -- Larry Wall.
staking mine and my family's needs in a technical career!
When I worked for a library I noticed a log of files with red ! as their icons. I determined that they must be erorrors or duplicates. So I removed them. Turns out that in windows 95 a red ! means that it is a critical system file.
And the library did not have the system source media anymore so we spend the next day looking for any machines with a similar version of the deleted file and moveing them back by hand.
Back in the mid 80s I was a jnr op on an old mainframe. Not much disk space so we used to save old audit trails to tape and remove them. Another pertinent fact is the DB starts UDX* and the audit trails start UDXA*
:-( God knows why they kept me around.
I wonder what might have happened if a certain jnr op had not being paying attention and thought he knew it all.
Yep, there goes the audit trails and the database
"What is the biggest technology related mistake you have ever made?"
Statement by Slashdotters after the supoenas start rolling in: "Posting an admission of wrongdoing on a semi-anonymous public forum, whose owners will most likely cooperate with law enforcement when asked about an admission of wrong doing in a semi-anonymous public forum."
Vote in November. You won't regret it.
First computer, got it in 1993 or so. Still had it sitting around as recent as a month ago, tried to boot it, and it failed. Finally tossed it. Not too surprised. It had trouble from day one. Most significant was the bad 8MB RAM module that couldn't be removed because it was soldered on the motherboard.
rm -rf * in the wrong directory. god that sucked, i lost weeks of work. been keeping daily backups since then and aliased rm to 'rm -i'
My biggest mistake was finding this website. I've wasted more time here that could have been spent doing my job and getting actual work done.
Yoda of Borg am I! Assimilated shall you be! Futile resistance is, hmm?
One time we did "rm -rf /" just to test a backup on a server. Well it turned out the best backup we could get was a month old tape. It wasn't a production machine so we just had fun it.
I'm reminded of when the metric/standard error lost the Mars probe. A labmate commented, "Boy, at least I'll never feel bad about screwing up a $20 experiment again!"
What I'm listening to now on Pandora...
Never, never, never, never commit to a schedule that is not realistic. If you know it isn't realistic before you get started, imagine what happens when you discover the unknown problems.
No matter how much that guy in marketing wants to meet his roadmap, he will not help you design, code, or test your product. If you are lucky, he will complete the requirements before you are supposed to ship the product.
Thankfully, that's the worst I've done so far.
Prevent email address forgery. Publish SPF records for y
The next day someone powered up the monitor to my old desktop (still at the office) and what did he see?
SQL Query Analyzer maximized with:(I still don't remember doing it.)
...five minutes later after coming back from getting coffee: D'oh!
I actually did this once... while logged in as root... at the top level in /home... on a production server. Thank baphomet for nightly backups!
Hopefully none of my clients are reading this. :-)
one day i was caught reading slashdot
I always had good luck with Maxtors, though I had only used old and small drives- 80-200 MB or so. When I was putting together a new computer for myself in 99, I thought I'd get another trusty Maxtor, a 6 GB. Pfft, bad idea. Thing failed in less than a year, taking all of my music with it; 5 years of dorky industrial music, recently copied over from a huge stack of ZIP disks. 100 songs.
grr.
Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
Buying a PCjr. (Actually my dad made that mistake, and got it for me for christmas, about a month before IBM discontinued production. Fortunately, it had real keys at that point. Heh)
-Sean
None that I've done come to mind - I tend to make lots of little stupid mistakes rather than occasional huge cock-ups. But I had a client that had a CIO who was actively hostile to the idea of any kind of computer security what-so-ever. Waste of time and money for a made up threat he said.
They were running 13 servers at remote locations (and I mean remote, as in out in the boonies 4 hours from nowhere on back roads) and these servers were unpatched, had out of date or innactive anti-virus and were connected to the net via a combination of satellite and dedicated (always on) dialup. Their communications were secured with nothing more than Windows 2000's built in VPN.
Needless to say, my audit report told them that they had big beefy powerful angels on their side since they hadn't yet had a noticable intrusion. (They had no way of detecting one, but at least the servers weren't hosting porn sites.) I warned them that a virus or worm would come along though and knock the whole thing out. The CIO scoffed at my report, called me an alarmist and said that my opinions were right up there with the Y2K doomsayers.
When Slammer hit, I had described the vulnerabilities and outcome so accurately that this guy actually accused me of writing it myself. Took the whole corporate network down and they couldn't bring it back up until their techs visited each site. It took two teams seven days to get to all the sites. The company lost 6 business days, three customers and a months worth of transaction records.
Needless to say the CIO was demoted (they didn't fire him, which I consider itself a major tech mistake) and had me re-issue my audit report which they then followed to the letter taking every precaution I suggested.
nooooo
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
When I was first learning linux/unix I installed RH5.something on my computer (cyrix 6x86 133+ iirc), anyway I was having weird issues with several programs so I decided i needed a fresh start, those darn dot files must be currupted. .* ... ... damn I must have a LOT of those dot files. .... *DOH* .. ( I was running as root, It was my personal box what could be the harm)
So I typed:
rm -rf
This disk started churning
about 30 seconds later
the disk is still churning
about a minute later
CTRL-C
Where did all the files go? DAMNIT! I recursively deleted
I learned my lesson very well:
CREATE AND USE USER ACCOUNTS!! DONT RUN AS ROOT IF YOU CAN AVOID IT!
Thoughts on tech, Software Engineering, and stuff
Telling that twerp Bill that he should quit school and try his hand in the computer industry.
I was young (around 8 at the time, can't remember) and I was bored one afternoon. I started fiddling around with the back of the computer, the PSU, to be exact. The red button looked fun to play with.
It was on 220v. I turned the computer on. It worked. Then I tried putting it on 110v and turning it on. Nothing. Then I switched it back to 220v, turned it on, and switched it to 110v while it was on.
Boom.
Moral of the story is, trial and error isn't the best way to learn hardware, and don't throw water on the smoking PSU while it's still live.
Founder of Mirror Moon - Tsukihime Game Trans
i think i'll post this anonymously.
i work for a financial transaction company. i had just started and i was nursing along a system the previous programmer had written. the code had this "feature" where a nightly job would rotate its logs by just opening and truncating a file with a date encoded in it. it also had a bug where it would get stuck. so the first time we hit that bug, i just killed and restarted the job. and again. and again.
the problem is that the log file contained the days transactions. oops.
so i spent about 24 hours straight figuring out how to recreate the logs - in the end i did with the sole caveat that i could only retreive the date, not the time. luckily our bills only showed the date, not the time (which i set to noon that day).
ah, stress? what stress?
obviously all code i write does not just blindly open and truncate log files. that was just a vague rule before, now i'm rather fanatical about it. plus i record important info in multiple places. disk space is cheaper than 24 straight hours of stress.
I think the articles implication of "the more we learn, the less we think" is wrong.
Back in 1998, we were working on site deploying a new product to a customer. The product required us to create a new database on MS SQL Server. Well because of the size of this database, it takes over 5 hours to create. We could not continue on with the deployment until this was finished.
Well when it finished, in a rush to get out of there, I accidentally deleted the database and had to restart the process all over again. Many a cow-orker was pissed at me. Had to stay an extra day to complete the deployment.
Mid-Eastern Pennsylvania Gaming Convention
I've lost my machine to cheap power supplies. The first time I thought was just a freak accident (blew the motherboard, CD drives, hard drive), since then I go for the Enermax and not some unbranded power supply.
Archie - CIO-for-hire
So this wasn't a production machine I screwed up or anything, but I'm still a moron.
I had a Linux workstation that was ultimately adopted by the development group I worked with in the late 90's. Anyway, for some reason I needed to make a boot disk from an image. For some other reason, while typing in my command line, I was thinking fd0 but managed to type hda. So my line was dd if=/wherever/whatever.img of=/dev/hda.
Anyway, before looking at what I had typed, I hit enter. About 2ms later, I glanced up at what was on the screen and exclaimed something along the lines of "holy fscking shit!" and simultaniously hit a ctrl+c. Interestingly enough, the drive still kind of worked. I tried copying the contents of the disk over to another device, but I found that with each command - nay, each disk access, the filesystem would disintegrate further. I was able to save /home -- but I otherwise had to reOS the system.
I guess I've done much more stupid things with production machines -- but these were better machines, with storage on a NetApp NAS, which all had snapshots, so recovery was nearly instantanous.
These are not things that I include on my resume. (So -- anyone want to hire a disaster waiting to happen?) ;)
-Turkey
mv some.file /dev/hda1
/mnt/hda1!
no wait, i meant
aw crap.
These are partly the fault of the company, as the development/qa/production all occur on the same machine, but anyway, here are a couple.
/* do stuff */
I deleted a client's production database. I got lucky with this one, since even though it was a major client, the backup was 6 hours old and they were only using it for the last 90 minutes or so.
When experimenting with how to use the fork() system call, I wrote a program that was more or less equivalent to:
int main(void)
{
while ( 1 )
{
fork();
}
}
which brought the server (production) to a screaming halt. (for those not familiar, this will basically spawn processes indefinitly. We had to reboot).
William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
Oh man. So I was a grad student, right? I was always trying to portray myself as a very serious, dedicated student to my thesis advisor. And he had the fastest computer in the department (a Sparc10!) and he gave me permission to use it for batch runs. So I pretty much kept one of my xterms as a remote terminal to his machine.
Anyhow, one day I found this funny .au (sound) file and wanted to play it for my office mates. So I did a 'cat naked.au > /dev/audio'. Nothing happened. So I turned up the volume and tried it again. Still nothing. Then I screached in horror! I was typing this command in on the xterm I use for my advisor's machine! Sure enough, two seconds later an email comes trickling in from my advisor stating 'Please note that you are logged into my machine so your sound file is coming through my speakers.'
So what was this sound file that I had inadvertently played for my advisor?
Butthead: "Whoa! Naked chicks!"
Beavis (excitedly): "Yeah! Naked chicks! Naked chicks!"
GMD
watch this
Okay, so "back in the day" (Amiga!), I got this really spiffy new floppy disk copying program, and decided that would be a good time to make backups of all my floppies (no HD back then). So I fired up the new copying program, and made backups of all my floppies.
Unfortunately, I hadn't taken quiiiiiite enough time to learn how this new copy program worked - it LOOKED pretty easy to use, but, well...no.
Instead of making backups of all my floppies, it had reformatted each of the floppies I put in to make backups OF, and formatted the 'copies' I thought I was making, thus wiping out my entire floppy collection.
*sigh*
But I did feel a bit stupid watching my school spend $160/box to roll out 16-bit Microsoft TCP/IP for WFWG 3.11 on all of the machines. A week after the order was made, a free 32-bit beta that worked quite a bit better was announced :-(
/dev/sda instead of /dev/sda2 :-( (where sda1 was my DOS partition with code I didn't want to lose of which I had no backups :( )
:-)
One of the worst personal mistakes is when I installed Linux 0.98 from the boot/rootdisk combination to my harddrive (previous versions I had to use from floppy due to missing support for my scsi adapter) I had a partition setup for it and everything. When the thing asked where I wanted to install, I answered
Of course, that episode made my transition to Linux as the primary OS a lot faster since I had to restart those projects from scratch and started using gcc instead of borland while I was at it
I was planning an update to our global network to convert from RIP to OSPF so we could do some trivial load balancing.
Step 1: I telnetted into our core router and turned off RIP.
Step 2: Turn on OSPF.
The problem was that turning off RIP first killed all of our traffic world-wide. Oops. 10 seconds later OSPF brought the network back up. Sigh.
LOAD "SIG",8,1
LOADING...
READY.
RUN
My mistake was to give the techie "thumbs up" under pressure. I folded to the "We needed this yesterday" argument despite my misgivings about the software. I paid for that mistake for the next year in slavish tech support. We became the software company's test bed as we found bug after bug. The software "worked", but operator efficiency dropped, and uptime was sub-optimal. "Customization" caused problems, etc., etc.
The second mistake I made was to attempt to use VPN over Broadband with Citrix MetaFrame. Although MetaFrame was a pretty secure and slim protocol for remote desktops, the Internet provider on the remote site had horrible latency problems and was run by a group of amatures. I should have stuck with the original Sprint frame relay proposal.
Morals of the story: don't let PHB push you into a solution you don't trust, and when network reliability is important, pay for assured quality of frame relay.
assert(expired(knowledge));
I was a young pup in the Army, during a training exercise. My Commander told me to kill the network, to "simulate" it's loss. We were operating a frequency hopping radio network, which of course is based on time. As the master node, I controlled the time. I pumped my transmitter to full power, and slowly pulled the stations that could recieve my signal out of time. Lowered power, pulled a smaller number of stations even farther out of time. Wash, rinse, repeat.
Commander thought I was brilliant, and so did I. I had fractured our network into at least 10 different domains. No one could talk to anyone, effectively "simulating" an enemy jamming attempt. It would take hours to restore the network, with many mad commo guys having to drive about with Pluggers, early GPS devices, to restore each radio to propper time.
Then a tank flipped. Someone died. No one could call for help. I am so damn smart.
No moon black, At 2 in the morning, in an upside down tank, the gunner figured out how to put his radio in plain text to call for help. It took him almost half an hour.
If voting were effective, it would be illegal by now.
A couple of years back (using a Mandrake distro that was still basically Red Hat with some changes) I was messing around with the fstab file, trying to get a cdwriter to work, and decided that I had screwed it up beyond repair. Since I was coming from the "Wonderful World of Windows" I expected these files to autogenerate themselves on bootup (how? why? etc etc). So I deleted the fstab file . . and upon reboot found that, no, the absence of an fstab file is not a quick way to have a new one generated.
Posting your worst tech mistake ever on slashdot and forgetting to check "Post Anonymously".
I worked in K-12 for a few years as a network engineer and programmer (weird combo, but's served me well)
The particular district I worked for supported three school districts on a single Data General minicomputer. This meant three separate student databases. When we wrote new apps, we would install the app and then create a symbolic link to the student database for which we were installing it. In this way, a single new app could be easily rolled out to all three.
Once... (just once!), I rolled out the new app and linked to the wrong student database. Without remembering thatI had changed directories, I deleted the symbolic link to the student database... or so I thought until the phone started ringing. I had deleted the entire student database - not the link.
Fortunately, we had a good backup from the night before and this had taken place in the morning...
Not really a tech blunder but...
Many years ago when I was a operator of an IBM 370 MF, I was showing a junior operator around the system console. My famous last words were "never, ever, ever press this button," as I proceded to depress the system halt button. Alarms went off, managers came running into the room. It was a mystery to all but me and the junior. We still laugh about that.
-Dumbass
rm -f * ~
(Should have been 'rm -f *~')
Bot Assisted Blogging
Obviously the thing crashed before me plus a couple of colleagues that were working with me on this box. All of us were all like, 'ooops' or 'shit...' The router was configured with something like 20 BGP peers, carrying multi-gigabit internet traffic accross one of the largest East Coast backbone.
Lessons learned: targeted traffic debugs (against an ACL), disabling console logging (only buffered) before launching heavy debugs.
installing and using Redhat 6.1 with Gnome.
:)
I got to use the rm * -fr command though
Not about computers, but: While working for a Physics research lab, I made a laser water jacket without a bulge at the end of the inlet pipe. The water pressure rose at night, the tubing slipped off, and the 2nd floor and part of the first was flooded, including expensive test gear like oscilloscopes.
I did something similar on my beloved C=64.
I borrowed a game from a friend, and wanted to copy it. Of course, it had the classic 'deliberate bad checksum' anti-copy protection, which meant nothing more than loading a disk copying program that would handle it.
About half way through the first phase of copying, it suddenly dawned on me that I was using my disk copying floppy as my destination disk. I immediately pulled it out of the drive, thus ensuring I had neither a copy of the game nor a copy of the software required to try again!
Working as a consultant I turned up at a new customer (moderate sized pharmacy) to see what they needed. Walked in, all confident, the local tech guy met me, and I asked to look at their server room (I always liked seeing the hardware).
Anyway, as we are standing there, I think, well lets see how many users they have, so I ask if I could look at the Name & Address book. Opened up the people view, hit Control-A to see the count at the bottom of the screen of the number of records. Unfortunalty it was a very small compaq keyboard, hit delete as I turned to the local tech..
"I see. The fact that you...`can't explain'.. explains everything."
Harddrives and PCI cards are NOT hotpluggable.
-------
Support Indy Music. Buy
I was working on a cisco access list one day, and was working on blocking IM clients per my manager. I tested it when I was done and sure enough, it blocked the clients, they could no longer log in. Satisified with myself I went to browse to slashdot. "hmmmm Slashdot must be down" I thought. So I tried somewhere else, maybe my computer is having problems. Then the IS phone started ringing off the hook. Dammit! Everyone is down. Nobody can do anything. Email is down, Web is down, VPN is down. I went to go look at my access list, but my terminal connection to cisco wasn't working either. Walked back to the server room, and rebooted the router (I never EVER save to nvram for at least a week). Everything came back online. I went and looked at my access list and forgot the allow rule at the end. All cisco access lists have an invisible "deny all" at the end, and I just forgot to say "allow all" oops.
The end of the day tally? at least 7 locations and 2 states.
/* oops I accidentally made a comment, sorry */
Luckily I was able to reconstruct it, pretty much from memory and using other pages.
yep. good times.
Years ago, I once put together an html prototype app that had bogus links href'd as "xzy" or "xxx" - it seems that around that time Netscape started doing you the favor of embedding it in www.-.com, so a curious user found himself at a porn site.
Actually, he was a consultant from a competing company, and he made a *huge* deal out of it, cc'ing the client's department heads and VPs. There was a big uproar for about an hour, but in the end, no one really cared, and the other consultant came out looking like an grandstanding byotch.
My friend and I were attempting to put together a computer for him about a year ago. The first motherboard we got was faulty (couldn't get a post), so we got a new one. The first few times we tried it we got some system beeps, so we looked them out and got rid of them. Finally, the bios screen came up and we both threw up our arms in jubilation, then the screen went black like it was shut off. I asked "I didn't shut it off did you?"
"No"
"What's that smell?"
"Hrmm...looks like the fan's loose. Oh well, I guess I fried the processor"
-- Political fascism requires a Fuhrer.
Actually, this was fairly recent:
The Chemical Engineering Department at my university purchased a 42 node rackmount beowulf through my department (well, the one I work for), the Research Computing Department. We were to assemble the cluster and house it in our server room.
Since my boss was busy working on a visualization project and my co-worker was out sick, I was handed the whole job. I decided that the best way to organize the cluster would be to put the 42 nodes in one rack and the head node, switch, and UPSs in the other. Well, when I put the 42 nodes in the rack, I had forgotten to drop the anchors at the base of the rack. The wheels on the left side at the base gave out under the weight (each node weighs in around 40-45 lbs) almost causing the whole thing to fall over. I almost trashed a $100k beowulf because of a stupid mistake.
Luckily I was able to fix the rack and rearrange all of the nodes before the guys at Chem. Eng. came to inspect. That would have been the end of my job.
I stuck a bare CD into a drive that required a caddy, before I knew that kind of CD drive existed. This was at school in the 7th grade (I'm a college senior now). If anyone had seen it, there would have been an uproar, because I was a well known, but not well liked (at the time) computer nerd. As I was too rushed to get it out to think carefully, I committed another foolish act by fishing it out with a paper clip. Luckily I didn't cause many scratches.
As I got home form work, I got a call from the company president saying that there was a power failure in the neighboorhood and all of the UPS'es were going. We didt't have any intelligent processes, like auto shutdown, etc, so he wanted me to remote in and safely poweroff the production and development servers. I knew there was a time limit, because we were stressing the UPS'es as it was, and the power had already been out for a while.
I got in through the firewall and started ssh'ing to various machines to poweroff. Unfortunatley, at one point, I exited out of one shell and ran poweroff before loggin into the next. Low and behold, I was powering off the machine that I was connecting through. DOH!
Luckily, the power did return shortly, and the ones that I didn't get to survived long enough to stay on UPS.
I have misplaced my pants.
I was doing phone support for a national bank in Canada. One of the problems we routinely had was a connection would freeze-up on a teller's terminal in one of the 1000s of bank branchs across the country.. We'd have to go into a program running on our AS-400 and reset the connection. On the odd occasion it wasn't just one terminal but serveral at the branch. We'd have to get all the tellers to exit out of their terminals for a second, then, in the program, we'd esentially hit the 'back' button, be up one level so we saw all the connections by bank branch instead of by terminal, hit 'backspace' to send the command to reset the connection and then 'y - enter' to confirm.
I got one of these calls, and I went one level up the tree, got distracted by something, and without thinking hit up-backspace-y-enter, going up two levels in the tree instead of one. This reset all the connections for the whole network, to all the banks, all across Canada.
Every phone in the call center started ringing. Every LED that could flash red did so. Everyone in the call centre looked around frantically. I looked at my terminal and almost died on the spot.
Not only had I reset all the terminal connection, but trying to bring them back online flooded the network so as soon as they tried to come back up they all went offline again. It took several hours to get things stabalized and the banks could start serving customers again.
Fortunatly my boss was a decent guy. He saw it as an accident and something that no one should be able to accidently do. The command to reset the entire network was modified so you had to type in your password to confirm, instead of just 'y-enter'
I had my windows partition mounted under a linux directory. I decided to move the mount point to another location so tried to delete the current mount point. I forgot that it was already mounted and ended up deleting the entier windows partition.
Ooops. So, don't drink and root. Important safety note.
"He who would learn astronomy, and other recondite arts, let him go elsewhere. " -- John Calvin, commenting on Genesis 1
(comic book guy voice)
By far, my worst tech mistake was dropping out of college to take a full time job as an outsourced computer admin. Not having my degree has kept me from being competitive for better jobs with larger companies.
I love job now, but I don't have much room to grow, being as I'm the top IT guy in a 70-person company that's family owned (and I'm not in the family). I'm working on finishing my degree now so that when the time comes to move on, I'll be able to find jobs that have room for growth.
Blogging Weight Loss, Distance Education, and more at verlin.com
Forgetting to back up my system before installing a new OS on the HD 0 (when I meant to install on HD 1).
in 2000, a co-worker was migrating a large Catholic Diocese, one of the top ten say, from Novell to Microsoft (I still don't know why) as I had somewhat purposefully(on my part) been asked not to come back for a while (but that's another, dumber story).
Anyway, not having done any such migrations before, after thoroughly RTFM, he set up, almost entirely correctly, the migration service and began moving users. The syncing tool was set to run just before backups, so that the backup would reflect that days migrations and updates.
It was supposed to go like this: copy all files from the Novell directory, nightly, to the new user directories on microsoft shares unless the Microsoft file was newer (hence indicating that user was migrated) and eventually all users, over the course of a week, would be migrated and the sync turned off. everyone transparently suddenly works with microsoft shares and la di da off they go.
It was an excellent plan with the exception of forgetting to check the little box that made sure that newer files were not overwritten with the old ones from the (now defunct) novell servers during syncing. So every night the old files would overwrite then newer ones. People started to complain about the third day that their changes to documents and such weren't "sticking", and on the last day of the migration, we figured out what had happened.
So every night, before backups, the newer files were being overwritten and then backed up. This included the Accounting, Newspaper articles, judgements, spreadsheets, EVERYTHING. For a whole week, 1600 users lost their data and it wasn't backed up on purpose. Oops. Funny thing though, our company kept the account and what remains of that company still works on it to this day!!
What happened to the co-worker? Well we all just kinda laughed it off and that 19 year old kid became the second youngest CCIE up to that point in time, and a year later got his second CCIE in security and is making comfortably north of 120k/yr now.
-- This sig has a cholesterol count of 680... higher is better right?
My best singke mistake was after a long night of re-installing an updated version of solaris on a SparcServer 2, I needed to clear out the /tmp dir sor some stupid reason. So, I did the old: "mkdir newdir ; mv * newdir"
/tmp. I was in /.
/usr/lib/libc.so.0
/usr/sbin/static there are 5 statically compiled binaries: cp, ln, mv, rcp, and tar. /newdir/usr/sbin/static/mv /newdir/* / would have fixed it.
I wasn't in
My next command was 'ls'. It returned: unable to find
AAAAARGH!
I now know how to solve that under solaris. Under
Ever since then, my prompt has had my current directory in it. That experience certainly made me more careful.
Better (or worse) was when a stupid service rep came in to replace a bad CPU on a sun e10000. The idiot shut down the sub-system, and powered off the board correctly. He then managed to pull out the wrong board, despite the blinken lights. Of course it was the peoplesoft domain. Running year end reporting.
AAAAARGH!
Zapman
We learn from our mistakes? Speak for yourself bitchtard. - Geccie
Freshly out of college I linked ls to l. But I reversed the parms to ln and linked ls to nothing. Not fun. Trying getting around Unix without ls.
I was a newbie sysadmin who didn't fully grasp the concept of the tar command. I untared an entire
I knew I made a mistake when exclamations, gasps, and quizzical mutterings began to fill an otherwise quiet graduate laboratory. Before I had a chance to fully grasp what I did, my three supervisors walked in the door, having seen my command scroll across their syslog terminals.
Michael.
Linux : Mac
One time I pulled out a 1U rack mount chassis from our rack only to hear it power down as it got about halfway out. When we installed the system, we didn't use the cable management system in the back to snake the power cable through, so when the chassis was pulled out to install a hot-swappable device, it yanked the power cord out.
Luckily it wasn't a really important server.
Lesson learned: Use cable management systems, or give yourself enough slack so that when you need it, you'll have it. No sense in having hot-swappable components if you power down the system when you pull it out of the rack.
There are 01 types of people in this world. Those that understand binary, and me.
Unfortunately, the library I defragmented was SYS1.LINKLIB -- out of which most of the operating system was executed as well as the defragment utility itself. Very quickly after the utility started moving programs around on the disk the operating system ground to a halt.
The real systems programmer, who had been up all night, was NOT amused. He had to come in and I helped him perform a DR. The mainframe system was down all day -- I wasn't very popular.
Somehow I didn't lose my job. In fact, I eventually came to own technical support and operations there. But 10+ years after my screwup people still liked to remind me about it. Guess they figured it kept me humble.
I had never used anything besides unix. Once a brand new alpha 500 arrived for me to play with on my first week as a PhD student.
While trying to cope with stupid licence restrictions, the fact that $HOME for root is / and that rm is non interactive, I accidently deleted (I think) the kernel of the machine. The fact is thta it wouldn't boot!!!!
Sweating like a pig I tried to explain to my supervisor and unfriendly IT guro how after 3-4 hours of unpacking the most expensive computer in the lab was dead.
I plugged it in, the little green light came on, the fan spun up, and I noticed the '220V' sticker right next to the power socket. So, I promptly yanked the plug and hoped that nothing bad had happened. After a second, I shrugged, figuring that putting in half the power it was expecting probably wouldn't do anything.
I walked out and found a 110 -> 220 transformer. I plugged it in and plugged the computer into it. Again, the light came on, the fan spun up, and...
POW!
Suddenly, it was very quiet, because everybody quiets down when they hear a loud noise. It was especially quiet because all of the computers in the room were now shut off. A second later, I hear someone in the next room over ask, "Hey, did your computer just shut off?".
At this point, people are looking at me, because I was the source of the noise. I promptly (and of course, futilely) yanked the cord again.
About this time, I noticed that the power supply didn't have a 110 <-> 220 slider like most of them do. I looked more carefully, and found it. It was set to 110. I had missed it the first time because it was CONCEALED UNDER THE '220V' STICKER!
It turned out that the room I was in was almost overloaded, and that computer blowing out was enough to do it in. We rerouted some extension cords and all was well.
You know, [CTRL]+[Z]?
In Windows Explorer, it undoes the last file operation....
I was at a client's site a few years ago moving some data files around on their NT server and archiving old things that weren't needed anymore. As a part of this, I renamed a "particularly named" folder and hit [ENTER] just as the thought occurred to me that I needed the exact spelling of the original folder name before I changed it. So, without thinking I hit [CTRL]+[Z] to change the name back figuring I could just copy the name to the clipboard and rename it again. HOWEVER, instead of un-doing the recent "folder rename" operation, it undid a much earlier "folder COPY" without telling me which folder or where it was located!
Luckly, I was able to retrace my steps and figure it out well enough to put everything right, but NO MORE HAVE I *EVER* [CTRL]+[Z]'d in Explorer again!
"Lawyers are for sucks."
- Doug McKenzie
I work for a telephone directory publisher. A few years back, we were pushing a deadline and the man was not happy with the completeness of zip (postal) code info in the book. I purchased a new zip coding utility, ran it against the listings, and told the production dept to proceed with pagination, thinking that the army of proofreaders we have would notice any errors introduced by the new software.
I mean, what, I'm supposed to proofread the entire phone book by myself?
Anyway, the software used some kind of crazy soundex routine to "fix" addresses that it wasn't able to resolve, and thousands of people ended up with completely incorrect address information. The book went to press, was distributed, and a day later the phones were ringing off the hook. We had to pick up the old books, fix the data, schedule more press time (no easy feat), re-print, and re-distribute.
Total cost to correct was around $1M, got my ass chewed royally, but managed to keep my job anyway.
Must be doing *something* right!
One time I was installing a new hard drive for a customer. He wanted the data transferred from the old drive to the new one and the old drive discarded. The company I worked for had just obtained a copy of Ghost and it was our first time using it. I mixed up the two drives and copied the larger (new) drive to the smaller one and destroyed all the client's data.
Breaking out the Wayback machine, some intersting war stories can be found in the first article I submitted to slashdot, "Embedded Computer War Stories".
"Prepare for the worst - hope for the best."
We had a drive in a RAID5 container go bad at work a few years back. This was a raid controlled by an old Mylex DAC960. A guy I worked with went to replace the drive. He powered the machine down, took out the bad drive, put in a new spare, and powered the machine back up. In the controller setup he was presented with two options: rebuild and make online. Well he chose "make online" instead of rebuild. This caused the controller to render the raid volume completely useless. He chose poorly.
-----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
You might be able to hear the mournful call of the endangered species known as "actual irony".
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
One time I accidentally low-level formatted the wrong hard drive. I wanted to do the 1st (Primary) hard drive and the BIOS started it's drive enumeration at zero. So of course without thinking I put in a 1 instead of a 0.
I think the most important thing that was on it was my collection of digital music that I had spent over 1.5 years downloading from random FTP sites (this was before Napster) via a 28.8 and later a 56k modem.
Fortunately, a friend had a recent copy of my music from our recent LAN party.
This user account is inactive account replaced by the PDA
Last year a friend gave me a pentium 200 mmx that he coundn't get working. Since my parents were in need of a firewall I figured I would drop a couple nics into this box and build one for them.
The first thing I did was plug in a keyboard, monitor, and turn the box's power on to see if it would reach the POST.
Smoke started coming from the box, and soon open flame. For a brief moment I just stood there looking at it thinking, "That's interesting. First time I've seen a computer catch fire." Then I pulled the plug from the wall and the flames soon stopped.
I looked into the case to see what went wrong. It seems that the power supply connector for a floppy drive is roughly the same size as a speaker connector on the sound card. My friend had plugged the power supply into the sound card which seems to have caused the fire when the power was turned on. I suppose I should have checked for something like this instead of just plugging in the machine.
For a while about a year ago, my friend had no Airport card on his laptop like I did. So when he was over at my place, I'd be using my laptop's wireless to get online and do internet sharing on the wired network card. Well, that worked great at home...
But then I'd come into work the next day, hook up my laptop to the network and it would start giving out IP addresses through DHCP to all of the hundred some workstations at my work. Well, suffice it to say, those IP addresses didn't work too well, and it wasn't too hard to track it down. After this incident happened three times, I'm no longer allowed to connect my laptop to the network.
So, I bought my friend (who is also a co-worker) an Airport card, and now he shares his wired network connection with me. So it's all good.
Trying out Kmail was my biggest mistake, because it had a different interpretation of the file OUTBOX than did my previous mailer. My previous mailer stored every email (6 years worth) in OUTBOX. And kmail took OUTBOX to be the file where messages written offline were temporarily stored until next coming online. The first time I fired up Kmail, a indeterminate-time progress bar came up, and it kinda hung. I went to get a coke, giving it time to snap out of its funk. Unbeknownst to me, during that time it re-sent every email I'd ever sent. When I got back and checked my INBOX, I screeched in horror.
Funny thing is, people from my previous job were getting work related emails from me again, and they didn't seem to mind that (1) they were on outdated topics and (2) the company was defunct, they played right along and replied stuff like "yeah what ever happened to that issue?".
That one has fucked me every time. The funny part is I still do that.
Computers are useless. They can only give you answers.
-- Pablo Picasso
When I was a CS student @ WWU a few years ago, we were writing some simple tcp/ip client/server applications in our unix lab. All of the machines home drives were nfs mounted from a central server, and there were no user quotas.
My server would log an error if a client was unable to connect, and one of my classmates had decided to 'mess' with my server by hammering it, I guess he hammered for quite some time as my log file eventually filled up the NFS share and all of the machines using it (consisting of many people doing their assignments!) decided to halt.
I consider it more of his big mistake than mine, it was a pretty early class and it definately wasn't production code, but it still caused some damage.
In another life, I was a devloper/lead on an embedded system that controlled the cabin lighting, reading lights and attendant call lights on a popular passenger aircraft. Basically, all automated functions (except for in flight entertainment - movies, video games and such) that are used by anyone in the passenger cabins. One input was a discrete that indicated when a toilet valve was thrown (flushed). The discrete would stay set if the waste tanks were filled with, er, waste. Our system would light the "toilet occupied" light above the door, which we normally only lit when another discrete indicated the door was locked.
For various reasons, we were upgrading the computers to an embedded 80186, a workhorse 16 bit processor that my company had a lot of experience with. As a technical lead, I ported an executive (it would be misleading to call it an "embedded operating system," but it did the same job) that we had used on another 80186 based system.
The discrete driver used a linked list to handle queued inputs, chaining a number of structures together, each about 20 bytes big. I did not need a lot of the features the driver had, so I pared it down to a three byte structure. What I didn't know was that the garbage collection routine did a type casting trick with a four byte pointer (instead of putting the pointer into the structure itself or using a union) which would overwrite the three byte structures. The net result was that the chaining was worthless: the driver interrupt routine could only handle one input each time it ran (at 60Hz).
During our in house testing, we hit the dicretes as fast as we could, but we never caused the bug to show up. Our customer, however, had a dedicated lab that simulated an entire 400 passenger aircraft. To allow a single operator to stress test the system, a lot of the inputs (passenger reading light switches, attendant call switches, etc) were wired together. One such arrangement was wiring two "toilet waste tank full" discretes together.
So, shortly after we delivered our first hardware and software to the lab, the lab guys were checking out the wiring on their test stands. One of them threw the ganged switch (that is, flushed two toilets). My driver recieved two inputs in the same 1/60 of a second. It read the first, processed it just fine, then tried to read the second via a corrupted pointer and... BAM!!... my computer reset.
The lab guys thought it was hillarious. My boss was less amused, I'm afraid. Ironically, I now work for my former customer, albeit at a different division. I guess word never got out...
"Prepare for the worst - hope for the best."
While working as a Systems Consultant for a short while my boss (the COO) decided to increase revenues by contracting us out to be on call for the local Internation Airport. We had to carry a pager and needed to give them a 1 hour response time. Problem is--my boss signed us up for the contract but never got us any trianing or received any documentation on the Miltope printers used to print plane tickets and and bagage tickets. I was called out to fix this miltope printer because it was misaligned. Our support line was no help at all, none of the tech support people had ever herd of, used, or fixed a miltope printer. So I deicded to disassemble it to see if its physically out of alignment. Anyways, as I took out a screw from the machine, I herd a shaft near the print head drop down--unfortunalty, there was a metal plate preventing me from putting it back into place. The only way to rmeove the metal plate was to strip the entire machine down to just the case (take out all electronics etc) Luckily with some persuasion I was able to get the part back in place without going through the toruble (Aobut 1.5 hours later) BUT I still hadn't fixed the alignment issue. I finally got someone on tech support who told me I had to push the buttons on the front in a special combination but he didn't know the combination. Another 30min later I had luckily done something right and the printer was realigned. That was the first and last time I worked on a miltope printer.
Well, this is more of a graphical error than a text error, but it's still amusing. My company developed a technology where you can watch video from mulitple angles. (note: this is going back a few years.) So we were pumping out demos like mad. At one point, we got some stock footage of a horse show or something. It had a horse jumping over a fence, filmed from different angles. I had to insert the words "click here" at the bottom of the video because I was going to make that clickable. If you click there, then you get s'more info about our software.
Back then, we didn't do letterboxing like Media Player does. If the window you play the video in isn't the same as the aspect ratio of the video, then cropping occurs. I did not consider this little fact about our player, rather I got it up on the site as fast as I possibly could. Then, I went to lunch.
When I got back from lunch, I noticed the CEO was looking at the demo. So I poked my head in to say hi. He says "Why is this video telling me to lick it?" Wha? I go up to the screen, look at what he's watching, and... eep. The c in click here was perfectly cropped out of the shot. I mean perfectly. I mean you didn't know it was missing. So here's a horse, reared up on its hind legs, with the words "LICK HERE" just below its.. uh.. tail.
I am so glad that we had the one CEO in our industry that understood what took place.
"Derp de derp."
Now I don't have time left over to make any more mistakes.
Syntax error: loose != lose, affect != effect, then!=than
This is the same guy who would routinely check in about a gigabyte of PDF documentation, that he downloaded from another company website, into our version control system.
As a joke, I once set the transporter to low resolution. The Captain was not amused.
Years back now I had a heavily upgraded Amiga 4000. 68060 processor (about as fast as a P100 but with a much lower OS overhead than Windows), third party GFX card that fitted in the Zorro slots and let me run 1152*864&16bpp for my desktop. Hmm, getting slightly nostalgic just sitting here typing about that machine :-)
Well, that GFX card needed drivers which were initialised by the OS, on load. If the OS didn't boot for any reason, no GFX drivers. I didn't have a monitor that could connect to the default graphics output...
Well, one day I wasn't concentrating when I did an install. Got something horribly wrong through blind stupidity and arrogance, enough to stop the machine booting. Boom! No output! At all!
I ended up having to get that machine back up by taking my other Amiga and writing custom bootdisks on it. These disks had nothing but the boot bit set and a custom startup script on them (ahh, to have machines that would run that!) that interrogated the drive in various ways while piping the output to a text file on the disk. So, write disk on the live machine, power up the dodo and stick the disk in then wait until the floppy access light stopped for more than a second or so. Then take the disk out, stick it in the other machine and interrogate the output. From that, try and work out whether I now needed to copy a new file over from the live machine, move something that was already there or get more information. Later, rinse, repeat until you've got a machine that actually boots. Took me ages, that did, but got the machine back eventually.
Or how about this one. The boss was going on holiday so wanted an autoreply on his e-mail address. Being a small company, he was also a member of several e-mail groups set up on the server, including the generic contact group. Predictably someone mailed this.
It was at this point that we discovered a major flaw in our mail server - it didn't send the autoreply to the original mail sender but to the account that had forwarded the mail. Which forwarded that message, which triggered a reply, which forwarded the message, which triggered a reply....
We noticed the server was being a bit slow but didn't think anything of it. It's an old machine, these things happen. We noticed a funny noise coming from the server HDD but didn't think anything of it - being a bit slow ourselves.
Boss then rings in and asks us to check his mail. We try - mail program crashes, takes server with it. So, we look up the server log file. What would normally sit below 100kb/day was sitting well clear 100MB. OK, something's wrong...
When we were able to stop this and work out what was happening, we found it had been sending e-mails in a loop for around 48 hours, averaging around 3 per second. Yeah, it crashed.
(Scary thing: Someone once forgot to turn off a group membership and did the same to an account that was set to forward to a webmail box. Thankfully not for very long but we weren't popular with the provider and had to talk sweetly to get ourselves taken off the spammers list...)
Final example: Boss couldn't remember the password to his NT laptop, hadn't used that one for a while. We'd tried getting the SAM file and cracking but when the fastest machine we could dedicate to this had managed 3% in a weekend we realised that wasn't viable. So, a colleague decided to sit at the keyboard and manually try as many passwords as he could think of to get in. NT doesn't appear to have any exponentially increasing timelock to protect against this so that's a viable attack. Miracle of miracles, within 10 minutes he gets in! Stupidity overtakes me and for some reason I still don't quite understand, I decided it was then necessary to confirm he had the right password, so logged back out. No, I can't think why either. It's at this point that he reveals he'd been guessing and typing so fast he hasn't got any clue what password actually got him in.........
We got back eventually. I went and hid from him, apologising profusely.
Greg
(Inside a nuclear plant)
Aaaarrrggh! Run! The canary has mutated!
Dangers of a Poorly Designed Sandbox: My first tech job was working on a cd/dvd manufacturers software system. We had test and production databases, but codewise they were mostly the same. While I was debugging an error and later testing my fix (using dev), I started and then cancelled all the orders for one of our major customers several times. After spending several hours doing this, the department head/VP came over to talk to me, the lowly intern. Apparently, the testing code was sending out automatic emails to essentially every single station in the manufacturing process to let them know the jobs had been cancelled or restarted. All of these were old jobs (dev db was snapped from prod a few months ago), but there had to have been thousands of messages and it undoubtedly caused a lot of confusion. The VP realized it was an honest mistake and left mumbling something like, "Well it's good you're testing."
Why we Don't Use SQL commands to update single records: This one wasn't me, but rather a very experienced friend of mine. One day a request came in to reset an individuals password within our web application. My friend, who wrote about half the code in this particular $20 million business, executed the following code in query analyzer: update user_account set password = 'newpasswd'. Oops, no where clause.... All of the passwords were wiped and due to the nature of our backup system, it took several hours to restore and what we got was a few days old. In the meantime, we had to make temporary passwords for all of our hundred or so employees while fielding a constant stream of calls from fortune 500 customers.
Fun With Backups: I jointly administer a machine with a software company. They handle anything related to their application and I get the rest. Well one day, the mysql database had grown to the point where it no longer fit on its partition. The company moved the data to a new partition and created a softlink from the old location and then sent us a note about the change (which was promptly forgotten). The individual who designed our backups didn't know a lot about mysql and was basically just copying the database files after stopping the server. Well, nobody updated the backup script after the data files were moved. Sure enough, a week later, someone did something through the (brand new) application that fubared a lot of data. When I went to restore from our tapes, I untared everything and verified the data was there (ls). Then I moved all of the current/fubar data out of the mysql data directory and went to copy the backup in. Only when I got to the backup directory, all of the backup data was gone! It then took me embarassingly long to figure out that what I had untarred was actually a soft link to the production data and not backed up production data. Hence when I moved the current/fubar stuff, the data in my backup directory "disappeared."
I don't see how you can call that a mistake. Thats more like quitting cold turkey.
In Republican America phones tap you.
And I'd rather not remember either.
Sattinger's Law: It works better if you plug it in.
Boy, do I feel stupid now.
My beliefs do not require that you agree with them.
1. I installed Linux on my computer
2. I bought an iPod
3. I got an iRiver MP3 player as a gift (ok, someone else's mistake)
4. I worked at BMC Software for Steve Wesling.
1. I was working on the development database but my boss needed a quick count of a number of checks so I opened a new window (Query Analyzer for SQL Server) to the production database and gave him his count. I then proceeded to finish what I was doing on development... without switching windows back to the development server.
:)) but beating ourselves in the head.
:)
TRUNCATE TABLE Checks
TRUNCATE isn't a logged option but thankfully Log Explorer Pro from Lumigent can retrieve truncated data if you move fast enough. As well we had a backup that wasn't so very old handy. Out of 1.3 million checks we only lost 34000, but I was so stressed out.
2. Way, way, way back when we had just gotten a new Dell server. I was showing an interviewee the server who I had found out I had known when I was younger. So, joking around I said, "Want to see a hot swap of a drive?" He was like, heh, that'd be cool. So I pulled the drive out of the RAID 5 array. Alarm klaxons started going off from inside the machine, I swear. I stuffed the drive back in but even though the drive officially -was- hot swap we hadn't purchased the high end Dell with an array controller that could dynamically rebuild the data. We'd gotten the cheap version. 8 hours later - with the machine beeping constantly at us - the rebuild was done.
3. This one's not mine but a guy I work with. I had asked him to migrate some databases to a backup server so he set up a DTS job to do the migration. Unfortunately he did two things wrong: the destination was the same server as the source, our primary production machine, and he set the DTS process to execute nightly instead of once. We ended up filling 300Gb of drive space and not having a clue as to what happened to cause it. When we found it we were giggling (it is funny
4. Another one that's not mine. New network administrator was installing Windows NT 4.0 (this was ~6 years back? Roughly?). He was complaining about it taking forever to install and I asked him what he was doing. "Well, shit, NT has like 35 disks man." I asked him why he wasn't installing off the CD and he just hung up on me. He didn't know the NT CD would allow you to do that.
5. On a similar vein my original boss when I started here was I thought a technical God. It's fun to see how that belief fades over time. In my case he was showing me how to install Netware 3.12 and configure it the way he wanted it to be configured. He sent me off on my own the next week to install a new office. The week at home I had burned all the Netware 3.12 files to a CD so I wouldn't have to cart around all those floppies. Apparently the load time off CD blew my boss out of the water because he didn't believe I'd installed the server already when he called to see how things were progressing.
6. I'm walking my COO through hooking up a new modem in our Kansas City office. He's getting mad at me and asking me if I know what I'm doing because we can't get a response from the modem. (I'm working blind over the phone.) I had asked him earlier if he had hooked up all the cables like they were to the old one and he had indicated that he did. Finally I said, "Look, don't take this the wrong way but let's check the cabling. You should have a phone cable to the wall, a power cable to the power, and an interface cable to the computer. These should all be coming from the modem." He had forgotten to hook up the RS-232 cable. To this day I razz him about modems telepathically communicating with machines.
7. My CEO is one of the brightest people I've ever met in my life and has my eternal respect for his intelligence and moral integrity. He called me and indicated he couldn't print. I told him to not get insulted but I was going to start with the basics. "Is the printer plugged in?" "Yes." "Is the power on?" "Thanks Brian, I'll call you if I have any more problems."
8. I had just come off the road from setting up our Texas operations - a 4 mont
My reality check bounced.
I was caught making a political statement. My company made technology that allowed the user to watch video from multiple views. Around the time this happened, a presidential candidate for the 2000 election made a speech. I got video of that speech and used a frame of it to illustrate a point in the manual.
I was explaining video compression. The lower the quality, the more quantized a video gets, etc. With this photo of not-yet-president Bush, I noticed that using a lot of compression made his mouth turn into two rectangular black pixels. So, in order to illustrate what the effect of lower quality is on the codec, I drew an arrow at the 2 large black pixels and wrote "the mouth loses definition."
Sadly, it wasn't subtle enough to slip by my boss.
"Derp de derp."
Was getting hooked on reading Slashdot every morning.
"Eve of Destruction", it's not just for old hippies anymore...
Say no more. And a bit more subtle then the omnipresent rm -rf /
What the heck are these files doing on E: on this machine? Fsck! Ok... let's delete them...
Sudden realization that it wasn't local after all.. it was our main server for the ISP we ran, 25 miles away!
Hopped in car, had to reinstall, got it back up and running about 2 sweat filled hours later.
Moral: Always be mindful of WHERE the command is running.
--Mike--
Years ago, I taught Mathematics at a small college. The newer faculty - I think I had been there about a year - had begun a jolly little discussion - just like this one - talking abour their misconseptions. I, being a sociable fellow like everyone else there, mentioned a problem I had had with a calculus concept as a graduate student. We all thought it was amusing.
I didn't think it was amusing a couple of weeks later when I was speaking with the dean - a chemist - and he mentioned the problem I had had and hinted that permanent faculty really should have better backgrounds than I did.
It turns out that one of the "sociable fellows" was really a sneaky little bastard who gathered up our charming, cheerful, self-deprecating stories and used them in a minor smear campaign on all the people in that group - except himself, of course..
Moral: Learn from your mistakes but be careful to whom you admit them. Someone will use them against you.
One time, I brought down the external network interface on the firewall at a remote site without scripting it to come back up again. (I was ssh'd in). There were no tech staff in that office, and it was two time zones earlier than where I was, so not only had everyone already gone home, they would be in earlier than I would. The remote office just happened to be the head office - you know, CEOs and stuff. In the morning, after they spent a couple hours cursing me, I talked one of the marketdroids through running "ifup eth0" as root to bring them back online. The result of that one was that we installed a modem with dial-back so that I could get a serial console if something like that happened again.
Another time, I messed up the NAT rules by putting "192.168.1.0/4" instead of "192.168.1.0/24" somewhere. The result was that about half the internet was unaccessible for a couple hours.
My worst tech related mistake was going into computing, instead of going to med school.
There is no God, and Dirac is his prophet.
Just because you can do something doesn't mean it's a good idea.
No one could access the shares on that particular fileserver anymore ("Can not log in from this workstation"). Thing is that the same error message is used when there's an encryption policy mismatch between client, samba share, and/or DC, which I spent 6 hours trying to track down.
And I thought I would be saving time with the switch to domain instead of server/share security.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
The very first fiber run in Phoneix went from one federal building to another. I'm not sure which, but they must have been important.
If you've ever seen an phone cable room underground, you know that the cables are straight, so straight that you can easily follow them across the room and usually clearly labeled. Well some dumbass manager went down into this one cable room underground in Phoenix, and saw this great big looping yellow piece of shit cable run and wanted it fixed pronto!! So he gets some new hire (been on the job less than a month) to go down there and I quote "Fix that Fu**ing thing! I want it to look just like the rest of the cable down there, and I'm gonna get the guy who installed it fired!!" (yes, he does come off as a jackass doesn't he?)
So this poor newbie goes down into the manhole and starts hammering, and tying down, this 'cable' run. He's using pliers, 3 pound mauls (why won't this stuff stay flat?) and whatever else he could do and wouldn't you know it, after 4 hours or so of this, it looks beautiful, just like the rest of the runs and even re-labeled!
Well, when this guy pokes his head out of the manhole, there are like 20 officers from the FBI, State DPS, County sheriff, ATF, and whoever else waiting for him with guns drawn!!!! Poor guy is fired on the spot and questioned for over 2 days, telling them he's not a sabateur and that his boss told him to do this. The boss doesn't fess up until the 3rd day of questioning, at which point HE is fired and the pleeb gets his job back.
The second first fiber run in Phoenix was back up shortly, and the other workers educated about it's "don't take a hammer to this shit" properties.
--
This sig writes better than I do.
I have done a few pretty bad tech mistakes myself. The worst occured when I was in grade 11 and the main admin of the school forum server. Because the forum server was inside the school network it was also behind the firewall, we wanted to have a whole bunch of stuff available to the outside that was being blocked by the firewall for remote administration. I decided that instead of arguing with the school board to open a few more ports it would be simpler to just use tunnelling through port 22 (which was open). Because I was by far the most computer literate of all the admins on the server I was writing a step by step tutorial on how to get tunelling working. I finished the Unix half of the tutorial and wanted to write the windows part. At this time I was still dual booting with windows so I figured the best way to test it with windows was to reboot into windows. No problem type reboot in the nearest open console and I'm done! Nope About a millionth of a second after I hit enter on the reboot command I realised that I had not typed it in my computer, but in an open ssh connection to the server! normally this wouldn't be a problem , I had everything that we needed to start in scripts and it would come back up after a minute of downtime. Unfortunately we had just finished installing linux on this comptuer. While we were installing linux on it the comptuer was on a desk in one of the classrooms like any other computer, and to prevent people from screwing around with it we put a bios password on it ... And forgot to take it off. As soon as I rebooted the machine stopped, started then hung at the bios password. I unfortunately did this on a friday night so it was at the bios password over the weekend.
When one of the teachers asked me why the forum server was down for such a long time I said that the storm we had over the weekend must have made the power flicker. We should probably get a UPS. :p
Then there was the time on my own computer where I meant to restore the MBR from a backup file. I typed
dd if=mbr.bak of=/dev/hda1
when I meant
dd if=mbr.bak of=/dev/hda
This was when I was still on windows and /dev/hda1 happened to be my NTFS windows partition. Whoops
History will be kind to me, for I intend to write it - Sir Winston Churchill
Aliasing rm to rm -i will do nothing if you use the -f flag, as you did. -f overrides -i.
.jpg
However, accidentally separating a wildcard from text is an infrequent mistake that can cause much pain. For example, typing rm -rf *
Zsh, by default, will complain at you and ask you if you *really* mean it if you use a bare wildcard with an rm command. Invaluable, and has saved my ass a few times.
May we never see th
I was at one of our main sites, trying to figure out why a new fibre switch I'd just fitted which was planned to replace an older copper model wasn't working correctly - the thing would constantly reboot when VLAN's were presented to it's trunk to a remote switch in the same building. The existing switch carries voice and data between two large sites.
After trying literally everything including using a serial terminal into the old switch to compare QoS and VLAN settings, I decided to reset the thing to factory defaults.
Called up the menu in the terminal, reset to factory default, confirm.
Almost immediately I got a call on my mobile telling me that "the phones had stopped working" for the site. "That's odd" I thought, and proceeded to troubleshoot the problem. I traced it to the old switch I was trying to replace, and was surprised to see a serial cable connected into it. The other end of course was connected into my laptop, which was displaying a confirmation of a reset to factory defaults...
here's the story:
i got all the components for my shiny new pc around christmas 2001.
during setup i followed the manual of my asus a7v266-e mainboard. there is a section that reads: to clear cmos remove the jumper xy, wait a few minutes, put it back on and power up.
i didn't have a clue, back then. otherwise i would have sensed, that there is something wrong.
but i did what the manual said and nothing happened. tried a few times and somehow got it working... the only problem is that the board now loses all bios settings after power down. that pretty sucks!
about a year later i stumble across user comments at different hardware sites. all state that there is an error in the manual. you must not power up with that jumper placed. if you do, the board will be damaged... there is still no statement from asus.
i really hate asus for that one. they make great products, but that was SO unnecessary!
the computer is online
i am not at it
what a waste of ressources
I was messing around, and wrote a small script, let's call it "hose". Hose contained one line: "./hose &"
After chmodding it, for some reason I typed "hose &" and hit Enter. At the time, we had 10 people per server, and I always had a load monitor running. It skyrocketed. People started complaining that the server ground to a halt. I went over to our IT person and told them what I did, but they couldn't get in fast enough. The process was spawning a new process so fast that we couldn't catch it. It was hard to even log in because the CPU was pegged. We thought - well, it will crash the server I guess. But it didn't. The load peaked, and stayed there. Just before the IT person was going to go into the server room and shut it down, I said "hold on!". I went back to my desk, typed in one command into the window I was still logged into, and everything went back to normal.
What did I type?
rm hose
My beliefs do not require that you agree with them.
I can remember hacking away on a box I thought was a testing box, when it was a production box. When I typed, "reboot" and hit enter there were screams all through the building.
As a runner up, I created a set of object oriented macros for C (pre C++). Now that was a disaster. Just try debugging macros of macros of macros of macros of macros of macros.
I used to wonder what was so holy about a silent night, now I have a child.
So a while back I was working as a network operations tech at a regional ISP with some very juicy colocated and dedicated server customers (includng some large national companies). One of my responsibilities (a minor one, really) was to set up some anti-spoofing filters to the colocated network each time a new box (and sometimes subnet for some virtual hosts) was added. For those who have not done it, you are basically denying any traffic which claims to be coming from an IP which rightfully should be on the other side of the interface and permitting everything else through (other filters handled other secutiry risks, like telnet). However, at the time, we were doing it the hard way - by hand. Copying out the old filter, editing it, and pasting in the new filter (we did change this later, for reasons which will become apparent).
So, a request for a new subnet associated with a dedicated server came in. I allocated the IPs, jumped into the router, changed the access list, wrote the access list to memory and jumped out.
Little did I realize I'd forgotten to include the permit any at the bottom of the filter. And, since I was accessing it over a different interface, I had no idea that hundreds of sites just stopped getting any IP traffic.
At the time, we monitored our servers by...Sun Net Manager, I want to say? It's been a while. Anyways, it was set up to alert us after five missed pings, bewcause for whatever reason Cisco really doesn't like reliably sending back ICMP packets. Each test was a minute apart, so it was a few minutes until finally we get notified that something went down. And by something, of course, I mean everything. Also, the work on the access list wasn't completely fresh in my head (I'd gone to reconfigure someone's Pipeline 50 in the meantime)...so we flail about for a few seconds.
Boy, did I feel stupid when I realized what it was. From that point forth, the filter lists were kept in RCS to make sure no one blew it like I did.
displague
monkeyballs
mistakes? i don't make em...
Personally, I never make mistakes. I hardly even watch what I type these days....
Marques Johansson
Just today, I discovered that rollback is not an option with DROP TABLE commands.
Whoopsee...
Je n'ai pas d'avenir Je n'ai qu'un destin Celui de n'être qu'un souvenir C'est pour demain
While in college I became an intern at a large fortune 500 company. As interns we used net send to chat between the various interns. We also discovered that you could net send to an entire workgroup. What we decided to do was place all our intern machines in the same workgroup so that we could chat to all interns more easily. However, it wouldn't work right, only sending to machines near your own.
Anyway, we came to the conclusion that workgroups would not traverse a subnet. To test our theory we placed a few machines on a seperate subnet and proceeded to net send to our default workgroup (which we didn't realize was the site domain). Well, net send to a workgroup will not traverse a subnet, but net send to a domain sure as hell will... Anyway, 2000 net send messages later, and a good talking to by the domain admins we were a bit wiser.
Hoping that nobody that got the detentions or the sysadmins are reading this, but here goes:
:)
:)
I had read/write access on one of the folders on the public drives because I requested Apache and PHP for one of my assignments that other people wrote in notepad with all those annoying javascript tricks they ripped from sites and the IT staff couldn't figure out how to install it. So, they gave me a folder and I installed it myself. Needless to say, I think they forgot I had read/write access on that folder, and I started to hide various files in there, some legimate like Dev-C++. And some not so legimate like some games. Some I made myself, and some not (Like this really addictive game where you have to dodge all these little dots. Sounds simple, but aint, especially with some special dots that home in on you, 2x speed dots, etc etc. It was Japanese tho, but yeah, great game).
Anyway, like in the second-last week of school, they started catching people who were playing games. In one day, they caught 40-odd people, and thank God I wasn't one of them because I was too busy practicing for a coding competition. I was able to get out of my class to get into the computer room where another class that was the same grade was there, playing a game that involved insane amounts of clicking. It was so obvious, with all that mouse-clicking going on, why wouldn't the teacher notice? They were supposed to be working on javascript...
Anyway, the next day our grade had to go to the hall while a teacher called out student IDs... It's surprising people didn't hate me for that, heh
And I managed to get some more people in trouble after a really simple HTML page I coded that was able to show the photo and past subject details of a person when you gave it a student ID was passed around. Apparently it was confidential data, but there was this link on the school intranet that said "Student Profile" that let to a page that worked like mine, and it worked two years ago. They found out and removed the text box, but...
One day I was bored, and went to that page again and viewed the source and found a line that contained a hidden value in a form named 'student_id'. So I just coded a page that posted to the same page, with a textbox for student_id, and voila! It worked.
The original page had "Example of a SQL query in ASP" for the title, too. It's amazing how bad the IT department is at my school...
P.S. I haven't been caught for either of the events, yet
Founder of Mirror Moon - Tsukihime Game Trans
This reminds me of a pal's worst tech mistake. Dr. Bob was the best programmer I ever worked with. One day I saw him open a letter he'd received, crumple up the page and start cursing a blue streak. When I asked him what was wrong he told me about the time he'd had his first programming job at a major university, writing and maintaining an application to record & track information on a very large medical research experiment. Thousands of subjects in the experiment were to be tracked for many years by this software. He said he wrote a really cool app, in PL/1 (it was along time ago), and everything was going good. Unfortunately, one day, he wrote an update program for it, and due to carelessness with the name variables, he inadvertently replaced the name of every person in the database with that of the first person...who was David Alexander. Even more unfortunate...the backups were no good...they didn't fire him, but he felt so bad about it, he quit in disgrace. For decades after, his friends, whenever they wanted to razz him about it, would send him mail, addressed from "David Alaexander", and he'd reliably fly into a rage. The letter I'd seen him just get was one of those.
There is no God, and Dirac is his prophet.
Oops, it was corrupted on the floppy, but I didn't get any warning during the transfer. So here I am on Friday afternoon in rural Italy without the critical software I need to debug the problem. I try to dial in to the company modem pool, but the Italian phone system doesn't respond to the dial tones my modem puts out, and I don't remember the magic AT commands to force a connection. The company office in Italy is a hundred miles to the south and won't be able to hand-deliver anything until Monday.
Eventually I managed to find a computer store in town that has Internet email, and they allowed me to have the company send them a copy of the software, which I load onto my laptop three times with three different floppies.
We did finally get the customer's problem resolved, but I've never even remotely trusted a floppy disk since. Boy am I glad they're gone...
PHEM - party like it's 1997-2003!
Not me, though. The CEO was really interested in building a customizable UI so we could sell it as a "co-branded" site. So I spent about a month perfecting the customized "tag language" for cobranding - and only about 2 days on the mail storage ("Hey, let's just store it in the database")
Needless to say, we moved to a new display system in about 6 months but used the email storage system for nearly 3 years. 3 years of painful, painful maintenance nightmares (migrations to bigger databases, indexes overrunning, no quotas, MIME decoding problems, etc.)
The worst part, of course, is that I'm still known as "the guy who built that stupid mail architecture." (despite the other, better things I've done.)
Getting a BS in CS in 2000.
In the summer of 2000, our server was running SCO Unix, and I decided to try and get the PS/2 mouse working so I could use the Windows 3.0 looking thing that was their implementation of the X-Window System. I ran the script, which seemed to compile a kernel module, and rebooted the machine.
Big Mistake! The machine wasn't Y2K complient and failed to reboot because it had some problem with some init script or other (some stupid thing like syslogd or somesuch). And it just hung there.... took the whole company's main production system down. Whatever expensive system recovery solution it was that SCO had sold us proved to be useless, and the company implored me to fix it rather than making us resort to calling in SCO the whiteshirts.
It took me about 6 hours to find a workaround, which ultimately involved booting a Debian Linux rescue cd, loading the UFS kenel module, mounting the root partition, executing a shell, and commenting the offending line out of the init script. Apparently the machine hadn't been rebooted since the turn of the millenium, which was fairly impressive, but needless to say I was not expecting a Y2K bug on a SCO Unix system. The whole thing caught me completely off-guard and made me look like a fool. That fact that I was able to resolve the problem without calling in technical support was probably the only thing that saved my job.
Clickety Click
No shit, after a few blunders in my youth (like constantly rebooting a server which was funneling traffic for thousands of customers, during peak hours), I now:
1) formulate a plan, broken down into SMALL REVERSIBLE steps. I use OmniOutliner to document.
2) come up with a test strategy for each step, put it in the outline
3) execute step #1, test. Document any surprises or other information in the outline. Repeat until all steps followed. test again. ask the customer to test.
4) check outline into CVS with a title including date and client
I hardly make any big mistakes any more. Of course I can't re-install BSD remotely in a different time zone over an SSH connection as fast as I used to, but I also never get any angry phone calls.
As you learn more, your brain can no longer take "shortcuts", becuase you simply know too many different ways to do things. Be careful, formulate a plan, document everything.
The 2nd...10th...100th time you do this, you will appreciate the docs from last time.
After that, and my contract was over, I was offered full time permanent work, so they weren't too upset with that little incident.
* When I was but a young tyke experimenting with the then obsolete Integer Basic setup for Apple 2, I discovered a copying utility. Now, under another copying program called Copy2+, a surefire way to correct errors on a floppy was to copy the disk onto itself. Not so in this other one - it systematically formatted the destination disk (which also happened to be the source disk) and wiped out its own source data. Oops.
* Much, MUCH later, when running OS/2 (!), I wiped out my config.sys file. Twice. Once when I wordwrapped it in emacs, another time by accidentally doing 'del *.*' in the / of the C: drive. I started mirroring the file onto a spare floppy shortly after the second incident.
* In 2001, I forwent OS/2 and installed Mandrake Linux. Now, Mandrake Linux (or is that GNU/Linux by SCO now? =^_^= ) by default installs an alias to 'rm -i' for rm. I found it annoying as hell at first, but it has more than once saved my sorry ass from rm -rf'ing my ~, so I leave the alias in place and let my lusers alter their own aliases.
Which brings me to my most recent technical gaff ever. Taking on users, running a web server, running my own NS1, and cutting into time that could be better spent doing things like (say) cuddling with my wife.
This sig no verb.
Back in the old days, when computers were enormous and water-cooled, instead of small and water-cooled, I worked with a guy who never bathed. He stank, something awful. From what I've told, nobody would really have been surprised or even bothered in a Unix shop, but I worked in an IBM shop, and everybody (but me) wore a jacket and tie to work, and everybody (but him) bathed regularly. One day, his personal odor field was so bad that his boss finally got fed up with the situation. He told this guy, in no uncertain terms, that he was to go home and bathe, and that he wouldn't be allowed back into the building until his stench was removed. The poor, smelly man got very offended over this obviously unfair situation and stormed out of the computer room, slamming the door as hard as he could on the way out. The door was in one of those lovely little non-load bearing walls that exist in many commercial buildings...a wall so insubstantial that you could see it wiggle a bit, every time the door was opened or closed. Right next to the door, affixed to the aforementioned flimsy wall, was the EMERGENCY POWER OFF switch. The slamming door somehow tripped the emergency off button, immediately removing power from every piece of equipment in the room, and from the coolant pumps in the basement below. In the completely dark and mostly silent room, you could hear the disk heads smashing down into the platters, and the gurgle of water draining down into the basement.
To make things even more fun, something went wrong, down there in the basement. One of the coolant reservoirs was apparently not up to snuff, and it leaked. There was chemically treated coolant water all over the floor, to a depth of two or three inches. On the floor, surrounding the pumps, we had just started storing the toner for our brand new laser printer, the size of a small city bus (the printer, not the toner). We didn't have all that much toner, and we hadn't stacked it all that high, so at least a third of it was ruined by the water.
It took over 24 hours to get everything back online, and about a week to replace the laser toner. Surprisingly, nobody got fired.
I edited the /etc/passwd file on an Sun system and locked out the super user. This wouldn't be a problem on a workstation, but this was on a Sunfire 3800 server which costs almost a quarter-million. It was also our workgroup server.
The fix is to crash the system, and reboot from a Solaris install CD, which will allow you to edit the file. It takes a long time to fsck multiple RAID volumes though.
It's good to use your head, but not as a battering ram.
how do you turn this thing on? C:>_ C:>del *.* C:>
we purchased a new box and i spent an entire friday getting the thing running and configured. because nw 4.11 was ipx, the new server had ipx and ip running for communications purposes. that evening i migrated the nds tree and files to the new server, checked that it was working properly, downed the old server, brought up the new, and went home happy.
the next morning, the owner of the company called me at home. he couldn't get any of the computers to connect to the server. i just figured it was something simple, because the night before i had tested some of the client stations to make sure they connected. i went into the office figuring it would be a quick fix. damn my naivete.
it turns out when i had tested the clients the night before ipx was still running on the server. i took all of the ipx parameters out of the autoexec.ncf after that. i honestly don't remember it happening in that order. at this point in my career my understanding of tcp/ip was considerably lacking; i thought it was like ipx, you just plug it in and it works. i had no concept of what it actually took to get all of the clients to communicate with the server. oops. being a stubborn idiot, i spent that entire saturday and the following sunday trying to get the damn thing to work. i finally got the clients to connect, but never did get the database server (pervasive.sql/btrieve) to talk to the clients. after two days of straight work, at 11PM on sunday, i fired ipx back up and everything worked. to say i was embarrassed would be putting it lightly.
i still don't understand why i wasn't fired for that one.
But now none of the tests would work. It took me a while to figure it out and then a while longer to stop swearing. The master test data wasn't in a directory off the root, as it was in production, it was under the home directory. The person who had set it up had used an alias to make it look like it was in the normal spot. I had just wiped out four man-months of work.
I explained what had happened to those in charge and -- after much wailing and gnashing of teeth (and more than a few meetings) -- it was decided that we had done quite enough Y2K testing. They declared victory and let the matter drop.
===== Murphy's Law is recursive. =====
I used to work for a company that produced mylar orthophoto plots. Wrote a program to take digital aerial photos and produce a set of a hundred or so EPS files to send off for printing. Did some proofing on a paper plotter, but nobody recognized a mistake with some "map ticks" (probably due to the plate separation process). Files were sent to another company that could handle the "E" sized plots to mylar. Originally we were going to just get 2 plots done and proof them, but the project manager decided to over-ride my request and print them all. Well after about $6K, the project manager managed to push all the blame back on me (which should have been a $80 mistake) and pretty much let me go (I was already in my 2 weeks notice from getting a new job anyways).
I decided to add more disk to a Novell 3.1 server (a long time ago). Took two 4GB disks from another server, but not before breaking the Compaq hardware RAID, swapping drives' places, re-making the hardware RAID to make sure they're blank. Installed the drives into the server I was upgrading.
Run server.exe . It's reporting that volume SYS: is damaged. Hmmm. Proceed to fix the volume. Mount. What do you mean?, the SYS: volume is blank??? But, it's supposed to have a bunch of user data, the main cc:Mail database and the Novell system stuff on it ???!!!?
Apparently, the 2 disks had remnants of the old SYS: volume from the other server. Lesson learned: don't trust the Compaq hardware utilities to remove data from the disks.
I had to run a 6-hour restore job. Luckily, I did a full backup right before. Got home at 2 AM.
I guess I have should have double-checked my query. What's that rule of thumb? Measure twice, cut once. I should have remembered that one. The restore took quite some time.
I should have posted this anonymously, heh heh.
The meme police, They live inside of my head
When I was working for one of the school districts around here, I plugged a 4Mbps token ring card into the 16Mbps network. Then I left the site. I had just started, and nobody had told me what would happen if you did that.
BOY, did I find out!
I think these two tie for my dumbest moments... 1) So, I'm working as a very low-level computer grunt co-op student in IS at an office, and basically my job is to install Office 2000 on every computer in the building, all day, that's it. How hard can that be to screw up, right? A trained monkey could do it... But, the ONE thing they told me beforehand was that if I was installing it on a drive partition greater than 7 Gb, things could go "screwy" due to Project because of the way paging worked in NT or something like that. So, the ONE thing I have to do is check to make sure the partition is smaller than 7 Gb before I install. After a couple weeks of never finding a computer with a big enough partition, I stop checking. Little did I know that the computers I had been checking were in the under-funded departments like Tech Pubs and HR. So, I move on to the nicer computers in the big money departments, and install Office on a whole bunch on computers during their lunch break. And, yup, installing Frontpage 2000 on a Windows NT machine on a partition greater than 7 Gb causes some interesting problems like not even booting up to Windows. Needless to say, I was very busy doing ghostcasts to restore these people's computers the rest of the day, along with most of the other co-ops. I was not exactly Mr. Popular that summer. But, hey, I didn't get fired and I learned my lesson. 2) Same place, same summer, same IS co-op lackey job. I have to call a bunch of manager-type people to ask them when I can come and install the new version of Office for them. Now, I'm in Calgary and the company has another building in Ottawa too. The phone directory does not specify which location people are at, and the company's internal phone system does not require area codes or anything. So, I call a whole bunch of upperbrass managers (including the company's most important VP) and book times to do this. Then, when I find out they're a few thousand kilometres away, I have to phone them all back and tell them "Uh... nevermind...". lol, those were good times.
My biggest (to date) mistake was enabling NT (4.0) software mirroring on our SQL server. This server housed our proprietary internal application's database....irreplaceable data.
...and it sounded like _such_ a good idea at the time.
When the server crashed from a (then) unknown issue with Compaq storage drivers and the MP kernel, I spent about five hours trying to resuscitate the volumes...unsuccessfuly.
-PONA-
+that's funny...I don't FEEL tardy.+
telnetting to a server over the internet (back then, ssh was only commercially available) I decided I need to change the settings for a network device. So I typed "ifconfig eth0 down"... It took me a few seconds before I reliazed what I did
Hi,
... totally by accident.... and the computer center's attempt to fix the problem I identified brought the system to a crawl. Over and over, they re-IPL'd (rebooted), and over and over, it slowed and no powerful ID's could log in. Somehow I deduced that this was related to their attempts to fix the problem I found, I called them up, promised I'd never use or reveal the information again as long as I lived, they pulled the patch and re-IPL'd and things were fine ever after. And no, I never used the vulnerability again -- and lost a bit of my hacker reputation that day, when the computer center realized they could no longer tease me endlessly with no fear of repercussion, because I, basically, 0wned their 3081 mainframe, and instead had to treat me as a fellow clued person responsible for not screwing something up by misusing trusted information.
/usr, whose dot files I wanted to delete: "rm -rf ./.?* ./*". Do note the careful use of "." to restrict it to the current directory. Do also note that ".." is a match for the pattern "./.?*". Yeppers, I took out /usr on the company's main development system. I was already on my way to the backup tapes by the time I heard the first scream of horror from a fellow programmer.
:-) I was, however, forever forbidden from that point forward, from bringing my cape into the computer room.
I'm a versatile geek, with experience at the systems level on many OS's. That experience in most cases was "hard won".
HP 3000 runnning MPE 3000, before HP got the UNIX religion. Using the BASIC interpreter used by freshmen at school, I created a two line program that basically GOTO'd itself, disallowing use of the break key. It put the CPU into so tight a loop that they couldn't even shut it down from the console. The admins had no idea what was going on. They called HP Support, which SENT A REP OUT, who puzzed about it for an hour or two and somehow eventually resolved the issue. From that point forward, I was never again to hack in peace at that school.
So having wrecked my reputation in the HP user community there, time to switch systems. I chose MVS, the IBM 370 mainframe OS (the one used by banks and such). I figured, nice big target, different users who didn't get pissed off by my stupid "what would this do?" act on the HP, I can have fun here. Well, let's see.... I taught a user how to delete undeletable files, he shared this demo with another friend and (oops) used a system library as the example target..... and was relieved of his continuing student status. For my part, I eventually broke (in the hacker sense, not the PHB sense) ACF2 (expensive security package add-on)
Next, UNIX. After 0wning the mainframe, and acquiring some responsibility to use it with extreme care, it was no longer much fun, so I moved to yet another OS. Well, the best thing I did was to carefully construct an rm statement to be run in some old user's directory, under
Windows. No OS is immune from my power. Actually, this wasn't really an OS issue, this was a network operations center issue. I worked in the op center of a 700+ station network. It was full of servers. It contained a UPS. My desk was in front of the (large, 6' x 4' x 3') UPS, and I'd often "put things" on top of this, as a handy counter behind my desk. "Things" included a big red wool cape I enjoyed wearing in the winter time. AGAINST ALL POSSIBLE ODDS, one day, as my cape accidentally slid off the UPS.... IT CAUGHT THE MASTER POWER SWITCH FOR THE UPS and DOWN WENT THE WHOLE NETWORK OPERATIONS CENTER IMMEDIATELY. Tell ya, there's nothing quite like the sound of 50+ servers full of disks all spinning down at the same time, and *knowing* that You Did It. They had to call the UPS company to get the UPS reset. It wasn't a simple "turn it back on and it will work" (err, tried that before I walked out and "confessed"; yeah, like I hoped no one had noticed... ROTFL!). Still continuing to work for that client for a couple years after that incident, eventually being given a 3x raise? PRICELESS.
rm -rf from the command line? Lucky bastards!
I once added the following to a cronjob
rm -rf $foo/*
My intention was to wipe contents of a directory that I was reusing. Unfortunately "foo" was unset. The cronjob ran overnight with rm -rf traversing every NFS mounted drive in the company. I remember coming in at 10 the next morning and thinking "christ what kind of idiot deleted all of my files?", and then "shit! that idiot deleted everyone's files" and then "shit that idiot is me!".
Ever since then I usually do something like
rm -rf ${foo:?}
mkdir $foo
Later as I recovered my composure I started thinking "Now why can't those idiots set their umask correctly?".
The only positive aspect of what happened was that it revealed a weakness in the backup procedures being following by the IS department.
Personally I count my self lucky to have had the benefit of such a humbling experience w/out loosing my job.
We had 2000+ Node Controllers that reset with CTRL - ALT - J, and put the standby memory online. A firmware upgrade changed the reset trigger to CTRL - ALT - Q. Later, anetwork technician wrote a minor config change to all the upgraded controllers standby memory. That night a NOC technician saw the Q, "corrected" it to J in the reset script and triggered it. CTRL - ALT - J erased the standby memory, reset the Node controllers, put the standby memory online, erased the standby memory, reset the Node Controller, put the standby memory online, on and on and on. Each Node Controller had to be physically powered off to stop the loop and configured.
The only thing new in this world is the history that you don't know.[Harry Truman]
Worked for a Life Insurance Company.
"640 K ought to be enough for anybody." -- Bill Gates, 1981
After years of weekly "fixing" of a cranky drive interface by power down/remove/reset/repower to clean the contacts, one day I forgot the power down part. I lost my favorite machine, my very first Apple II. I've never made that mistake again.
"I may be synthetic, but I'm not stupid." -- Bishop 341-B
I was working level 2 for a company that had software that started at 6 figures and easily went into millions. Got a call from a state library system. Contract for the entire state (paid for by Bill Gates charity btw) for our software. Customer told me that previous tech had told her to do something that wiped out all data. She told me what he said, and I couldn't believe it. So I asked her if she had this in an email or something.
.bat file meant to be run on the server was the command:
She did. In plain text, to be turned into a
del q:\allimportantdatadirectory\*.*
I got the email within seconds of when she said she'd send it to me (she had mine from previous calls, 16 million dollars gets a few privelages). I then spent the next week over the phone helping her recover what had been done. The mistake - not lobotomizing the flatworm who did this. Two weeks later he was found to be hosting kiddyporn off the corp servers (we were an educational software co). Only then did they finally fire the idiot. I did mention that it was Bill Gates that paid the 16 million, right?
Your real mistake was leaving the office machine logged in. That means anyone could have walked over and run arbitrary commands, and the logs would show that you did it.
You probably have an enemy in the office, who killed a table using your ID to make you look bad. (Another explanation is that an insufficiently gruntled employee wanted to annoy the company, and you were just a convenient scapegoat. But in that case I'd expect a lot more damage.)
from a now defunct dot com
The C++ class to discount items on the commerce site was overloaded -- item.discount(number) would accept floats or integers. The problem was they had completely different behaviors; a float would discount as a percentage and an integer would discount in dollars. So when I was told to discount a set of items by five dollars it ended up discounting them to (1.00-5.00)*price. The sanity checking code that made sure all prices were above $1.99 had been commented out because someone had wanted to do a 99 cent sale, so some items ended up being set to -400% of their price. Adding the items to your shopping cart GAVE you money. My boss was paged out of bed when the changes went live at midnight: the system alerted him because daily revenue went to NEGATIVE 50k.
.sig Karma out the wazoo, better to spend points elsewhere if this is above 2 or below 0
first big blunder when i left uni, set up a remote site with 2 intel isdn routers (1 redundant hot fail). these where configured to dial into the boss's home isdn line because the client hadn't had the other end set up yet. when they did, i dialed in and reconfigured them, well i thought i had..... :-{
:-)
one of the routers tried to dial out but it was still configured to the boss's house, the boss's router picked up the line and then disconected it.
that wouldn't have been too bad, but it was doing this 5 times every 30 seconds with a minimum call charge of 5.2 pence for 3 months
the total bill 10k
lets just say the client wasn't best pleased
the second big cockup....
dialed into a customer to make some changes to their test system. i cleared the table that holds the days transactions, not to bad thinks i, until i realise that i'm logged into the live database and i've just binned 3 million pounds of transactions.
the database wasn't sql, there was no roll back. all of a sudden there was no blood left in the top half of my body.
thankfuly i had done a backup by accident and resored much of the binned data, and got the customer to resend everything from the consoles. they where none the wiser
I was working as a "Analyst" for a data center company's "Professional Services" devision. IOW, they rented me out as a consultant.
One of the sales guys* comes to me with a job. Some outfit wants us to load test one of their servers. I guess they bought an app and were skeptical of the performance claims from the vendor. Our company was doing this as a freebie to prove what bad-asses we were, in the hopes of generating business. (This "made sense" in the dot-com days.)
I was to run the test from the office. We two OC-3 connections that were basically completely unused (This "made sense" in the dot-com days.) so that was totally feasible. Dates and times (the middle of the night) were arranged. The sales guy's marching order to me was "crater the box."
I throughly tested all of my scripts and had a cron job set to fire them off at the appointed time (1am as I recall). The next morning I come into work to find a shitstorm. Something like:
Sales guy: You crashed their system!
Me: It's an NT box, right? I'm not surprised.
Sales guy: They're pissed. They couldn't get a hold of anyone to stop the test. Their system just kept crashing every time they brought it up for four hours.
Me: You gave me a four hour window. Why are they pissed?
Sales guy: It is a production server!
Me:?!
I don't know what the moral of the story is. Don't let the sales guy stand between you and the client?
-Peter
* The "Sales guy" is named Tracy, and he is cool guy. I hate that he comes off as a sales-twit in the story. In reality he is quite smart, fairly technical, and hardly cheesy at all!
Since I am an impatient guy, I wanted to make my external USR Sportster 33.6k modem to dial faster with initialization string parameters.
:(
Well, it was the middle of the night (3 am?) and I was a teenager. I made the modem dial fast, but one of the BBS phone number started out with 914... Well, the modem accidently dialed 911. I didn't have the modem speaker loud enough so I didn't hear 911 operator. Then, a cop came by after a few minutes. My folks weren't happy that day.
DOH!
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
We all know how this ended. So I won't bother with the details. Much apologizing ensued on Kenny's part and much "Don't worry about it" on mine. I had backed up the files just before I asked for help. Thank god he didn't do "rm -r*" at / like I thought I saw.
"If a quarter is two bits, then a dollar's a byte." -R Deric Miller
Site A was a popular Windows shareware download site (rhymed with CaveDentral.com).
Site B is an even more popular open source download site (rhymes with freshfeet).
I was given the task of upgrading Site A to run on top of the PHP codebase developed for Site B. Nearing completion of this project, I began toying with the automated newsletter update features of the codebase. Unfortunately, since the codebase hadn't been designed with the idea that it would ever be used as an extensible framework, the newsletter posting address was hardcoded in an obscure corner of the include files. Or something like that -- it's been a few years.
Anyway, end result was Site B's subscribers began recieving a multitude of strange emails with the subject 'Testing -- Visit Site A for Windows Shareware!'
Oops!
really? wow... that's reallywow.
was on a main bank pc (some years ago) and wanted to format her floppy disk. She got to the DOS prompt and typed "format c:" and waited. And waited and waited. Then she started getting a sickly feeling that it was taking too long. She said she almost had a heart attack when they told her what she had done. Luckily they were able to reassemble the FAT table and all ended well.
As for me, my only tech mistakes have been working for evil PHB's. I *swear* that *two* of them actually looked like the PHB in Dilbert.
I mean c'mon, yer either taking care of the details or yer not. Also, any serious mistakes have been subconsciously auto-purged from my memory. Ahh, the joy of convenient amnesia!
Peace & Blessings,
bmac
www.mihr.com, check it out!
Good story. :)
Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
This one's not mine, and I called my buddy about it but he said no way he was posting it! heh
Back several years ago we were both middle school teachers and our school had just got a T1 so I was downloading a lot of mp3s there (pre-napster...all FTP and binary groups) and then taking them home (unplugging the drive). Soon I was running out of space. Our IT director asked me to do some training and I said yeah, if i could get a bigger drive.
So I got a 10 gig installed and moved all my mp3s over to it. By then end of this school year I had on that drive like 1000-some mp3s, all gotten with blood/sweat/tears through FTP sites or ripping them from my students' CDs.
I got a job the next year in our district IT dept and my fellow teacher buddy got my computer, with my extra drive with all the mp3s. Anyway, he wants a fresh install of 98 and calls me up and asks which drive are the mp3s/system on and I'm like the "mp3s on the big one and system on the small one". Well, he took my word for it and didn't check or whatever and formatted the drive with the mp3s. Had something to do with I told him the mp3s were on the "bigfoot" drive (he was kindof a noob then).
Napster showed up that fall though so no big deal...
So, I'm working on tracking down some random bug, and login to the production server to see if I can recreate it. I go scrolling through the logs, looking for my username, when lo and behold, there it is, looking something like this:
2003-12-11 14:30:07 UI: Adding UserName --> mylogin
2003-12-11 14:30:07 UI: Adding LoginType --> LDAP
2003-12-11 14:30:07 UI: Adding UserPassword --> cleartextpassword
Needless to say, I had a new priority.
Now, that's pretty bad, and at least I can say that this bug wasn't my fault. Then I made the mistake of mentioning it to the guy in the next cube. It was all of ten seconds before the tech lead found out, and got busy writing a script to delete all the passwords from the log archives. Thus, the greatest password harvesting scheme of my life was stillborn.
Yup, been there done that.
It hurt too, as it was the newest server in the park, and almost about to pass it's 365 day uptime mark. I had been looking forward to that for a long time. :-(
So people, fix your inittabs. :-)
zWhat would an EWOULDBLOCK block, if an EWOULDBLOCK could block would? -- me
I work for a Fortune 250 company. We started an Oracle ERP implementation project. Nice Sun 4000 box loaded with processors for a dev box, and a half-loaded E10K for the production and testing domains, all on an EMC Symmetrix array. I am the newly-appointed "Unix" guy at the data center, as this is the first Unix they had there. I'm am trying desparately to work well with the DBA's who work in another state. Got the picture?
Part of the project plan called for layering Veritas foundation suite on the EMC disks. Notice that we already have Solaris and its native VM, JNI fiber controller drivers, and EMC PowerPath running on the stack, and now I'm going to add Veritas File System and Volume Manager. I'm just sayin'. On top of an E10K, it's a lot of stuff.
I read the docs over about 2 days. I prepare all the scripts over another half. I am supposed to be doing about half the data disks in the first pass. Easy, right? You just specify those LUN's in a special file, and it keeps them from being touched by any commands you throw at EMC's PowerPath software. (Wait for it...) I stare at my navel for a awhile, then throw the switch... and convert ALL the disks, including the stuff they were currently using, to the new file system.
After all I had done to prepare, it totally escaped my attention that I would be touching the disks with Veritas software, NOT EMC software, so the exclusion file was moot. I was scared witless. I'm sure that those of us who've been through something like this can commiserate with this almost indescribable sinking feeling in the pit of your stomach.
Amazingly enough, the "live" data disks I wiped out were all due to be cleaned off anyway, with the exception of a couple, and the DBA team lead just considered it a good chance to put their data restoration skills to the test. Talk about a nice group of people!
Here's another. I've left that group due to... financial issues... and taken up with my old group to do custom programming for the engineering department. Everything's great. Now that I don't need to run EMC's Windows-only managment software, I'm back to running Linux as much as I can.
The need arises to setup a SQL Server box for some engineers to develop some other apps on. No problem. I get it all setup. But then, I check up on it after a break, and I see TONS of specialized User Account Policies. My first thought is that someone with a normal account -- with a login script that calls SMS setup stuff -- has logged in, and the "domain" has taken over the whole machine. So I angrily start undoing these changes. Problem? By default, User Manager looks at the domain. I'm a domain admin. It lets me delete something like a hundred entries.
I'm down to the last few when I realize what I've actually been doing. Yeah. Screwing up batch files and service account privileges across our entire multi-national domain with thousands of accounts. Me? I start sorting my stuff from my company's stuff in my office. I know what has to happen; I'd do the same thing.
I make the call to a good friend who's heavily involved in the domain administration back at the aforementioned data center, and sit on pins and needles. I can scarcely believe it when he tells me that it's no big deal, and he can get one of his guys to repair these permissions in a few minutes, from memory! (Turns out that this might not be the first time that someone has done serious damage to these settings.) WHEW! But, boy, was my face red.
I think it's a positive testamony about the company that I get to do things that stretch my abilities and not get the axe for the occasional mistake. Even over and above folks overlooking these blunders, I'm incredibly lucky to have this job. It's a fantastic place to work.
Acts 17:28, "For in Him we live, and move, and have our being."
Seems innocent enough, right? I complained about a frivilous project. (Some PHB wanted the entire company to do a 1 hour presentation on what we do. For some reason, upper management was able to get out of it.) I had real work to do with an ambitious deadline, and someboy who wasn't even my boss was telling his employees to tell me to do some waste-of-time project that could easily be rescheduled.
So why was this a 'tech blunder'? Because I was running Windows 98 on relatively new hardware. You see, hard drives became significantly faster shortly after the release of Windows 98 due to increased buffers and what not. One of the 'improvements' of Windows 98 was that it could shutdown and restart a lot faster than 95. The problem? I discovered shortly after arguing with my coworkers about how dumb an 'urgent project' was that Windows 98 could actually shut down the computer so fast that the data in the HD buffer wasn't done writing to disk. When the computer came back up, 'No OS found.' Obviously, my project wasn't going to get done that day.
The coworker who was being pressured by her boss for me to get this done heatedly accused me of sabotaging my computer so I wouldn't have to do the project. She really thought I broke my own computer because I made such a big deal about not wanting to do the project. I can sort of see where she's coming from, but I was rather offended that she thought I'd sink to that. I am very much against doing unnecessary work, but I won't disobey my boss.
I was soon vindicated. No, another 98 machine didn't crash. Rather, when the machine was restored, I did the stupid project. I was thoughtful about how to present it, moreso than my other coworkers. Lots of praise, yadda yadda yadda.
The moral of the story? Be very careful about bitching about your job. Those of you who frequent here know that people like to jump to conclusions, and they usually operate with critically low information. Once I called in sick after working a late night. My boss called me and said "You should have just said you needed to sleep in." Sadly for her, she got to hear the details of the digestive problem I suffered that morning.
"Derp de derp."
...to feed my cats while I went on vacation and leaving my ICQ chat logs intact and not passworded.
I will never do that again.
One of my worst history about technologies was: I needed a multiplexing board to have 8 Serial Comport on a computer (Os/2 at the time) I bought a DigiBoard PC/8e (I still have) for 700$ a piece. Just to realise that there was something else available for 150$ that could have done the same. Now before buying hardware equipement i ask everybody if there is something else that can do the same thing.
Periodically, the spooler would start to get full, so one of us good-hearted individuals would run a command to delete the oldest few spooler jobs.
One day, I noticed that the spooler got full, so I took on the task of deleting old jobs. I typed the wrong command, and then left for home. After about 5 minutes on the road, I realized my mistake and knew that it was deleting every spooler job. I raced back to my office, and aborted the command.
Nobody even noticed.
HCG 50a = 2MASX J11170638+5455016
11h17m06.4s +54d55m02s
It's not that -f overrides -i. It's that whatever option comes last on the command line wins.
One trick is to go into an important directory and "touch ./-i". This creates a file called dash-i. Next time you "rm -f *", the C locale's collating order will put dash-i at the front of the expanded list, and the command actually executed will be "rm -f -i foo bar baz ...". Now -i wins.
For added space savings, hardlink all your empty dash-i files together.
I've never tried this suggestion (my friend is backups, not mother-may-I intentional stumbling blocks), so I can't report on its effectiveness.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
I was showing a guy how to use CVS and he kept messing up. So I showed him how to remove his dir out of the CVS repository so that he could start over. Well on one of his tries he accidentally checks in a new directory with a space on the end (it was a GUI client). So he goes to the CVS repository and does "ls -F" and gets:
/
/". Whoops.
somedir
So he copies the dir and "rm -rf"s it. Comes to me and says, "it's been removing my directory for about 5 minutes now." I go, see what he typed and start frantically hitting control-c. You guessed it: "rm -rf somedir
Another guy was the lab manager where I used to work as lab assistant (a long time ago). He bought a brand new state of the art 486dx2 chip for around $500. Apparently back then they weren't keyed and he plugged it into the mother board rotated 90 degrees. Fried it good. So he goes to the store and buys another. He studies the pin 1 indicator on the chip and board good and long. Plugs it in. Turns it on... Nothing. Turns out he got it wrong AGAIN! $1000 in 2 hours. Whoops.
-David
There. Now go play some cool javascript games!
i was in a networking class and the teacher (who was going to resign in 3 days) gave the class the novel administrators password. everyone in the class logged in and started poking around. at this point the teacher left the room. the class proceeded to delete accounts and computers and stuff :)
:)
well, someone deleted the administrator account everyone was logged into.... boom! everyone got a warning that they had no access anymore. 10 minutes later the teacher returned, looked at the screen and screamed. then he laughed because he knew it wasn't his problem.
cost me my last job, in addition to building up a coworker to back me up in case of emergency.
No sh*t!
- Hubert
Former boss was staying with his grandmother when he was a child. Since he was young and healthy, it was one of his jobs to go out to the mailbox by the street to get/put envelopes.
So one day he walks out to the mailbox and notices something he hadn't noticed before: a small box on an adjacent telephone pole, with a little lever and the words "Pull Here". There were some other words, but he couldn't read those. But he knew what "pull here" meant, and he was a responsible boy who did what he was told, so he smartly pulled down on the lever, took the mail , and went inside.
Shortly afterwards he was fascinated by the big trucks with the flashing lights driving around.
The next day, he walks out, pulls down the lever, gets the mail, walks back inside. Trucks drive up and down the street again. No cause-and-effect thoughts are passing through his mind (why would they?).
Happens every day for a few more days. No electronic records back then, so it takes some bright guy at the firehouse watching the clock to notice the pattern. One day the truck just shows up early and waits. Young Boss walks out, gets the mail, reaches for the lever, gets stopped.
"Young man, have you been pulling this lever every day?"
"Yes sir! I'm just big enough to read it!" *huge grin*
"Ah." *sigh, smile* "No harm done, but let me talk to your folks, young man."
Young Boss got the stuffing beat out of him by grandma, but he says getting to tell the story these days to his laughing kid, wife, and employees was worth it.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
It must be noted that I had previously been used to the MS-DOS command-line, and got into the bad habit of copying files without specifying where to. This makes them go into the current directory on MS-DOS so no problem.
But now I was on Unix. And so I make a sibling directory, and go into this directory and issue the command:
but where the hey are the files? ah! yes, I'd forgotten to say where to put them... try again:Ok, now I got the ndss.c and ndss.h files. But... why doesn't this seem to compile? Hmm... but the previous one did? So I check... Nope!Whats happened here?! I take a look at the files, my ndss.c looks ok and ndss.h looks ... eh? wasn't this ndss.c that I just looked at? What's going on here?
Realization dawns. My ndss.h is lost and I got four copies of ndss.c. So, OK, I can recreate that knowing the function names and stuff from the C file... but damn if I ever issue a copy statement without specifying where to anymore!
Another fun tech mistake: Put the cache memory chips in a 486-era box backwards. These are 28-pin narrow DIP format, (61256s) and work one way and fail spectacularly the other way, but they fit either way.
SIGBUS @ NO-07.308
Had to rebuild the tables from scratch...
Your reality is lies and balderdash and I'm delighted to say that I have no grasp of it whatsoever. - Baron Munchausen
I mentioned to some bigwig at SCO that some code in the Linux kernel looked a lot like SCO's unix. :)
I think that what's really being said, is that the more you learn, in general, the higher up you go in responsibility chains.. and therefore, when you make a screw up, it can be an order of magnitude worse than if you had been responsible for a smaller area/subset/whatever.
"Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
That was my big mistake. Of course, it was compounded: They didn't have a very recent backup (about a week old). After a stern talking from the president/ceo/owner of the shop, my supervisor and I spent a VERY late evening restoring the data from the backup, rebuilding what we could from the leftover two tables (due to the way the system was structured, a lot of the data for this one table could be gleaned/inferred from the other two tables - had I screwed up one of the other two - I would have been even more hosed), and the rest the client had to rekey.
In all, it caused this company to lose about a day's worth of work, a half day or more of employee downtime/productivity, and untold dollars. Lesson learned?
ALWAYS DOUBLE CHECK WHAT YOU TYPE BEFORE PRESSING RETURN - ESPECIALLY IF YOU ARE PERFORMING A DELETE COMMAND.
To this day, I am amazed I wasn't fired...
Reason is the Path to God - Anon
While working at HP I did a NET SEND command to get whoever was logged into one of the servers I was using to log out of PCAnywhere. Unfortunately, I missed one of the parameters and sent the message to everyone in the login domain (ie. a few thousand users).
After hitting ENTER, I hear a hundred Windows 'dings', and everyone in cubicle-land starts prairiedogging. I got a few nasty replies asking who I was, and a very nice one saying "Don't worry: once I sent 'I know you don't have any pants on' to most of HP Belgium".
Worst thing was, the guy clogging up the server was my cubicle-mate who'd gone out to get coffee.
I was typing for RMS back when he needed volunteer typists because his wrists hurt too much.
.config". But what he said afterwards was "NO, NO, Ctrl-C, Ctrl-C!!" And then he said "Did you get enough sleep last night?" As if *I* was at fault!
He said "rm" "-rf" "." "config".
Or so I thought. He probably thought he said "rm -rf
-russ
Don't piss off The Angry Economist
from Darl.
Once upon a time, there was a small, innocent server room. A few servers for the company's knowledge and document management, main part of the intranet, were connected to two UPSes, and everything was good. One day, the power failed for a few minutes, the UPSes switched to battery power, and nobody noticed it until the servers suddenly stopped responding. Both batteries were empty, the UPSes could not tell the servers to shut down because nobody had wired the serial port of the UPSes to the servers. But the best fact: Both UPSes were connected in parallel to a single 16 amps automatic circuit breaker, and both UPSes tried to recharge their batteries immediately after the power returned while still supplying power to the servers, drawing much more than 16 amps. Guess what happens: The breaker switched off and the UPSes where discharged until they could not even light a single LED.
The next server room had 12 circuit breakers rated 16 amps each, a well-calculated power distribution, and the UPSes could tell all servers (a few more than in that good old times) about their battery states.
Tux2000, happy not to be responsible for that old server room
Denken hilft.
I've made two major blunders on the job. The first time was in '97 when I was sent to a vendor site to fix a PC application. I talked to the supervisor there before I got started and was told something to the effect of "do whatever you need to - just get it running". I went to work on it and determined the OS (Win95) had some serious issues and decided a reinstall was most expedient. I reformatted, reinstalled, and put the application on. When I was done I told the supervisor what I had done and that everything was honky dory. That's when he tells me other customers application data was on there and 'I had no business doing that'. They had no backup. My company was charged for the rebuild work, and I ate crow. The other big mistake was when I was doing a hardware upgrade on a database server. I was new at this division of the company, having transferred only a few weeks before. IIRC I was repartioning a RAID array. I started around 8pm. I made my backup of the database, added disks, and went to restore. The SQL database backup was corrupted. I panicked hard. Why hadn't I tested the backup?! I called Microsoft support but a full restore was not possible. At 4 am, I ended up going back to the previous nights backup. It actually gets worse, but it's too long to document here and it depresses me to think about it. On my own machines at home, I'm always switching hard disks, reinstalling OS's, etc., and many times I've forgotten I had documents or digital pictures tucked away somewhere and lost them for good. It's harder to get myself to apply due diligence on my own stuff, but I'll learn eventually.
I've made two major blunders on the job.
The first time was in '97 when I was sent to a vendor site to fix a PC application. I talked to the supervisor there before I got started and was told something to the effect of "do whatever you need to - just get it running". I went to work on it and determined the OS (Win95) had some serious issues and decided a reinstall was most expedient. I reformatted, reinstalled, and put the application on. When I was done I told the supervisor what I had done and that everything was honky dory. That's when he tells me other customers application data was on there and 'I had no business doing that'. They had no backup. My company was charged for the rebuild work, and I ate crow.
The other big mistake was when I was doing a hardware upgrade on a database server. I was new at this division of the company, having transferred only a few weeks before. IIRC I was repartioning a RAID array. I started around 8pm. I made my backup of the database, added disks, and went to restore. The SQL database backup was corrupted. I panicked hard. Why hadn't I tested the backup?! I called Microsoft support but a full restore was not possible. At 4 am, I ended up going back to the previous nights backup. It actually gets worse, but it's too long to document here and it depresses me to think about it.
On my own machines at home, I'm always switching hard disks, reinstalling OS's, etc., and many times I've forgotten I had documents or digital pictures tucked away somewhere and lost them for good. It's harder to get myself to apply due diligence on my own stuff, but I'll learn eventually.
I needed to blow away a tree of directories under /tmp on our development systems. Since they were all owned by various other developers, I needed root permission to clean out the space. I figured I would su, and then delete it.
:)
10 seconds after starting the rm, I figured it was taking to long. I looked back up the log, and I noticed that I had done a "su -" instead of just a "su"! Freaked me out completely. Destroyed the whole machine, we had to rebuild from scratch. Thankfully, I hadn't gotten to the home directories yet.
Jason Pollock
The very first chip I worked on came back only partially functional, and the part that was broken was my logic :-(
;-)
At the time, the respin cost was about $50K!
After 6 weeks of working on the problem, we verified a timing problem in my logic. Now - for those in the know, you would expect static timing analysis to find this - uhm, the company didn't use STA, but rather depended on the library's back-annotated delays and verilog simulations saying a Flop input didn't meet setup or hold timing requirements. So if you didn't excite the slow path in your simulations you wouldn't see it...
We also found that our libraries were 40% slower than the actual silicon we received, i.e. there wasn't any ground-truth in the simulations we were doing anyway.
To make a long story longer - the company wound up making huge changes in their methodology due to this mistake - and I was exonerated because I followed the extant methodology and couldn't have possibly found the problem with the tools in place at the time.
Talk about 6 weeks of hell!
This is a classic story that REALLY happened, cause it happened to someone I went to college with. I got the story from two other folks that saw it happen
This is back in the days when people used "Disk Packs" These were multiple platter disks that you could actually remove from the drive. The guy decided to do some preventive maintenance on the disk drive, so he turned the drive off to remove the pack. Only one problem - the system was hot with real live software developers editing and saving files. The disk pack contents was ruined, and from what I understand, they took over two weeks to recover everything, i.e. getting back to where they were when he did his thing. He didn't work there after that.
The last story I've got is along a similar vane and involves the same disk packs. This happened at my first job a few years before I was hired in. A disk pack failed on a drive, so the tech downs the drive and installs the disk pack on another drive. Well that one doesn't work eithre so he moves it to another drive..so on, and so on, until he ruined 9 seperate drives with a crashed disk pack!
Have you compiled your kernel today??
With that aggravating beauty, Lulu Walls.
I wrote a recursive Bourne shell script in 1983 -- it called itself, twice, before exiting. Blew out the 4.2 process table and crashed the VAX. Luckily it was an academic system, not a business production system.
Not enough EE's represented here, so I'll take a shot:
Took the tantalum cap that we used to decouple the +5v power supply and used it to decouple a +12v power supply. About two months later, after we had about a dozen products in the field, I found out that it was rated at 9 volts.
I worked really hard to write the recall letter without using the words "explode" or "burst into flame" while trying to convey the thought that this part probably wouldn't catch their whole system on fire.
HIV Crosses Species Barrier... into Muppets
deltree /y \windows
;-)
But then again, I guess most of you don't think that's a terribly big mistake
The factory consumed wood and produced pulp, which is like thick blotting paper and is used to make paper. Basically you chop up the wood, and digest it to form an aqueous suspension of cellulose and lignin, which is then sprayed over a moving metal mesh belt to filter out the cellulose, which is the bit you want. As this factory was in a region with a poor power supply, it produced its own electrial power by concentrating the waste lignin, burning it in a steam generator, and powering a turbine from the steam.
This is all very elegant, but there's an obvious bootstrap problem. My colleague had to do some maintenance on the control system for the turbine, and of course it couldn't be shut down without major problems rebooting the factory. Unfortunately this meant that when he got the byte sex of an integer the wrong way around, he found out the hard way when $2M of live turbine expired.
I met him some time after the event - fortunately still working for the same company.
About 5 years ago I was a fan of the NT bootloader since it enabled me to simply use the arrow keys to decide whether to boot Linux og NT.
And since I had for some reason or another re-installed LILO, with the expected effect that it overwrote the Master Boot Record with its signature, I figured I might as well try and put the NT boot loader there instead.
I had read somewhere that the MBR was the first 512 bytes of the disk, and since I had somehow worked out in my head that LILO booted DOS by running some code from the first sector I naturally came to the conclusion that I could easily restore the NT bootloader as the default bootloader by simply copying the first 512 bytes of the NT partition to the front of the disk.
The command I then came up with was the following dd command:
dd if=/dev/hda1 of=/dev/hda bs=512 count=1
The system worked of course just fine after copying these few bytes from one place to the next and it wasn't until I rebooted the machine that I found out that something had gone terribly wrong.
I can't recall what the error message was exactly, but what I do remember is the feeling I got in my stomach when I saw it.
At first I just said to myself that I had simply hosed the MBR. But that bad feeling in my stomach grew worse and worse when a Linux Rescue CD I booted from actually found NO partitions at all on the hard drive.
So, let this be a lesson to everyone not to run dd unless you're REALLY sure you know what you're doing.
---
Then a couple of years later I was working as a programmer, and I too did what I guess a lot of programmers have done at one point or another in their career.
That is, "DELETE FROM table;". But I was fairly lucky in that it was not on a production server, and the data was easily restored from backup.
At my first real programming job I was in charge of maintaining several different versions of UNIX software for different platforms. I didn't have dedicated development boxes for most of them and in particular I had to share the SunOS4, AIX and Digital UNIX boxen which were on the other side of the building.
/tmp/sunos and mounted the volume there.
/tmp ... pwd ... (yep, we're safe) ... rm -rf * ... hey, why is this taking so long?
Anyhow I generally would telnet into these machines and then use NFS to mount my development directory on my box, then do the compilation and testing.
One day I was looking for a file on the SunOS box but the find command syntax was a little different. So I decided to just NFS mount the drive from the SunOS box on my Linux box so that I could use the find command I was more familiar with. I didn't normally do this, so I didn't have an fstab entry or established mount point. I figured this would just be a one time thing, so I just quickly did a mkdir
Later that day I decided to clean out the temporary directory on my Linux box after installing some new software.
cd
The SunOS box was slow, and it took a long time to restore the backup. Lost a day of work and the IT guys laughed about that one for quite a while.
-=Ivan
I have a theory about this...
;-)
See, the idea is that Real Life (TM) is just a giant RPG. Therefore, you complete tasks supplied by the DM, and gain experience for good work and good RPing. The DM of Real Life (TM) has to make the game interesting enough that people will want to keep playing, but not so hard that they just give up.
The conclusion I've come up with is that the DM will, from time to time, give you challenges to overcome. NumLock problem, CR 0. Windows reinstall, CR 2. Exposed wire on the IDE cable shorting and forcing a reboot every time you burn a CD, CR 10 or so. In order to keep you engaged in the game, the DM has to make the challenges appropriate to your current level. Also, when you get enough experience, you go up a level and gain nifty powers.
Thus, as you gain in levels, simple problems become less appropriate, and the DM has to throw in some real show-stoppers to keep you on your toes. =)
The department that I worked for had two SunOS servers (CS and IS depts). One day I was given the job of walking across campus, shutting down one, plugging something in and starting it up again - I can't even remember what I had to plugin.
On the IS machine I typed the shutdown command in wrong and it didn't work, so I tried to read the manpage. Manpages were NFS mounted from the other machine, but had now been dismounted, so I rsh'ed into the CS machine, RTFM'd and then typed in the correct command... Ooops, forgot to exit the rsh.
Boss was very kind, the comment was "everyone is allowed *one* bad mistake"
Second bad was spinning around on a swivel chair and having my knee hit the power switch on a NetWare server. Hurt like hell as well as switching off a running server!
So I used to work the tech support lines for a local ISP and I was trying to get a new customer up and running: Windows 3.11, Trumpet Winsock, etc. To this day I don't know what this person did to their computer, but it took us HOURS to get that damn thing connected. Of course, them having only one phone line ("Okay, I'll try that and call you right back..") didn't help any.
:).
Once the torture was finally over, I got the dreaded question: "Now what do I do?". So I tell him about setting up email, "surfing the web", etc. He replies "That sounds great. How do I do that?".
So I start walking him through ftp'ing over to netscape in the hopes of downloading a nice browser. I should have hung up the phone and shot myself in the head instead. Another 30 min. goes by with frustrations mounting on both ends of the phone (Only one phone line, remember).
Finally, I realize that enough's enough, download the file for them and start up wu-ftpd on our shell/mail server so that I can see what's happening when they attempt to connect. I give the customer the new ip address to try, hang up, and start watching the logs.
Radius says they connect, authenticate, yadda yadda yadda.
Syslog on my local box says someone's connecting... YAY! On the first try it works. By now it's waaay past time to go home, so I tell the next shift guy what's up and head on home.
When I get into work the next day, my manager calls me into his office. I'm ready for a commendation, or a medal, or something for going way beyond the call of duty. He asks me "Do you trade warez?". I say "Not really.". He says "Explain this then.".
Of course, I forgot to turn off the ftp server after the customer had downloaded netscape. And of course, someone had scanned it over night and started filling it with all sorts of stuff. And of course, they filled up the disk, wreaking havoc.
I quickly explain the situation and I somehow manage to keep my job.
But we did get some kick ass warez out of the deal, so it wasn't a total wash
--
Mando
(all 3 of you)
Notice how many of the tech mistakes posted so far involve problems with 'rm.' Unix vendors listened to these complaints for decades and ignored them, and look where they are now. Maybe just because rm (and a lot of other command-line stuff) has been brain-damaged for such a long time doesn't mean it's without fixable faults.
The worst mistake of MINE that comes to mind is getting drunk while on call.
The chances of anything going wrong that night were slim to nil (null) but, as my luck would have it, the system went down. System fixes on our system required many people working together. I was sysadmin and in charge of the techs. Needless to say, that night, I was NOT a sight for sore eyes but rather a sore sight for eyes. Not to mention having sore eyes myself!
I DID get the problem fixed rather quickly, but ended up going to sleep under my desk for the remainder of the night. Worst of all, I didn't wake up until people were already in the office the next morning! Oops!
-definitely posted by Anonymous Coward!
Back when I had my good old 286 running DOS, I was having a little trouble running out of room on the hard drive (barely a few megs then!), so I went into a directory of a game I no longer used (at least, so I thought), and confidently told the computer to erase everything in the directory I was in, c:\...
I recommended that we install RedHat 8 on a co-located server to the company that I work for. It was kind of embarassing to tell them three months later they should migrate away from the solution I just recommended. Thank god for Debian!
This tidbit from Lars Wirzenius is a part of Linux Lore:
/dev, and wanted to dial up the university computer and debug his terminal emulation code again. So he starts his terminal emulator program and tells it to use /dev/hda. That should have been /dev/ttyS1. Oops. Now his master boot record started with "ATDT" and the university modem pool phone number. I think he implemented permission checking the following day."
"Linus also got some other stuff via mail. For example, a pair of 40 megabyte hard disks. That was really nice, since it meant that Linus was finally able to keep some backups. Not that he did, of course. One of his well-known quotes is: "Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." He said that even after dialling his hard disk.
"At one point, Linus had implemented device files in
I once did something similar -- I was going to back up my MBR to a floppy. Using the 'dd' utility. I got the command line options backwards, and overwrote the first 1.44MB of the hard drive with the contents of a blank floppy disk. Required a low-level format of the hard drive to reuse the sucker. Thankfully, it had no critical or irreplaceable data on it.
Give me my freedom, and I'll take care of my own security, thank you.
I tried to install a PCI video card. (which cost twice as much as the game [Enter the Matrix]that I was buying it to play) My BIOS didn't have an option to shut off onboard video. After installing, I noticed my taskbar had been removed. Undable to retreive it, I reinstalled windows over my old copy. It turns out I'd already corrupted drivers by installing the video cards, and this just made things a lot worse. USB devices, networking, almost everything just quit on me. For a while, that game was, I joked, the only thing on my computer that worked. Thankfully my ASPI layer was in good enough shape to burn my current school project to a CD. Eventually, I backed up onto a stack of CD-R's and reformatted, re-installing. After that, I broke doen and decided to call NVidia's tech support, instead of Dell's, which had told me that I had to disable onboard video, delete my USB drivers, reinstall any various items, reformat, and that it would not work. I spent countless hours trying to work with people with little English training to solve my futile crusade. NVidia's tech support guy had the whole thing done with me in under 30 minutes. (Disable onboard video by configuring it as a separate monitor.) And he spoke English. I also had a nice clean HD to work with. I nearly lost my mind, and I have never been able to recover the scanner drivers I lost in that reformat. But elsewise, my computer works great. In retrospect, the game in question was the most expensive game I ever bought, counting that hardware ($120 in total) and I was, come to think of it, not too impressed. Thankfully, I moved on to other games that can make good use of the card.
I still remember getting a rush job assignment as an entry-level web developer "develop an email to 50 of our best clients and blast them with our new service"...
I code the whole thing up in ASP thinking I could use that as a quick and dirty bulk mailer. Unfortunately I didn't realize a condition in my code caused an infinity loop around the send command...
Unfortunately with no QA (remember I was young and stupid then) I hit the execute button. Five minutes later no response from the page...then the phone call came. The worst was one client who got about 17,000 emails advertising our new product in a single in-box.
Believe it or not I got to keep my job, and we were able to spin most of those clients to stay with us, but damn that was a scary day.
...in bed
Well, one night, I'm bored, and I notice there's a handy little feature (don't remember exactly what) from hitting Ctrl-A. So I start going through the alphabet... get to Ctrl-N, and I'm at a dos prompt. It's 11pm, no one's around, and none of the other managers would know what the hell to do anyways. Can't get anyone at corporate.
"DIR" reveals a list of some 100 .EXE files. I start scanning for one, find one that sounds like it might start the system, run it. Yeah, it started the system. From scratch. Everything, and I mean everything, was gone -- employee records, store sales history, boom. No backups. A very sad night.
Not my bad, but I had to fix it.
/lost+found. One 3TB directory full of numbers...
A DBA decided to auto-boot to "Fix" a problem on a 3TB vxfs filesystem housing the production oracle DB. For some reason it spent the next 6 hours renaming every file on the filesystem to the inode # and put it all in
Then we found out the backups were shit, and I had to go fishing for a tarball in the lost+found dir. It was a 26 hour day, but I got it all back with only 1 day of data loss.
Afterwards, I went out and got royally pissed out of my gourd.
When I was about seven or eight, my dad decided we kids were watching too much TV. So he decided to implement an early version of the V chip. :-)
He put replaced the male plug at the end of the tv power cord with a female plug, and built a male-to-male power adapter that he kept control over. He would give us the adapter for two hours per night, then confiscate it again.
Worked pretty well - with no way to plug in the tv, there was no way to turn it on! We got bored watching the blank screen pretty quickly.
Can you see the flaw in his plan? :-)
One day, the TV didn't turn on - but I was pretty sure the adapter was in place. So I reached behind the stand without looking, to plug it in - and accidentally grabbed one end of the adapter - the end that wasn't plugged in to the wall...
YOWSA!...
no permanent damage, thank Gd (twitch twitch)
I never told him about it, and I was always much more careful around electrical stuff after that.
Actually, that's not strictly true. A few months later, I was home sick, without the adapter.
I couldn't imagine an *entire* day staring at a non-functioning tv, so I got into his toolbox and built myself a replacement male-to-male adapter.
Being 8 years old, I did a piss-poor job - a couple of strands of wire weren't properly tucked into their screw post, and touched some other strands (causing a short circuit).
sparks everywhere, scared the heck out of me.
So I hit the circuit breaker, cleaned up the strands, plugged it back in and watched TV till 5 minutes before my mom was due back. :-)
I was working at a Fortune 500 company in an R&D job. In order to provide demonstrations of new network services under a variety of conditions, I had built a real-time network emulator. I was testing it, on a test subnet, which was supposed to be isolated by the gateway router from the company's production network. I wanted to "soak" things, looking for odd error conditions, and left it running on Saturday morning with every intention of coming back Sunday evening to shut it down. A sick kid kept me at home Sunday night so I didn't get back to it until 7:30 Monday morning. I walked in, the packets-per-second being forwarded was off the scale, hit the interrupt key and everything shut down gracefully.
The problem involved multicast packets and some bug in the Linux kernel which caused, occasionally, packets to be reported as being received from a different interface than where they had actually arrived. One line of added code fixed things nicely, but as it turned out, I had code running on multiple boxes that caused such packets to be bounced back and forth forever. Complicate the situation further when it turns out that the gateway router was not configured as I had been told, and it only isolated unicast traffic; my multicast flood had leaked out onto the production network. I had intentionally run a small TTL so the packets didn't propogate past the Denver headquarters (could have been ugly if they had gotten to Atlanta, Boston, LA, etc). I'm in early, got the problem shut down, and think that no one has noticed, right?
Some weeks later, I was at a presentation done by the network support group, talking about the kinds of things they measure and monitor. The guy doing the talk, whom I don't know at all, puts up a slide showing an enormous spike in traffic early one Monday morning. "We're not sure what caused this," the guy says, "but we think it was Mike up in Westminster." Emberrassing when problems that cannot be otherwise explained are attributed to you.
I worked for a fortune-100 company as a UNIX admin/general systems geek.
/usr/bin and /usr/sbin, save su to the admins. You can do things like 'sudo chown' and 'sudo rm'. Psssht.
.' and then the previous command. After some time, I got my prompt back. I did a quick 'df -k .' to check my work and noticed that the filesystem was WELL within acceptable limits. I was so pleased with myself (and shocked by the tens of gigs of rotated logs) that I went to tell my boss that it was taken care of and to state my amazement at the amount of space that was being taken up.
We noticed that one of the filesystems that held the log files for an Oracle Application Server (two machines, shared storage) was filling up.
At this company, the security wannabees gave no one root access, but gave sudo privs to all UNIX admins. No big deal, huh? Well, they gave permission to everything in
Anyhow, my boss asked me to clear out the rotated logs in an attempt to free up some space.
I logged on to one of the two boxes and went to the directory in question. I typed "rm *.*"... Permission denied. Bummer. I guess I'll have to use sudo.
I typed in 'sudo chown [myid]
I got my 'attaboy' and continued working.
After about an hour, we went to lunch (boss went to lunch with me almost daily.) He gets a call on his cell from the PHB (although, to be fair, 'balding head boss' would be more appropriate.) He said that the OAS cluster for the largest app we supported was down.
After about 30 minutes of investigation and head-scratching on the part of my teammates still at the office, my boss got another call. One of my teammates asked him "who is [my id here]?"
My boss asked me if I knew, and my heart nearly exploded. I told him it was me.
I didn't even think to mention the change I made as a possible cause because so much crap happened every day that I forgot about one project about 5 minutes after completing it. I always fess up immediately when I make a mistake, so my boss knew I wasn't trying to hide anything...
Apparently, the server crashed when it had to rotate the log file (too large) and couldn't write to the directory. It wouldn't come back up again (with a completely non-descript error message, of course) after the crash for the same reason.
I'd left the directory permissions set to my user id. D'oh!
What makes this funny (in that sick kinda way) is that this app server crashed constantly, and the higher-ups tried to make themselves look good by being concerned (even though no business loss was actually incurred.) They always wanted a root cause analysis for every crash, and they were all the same - "unknown. vendor support not available because software is past end of life."
The higher-up jumped on this opportunity to make a freaking "oh my God, this guy is so dangerous" case out of it because it gave him something concrete to go to his higher-ups with, after so much "idunno" action.
I was given a written warning (my boss was forced to do so.) He smiled and laughed with me over the stupidity of it.
Plain and simple, ever using AOL was my greatest tech-related mistake of all time.
I learned a hard lesson in August. One of my clients had a server that had gotten rather crash happy. I spend one day there every week-- about an hour fixing accumulated problems since my last visit, a little time checking the servers to make sure everything's cool, and the rest of the day hanging out and flirting with the hotties who work there.
Anyway, on this particular day the server was especially cantankerous, crashing a few times before lunch. It was a busy period for the client, and I didn't want to bring the department to a halt to run utilities on the thing, so I told them I would stay late and do it after 5pm.
At 4:30pm, the server crashed hard, and the RAID-5 unit with all their stuff on it was hosed beyond the ability of any utility in my bag of tricks to recover the data.
After a taking a few moments to suppress the vomit, I hooked the RAID up to the backup server and began recovering 150GB of data from their AIT2 tapes.
I spent the entire night there. I determined the server hardware was to blame, grabbed a spare workstation off the floor (the server was old, so the newer workstation was more than up to the task of server duty) and rebuilt the server from scratch on it.
After I put the last tape of the backup set into the tape drive to be recovered, I took an hour nap on the floor of the server room.
I hooked the RAID back up to the new server and made sure everything was shared out correctly, then I stayed until about 10am while everyone checked a bit of their data. Everything looked okay, so I called my office and told them I was going home to sleep.
Over the next two weeks, a few job folders turned up missing. Unfortunately, it turns out that the server's flakiness had resulted in the last week of backups before the crash being slightly fux0red. Most people kept local copies of their active jobs on their machines and just copied them over to the server every night to get backed up, so in the end very few job folders had to be recreated from scratch.
Lesson learned. Next time a server so much as twitches on my watch, it's going down for maintenance ASAFP, busy department or not.
It was a rack of servers and I hit the power button on the WRONG server. However, I noticed this while I was doing it and... I held it down knowing that if I released it, EVERYONE would loose their work (300 users I was on the file server not the backup domain controller machine) -Just like a landmine... pick your foot up and the bomb goes off. Too far from the phone, I stood there for 20 minutes before I could flag someone else down but they couldn't get into the room (locked and they can here me through the glass). They tracked someone else down 10 minutes later, and then they gave a 30, 20, 15, 10 and 5 minute warning before releasing me and the button. Of course, by the time this all happened, about 20 people got to stand there and laugh at me... I can't really move.... ugh.
I've deleted /var/lib by accident. Only, the kicker was that the nightly backup cron job was running, and had the rotated backups mounted under /var/lib/backup. All I can remember is coming to a stark realization that I had just done something wrong, and hearing a loud noise as I banged my head against the keyboard.
No comment.
Anyways: back in my post-college, pre-moving-to-Portland days, I worked at Radio Shack, and had unofficial but responsible assistant manager status after a year or so. Among other things, closing duties included putting a long-play videotape in the VCR attached to the store security cameras. No big deal, it was right by the PC in the back office where you closed everything out, impossible to forget and nothing every happened anyway. Until, of course, the night I forgot to do it, which also happened to be the night I got a call from security around 1 AM, to let me know the alarms had been triggered and I'd have to go down to meet the police and see what had happened. About a $1000 loss in stolen display merchandise, and no evidence. Oops...
I run the backups for the company where I work. We have one dedicated backup server that has about a terabyte of RAID that holds just over a weeks worth of backups. I needed to nuke one folder that was too large to open via Windows Explorer. So... Off to a command prompt; I type in `del /y foldertonuke` and hit enter. This command took HOURS to run. I left for the night and upon return in the morning the ENTIRE terabyte drive was wiped clean. The directory structure was still intact, but the files were gone! I about crapped myself...
Not to mention the backup servers tape drive had been out for repair for over a week.
Yep... That was a bad day.
1) I remember way back in middle school when I got my brand new Amiga 2000 w/ its whopping 48M HD, and trying to use one of those old file explorer/graphical shell type programs to free up some space by deleting a bunch of pesky *.info files that were littered around, taking up space. Go back to the desktop & open my drive, oh shit- nothing's there! A call to the local Amiga BBS's sysop informed me that those .info files are the icons.
/tmp. And I figured I'd leave them there instead of deleting each one as it was done, so I could make sure they all ran properly. The next day I was informed that /tmp is what Solaris uses for its swap (I'd only had a bit of experience on HP & Linux), and I'd brought down the machine.
2) A little more serious- few years back at a large client site, I was running some thing to generate a bunch of temporary files from the db & do something with them. I tested it on a few rows, then set it to run on the whole database. Most of the system was on RAID-5, which was quite slow, I'd learned, so I just stuck all the temporary files in
3) Several other instances involved working on the web-based data entry system at the same client site & accidentally doing something wrong that brought down the whole machine, causing all the DE people to be unable to work until the machine rebooted, but that was mostly due to 1) the in-house PHP-like parser we were using not handling unclosed tags & continuing to read past the file until it used up all available memory (the guy who wrote that fixed it after that happened a few times), and 2) not actually having a separate development/testing environment, & just editing the UI on production, while people were working on it (that changed, too).
shit, are you in high school? (now, not then)
On the same Windows domain that the corporate (fortune 500 pharma...), I foolishly did a "net send /domain dood!" ...which sent our techs off to check various CxO computers and hide the popups.
my old bosses email & checking he can recieve email i thought wahey, lets try to send a test email, he had a funny little dell keyboard, with slightly tighter packed keys than im used to, so instead of typing test i ended up with twat, my left hand just a little to far to the left. when he opened it later he wasnt ver happy.....
I used to work in Holography Research and as it was trying to make holograms in water we used a big Argon ion laser, as this produces the ideal wavelength in the visible spectrum and has the power to keep exposure times down to few seconds. This was a big beast which is basically an Argon filled flurorescent light tube with some optics on the outside of the tube to make it lase. Because it gets pretty hot it needs a water cooling system around the tube.
:-)
I came into the lab one morning ready to expose some holograms on a set-up I'd done the night before and switched the laser on and left it to warm up while I fetched a mug of tea. When I came back I heard the laser making some funny noises. It was then that I realised that I'd forgotten to turn on the water cooling. I quickly turned it on, forgetting what effect cold water coming into contact with hot glass can have on the glass. The result was a sound rather like that of standing on a frozen puddle and a jet of steam coming out of the end of the laser where a nice green beam should have been emanating from.
Luckily, the Univerity's building insurance covered the damage or I could still be paying it back now, twenty years later
No but, yeah but, no but...
that would be pretty bad.
...getting a job with SCO.
...and he grinned, like a fox eating shit out of a wire brush.
We had just moved buildings a couple months prior from a site with no central UPS to a site with central UPS, massive backup generator, and a warning system for said kit. Well I'm sitting in my cubicle across from the datacenter when the red strobe light for the UPS starts going off along with the loud siren, plus the blue light for the generator isn't lit! I rush into the datacenter to figure out the problem and so that I can think straight I press the surface mount button to silence the alarm, well the surface is greasy and my finger slides down, to the power button! Stupid Liebert UPS had a surface mount power button with no confirmation and no lockout shield like every other freaking UPS I've ever seen. Let me tell you there is nothing worse than the sound of silence in the middle of a multimillion dollar datacenter! I powered the UPS back up and things mostly came back on their own, only a couple of SUN workstations in the development cluster failed to come back on their own and those just needed an fsck.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
We *do* have a PDU, but that box isn't plugged into it!
At work, I was working with a fiscal printer. It worked with 110V, and here the electricity is 220V. The transformer worked both ways, being able to choose the voltage change with a switch. So I setted up the switch on the wrong position. Long story short. I've plugged the 110V printer at 440V. So I've broken the printer power almost instantly. I still remember the delicate smell of the power source burning up.
I was a co-op, and a SysAdmin. Bad combo. I was erasing a directory, and there where a bunch of annoying hidden files and directories. So I said "No Prob!" rm -rf .* would do it!
Wiped out 2TB. That may not sound like too much now, but this was 1996! This was a server that had just consolidated about two weeks before (onto an Auspex machine) from 5 or 6 Sun servers. The fones starting ringing... FAST!
For those not in the know, that recursively follows every .. directory until it hits root, then goes forward from there!
- RR
I should put something clever here. Maybe someday.
I once had a job doing backend processing for trades in government bonds at the largest insurer in the US. A bunch of us had to handwrite vouchers every day to move money around between accounts to accomodate the trades. The signed, authorized vouchers would then be sent to a girl who worked after school keying them in. I have terrible handwriting, and she frequently made keying errors. So I wrote a small app to print out completed vouchers. They asked me to share the app with the other workers, installing it on a centralized PC running DOS (this was 1992). They also asked me to capture all of the transaction details in a record file, which would then be fed directly to the mainframe application. By reducing the processing time and eliminating keying errors, I was credited with saving the division $100k annually, and given a $1,000 check.
One day, an account manager came to me and said her group uses 4-digit account numbers instead of the 3 my app accepted. I took her word for it and changed the app to accept and store 4-digit account numbers. Well that broke the feeder file and crashed the mainframe app during one of the nightly feeds. Boy was I slapped around for that one.
Intelligent Life on Earth
Someone made an installation program, to install on a floppy. It formatted A: to start.
Well, went to Japan, and someone lost the data on his harddrive on a NEC. It would seem that Japanese NECs are A: for the harddrive, and C: for the floppy.
Have you read my journal today?
Line 11,000 in a 25,000 ipf config file (yes yes I know, but it used to be a lot longer before I got there and was a lot shorter by the time I left. Big international bank, don't touch anything, you know how it is.) contained , instead of . 11:45 at night, not on call, drunk off my ass, about enter house, frantic call from manager TRADING ISN'T WORKING TAKE TAXI GET DOWN GO GO GO take taxi, stumble into office, see most of senior management standing behind desk chewing fingernails, sit down, unix guy runs in, looks around, comes back with cup full of cigarettes and another cup full of coffee, "you might need this", runs out, and at 5 a.m. we stumbled bleary-eyed out of the building having saved the day. "Ooops".
Also, not mine, but colleague while we were getting rid of the aforementined firewall abomination, and migrating the entire (huge) environment. Same guy from the unix team on IRC during horrible weekend-long infrastructure move: "Ooops". Me: "Dude, what 'Oops'?!?" Him:
# cd
32,019
# mkdir
"Hmm that's taking a long time to copy.."
Cole's Law: Thinly sliced cabbage
I once deleted all of the alternate keys in a medium-sized insurance company's Claims file, preventing them from managing any claims. Since my "enhancement" was run before backups, I had to recreate the keys by writing a COBOL program that parsed a nightly report and figured out which records had which keys.
OK, so about a year and a half ago I was a software tester working on testing out the mainframe billing engine that used COBOL, JCL, stored procedures, and a whole ton of other stuff to send out automated online billing to our customers.
One day I am required by my coworker to test 'the dinosaur' by setting up some generic bill requests to send through the API's to test the billing engine. So I input all the parameters a few times (there were like 30+ that you had to hand-key into each API call) and start getting bored. Now when she had told me input these parameters, I was supposed to use the reference website "www.testbills123.com/index.html" (just an example of our actual URL) as the 'view your bill' field setting since that would use that URL in the bill when it was sent, so that if you viewed this online bill in your email inbox it and tried to click on the link it would just resolve back to our home webpage.
Well she didn't really explain that reason for using that particular URL to me in detail. So after a few API calls, I'm getting bored and decide to at least make my mundane software testing life that day at least enjoyable. Being the sarcastic dork that I am, I start inputting "www.thedinosaursucks.com" and "blowme.org" as the URL's in the 'view your bill' field. Yeah, I thought I was pretty funny, because these would show up on her report of all the 'bills' sent through the billing engine and make her chuckle.
Well, later that day I'm looking at the test bills through our online email viewing site. My URL's don't appear, but the link button that uses those URLs does. For some reason I thought that my URL's didn't matter and were just there to satisfy the API calls.
You probably know how the story ends. 1 minute later I'm in a panic and thinking that our system is somehow resolving addresses incorrectly and redirecting 'view bills' requests to japanese pr0n sites. 20 minutes later I am still very red in the face because everyone in my 40+ person department now knows of my "pr0n habit" when testing software. I still get razzed about that to this day.
There's always a problem when you're looking at a 3-5 different machines in a bazillion terminal windows. I had one window updating the kernel on my laptop, another which I was checking some files on the server. My kernel finished and I cleared that window, went on doing some other stuff I wanted to finish before the reboot.
And of course, I type the "reboot" command on the wrong machine. Then it was a mad rush to the server to try a shutdown -a, stampeding through a room where students were doing finals (I work in schools). Alas I was a bit late, and the server was past redemption. Not much damage was done, except for some windows machines that had open files on the server, which of course crashed... but at least no critical files were killed mid-write.
I made a mass-mailing script which we used to email our customers with important info. Since they're all bulletins, the subject was hardcoded and the body dynamic. Unfortunately, I forgot the remove the "All is fine and well" subject in favor of "Company XXX - Bulletin" when I was done testing it, and the first few emails actually going out to customers detailing several bugs, problems etc were a bit contradictory between subject and body.
Other odd doings are like when creating emails for experts-exchange, I used an alias that forwarded to my main account. Of course, with an alias of expertSEXchange@mydomain.com (caps are just for emphasis), the next person to check my alias was slightly amused.
I got an MIS degree instead of a Computer Science degree. No idea why.
When I started my current job I was assigned to testing one of our apps. I was given 2 databases to log into and was told to "exhaustively test" the entire system (adding/deleting records.) So I started adding all sorts of odd stuff (I was going to delete them anyway.) Now I'm not dumb enough to name them dirty things (afterall this was my 2nd week) but I was still typing silly things.
Anyway about an hour into it I got an email that was addressed to the whole department that said
"To whomever is modifying the market database while I'm trying to demo it to one of our most important customers: not funny."
Working as a contractor at a now-defunct game site, I was making space on the dev server and deleted a file directory that was named with my first name. At that time I was the only person there by that name, and I knew I hadn't stored anything meaningful on that machine.
However, it turned out that months earlier there had been a key developer there with the same first name as me, and for some strange reason they were still using his old directory to store the only copies of the files for a massive website upgrade.
Fortunately the people doing this dev work managed to dig up local copies of most of it on their machines, so not a whole lot was lost. But the webmaster made a point of blatantly treating me like a retard forever afterwards. He was one of those untouchables who can walk away from a smoking pile of debris and be presumed blameless, so there was never any mention of his lack of backups or his refusal to bother with any sort of source control system. Eventually they went under for various other reasons. Dumb-asses.
The spare tube had been in storage for over a year, and they sometimes get gassy. So I borrowed a portable high voltage power supply from EEV and proceeded to fire up the tube in the crate to burn off filament deposits and burn out the gas.
Except I misread the color code on the leads and hooked 8000 volts to the filament contact on the tube. The filaments are supposed to run around 8 volts.
The meters on the power supply pegged, the circuit breaker tripped, and the tube became a $45,000 crate of junk in about half a second.
Wanna try explaining that to the General Manager during your Monday morning staff meeting???
The number 1 problem of working in a cubicle - 23 power cords, 1 outlet...
I worked for a large elevator manufacturing company formerly HQ'd in the US, but it got bought out by a company in Finland. Anyway, the admins there didn't have DHCP set up properly, and laptops brought into the US from Finland wouldn't release/renew their IPs properly. They also didn't allow laptop users to have admin access on their own laptops. Result: laptop that you couldn't log onto anything with. The only thing you could do is copy the SAM file and crack it with L0phtCrack. This saved many Powerpoint presentations for a large number of VIPs from Finland. One day I was backing up some stuff from my laptop to the corp. network because the HDD was getting ready to fail. The genius that configured the corporate anti-virus had it set to identify L0phtCrack as a virus/trojan horse. Within five minutes of detection, the network security manager (note: this was a former SALESPERSON who knew NOTHING about network security) comes running into the room... "WHO HAS LOW-FAT HERE??? I KNOW SOMEONE HAS LOW-FAT." I didn't have the heart to tell him he was a f'ing idiot :)
These batteries are not fused. The inrush current of larger mechanical Telephone exchanges was over 2500 amps @50VDC - so though the voltage is low its DC and when it arcs then kiss goodbye to whatever is now welded to the bussbar. It'll stay arcing until its welded itself into slag. Thus I was very cautious of any metal objects.
You normally check batteries by doing a discharge cycle. You discharge through a heater at say 1/10th (or 1/15th) of the total Amp hr rating and monitor Specific Gravity (about 40 cells reading every 15 minutes so takes while and its hot work with say 850Amp x 50 VDC heater running and fans.
One day was paying usual close attention to my own body parts and the buss bars but failed to notice I'd just coupled the 50VDC heater to one of the Telex battery banks. These are two banks + and - 80 VDC with respect to earth. A full +/-80VDC to earth Telex signal hurts real bad whereas 50VDC just tingles. Remember I = V/R but power = V squared /R so 80 VDC compared to 50VDC would provide about 2 1/2 times more power in watts for any given resistance thus you used a different heater.
Went back to heater (this is a mobile cage thing about size of very big chest freezer. Set starting current and turned on. Flash - bang. I nearly shit myself. Fortunately the heaters sections are fused but remember this heater is designed to run continuous at 850 amps current at 50VDC so each section shares quite high current loads and being DC it arcs really bad so it has springs to wack the bits away and break the arc. Had to stop work for the day cause I was sweating. Next time I'll check which heater and not set a starting load when turning on.
ps: worked 8 years in Telephone exchanges. Only major issue was the one above (apart from wrapping a company car around a power pole before that). Spent first 4 years learning electromechanicals then next 4 learning digital and got lots of overtime ripping out the old equipment. I did get impaled through shoulder by an equipment rack that dropped on me when gutting an exchange that was going to digital - so we're evens in Karma !
chmod -R -x /*
missing one little '.'
sigh.
biggest goof i've done:
I was adding a new box to my home network, moving my old main drive to another system, adding a brand new drive in my main workstation, formatting the new drive, transferring the files from the old drive to that one, and finally reformatting the old drive and installing a new system.
The time was 9pm and i was on a bit of the tired side.
I started installing the new drive's OS, and absentmindedly thought i'd run the format on the other drive. the boxes were in two differet rooms 50 walking feet apart, i started the old drive (filled with critical data) formatting BEFORE i copied the data, i realized this after the new box was up and i went to access the old data for transfer, fortunatly the drive format had hit 2% (NTFS). i spent the next 6 hours recovering data from that nearly doomed drive... got 70% back plus stuff from the drive's first life, who knew.
Since then, I do ONE drive format at a time. In addition, all uber critical files are kept on a RAID5 linux server (dupe files are all ove the place as the really important stuff dsoesn't exceed 50MB.
Oh yeah, and any maintenance to the server follows the following procedure:
shutdown samba/NFS.
unmount raid array.
stop raid array.
backup raidtab.
learned my lesson.
Logistical Chaos Officer http://www.slagg.org - LAN Gaming in Sarasota FL,USA
I set up a filter to forward all messages coming to my sister's ISP account to her webmail account. The only problem was that the original Cc field was left intact. There happened to be some spam to the users of that ISP where recipient e-mails where in Cc. As a result, when a spam message was forwarded, it had Cc field intact and a copy went to every other user. And to sister's ISP account too. And then it was forwarded again. And again. And then again. I can't say I brought down the email system, but it did piss off other users. :)
Future Wiki -- If you don't think about the future, you cannot have one.
Well, I guess my worst technology mistake was saying "Jobs are not an American's god-given right" while on an open mike at the Consumer Electronics Show and snickering. OMG wait. I'm NOT Carly Fiorina. Never mind.
first mistake: ...
... (156 MB ++ now) ...
denying admin access to %systemroot% on windows xp.
(reinstall)
second mistake:
limiting "temporary internet files" to 5 MB on internet explorer
didn't have a computer at that time and was
...
... i don't know about other people ... but after another commit ...
...
...
planning to get a 486 66 Mhz 'cause i desperatlt
wanted to play wing commander on it.
anyway i knew from another friend that it would need
himem.sys and emm386.sys to work.
so i went to another friend home who had a lowely
386 at the time and asked him if i could try the
himem.sys and emm386.sys thingy
anyway it was all trial and error at that time.
first ting i did on his portable with external
screen computer was go into the BIOS and enable
video BIOS shadowing. i don't rember why i did
this but after commiting the changes the screen
would just stay black after reboot.
"oh-oh... now what i can't see the BIOS menu"
luckely my friends dad still had the manual for
the computer where all the BIOS menu and settings
were discribed. so re-boot press delete and
blindly navigate the BIOS menu to the shadow video
BIOS setting
but i get alot of hot flashes when i screw up
stuff on other peopes computers and i was
definetly sweeting
and reboot the screen would start up
anyway this escapade used the whole after-noon and
i had to give up learning himem.sys and emm386.sys
on my friends computer
got a new 486 anyway and it was a breez to get
himem.sys and emm386.sys working. 624 kb free
memory was the record i think
I had just gotten my first job as a UNIX admin, and my make-work learning task was to resurrect a machine which had just been replaced by a new, faster machine. The old box was still in its rack next to the new one, because the new box had all the old disks cabled to it. I was supposed to be getting the old one to net-boot off a root on the new server (a great way to learn, by the way, since you have access to the root filesystem even when you screw it up so much it won't boot).
Anyway, I hung the machine some how, hard enough that our console server's break didn't drop into the boot monitor, so I called up operations and asked them to powercycle it. I described in exquisite detail where it was, where the power switch was, etc. Took them a damn long time but finally my machine started to boot and we got off the phone. Later I mentioned how long it took to my boss and he had me repeat my explanation. "You just described the new machine," he told me. Sure enough, I had mixed up left/right because I was trying to send them to the back instead of the front. Luckily they were smart enough not to reboot the production machine.
I thought I had it, renamed aw_host5.sys to aw_host5.sys and rebooted.. turned out to be the keyboard driver.
did I mention I have PC anywhere on this machine, so the ONLY way to boot is by hitting CTRL+ALT+DELTE (or inserting a card I don't have) at bootup?
every day http://en.wikipedia.org/wiki/Special:Random
that make more sense?
every day http://en.wikipedia.org/wiki/Special:Random
I agree, but it's been happening for a long time now.
The last set of ads that Sally Struthers did the voice-overs for had "PC repair" and "Computer programmer" as options.
Not to mention the vocational-school industry's widespread expansion into IT over 98-now.
The outsourcing of IT workers to third-world countries was something I expected to happen 5 years ago. "Taps" was appropriate in 2000, but now it would be blowing a horn over a grave.
Some time ago, I needed to write a cronjob on a friend's box to do nightly updates of the locate database. Not being familiar with the syntax, I looked it up quickly and wrote this cronjob:
* 05 * * * updatedb
I thought that this would run 'updatedb' at 5 o'clock, and the minute isn't important. What this actually means is 'starting from 5 o'clock, run updatedb EVERY minute'. Imagine my friend waking up in the middle of the night because of the disk making lots of noise, (updatedb is heavy on the hard drive) as my cronjob had generated loads of updatedb processes. The system load was 70.00 at the time my friend arrived to check out what's happening.
You solved their disk space problems. Definitively.
I only work for a college, so nothing hugely drastic has ever happened.
:)
:)
But two classics do spring to mind - but whilst I Was there to witness them, neither of them were my fault.
First one was me teaching a work collegue a bit about Linux. He'd been using it on and off for a few months and was starting to get quite comfortable and happy using it. One of the things he had learned was how to map Novell drives which was pretty handy as that was where most of our work and stuff was stored. But one day he made the fatal mistake of trying to remove a mountpoint by using rm -Rf - the mountpoint just happened to the SYS volume on one of the core servers.
I remeber the network just stopping running, and someone coming in from the server room with a serious look of concern on his face saying "MARS is asking for its name! It won't boot up properly!" - thats when the cold sweat suddenly starts to break out
Second story was.. a few of us were trying to rerack some gear in the server room, one of the items was a Cisco Callmanager. Problem was we couldn't take it offline but all we needed to do was nudge it up the rack a few U's. All going great until the guy who was trying to hold it whilst it was unbolted slipped and it just crashed to the floor from about 4ft up. It was almost like watching a Clown Car as the top lid just sprang open and bits of the chassis buckled. We all panicked and ran out of the room with the CallManager.. took it to the office to perform some corrective surgery (bending the brackets back into place, reseating stuff) and then went and plugged it back in.
Glad to say it worked, and still does a year on.. not funny at the time tho
"Hey! Unless this is a nude love-in, get the hell off my property!!"
I made exactly the same mistake TWICE.
When I designed an AI operating system I made the worst mistake I could make. Designing the system assuming unliminted disk space and then using a floppy disk drive for storage.
The database grew wildly then (per my design) when it had access problems it erased the whole database and created a fresh blank database.
I repeated this mistake on my BBS only this time the database took months to fill up but with the same results. It couldn't modify the file so it erased it and created a new file.
At some point I'd come it realise that having the program automaticly erase files (just becouse it couldn't write to them) was a bad idea.
I don't actually exist.
On the other hand, a coworker and I were doing some work on one of the Bell Labs Murray Hill machines that did weather, and I did a typo when removing something, and trashed some of the files that made our internal weather-report processing system work. It was in some directory like /usr/local, and we found that nobody had backed it up in five years, and the stuff was really gone. Nothing we could do about it except apologize.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
This was also the days when video projectors were big ceiling-mounted things that cost a lot of money. Ours was in the conference room attached to the computer room, and you were supposed to turn it off when you weren't using it. Usually no big deal, but somebody once forgot that after a Friday meeting and we had a Saturday power hit, so Monday morning when we came in, the projector's cooling fan was blowing 140-degree air through itself to cool itself down, and there was a puddle of oil on the table....
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks