Friday, December 02, 2005

Top 10 System Administrator Truths

I figure with enough time and effort, anyone could be a System Administrator. Really, it's not hard, it just takes practice, methodology, and trial and error. A lot of trial and error. These truths will certainly get you on your way. Let;s get started.

#1 - Users Lie

Oh yes, they do. Don't think you're immune either. Have you ever been on a tech support call, convinced that you know the problem and the guy on the phone says something like "Would you put in the recovery CD, restart, and scan your memory?" "Oh, I've tried that," you say with eyes rolling. Believe it or not, sometimes we crazy admin peeps suggest these fixes because they work. When a user is protesting my assessment, the best is to politely insist them to do what was asked until the doing is done.

#2 - Email is the Lifeblood of Non-Techies

I love my non-techie bretheren--I mean, how else would I know what happened on the OC and Gilmore Girls? But at the end of the day, email is #1 in their book. Now a lot of it is business related, and certainly that shouldn't be taken lightly, but most likely they were waiting on a warm, fuzzy message from their daughter or sister and really needed their email back up ASAP ("I'm waiting on a proposal!" they screech -- see #1)

#3 - Printers Suck

Ever had to clean a laser or, God forbid, an inkjet printer? It's like stabbing yourself in the eye. It's not just the grime either it's the fallacy that a little chunk of ink could make the machine just stop working. 90% of the time (or better), this isn't the case (instead, check the fuser/print heads). In terms of network troubles, HPs Jetdirect cards have a pretty solid reputation of failing every few years, so expect to shell out $200+ for those on a semi-regular basis, depending on what kind of printers you run in your office. For those with network cards integrated into the printer mainboard what were you thinking?

#4 - Cleanliness is Godliness

Ever open up a PC and see the Ghost Of Dust Bunny's Past in there? It's scary stuff, I tell you. I've seen some PCs begin to lock up for absolutely no reason while the innards tell you different. Sure Peggy in Accounting wasn't stuffing her machine full of cloth, but that blanket she keeps at her feet will slowly shed and the PC fans suck that stuff right up. When you're completely stumped, make sure there isn't something inside gunking up the works.

#5 - Backups are Crucial

This needs to be said. I've been caught with my pants down on this one a few times myself. Backup, Backup, Backup! Nothing (and I mean nothing) will bite you in the ass like a piss-poor backup schema. If your server dies right now as you read this post, what are you going to do about it? Do you know where the install discs are, do you have a configuration backup, do you know who to contact regarding tech support on that box? If not, you need to get your act together before you have a disaster and a lot of excuses and apologies following it. I use Retrospect at my job and consider it better than Backup Exec. It has amazing Macintosh support and is cheaper too.

#6 - Switches and Hubs (Usually) Die One Port At A Time

You can spend hours tracking down a bad network card or cable just to figure out that a port in a switch has died. You're pinging and pinging and looking, the lights are on but there's nobody home. The trick here is to know that a single port doesn't spell the end of the hardware, quite the contrary. Don't throw the baby out with the bathwater. If a port does go out, that hub or switch may work for years without another outage, but do be sure to stuff an RJ45 connector in that bad port so you don't forget (and chase down phantom problems) in the future.

#7 - No One Ever Got Fired For Buying Microsoft

So sad but so true. This old saying used to reference IBM, but oh how times have changed. Linux may be powerful, but the command prompt and configuration files and filesystem obscurity will just as soon get you a pink slip if something goes wrong and no one knows how to fix it but yourself. Even so, with as much stupid crap as we admins have to put up with on a daily basis, configuring some of the 'high end' Microsoft software is enough to drive you insane. Ever tried installing Exchange Server or, worse, installing Exchange Server and migrating a 5.5 install to Exchange 2000? I feel your pain, oh how I feel your pain.

#8 - Politeness is greater than Brevity

You can come up with all sorts of analogies for this one. You'll get more bees with honey, a spoonful of sugar, etc. But generally, you probably have very little day-to-day contact with end users. This means that when you do finally get to speak to one of those souls fortunate enough to login to your domain (both figuratively and literally), you should be sure to be as polite as possible about it. Even if the network is down. Even if the server is having weird, irrational problems. Use please, thank you, I'm sorry, and don't be too proud to apologize or 'make nice' with those who may ultimately influence your career path down the line. The peon you insult today with a "I sent an email about this, do you not check your own email?" could very well climb the corporate ladder and let your rude ass go in a few years. Mind your manners, peeps.

# 9 - Know Your Needs

This one could also be called 'Learn Linux.' Many admins get wooed into the idea that 'managed solutions' are always the correct ones. A web interface on a switch is cute, but rarely useful. A huge Cisco router may not always be necessary, sometimes a 'lo-fi' approach is best. When you want a spam solution, before looking at $5,000 servers and huge licensing fees for Windows Server software take a look at one of those old 'junk' PCs you have in the closet, download your favorite distro of Linux, and install procmail and spamassassin. You (and your budget) will thank me later.

#10 - The Holy Grail of Tech Support

...is the reboot. Rebooting can cure ailments of all sorts, can stop network troubles, crashing computers, find missing documents, and rescue cats in trees. System admins all over the world have, by and large, trained their users to reboot before even calling support. I mean, when's the last time you didn't reboot to see if it cured a problem? If you're not, then you're either stubborn or you're an admin who knows better. Rebooting doesn't cure all ailments, but it cures so many of them it's hard to not throw out a "Can you reboot for me?" to the end user when they call with some off-the-wall issue. Use and abuse as necessary.

I hope we all learned something.

Thanks for reading, I'll speak with you wonderful peeps Monday.

And his heart was cold, so very cold
You believe it might never have beat

Update 12/13/05 - Wow, I've been slashdotted!

Update 12/05/05 -
I've been farked! Welcome everyone!

Update: For those who like this, check out Flipside: 8 End-User Troubleshooting Tips.

May I also take this time to mention to such a huge, intelligent audience that I help develop new content for a great video game called Star Chamber. It's a fantastic strategy game that is relaunching with a brand new client and publisher. Those out there who enjoy (and empathize) with this list will love it. You can download a demo and try it for yourself.



Update 6/10/06: Fixed the horrid formatting.

126 Comments:

Azmeen said...

Nice list which I can definitely relate to. Brings back lots of memories when I was a sysadmin/on-site tech support.

Thank you for taking the time to come up with the list and sharing it.

1:32 PM, December 02, 2005  
AT said...

Excellent list, man. If only all IT were as consciencious as yourself, and with such good spelling.

1:39 PM, December 02, 2005  
Anonymous said...

Cool List, and I totally agree with the point "Switches and Hubs (Usually) Die One Port At A Time". I have a BEFSR41 router and port 3 died on me while unplugging it during a gaming session. The rest of the ports work and am still using those last 3 ports today.

Anyways cool stuff.

2:53 PM, December 02, 2005  
The Mad Tech said...

Great Post!!!

3:22 PM, December 02, 2005  
Sliver said...

haha, yeah... I actually HAVE migrated Exchange 5.5 to Exchange 2000. So I really know what you mean.
Good article.

3:34 PM, December 02, 2005  
Anonymous said...

Cool and True stuff Man

3:35 PM, December 02, 2005  
Joshua said...

As far as point #1 goes, one might also considering implementing a remote service solution. Anything in the realm of VNC, GotoMyPC, Windows Remote Desktop...you name it would go a long way in not having to ask customers to do anything for you. Just take control of the system and do it yourself. Granted, there are some things you can't do with remote services. The majority of the time though it is a lifesaver. Good read.

3:41 PM, December 02, 2005  
misterorange said...

I agree Joshua, and have used VNC quite extensively in the past. It's a lifesaver in and out of the office, particularly with family members.

Probably should've included that one in there somewhere, but hey, it's good advice nonetheless.

3:44 PM, December 02, 2005  
Ben Bishop said...

Man I wouldn't touch exchange if I could possibly help it - that surely aint any fun. No.10 - spot on! - its the first rule of tech support!

4:05 PM, December 02, 2005  
LarryTheDwarf said...

If you admin Apple machines, the Apple Remote Desktop software is absolutely essential. It's expensive, but well worth the money.

4:16 PM, December 02, 2005  
miscblogger said...

i totally agree with your last one. reboot is like a cure all for many people.

4:19 PM, December 02, 2005  
Anonymous said...

"If you admin Apple machines, the Apple Remote Desktop software is absolutely essential"

2 words... Netopia: Timbuktu

4:23 PM, December 02, 2005  
Anonymous said...

"with as much stupid crap as we admins have to put up with?"

95% of sysadmins I have met were total idiots.

4:26 PM, December 02, 2005  
Anonymous said...

The great thing about reboot is that even if it doesn't cure the problem it gives you an extra minute to think while the end user still sees forward progress.

4:40 PM, December 02, 2005  
dogg said...

I love it.
I'm an IT dude and I've been though all of this. The worst is going into someone elses' shop, where they've set up exchange incorrectly, the backup schedule is useless, they've made the firewall all wrong, and have somehow found the time to store a couple gigs of porn on the gateway box, in D:\temp (seen it twice)

If I may, I'd like to discuss how amazingly often those 'broken monitors' or 'keyboard wont work' or 'cant get to my q drive' or 'wont print' problems end up being physical cabling. It's uncanny.
Anyone remember that IBM commercial, "Did you jiggle the cable?" hehehe. It's so true.
If a problem is not solved by a reboot, if it's not the phyical wiring/cabling, then at that point, that's when you earn your salary.

4:40 PM, December 02, 2005  
Steve said...

If you're tech and you have lots of machines to support with corporate images on them....

Save yourself thousands of headaches by looking into a program called Deep Freeze. It restores a computer's configuration to a known state every time the PC is rebooted no matter what (virtually) is done to it. This includes all program settings, registry settings, and even adding and deleting programs.

This program is awesome. www.faronics.com

4:54 PM, December 02, 2005  
Anonymous said...

Basic problem from someone who has been both an administrator and a user is the administrator also will not take responsibility and assumes the user is the idiot. And the microsoft thing...don't get me started. Our company was so standarize on one platform, that it suffered through several periods of horrible down systems due to viral/trojan attacks due to their myopic Windows-for-everyone. Meanwhile the graphic arts division with Macs just hummed along while the Melissa virus brought the business side to a halt.
Does anyone get fired for that? No because they followed the corporate guidelines. Meanwhile the business goes in the toilet. Great forward thinking. We are still on Exchange 2001 because noone on the IT side will take any responsibility to change things because of "pink slip" fear. So Sysadmins...its not just idiot users, its also lemming-like IT people who can't think outside the corporate safety zone that are the problem too. Too often IT thinks they are running the business, not supporting it and have to protect the users from themselves.

4:54 PM, December 02, 2005  
dean said...

Rebooting really does fix all the wierd problems. The important thing to remember is, if a reboot ever does fix a weird problem, make sure you act like you knew what you were doing the entire time.

good stuff!

5:01 PM, December 02, 2005  
Anonymous said...

Anyone remember that IBM commercial, "Did you jiggle the cable?" hehehe. It's so true.
How scary but true, happening at my home. I invited two friends over for a mini LAN party, and for some reason, one laptop cannot find the WiFi AP half the time, while the other, identical, company-issued laptop have no problem. But on the second laptop, then ethernet cable needs to be be massaged and jiggled before it'll work, even with different cables. And the same cables work fine on the first.
...or maybe Dell just secretly hate both of them.

5:42 PM, December 02, 2005  
Anonymous said...

Holy cow man, spot on. All the comments are spot on as well, I can totally relate. It is comforting to see that other admins have the same problems I do. If I had sat down to write this list it would be very similar. I can also recommend Deep Freeze.

6:18 PM, December 02, 2005  
Anonymous said...

don't know where would go in your list, but possibly under

lies:

find a way to ask at least 3 questions.
sometimes 1st question is just
repeating what they told you
(affirmation that they are being
heard is big psych, plus it puts them in a postion that they are
being part of the solution, not "the problem").
Once saw a fellow tech ask questions until user realized
what he had done.."solved himself."

politeness:

It costs little to follow up with
a user if you run into them later..
quick "everything still working okay? etc."

I once supported ~200 computers at
five buildings on a campus, we used pro support software, but it
always seemed to impart a good feeling that I "actually cared" if I checked back with them personally
when I'd run into them.

Plus, sometimes, its easy to solve the symptom but not the problem.

6:23 PM, December 02, 2005  
J said...

You're my new hero.

6:43 PM, December 02, 2005  
Loyd said...

I'm laughing as I bang my head on my desk because they're ALL TRUE!

6:49 PM, December 02, 2005  
MaZa said...

I have used this one on quite a number of occasions.....

Support: "Could you please restart your workstation for me"

User: "NO, I can't, I am extremely busy and cannot afford to!!!!"

Support: "Thats fine, I will run a few diags over you workstation (A blatant lie, but they wont know)"

This is when I instigate a remote shutdown (shutdown.exe). :-D. Usually giving it a 60 second window.

User: "My PC is shutting down!!!"

Support: "Sorry, it appears to be due problem you have been having, your OS has had a fatal error and would have instigated a shutdown"

Doesn't always work, but is quite funny :)

8:07 PM, December 02, 2005  
Darren Stalder said...

I've got at least one JetDirect card that's been humming along for 10 years now.

8:35 PM, December 02, 2005  
Anonymous said...

Coming from both sides of the fence, I can see how you get to these conclusions -- these are the cases that drive a tech insane.

However, having seen the flipside, I can't quite agree with your list, namely...

- Users lying: often true, not always. Nothing is more aggravating than relating everything that's been done, only to have the tech start a script from the top, doing everything that was just mentioned.

- Email importance: depends on the business. For many, it *is* crippling. Nothing worse than having clients breathing down the necks of the sales folk, who are just trying to fulfill the orders. No money = no business, and some clients won't wait.

- Microsoft: no one ever got fired, but maybe they should. There are many headaches in switching from non-Microsoft to Microsoft, relating to maintenance, uptime, and security, it's true. I regretted my organization's switch from qmail to Exchange, that's for sure. However, the kicker is the cost. Part of an administrator's job is making sure the costs are minimized. Don't set up stuff that no one will ever be able to figure out, but don't be spend thrifty either.

8:55 PM, December 02, 2005  
Anonymous said...

I could not agree with your e-mail comments anymore. The less technical knowledge a person has the more they think the world will stop if they don't have e-mail, I wonder if these people remember what a stamp is half the time, not to mention the 8 fax machines within spitting distance of their desks.

Good article, while your at it, write something up on setting up Nagios (server monitoring software for linux). Seriously takes some patience to do that.

Exchange 5.5 -> 2000 tough on you? Try upgrading from Exchange 2000 -> 2003 with a new domain on new hardware with users that refuse to delete e-mail or archive even thought they are in the 3gb in size area (over 2gb's exchange tends to toss errors). Everybody that could make or break me had the errors in their e-mail because of course they couldn't delete those e-mails from 1998 about the Christmas party. Sigh.

9:58 PM, December 02, 2005  
Anonymous said...

Sound advice my friend! You might also want to check out STReeTJeSUS's blog page. He has some interesting survival tips for IT \ Support environments.

http://streetjesus.blogspot.com/2005/12/survival-tips-for-it-professionals.html

10:03 PM, December 02, 2005  
Anonymous said...

Spot on for the most part bud. Made me laugh!

1. Why do they lie? My job would be so much easier if I knew what was wrong right away. :(

2. Definately. Email is the always the number one priority.

3. Printers are the lifeblood of marketing, especially color printers. Save yourself some trouble and buy a Dell or Xerox, something with support included so you don't have to support. Because 9/10 times they'll just give you a new replacement the next business day.

4. Have not come across this yet, but I imagine it sucks. :(

5. Also think about Ghost or some other imaging software for quick setups if you have to nuke a server.

6. Dunno, never experienced this.

7. So true. If in doubt, or if your bosses say "We're a Microsoft house," always buy Microsoft. Yeah, it may suck and make your job a bit harder, but Window's sysadmins are a dime a dozen. You CAN be replaced very easily...

8. This should be the number one concern of everyone who works in IT. We have to remember that not everyone knows a lot about computers, if anything at all. Politeness can go a super long way, and it should be the charge of every IT admin to get rid of the dirty sysadmin-who-lives-in-the-dark-and-is-mean and-performs-rituals-to-keep-everything-running image. We're really nice people, I swear!

9. Saw this very recently. Our senior admin decided that a Pix 500 series would serve our needs better, rather than just upgrading to a 2600 series router. Aww well, it's not my money their spending. Make sure you document anytime you disagree on some big purchase like this in case it ever comes back to bite you in the ass.

10. YES!

Thanks for the good post! :D

10:05 PM, December 02, 2005  
Anonymous said...

'only to have the tech start a script from the top, doing everything that was just mentioned.'

If i read this right your referring to a tech trying to fix a problem and the user saying he's already done it? Yeah, you have to. They don't get paid to fix computers, so when they say, I already did this, you have to do it again. Its proper trouble shooting procedure. Why? Half the users in the world insist they know the problem when in all reality they are:
1. lying about what they did to 'fix' the problem because they think this will somehow make the repair process faster
2. Have absolutely no clue what they are talking about.

If something in my building breaks, I fix it and I start with the simplest repair/fix possible. I'd rather spend 20 minutes fixing tiny things and find out its a complex problem, then spend 20 minutes doing complex procedures only to find out how simple the problem really was.

10:05 PM, December 02, 2005  
kevintest@yahoo.com said...

It was so cool to find this because I've been training our techs for years by teaching them "Rule number one is - Users Lie!" People always think it is a bit hostile at first, but when I explain that it is the begining of your troubleshooting process, they start to get it. Anyway, it was great to see a confirmation of my past rantings.

1:28 AM, December 03, 2005  
David Humphrey said...

As a student in College studying Network Administration I found this to be a great article. I have used a few of these points in the past agaisnt some of my teachers. For instance, the rebooting i had a teacher tell me once that rebooting doesn't ever really fix anything. It just simply puts it aside for it to fail later. And the dead port I have fallen into a lot of free networking hardware because they were throwing it out and it just had a dead port on it. Tape it off and keep using it.

2:51 AM, December 03, 2005  
Pauk said...

I'd say Backup is #1 not #5 and the rest of the list is common sense (and experience)!

8:49 AM, December 03, 2005  
John said...

Good points -- but please don't teach users to reboot if it is your job to find a root cause :) While it fixes things with magical precision, sometimes the dust clears and people want to know "Whaa happen?".

9:19 AM, December 03, 2005  
Drew said...

Fark is evil.

11:24 PM, December 04, 2005  
Anonymous said...

fark is way scary enough, but fark AND slashdot would be worse;)

{romana}

12:51 AM, December 05, 2005  
Tripps said...

Can I take a guess and say you work for ClientLogic?

12:54 AM, December 05, 2005  
misterorange said...

No, I do not work at Client Logic. I'm an admin for a small business currently. I've worked for financial institutions in the past.

12:57 AM, December 05, 2005  
Anonymous said...

i sometimes prefer the BOFH approach.. it can be more entertaining =)

http://bofh.ntk.net/Bastard.html

1:31 AM, December 05, 2005  
Anonymous said...

"95% of sysadmins I have met were total idiots."

That's funny. They told me they weren't too impressed with you either.

1:42 AM, December 05, 2005  
kelebek }{ said...

Oh, dust bunnies. That reminds me I better get one of them spray thingies for my laptop! Thanks for the reminder. And YaY for Fark!

1:53 AM, December 05, 2005  
Anonymous said...

I blame Dos 5.2b and everything after it - long live the snorf

1:57 AM, December 05, 2005  
Anonymous said...

Hey, if you want to make some decent money, work overseas, and have your soul sucked out of you as an Admin, then work for EPS.

You will get treated like crap...not from users but from your employer...then you will learn the true meaning of stress and dealing with customers.

Just make up something on your resume (or use the default resume template and change the name) and get a job without interview, and "learn stuff" so when you go back to the states you can put the guys who has had 10 years experience out of a job!

http://www.epscorp.com/Careers/careers.cfm?subdir=1&career=1

They are seriously hurting and you can start at 70K tax free. Ask for Wayne!

2:06 AM, December 05, 2005  
Mark Blair said...

Very good list and very true indeed. While reading this post I found myself shaking my head in agreement. :)

2:12 AM, December 05, 2005  
Anonymous said...

Company I work for supports multiple small businesses. VNC is installed on every workstation, all the firewalls are linux boxes and every site is on a 10. subnet that is locally routable from our workstations.

I'm sitting at home right now, on our VPN logged into a client system to do some work on the backups there.

When a user has a desktop issue, they love that VNC login - the computer just fixes itself.

JC

2:13 AM, December 05, 2005  
Carl said...

Discovering your blog was a treat. Thanks. I discovered and the used VNC to remotely control a camera in my home while vacationing in Turkey earlier this year. Of course, I have no idea what I would have done if someone had been looking back at me.

3:52 AM, December 05, 2005  
Steven Dickenson said...

Great post, particularly the bit about printers... God I hate those things.

Anyway, I would've added a few more:

Always pay attention to your systems. Look at logs, ensure your backup jobs are running and your AV software is being updated. Look at your firewall logs regularly, so you can get an idea of what normal day-to-day traffic looks like.

Automation is key. Script tasks to avoid errors and speed things up. Scripts can serve as a form of documentation as well.

Every shop should have a testbed. Take a few desktop-class PCs, install VMWare, put them on their own subnet, and re-create your basic environment. Use this to test new applications, security patches, and configuration changes.

6:59 AM, December 05, 2005  
wyckedone said...

Dust bunnies are bad but I once worked on a system that was so full of tar that the entire inside was brown! Nasty stuff.

Good list.

7:01 AM, December 05, 2005  
wyckedone said...

Dust bunnies are bad but I once worked on a system that was so full of tar that the entire inside was brown! Nasty stuff.

Good list.

7:03 AM, December 05, 2005  
lak_0f_sleep said...

Ahhh... Wonderful list. I've come to find that my users don't so much lie, as they tend to diagnose problems themselves, and then email me the problem (CCing the boss, of course), usually sending me on at least a half-hour wild goose chase, before I finally get back to their machine, unf*** whatever setting the've f***ed with, and restore order in the universe...

7:38 AM, December 05, 2005  
Anonymous said...

I agree with backups being number one. Having an easy solution saves soooooo much time and headache. We use LiveBackup for PC backup which is really cool.

8:03 AM, December 05, 2005  
Anonymous said...

All printers are indeed evil. I've been a sysadmin for 10+ years and the majority of the problems I've dealt with have been printer related.

BTW, Steve is 100% correct. Deep Freeze is the shizznit!

8:45 AM, December 05, 2005  
Dave said...

Excellent Post! *LOL* As an IT person myself, the truth should be told.

10:22 AM, December 05, 2005  
Anonymous said...

Truth #11: You WILL relearn everything you know about technology every few years.

10:55 AM, December 05, 2005  
Anonymous said...

Awesome list I'm in the navy and my sonar system is basically its own network running over 65 SMPs and 20some apple G4 clones with redhat and yellowdog. Plus I have three different interfacing networks a GIGe ATM and FCS...sometimes it can be a real headache but not only do I have to fix it but I have to use it too and when it goes down the submarine has problems...on the good side though I never have to be polite if the guy is an idiot who broke it I can tell him so!

11:03 AM, December 05, 2005  
Anonymous said...

Well nice start but it looks slightly more like a helpdesk tech support truth list than a sysadmin but truth list thats cool. There is some definite commonality there. Users do and with astonishing regularity so this is something any IT person can relate to. And the email thing. Yea. I've had users tell me they needed an email resotred because it contained an important planning document only to discover that it was a love letter someone had written. And then there are the programmers and end users who try to get root access on systems claiming that they need it to do their job.

11:05 AM, December 05, 2005  
TheTuna said...

Wonderful and insightful post. I disagree only with recommending an open source Linux/Spam Assassin solution on your server. While it's cheap, it isn't as secure as a hardened gateway email firewall/IDS that's hardened against network level attacks, DHA, malformed MIME expoloits, etc.
Reference: www.secunia.com, http://www.openpkg.org/security/OpenPKG-SA-2005.015-spamassassin.html, etc.

By dropping 50% of the inbound EHELO packets at the network layer, thru RDNS, RBL and other reputation checks, your systems will work less hard and you won't have to process all that spam on your server. It also frees up bandwidth on your internet pipe. Just a thought. Enjoyed it!

11:13 AM, December 05, 2005  
Anonymous said...

I'm not an network admin, but i know how it feels... I working with drafting software that is supposidly 'cutting edge' which means a bug popps up every so often or you run into dead end programing... the whole reboot solves a bunch o problems is completly true...
the program will often go haywire with you not being able to selct anything on screen, or certain features shut down. When people come whining to me, the first thing I always say is "shut down and restart and then we'll dig into it" 99/100 that solves the problem

11:32 AM, December 05, 2005  
Anonymous said...

VNC!

The bane of my existence out of work..especially with family members.relatives.acquiantance.

Its the best, really.
Except it gets easier going to their place and fix their computer than trying to get them to install VNC over the phone >_<

11:34 AM, December 05, 2005  
Anonymous said...

Great List!

But #8 should be #1.

Now, if this was my list that you were reading about my business you would answer "So." I appreciate that you recognize the importance of "politeness" but it's really more of an industry attitude shift that needs to take place. We all live and die by the function you perform and we'll all die without water too. But If I've got an expensive plumber with a lordgod attitude, he goes to the top of my re-org list. The world's full of good sysadmins.

12:07 PM, December 05, 2005  
Anonymous said...

Where's "Vendors Lie"????????

12:16 PM, December 05, 2005  
Anonymous said...

Good list, mostly common sense. I do disagree with the comment about the exchange migration, I have done probably a dozen of them and never had a problem. As far as backups go, you also need to go further and have a disaster recovery plan, and for go sakes take your backups off site, if your server room were to flood or catch fire and you backups are in there, not a whole lot of good they will do right?

12:26 PM, December 05, 2005  
Anonymous said...

Wow, I cannot count the amount of times I have told my coworkers that when I die, I am going to have "Reboot" put on my grave stone. Good List!

12:28 PM, December 05, 2005  
Anonymous said...

It's true that nobody gets fired for buying Microsoft. However, it's not because it's Microsoft, per se. It's because ANY idiot can figure out how to reboot (troubleshooting step #1) without screwing anything up, and then call tech support if that doesn't work.

Where people get into trouble is if non-Microsoft products are used and only one or two people know how it all works. Sad as it is, when these non-Microsoft folks do get lured away to more lucrative jobs, it leaves chaos. So... policies are then put into place... Microsoft-only. Why? Not because it's better, but because they can always find someone just smart enough to call Microsoft and be walked through a solution.

Microsoft has its place, but it is far from the end-all, be-all solution for any number of reasons. In many cases, simpler is far better, but simpler and cheaper also usually means "different", and that spells trouble for myopic executives that don't understand how time-consuming troubleshooting Windows can really be in the end.

Here's one example... have you ever been in a situation where you KNEW things were hosed to the point where if you rebooted that Windows server, it wasn't going to come back up, but the only way to fix the root problem was to first reboot? Rare is the time when that happens to a Linux/Unix admin. They can either fix it without a reboot, or work around the problem such that the machine will come back up to a point where the problem can be fixed. You can't unload system DLLs or stop a core service to fix/replace files and if one is corrupted or a bad version, a reboot screws you.

The difference between a good admin and a crappy one is that one thinks Windows works well, and the other knows why and how Windows works poorly. I'd much rather work with someone that actively seeks out the poor qualities. Why? That guy will likely know how to fix the problem or get around it... griping the whole time, but better than the other idiot shrugging and speed-dialing Microsoft because "that's odd... it should be working".

1:38 PM, December 05, 2005  
Saskboy said...

I found your blog through Fark.com

That's a good list, especially the bit about backups, and the plan for restoring the system. I like to make a ghost image of a server in its working condition, then stow the backup in a fireproof safe. If one ever dies, then I can get running again in an hour [after the hardweare is fixed], and have the data restored from daily tape runs.

3:19 PM, December 05, 2005  
Anonymous said...

Don't forget to add to #5 - Backups are crucial - but USELESS if you can't restore them. Do a test restore of a backup every so often on a test machine or directory.

4:40 PM, December 05, 2005  
Sparky said...

Actually..the Exchange 5.5 to Exchange 2.x migration isn't bad---IF you have the luxury of doing a swing upgrade (meaning new hardware). I've done more than a dozen of them. OH..and watch those public folders.

Bad switch ports get stuffed with RJ-45 connectors that went bad..that's why I clip those ends off and stuff 'em in a bag. You just never know.

As to "Low Tech" solutions. I just replaced a very expensive Cisco Pix Firewall that died without a SmartNet contract, with a $300 firewall that has a VPN option they need.

*sigh*..I wish my company would subscribe to the procmail/spamassassin solution. Instead, we'd rather upsell that wonderful GFI product...

In the words of the immortal Bill the Cat:

ACK..PHPPT.

7:18 PM, December 05, 2005  
Michael said...

Nice list, sorry you got farked but hey, it happens.

10:25 PM, December 05, 2005  
Anonymous said...

I do IT support myself. My biggest problem is with employees expecting us to help with personal computer/tech issues. It's not that I don't want to. But we are just like 100% of the IT depts out there: we are just swamped all the time. We are understaffed and it's always, always just crazy busy.

And then these same people wonder why we 'never pick up the phone'.... Well, we FUCKING CAN'T because we're too busy trying to tell you why we can't advise you on what HDTV to buy.

Or why your Treo keeps needing a hard reset (stop monkeying with it).

Or why your clients' email keeps going into your own spam folder (do business with people who aren't so cheap they use hotmail or yahoo email).

And so on.....

1:16 AM, December 06, 2005  
Justin said...

So very true. A little windows centric maybe.

10:14 AM, December 06, 2005  
Anonymous said...

USERS LIE!!!

Lol, I alwys told the trainees that they should never beleive the customer. Saying a user lied is mean, and if someone higher hears you put it that way, you may not be there much longer.

11:26 AM, December 06, 2005  
Jane said...

I'm done with Sergio.
He treats me like a rag doll.

/Boobies!

1:12 PM, December 06, 2005  
B. Stabby said...

the geekery here astounds, I think I just grew buck teeth and a broken pair of glasses. but you're right! I'm the guy all the broads in my office go to before we have to call in the IT big guns. Geek on, ya bunch a dorks.

10:52 PM, December 06, 2005  
mytdawg said...

The difference between a good admin and a crappy one is that one thinks Windows works well, and the other knows why and how Windows works poorly. I'd much rather work with someone that actively seeks out the poor qualities. Why? That guy will likely know how to fix the problem or get around it... griping the whole time, but better than the other idiot shrugging and speed-dialing Microsoft because "that's odd... it should be working".

That guy worked for us for a while. Had all sorts of certifications but no field experience. His answer to everything was "call support". WE ARE SUPPORT! (sorry about the outburst).

My concept of support is that it's my job to keep the computers working. Don't try to make me learn accounting or business administration and I won't try to make you learn any more computerese than you want to know.

Being all high and mighty just sets you up for a long fall when you screw something up (and you will). It's hard enough without baiting your own trap.

I don't consider computer illiteracy a problem, it keeps me employed. I laugh at them sometimes but not in an accusatory way, the way people see the cause and effect is hysterical at times.

You have to laugh when they call and say that their ATM PIN is the same as their system password and now they can't get any money out of the machine in the lobby. It's just funny. "Yes mam you'll have to call your bank, no mam - the systems aren't connected, yes mam - they are both computers."

It's not a science, it's an art...

12:06 AM, December 07, 2005  
Anonymous said...

I certainly concur with the 'reboot' problem-solving method.

A problem continues to be a problem only if a reboot can't remedy it.

True, rebooting means you probably will not find the root cause of the problem. But in the real-world, so many things can go wrong,... the hardware, the software, etc. Sometimes the effort put in the find the root-cause of the problem isn't worth it after all.

8:57 AM, December 07, 2005  
MS myth busting said...

Your "Top 10 System Administrator Truths" is certainly a blast to read and is pretty well right on the mark, but #7 ? "No One Ever Got Fired For Buying Microsoft" is wrong.

A bank down under migrated from the old 16-bit OS/2 to the 'new' 32-bit Windows NT (3.x) about 10 years ago on instigation of a the CIO level guy. It bombed big time and that CIO level guy was FIRED for it as Windows just couldn't do some of the things that they needed that even that earliest version of OS/2 could readily do. (interesting that it got very hard to follow this story in the media, especially just before the Web really got going)

A client of mine got a new CEO who was totally Linux focused. The existing IT staff were given the mandate to learn Linux for their migration to it or else. (fortunately they were all keen on that idea and had been learning Linux already, so those staff still have their positions, but if they stuck with Microsoft they clearly would have been gone)


beware of absolutes. you make a good point, just please soften it some to something like "Very Few Ever Get Fired For Buying Microsoft"
none of us know all the data and an absolute that is wrong really diminishes your whole point.

2:14 PM, December 08, 2005  
Anonymous said...

why the hell is everyone advertising his junk in this thread? I missed point about how admins HATE advertising.

2:53 AM, December 13, 2005  
Charles Lacroix said...

Haha, as a full time tech support, i've seen case #1. Very funny, you ask them " Is the firewall disables " they will assure you without doubt that it's disabled, once you try connecting and it's not working after ( since the client is paying me for support ) 15-20 minutes of head scratching, i ask the client to really check it. And problem is solved. This happens at least once a week here at work :)

9:25 AM, December 13, 2005  
David said...

Though I was plentifully amused by your article, I have to say I must respectfully disagree with the generaly "reboot" fix-all. As a technical lead with a production support team, I never let them get away with simply rebooting a server. Things fail for a reason, and if you reboot before collecting enough information to figure that out, you'll be rebooting for the rest of your career.

Of course, we don't use Windows servers, where it's practically impossible to troubleshoot reboot-worthy problems, and where so many of the problems are actually with the OS or its drivers, but in the Unix world, you always have many tools at your disposal to figure out what's going wrong with something.

If something needs to be restarted, it's buggy and it needs to be looked at more closely.

11:03 AM, December 13, 2005  
Anonymous said...

"If it ain't broke - don't fix it"

11:05 AM, December 13, 2005  
The Spoonman said...

That guy worked for us for a while. Had all sorts of certifications but no field experience. His answer to everything was "call support".

Then, he never worked for you. The original poster was making the point that a good admin knows what they're doing. Certs are not proof of that, certs are proof that you can pass a test. I've got certs, I didn't take any classes or "study". How? Easy, certs test the BASICS of what you should know. They're slightly above "here's the power switch".

My concept of support is that it's my job to keep the computers working. Don't try to make me learn accounting or business administration and I won't try to make you learn any more computerese than you want to know.

Then, you've got the wrong concept. Your job is to make everyone else's in the company easier. You are there to make them more efficient and help them find solutions. To that end you need to know about accounting and business administration and payroll and human resources and marketing and so on. How can you possibly provide technology to people if you don't know what they're going to do with it?

Frankly, it's people like you (and ANYONE on this list who says an Exchange 5.5 to 2000 migration is hard) that make companies hate their IT departments. You come in everyday with the expectation that you're there just for the technology or worse that the technology is there for you.

11:07 AM, December 13, 2005  
Crocket said...

While I enjoyed this list, a well known web site comes to mind...http://www.theregister.co.uk/odds/bofh/

And of course..the history: http://bofh.ntk.net/Bastard.html

ALL admins should be read....

11:09 AM, December 13, 2005  
Marc said...

Good idea about stuffing the bad ports with a blank connector. I need to remember to do that next time a switchport fails on me. At that point, when I see myself avoid the BAD port I marked with the connector, I'll owe you a cup of coffee.

11:27 AM, December 13, 2005  
Anonymous said...

I disagree with #10 - the only problem with training your users to reboot is that you often lose insight into problems they're having. Sure, it's a great fix for a symptom, but by giving them the go ahead to apply that fix, you lose the opportunity to gain insight into the nature of the disease.

I run into this a lot. People reboot their own machines whenever something starts to act odd, takes longer than they perceive to be normal, or a program crashes. By the time something goes catastrophically wrong and I get called in, I invariably find the cause is something I could have done something about if I'd been informed they were having trouble, instead of just rebooting.

#8 is bang on, though. Sure, users do some dumb stuff now and again, and can be confrontational when frustrated, but you're an IT professional. Act like one, say please and thankyou and sorry - and remember, these people aren't as dumb as you think. Sure, they don't have a clue about the intricacies of packet-routing, but then again, I don't know crap about marketing.

11:45 AM, December 13, 2005  
Anonymous said...

A word of caution on the "reboot first" mantra. This only applies to workstations, printers and unmanaged hubs/switches.

If servers go fubar, it is for a reason and you should try and figure out "why", first. With many high end switches and routers rebooting wipes the in-RAM logs and you've just lost all clues as to what went wrong.

As a telecom tech, we used to joke about which people came from the PC world. We could always tell because their first response was to reboot the switch. :-)

11:58 AM, December 13, 2005  
Anonymous said...

This is one of weakest, most poorly written and boring article I have ever been compelled to read. I want my two minutes back.

12:02 PM, December 13, 2005  
Topher2798 said...

Great list, but you contradicted yourself on items 7 and 9.

12:13 PM, December 13, 2005  
Anonymous said...

"#5 ? Backups are Crucial"

No. Being able to restore is crucial.

I've seen a site totally lost because although backups were being taken (a combination of full and incremental), no-one had any idea which backups to restore in what order.

The restore strategy comes first. Then work out what backups will be needed to carry it out.

12:15 PM, December 13, 2005  
Anonymous said...

I decided not to go into system administration because of #10. I hate watching computers reboot; it drives me crazy!

12:20 PM, December 13, 2005  
Tom said...

Along the lines of "users lie", I'd like to add "users get it wrong". I wish I had a dollar for everytime a user told me that "the server is down" - on networks that had four or five servers. Sysadmins need to learn how to translate user jibberish into a sensible statement about the situation.

12:30 PM, December 13, 2005  
Anonymous said...

hmmm. i found that my exchange migration from 5.5 to 2000 to 2003 went very smoothly. maybe it's because i followed the migration guide... who knows? ;)

12:33 PM, December 13, 2005  
Anonymous said...

The saddest fact of being a sysadmin is that if you do your job really well, nothing ever goes wrong. So all the users (and more important - your boss) assume you're doing nada but sit around all day websurfing.

It took me several years of being a dedicated admin ace before I realized the error in my ways. Now I contrive a random network fault at least every quarter. I run around looking hassled for a few hours, then miraculously solve the problem in full management view. Not only does it help shed a few pounds accumulated reading email, but I also get a big pat on the back from the PHB for being a trooper working late to get it sorted.

12:34 PM, December 13, 2005  
Steve Magruder said...

Of course, the reason rebooting is oftentimes a solution to weird PC problems is that a lot of software leaks memory, and those leaks pile up to the point where many apps will get stuck, throw out memory errors, the hard drive will "chug, chug, chug" and so forth. Only a reboot can (temporarily) alleviate that leak buildup.

To reduce the instances of the need for rebooting, sysadmins need to be *vigilant* about keeping spyware and other crept-in software on users' systems, as obviously, these things suck up memory.

Also, it helps to keep a diary or handbook around as to the utilization of memory by the common apps that your users are using. Identifying/tracking those apps that "leak like a sieve" will be very useful, and creates good evidence for reporting those problems to the developers of those apps, so perhaps the leaks can be fixed.

12:37 PM, December 13, 2005  
Steve Magruder said...

Meant to say "sysadmins need to be *vigilant* about keeping spyware and other crept-in software _off_ users' systems"

12:39 PM, December 13, 2005  
Nishi said...

All you said is so true, these are the day to day activities of a System Admin. Currently I am working as a Network Admin and love every bit of what I am doing. The list helps sooth the pain.

12:40 PM, December 13, 2005  
Nougat said...

I am an MCRS: Microsoft Certified Reboot Specialist.

12:40 PM, December 13, 2005  
Anonymous said...

FYI, you've been slashdotted.

12:40 PM, December 13, 2005  
Anonymous said...

what about installing Linux.
no more problems then

12:58 PM, December 13, 2005  
Anonymous said...

Funniest user-comment I've ever heard - guy had a loose cable on his monitor ,and every so often the green signal would drop, causing the screen to go purplish. His solution was to shake the monitor until the color became normal again. I'm walking by and see this, give him a funny look, and he says "I hate it when it turns purple and you have to shake it!" I just smiled at him and he was embarassed about it for weeks.

1:00 PM, December 13, 2005  
operagost said...

Your suggestion to avoid Linux because "no one ever got fired for buying Microsoft" conflicts with your later suggestion to load Linux on old boxes to use as a firewall. Fortunately, there is really no contradiction as being the only guy to know a system is more like job security (as long as you maintain a good work ethic and relationships) than a risk.

1:02 PM, December 13, 2005  
Anonymous said...

I think the MS bashing was interesting. Other than that, nice post.

What needs to be said though, is that IT supports the end users, who ultimately know nothing but MS applications and OSes. As for Exchange, I'll be honest and admit to not knowing the cost, but as far as technical issues it has been solid with the exception of third party software farking up the database.

1:14 PM, December 13, 2005  
Anonymous said...

Awesome List...you couldnt be more right.

1:51 PM, December 13, 2005  
Anonymous said...

I use VNC for the Apple Network I admin. It is nice for doing things so that I don't have to run between offices, can access it from any type of machine (Linux, Apple, or if need be MS) and it is CHEAPER than Apple Remote Desktop.

2:00 PM, December 13, 2005  
Anonymous said...

#7 states:

Linux may be powerful, but ... will just as soon get you a pink slip ...

yet #9 says:

take a look at one of those old ?junk? PCs you have in the closet, download your favorite distro of Linux, and install procmail and spamassassin.

These 2 statements seem to contradict each-other. I'd leave #9 as it is, and propose the solution to #7 is to document everything. Make sure there are written instructions on how to bring that Linux box back from anything, and perhaps train someone else on its inner workings.

2:25 PM, December 13, 2005  
Anonymous said...

My company is migrating from Exchange 5.5 to 2003 early next year, so I will soon feel your pain and then some!

2:34 PM, December 13, 2005  
erik said...

Deep Freeze is garbage. uninstalling it is impossible because if you do you cannot use windows update or firewall and if you start editing the registry it will kill the pc on the next boot.

sys admins who use this dont know what they are doing. This could all be controlled by group policy on active D.

3:10 PM, December 13, 2005  
esm said...

never do anything important on on friday.

unless of course you LIKE coming in on weekends to fix stuff.

5:35 PM, December 13, 2005  
SoreGums said...

Mellisa Virus taking down the business due to microsoft solution. Art department humming along due to being on apple platform

You know that it is just shit sys admin.... melissa only infected the people who have nfi, regardless of all M$ or not.....

7:01 PM, December 13, 2005  
Anonymous said...

ehehe

no one ever got fired for buing Microsoft :)

well some companies (including mines) will _not_ hire if you ever used Mircosoft :)

take that double click boy :)

7:49 PM, December 13, 2005  
mytdawg said...

Frankly, it's people like you (and ANYONE on this list who says an Exchange 5.5 to 2000 migration is hard) that make companies hate their IT departments. You come in everyday with the expectation that you're there just for the technology or worse that the technology is there for you.

Ah yes, the unavoidable flame. I am the reason everyone hates IT. I've tried so hard to keep it a secret.

Because I really don't want to know how to depreciate assets to minizimize capital gains, I must not be a "team player".

I thought it was the arrogant pretentious jackasses that were the problem. Thanks for clearing that up. All that and an Exchange expert - we're not worthy.

Nice list anyways - sorry about screwing up that whole "IT" thing for everybody... sheesh.

7:50 PM, December 13, 2005  
nrunge said...

"...but the command prompt and configuration files and filesystem obscurity"

Something as well documented as Linux cannot be obscure. It however can be misinterpereted. Much like your definition of the word obscure. I guess this is the "error" part of your approach to systems administration.

7:54 PM, December 13, 2005  
Rob said...

When in doubt, it's always the name service. :-)

Nice collection of (in my opinion) truisms...

8:40 PM, December 13, 2005  
Ben said...

I'm not sure where you got your jetdirect cards, but my company has around 100 HP LaserJets with jet direct cards and in 4 years I've replaced 2 cards.

9:18 PM, December 13, 2005  
Anonymous said...

I considered Retrospect for my company, but we needed a super advanced backup system (digital photography, a few terabytes of information, added hourly, multiple backups including off-site), which is when I found SyncBack SE (http://www.2brightsparks.com/syncback/sbse.html)

It's only $20 and blows all backup software away with quick tech support, but is mainly for advanced users.

10:31 PM, December 13, 2005  
Anonymous said...

Something else to be said about the reboot, which ties into the politeness too:
That reboot gives you a couple minutes (sometimes up to five) to ponder the problem, ask further questions, and make that end user feel alot better.

I've had plenty of panic calls where I've spent that reboot telling the user "Everything will be fine" and "The worst that can happen is we just pull some files off of that thing and get you running elsewhere for a while."

11:30 PM, December 13, 2005  
Anonymous said...

If users are calling you with broken monitors and busted printers you are not an admin. You are a support technician. Admins are domain or enterprise admins and users rarely contact those people directly. Admins are the people that support techs talk to when the problem cannot be fixed on the client machine.

11:54 PM, December 13, 2005  
Darcio Prestes said...

Perfect. You travelled elegantly around personal and technical issues of a systems administrator professional routine. I encourage you to write a book.

8:11 AM, December 14, 2005  
Anonymous said...

Great list. One addition to the reboot remedy: specify a cold reboot (for PCs and Windows, anyway). I was embarrassed this week because installing Norton Internet Security caused the DVD-ROM on my wife's PC to stop working. I spent a couple of days trying to fix it by turning the software off, and even trying to get the drive to work while the machine booted. We took it to Best Buy (I know; we have a service contract...) The "geek" turned on the machine, pressed the eject button. The darned thing worked! Stupid machine!

3:59 PM, December 14, 2005  
Anonymous said...

Pffff, man! This list has no sense!!!

get a grip. This is not even SysAdministration. At most a support technician witha root password.

4:15 PM, December 14, 2005  
Dennis Smith said...

awesome blog. It inspired me to work on a blog called Top Ten Recruiting Truths.

Keep up the good work.

6:12 PM, December 14, 2005  
Anonymous said...

Nice list. Although it should be noted #1 is our own fault due to people in support positions that have zero business being there and the lazy tech scapegoat of blaming Windows and reloading the machine instead of saying "I don't know" and then finding the real problem (although I've seen from my customers past bills received from other support companys it's a great way to pad a invoice). I see time and again a "support" person tell a person to use recover disks etc for something that can easily be remedied by not forcing the user through hours of annoyance reloading everythng. I roll my eyes too, just like we all do when the DSL goes down and we are told "turn off the modem turn off the PC turn the modem back on first, then the pc" right after you have to give them a impormptu on the phone class on IP and NAT etc. We doesn't just "yes man" tech when we call about a hardware device we KNOW to be bad and in need of a RMA while they have us going through the "let me hold your hand while you update the firmware that won't load again" song and dance, meantime you are stuck telling the support clown what features and specs the device actually has because of the twoof you, you are the only one who bothered to look at that handy tidbit of information.

Lazy techs tossing sticks of TNT to swat flies just messes up the reputation of the industry. Just because something works doesn't make it is the best solution.

Could you imagine what would happen if everytime a car had something wrong under the hood the mechanic said "let's rebuild the engine".

4:10 AM, December 15, 2005  
David Gerard said...

I've been a Solaris admin for five years.

One thing I am very careful never to let the NT admins know is how often "reboot" is the quick answer for Solaris as well. Unlike on Windows there's probably an actual solution if you go hunting, but it's still embarrassing.

3:34 PM, December 17, 2005  
Anonymous said...

What you and other sysadmins miss about rebooting is that it takes a lot of time. If you've got a dozen or so applications open, recovering the state in all of them is a big effort and a big disruption to your workflow. And all because some sysadmin values their time over the user's and can't be bothered to fix the real problem. Okay, sometimes you need to reboot; but not every time. The prime example of this is when Lotus Notes hangs. Rebooting works because it kills two processes and removes a temp file, but you could do the same with Task Manager and save the user half an hour. But no, it's the nuclear option every time.

9:36 AM, December 29, 2005  
Cathy Podd said...

#10 is my #1
After many times of asking our staff to do this before asking me, I finally made up a really bad jingle so they would remember:

"Restart it (clapclap clap)
Restart it (clapclap clap)
If your PC's bein' mean
Restart it (clapclap clap)
Restart it (clapclap clap)
If the fax is goin' wax
Restart it (clapclap clap)
Restart it (clapclap clap)
When the printer stops the center
Restart it (clapclap clap)
Restart it (clapclap clap)"

I haven't had pre-questions in a long time.

3:26 PM, January 27, 2006  
Anonymous said...

What's up with the '?' ???? LOL!!!


#1 ? Users Lie

Oh yes, they do. Don?t think you?re immune either. Have you ever been on a tech support call, convinced that you know the problem and the guy on the phone says something like ?Would you put in the recovery CD, restart, and scan your memory?? ?Oh, I?ve tried that,? you say with eyes rolling. Believe it or not, sometimes we crazy admin peeps suggest these fixes because they work. When a user is protesting my assessment, the best is to politely insist them to do what was asked until the doing is done.

11:20 AM, May 02, 2006  
Anonymous said...

I can definitely identify with the old fix all --- REBOOT (init 6)!

Customer's response - Okay. How long until it comes back up? Me: Depends if the RAIDs FSCK

6:37 AM, October 12, 2006  

Post a Comment

Links to this post:

Create a Link

<< Home