Programming Things I Wish I Knew Earlier
theodp writes "Raw intellect ain't always all it's cracked up to be, advises Ted Dziuba in his introduction to Programming Things I Wish I Knew Earlier, so don't be too stubborn to learn the things that can save you from the headaches of over-engineering. Here's some sample how-to-avoid-over-complicating-things advice: 'If Linux can do it, you shouldn't. Don't use Hadoop MapReduce until you have a solid reason why xargs won't solve your problem. Don't implement your own lockservice when Linux's advisory file locking works just fine. Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job. Modern Linux distributions are capable of a lot, and most hard problems are already solved for you. You just need to know where to look.' Any cautionary tips you'd like to share from your own experience?"
>>> cunt nigger
File "<stdin>", line 1
cunt nigger
^
SyntaxError: invalid syntax
>>>
Thank you for this great tip!
Put enough comments in your code so that five years from now you (and others) can remember what you indented the code to do. Remember that comments are not for describing what the code technically does (that is what the code is for), comments are for what the code is intended to do. Try and comment the decisions you made when developing the code, specifically why you took the approach you did and why you didn't use other options.
The truth is that the "hard" way of doing things is often more fun, because you have the challenge of learning a new tool or API. Plus sometimes it's actually easier in the long run because you've engineered a solution for the outer bounds conditions of scalability, so if your application takes off, it can handle the load.
I guess the real issue is that you have to engineer a "good enough" solution rather than a "worst case" solution.
I do not fail; I succeed at finding out what does not work.
Don't ask for advice about programming on slashdot unless you have a pile of salt grains ready.
Am I part of the core demographic for Swedish Fish?
Sometimes it's easier and faster to code from scratch than it is to use off-the shelf software - especially in the age of "frameworks".
In that train of thought, its often better to toss and rewrite (or write new programs) than it is to extend existing programs.
It's easier to implement a whole new framework than it is to convince your boss that writing anew is actually faster.
Python already has a library to do the details for you. Perl does too, but the documentation may or may not be as nice and may require a separate install.
I guess I'm an idiot, but.. err... did someone REALLY use mapreduce to solve an argument passing problem (the domain of xargs), or is the writing just shit?
A database is not a bitbucket. Re-building basic database functionality in an external app is not a good idea. Applications, frameworks, languages come and go; data remains forever [1]. Business logic is part of the database. If you find yourself adding more and more "application servers" to get performance than you have a fundamental problem with your architecture (and probably a fundamental misunderstanding of how databases work). While it is not impossible to learn and implement good data management/database development practices using Microsoft tools, such a result is seldom seen in the wild.
sPh
[1] Per Tom Kyte of Oracle, whose first database job at the Department of Agriculture involved working with datasets stretching back to 1790.
unless tcl/tk wont do the job
If you are writing a program that touches more than two persistent data stores, it is too complicated.
I disagree. Is a program too complicated if it has 1. input, 2. output, and 3. logging? Is a program to prepare images for an online store too complicated if it reads 1. raw source images and 2. an overlay image and writes 3. finished images?
If Linux can do it, you shouldn't.
That'd be fine if we all ran Linux. But in an organization that already has to run Microsoft Access for other reasons, we have to take Windows into consideration. And I don't think Ted Dziuba was talking about just using Windows as a shell to run Linux in VirtualBox OSE either.
Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job.
Our programmer is far more experienced in Python than in bash, and if I felt like it, I could benchmark PIL against subprocess.Popen(['convert', ...]).
if the physical machine is not the bottleneck, do not split the work to multiple physical machines.
Yet PC game developers split a 4-player game across four PCs when one could do, and increasingly, PS3 and Xbox 360 game developers are following the same path.
It is far more efficient to buy your way out of a performance problem than it is to rewrite software. When running your app on commodity hardware, don't expect anything better than commodity performance.
If you are writing software to be used internally, sometimes springing for better hardware is worth it. But if you are writing software to distribute to the public, you can generally assume your customer has commodity hardware unless your software costs at least 1000 USD a seat.
I just RTFA. It isn't that good. These aren't many tips and they also don't seem to be too specialized. Most of them are already known or predictable. I can say that I didn't learn anything from TFA.
/.
I think that anything that reads "___ things to know about ____" or similar gets instant hits on
-1, Boring from me; hope it helps others.
Have you heard about SoylentNews?
And that's before you get into the really difficult stuff (that very few have managed to master) of getting a website that is easy to navigate and intuitive to use
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
Don't listen to other people they are the biggest source of painful misconceptions.
Don't program in C if Python will do the job.
But in a lot of cases, Python does not do the job. It doesn't do the job on iOS because Apple has explicitly banned everything but Objective-C++. It doesn't do the job on Xbox 360 or Windows Phone 7 because IronPython uses Reflection.Emit, and the version of .NET used by XNA doesn't support Reflection.Emit. And it doesn't do the job on Nintendo DS because the runtime uses up a lot of the available 4 MB of RAM.
Don't assume that, even six months from now, you're going to remember why you did things a certain way.
And the corollary: Don't assume you're going to be the one modifying the code a year or two from now.
Either way: Add comments liberally. Even if you're a conservative.
#DeleteChrome
You might learn something from doing things the hard way, but all you'll achieve is a version #1. As we all know (or will learn) version #1 of pretty much everything should be thrown away and should NEVER see the light of a production server. However, timescales being what they are as soon as an application gets close to functional it gets snatched away and put live - no matter how ugly it is. After that, all you ever have time for is to patch the worst parts. Doing a complete rewrite from the ground up, to do it right, is a luxury few of us experience.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
I guess I'm an idiot, but.. err... did someone REALLY use mapreduce to solve an argument passing problem (the domain of xargs), or is the writing just shit?
xargs and its successor GNU parallel implement the "map" part of MapReduce.
That's why you test new stuff, idiot.
Hang on a minute, new stuff is not worth the risk, apparently. Perhaps when he said "Upstart" he meant "mon", or "monit", or one of the other well established process monitoring tools?
I'd like to see this wonderkid take his non parallel code and parallelize it in five years time when he realises his non-parallel implementation doesn't scale.
Crap, crap, crap, crap, crap.
Do not make things super-modular and generic unless they 100% have to be. In 99.9% of the projects no one, including yourself, will use your stupid dependency injection, and logging / access control can be done just fine without AOP. Don't layer patterns where there's no need. Aim for the simplest possible design that will work. Don't overemphasize extensibility and flexibility, unless you KNOW you will need it, follow the YAGNI principle (you ain't gonna need it).
"Modern Linux distributions are capable of a lot, and most hard problems are already solved for you. You just need to know where to look."
First off, I must say this piece says a lot about the Linux ecosystem. Specifically that this system's documentation is anemic at best. Why won't we have something like:
"What do you want to do?...with an associated answer...this kind of arrangement surely cannot hurt the Linux ecosystem.
You know, I find that as I get older, I am able to avoid overengineering things a lot better than when I was twenty something. There's nasty effect, though. I'm learning a lot less in depth about systems than I normally would.
Overengineering is terrible for a project, but it often is highly educational.
... When you're working on a really, really good team with great programmers, everybody else's code, frankly, is bug-infested garbage, and nobody else knows how to ship on time. When you're a cordon bleu chef and you need fresh lavender, you grow it yourself instead of buying it in the farmers' market, because sometimes they don't have fresh lavender or they have old lavender which they pass off as fresh.
I read this in Joel Spolsky's article (http://www.joelonsoftware.com/articles/fog0000000007.html) but I think he has a point.
Visual Basic - Don't. Just don't.
It's always great to learn a language then have the company change it so drastically in the next version that all your knowledge of the language is useless. I don't believe it'll be the last time that happens either. I do know I will never bother to learn another MS programming language again.
Good luck to all you C# programmers when they switch to C#.NET, or whatever they call the next one. Hope you like reading!
This sentence no verb.
Most of that stuff isn't even part of "linux", strictly speaking. No, I can't resist pointing out this nitpickery after having been annoyed to death by "gn00/linex/debjan" and all the overlong designations that crowd of painfully annoying pedants felt on slapping onto systems that worked Just Fine without their oh-so-open goo. Like, just about all modern members of the BSD family. Thanks so much for that guis. Anyway. If we're to be properly open-minded about our tools, one could argue that "linux" is just as much part of "Unix" these days as the entire at&t Unix and BSD families and perhaps even minix and a bunch of others are. Why be so centric on a single mass of code?
Some oversimplified philosphy, some good hints. Programmers and SysAdmins who do a lot of resource management eventually become managers. This isn't neccessarily a bad thing, as the world needs more managers with extensive experience with that which they are managing, and the respect of those people they are managing. It's true that it's silly to adopt some software, technology or process just because it's new. But Ted seems to be resistant to any change, which is not good either. The problem with "don't fix it if it ain't broke" reasoning is..what do you do when it eventually breaks? This is a mistake made by many in process control / automation eniveronments: failure of a part which is so obsolete that it has become difficult and expensive to obtain a replacement. Just try to find a new motherboard with an ISA bus these days. Or a composite monitor. The same thing can happen with software and the OS..where are you going to find a guy who knows enough about that old Kaypro which was running some COBOL software on CP/M, which controlled the electroplating machinery? This is why companies have lifecycle management, so that the pain of switching to newer software / hardware comes with predictable cost and timetables instead of sudden, possibly prolonged unavailability and expensive, awkward, band-aid fixes.
This flows into the idea of organizational amnesia, where important processes become lost. This is perhaps best illustrated by the US DoE forgetting how to make this secret substance called FOGBANK, which is a critical component of H-bombs. Upper management felt as though, because there was no need for additional H-bombs, the process was unimportant, and didn't take into account that H-bombs become (more) dangerous with advancing age, and eventually these needed to be replaced. It took considerable time and money to re-engineer FOGBANK.
These are both examples of failure to consider that all equipment wears out, and failure to plan for long-term needs.
"The purpose of software engineering is to manage complexity, not to create it."
Bill Catambay, Pascal Programmer on Macintosh and Open VMS
But in a few cases, Python does not do the job.
FTFY.
I've been building a website. Mostly its just a big pile of bash scripts building everything. I have one script that calls about a dozen others. It builds Apache from source, PHP from source, MySQL from source, and my content manager Typo3 from source, plus a dozen support packages all from source, patches (hardens) PHP and adds in a security module, patches Apache (modsecurity2), configures everything (all scripts and configuration files, get configured from this one main script), and also builds an ODBC driver plus builds ODBC connectors for OpenOffice, all from the one main script. With nothing else running, my CoreI7-920 runs at 800% for about 20 minutes. I have a backup script mostly using rsync to back everything up once per week. I wrote the bash scripts, but I didn't write the web server, database engine, interpreted web language or content management system (although I have heavily configured all of them). If I ever needed functionality exceeding the bounds of what the pre-built packages can do, then I would write a program (likely in C) to provide the functionality. Its just a matter of efficiency. If I can get 90% or 95% or 99% of the functionality I need at 1% of the effort (or whatever the ratio is between writing a software package yourself versus just configuring a pre-built one), I get much more done in a smaller amount of time. Likewise, its easier to modify the Apache source code, rather than building my own web server from scratch. If I really wanted to build my own web server from scratch, I suppose I would. But alas, the desire is not within me.
Funny thing is sometimes you wait a year or two and: ;).
;) ) but, a few years ago I wrote some TCP/IP servers/services in Perl and they performed OK, it wasn't my stuff that was falling apart regularly due to load, unexpected/malicious input, etc.
:).
1) The C guy still hasn't finished the job either (at least not got all the annoying bugs out).
2) Intel and friends now let you use Python to do the job
Not usually true (hopefully
Thanks go out to Intel, AMD and the DRAM bunch
The smart phones are already more powerful than many old PC desktops still creaking on merrily...
You're wrong. C is more like a swiss army knife which has a chainsaw, a sword, a lightsabre, a rod of destiny, and a diamond tipped blade. Python is like a playdoh toy, and that's why it's so popular with the Ubuntu developers.
I was hoping for some actual programming tips or tricks.
This was completely irrelevant to me, and aside from the NoSQL review, not even that interesting.
If the only way you can accept an assertion is by faith, then you are conceding that it can't be taken on its own merits
q since that makes no sense. (Yes, I've seen an open source project with a stack that had the name 'q'.)
Did you know 80 to 90% of the moderators on slashdot wouldn't recognize a troll even if one dragged them under a bridge.
I don't know if Linux has everything Unix has, but in C on Unix, you can call popen() to pipe to a subprocess from within compiled code. That allows you to use any utility that reads from stdin or writes to stdout. That goes along the "don't do stuff that Linux already does" and Python subprocess.Popen sentiments.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Hey
I know these might sound odd but hear me out. Start by trying to rewrite the basic library's. make your own printf, strcpy. strlen etc..... Write copies of your own link list and tree storage methods and above all really really start to understand how memory works.
Another really really important thing that I have learned is stay FAR FAR away from OO programming until your really really comfortable in lower level languages. The reason is that to many students and beginners sit there trying to figure out why there variable started with value X and ended up with value Y only to find out that there object bashed some memory earlier on.
Basically just grab a good C compiler, I mean C COMPILER, not C++, not C#, not F# and start to learn how all the functions you use on a dailiy basis work, it will give you new insight into why and how you can quickly avoid and fix problems when and before they happen. It's also important to get a really really good handle on using a CLI over a GUI, Stay away from Visual Studio and other simular compilers. Use GCC and CC and make sure you look at how LD is working and understand how compilers do optimizations and improvements to your own code. This post is taking about grabbing tool to do Image processing and preform functions that have working solutions. However taking the time to see how the solutions work and why they work will give you good insight into not only great code design but great programming methods. It might seem odd for me to suggest to a beginner to try and rewrite strcpy or strcmp, but once you see how they really work you'll be far less likely to make the simple mistakes that can ground your program / project from working. It's the same say with with a beginner figuring out how malloc works and where memory gets taken from and put to, all of these suggestions are coming from the way I learned to program in C and other languages.
Feel free to throw any of these away or take any of them into your own programming adventure, but one thing is for sure. When you can figure out how the basic functions you use every day work it will save you hours and days of trouble shooting and leave you with a greater pallet of tools to use in the class room and on the job. I welcome and one who wants to add ideas to this post and attack it with there own view points.
How does xargs replace hadoop, even in trivial examples?
Unless you are talking about running hadoop in a single machine, which is just waaay too trivial an example.
I have a shock collar around my neck hooked to my computer, it activates if I attempt to use any IDE, compiler, or script interpreter.
No. Sorry, I strongly disagree.
C is like a swiss army knife that can be used to assemble a chainsaw, a sword a lightsaber, etc. You'll have to carve out all the pieces, file them down and put them together yourself, but the swiss army knife will help you a long the way.
But don't forget the bandaids.
Consider making it function, especially when you've repeated the same 8 damn lines over 70 times. (Yes, I've seen that one too. Yes it bit the guy writing it in the ass and no it wasn't me, I had to code review it.)
Did you know 80 to 90% of the moderators on slashdot wouldn't recognize a troll even if one dragged them under a bridge.
Look, the PHB usually wants the code to run on HIS server for HIS use only. That's what he pays you for. Not to code it in the most cross-platform friendly language-du-jour and take 2 years to iron out all the bugs.
He doesn't give a shit if it'll run on the Xbox 360 or a Linux-ready Dead Badger.
And neither should you, unless you are some kind of anal retentive who spends all day arguing the merits of absolute versus relative positioning, fixed vs percentage tables, and worrying whether your code will run on every machine conceived in the next 50 years.
I have news for you, it won't.
Hell, I got an N900 6 months ago, and it's already EOL'd as far as updates to the OS are concerned.
I suppose I could wait till there's a port of Android or something, because the three coders on the project are doing a fine job, in less than a year I'll have the same functionality as a 3210.
I read his page and the comments here and I can't seem to find any arguments on this?
I'm in a different boat from most commenters here, I think, because I am a scientist writing simulations; some simluations run a long time and create a lot of data which would be costly to reproduce, and what I wish someone had told me early on was that I should comment my *data files*, not just my code. Each file should include the exact parameters used to create it, an explanation of what each column represents, and preferably there should be a way of knowing what version of your simulation code was used to create it. A couple of times in grad school I had toss out months of data after I discovered a bug in my code, and didn't know when the bug showed up and which data was affected by it.
(I'd welcome other advice from simulationists too; I've never had an advisor who was particularly programming-savvy, even though programming was always a large part of my research, and so I always had to make it up as I went along.)
Sometimes you have a command, an app, something, that makes things trivial or are already installed and are easy to use instead of installing a complex heavy app to do that. With a good amount of pipes and installed by default command line software you can do complex processing in a lot of data, or do a somewhat trivial python/perl script for that, But sometimes you don't know exactly what, or learning how to do it would take more time than the "non optimal" way. The priority is to solve problems, if takes too long to learn how to do it in the "right" way you first must solve it. And then learn how to do it right for the next time
we could not figure out whether the author was an incredibly elaborate troll or just a run-of-the-mill idiot.
Reading this comment of his reminds me of something I read recently:
Physicists stand on each other's shoulders. Engineers dig each other's graves.
I've never understood why so many software developers feel the need to disparage one another in an attempt to prove their intelligence/superiority. There are plenty of tough problems out there and we all can learn something from one another, no? I've definitely been guilty of this in my tech career but lately I'm wondering more and more, why does the person who has a different solution always have to be an "idiot?" Why isn't he/she just someone who has a different take on solving this particular problem?
Now, I'm not saying that engineers do this more than any other group but out of all of my friends (some of whom are doctors, lawyers, teachers, etc.) it certainly seems like a more common event among software developers.
Resist the urge to start coding and think hard about the problem your program will solve. If it can't be solved without coding, resist again, and design every piece of the program carefully, and, paraphrasing Dijikstra think about LOC's as lines spent, not lines produced.
You're not that great.
Even if you think you're the best person in your department, there are other departments.
Even if you think you're in the best department, there are other firms.
Even if your organisation's top of its league, empires rise and fall, and so will yours.
There is no silver bullet. You are no Superman. You're not going to change the world.
So shut up, listen and chill. Feel free to do your best, but remember to be nice. Money buys you hookers, but love gives you peace.
I'd like to see this wonderkid take his non parallel code and parallelize it in five years time when he realises his non-parallel implementation doesn't scale.
From reading his other articles, you deal with the lack of scaling once you have a working product that people actually use. The huge majority of the time the load levels never appear to make scalability remotely relevant.
But I'm sure you're built lots of apps with millions of users, right? That's why you're hanging out on slashdot...
flock() isn't POSIX, doesn't have a standard behaviour across Linux/Unix/BSD platforms, can and will deadlock a process against itself, doesn't play well with either fcntl() or lockf(), doesn't work on NFS mounts, but does work on o2cb-based ocfs2 clusters, and God alone knows what the version(s) of Python / Perl / Java on your target system(s) are using to do locking under the hood - and He sure won't have documented it for you.
Doing it the easy way works 95% of the time, but if you don't take the time to find out what's inside that handy black box, you'll have a hard time putting it all back together (often on a customer site, and always at 11pm at night) when it does finally explode on you.
If you were blocking sigs, you wouldn't have to read this.
And the percentage of lines of code written on each of those platforms is what exactly compared to the total?
Python frequently does do the job. So does PERL, or AWK or other interpretive languages.
For the record, and I always find this interesting, Jak & Daxter for the PS2 was written in LISP.
- Michael T. Babcock (Yes, I blog)
Put everything in version control. Everything. EVERYTHING!
Well. You could skip /home, but I know a roll back of /etc has saved me a couple of times on config upgrades.
Remember that once code is deleted, you can't get it back. However, version control changes that. Version control is one of the most vital tools for anyone developing/working with a computer.
Oh and git rocks and stuff :)
Penguins can be fascists too
in the case of a 4-player game, the physical machine is the bottleneck. You can't have 4 players simultaneously use the same set of input peripherals
For over a decade, PCs have had USB ports into which four gamepads can be plugged through a hub.
display
Bomberman series, Tetris series, Mario Kart series, GoldenEye 007, Mario Party series, and Smash Bros. series put four players on one display. Sometimes it's split; sometimes all players' characters are in the same room. All six of those have appeared on Nintendo 64, whose GPU is noticeably less powerful than even the Voodoo3-class GMA in a netbook hooked up to a TV. So what bottleneck were you talking about?
I wish I'd know about LISP 25 years ago. Stupid people told me it was "for processing lists." If only I'd known better. Functional programming gives you wings and a jet engine.
I wish I hadn't paid too much attention to people with limited imaginations. Just because they're older, have more money and shout louder doesn't mean they are clever or wise.
C++ is way over-rated but it's worth knowing because it's so widely used. Don't let it detract you from mastering C and learning scripting languages. Understanding object-oriented design is more important than knowing the latest trendy language.
Objective-C.
Just because software is Free/Open doesn't mean it's "cheap" and poor quality. I could have saved myself 2-3 years there.
Ignore Windows and it will ignore you.
Stick Men
Reinventing the wheel uncessisarily and many times over, then forking the project is part of the fun. This is why Linux as a whole is not wanting for labour hours, but despite this brute force army of coders in thousands of open source projects all over world, the year of the desktop linux is always a few years away. Because when it's finished there will be nothing to do.
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
I've built systems that can handle hundreds of thousands of users, yes thanks.
Yes, because obviously with an intellect as massive as mine it's impossible to ever finish work for the day and sit around posting on the internet. What a truly stunning insight you have had.
Because Linux is NOT "just as easy as Windows".
I don't like M$ products and I don't like Apple's attitude. But, they are both easier to use for non-technical users than Linux. They are even easier to use for technical users, in many cases - for instance, if one wants to use an audio or video peripheral, finding a Linux driver for it often means hoping another Linux user wrote it, or you writing it yourself.
Linux is great and does a great many things, but it's not ready to replace Windows or OS X for the every day business user or at-home consumer.
The Invisible Hand of the Free Market is what punches workers in the nuts.
Look, the PHB usually wants the code to run on HIS server for HIS use only.
Then what's the product? For example, in the case of an online game that runs entirely on the PHB's servers, graphics and all, I don't think every one of the PHB's customers has an Internet connection fast enough to run something like OnLive with acceptable picture quality and latency. And certainly, mobile devices in the United States market don't have the bandwidth, with the typical cellular data plan capped in the low single digit GB per month.
I like to use an interpreted language to make a proof of concept and play around with that for a bit. Then once I feel I have a solid grasp of the problem I'm trying to fix and the solution I want to use I'll do a rewrite in C if performance or other reasons warrant it.
People replying to my sig annoy me. That's why I change it all the time.
Nice.
It must have been something you assimilated. . . .
A few rules of thumb for a startup environment:
1. Don't overengineer! Overengineering wastes time on things that may never be used. Features should be customer driven.
2. Functions and methods should be as small as possible. You should make it an obsession to split methods and functions into the smallest possible components. Only then can you have good code reuse. Don't start thinking I will split it when I need it, you never will!
3. Never ever reinvent the wheel. Reinventing things that exist is overengineering.
4. Don't optimize ahead of time. When I say that I don't mean don't use a hash table instead of an array where it makes sense. I mean don't try to avoid exception handling or function calls or other minor optimizations. If it has an impact on readability don't do it. Optimization always comes last. Often you'll find there are only 1 or 2 "hotspots" in your code. If you spend time optimizing these "hotspots" after your application is built thats when you'll get the best return on your investment. Another gotcha with optimization is using technologies that can't deliver the level of performance you expect. You should test to make sure the underlying components you plan to use will perform as expected before you start coding.
5. Don't cram as much code in a single statement as possible. Every compiler I know about today will produce identical code whether it's one statement or 5 statements. It makes it hard to read so don't do it!
6. Allocate time for testing. No one writes perfect code.You want to give a good impression to your customer so don't skip this step.
7. Make unit testing an obsession. Always add unit tests for new code, it reveals errors in your code. When you find a bug in your code add a unit test to test for it. If in the future someone decides to rewrite some function or method you wrote because it's not elegant enough they will not reintroduce old bugs.
8. Don't rewrite code if possible. Refectoring is almost always easier and less error prone.
The smart phones are already more powerful than many old PC desktops still creaking on merrily
A lot of people carry a DS Lite or iPod touch and a cheapo feature phone instead of an Android phone precisely to avoid the 70 USD/mo voice and data plan associated with a smartphone.
Like everyone else I went ahead and used third party code and libraries to accelerate the development of my various sites. Everything went as well as possible I guess, until one of my sites was defaced one day. Not because of my code (though by no means I'm pretending my code is perfect, far from it), but because one of these libraries had a security vulnerability and they didn't even have a security mailing list. That vulnerability was big enough that it would show in about every single site that would use this particular library, and was of course exploited very quickly by all the script kiddies in the world (I still see it being scanned automatically from time to time, years later).
Lesson learned, I don't ever use any 3rd party code unless there is a announcement security mailing list anywhere on their site, and even then I'd rather do the code myself if possible. Not because it will be perfect, but because at least my site won't be vulnerable to an automated attack targeting a 3rd party thing I put in their and totally forgot about.
And of course, don't get me started on phpBB and stuff like that, using such apps a few years ago was either having open doors for hackers, or a nightmare of patching.
If you are writing a program that touches more than two persistent data stores, it is too complicated.
Others have already mentioned cases where multiple datastores make sense. A trivial example: One database to handle user data, another to handle blobs (image conversions, etc) -- bonus if the second store can do its own conversions; a third to handle logging -- that's already three, and that's before we start considering things like RESTful services, which can function as intelligent datastores of their own...
If Linux can do it, you shouldn't.
Unless you're not on Linux. And, specifically:
Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job.
If you're doing something that truly works as a shell script, and isn't part of a larger app, I agree. However, PIL likely performs better, and it removes the shell as an issue -- if you thought SQL injection was bad, wait till you have people exploiting your shell commands. You can do it safely, but why would you bother, when you've got libraries that accept Python (or Perl, or Ruby) native arguments, rather than forcing you to deal with commandline arguments? Why do you want to check return values, when you can have these native libraries throw exceptions?
Parallelize When You Have To, Not When You Want To
If you don't at least think about parallelization in the planning stage, it's going to be painful later on. It's easy to build a shared-nothing, stateless architecture and run it in a single-threaded way. It's hard to build a stateful web service with huge, heavyweight sessions, and then make it run on even two application servers in the future. Possible, but awkward, to say the least.
For example, if you are doing web crawling, and you have not saturated the pipe to the internet, then it is not worth your time to use more servers.
...unless, maybe, it's CPU-bound? And this is odd to mention in a section about parallelization -- wouldn't slow servers be a prime candidate for some sort of parallelization, even on a single machine, even if it's evented?
If you have a process running and you want it to be restarted automatically if it crashes, use Upstart.
Cool, but it looks like Upstart is becoming a Maslow's Hammer for this guy. Tools like Nagios, Monit, and God exist for a reason -- one such reason is knowing when and why your processes are dying even if they're spread across a cluster.
NoSQL is NotWorthIt
People who have read my other posts likely know where I stand on this, but...
Redis, even though it's an in-memory database, has a virtual memory feature, where you can cap the amount of RAM it uses and have it spill the data over to disk. So, I threw 75GB of data at it, giving it a healthy amount of physical memory to keep hot keys in...
So you found out an in-memory database wasn't suitable when you have far more data than physical memory? Great test, there.
Redis was an unknown quantity...
Maybe so, but that wasn't terribly hard to guess.
Yes, maybe things could have been different if I used Cassandra or MongoDB...
So maybe you should've benchmarked a NoSQL database which is actually designed to solve the problem you're trying to solve? Just a thought.
especially if something like PostgreSQL can do the same job.
If PostgreSQL could do the same job, the current generation of NoSQL databases wouldn't have been invented. Unless something's changed, PostgreSQL can't scale beyond a single machine for writes, unless you deliberately shard at the application layer, which would violate his rule about multiple datastores, wouldn't it?
It seems like the attitude is to no
Don't thank God, thank a doctor!
eh... "input, output, and logging" as you describe is _one_ data store in the original post, namely the disk cum filesystem. That's why the original article said (paraphrased) "the disk being one data store." If you intelligently apply the rule then disk + database is okay, disk + net is okay, database + net is okay, but disk + database + net is probably unnecessarily complex.
eh (further)... the article presumed linux, and I will skip the *NIX vs MS debate here. The rule espoused is better stated If you have a wheel already don't try to invent it. *NIX vs MS is, in this context, only a difference of how many wheels you have and how well they work together.
eh (further still)... benchmarking subprocess.Popen(['convert', ...]) is a myopic response, foremost because if you are CPU bound then you may have already surpassed the predicate condition on the choice. Since ImageMagick is also a _library_ there is no need to call it through fork/exec anyway. So since you didn't investigate the tools maybe you shouldn't be critiquing their use or offering benchmark advice.
eh (yet further still)... games, at the level you are speaking of, are always CPU bound for rendering. By definition the machine (and the network latency of transferring pre-rendered frames etc) makes the machine the bottleneck in the game if it were monolithic. Your example supports the need for the predicate "if" rather than counters it. That is, you have provided an example of when you _do_ need to distribute the processing load rather than engage in the argument about when you _shouldn't_.
eh (yet again)... you give an example of where you _cannot_ buy your way out of a performance problem rather than discuss a situation where you _could_ buy your way out and then discussing the merits of performing the purchase rather than rewriting the code. That is, you have brought everyone and their car as a citation into a discussion of trans-Atlantic travel. The OP is discussing the use of upgrading to heavy haulers to cross an ocean and you yell "but most people only drive cars" as if it had the slightest bearing on the conversation.
So, having failed utterly to understand the articles points about using existing technology and keeping things manageable, you have made several observations of the "but since sometimes you cannot keep things simple, you should never try to" or some such nonsense. Citing already accounted-for corner-cases in apparent inability to understand the scope of the general case doesn't really add much to the discussion of the general case.
There is a reason the original article was filled with predicates like "if" and "when". Giving examples outside those predicates is, in no way, applicable to the discussion within those predicates.
[snark]Please post you CV so that I can make sure I never hire you to do a requirements analysis...[/snark]
Innocent people shouldn't be forced to pay for inferior software development.
--"Code Complete" Microsoft Press
Be weary of toolkits and APIs that claim to solve difficult problems. They might solve it in the best way for small-scale deployments, but won't scale to fit your needs. ORM is a good example of being careful when choosing an API that claims to solve a difficult problem. Hibernate can quickly complicate traversing foreign-key relationships if it's used as a replacement for knowing how to use a database. In contrast, a team that knows how to do database programming might be happier with a much simpler ORM API.
Following up on the ORM example: A popular ORM technology is no replacement for good database design and a team member (or members) who know how to program a database. Even if you're using an embedded database that'll grow to 40 gigabytes, someone on the team needs to be comfortable programming with it.
Stick with tools that "do one thing and do it well." If the tool does many things, there should be a decent ecosystem that developed the "many" things. jQuery is a great example. It nicely abstracts browser differences and gives a helpful wrapper for dynamic HTML. There's also a healthy ecosystem of community plugins.
Avoid tools that get in the way of how you normally program, except in confined areas that can be refactored without starting from scratch. Spring.Net's aspect-oriented programming plugins for C# can leave lots of layers of auto-generated code on the stack trace, and require lots of additional work outside of normal C# in order to get it to work. Node.js's asynchronous approach is getting a lot of attention, but if your program crashes, you won't have a usable stack trace, and it's difficult to do a try-catch-finally around a non-blocking IO call. In both cases, when new technology brings akwardness, limit its use to one-off utilities or a small part of a larger whole. This way if the new technology proves too immature, your risk is minimized.
Code generators can be useful timesavers, but their scope should be limited to one or two layers of a program. Code generators that hit all layers of a program can become too inflexible to handle changing needs outside of what were originally anticipated. Likewise, code generators can become so complicated that it's easier to avoid using the generator altogether.
Runtime code generators need to be simple and well-tested. (These typically implement .Net or Java interfaces at runtime, or inherit from classes at runtime.) Bugs in code generators can be difficult to find and fix, because you'll have an incomprehensible stack trace that doesn't lead back to the bytecode generator, and because bytecode programming is quite time consuming.
Don't be afraid to write one or two utilities yourself, even if there are pre-existing libraries. As long as you have a good justification, it's a helpful learning experience.
There's a reason why many different APIs exist that appear to do the same thing. .Net has three different XML handling APIs; each is optimized for something different and introduces valid tradeoffs. Choosing between APIs, utilities, or libraries often isn't a matter of which one is best, but instead is a matter of which will meet the design needs of your program.
No, I will not work for your startup
Macro in C/C++. (That auto code generation.) That sounds like it kind of sucks in Java. (The guy I'm talking about did it in C/C++. To make matters worse, he got his code wrong so he had to fix it 70 times and then retest.)
Did you know 80 to 90% of the moderators on slashdot wouldn't recognize a troll even if one dragged them under a bridge.
They are there to be aggressive advocates for the user. /They can also save your arse because guess who is the one person in any meeting who took minutes and a manic depressive Project Manager is blaming you for not following through on an order that he failed to giv.
I just finished a project where the other developer was a noob that thought he was uber and my code was crap.
He lobbied the boss behind my back and they started pushing me away, not including me in design meetings and reducing my coding responsibilities meanwhile the "uber" dev overcomplicated the code by throwing factories and inheritance all over the place without any regard to design patterns and coupling....
So yeah, keep it simple but make sure the boss understand your design decisions.
HTML is obsolete. It's time for a new, simpler and richer markup language.
Thank you, thank you, Mr. Ultra Obviousman. Open source usually means closed, archaic, obsolete or entirely missing FM-for-which-it-is-impossible-to-RT.
``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_
Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job.
I think the worst mistake I made as Mr. XEmacs was attempting to unify our graphics support to call ImageMagick libraries instead of the custom stuff we were using (and later restored when ImageMagick was backed out).
Does it work any better now? The last time I looked at display(1) a couple of years ago, it still wasn't close to long lost and patent challenged xv(1) that got shut down by the GIF patent war.
"input, output, and logging" as you describe is _one_ data store in the original post, namely the disk cum filesystem. That's why the original article said (paraphrased) "the disk being one data store."
The database is stored on the disk.
disk + database + net is probably unnecessarily complex.
I have one program that scrapes information from a web site into a database. (I have another that makes reports on the disk.) Does this mean the logging should also be to the database?
Since ImageMagick is also a _library_ there is no need to call it through fork/exec anyway.
In that case, PIL vs. PMW may be a wash. But in context, I understood the gist as the following: if shell is good enough, don't use Python.
games, at the level you are speaking of, are always CPU bound for rendering.
I thought they were GPU bound. But if they're fill-rate bound (as opposed to vertex shader bound), putting four 960x540 pixel windows on one screen won't tax the pixel shaders any more than one 1920x1080 pixel window. And some game styles don't even need four different views: look at Bomberman.
and the network latency of transferring pre-rendered frames
Why would you need to transfer frames over a network? You just display all four players' frames in windows on one monitor. Goldeneye 007 for Nintendo 64 does this. Or you render one frame that's suitable for all four players and again display that on one monitor. Smash Bros. does this.
you give an example of where you _cannot_ buy your way out of a performance problem
I took the article to mean that buying your way out is the preferred solution, and I gave a counterexample.
"but since sometimes you cannot keep things simple, you should never try to"
That is not what I meant. I meant only this: "but since sometimes you cannot keep things simple, you should never force yourself to."
There is a reason the original article was filled with predicates like "if" and "when". Giving examples outside those predicates is, in no way, applicable to the discussion within those predicates.
The problem comes when people read "if A then B", either gloss over the "if A" or forget the "if A" months later, and then blindly apply B without considering A. For example, consider Dijkstra's "Go To Statement Considered Harmful", which refers to "the unbridled use" of GOTO, but people forgot "unbridled". So if !A is a common occurrence, of course discussing how !A relates to B is on-topic.
It's good to design with some speed in mind. Structure your database for the kinds of queries you'll be running.
I've seen people waste effort producing unreadable code to shave nanoseconds off a system that then spends seconds or so waiting for a database to catch up. They worry about the cost of memory allocation or iterating over collections and go through hoops to avoid them, but this is nothing in comparison with the network and disk latency the job they are doing involves. Best optimise code later when you can get a profiler to work out what's really making it slow. Chances are it's your database queries.
Don't use PIL if you have imagemagick? If I'm using python anyway, why add another dependency? If I'm not using python than obviously i'd use IM. Does this giy actually roll his own for no good reason? Lulz what a geek.
Sure, it was written in LISP, and then the program got bottlenecked on the only programmer in the world who could understand the code, and they had to deal with 15 minute long garbage collection interrupting development and two games later they had to scrap the whole engine because the one guy in the world who understood it doesn't work for them anymore.
if it takes five years for your solution to stop scaling, then you've probably done a pretty good job.
Advanced users are users too!
Hey, you can do everything in assembly!
-- dnl
I wish someone had suggested I and the team I'm in work in a flexible and agile way a long time ago. I'm a recent convert, so excuse my fanaticism.
The manifesto recommends:
Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
in light of the following principles:
I had to deal with massively asynchronous code. My head would have blown out without proper comments and a flow chart for the code.
Tomorrow is another day...
While it is not impossible to learn and implement good data management/database development practices using Microsoft tools, such a result is seldom seen in the wild.
I've got a funny story that might give you a heart attack.
:-P
I peeked at the helpdesk one afternoon to see that there was a ticket from one of the programmers. One of the web pages they built was throwing a script error and it was likely a "permissions issue." I noticed the browser had a UNC path in it (to the web server, no less) and figured that had to be the problem. So I hop on the web server and drill down to the folder and find the URL for it, pop it in my browser... still not working. I crack the HTML file and get no help there, so I start tracing through it with Chrome... where I find the SQL connection string in the included javascript file and nearly have an aneurysm.
After the programmers assured me that "that's just how it's done there" and not knowing enough VB/ASP nor it being my job, I hopped on the SQL server and added the user group to the stored procedures necessary to get the pages to work.
At least it was using Windows Authentication and not "sa" right in the javascript file
Boot Windows, Linux, and ESX over the network for free.
by .Bruce Perens
Someone didn't pay much attention to Ender's Game. Period belongs at the end of the name. Of course, and underscore might have been better here.
2. Bathe periodically.
I bathe all the time.
5. Parents will eventually charge rent.
So do landlords, incidentally.
7. CTL-C, CTL-V. Took me 15 years until someone showed me that shortcut.
This ought to blow your mind then: CTL-X
8. The sun, it burns.
Funny, I figured that out long before I owned a computer.
9. As far as your family is concerned, "network admin," "programming," "database management," ... to them, it's all "computers."
It's good practice for talking to Management later.
I wish parent had learned about if statements.
Next up is a rework of the database IO codebase so that it becomes feasible to plug-n-play different databases.
Why? Pick a widely use database that works and stick with it. Less work, simper code, easier to test and a shorter route to maturity.
That was a truly a forgettable book. I had a copy and even tried to read it through several times. I cannot recall anything about it other than the title.
That makes me sad in a way, because I usually can read something end to end.
Regarding the quote: It's not that COBOL was such a bad language for its time. It wasn't. Of the three languages COBOL, FORTRAN and Lisp, only Lisp survives (in new code) in anything resembling its ancestor. COBOL survives in legacy code that will probably never be retired. FORTRAN has mutated into something unrecognizable and arguably didn't last.
We can be grateful for FORTRAN because it buried the notion that compilers could never beat hand coded assembly.
We can be grateful for Lisp because it later spawned The One True Editor.
We can be grateful for COBOL because it made so mistakes that were so glaringly obvious, no one ever made them again.
Well, you're both right. Business logic belongs in the database--but our databases are too weak when it comes to integrating with general-purpose programming languages, which is the whole reason why we have domain layers. Our relational databases come from the world of COBOL, and it shows in their built-in languages (e.g., SQL, PL/SQL).
We'd be better served by a relational database system that embedded its features inside a good programming language. I'll be bold and sneak in the suggestion that this ought to be a functional language; after all, the relational algebra is a simple kind of functional language.
Are you adequate?
No. He asked if you've built systems that are actually used by millions of users. The first user system I ever wrote could theoretically handle hundreds of thousands of users, but it was only ever used by 7.
Hell, I got an N900 6 months ago, and it's already EOL'd as far as updates to the OS are concerned.
I suppose I could wait till there's a port of Android or something, because the three coders on the project are doing a fine job, in less than a year I'll have the same functionality as a 3210.
NITdroid already provides voicecalls over gsm so it's coming along nicely.
Maemo seems EOLed because nokia and intel are focusing on Meego which will be released for N900 aswell.
Atleast with N900 you get platform which is open enough to modify to your liking, rather than being locked in a format which is force-fed from 'central command'
I wish parent had learned about if statements.
Perhaps I phrased it wrong. Let me try again: "I agree that C++ isn't always the right tool for the job, but then neither is Python. It is still wise to learn C++ because you're likely to need it sometime in your career." Is that phrased better?
I've built systems that can handle hundreds of thousands of users, yes thanks.
Which doesn't change the fact that you're the only one using it, making the scalability redundant. If you read again he said with millions of users, not just able to support them. Some in-house software that will only ever need to support up to 200 users shouldn't care if it can run for 2 million or not. If that suddenly becomes a requirement then there's much more than scalability to worry about.
Not that I agree that scalability shouldn't be thought about, it should but in the background, unless it's in the clients top 3 requirements. Good, decoupled code will always allow you to come back and optimise for scalability later.
You are a strangely literal person. And your comeback sucked, I hope you don't think you were at all witty.
My main question is, Who the fuck is Ted Dziuba? My next question is, Why should I listen to what he has to say?
A co-worker proudly demonstrated his brilliance to myself and the CEO - spending two days writing a program on linux that swapped bytes to read data from old tapes. That is all it did. His input was piped from dd.
What could I say apart from "that's nice", instead of "why didn't you spend two minutes reading the dd manual and get dd to do it all for you instead of two days on a very simple task - or just ask me instead".
In B4 Persai
There are no karma whores, only moderation johns
There is a reason that every write "hello world". There is a reason that it is actually ok to reinvent the wheel, if only to figure out how the wheel works so you can use it better.
I gotta say I cannot believe this hit /. frontpage, are there no more "do it yourself" programmers around?
So, where exactly did you get that information from and why do you think it is real?
No, just bored.
Hell, I got an N900 6 months ago, and it's already EOL'd as far as updates to the OS are concerned.
I suppose I could wait till there's a port of Android or something, because the three coders on the project are doing a fine job, in less than a year I'll have the same functionality as a 3210.
Or you can read a little farther and see that upgrades are already available: http://en.wikipedia.org/wiki/MeeGo
perl wisdom.pl
That's the wrong tool for the job. Just use a modern file system with snapshotting capabilities, like btrfs.
The Russians have won. They have made the world a cesspool of distrust, greed, fear and hate.
This is insanity! Bloat that code! Abstract those abstractions even more! Pile on the software! More! More! Then blame the hardware when it's slow!
Otherwise how can you keep creating jobs?
Software = planetary cancer. It will keep consuming resources until the host dies.
There is tons of awful, awful people working in programming. Fakers, posers, kiss-asses, sociopaths, narcissists, liars, and otherwise generally crappy people.
Snoop around in the shell histories of any account on the system you have access to.
Look at how other people do things. Some of them are smarter than you. Some of them are more experienced than you. Some of them even have some good porn you can snag.
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Yes, obviouslly. But still, some one would use a distributed multi machine solution to do something that xargs could easily do on one machine??? That seems like extreme over engineering. If some one is really using hadoop to grep through a list of files... They have serious problems that learning about linux commands wil not solve
1) get requirements from those that will actually use the software (not their boss, not your boss, not the bean counters)
2) specify the requirements
3) document the specification
4) code to the documentation
5) test to the documentation
6) deliver software that works and does what the users want and can be compared to human (user) readable documentation
7) goto 1
1. Don't over-engineer.
2. If the code is no longer used remove it.
3. Write defensive code (assertions, check arguments, etc..)
4. Think of how you're going to debug it in the field (use logs)
5. Write unit tests (Unfortunately I don't do this enough)
6. Think about error handling right from the beginning
7. Think about security right from the beginning
Some years ago I had these three laws of software development:
I still believe that these rules are the essence of good code.
Open Source Alternatives
Don't write a program called "test" and then try to run it by typing "test" at the console.
Doing this can cost you about half a day of printf() debugging because the program will work when you try running through GDB.
At least I did this only once :-).
Installed the Bubblemon yet?
because nokia and intel are focusing on Meego which will be released for N900 aswell.
Citation please, for my piece of mind.
( concept somewhat stolen from "Corps Business: the 30 MANAGEMENT PRINCIPLES of the US Marines" by David H. Freedman, Forbes senior-editor )
If you don't know the INTENT of the work, how are you going to know if
a) it's the right intent
b) you're solving it the wrong way, or
c) it's overwrought?
This *doesn't* mean one has to over-detail one's documentation...
but documenting the Intent is part of test-first, isn't it?
E.G.
This (whatever) is for PRESENTING the INFO needed FOR PURCHASING,
when UPDATING AN ORDER
and NOT at any OTHER TIMES,
while PASSING-THROUGH their UPDATES/amendations, to the current-orders (whatever).
Documents the Intent, isn't over-detailed *for its level*, and means if someone's reading just these comments, then they can get a GOOD overview, *quick*.
If people won't solve what the intent is,
then I'll bet they are producing buggier-than-necessary code.
Steve McGuire, was it? "Code Complete"...
Don't let bugs IN!
For example, if you are doing web crawling, and you have not saturated the pipe to the internet, then it is not worth your time to use more servers. This guy got me thinking about it, he's doing "Large-scale HTTP fetching" in Clojure. He talks about parallelizing with some queueing silliness, but never mentions how much data is moving down the pipe on any one machine. If you have a 100 megabit connection to the internet, and your fetcher is using 700 kilobits, then figure out why your fetcher sucks. )
Erm -- if your process is not taking up all of the bandwidth, AND you have available CPU, grabbing more sites concurrently would use some of that additional bandwidth. I don't know about this specific example, but generally speaking you might be able to max out your CPU without maxing out your bandwidth depending on the type of processing you were doing.
Finishing the article solidified my conclusion. The author ranted about how nosql sucks (not really a programming thing), how to monitor a process (useful, but not programming - because he's not telling you how to do it within a program), that hardware matters for performance (erm, duh...), and then some mumbling about event loops done by taking a quote out of context without really explaining what he was talking about.
(As a side note, I was talking about that post with Milo's prolific systems administrator, and we could not figure out whether the author was an incredibly elaborate troll or just a run-of-the-mill idiot.
Hmm - I'm sure there's a law similar to Muphry's for this scenario...
When writing code the only thing you should be doing is generating a list of reasons why everything you are doing is not only wrong but total shit.
If you stick with it long enough you may learn and perhaps even somewhat improve the quality of whatever your working on.
Writing software is no fun.. I only do it because I get paid. Dealing with absurdly inflated ego's is also no fun. No chicks in the office also no fun.
From someone who works where the coding "standard" is indentation set to 2 and spaces it's a nightmare. If it were tabs then your indent size would be up to you, but nooooooooooo.
If you don't risk failure you don't risk success.
Just great. Become another programmer who'll call malloc() and memcpy() where a constructor and a copy constructor should be. Become another programmer to introduce buffer overruns for lack of familiarity with C++ strings. Become another programmer who thinks toString() methods are a good idea. (Admittedly the latter applies to programmers who have heard of Java and don't understand C++ streams.)
Something that's worked pretty well for me: write your comments first.
That is, describe with comments what each step is going to do logically. When your logic is sound, fill in the code.
Now, realistically, I'm not quite smart enough to get it right all the time before I put the code down (writing the code sometimes shows my logic errors). But then, engaging in a bit of revisionist history, in the same voice as the original comments works well.
I only 'comment the [actual] code' when I'm forced to write some kind of code that is ugly for performance reasons or it appears to be non-intuitive after being written.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Yes, comments are good, but NOTHING beats a good self-descriptive variable, struct, class, enum, function or typedef symbolic name.
Not the "Hungarian notation" crap either - that's too strongly tied into the underlying type, and can cause confusion when you change the underlying type but not the symbolic name.
If you can't come up with a good descriptive name, it may indicate that there is something wrong with your underlying code - that it's either trying to be a "god class" or a "spaghetti monster" or some such.
Nothing worse than seeing test1(), test2(), test3(), etc., along with comments telling you what each test is. Every time you see it, you have to refer back to the comments. Stupid, but people still do this. "Oh, I'll clean it up tomorrow." Tomorrow never comes.
I am not inclined to waste much time agonizing over things that will help my replacement do his job better than I did mine. I assumed that the purpose for the "tips" was to help *me* be a better programmer and avoid the mistakes of others. To that end, here's my suggestions, derived from 35 years of experience in the real world:
1. Programmers always bitch about deadlines because they insist on coding to some puritanical design guidelines and attempt to use all the latest features of an OS. I never, ever worried about deadlines and always met them because I trained myself to size my coding efforts to the time available.
2. In the end, when the clock runs out, the *only* thing that counts is whether you have something that works that you can show the client. Excuses will not keep you employed.
3. As you learn new coding techniques in the course of a project, resist the temptation to revisit your earlier code to "clean it up" - at least until you have archived a complete working version of the application. Only then - and only if you have time - should you revisit and improve your code.
4. Never, ever get defensive about your code. There is always someone that can code more stylishly than you or can comment better or knows the API by heart. Instead, accept suggestions with grace, be a good team player when you are part of a team, and concentrate on never missing a deadline. Better to have working ugly code, that be pretty and and a day too late.
5. Always jump at the opportunity to be a part of something new, because you will expand your knowledge, your resume, and (usually) your wallet.
6. If you are a team leader, be generous with your time and knowledge, work one-on-one, and avoid organizing meetings that have no concrete agenda and time limit.
7. Have a life and a hobby that does not involve computers. It is important to be able to decompress.
You need to learn to use libraries :P
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Eh, isn't that the basic difference between a programmer and an engineer?
A good programmer is a good engineer. A bad programmer is not. The false dichotomy between programming and engineering is nothing more than a clutch and a sorry excuse for having incompetent programmers around.
"What's wrong with stored procedures? 1) It means we need to support two languages instead of one, with all that that entails (a proper debugger, expert knowledge, etc)"
Wrong! *you*, not we. Repeat again: data is company's property, not the application's. You can bet your company data *will* be accessed by more than one application at more than one age if it's of any use (if it isn't the app shouldn't be programmed to start with). It will be the applications' side the one that will add supporting needs for more than one language, debugging, expert knowledge while the data will still use the same SQL language, the same tools and the same domain experts.
"2) Stored procedures cannot do ALL business logic"
True. Neither it's expected to do so. Data-bounded logic should be retained at the data engine level if only to be sure every accessor will comply to it. Your specific app logic can and should be developed within the application. Sometimes it takes some cleverness to find the boundaries, usually not.
"3) Because you don't want to have to deal with Oracle's horrid error messages anymore than absolutely necessary"
This is an argument *against* your position, not supporting it. The way you are free from Oracle's whatever is insuring that data managing functionality is as bound as possible to the data management engine (encapsulation, you know the concept). This way it will be the DBA's problem not yours.
Atta boy, you built a strawman that was completely unrelated to the parent post. Also, the strawman was built like a dead horse, then you grabbed your rhetorical e-dick and beat it to death with it. Pat yourself in the back and have a cigar, for it is misshum' accomplished.
yes
you can much more easily inject natsy things into perl or php than into shell.
Have you ever seen a shell executing data? I would need to call "sh" or "eval" to make this work.
On the other side, is perl which calls without you knowing "sh" on simple things as opening a file.
So sorry, perl is more prone to injections as shell.
Georges
Atari rules... ermm... ruled.
A lot of people apparently don't know what they're talking about because they can use the exact same plan on a smartphone
In the United States of America, it's hard to buy a smartphone without buying a plan. Most people don't know to look online, and online stores charge a restocking fee. And even if you can find one, the price on Google Product Search for an unlocked Android smartphone is often twice that of an otherwise comparable iPod touch. For example, a Samsung Galaxy S is $600 while Apple's iPod touch 4 is $300.
I got started doing CLI image processing with netpbm decades ago, and the approach is still valid. It was designed to work as a CLI program, and is a great solution for batch processing. In addition, you have libraries to manipulate data in the native format, so it's possible to add some functionality which you need by just a small program doing the operation, with all the i/o and many of the transforms done for you.
I'm not being a troll, I honestly mean this: those tips/experiences aren't that great. How the heck did someone think this important enough to be put on Slashdot? It's oriented towards someone's very specific experience, which for the most part doesn't translate well to other people.
Just because the U.S. is a republic does not mean it is not a democracy. Democracy/republic are not mutually exclusive.
The trouble with advice is that you don't know its good until you don't try it.
I see lots of posts here telling programmers they should do this and do that, but the truth can change quite a bit for each developer depending on their situtation. Some software will have millions of lines of code and countless developers who, over several years, have written commented and rewitten the code. The lessons learned by developing, testing, and maintaining that codebase will be completely different for a developer who maintains a codebase who only a handful of developers have ever touched.
Here is my advice.
1. Keep it simple. If its not simple you didn't spend enough time thinking it through. If you can barely understand it now you'll never be able to figure it out later.
2. Code Comments should not explain why or how, only what the intention of the code is. The why is irreleveant and the how will change.
3. Revisit your code from time to time even if you don't have too. Pick what you remember as your best and worst and take a look at them again.
4. Don't stop learning, ever. Keep expanding your knowledge and don't stick to just one language or platform. Having a specialty is fine, but much can be learned from examining other areas.
5. Listen to the advice of senior developers.
6. Make sure you, your superiors, and your clients are all crystal clear on the requirments before any coding begins.
7. There are exceptions to every rule and times when its purdent to break them, especially these.
10: PRINT "Everything old is new again."
20: GOTO 10
"Don't use Hadoop MapReduce until you have a solid reason why xargs won't solve your problem. Don't implement your own lockservice when Linux's advisory file locking works just fine. Don't do image processing work with PIL unless you have proven that command-line ImageMagick won't do the job. Modern Linux distributions are capable of a lot, and most hard problems are already solved for you. You just need to know where to look."
Ah, this last sentence is what made it a challenge. You get out of school, where most assignments were to code up projects from scratch, writing it all yourself.
Now you are a professional, and for me there was so much preexisting stuff _that_I_was_unaware_existed_.
I knew how to program using the languages and libraries of the language (C, C++, Java). I learned about Design Patterns. Yet it does not seem that there is a way for people to find reference material, or perhaps a catalog, of preexisting libraries or modules, or even other programs that can be used.
I now know Perl has one (CPAN). I didn't know about it when I was new.
I know PHP has one (PEAR). I didn't know about it when I was new.
etc.
So for the article's point about knowing there are things pre-built to use, like Image Magik: how does the programmer community go about informing each other about creating and using such repositories of preexisting programs, Freshmeat?!
Example: Maybe I can study the source code of Pidgin (I have) and see they are using gtk, and have a library in libpurple of how the program can interact with IM clients.
But, are we then supposed to have and find the time to analyse lots of programs, case-by-case to then ferret out the existance of things we can use that preexist?
I do not think I ever had that much time, unless I never get sleep, and never see my family, and work 20hr days... forever?
Uh, Linux geek since 1999.
...don't be such an early-adopter that you invest so much time in something (like Linux) that it's not (yet) profitable.
I remember seeing Linux consultancies go down the toilet 10 years ago because there wasn't enough of a market for them; perhaps now they are. But I shouldn't have spent so much time, at the time, learning Linux and FreeBSD, when there was piles more money to be made in the Windows & .NET world.
That said, the grass is starting to look greener on the Linux side again, in large part because Google is making it relevant both server-side and on mobile devices, and in part because programming still seems like fun there. In the MSFT world, it's all about slavish devotion to sometimes half-baked tools and half-baked requirements. There's more of a scientific & engineering culture in the Linux camp than the MSFT camp... (And I say that with deep personal and professional experience on both sides.)
Maybe I just want to be a coder in my personal time, and not also my professional time...
Is Capitalism Good for the Poor?
95% of the work programmers do is like designing a building on the blueprints. The only part that is *at all* like actually building a building, is when you press "Compile" at the end, or when you package your app up and burn it on a DVD and ship it to customers in a box. The programming part is all design.
The source code IS the design.
Comments should just explain the assumptions, references for the algorithms, and other miscellaneous stuff that is not obvious to someone who is simply reading the code.
Write your code as simply and clearly as possible. Choose names for functions and variables carefully. Make sure not to give them misleading names. If the code has evolved and the old name no longer communicates clearly what its for, then refactor the name. Abbreviations should only be used if there is consensus about what it means (e.g. a prefix for a certain library that is used on all of its macros and functions)
Resist the urge to do "clever" hacks, unless they are badly needed for some reason (like performance) and in that case you better explain your "clever" code with a comment explaining WHY it is the way it is.
From someone who works where the coding "standard" is indentation set to 2 and spaces it's a nightmare. If it were tabs then your indent size would be up to you, but nooooooooooo.
Tabs for indentation are a good theory, and in the context of a single editor, work fine. However step out of that context and it's a roll of the dice.
Go ahead and set up your editor to the preference of your choice... lets say 3 for this example. Save your code. Use "cat" or "type" from the command line and what indentation do you see? Typically 8 space indents. Bring it up a different editor and it's a crap shoot. May be configured for 2 or 3 or 4 or just default to 8. Maybe you want to print it from a box that you don't have "your" editor configured for it. Again, roll the dice. Want to do a diff on the command line? Probably 8 spaces again. Diff in a tool? Again, roll the dice.
You put indents with spaces then there's no question that no matter what computer, what application, you will see 3 space indents. Write new code with whatever you want, modify exiting code with whatever it was written in. Just be consistent. Tabs don't give you cross-tool consistency.
CVS (or any other version control system) lets me create a copy of my analysis routines as they existed at a given date, e.g. the month before a certain conference three years ago. Thus I know that I can recreate exactly whatever I did then. I feel more free to rip out and rewrite things knowing that the old version is never lost. Can the same thing be done by archiving all old versions of routines, keeping careful changelogs, and (depending on your setup) frequently running rsync? Of course, but not as conveniently.
In practice, I give code to people I work with and when they complain about bugs, I fix them. I believe that on the local system, cvs checkin and checkout access is governed by unix file permissions, no server required.
People who don't work in this sort of environment don't understand the differences in mindset. Not many people want easy access to code and documents created over a decade ago, but I find this to be common in scientific work. Why do some scientists like TeX and FORTRAN? Because they can still use their work 20 years later. I figured that CVS would also be around forever.
I won't argue with anyone saying that there are better solutions, but using CVS was easy and good enough.
I couldn't disagree more. Getting into the nuts and bolts of coding first thing and reinventing wheels isn't the way to start out. You need to learn basic code organization and good practices first, and it's hard to do that when you're spending all you're energy an memory management or what have you.
/. story about cosmic rays affecting memory, shouldn't you know how to troubleshoot that? See, there's really no end to it
:). And anyway, as Microsoft has proven time and again good enough is good enough.
Put another way, you can take the 'learn how stuff works' approach you're suggesting as far back as you like. If it's great to know how printf & link lists work, wouldn't you want to know how your C code breaks into assembly? Then why not take it a step further to machine language, and then how about the processor and registers? What about the electronics. I saw a
Of course, part of me is also responding to changing times in the US. It's damn near impossible to get the sort of cushy corporate programming job where having general skills pays off. Yeah, if I worked for IBM I'd want to know how to roll my own linked lists in case they handed me a project where I couldn't just pull an OSS library, but IBM isn't hiring in America except at the top end. All the programmer grunt work is in India. Yeah, it's easy to say I should become a superb programmer like you suggest above, but frankly I don't want to work that hard. I'd rather spend my time learning what I need to know to write my own programs and start my own business
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/