What Knowledge Gaps Do Self-Taught Programmers Generally Have?
BeardedChimp writes "I, like many others here, have learned to program by myself. Starting at a young age and learning through fiddling I have taught myself C++, Java, python, PHP, etc., but what I want to know is what I haven't learned that is important when taught in a traditional computer science curriculum. I have a degree in physics, so I'm not averse to math. What books, websites, or resources would you recommend to fill in the gaps?"
Being self taught myself, I think the biggest downside is some of the strategies and standards that are taught in the mainstream curriculum (IE: how to properly use the object-oriented model, etc.). Especially when I first started out, my code may get the job done, but it wasn't the cleanest or best approach. Luckily, I think it will come to you in time if you focus on improving your code.
testing and architecting (building frameworks) etc.
read your gamma and fowler...
design patterns
metageek
Although in practice most of the time advanced data structures and algorithms are not used, it is useful to study them and implement them yourself at least once. Dijkstra's algorithm, Prim's, Kruskal's, maximum flow, and other basic graph-operating algorithms are a good example.
No, I'm not being a smart ass.
Others have slightly different styles and conventions, and ways of solving problems. Working on something like a large open source project could teach you about working on a team, where one person can't "own" a whole part of a program. (And cleaning up others code will greatly help you learn about documenting and formatting your own code.) One good team assignment we got, Each person work on a part of a program. Away from computers, we had to, on a whiteboard or whatever, decide inputs and outputs, etc between parts, then code separately. Grade on that assignment was how well the program behaved when the teacher, in front of the class, compiled the separate parts, and ran it for the first time combined.
What are we going to do tonight Brain?
Design Patterns: common "Template" solutions to regularly encountered problems/variations-on-that problem. Be careful when learning these that you don't fall victim to "when you have a hammer, everything is a nail". Also learn the Anti-patterns, wikipedia has a good list of anti-patterns.
Algorithms & Data Structures: Analysis, average running time Big O is most important, but understanding worst-case runtime is important too. Designing algorithms vs knowing when to leverage an existing one.
the C++ standard library provides a great many of these, it has a high efficiency sort (from ), it has good collection data structures (vectors, linked lists, maps, etc)
Objected Oriented Analysis And Design: Knowing when to make something an object, when and how to use inheritance and polymorphism, when to not make something an object. Plain old data objects. separation of responsibility: UI is not logic, logic is not UI.
Threading: proper thread synchronization techniques (mutexs, semaphores, conditions, etc), threading patterns such as Producer-Consumer, Inter-process communication
Automata & Computability: (Deterministic|Nondeterministic) Finite State Machines, Regular Languages, Turing Machines
Programming Languages: LL language parsing & rules authoring.
Computer Architecture: Processor design, pipelining, caching, function calling conventions, etc - how to use this knowledge to write more efficient programs
If you cannot keep politics out of your moderation remove yourself from the Mod Lottery.. NOW!
Experience helps, but the real killer deal is experience backed by a CS/Eng. degree.
Need an ISP in South Africa?
If you know your algorithms and data structures, and have a firm grasp of the architecture of modern computer systems, you'll be way ahead of a depressingly large proportion of people with degrees in CS that come past me in interviews.
The most informative and entertaining book I can recommend on algorithms is Bentley's "Programming Pearls".
"Skill shows through where genius wears thin." -Wittgenstein || Religion: uniting aviation and architecture.
What gaps do schooled programmers have that self-taught programmers don't? While a self-taught programmer might go about getting the job done differently, I can almost always count on him to get it done. Programmers coming out of school often still have a horrible worth ethic, especially when compared to their self taught peers. Granted, I have a very limited experience, so I wouldn't cast that judgment over all, but I would be curious to here what others think.
Invexi - a Phoenix, AZ based web design and web development company.
I have a CS degree from a major university. I have to disagree with most of the comments I've seen so far. Things like design patterns, proper object modeling, even advanced data structures and algorithms can be picked up on your own with a bit of effort as you need them, and experience building real production used software is the key to hone those skills.
...) so that I could accurately predict how it would behave under different conditions (especially when some of those conditions can't easily be tested). And an introduction to a large swath of computer science terms and facets, so that years later something comes up and I have a faint understanding of where to start looking.
IMHO there are two things that I got from school. How to properly analyze code (in terms of processing time, memory usage,
The code quality, design, and ability to apply [insert new hot term of the day] correctly all come from real world experience. And I do think you have to get that experience in a professional setting (I would consider much of the open source world profession, just FYI), hobbyist work just won't let you grow the way you need to.
I've been doing this for a few years and the one gap I'm seeing more and more of doesn't actually have anything to do with programming techniques, "design patterns" or anything else that's hugely technical. All of these things are pretty well-known and accepted by everyone, and you can always be sure that there'll be someone around pushing one or another of them as the be-all and end-all of Programming.
The one gap you might have as a self-taught programmer is in fact in the _history_ of computer science. There's a lot of stuff that has happened and in fact people keep finding and solving the same problems, never realizing that the problems have been encountered and solved many times. (An example that's particularly relevant to me at the moment has to do with extent-based file systems; ext4 has extents and so do a number of new file systems. Great idea, right, particularly for large file systems? Thing is, extent-based file systems have been used at least since the 70s in mainframe operating systems. Odd that it took 40 years to get it into Unix.)
But don't feel bad that your self-teaching has skipped the history of computing. It appears that most university computer science programs neglect that little bit of background as well, in favor of jumping straight into C++ or Java.
Maybe I'm an old fart but that half-semester of history I took back in 1981 made a small but significant improvement in my ability as a software engineer.
I've found that self-taught programmers can actually be quite productive. However I've noticed (in general) the following deficiencies which I think are both rooted in the fact that the need to memorize seemingly arbitrary facts about a system is inversely proportional to deepness of understanding of that system (see graph):
-Design Patterns (noted earlier by others): There is a tendency of self-taught programmers to follow a design pattern more doggedly than others. This can be tied back to the fact that for the self-taught a particular design pattern represents what programming is to them. They memorize a series of facts that support the design pattern they use rather than understand the nature of a design pattern itself. They tend to have steeper learning curves when presented with new structures and design patterns because using a new design pattern requires the abandonment of the facts they've memorized and starting anew with memorizing a new set of facts.
-Adaptability: Self-taught programmers tend to reach a certain level of comfortableness with technology (ie: languages/libraries/etc.) and attach themselves to it. The thought of using a different language, library, or system is daunting (or even aggressively resisted) since, again, changing requires a new memorization of facts around the technologies (see graph).
Much of what you should learn formally from a CS degree is WHAT a programming language is or WHAT a design pattern is, not merely HOW to program or HOW to use a particular design pattern.
That said, there's nothing stopping a self-taught individual from learning these things on their own. It's just that when you're teaching yourself a trade you, naturally, immediately (and sometimes exclusively) focus on things that allow you to compete on a particular level or with a particular technology. Learning design patterns or what programming is in the abstract doesn't seem to have an immediate payoff (clients aren't going to ask you about those things). But they are skills which allow you to be competitive across technologies or design patterns which is especially important in the rapidly changing world of computers.
Faith is a willingness to accept something w/o complete proof and to act on it. Reason allows you to correct that faith.
As a self-taught PHP and C# Developer, the biggest trouble has already been outlined as limited exposure to new concepts. The bigger question, however, is how to gain exposure.
#1 - User Groups I personally don't attend user groups because I have 2 jobs, and 2 kids, however, the Ruby community has shown again and again that it works, not just for the new stuff, but for the old stuff. They just overhauled Rails and as long as the community keeps talking, they'll do it again and again to perfection.
#2 - Contracting It's a large assumption, but if you have the time to learn a language, you've got time to find small contracts, and hopefully ones that will introduce you to knew people with difference foci (focuses?). Also, digging unto other people's code helps you think outside of the structures that you taught yourself - you might even get some extra cash. Check out craigs list, elance, etc
#3 - Open Source Not as good for your wallet up front, but if you think you have a unique perspective that is applicable to an existing project, donate some code. Bug fixes are just as valuable as new features.
#4 - Publications I use this in the loosest sense of the word possible. I "camp" PHP.net because there are new functions popping up all the time. Their search database is fairly decent, so when you're thinking like a PHP dev, put a word or two in and see what pops up. MSDN isn't too bad either, but the naming conventions vary, and it's so large that simply search for keywords is a challenge (They have an "OrderedDictionary, but not a UniqueList...?)
#5 - Inspiration (& Perspiration) Nothing develops with out the the will power and simply getting things done. Going back to #3, you can simply start your own project or feature. Lots of things are pluggable these days, and if your desired functionality doesn't exist, don't cry about it - build it! PHP doesn't have events, because events don't make a lot of sense on the Web.... HOWEVER, if you're writing a PHP-JS-AJAX framework, then they make a LOT of sense. Noone says you HAVE to release your code either... managing a repo is a lot of work. The point is to build something, find the pain points, then ask yourself "Is there a better way to do this?" Find the better way, build it, and make your life easier... then share it if you can.
I'd say that formally taught programmers may not have much experience with maintenance programming, especially with legacy systems that have been running for years. They are used to 'blank sheet' programming assignments that allow them to control the entire project from start to finish. They don't have to deal with code that has been modified dozens of times over the years, often without much documentation.
I got my professional start in programming doing maintenance programming under the supervision of a senior programmer who had spent years working with the systems as they evolved. While I had done some programming in college, all 'blank sheet' stuff, doing maintenance programming was much more educational. You had to make sure that things were done right, otherwise you could cause big, real world problems.
You almost certainly already have some grasp of Complexity Theory since it governs why e.g. mergesort is faster than bubblesort. I personally found it a somewhat dull topic but it is probably worth delving into a bit for "self improvement" purposes.
Functional programming is worth playing around with. US universities tend to focus on Lisp, I think. ML and Haskell are often used in the UK and have a very interesting type system (proponents say that it's about the most advanced one out there) that it's also worth being aware of. Haskell is also a lazy language, which is interesting although you're unlikely to encounter it anywhere else! Some of my ML programming course dealt with how to build lazy data structures without explicit language support, which was potentially a useful technique.
Others have mentioned design patterns. I guess it's worth looking at those since even though you might instinctively know some, it's easier in an interview if you can *name* them so they know you know what you're talking about.
I'd say you need to learn enough mathematics to get an appreciation for what goes into the discovery of an O(nlog[n]) algorithm -vs- its [naive] O(n^2) counterpart.
This is a gap I've seen with self-taught programmers: they didn't take algorithms classes where you analyze algorithms for efficiency and write complicated algorithms for (mostly) academic problems. Even amongst university-taught programmers, most people see this class as a waste of time. I've taken it twice (undergrad and graduate level) and find it helps me in my job.
You could teach people all they need to know about big O and common algorithms in an afternoon.
Sorry, but I gotta call B.S. on that one.
You need YEARS of mathematical training to grok this stuff.
Have you ever tried teaching college level programming to recent American high school graduates?
I have had young adults [some already with bachelor's degrees who were coming back to school to brush up] who couldn't reliably compute anything in Base-16 [hexadecimal].
They need the better part of a decade's worth of intensive mathematics training to get to the point that they could really grok the difference between what goes into a "slow" O(n^2) algorithm and its "fast" O(nlog[n]) counterpart.
And let's face it, lots of people doing basic HTML or VBA [Visual Basic for Applications] probably don't have sufficiently high IQs to make that transition.
And even if they do have sufficiently high IQs, then summoning the self-discipline [not to mention just the spare time] to tackle this stuff is going to require a really formidable application of the will.
Which is not to say that it can't be done, but the odds are definitely stacked against them.
That kind of thing isn't necessary in most programming. Niche knowledge ftw.
I disagree. Maybe you don't need to understand FFT's in your line of work (I don't), but analyzing algorithms for efficiency is absolutely a real-world skill.
If you don't at least understand these concepts (, then you don't understand why one algorithm is crap in the real world and another algorithm is preferred. If you don't understand why one algorithm is crap and another algorithm is good -- even though both provide the correct results -- then you have no business writing code professionally. This is not just an academic exercise. I have seen programmers (and I'm using the term very, very loosely here) who could not get the concept that nesting for loops -- while conceptually simple -- was an exceptionally poor way to do what they were trying to do. Then they couldn't understand why all our user were complaining that "the network" or "the server" was slow, when in fact, the network and the server were fine; it was the developer's piss-poor code that made a snappy network and bleeding edge hardware appear to be so sluggish.
MCSE? No, sir...I don't do Windows. Yes, I am an idealist. What's your point?
I wanted a multiparadigm language to help me learn different approaches to coding.
And for that, you chose C++? Really?
Not Ruby, not Python, not even Javascript? C++?
learn C++ at an adequate level and then most other computer languages should be easy to comprehend.
Except Lisp. And purely-functional languages, like Haskell. And interesting things like Erlang -- immutable, non-shared memory plus message-passing. And...
Learn one language well, and other languages will be easier to pick up, yes. But calling C++ "multiparadigm" is like calling Java "Object-Oriented". It makes you sound smart, and it even makes sense when you know a little about the subject. Then you learn what the term actually means, and you see a language which actually does that successfully -- Ruby is object-oriented in a way Java can only dream of.
Don't thank God, thank a doctor!