Slashdot Mirror


Ask Slashdot: Best Programs To Learn From?

First time accepted submitter camServo writes "I took C++ classes in college and I have played around with some scripting languages. We learned the basics of how to make C++ work with small programs, but when I see large open source projects, I never know where to even start to try and figure out how their code works. I'm wondering if any of you have suggestions for some nice open source projects to look at to get an idea for how programming works in the real world, so I can start giving back to the FOSS community." Where would you start?

19 of 329 comments (clear)

  1. Pick small libraries/utilities by Nethemas+the+Great · · Score: 4, Insightful

    The more lines of code the more difficult to get started as a general rule. Just find a small library that provides support for something you have an interest in. Tinker with it.

    --
    Two of my imaginary friends reproduced once ... with negative results.
  2. Re:The kernel by kthreadd · · Score: 5, Insightful

    Off you go

    Oh yeah, Good ol' BSD kernel. The best one in town.

  3. Really good question. by Georules · · Score: 5, Insightful

    I have often wondered the same thing. People tell me, "read the code and submit patches!" It may sound like hand-holding to experienced developers, but many new coders could really use an introduction to becoming a part of a community around a project.

    1. Re:Really good question. by Nethemas+the+Great · · Score: 3, Insightful

      If you want to really help start with QA testing and filing bug reports. Graduate to identifying the bug in code (and reporting your findings). Graduate from that to actually fixing these bugs and submitting the fix. Not only will you be helping the project but in the process you will be making connections and establishing yourself with the development team. Very few groups will give you the time of day if one day you--a total unknown--just happen by and drop a bunch of code in their lap.

      --
      Two of my imaginary friends reproduced once ... with negative results.
    2. Re:Really good question. by LateArthurDent · · Score: 4, Insightful

      I have often wondered the same thing. People tell me, "read the code and submit patches!" It may sound like hand-holding to experienced developers, but many new coders could really use an introduction to becoming a part of a community around a project.

      I wouldn't call myself an open-source developer by any means, but I've submitted patches to open source projects on occasion, and it wasn't too hard, even back when I had no experience with any large program. The trick is in the approach. Here's my recommendation:

      Don't just download the code and start reading trying to figure out how everything works. That's when you drown in too much information, become frustrated, and decide you can't do it. It's a large, complex program. If you don't have a purpose, you can't navigate it. Instead:

      1. Find an open source program which you use and like. This helps keep you interested.
      2. Pick some small bug that annoys you, or a small feature you wish your pet program had that it does not. Emphasis on small here, you don't want to commit yourself to rewriting large portions of code. First, that would be overly challenging; second, the main developers of the project are unlikely to trust a huge infusion of code from someone who never contributed before, unless you can show that it really kicks ass. That's going to lead to a lot of talking back and forth, when you really just want to code.
      3. If your program uses an issue tracker, go there and see if your bug / feature is listed, and if anyone else is working on it. If so, you can post there and offer to work with that person, if they're willing to help you out. This can also save you headaches, as the posts might explain that the simple bug you've chosen has an underlying complex reason which makes it a hard to solve problem.
      4. Try to find the location in the code responsible for the small area which you want to change. Knowing your way around gdb or a frontend to gdb can be helpful here.
      5. If you start getting lost in the code and can't find what you need, contact a developer, tell them what you want to work on, and ask if they can lead you in the right direction as to where in the code you should be looking at. I generally find that it's a lot easier to ask a developer a specific question about a specific problem than a generic, "how can I help out?" The latter will typically get you a response such as, "check out the issue tracker, pick something, and go for it." It's a good answer, but it feels daunting for a beginner. So contact the developer with a purpose and specific questions, and they'll generally be extremely helpful in guiding you through your problems. If you've demonstrated that you tried to read the code on your own first, they'll also be much more likely to take the time to offer you more detailed guidance.
  4. Start with slashcode by Anonymous Coward · · Score: 3, Funny

    And then do the opposite.

    Works every time.

  5. Easy! by aglider · · Score: 4, Informative

    For C++ I would suggest Qt.
    For C I would suggest Minix3.

    --
    Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.
  6. From stuff I've seen... by ilsaloving · · Score: 3, Funny

    If you want examples of 'real world' programming, take a bowl of spaghetti, add some additional ingredients that you wouldn't normally expect to see in spaghetti, and then fling the whole thing against a wall.

    That's what the vast majority of modern day code looks like, especially if the organization that wrote the code tried to outsource the development effort 'to save money' at some point during the dev cycle.

  7. The code *you* wrote a year ago ... by perpenso · · Score: 4, Insightful

    Besides looking at the code of others be sure to look at the code you wrote a year ago and haven't looked at since. You should learn something about good comments and documentation. You probably will have ideas on how to better implement things. There is some truth to the notion that programmers don't really like the code they wrote for a project until they have thrown it out and rewritten it from scratch for the third time.

  8. Your Own by clinko · · Score: 3

    A tip I always give:
    1. Start writing something you want. (It'll keep you interested)
    2. Google the SMALLER hard parts (String Parsing, data models, misc functions, etc)
    3. Use that code. (No one is going to blame you for copypasta on your own project.)

    Eventually you'll understand how the copied code works. After a few projects you end up writing your own version because you're better than "that guy you copied from".

  9. Re:The kernel by serviscope_minor · · Score: 3, Informative

    Um the "kernel" (by which I assume you mean Linux) is not written in C++.

    It should be, but it isn't.

    I mean, it's full of objects with derivation and virtual functions, and structs on which constructors and destructors have to be called for everything to remain in one peice. Seems odd not to use a language which is every bit as efficient, has a familiar syntax and yet does a large number of common tasks automatically and without errors.

    Oh, and the other thing is that linux has the vtable inside the classes rather than a vptr, presumably because they syntactic overhead of a vptr is too high. C++ is by default significantly more memory efficient in this regard.

    --
    SJW n. One who posts facts.
  10. Good luck with that by ultranova · · Score: 4, Insightful

    I took C++ classes in college and I have played around with some scripting languages. We learned the basics of how to make C++ work with small programs, but when I see large open source projects, I never know where to even start to try and figure out how their code works. I'm wondering if any of you have suggestions for some nice open source projects to look at to get an idea for how programming works in the real world,

    I think you already do.

    This is the difference between C and C++: in C, whatever the code of a function says it does, it does; in C++, whatever the code of a function says it does is subject to be changed by templates, operator redefinitions, etc. Because of this it is impossible to make small changes without reading and understanding the entire codebase first.

    Basically, if you want to get involved in a large C++ project, you either have a tour guide or very good documentation or make the huge investment of learning the entire superstructure of the program before making any changes to any part of it. It's kinda interesting how C++ encourages this kind of greater dependency between different parts of a program than C.

    --

    Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  11. Re:The kernel by BenoitRen · · Score: 3, Insightful

    I mean, it's full of objects with derivation and virtual functions, and structs on which constructors and destructors have to be called for everything to remain in one peice. Seems odd not to use a language which is every bit as efficient, has a familiar syntax and yet does a large number of common tasks automatically and without errors.

    The special treatment that C++ gives to constructors and destructors makes things harder, though. They don't return any value. If the constructor fails your only option is to throw an exception. But C++ exceptions make code execution slower. Another alternative is to check the object through a method after construction, which a lot of STL objects do, but that's kind of messy.

    Don't get me wrong; I program in C++. But this is one of the dilemmas that I've faced and researched, and I still don't know what to do about it.

  12. techniques by lkcl · · Score: 4, Funny

    the most important thing is to have techniques that allow you to find your way. the language doesn't actually matter, but it does definitely definitely help if the code is documented in some fashion. i tried, for example, to work on fontforge with my usual techniques, and the code was so incredibly dense and uncommented that it was absolutely impossible to understand. but, exceptions aside, here's a starting point for getting into large projects:

    * use vi. do not use graphical editors. do not use emacs
    * get a damn big monitor (or 2 monitors). open xterms at 80x60, as many wide as you can get.
    * use a multi-window desktop manager (i use fvwm2 and i run a 6x4 grid: that's 24 desktops.
    * be prepared to open (and background) up to 200 simultaneous files, across multiple windows.
    * make sure that you open the files from the *root* of the project.
    * open the files "by name", explicitly, so that you can do "jobs | grep {filename}"
    * run "ctags -R" - it is your friend. then use ctrl-] on a function you don't know, and read about it.
    * remember to use :e # to go _back_ to the file you were originally editing (after using ctrl-])
    * be prepared to print out the ENTIRE codebase, and flip through it, off-line, very very quickly.
    * be prepared to do page-down, page-down, very very quickly, through as many files as you can stand

    the main thing to do is to get a vague map of the code into your subconscious, as quickly as possible. then you will go "i've seen that before..." and you stand a chance of being able to hunt for it and find it.

    you *don't* have to memorise the entire codebase - you *don't* have to even understand all of it. but you *do* need to at least have the techniques which will allow you to jump to wherever it is that you want to go.

    ultimately, though, you need a goal. what, exactly, is it that you want to achieve? if you have no goal, you are pissing in the wind.

    i added NT Domains Security to freedce - that's a good, simple goal. FreeDCE is 250,000 lines of code, and very well laid-out. it was therefore quite straightforward to add 6,000 lines of code to do NTLMSSP. took a couple of weeks.

    i added python bindings to webkit - that's a good simple goal (ok, it was horrendous, requiring over 12 different skillsets, including c, c++, python, perl, autoconf, gtk, python c modules, IDL files parsing - the list just went on and on). webkit is a massive project, and also very well laid-out and structured. the first version of the python bindings took about 8 weeks, and the 2nd (faster, better) version took only 2. the reason why the 2nd version took only 2 weeks is because i hunted down the mozilla xulrunner IDL file parser, hunted down python-gobject's code generator, adapted the xulrunner IDL file parser to understand the webkit IDL file-format (2 days), then spent the rest of the time hacking codegen.py to spew out the data types from webkit, and to create a standard python c module.

    so you say "you don't know how to get familiar with a free software project", well, i am not - i wasn't familiar with webkit, but that didn't stop me. i wasn't familiar with xulrunner, but that didn't stop me. i wasn't familiar with python-gobject's codegen, but that didn't stop me. i just got on with it, and just trusted that the surrounding code would do its job, and trusted that the bit of code that i picked up could be adapted.

    so in many ways, tackling a large codebase is more about overcoming your own fear and feelings of inadequacy. sometimes not even i can do that, and sometimes i can.

  13. Re:The kernel by b4dc0d3r · · Score: 3, Insightful

    I finally found you. I hate you. Not personally, but this kind of thinking. A TCP class raises a disconnected exception, the stream class raises an interrupted exception, the object class raises an error exception, and the application says "There was an error." What kind, and how do I fix it?

    OOP error handling and code reuse can be done well, but it generally is not. The basic idea of a "return code", giving some sort of information or context about the error, is very important. Even if it's just preserving the exception information to bubble up.

    I've collected probably a hundred Microsoft-specific error messages that don't mean what they say they mean. They add helpful text to say what you might fix, but that's a red herring. There is an underlying error which is caught but not bubbled up, and it leaves the user with little or no idea what to do.

    You have to have the idea, if not the implementation, of returning something to the user.

    And, I take exception to your assertion that exceptions don't make the code slower. Each class wraps its code in try/catch and has to deal with fairly complicated Exception objects in many cases. Did the file open? fopen() returns null, and you can get more information if you want it. OOP says you have to make an exception object and run catch code, and go up the stack and to the exceptions there.

    Code that experiences no exceptions will not be noticeably slower, but code that relies on exception processing to try alternate methods or re-try will be a lot slower. This from someone who looks at C++ code at the (dis)-assembler level.

  14. Re:The kernel by walshy007 · · Score: 4, Insightful

    Um the "kernel" (by which I assume you mean Linux) is not written in C++. It should be, but it isn't.

    There are reasons the kernel doesn't have any c++ in it (link is about git, but same deal for the kernel).

  15. Re:From stuff I've seen... if you mean businesses by b4dc0d3r · · Score: 3, Interesting

    If "the real world" means the corporate world, do this. Take an application you don't care about and don't know how to use, and assign yourself a bug to fix, and give yourself a deadline pulled from a RNG.

    Code is developed this way:

    Start developement
    Shrink the team
    Fire all but 1 guy who does all the maintenance
    Bring in contractors
    Add a few people
    Shrink the team
    Fire the 1 guy who knows everything
    Scramble to find someone who knows the application
    Bring in contractors
    Select any step above at random

    I'm not trying to be funny. End result is quirks, inconsistencies, inexplicable code blocks, bugs, performance issues, and all kinds of other bad things.

  16. Re:The kernel by Imagix · · Score: 3, Interesting

    And if the code that *has* to be called to make the object valid fails, how do you prevent that object from being used when it fails? Can't use a return code as the programmer may simply ignore the return code and then blithely try to use the object that is now failing it's invariants. With the constructor throwing an exception, at least the code block that the variable was declared in will be exited, causing that variable to no longer be in scope, and thus cannot be accidentally used.

  17. Re:The kernel by emt377 · · Score: 3, Insightful

    Exception enabled code used to be slow, especially in GCC. These days it is much faster.

    Yes, exception-enabled code is fast, until you actually have to process an exception. Merely enabling exceptions typically also doubles the footprint of code, without a single actual exception handler or exception thrown, simply because the compiler has to emit cleanups to unwind each and every stack frame, from each and every scope, in case an exception *is* thrown somewhere. Code to actually work with exceptions then add on top of this.

    Given the outrageous expense of processing exceptions, anything that resembles normal or non-exceptional should never be handled as an exception. This includes things like TCP FIN (*all* connections end with it), EOF, poll timeouts, event processing (yes, I've seen it done!), not to mention regular C library and syscall error returns. I've seen people wrap things like mkdir() with an error check that throws an exception if it returns -1. Well, the program that used the wrapper of course used mkdir() all over the place just to make sure a directory existed (and to create it if didn't). So it got an exception in the TYPICAL case. Every call point was then wrapped in an exception handler that, well, did nothing! I consider that borderline incompetent. But I digress. On the other hand, the things that really ARE exceptional - heap corruption, out of memory, deadlock detection, out of file descriptors, thread creation failure, unmounted root fs, kernel resource errors, etc, etc - there's nothing to be done about. And any attempt to do anything at all will likely aggravate the problem. In the case of say a heap corruption, or stack overflow, it's unlikely that attempting to process an exception is going to do anything more than crash. And serve only to make it harder to debug because it crashed somewhere in a runtime routine that walks a table with links to procedures to unwind the stack, not the place where it was actually first discovered. You're IMO better off simply calling a panic routine that stops then and there rather than attempt to do anything else that would only aggravate the problem further (possibly leading to real data loss - "oh, a corrupt heap... lets try to save the document before bailing" or similar brilliance). Between the two, and given the footprint overhead and the inevitable abuse in the absence of adult supervision, it's best not to use them at all. The sliver of cases between the two where exceptions are useful is so narrow that it's no longer meaningful formalism.

    Note that you typically don't get away from the error check of a return value. Instead you move it further down the tree, to the position where you conditionally throw the exception, instead of at the return of the function. The difference is there, but trivial.