Slashdot Mirror


Secure, Efficient and Easy C programming

cras writes "Feeling a bit of masochist today.. First in the morning I wrote Secure, Efficient and Easy C Programming Mini-HOWTO. And since I already spent a few hours with it, I figured I might just as well see what Slashdot people would think about it."

16 of 347 comments (clear)

  1. a little short?? by AmigaAvenger · · Score: 4, Interesting
    I think this might be a little bit too soon to have it posted on slashdot, you only have a few pages covering memory and string handling, and nothing ground breaking at that.

    It does look like a good start, add a few more chapters and you will be halfway there...

    1. Re:a little short?? by crimsun · · Score: 4, Interesting

      I agree that it's a good start. I would also add in links to vsftpd's design & implementation documentation; see here, here, and here. For what it's worth, Maradns's string library is worth examining as well.

  2. Unportable? by Anonymous Coward · · Score: 5, Interesting

    I found strlcat and strlcpy easily ported - simply toss them in the same .c file and dump it into the makefile!

    On a more serious note, why in Bob's name don't these two functions exist, standard, in Linux? IMO, they should be added, and gcc should give deprecation warnings about the use of non-safe buffer handling functions - sprintf, strcat, strcpy, etc. No offense to purists, but screw the standard. I'll sacrifice some portability of software and such for security.

    Oh, and on a side note, you may take my malloc() when you pry it from my cold dead fingers. ;) Eh, I suppose we all have a certain way of doing things that we don't wish to part with. (*points at the unsafe buffer people*)

  3. Definitely useful by ttfkam · · Score: 5, Interesting

    in that folks who use C can avoid common pitfalls. But so much of this seems like it has been tackled by C++. Only C++ did it cleaner. C++ is complex though. So this only leaves (horrors) a higher level language that removes all of these implementation details that lead to insecure programs.

    Do it in a higher-level language first. Make sure your algorithms are clean and efficient. If and only if you see a performance or resource problem do you rework portions(!!!) in C. As a bonus, the higher level language acts as a code template for faster C development.

    Once you are at that point, this Mini-HOWTO will definitely be a great resource to use.

    --

    - I don't need to go outside, my CRT tan'll do me just fine.
    1. Re:Definitely useful by hobuddy · · Score: 5, Interesting

      ttfkam wrote:
      Do it in a higher-level language first. Make sure your algorithms are clean and efficient. If and only if you see a performance or resource problem do you rework portions(!!!) in C. As a bonus, the higher level language acts as a code template for faster C development.

      Amen.

      Kragg wrote (in his reply to ttfkam):
      Prototyping in a higher-level language (c# is easy, java everyone knows) is a superb idea, provided you
      - can release the final product as interpreted, with slow execution speed


      Most programs spend 90% of their CPU time executing 10% of their code. If that 10% is optimized in a low-level language such as C, a large-scale interpreted program can boast performance that's virtually indistinguishable from an equivalent program written entirely in a low-level langauge. However, there's likely to be a huge difference in programmer productivity.

      As a reference, see this Dr. Dobbs article, which states:
      """ ... 90 percent of the software's running time occurs in only 10 percent of the code. This is the whole basis for virtual memory: Potentially, a program can run at full speed with only 10 percent of itself--or whatever the working set is--loaded into memory at any given time. Unlike that nasty segment stuff, the programmer does not specify any of this in advance. The operating system "discovers" a program's working set on-the-fly, through page faults.
      """

      - can afford the time to port all to C, in which case DO, this is an excellent way to make a watertight C program

      Why port 90% of the application's code to a low-level and less productive programming language, when that 90% will inevitably evolve and require maintenance as the program is utilized in unforeseen ways? I've never written a large program that didn't end up having features added incrementally over a long period of time after the initial release.

      - are happy to learn how to make managed code/vm code call to native and vice-versa (this is far from a trivial problem)

      If it's "far from a trivial problem", you're using the wrong tool.

      Take Python, for example: it's simple to interface between Python and C using Python's C API. Recently, a tool named Pyrex has appeared that makes it almost trivial. Pyrex is amazing.

      Kragg suggested prototyping in C# or Java, but Python surpasses both of those as a prototyping tool. Python is higher-level than C# or Java (and thus better suited to prototyping and/or malleable fusion with C) because it features:
      - dynamic typing ("dynamic", not "weak" like Perl)
      - no obession with a particular programming paradigm; use procedural, functional, or OO as appropriate
      - high-level data structures built into the language
      - more convenient dynamic code loading
      - interactive development at a "Python prompt" (the value of this cannot be overestimated)
      - no separate compilation step in the edit-test-debug cycle
      - more concise syntax
      - excellent interface capabilities to C (or C++ via Boost.Python, or Java via Jython)

      I suggest that the fusion of a truly high-level (higher than Java-level) language with C is far more broadly applicable than Kragg claims.

      --
      Erlang.org: wow
  4. Re:Voluntary slashdotting by cras · · Score: 2, Interesting

    Well, I didn't link it to original site, but from the look of this so called slashdot effect I think I just as well might have. Come on, I don't see any kind of load at all!

    [cras@foo] ~$ ps ax|grep apache|wc -l
    60

    [cras@foo] ~$ uptime
    20:32:54 up 127 days, 10:58, 56 users, load average: 0.23, 0.41, 0.37

    Those loads were pretty much the same before slashdotting.

  5. data stacks by larry+bagina · · Score: 5, Interesting
    What I haven't yet seen used anywhere outside my own software and some programming languages internals (eg. calling Perl code from C), is using data stack for temporary memory allocations. It has the most important advantage of garbage collectors; allocate memory without worrying about freeing it. It also has a few gotchas, but I'd say it's advantages are well worth it.

    The way it works is simply letting the programmer define the stack frames. All memory allocated within the frame are freed at once when the frame ends. This works best with programs running in some event loop so you don't have to worry about the stack frames too much. Here's an example program:

    That sounds a little like the NSAutoReleasePool in Cocoa/OpenStep. Objects use reference counting, when the count reaches 0, they deallocate themselves. When an object is created, it can get added to the most recent pool. When the pool is deleted, it decrements the reference count of all the objects within it, causing deallocation unless it needs to be kept around longer.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

    1. Re:data stacks by Handyman · · Score: 2, Interesting

      I've seen an especially nice version of these dynamic stack frames in Cyclone. For those of you who don't know it, Cyclone is kind of a type-safe, pointer-safe version of C, mixed with some good features found in languages like ML. It takes only simple transformations to compile C code as Cyclone (mostly having to do with minor pointer syntax differences).

      Cyclone has region-based memory protection, which means that you can't do stuff like return pointers to local variables etc. because it statically checks the pointer lifetime using region tags that are part of the pointer type. Example: you can have a pointer-to-memory-belonging-to-region-foo, where foo is a function or some other static scope, (written sometype * `foo, although the default region tag is usually correct, then it's just sometype *) which can point to heap memory because that's garbage-collected and guaranteed to live at least as long as function foo's memory, but you can't have a pointer-to-memory-belonging-to-region-Heap pointing to a variable on the local stack: if you have a local variable x and take the address, the type of that is pointer-to-memory-in-region-foo, and that type is not allowed to be cast to pointer-to-memory-in-region-Heap because foo's memory doesn't necessarily live as long as Heap does.

      They combined this region-based mechanism with dynamic "stack frames": You can open a dynamic region to open up a new "stack frame" or separate heap of memory bound to a scope in the program, so when an exception is thrown or when you exit the scope the memory is automatically deallocated. The good thing is, you can't go wrong, because the region-based pointer lifetime checking will prohibit you from casting a pointer into that specific region to a pointer into a region with a longer lifetime, so you will never have dangling pointers into such a dynamic region: you will get a compile-time error when you attempt to do this.

  6. Maybe being in the business world by PaddyM · · Score: 3, Interesting

    But I can't follow this HOWTO very well. Does he have a global variable stored in the file with t_push and t_pop so that t_sprintf can use that variable? But if he has a global variable, than all he's really doing is allocating the maximum amount of memory his program will ever need at the beginning, and managing his memory.

    Perhaps working until 4 in the morning on C code has drained my ability to understand.

  7. Devils' Advocate by windex · · Score: 4, Interesting

    Okay, let's preface. This guy has a good idea in the memory allocation department.

    Problem 1:

    It's not easy, nor fast to write. Errors are severe if present and undetected. Code required to be reliable might not be a good place to test this allocation method.

    Problem 2:

    I'm not entirely sure these concepts are very portible outside of GCC. May not be a big deal to most, but uh, multiplatform code is required in some enviroments.

    Problem 3:

    Any speed increase without massive resource wasting is pure dumb luck during heavy usage, unless used in an application that takes little user input or has limits on the ammount of input.

    Just my $0.02.

  8. You just reinvented alloca() by lamontg · · Score: 3, Interesting

    From the FreeBSD manpage:

    ALLOCA(3) FreeBSD Library Functions Manual ALLOCA(3)

    NAME
    alloca - memory allocator

    LIBRARY
    Standard C Library (libc, -lc)

    SYNOPSIS
    #include <stdlib.h>

    void *
    alloca(size_t size);

    DESCRIPTION
    The alloca() function allocates size bytes of space in the stack frame of
    the caller. This temporary space is automatically freed on return.

    RETURN VALUES
    The alloca() function returns a pointer to the beginning of the allocated
    space. If the allocation failed, a NULL pointer is returned.

    SEE ALSO
    brk(2), calloc(3), getpagesize(3), malloc(3), realloc(3)

    BUGS
    The alloca() function is machine dependent; its use is discouraged.

    FreeBSD 5.0 June 4, 1993 FreeBSD 5.0

  9. Re:Or Otherwise Known as by cras · · Score: 2, Interesting
    How to program in an amazing unthread safe manner!

    But then again, threads are useless for most applicatios, especially the ones I've written so far. Besides, it's easy to make it thread safe with per-thread data stacks and adding locks to other stuff.

  10. Re:These are common tricks by cras · · Score: 2, Interesting
    I'm not seeing the difference between your data stack and a memory pool,

    You could think of it that way if you wanted. I actually called them "temporary memory pools" before learning it had an existing name.

    A context frame should probably always map to a single function invocation. Or to put it another way, a data frame pushed in a particular function call should always be popped by that same function call. And that kind of defeats the purpose of being able to return stack-allocated data UP the call stack.

    Yes, but there's no need to create a new frame for each function call. You may not need to create more than one frame in the entire program if you know you're not allocating too much memory out of it. That's what makes it better than alloca(). You can do for example:

    t_push(); ret = alloc_some_data_from_stack(); /* do stuff with ret */ t_pop();

    All very simple. Sure there's still possibility breakages but they're not very common, and you know when you're doing it wrong. Simply forgetting a t_pop() call will be noticed at the bottom level t_pop() which would kill the program then - nothing got overflown but it might have allocated memory excessively.

    In contrast alloca() is a simple manipulation of the hardware stack pointer, which will be automatically undone by the hardware itself at the end of the call frame (on any sane architecture, that is). There's no possiblity for abuse.

    alloca() simply doesn't do what I want. I want to return dynamically allocated memory from functions without worrying about freeing it. Data stack and GC are the only possibilities for that.

    Any strlcat(), strlcpy(), etc. don't solve the underlying problem in all string operations, which is making sure you always have enough room.

    I'm not propsing strlcpy() either. I only mentioned them as being much better than strncpy/strncat which they definitely are. I've never used them though.

  11. Re:aggressive use of glib by Anonymous Coward · · Score: 1, Interesting

    And it's very lightweight, fast and efficient.

    But portable??? I'm serious, does it currently run on BSD, OS X, Windows, and Linux? And if so, then how much bloat does it add?

    See, that's the problem.

  12. Jeez, just learn C++ already by Moses+Lawn · · Score: 3, Interesting

    OK, so that subject's a little inflammatory. But my god, I don't see why anybody is still writing new code in C in this day and age. C++ has been a fast, stable, standardized language for what - 10 years now? All the problems with buffer overflows that require hokey, kludgey workarounds in C are cleanly solved with any well-written string library (like, say, the one in the STL). Memory pools can be nicely wrapped with a class that pushes in the ctor and pops in the dtor, so you don't have to remember to call them in the right order everywhere (just declare an object at the top on the block).

    The arguments I've seen against C++ seem to fall into the following categories:

    * It adds bloat and it's slow
    No, not since optimizing compilers were perfected in the 90s. You can add a lot of overhead to your app by abusing the STL, but for non-trivial applications, you'll never notice it. GCC (at least for the pre-3.0 series) has a really unoptimized template implementation, where "Hello, world" using cout would make a multi-megabyte executable (and be forever compiling it), but more modern compilers, like VC++ and Intel's compiler, do a lot better. Either way, for a real-world app, any size increase will be unnoticable. As for speed, with an optimizing compiler and judicious use of inlining, a C++ program will run just as fast as one written in C.

    These complaints may have been true in the days of the Cfront preprocessor, but not today. I don't know about you, but I no longer write code for a 386 with 4 meg of memory.

    * I don't like/need/want to learn OOP
    You don't need OOP to use C++, but it helps. A class is just a struct where everything's private by default. If you know C, it takes about a day to learn the basics of constructors and destructors, references, and exceptions. Templates and STL will take a bit longer. One great about C++ is that you can just use small bits here and there if you don't want a full-blown OO program.

    * It's not as good as Perl/Python/Ocaml/Eiffel/Java/whatever
    That's not the point. It's not supposed to be. It's supposed to be as good or better than C. If you want a standalone-executable without linking in a complete interpreter and you don't need a lot of string parsing or regexps, you were using C anyway.

    * It won't let me write libraries that work with other languages
    Just declare all of your external APIs using 'extern C' and make sure they only use C types in their signatures. Done.

    The main reason not to use C++ in new development seems to be "I don't want to learn it" or "I don't know anything about it". If you use either one, I don't ever want to work with you.

    --

    What if life is just a side effect of some other process and God has no idea we exist?

    1. Re:Jeez, just learn C++ already by Anonymous Coward · · Score: 1, Interesting

      * It's not as good as Perl/Python/Ocaml/Eiffel/Java/whatever
      That's not the point. It's not supposed to be. It's supposed to be as good or better than C. If you want a standalone-executable without linking in a complete interpreter and you don't need a lot of string parsing or regexps, you were using C anyway.


      Why? I know Eiffel, Java, Ada, Common Lisp, and Scheme can produce compiled binaries without linking in anything that could be considered a complete interpreter. I know I moved from C++ to Ada because C++ didn't seem to fix any of the problems with C.