Microsoft To Banish Memcpy()
kyriacos notes that Microsoft will be adding memcpy() to its list of function calls banned under its secure development lifecycle. This reader asks, "I was wondering how advanced C/C++ programmers view this move. Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
You can replace memcpy entirely with memmove (the latter is slightly slower and handles overlaps), and nothing in the article suggests that memmove is banned.
But, no, it shouldn't hurt creativity--they're introducing a memcpy_s, which is the same aside from taking a size parameter for the destination. That's something that is generally easy to track in new code (obviously this secure developement lifecycle is not backwards compatible).
rage, rage against the dying of the light
Someone already explained this better than I could.
No its not. This is only banned under Microsoft's Security Development Lifecycle, which means you only care about this if you're following those set of development guidelines. Its still in the language. And you can always use memcopy_s:
Developers who want to be SDL compliant will instead have to replace memcpy() functions with memcpy_s, a newer command that takes an additional parameter delineating the size of the destination buffer.
you didnt read.
MSFT is banning it from their development process, not the language, use it as much as you like.
Internally to Microsoft, "banned" means that no products can be shipped using these functions. Externally, this is just a recommendation.
Are you high? It already takes a size argument. If this were about strcpy(3), then you'd have a point, but I do not think memcpy(3) means what you think it means.
I'm not saying you can't get yourself into trouble with inappropriate use of memcpy(3), but buffer overruns aren't the go-to threat every time.
Just like removing printf, scanf, and most other copy/string functions. There are safe versions of memcpy that work just fine and are just as easy to use...
There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).
strcpy() is unsafe because you don't know how many bytes you are going to be copying. strncpy() is completely safe as long as you aren't brain dead and set the 'n' to the size of the destination buffer (as opposed to strlen(src) which would be brain dead) and then slap an '\0' into the last index of the dest. sprintf, same deal, just use snprintf and tell it the max bytes it can print.
So what's unsafe about memcpy()? You explicitly specify the number of bytes to copy. If that number of bytes is greater than the known size of the destination buffer, then you've got a problem that simply adding a second 'size of dest' paramater to the copy won't fix because you already screwed the pooch on figuring that out now didn't you?
Yes memcpy() doesn't work if src and dest overlap. When that's happening, you typically know about it (you've got some clever in-situ array modification going on) and can use memmove(). memmove(), on the other hand, is equally unsafe if you can't properly specify the number of bytes to copy.
Bottom line: There's no such thing as a "safe" copy in C when we're assuming the programmer can't figure out the destination buffer size.
The enemies of Democracy are
The only "safer" thing about memmove is it can handle moves between overlapped regions. This is only reason to use memmove over memcpy. There is nothing safer about that.
That's physically impossible, even given infinite time. Read up on the halting problem.
However, programming a framework in which we may rule out certain things, for example a process jumping over and altering the OS, is perfectly possible. It just has to be verified through reasoning, rather than testing. The unit testing methodology is really the problem here. You cannot unit test everything.
Don't get me wrong, testing is a good start, but it's no proof of security, and a proof of security, while very hard, is possible. Kudos to Microsoft.
And to expand on the GP for those that didn't RTFA, they replaced Memcpy with a memcpy that forced you to state the size of the destination buffer, which is a constant time operation, and a much needed one. So this only forces C coders to make their code a little more clear.
And when you're being intentionally unclear to the computer in addition to the reader, your code has no place in a secure production setting.
The problem is memcpy returns a void *. If this is dynamically cast, it needs to be checked at runtime and may even be set to a value the programmer never intended (say unsigned 16 bit values instead of unsigned 8 bit characters). It may be an issue with updating the code - say the code was originally written for 8 bit ASCII and got updated to, say UTF-16 (16 bit). A dynamically cast void* doesn't care what the size is, it just shoves the values in the buffer. This may work fine in basic testing even, because you never overflow the buffer with 1-2 characters, and maybe even gets past a QA team, but once you go past 1/2, you've got a buffer overrun.
As I understand it, __restrict wouldn't work in a C++ program using dynamic_cast because it doesn't know the size at compile time (sorry, I'm not sure what is done in C as I haven't kept up with the language, so I have to use a C++ example). My guess is memcpy_s does runtime bounds checking (it isn't specified on the memcpy_s page, maybe the security ref - too busy to read it though).
Not quite "just as bad," in my opinion. Writing outside of your boundary is much more likely to cause problems (overwriting other things) than reading out of bounds. A 40-byte null-terminated string, for instance, wouldn't be hurt by another 40 bytes of heap data, so long as the null terminator was intact. You may still throw an error if it's not your memory that's being read. Just saying... writing (approx. equals) 100% trouble, reading < writing.
There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).
%n writes to the stack. It's disabled by default in VS2005 onwards. More at http://weblogs.asp.net/george_v_reilly/archive/2007/02/06/printf-n.aspx and http://julianor.tripod.com/bc/formatstring-1.2.pdf
/George V. Reilly
This is not the first time MS has done this. They have plenty of other standard functions that they have deprecated.
Yes, you read that right. Microsoft is deprecating parts of an ISO Standard all by themselves.
No, Microsoft isn't deprecating "parts of an ISO Standard" - only the standard committee can do that, by marking those parts as deprecated in the next version of the standard. Microsoft has enabled warnings on use of those "unsafe" functions by default, yes, but it is very much not the same thing.
Regarding "all by themselves" part - do you realize that all those "safe" *_s functions are actually covered by an ISO C99 TR?. There's also a FOSS implementation available under the MIT license.
And the warnings are irritating. You can't write a nice cross-platform library without either spewing tons of warnings or having to put in a bunch of #defines to shut the compiler up.
You don't have to use #defines for that purpose, you use compiler flags in your makefile. You'll have to write one specifically for MSVC anyway (since it uses its own Make), so it's not a big deal.
And if you do that, your users get irritated if they depend on these warnings because you just turned them off
That doesn't make sense. If you turn them off for the code of your library, they're obviously not turned off for code of your users - unless you put #defines in your headers, which is obviously a dumb thing to do for many reasons (and I've already explained the proper way to do this above).
Eh? the 'n' in memcpy call is number of -bytes-. not "things" you're trying to copy. it doesn't matter if you give it an array of signed 8 bit characters and copy it over to 32bit unsigned longs... you just specify n to be number of -bytes- to copy.
How can this be confusing?
"If anything can go wrong, it will." - Murphy
Have a look at strlcpy. It's non-standard, sure, having originated in OpenBSD. But it can now be found in the libc of all the *BSDs, Mac OS X, and even Solaris.
It guarantees the destination is always nul-terminated and it makes it easy to check if your destination buffer was short.
So why is strncpy in the banned function list?
I think this is just Microsoft trying to embrace and extend. There's no better way to do that then making most existing C and C++ code invalid. The quickest alternative, of course, is to write it in C# or some other embraced language.
Hypocritically, Microsoft did NOT add memset to the banned list despite it having almost exactly the same problems as memcpy. Why? Almost every MSDN example begins with "memset(somestruct,0,sizeof(somestruct))" and invalidating every MSDN example would probably look bad.
As you pointed out, the size of the destination buffer makes no sense when dealing with pure pointers. Often memcpy is used to move memory around inside larger buffers, which completely invalidates memcpy_s as a safe replacement. memcpy is also often used to copy smaller buffers into larger ones, and accidentally copying the uninitialized (or carefully crafted by some exploit) data that comes after the source object can be just as dangerous. The correct replacement, memcpy_overkill(void *source_object, size_t source_size, size_t source_offet, void *dest_object, size_t dest_size, size_t dest_offset, size_t count) is what they're REALLY looking for, but this is impractical primarily because of the heavy use of context-less pointers (to objects within arrays, or within some other structure; the void * in memcpy's prototype hints at further possibilities) in C and C++.
Obviously you have never made a parser of any kind. Any time you read a file in, or use a data stream (cin, cout, cerr, etc.), and many more situations (printf, aprintf, aprintf, etc. not to mention document editors, web browsers, etc) you need to be able to have a dynamically sized buffer to at least manage the data between states.
Now, if you can always guarantee that you'll read a file with the same exact format every single time you run the application, then yes, you can determine it at compile time.
However, most applications are more like web browsers in that respect - their input changes every time they run; the format changes. (Think of typing in your post - is there any way the computer could have predicted how much text you'd type? Or the server that it would receive that amount of text from you? Or...the list goes on.)
Please turn in your nerd badge as the XmlHttpRequest hits you while you close the page...
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
He didn't say "how to make your code safe." He said "how to make your code comply with the safety standards." Rarely are the two the same. It's perfectly possible to safely use memcpy(), just like it's perfectly possible to abuse about a billion other system calls.
You must wait a little bit before using this resource; please try again later.
Or the server that it would receive that amount of text from you?
Actually, yes.
memcpy() isn't a syscall.
Perhaps that was a little bit ambiguous of me. What I was referring to were programming languages which reduce the possibility of error (eg: ADA) and/or which are designed to enforce good programming practice and rigorous standards (eg: Occam).
I consider these to be "secure by design" because they were designed to make the more common security flaws impossible and were also designed to make it possible to validate the software. (Both, if I understand the histories correctly, were linked to military efforts to produce highly robust, highly secure code.)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)