Microsoft To Banish Memcpy()

No mention of memmove... by pthisis · 2009-05-15 03:32 · Score: 5, Informative

Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"

You can replace memcpy entirely with memmove (the latter is slightly slower and handles overlaps), and nothing in the article suggests that memmove is banned.

But, no, it shouldn't hurt creativity--they're introducing a memcpy_s, which is the same aside from taking a size parameter for the destination. That's something that is generally easy to track in new code (obviously this secure developement lifecycle is not backwards compatible).

--
rage, rage against the dying of the light

Re:No mention of memmove... by pthisis · 2009-05-15 03:53 · Score: 3, Informative

Okay, I'm obviously missing something here. How is having an extra parameter for the destination size any safer? I always thought the third parameter to memcpy was the amount of data to copy, and since obviously it should never be set to anything larger than the size of the destination, how will having the destination size explicitly passed in help any?
That's the error that this is trying to fix. I'm skeptical as to how much this will help; if you're that lazy, you can just set the destination size parameter to the same value as the amount to copy.
But it might be easier to enforce at a code-review level in the organization: destination size always has to be a size tracked based on memory allocation.

--
rage, rage against the dying of the light
Re:No mention of memmove... by pthisis · 2009-05-15 03:58 · Score: 3, Informative

Now developers will write
memcpy_s(dst, sizeof(dst), src, sizeof(dst));
I get the feeling that this is mainly for Microsoft internally developed code which conforms to their security guidelines. As such, it's probably mainly intended to help in code reviews. Still pretty dubious.
Now the coders that have been using something like
MIN(sizeof(dst), bytes_to_copy)
for the last parameter for years will have to change their code.
That fails in the common case of dst being a real pointer (whether it's indexing into a static array or dynamically allocated memory or whatever).

--
rage, rage against the dying of the light
Re:No mention of memmove... by Tony+Hoyle · 2009-05-15 04:54 · Score: 2, Informative

Given that dst is a pointer, sizeof(dst) is generally going to be 4 or 8, and not do what you want.
It's more likely that programmers will just pass len to both parameters, defeating the point. Unless you define a pointer type that contains a length attribute (which wouldn't be a bad idea, but MS haven't done that) you're just relying on lengths being passed around the code being accurate, which isn't any safer.
A bad programmer will always be a bad programmer. Someone who would use strcpy on user data or memcpy not knowing the destination length, is the same kind of person that's going to work around this. For everyone else it's just a pain.
Re:No mention of memmove... by Tony+Hoyle · 2009-05-15 04:56 · Score: 2, Informative

memcpy_s(dst, sizeof(dst), src, sizeof(dst));
Whilst that will work, it probably doesn't do what you think.
Hint: dst and src are pointers.

What an idiotic idea. by Anonymous Coward · 2009-05-15 03:35 · Score: 5, Informative

Someone already explained this better than I could.

Re:What an idiotic idea. by iluvcapra · 2009-05-15 05:45 · Score: 2, Informative

The more important step is that it will encourage programmers to actually attempt to track both size1 and size2.

I think you misunderstand his point... the 'size' parameter isn't the number of bytes in either buffer, it's the number of bytes you want to move. Obviously this has a lot to do with the size of either allocated buffer, but it's not the same thing.
memcpy doesn't know what a buffer is-- no, it really doesn't. At it's heart, all it does is copy a byte from one pointer to another and increment both, until it's done 'size' of them. There's no requirement that the pointers passed to memcpy be pointers to arrays. They just have to be pointers.
The idea that 'size' actually corresponds to the number of allocated elements in an array is a leaky abstraction. Obviously you never want to copy more bytes than either buffer has the capacity for, but that's really not the language's responsibility to enforce, because the language has no first-class concept of a "buffer" of known length. By making the function 'safer,' all they're doing is encouraging the leaky abstraction and confusing people who think memcpy's job is to copy "buffers" when all it really does is copy n bytes. These developers who have written memcpy_s may have made it "safer," but at the expense of giving it another error condition that must be tested for, and making the actual workings of the function more obscure.
Let's say, If I only want to copy the first 3 bytes of src to dest, do I put in "3" for the source length or the destination length? And I'd better test that against the length of both first, because if the function fails, was it because there's only 2 bytes available in dest, or because there's only 2 available in src? OR, if I'm starting at byte 2 of src to dest, do I put "1" in for srcLength, because that's alll that's left in the array, or "3," because that's how big the array is? I know it probably should be the first one, but the people who don't know better, the people this function is theoretically trying to engineer against, probably will make the wrong choice, because they have it in their head the array is length "3" and this function wants to know the length of the array, right?

--
Don't blame me, I voted for Baltar.

Re:Python is done by Rycross · 2009-05-15 03:37 · Score: 4, Informative

No its not. This is only banned under Microsoft's Security Development Lifecycle, which means you only care about this if you're following those set of development guidelines. Its still in the language. And you can always use memcopy_s:

Developers who want to be SDL compliant will instead have to replace memcpy() functions with memcpy_s, a newer command that takes an additional parameter delineating the size of the destination buffer.

Re:Isn't security the programmer's responsibility? by Anonymous Coward · 2009-05-15 03:39 · Score: 5, Informative

you didnt read.

MSFT is banning it from their development process, not the language, use it as much as you like.

Re:No - there are plenty of safer alternatives by Anonymous Coward · 2009-05-15 03:54 · Score: 3, Informative

Internally to Microsoft, "banned" means that no products can be shipped using these functions. Externally, this is just a recommendation.

Re:No - there are plenty of safer alternatives by Anonymous Coward · 2009-05-15 04:16 · Score: 5, Informative

I'd say it's a good move - passing the size of the destination buffer is usually not that complicated.

Are you high? It already takes a size argument. If this were about strcpy(3), then you'd have a point, but I do not think memcpy(3) means what you think it means.

I'm not saying you can't get yourself into trouble with inappropriate use of memcpy(3), but buffer overruns aren't the go-to threat every time.

NAME memcpy - copy memory area SYNOPSIS #include <string.h> void *memcpy(void *dest, const void *src, size_t n); DESCRIPTION The memcpy() function copies n bytes from memory area src to memory area dest. The memory areas should not overlap. Use memmove(3) if the memory areas do overlap.

Re:No - there are plenty of safer alternatives by Chris+Burke · 2009-05-15 04:19 · Score: 5, Informative

Just like removing printf, scanf, and most other copy/string functions. There are safe versions of memcpy that work just fine and are just as easy to use...

There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).

strcpy() is unsafe because you don't know how many bytes you are going to be copying. strncpy() is completely safe as long as you aren't brain dead and set the 'n' to the size of the destination buffer (as opposed to strlen(src) which would be brain dead) and then slap an '\0' into the last index of the dest. sprintf, same deal, just use snprintf and tell it the max bytes it can print.

So what's unsafe about memcpy()? You explicitly specify the number of bytes to copy. If that number of bytes is greater than the known size of the destination buffer, then you've got a problem that simply adding a second 'size of dest' paramater to the copy won't fix because you already screwed the pooch on figuring that out now didn't you?

Yes memcpy() doesn't work if src and dest overlap. When that's happening, you typically know about it (you've got some clever in-situ array modification going on) and can use memmove(). memmove(), on the other hand, is equally unsafe if you can't properly specify the number of bytes to copy.

Bottom line: There's no such thing as a "safe" copy in C when we're assuming the programmer can't figure out the destination buffer size.

--

The enemies of Democracy are

Re:"memmove()" is safer than "memcpy()". by Anonymous Coward · 2009-05-15 04:33 · Score: 1, Informative

The only "safer" thing about memmove is it can handle moves between overlapped regions. This is only reason to use memmove over memcpy. There is nothing safer about that.

Re:No - there are plenty of safer alternatives by FlyingBishop · 2009-05-15 04:56 · Score: 5, Informative

That's physically impossible, even given infinite time. Read up on the halting problem.

However, programming a framework in which we may rule out certain things, for example a process jumping over and altering the OS, is perfectly possible. It just has to be verified through reasoning, rather than testing. The unit testing methodology is really the problem here. You cannot unit test everything.

Don't get me wrong, testing is a good start, but it's no proof of security, and a proof of security, while very hard, is possible. Kudos to Microsoft.

And to expand on the GP for those that didn't RTFA, they replaced Memcpy with a memcpy that forced you to state the size of the destination buffer, which is a constant time operation, and a much needed one. So this only forces C coders to make their code a little more clear.

And when you're being intentionally unclear to the computer in addition to the reader, your code has no place in a secure production setting.

Re:They should go one better... by Creepy · 2009-05-15 05:29 · Score: 4, Informative

The problem is memcpy returns a void *. If this is dynamically cast, it needs to be checked at runtime and may even be set to a value the programmer never intended (say unsigned 16 bit values instead of unsigned 8 bit characters). It may be an issue with updating the code - say the code was originally written for 8 bit ASCII and got updated to, say UTF-16 (16 bit). A dynamically cast void* doesn't care what the size is, it just shoves the values in the buffer. This may work fine in basic testing even, because you never overflow the buffer with 1-2 characters, and maybe even gets past a QA team, but once you go past 1/2, you've got a buffer overrun.

As I understand it, __restrict wouldn't work in a C++ program using dynamic_cast because it doesn't know the size at compile time (sorry, I'm not sure what is done in C as I haven't kept up with the language, so I have to use a C++ example). My guess is memcpy_s does runtime bounds checking (it isn't specified on the memcpy_s page, maybe the security ref - too busy to read it though).

Re:How to easily ... by jdoverholt · 2009-05-15 05:49 · Score: 3, Informative

Not quite "just as bad," in my opinion. Writing outside of your boundary is much more likely to cause problems (overwriting other things) than reading out of bounds. A 40-byte null-terminated string, for instance, wouldn't be hurt by another 40 bytes of heap data, so long as the null terminator was intact. You may still throw an error if it's not your memory that's being read. Just saying... writing (approx. equals) 100% trouble, reading < writing.

Re:No - there are plenty of safer alternatives by George+Reilly · 2009-05-15 06:08 · Score: 4, Informative

There's nothing unsafe about printf (since compilers started doing format type checking), as long as you don't use user input as the format string. To print user input, you use printf("%s", user_input).

%n writes to the stack. It's disabled by default in VS2005 onwards. More at http://weblogs.asp.net/george_v_reilly/archive/2007/02/06/printf-n.aspx and http://julianor.tripod.com/bc/formatstring-1.2.pdf

--
/George V. Reilly

Re:When will MS learn? by shutdown+-p+now · 2009-05-15 06:43 · Score: 2, Informative

This is not the first time MS has done this. They have plenty of other standard functions that they have deprecated.

Yes, you read that right. Microsoft is deprecating parts of an ISO Standard all by themselves.

No, Microsoft isn't deprecating "parts of an ISO Standard" - only the standard committee can do that, by marking those parts as deprecated in the next version of the standard. Microsoft has enabled warnings on use of those "unsafe" functions by default, yes, but it is very much not the same thing.

Regarding "all by themselves" part - do you realize that all those "safe" *_s functions are actually covered by an ISO C99 TR?. There's also a FOSS implementation available under the MIT license.

And the warnings are irritating. You can't write a nice cross-platform library without either spewing tons of warnings or having to put in a bunch of #defines to shut the compiler up.

You don't have to use #defines for that purpose, you use compiler flags in your makefile. You'll have to write one specifically for MSVC anyway (since it uses its own Make), so it's not a big deal.

And if you do that, your users get irritated if they depend on these warnings because you just turned them off

That doesn't make sense. If you turn them off for the code of your library, they're obviously not turned off for code of your users - unless you put #defines in your headers, which is obviously a dumb thing to do for many reasons (and I've already explained the proper way to do this above).

Re:They should go one better... by Prof.Phreak · 2009-05-15 06:52 · Score: 2, Informative

Eh? the 'n' in memcpy call is number of -bytes-. not "things" you're trying to copy. it doesn't matter if you give it an array of signed 8 bit characters and copy it over to 32bit unsigned longs... you just specify n to be number of -bytes- to copy.

How can this be confusing?

--

"If anything can go wrong, it will." - Murphy

Re:No - there are plenty of safer alternatives by ZerothAngel · 2009-05-15 06:56 · Score: 2, Informative

Have a look at strlcpy. It's non-standard, sure, having originated in OpenBSD. But it can now be found in the libc of all the *BSDs, Mac OS X, and even Solaris.

It guarantees the destination is always nul-terminated and it makes it easy to check if your destination buffer was short.

Re:No - there are plenty of safer alternatives by DamnStupidElf · 2009-05-15 07:04 · Score: 3, Informative

So why is strncpy in the banned function list?

I think this is just Microsoft trying to embrace and extend. There's no better way to do that then making most existing C and C++ code invalid. The quickest alternative, of course, is to write it in C# or some other embraced language.

Hypocritically, Microsoft did NOT add memset to the banned list despite it having almost exactly the same problems as memcpy. Why? Almost every MSDN example begins with "memset(somestruct,0,sizeof(somestruct))" and invalidating every MSDN example would probably look bad.

As you pointed out, the size of the destination buffer makes no sense when dealing with pure pointers. Often memcpy is used to move memory around inside larger buffers, which completely invalidates memcpy_s as a safe replacement. memcpy is also often used to copy smaller buffers into larger ones, and accidentally copying the uninitialized (or carefully crafted by some exploit) data that comes after the source object can be just as dangerous. The correct replacement, memcpy_overkill(void *source_object, size_t source_size, size_t source_offet, void *dest_object, size_t dest_size, size_t dest_offset, size_t count) is what they're REALLY looking for, but this is impractical primarily because of the heavy use of context-less pointers (to objects within arrays, or within some other structure; the void * in memcpy's prototype hints at further possibilities) in C and C++.

Re:No - there are plenty of safer alternatives by TemporalBeing · 2009-05-15 07:05 · Score: 2, Informative

memcpy_s's buffer size argument should be used for the actual size of the destination buffer. It shouldn't be variable based on user input / program flow, and can likely be determined at compile time.

Obviously you have never made a parser of any kind. Any time you read a file in, or use a data stream (cin, cout, cerr, etc.), and many more situations (printf, aprintf, aprintf, etc. not to mention document editors, web browsers, etc) you need to be able to have a dynamically sized buffer to at least manage the data between states.

Now, if you can always guarantee that you'll read a file with the same exact format every single time you run the application, then yes, you can determine it at compile time.

However, most applications are more like web browsers in that respect - their input changes every time they run; the format changes. (Think of typing in your post - is there any way the computer could have predicted how much text you'd type? Or the server that it would receive that amount of text from you? Or...the list goes on.)

Please turn in your nerd badge as the XmlHttpRequest hits you while you close the page...

--
Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)

Re:How to easily ... by cool_story_bro · 2009-05-15 07:32 · Score: 5, Informative

He didn't say "how to make your code safe." He said "how to make your code comply with the safety standards." Rarely are the two the same. It's perfectly possible to safely use memcpy(), just like it's perfectly possible to abuse about a billion other system calls.

--
You must wait a little bit before using this resource; please try again later.

Re:No - there are plenty of safer alternatives by chkn0 · 2009-05-15 11:47 · Score: 2, Informative

Or the server that it would receive that amount of text from you?

Actually, yes.

Re:How to easily ... by Anonymous Coward · 2009-05-15 16:52 · Score: 1, Informative

memcpy() isn't a syscall.

Re:No - there are plenty of safer alternatives by jd · 2009-05-15 18:41 · Score: 2, Informative

Perhaps that was a little bit ambiguous of me. What I was referring to were programming languages which reduce the possibility of error (eg: ADA) and/or which are designed to enforce good programming practice and rigorous standards (eg: Occam).

I consider these to be "secure by design" because they were designed to make the more common security flaws impossible and were also designed to make it possible to validate the software. (Both, if I understand the histories correctly, were linked to military efforts to produce highly robust, highly secure code.)

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)

26 of 486 comments (clear)