Microsoft To Banish Memcpy()
kyriacos notes that Microsoft will be adding memcpy() to its list of function calls banned under its secure development lifecycle. This reader asks, "I was wondering how advanced C/C++ programmers view this move. Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
Just like removing printf, scanf, and most other copy/string functions. There are safe versions of memcpy that work just fine and are just as easy to use...
Lame story (Trying for flamebait here?)
I have mod points and I am not afraid to use them
This should have been done thirty years ago.
Those are also dangerous functions. And also array indexing! That should also be eliminated.
Figures, Microsoft had to go kill of python and do it all in the name of security. No more accessing MEMory in C structures from our .PY files, damn it this really pisses me off.
Do you find this having a negative impact on the flexibility of the language, and do you think it will restrict the creativity of the programmer?"
You can replace memcpy entirely with memmove (the latter is slightly slower and handles overlaps), and nothing in the article suggests that memmove is banned.
But, no, it shouldn't hurt creativity--they're introducing a memcpy_s, which is the same aside from taking a size parameter for the destination. That's something that is generally easy to track in new code (obviously this secure developement lifecycle is not backwards compatible).
rage, rage against the dying of the light
...and pop up a message box asking the user to confirm they want to copy the memory, and if they press OK then they should have to enter a captcha.
Seriously though, how is it supposed to make your code safer if you pass the size you think your destination buffer is? With memcpy, that size is implicitly greater or equal to the copy size and it's the caller's responsibility to make sure this is the case. Putting bounds checking into the copy function is ridiculous if you're responsible for passing the bounds yourself, and it goes against basic good design. I'm surprised they aren't passing the source buffer size too, just to be extra safe. Also, what happened to the __restrict keyword? It's strangely absent from the memcpy_s function declaration.
=Smidge=
Is it just my observation, or is eldavojohn an idiot?
First they came for gets, then they took scanf and strcpy, now they want memcpy? Outrageous! How are virus writers going to be able to take advantage of buffer overflows if I'm continuously keeping track of how big my buffers are? I may have to start lying about their size just to give hackers a chance.
Someone already explained this better than I could.
I have often used memcpy instead of strcpy when I have known the length of the strings, and also known the destination to be large enough.
I'm guessing many developers will just #define memcpy to something else and continue as nothing happened.
/ The Arrow
"How lovely you are. So lovely in my straightjacket..." - Nny
If you consider memcpy too dangerous then you should be using something besides C. If you're using C++ and memcpy then you really do need to know what both you and the compiler are doing.
The problem with strcpy() and sprintf() and like functions is that you don't know when calling them the length of the source to be copied into the supplied buffer. But with memcpy() you specify this length.
Frequently, the size of the target is calculated at run time, so bugs in memcpy() tend to be in the area of this calculation, rather than in not checking if the source fits the target.
Any lack of memcpy() would be easy to overcome, just use
memcpy_s (dst, len, src, len)
which is functionally identical to
memcpy (dst, src, len)
you didnt read.
MSFT is banning it from their development process, not the language, use it as much as you like.
Most any security problem can be traced back to this function.
As Windows products are now (and have been) mainstream products used extensively in banks and other financial institutions, reliability and security (RS) have prime importance. The speed that "memcpy()" gets you is not worth the price of reduced RS.
So, Ben... or is it Peter? Do you always copy your comments verbatim from the linked article, or only when you agree with them?
This is nothing like sprintf. In sprintf there is no way to know how much data will be created ahead of time, so limit on buffer size is useful to make sure there is no buffer overrun.
With memcpy it is *precisely* known how much data will be copied. It is right there, 3rd parameter. If a developer can't do "if (sizetocopy = sizeofdstbuffer)", it is just as unlikely that he will be able to properly state that additional parameter that specifies the destination buffer size.
Of course if Microsoft is so concerned with security, why the heck did it take them years to add snptinf()? All this is is another attempt to make crossplatform development that much harder (much like all those "obsolete" POSIX functions that will barf warnings unless you use a cryptic define).
That said, if this silliness ever becomes a rule, I have an easy solution:
#define memcpy(dst, src, size) memcpy_s((dst), (src), (size), (size))
Problemo solved, now let's go actually write some real code.
This is not the first time MS has done this. They have plenty of other standard functions that they have deprecated.
Yes, you read that right. Microsoft is deprecating parts of an ISO Standard all by themselves. Not that this should surprise anyone. I would have absolutely no objection to them proposing to WG14 to deprecate those functions; heck, I'd encourage it! But besides going out and deciding to 'deprecate' parts of the standards, the replacement functions actually violate those same standards.
And the warnings are irritating. You can't write a nice cross-platform library without either spewing tons of warnings or having to put in a bunch of #defines to shut the compiler up. And if you do that, your users get irritated if they depend on these warnings because you just turned them off (and of course, if you don't, they'll complain that your library is unsafe).
Screw Microsoft.
If = is ambiguous, then you must have a habit of abusing it. While you can overload = to mean anything you want, I suppose, it would seem like you should try to preserve the general notion of the assignment operator in C and C++, which is that = never modifies the right-hand side of the equal sign, only the left hand side.
"Does it mean duplicate the contents, transfer the contents and clear the original copy, or just swap the contents of the items, which might be quicker."
I would say if you are trying to stay true to the way the operator behaves in the language when you overload it, the last two operations would NOT be overloaded to the equal sign, because they modify the right-hand argument. Give them a meaningful method name instead.
#define memcpy(s1, s2, n) memcpy_s((s1), (n), (s2), (n))
As a competent developer, I get extremely annoyed by this sort of shit.
Removing/banning memcpy doesn't change a damn thing cause the first thing I do with things that have to compile in VisualStudio now is add the following defines which turn this shit off:
_CRT_SECURE_NO_WARNINGS
_CRT_NONSTDC_NO_DEPRECATE
If the remove that option I'll simply add memcpy to my standard MS compatibility library that deals with all the other bullshit MS decides to do.
You can't fix stupid. Stop trying. People fuck up VB and C# apps just as much as the fuck up C and C++ apps. So they don't do it with a buffer overflow, they do it by shear stupidity. You'll be more secure by taking away languages that allow non-programmers to pretend to be programmers than making it harder on those of us that are just going to work around what you do anyway.
You're not going to fix broken shitty apps with exploits by removing functions, the functions aren't the problem they do exactly as they are told (or atleast they are supposed to :). You need to fix the programmers who can't clarify what they want done.
http://www.xkcd.com/568/
Second pane:
You'll never find a programming language that frees you from the burden of clarifying your ideas.
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
= never modifies the right-hand side of the equal sign, only the left hand side.
std::auto_ptr would like a word with you. This was one of the dumbest decisions the committee made.
Just write a one-liner that replaces all calls to memcpy with a call to memcpy_s, duplicating the size parameter.
I'm only half-joking. This is exactly how people will (mis)use memcpy_s. If you want safe memory access, you need to ban the entire C language. For those cases where you need C, you'll just have to make sure your programmers know what they're doing.
There's no failure quite as dissatisfying as a complete and total solution to the wrong problem.
How to easily make your code compliant with the new safety requirements:
#define memcpy(dest,src,len) memcpy_s(dest,len,src,len)
Some of these reactions are quite funny.
The goal of asking you to specify the length of the destination buffer is to force you to think about the data you're working with *while* you're writing the code and not afterwards in an unconnected security audit. Furthermore, it provides documentation to other people reading the code who may not have the same mental model of what's going on as you do. And as usual "other people" includes you, six months after you wrote the code.
Uh-huh. Because everyone is not just going to add a line of code to one of the base headers:
#define memcpy(dst, len, src) memcpy_s((dst),(len),(src),(len))
C is not a safe language, and stupid programmers will always find ways to mess it up. There are safer languages where you can't hang yourself as easily, and if you don't understand C, you should use them. Microsoft can't fix this problem.
Warning: Opinions known to be heavily biased.
Firstly, the specification of C anf C++ standard library is governed by the corresponding standard commitee. Microsoft has absolutely no authority to "banish" anything from neither C nor C++. They can deprecate it in their .NET code, C# etc., but it has absolutely no relevance to C and C++ languages. So, why would the author of the original question direct it to "advanced C and C++" programmers is beyond me. In general, C and C++ programmers will never know about this "interesting" development.
Secondly, the tryly unsafe and useless functions in the C standard library are the functions like "gets", which offer absolutely no protection agains buffer overflow, regardless of how careful the develoiper is. Functions like 'memcpy', on the other hand, offer sufficient protection to a qualified developer. There's absolutely no sentiment against these functions in C/C++ community and there is absolutely no possiblity of these functions to get deprecated as long as C language exists.
Now, all that's going to happen is that programmers are going to write their own memcpy-like routines using a quicky for-loop or something. It'll be just as bug prone, and harder to detect via automated source code analysis.
...but it's being eaten...by some...Linux or something...
The idea that smart people don't make mistakes is thoroughly ridiculous.
They make mistakes, they don't make that mistake.
Smart people recognize that they make mistakes, so they create systems that help them catch and prevent their own mistakes. If you're foolish enough to believe that you can't make mistakes, then you should just turn off all the warnings on the compiler and not bother with lesser workarounds like redefining a single symbol.
That's not the issue. The new function is still relying on the programmer not making a mistake. What makes you think that anyone who would make a mistake on the number of bytes to copy wouldn't make a mistake on the size of the buffer? Alternatively, what happens if the src buffer size isn't large enough? Do we need to add another length parameter here?
I have nothing against making a language safer, but if the language can't ensure the size of the buffers itself (and it can't in this case), relying on the programmer to pass in the correct number is exactly as safe as relying on the programmer to do the bounds checking. Exactly as safe, no safer.
Warning: Opinions known to be heavily biased.
Until you're trying to do something for which Java has no standard API. Last time I checked, USB joysticks were like this. There is JInput, but if you include JInput in your distribution, your project is no longer 100% Pure Java and will not run as an applet.
But your C/C++ code is never able to be run as an applet, while Java code might. So, on these grounds Java is better.
Or unless your target platform is incapable of running managed code. Some handheld platforms have only 4 MB of RAM, not enough for a JIT compiler and the bytecode in addition to the translated code except in trivial examples that act a generation old.
And for that we have Java ME edition.
Okay, there are devices that are even more tiny in which C or assembly is the only option. I've also dealt with a processor that couldn't be programmed in C without extensive non-standard C extensions. Either way, this is a small minority of situations -- there is no one solution that suits all purposes.
And per Apple's developer agreement, the only managed language that can run on an iPod Touch is JavaScript in Safari.
And they allow just any old C code to run?!!
Or unless your existing program's model is written in an unmanaged language, and you want to reuse the old code so that you can be sure that the model is bit-for-bit accurate. How hard is it to automatically translate C code to, say, C++/CLI?
Eeek! Don't do this!
If you want bit-by-bit accuracy, you absolutely don't want to try translating C code to C++! You even have to worry about changing C compilers if you are doing that. (And yes, I have worked with C code which would break if we used a different compiler. It was not fun.)
What you should be thinking of doing is to write an interface between the existing model code and your new one. This can be done in both C++ and Java. When the new code has to manipulate the model, it goes through this interface and calls the old code to do the necessary operation.
And for [platforms with small CPU and small RAM] we have Java ME edition.
That is, if your platform vendor offers a port of Java ME.
If you want bit-by-bit accuracy, you absolutely don't want to try translating C code to C++! You even have to worry about changing C compilers if you are doing that.
I already test my code on GCC targeting x86 and ARM. Or did you mean compilers that a hobbyist looking to start a business can't easily afford?
What you should be thinking of doing is to write an interface between the existing model code and your new one. This can be done in both C++ and Java. When the new code has to manipulate the model, it goes through this interface and calls the old code to do the necessary operation.
Mostly I was thinking about trying to port an existing video game to a phone that runs Java ME MIDP or a game console that runs XNA, while preserving frame-for-frame accuracy of the physics. You can't run the old model on such hardware because you don't have the certificate to digitally sign native code. Or did your "interface" include communication over a network to a server running the model?
There have been several suggestions to replace memcpy with memcpy_s as the safer alternative. That's fine, I guess, if memcpy_s is part of the ANSI/ISO standard for C, which as far as I know, it is not; just like all the *_s functions.
Microsoft says your code is safer when using the *_s, but it will no longer be portable, it'll be Microsoft-only. They put in a warning in the compiler from VS2005 onwards about using "unsafe" functions, and that you should use *_s, which is a pain because you have to disable it as the project level, there doesn't seem to be anywhere that I've found that can just turn it off permanently. Even using the STL that comes with VS2008 will generate these warnings, even if you never do any explicit memory stuff yourself.
Microsoft did the same thing with the _* functions; a lot of them are just wrappers around their ANSI-compliant versions (_sprintf -> sprintf), but are also not portable; I worked with a guy who wrote/tested all his C code in VS6 then gave it to me to port to Unix and VMS, and the compilers would choke on not having these particular functions.
Microsoft is trying to get lock-in at the language level instead of providing a good set of Win32 API-based functions that make using memcpy() unnecessary.
Foolish mammal, they cannot be defeated so easily. http://xkcd.com/292/
If you wrote a program that used 8+ gigs of memory that means you're an incompetent code monkey.
I have an hourly job that processes about 8GB of input data files. We found that copying the data to a tmpfs filesystem instead of leaving it on a HDD cut the work time from 20 minutes to about 30 seconds because the job necessarily requires an enormous number of random seek()s. Since mmap()ing a file on tmpfs is roughly identical to read()ing the whole file into RAM, I guess that makes me an incompetent code monkey.
Of course, my boss who got a 40x speedup in exchange for $250 worth of RAM might see things differently from your inexperienced little self. Sometimes the Real World throws pretty big datasets at you.
Dewey, what part of this looks like authorities should be involved?
If a developer can't do "if (sizetocopy = sizeofdstbuffer)"
Uh oh, we'd better ban the = operator too, so no one can mistake it for == in an if statement ever again.
Yes and yes. I've been a developer for over 30 years now(nearly all C or C++). How about you?
I guess Microsoft is trying to get people that don't understand C++ to program better in the language. IMHO, this will not solve fundamental problems with people programming C++, just cause them to learn the language slower by not having an experience working through a bastard of a bug. I think the world (and not just MS) needs to realize that writing a piece of complex software is difficult. Bugs like an errant memcpy of a subclass into a baseclass instance are *very* easy bugs to solve if you're looking for them. Memory overwrites are easy to detect if you are looking for them. Bad pointers are easy to detect if you're looking for them.
Such problems will always exist in software. I've found a fair number in my own code. What we *really* need to do is train a better caliber of programmer. OTOH, Microsoft seems hell-bent on trying to make writing software easy to do
(I have 20+ years proffesional programming experience, a good chunk of it in C/C++, and in recent years have spent a lot of time debugging other peoples code, especially ported, re-ported and generally hacked around code !)
IMHO a lot of the posts here are missing the point when they say, "these functions are safe, you just need to check the size before hand" etc. The fact is, in the Real World, not everyone is an ace-programmer, deadlines loom, mistakes get made. Problems do not appear in testing, but can further down the line. IMHO a language simply should *not allow* the programmer to do crazy things. Potential buffer overflows, de-referencing null pointers etc should not get past the *compiler*.
To be honest I do not have a prefered language to suggest, but I do not think that tweaking C is the answer, but better, high level, languages are needed.