I tried compiling with a number of different optimization options and I got the same thing for all the mixes I tried. Since nothing I tried generated something even close to what I was expecting, I went back to only/Ox/G6/GA/EHsc. (/Ox = full speed-wise optimization enabled.)
And yes, this was for a release build. Other than the stack-checking prolog/epilog code, the debug version of that function is nearly identical.
As for my VS skills, I think they have little to do with how the alledged optimizing compiler is miserably failing to optimize this trivial function.
Ignoring the absolutely scary fact that you're using hungarian... With my rather bad memory, it probably spared me hours of re-reading header files, waiting for tooltips or otherwise wasting time to look up trivial stuff. Autocomplete saves me a fair amount of typing so I can afford being somewhat more verbose.
The first conventions is to be consistent and use the dominant convention used in the development environment. I mostly work with COM, ATL and MFCs and am mostly happy with their conventions, end of story.
There are two shift keys on a conventional keyboard and getting the underscore usually requires pressing a shift key anyway.
In general, AddRef/Release are not horribly high frequency call.
The point though is that VC is doing a horrible job at fairly trivial stuff and many trivial functions are also high-frequency and for those, this can hurt big time.
BTW, my "optimal" code was in fact still three instructions short from truly optimal... the truly optimal version is:
XAdd is a simple wrapper for the ASM instruction of the same name. Because Interlocked* are implemented as runtime library calls, I opted for my own inlinable equivalent.
Humm... I actually missed another optimization... I could save a further three instructions (push ebx; pop ebx; lea ebx, [ecx+m_nRefs]) by simply doing:
xadd [ecx+m_nRefs], eax
This makes VC's output 70% larger than my optimized version.
For the flags:/Ox/G6/GA/FD/EHsc/MT/W3/nologo/c/Zi/TP
Unless I am mistaken, this is pretty much as aggressive as one can be without ditching exception handling... and nothing about my optimizations change anything to that compared to VC-generated code.
I wonder how much Microsoft even bothers to optimize for the different exstension such as SSE and 3DNow, I imagine they don't do much at all
Optimize for extensions? To me, it seems Microsoft has yet to master basic 80386 optimizations... in trivial functions, this generates some impressive and unexpected overhead.
For example, I use intrusive reference counting in my code and here is what my Release() function looks like:
int CRefCount::AddRef(void) { int i = XAdd(&m_nRefs, -1) - 1; return i > 0 ? i : FinalRelease(); }
The ASM code from VC7.1 and VC8: push ecx ; Why push a register that is not written to anywhere? push ebx lea eax,[ecx+4] mov dword ptr [esp+4],eax mov ebx,dword ptr [esp+4] ; The above three lines have the same net effect as: ; LEA ebx, [ecx+4] ; It seems VC loves creating and using unnecessary stack variables. ; I tried the 'register' keyword but as MSDN states that VC simply ignores it.
mov eax,0FFFFFFFFh lock xadd dword ptr [ebx],eax dec eax test eax,eax ; DEC already sets the flags register, testing afterwards is redundant.
pop ebx jg CRefCount::Release+24h (4058FBh) mov eax,dword ptr [ecx] add esp,4 ; discard ECX that was saved on the stack
jmp dword ptr [eax+10h] pop ecx ret
My optimal hand-written Release() goes like this: push ebx lea ebx,[ecx+4] mov eax,0FFFFFFFFh lock xadd dword ptr [ebx],eax dec eax jg nofinal (406392h) mov eax,dword ptr [ecx] call [eax+0x10] ; Note: 0x10 is the vtable offset for the fifth CRefCount virtual function, FinalRelease. ; I am using 'call' because I have no control over the automatic VC prolog/epilog.
pop ebx ret
All in all, the VC version is 50% larger both in instruction count and code size than the optimal (99% of the time) code.
Last time I checked, it was illegal to trademark generic terms and numbers.
Also, last time I checked, a 'tiger' was some furry critter with claws and tooth.
A company cannot own a generic word... otherwise, the only legal dictionaries would be blank dictionaries - which is probably just as well since good spelling and decent grammar are apparently deprecated.
This is why most company usually graft their name to their generics... like "Microsoft Windows", "Norton Antivirus", etc.
I sort of feel sorry for the script kiddies that successfully hacked 127.0.0.1, there indeed is no better example of how much of nothing they know about what they are playing with.
On the other hand, they do get what they deserve for being clueless nuisances. I was merely extending the rod to slightly more clueful nuisances:)
Well, one could simply write a simple proxy that connects back to the origin:port. Assuming the attacker has no firewall, this would have almost the same net effect as 127.0.0.1 except for the extra delays and bandwidth limitations. If there is a way to catch all connections to closed ports and accept them in some other way, the proxy would not have to listen on all common attack ports.
This version of "127.0.0.1" would give hackers the false impression of having a valid target and only the more clueful ones would realize that they are engaging in the same sort of mirror match.
How would the white-balance information people's digital photos qualify as Nikon-owned property?
AFAIK, the DMCA is there to protect copyright and data colateral to taking a photo should be technically owned by the camera's operator.
Nikon can own the patents or trade secrets behind how to use the data but the actual data's ownership/copyright should clearly belong to whoever took the snaps.
This is not too many steps away from Microsoft claiming it owns all code and software written or compiled using VisualStudio tools.
The exercise here was to remind people that the RIAA/MPAA only report inflated absolute worst case (for them) scenarios where something like 1000% of illegal downloads are lost sales.
There is a reason why studios rarely go after file sharers themselves: actual lost sales are not of the same order of magnitude as the RIAA&all want politicians&all to believe and the bad publicity from going directly after 'Joe-sixpack" sharers would result in many more REAL lost sales. Scaring one's customer base is not a good business practice, this is why studios let an anonymizing/racketeering service (RIAA/MPAA) to do most of this.
The difference is that stolen goods count as a sale as far as the MPAA/RIAA are concerned - retailers or their respective insurance company ends up paying for it but it changes nothing to the studios' revenues.
On the other hand, financian losses due to unlicensed downloads is in no negligible part hypothetical - I suspect something like half (or more) of such downloads would never have been sales anyway - the "Download because I can but otherwise do not care (much)" attitude.
This is what Windows power-users will say when the way "It just works" works, nothing they do will still work as they expect. Also, the more stuff that happens auto-magically, the more likely stuff is going to go disappear auto-unexpectedly.
One major problem with Videotron: it is owned by Quebecor, one of the biggest media company in Canada.
(As if being a cable TV provider and video club chain were not already sufficient motivation for wanting to keep online media "locked up".)
It sort of sucks and I would cancel my Videotron service if any comparable service was available... but right now, the next best thing is 75% slower and nearly as expensive.
The fastest AMD chip for the near future is 2.6GHz while the slowest P4 for the last year or so is 2.8GHz. If you want to compare based on clock speed, the Pentium-M (aka P3-v2) is a much fairer comparison. It has been a well known fact since the P4's launch that the P4's IPC (instructions per clock) sucks when compared to the P3's. (And even more so when compared to the PM's.)
I have both an A64-3000+ and a P4-3G. My typical workloads usually contain a number of non-trivial tasks. While my A64 does complete most single tasks faster than my P4, it is nowhere near as smooth-running once non-trivial tasks start piling up while I am doing (or trying to do) anything serious.
Re:"Paltry" is probably a poor choice of words
on
GCC 4.0.0 Released
·
· Score: 1
Computers are not the only things that use compiled C/C++/Java/etc. code... many of the most basic modern microcontrollers can also be programmed using GCC or some of its variants.
Since microcontrollers are everywhere (the average household probably has well over 100), the number of GCC-targettable microcontrollers currently deployed very likely exceeds the number of desktop/server/etc. currently in use by an order of magnitude or two.
> They would blame piracy. "Nobody's going to theaters because they're downloading movies off the net!"
And independent research would then point out that download activity on major movie trackers and other systems used to trade movies has dropped by 40%.
Yes, I know, none of this will ever happen because too few would be willing to put up with any sort of inconvenience a potentially greater good.
Best Buy might want to know what is going on... and decide to solve the problem by adopting a credit-note or 90% cash/charge-back refund policy. (Getting 100% cash refund is still possible but often requires a spare afternoon and enough knowledge of retail laws to scare managers.)
People deserting movie theaters, video clubs and movie aisle of all major stores would send a much clearer message.
That'd be 10,000 bloated drivers. Most drivers are FAR from being 100KB. Fine, you are right, few kernel drivers are over 100KB... but even the most trivial drivers are typically over 10KB and there probably are milions of more-or-less standard devices out there and even with 10KB trivial drivers, this would be GBs worth of potential extra code.
As I wrote, the only reason the current method (full trees) is because the kernel has limited device support. Add every not-quite-standard device's 10KB supplement and a monster will be born.
An unopened copy simply goes back on the shelves until someone else picks it up. Buying and returning unopened wastes everyone's time and otherwise has the same net effect as not buying in the first place. At most, it might generate some transactional noise if retail outlets report sales in real-time... but if they report only once a day/week, noise from such returns would vanish as long as final sales exceed fake sales.
If enough people start putting off the 'inconvenience' of not buying crippled DVDs and not watching them, studios will have to review their business practices or sink.
Then again, they would almost certainly blame download first, long and hard before realizing the true thruth that they are facing a boycott. This would hurt their bottom line much worse than "sales lost to piracy" ever have.... most real evidence so far actually says downloads promote sales - at least for the better stuff.
The US also rents oleoduct capacity from Canada to shift oil from coasts to inner states since this is safer, cleaner and cheaper than the trucks that would otherwise be necessary. This certainly inflates exchanged oil volumes.
The current size may still be manageable for most... but imagine 5-10 years down the road if more manufacturers start submitting drivers and enhancements for their hardware.
If every manufacturer submitted 100KB worth of code to support every feature of their respective gadgets, the kernel would probably reach 1GB pretty quickly. Would this still be acceptable?
Having everything in one place now might be convenient but it ultimately not scalable. It works - for now - only because there is relatively limited hardware support.
But before one can compile the kernel, one has to download and un-gzip/tar it, configure it then build it and then hope it works - assuming it builds, which is not always the case.
How many people actually use I2O, HAM and all that exotic hardware Linux can support? Spinning off all the exotic sections into separate downloads would seriously reduce the average download size. For fairness to server people, I suppose even the sound system could be dumped into a separate archive.
If "make *config" conveniently removes sections whose directory does not exist from the menu, not having the directories in question becomes a convenient way of disabling all items within that section.
I tried compiling with a number of different optimization options and I got the same thing for all the mixes I tried. Since nothing I tried generated something even close to what I was expecting, I went back to only /Ox /G6 /GA /EHsc. (/Ox = full speed-wise optimization enabled.)
And yes, this was for a release build. Other than the stack-checking prolog/epilog code, the debug version of that function is nearly identical.
As for my VS skills, I think they have little to do with how the alledged optimizing compiler is miserably failing to optimize this trivial function.
Ignoring the absolutely scary fact that you're using hungarian...
With my rather bad memory, it probably spared me hours of re-reading header files, waiting for tooltips or otherwise wasting time to look up trivial stuff. Autocomplete saves me a fair amount of typing so I can afford being somewhat more verbose.
The first conventions is to be consistent and use the dominant convention used in the development environment. I mostly work with COM, ATL and MFCs and am mostly happy with their conventions, end of story.
There are two shift keys on a conventional keyboard and getting the underscore usually requires pressing a shift key anyway.
In general, AddRef/Release are not horribly high frequency call.
The point though is that VC is doing a horrible job at fairly trivial stuff and many trivial functions are also high-frequency and for those, this can hurt big time.
BTW, my "optimal" code was in fact still three instructions short from truly optimal... the truly optimal version is:
mov eax, 0FFFFFFFFh
lock xadd dword ptr [ecx+m_nRefs], eax
dec eax
jg nofinal
mov eax, dword ptr [ecx]
call dword ptr [eax+0x10]
ret
This makes it 16 instructions for VC's version VS 7 instructions for mine, VC is doing 114% worse than optimal.
XAdd is a simple wrapper for the ASM instruction of the same name. Because Interlocked* are implemented as runtime library calls, I opted for my own inlinable equivalent.
/Ox /G6 /GA /FD /EHsc /MT /W3 /nologo /c /Zi /TP
Humm... I actually missed another optimization... I could save a further three instructions (push ebx; pop ebx; lea ebx, [ecx+m_nRefs]) by simply doing:
xadd [ecx+m_nRefs], eax
This makes VC's output 70% larger than my optimized version.
For the flags:
Unless I am mistaken, this is pretty much as aggressive as one can be without ditching exception handling... and nothing about my optimizations change anything to that compared to VC-generated code.
I wonder how much Microsoft even bothers to optimize for the different exstension such as SSE and 3DNow, I imagine they don't do much at all
Optimize for extensions? To me, it seems Microsoft has yet to master basic 80386 optimizations... in trivial functions, this generates some impressive and unexpected overhead.
For example, I use intrusive reference counting in my code and here is what my Release() function looks like:
int CRefCount::AddRef(void) {
int i = XAdd(&m_nRefs, -1) - 1;
return i > 0 ? i : FinalRelease();
}
The ASM code from VC7.1 and VC8:
push ecx
; Why push a register that is not written to anywhere?
push ebx
lea eax,[ecx+4]
mov dword ptr [esp+4],eax
mov ebx,dword ptr [esp+4]
; The above three lines have the same net effect as:
; LEA ebx, [ecx+4]
; It seems VC loves creating and using unnecessary stack variables.
; I tried the 'register' keyword but as MSDN states that VC simply ignores it.
mov eax,0FFFFFFFFh
lock xadd dword ptr [ebx],eax
dec eax
test eax,eax
; DEC already sets the flags register, testing afterwards is redundant.
pop ebx
jg CRefCount::Release+24h (4058FBh)
mov eax,dword ptr [ecx]
add esp,4
; discard ECX that was saved on the stack
jmp dword ptr [eax+10h]
pop ecx
ret
My optimal hand-written Release() goes like this:
push ebx
lea ebx,[ecx+4]
mov eax,0FFFFFFFFh
lock xadd dword ptr [ebx],eax
dec eax
jg nofinal (406392h)
mov eax,dword ptr [ecx]
call [eax+0x10]
; Note: 0x10 is the vtable offset for the fifth CRefCount virtual function, FinalRelease.
; I am using 'call' because I have no control over the automatic VC prolog/epilog.
pop ebx
ret
All in all, the VC version is 50% larger both in instruction count and code size than the optimal (99% of the time) code.
Last time I checked, it was illegal to trademark generic terms and numbers.
Also, last time I checked, a 'tiger' was some furry critter with claws and tooth.
A company cannot own a generic word... otherwise, the only legal dictionaries would be blank dictionaries - which is probably just as well since good spelling and decent grammar are apparently deprecated.
This is why most company usually graft their name to their generics... like "Microsoft Windows", "Norton Antivirus", etc.
I sort of feel sorry for the script kiddies that successfully hacked 127.0.0.1, there indeed is no better example of how much of nothing they know about what they are playing with.
:)
On the other hand, they do get what they deserve for being clueless nuisances. I was merely extending the rod to slightly more clueful nuisances
I did mean 1000%.
Some people who analyzed the RIAA/MPAA's numbers a few years ago concluded that there must be at least five downloads for every sale.
One of them mentionned something along the lines of "to make their numbers come true, americans would have to buy 100 CDs per year."
How many sane people buy (anywhere near) this many audio titles each year?
Well, one could simply write a simple proxy that connects back to the origin:port. Assuming the attacker has no firewall, this would have almost the same net effect as 127.0.0.1 except for the extra delays and bandwidth limitations. If there is a way to catch all connections to closed ports and accept them in some other way, the proxy would not have to listen on all common attack ports.
This version of "127.0.0.1" would give hackers the false impression of having a valid target and only the more clueful ones would realize that they are engaging in the same sort of mirror match.
How would the white-balance information people's digital photos qualify as Nikon-owned property?
AFAIK, the DMCA is there to protect copyright and data colateral to taking a photo should be technically owned by the camera's operator.
Nikon can own the patents or trade secrets behind how to use the data but the actual data's ownership/copyright should clearly belong to whoever took the snaps.
This is not too many steps away from Microsoft claiming it owns all code and software written or compiled using VisualStudio tools.
The exercise here was to remind people that the RIAA/MPAA only report inflated absolute worst case (for them) scenarios where something like 1000% of illegal downloads are lost sales.
There is a reason why studios rarely go after file sharers themselves: actual lost sales are not of the same order of magnitude as the RIAA&all want politicians&all to believe and the bad publicity from going directly after 'Joe-sixpack" sharers would result in many more REAL lost sales. Scaring one's customer base is not a good business practice, this is why studios let an anonymizing/racketeering service (RIAA/MPAA) to do most of this.
The difference is that stolen goods count as a sale as far as the MPAA/RIAA are concerned - retailers or their respective insurance company ends up paying for it but it changes nothing to the studios' revenues.
On the other hand, financian losses due to unlicensed downloads is in no negligible part hypothetical - I suspect something like half (or more) of such downloads would never have been sales anyway - the "Download because I can but otherwise do not care (much)" attitude.
"It's just broken."
This is what Windows power-users will say when the way "It just works" works, nothing they do will still work as they expect. Also, the more stuff that happens auto-magically, the more likely stuff is going to go disappear auto-unexpectedly.
One major problem with Videotron: it is owned by Quebecor, one of the biggest media company in Canada.
(As if being a cable TV provider and video club chain were not already sufficient motivation for wanting to keep online media "locked up".)
It sort of sucks and I would cancel my Videotron service if any comparable service was available... but right now, the next best thing is 75% slower and nearly as expensive.
Same clock rate?
The fastest AMD chip for the near future is 2.6GHz while the slowest P4 for the last year or so is 2.8GHz. If you want to compare based on clock speed, the Pentium-M (aka P3-v2) is a much fairer comparison. It has been a well known fact since the P4's launch that the P4's IPC (instructions per clock) sucks when compared to the P3's. (And even more so when compared to the PM's.)
I have both an A64-3000+ and a P4-3G. My typical workloads usually contain a number of non-trivial tasks. While my A64 does complete most single tasks faster than my P4, it is nowhere near as smooth-running once non-trivial tasks start piling up while I am doing (or trying to do) anything serious.
Computers are not the only things that use compiled C/C++/Java/etc. code... many of the most basic modern microcontrollers can also be programmed using GCC or some of its variants.
Since microcontrollers are everywhere (the average household probably has well over 100), the number of GCC-targettable microcontrollers currently deployed very likely exceeds the number of desktop/server/etc. currently in use by an order of magnitude or two.
Or...
"And next time I get mod points, you will also be overrated, redundant, flamebait and a troll."
At least, The Incredibles is one of those titles worth watching... quite good for an action-comedy animation.
> They would blame piracy. "Nobody's going to theaters because they're downloading movies off the net!"
And independent research would then point out that download activity on major movie trackers and other systems used to trade movies has dropped by 40%.
Yes, I know, none of this will ever happen because too few would be willing to put up with any sort of inconvenience a potentially greater good.
Best Buy might want to know what is going on... and decide to solve the problem by adopting a credit-note or 90% cash/charge-back refund policy. (Getting 100% cash refund is still possible but often requires a spare afternoon and enough knowledge of retail laws to scare managers.)
People deserting movie theaters, video clubs and movie aisle of all major stores would send a much clearer message.
That'd be 10,000 bloated drivers. Most drivers are FAR from being 100KB.
Fine, you are right, few kernel drivers are over 100KB... but even the most trivial drivers are typically over 10KB and there probably are milions of more-or-less standard devices out there and even with 10KB trivial drivers, this would be GBs worth of potential extra code.
As I wrote, the only reason the current method (full trees) is because the kernel has limited device support. Add every not-quite-standard device's 10KB supplement and a monster will be born.
An unopened copy simply goes back on the shelves until someone else picks it up. Buying and returning unopened wastes everyone's time and otherwise has the same net effect as not buying in the first place. At most, it might generate some transactional noise if retail outlets report sales in real-time... but if they report only once a day/week, noise from such returns would vanish as long as final sales exceed fake sales.
How do we combat profit?
Boycott?
If enough people start putting off the 'inconvenience' of not buying crippled DVDs and not watching them, studios will have to review their business practices or sink.
Then again, they would almost certainly blame download first, long and hard before realizing the true thruth that they are facing a boycott. This would hurt their bottom line much worse than "sales lost to piracy" ever have.... most real evidence so far actually says downloads promote sales - at least for the better stuff.
The US also rents oleoduct capacity from Canada to shift oil from coasts to inner states since this is safer, cleaner and cheaper than the trucks that would otherwise be necessary. This certainly inflates exchanged oil volumes.
The current size may still be manageable for most... but imagine 5-10 years down the road if more manufacturers start submitting drivers and enhancements for their hardware.
If every manufacturer submitted 100KB worth of code to support every feature of their respective gadgets, the kernel would probably reach 1GB pretty quickly. Would this still be acceptable?
Having everything in one place now might be convenient but it ultimately not scalable. It works - for now - only because there is relatively limited hardware support.
But before one can compile the kernel, one has to download and un-gzip/tar it, configure it then build it and then hope it works - assuming it builds, which is not always the case.
How many people actually use I2O, HAM and all that exotic hardware Linux can support? Spinning off all the exotic sections into separate downloads would seriously reduce the average download size. For fairness to server people, I suppose even the sound system could be dumped into a separate archive.
If "make *config" conveniently removes sections whose directory does not exist from the menu, not having the directories in question becomes a convenient way of disabling all items within that section.