- you don't know what code the compiler will emit, especially if you're optimizing. Given the bajillion optimization flags most compilers have, and the fact that they change across versions (for example gcc 4.4 has the graphite framework, gcc 4.5 has link-time optimization), you're either very smart and brave to take a guess, or just ignorant. You can look at the output of course, but that's hardly "predicting", and we're lazy anyway.
Yes, but the more output you inspect, the more you get to know your compiler, and thus predicting the output becomes easier. (Actually you don't need to predict the output for 99.99% of cases, you only need to worry about C/C++ constructs that will cause performance problems. Once you've identified those sorts of bottlenecks [via a profiler of course!], avoiding them in future is fairly easy). I'm not sure about you, but personally I don't go and blindly enable every single new optimization flag with every time I get a new version of a compiler! I think the fact they change is largely moot to be honest
- you don't know the state of RAM, with regards to swap - you don't know the state of the CPU cache - you don't know what microcode the CPU has, or what it's actually doing under the hood
A touch pedantic imho (and not entirely accurate). Those are specific to programming on a modern CPU/modern OS, and apply to any language. If you want to know all of those things, don't use a modern CPU or os....
- at any given point in time, you don't know how long it will take to execute the next instruction (hint: scheduling, possibly powersave)
What does scheduling have to do with an instruction timing? It sounds like you are working on the assumption that a thread can be suspended mid-op? Anyhow, if an instruction executes, it executes in 'n' cycles, with a latency of 'p' and a throughput of 't'. Those are known values for all CPU's. If the CPU has been throttled back via speedstep, then it still takes 'n' cycles, with a latency of 'p' and the throughput remains the same. The only thing that changes is the time for a cycle, but do you really want to be measuring your code performance in seconds? Fine for us console developers i guess, but not so great for your average PC.
Also worth pointing out, that there is always a counter-example. If you want to measure your code in terms of ops, get an ATOM;)
I guess one of the benefits of AVX is the new register sizes are suppose to give transparent speed increases. So a program made for 256bit AVX will automatically see faster calculations when the new 512bit AVX registers come out.
Afraid not (well, there are ways if you are willing to litter your code with C++ templates). Yes the instructions will process 8 floats, however you're only going to see some nice linear speed up if you are already using SOA data structures. For a lot of the 'traditional' SSE code you'll tend to see (i.e. AOS vector3/matrix classes etc), the AVX instructions will be of little use. In effect they duplicate all SSE->SSE4.1 for 256 register types. i.e. SSE has _mm_add_ps, AVX has _mm256_add_ps, and any new 512 bit instruction will be _mm512_add_ps (which incidentally is one of the larrabee instructions). You'll have to modify just as much code porting SSE to AVX as you will porting the 256bit AVX instructions to 512bit AVX ones. The only advantage is that we know what the new 512 bit instructions will look like, and can plan for the future!
I've already had a crack at porting a fair amount of existing code to AVX (Intel compiler comes with an emulator - I don't have any hardware obviously!). For code optimised in an SOA layout, coding is going to be quite fun. If you have a load of Vector3/Matrix type classes, then you aren't going to get much performance benefit. The nicest thing about AVX for that kind of code is that it allows you to convert to double, perform the operation, convert back to floats, and the resulting code should run at about the same speed as the SSE equivalent..... (a touch slower, but not enough to care....)
Reminds me of a situation Yann L found himself in (see his post here). For those who don't like clicking links:
We've had a very annoying situation in the past, where someone (we know who he is, but I won't disclose any details for obvious reasons) actually took the detailed explanations about a novel technique a colleague of mine and myself gave on various occasions (presentations, and also from posts on these boards), and tried to file a patent in his own name on them, with the intention to sue us afterwards for patent infringement on 'his' invention. We sued, and won on prior art, but it costed us a lot of money (we won't get anything back from the guy, he doesn't have anything). Both myself and also my company believes that technological innovation should be freely available, and that's why we were always pretty open about it in the past. But if some people try to use this against us with the intend of making a quick buck (some passages from some of my older posts in the GP&T forum were almost quoted 1:1 in this individuals patent application !), then we have to be much more careful about what we say on public platforms. Eventhough a simple post in this forum can classify as prior art (and more so a published paper), it is still upon us to prove it - and that drains on our resources.
It's a real shame - the guy knows his stuff, and every game dev out there would like some more detailed posts on his research. I hate patents more than most, but unfortunately relying on prior art is foolish in this day and age (unless you have an extremely happy looking bank balance!). Save yourself the bother and apply for the patent if you really think it's worth it.....
but my current motherboard graphics can only drive 1280x1024, which is lame, so I'll need a graphics card. And almost all of the AGP graphics cards are lame too - they won't do 1920x1280, though some might do 1680x1050, so basically I'd need to upgrade the motherboard to do PCI-express. ATI to the rescue sir! That card will do what you need (no problems driving 2x1080p monitors). DX10 + GL3.3 support, handles blu ray nicely. I've got one in a circa 2002 dell 530 precision workstation (as well as sata raid + IDE RAID + USB2 controller cards). Runs absolutely fine..... Hell, you are more than welcome to buy it all off me if you *really* want!
I've got an Atom 330 + ION as well, and to be fair, it can't keep up with the old dell (dual 2.2Ghz HT xeon). The ION chip isn't bad (although it's not great either), and can run VMWare (although the ATOM *really* struggles to the point of not bothering).
> Although it was also the lack of hardware T&L that eventually killed it, once the critical mass of games taking advantage of it was surpassed.
I'm hoping you realise that HW TnL 'just works'. Plug in the card, and all those openGL calls are suddenly hardware accelerated? Zero coding effort required.... IIRC with D3D at the time, I think we had to wait for a new SDK update - but thankfully someone erased my memory of D3D pre version 9;)
Simple fact is, register combiners (and then vertex programs with the Geforce3) killed off the competition. Imho, we're lucky that ATI managed to survive that period! (Others like S3D, Diamond, 3DLabs, and matrox failed to keep up!).
Open standard it may be(come), but I'd rather take a working standard (albeit proprietary) which works today, and lets me chat to my colleagues on the other side of the atlantic (and has time and again proved that it works for that purpose for the past 5+ years or so). Where i work skype is as much of an essential coding tool as visual C++, gcc or gdb is. Programmers *know better* than to believe hype about 'what may become' or 'what might be'. So until we have: an open standard for FaceTime; have clients that provide more than the skype client; and we have found that skype is no longer not fit for purpose; we wont be changing anytime soon. Until that time, skype has served our purpose for years, we've built our company practices around it, so why change?
Yeah, last time i used e-bay i had to pay some guy £50 to buy a camera lens off him! The cheek! I even had to pay the royal mail to deliver it. Jesus, is nothing free anymore? You know what really takes the biscuit?!? Last time i spoke to my fiancée on skype, apparently I've got to pay for the wedding now! I hate e-bays model of making me pay for things - it sucks! Too right about lock in! If it wasn't for e-bay, I wouldn't be engaged! (..... actually that last bit, that's not actually true!)
smaller by what metric? number of devices in circulation? or number of new devices likely to be sold this year/next ten years? Business men only care about the latter i'm afraid.... Are you honestly surprised adobe were worried about iOS not supporting flash? (well let's be honest, adobe doesn't care, but their shareholders do..... )
Well done sir. You win todays award for completely failing to understand what HW virtualisation is....
I've got an atom 330 (Dual core HT), and have run a guest OS on it yes. The performance was sodding awful for a number of reasons.....
1) Only 1 core is available to guest OS's (even though it can handle 4 HW threads). Why? No HW virtualisation support.
2) It is impossible to dynamically assign CPU cores to the guest OSes. Why? No HW virtualisation support.
3) It is impossible to run 64bit Os's as a guest. Why? No HW virtualisation support.
4) When running a guest OS, performance is apalling. Why? No HW virtualisation support meaning it can't execute the instructions directly.
So, on your D510, are you seriously trying to suggest that you have magically managed to overcome all the *PHYSICAL* limitations of the CPU? You can run a 64bit guest can you? You can dynamically assign multiple cores can you?
Christ, next you'll be claiming Intel graphics cards support OpenGL4.0.....
Almost, but not quite. The things that suck about the atom:
1. double precision. Use a double, and the Atom will grind to a halt.
2. division. Use rcp + mul instead.
3. sqrt. Same as division.
All of those produce unacceptable stalls, and annihilate your performance immediately. So don't use them!
Now, you'd imagine those are insurmountable, but you'd be wrong. If you use the Intel compiler, restrict yourself to float or int based SSE instuctions only, avoid the list of things that kill performance, and make extreme use of OpenMP, they really can start punching above their weight. Sure they'll never come close to an i7, but they aren't *that* bad if you tune your code carefully. Infact, the biggest problem I've found with my Atom330 system is not the CPU itself, but good old fashioned memory bandwidth. The memory bandwidth appears to be about half that of Core2 (which makes sense since it doesn't support dual channel memory), and for most people that will cripple the performance long before the CPU runs out of grunt.
The biggest problem with them right now is that they are so different architecturally from any other x86/x64 CPU that all apps need to be re-compiled with relevant compiler switches for them. Code optimised for a Core2 or i7 performs terribly on the atom.
Absolute tosh. You need CPU hardware virtualisation support to do those things you speak of. Those features are no where to be found in the Atom. You will not be running VM's on any Atom based system because they are simply not up to the job. (I speak from experience here)
The ATOM doesn't support virtualisation in hardware, so at best, you are limited to one 32bit OS as a guest per core (no 64bit since you need CPU support for virtualisation) - and even then, the performance is so bad that you might as well just not bother. Having tried virtualbox on an Atom330, I can assure you it's really not worth waiting for the guest OS to finish installing....
You're right. All cars sold in England these days are required by law to be rated using cubic fathoms per inch. My smart car averages 2 when driving through London.....
hmmmmm.... slashdotting spam off the face of the earth? Why did no one think of this before? Tomorrow is a new dawn for nerds everywhere:
"No, I'm not time wasting, I'm slashdotting for the benefit of mankind!"
The OP didn't say anything about employees - he said workplace. Every worked in a university? It's far easier to ghost the machines at the end of every day or session than deal with hundreds of queries a day from the vast majority of the 20,000 students who struggle to understand the basic concepts of computer security.
There's another little thing i discovered today whilst dicking around with a false account. If you add someone as a friend, you can see their 'new friends' news feed items before they've added you as a friend. Not sure if it applies to other news feed items. It was purely an accidental discovery, and the window of opportunity had passed before i could invietigate further. Going to play around a bit more tomorrow.
Ferrari has had some experience dealing with the weight problem with their KERS system this year. I'm guessing the hybrid will be taking a lot of lessons learned from the F1 team...
- you don't know what code the compiler will emit, especially if you're optimizing. Given the bajillion optimization flags most compilers have, and the fact that they change across versions (for example gcc 4.4 has the graphite framework, gcc 4.5 has link-time optimization), you're either very smart and brave to take a guess, or just ignorant. You can look at the output of course, but that's hardly "predicting", and we're lazy anyway.
;)
Yes, but the more output you inspect, the more you get to know your compiler, and thus predicting the output becomes easier. (Actually you don't need to predict the output for 99.99% of cases, you only need to worry about C/C++ constructs that will cause performance problems. Once you've identified those sorts of bottlenecks [via a profiler of course!], avoiding them in future is fairly easy). I'm not sure about you, but personally I don't go and blindly enable every single new optimization flag with every time I get a new version of a compiler! I think the fact they change is largely moot to be honest
- you don't know the state of RAM, with regards to swap
- you don't know the state of the CPU cache
- you don't know what microcode the CPU has, or what it's actually doing under the hood
A touch pedantic imho (and not entirely accurate). Those are specific to programming on a modern CPU/modern OS, and apply to any language. If you want to know all of those things, don't use a modern CPU or os....
- at any given point in time, you don't know how long it will take to execute the next instruction (hint: scheduling, possibly powersave)
What does scheduling have to do with an instruction timing? It sounds like you are working on the assumption that a thread can be suspended mid-op? Anyhow, if an instruction executes, it executes in 'n' cycles, with a latency of 'p' and a throughput of 't'. Those are known values for all CPU's. If the CPU has been throttled back via speedstep, then it still takes 'n' cycles, with a latency of 'p' and the throughput remains the same. The only thing that changes is the time for a cycle, but do you really want to be measuring your code performance in seconds? Fine for us console developers i guess, but not so great for your average PC.
Also worth pointing out, that there is always a counter-example. If you want to measure your code in terms of ops, get an ATOM
I guess one of the benefits of AVX is the new register sizes are suppose to give transparent speed increases. So a program made for 256bit AVX will automatically see faster calculations when the new 512bit AVX registers come out.
Afraid not (well, there are ways if you are willing to litter your code with C++ templates). Yes the instructions will process 8 floats, however you're only going to see some nice linear speed up if you are already using SOA data structures. For a lot of the 'traditional' SSE code you'll tend to see (i.e. AOS vector3/matrix classes etc), the AVX instructions will be of little use. In effect they duplicate all SSE->SSE4.1 for 256 register types. i.e. SSE has _mm_add_ps, AVX has _mm256_add_ps, and any new 512 bit instruction will be _mm512_add_ps (which incidentally is one of the larrabee instructions). You'll have to modify just as much code porting SSE to AVX as you will porting the 256bit AVX instructions to 512bit AVX ones. The only advantage is that we know what the new 512 bit instructions will look like, and can plan for the future!
I've already had a crack at porting a fair amount of existing code to AVX (Intel compiler comes with an emulator - I don't have any hardware obviously!). For code optimised in an SOA layout, coding is going to be quite fun. If you have a load of Vector3/Matrix type classes, then you aren't going to get much performance benefit. The nicest thing about AVX for that kind of code is that it allows you to convert to double, perform the operation, convert back to floats, and the resulting code should run at about the same speed as the SSE equivalent..... (a touch slower, but not enough to care....)
Billions of years in the ground, and only a few centuries on the roof and all of the radioactivity is gone! Wow!
it was blessed....
no, because we run ssd's.... :p
Reminds me of a situation Yann L found himself in (see his post here). For those who don't like clicking links:
We've had a very annoying situation in the past, where someone (we know who he is, but I won't disclose any details for obvious reasons) actually took the detailed explanations about a novel technique a colleague of mine and myself gave on various occasions (presentations, and also from posts on these boards), and tried to file a patent in his own name on them, with the intention to sue us afterwards for patent infringement on 'his' invention. We sued, and won on prior art, but it costed us a lot of money (we won't get anything back from the guy, he doesn't have anything). Both myself and also my company believes that technological innovation should be freely available, and that's why we were always pretty open about it in the past. But if some people try to use this against us with the intend of making a quick buck (some passages from some of my older posts in the GP&T forum were almost quoted 1:1 in this individuals patent application !), then we have to be much more careful about what we say on public platforms. Eventhough a simple post in this forum can classify as prior art (and more so a published paper), it is still upon us to prove it - and that drains on our resources.
It's a real shame - the guy knows his stuff, and every game dev out there would like some more detailed posts on his research. I hate patents more than most, but unfortunately relying on prior art is foolish in this day and age (unless you have an extremely happy looking bank balance!). Save yourself the bother and apply for the patent if you really think it's worth it.....
not even as a beermat?
but my current motherboard graphics can only drive 1280x1024, which is lame, so I'll need a graphics card. And almost all of the AGP graphics cards are lame too - they won't do 1920x1280, though some might do 1680x1050, so basically I'd need to upgrade the motherboard to do PCI-express.
ATI to the rescue sir! That card will do what you need (no problems driving 2x1080p monitors). DX10 + GL3.3 support, handles blu ray nicely. I've got one in a circa 2002 dell 530 precision workstation (as well as sata raid + IDE RAID + USB2 controller cards). Runs absolutely fine..... Hell, you are more than welcome to buy it all off me if you *really* want!
I've got an Atom 330 + ION as well, and to be fair, it can't keep up with the old dell (dual 2.2Ghz HT xeon). The ION chip isn't bad (although it's not great either), and can run VMWare (although the ATOM *really* struggles to the point of not bothering).
> Although it was also the lack of hardware T&L that eventually killed it, once the critical mass of games taking advantage of it was surpassed. ;)
I'm hoping you realise that HW TnL 'just works'. Plug in the card, and all those openGL calls are suddenly hardware accelerated? Zero coding effort required.... IIRC with D3D at the time, I think we had to wait for a new SDK update - but thankfully someone erased my memory of D3D pre version 9
Simple fact is, register combiners (and then vertex programs with the Geforce3) killed off the competition. Imho, we're lucky that ATI managed to survive that period! (Others like S3D, Diamond, 3DLabs, and matrox failed to keep up!).
not any more. the igp's are on the cpu's these days.
Open standard it may be(come), but I'd rather take a working standard (albeit proprietary) which works today, and lets me chat to my colleagues on the other side of the atlantic (and has time and again proved that it works for that purpose for the past 5+ years or so). Where i work skype is as much of an essential coding tool as visual C++, gcc or gdb is. Programmers *know better* than to believe hype about 'what may become' or 'what might be'. So until we have: an open standard for FaceTime; have clients that provide more than the skype client; and we have found that skype is no longer not fit for purpose; we wont be changing anytime soon. Until that time, skype has served our purpose for years, we've built our company practices around it, so why change?
Yeah, last time i used e-bay i had to pay some guy £50 to buy a camera lens off him! The cheek! I even had to pay the royal mail to deliver it. Jesus, is nothing free anymore? You know what really takes the biscuit?!? Last time i spoke to my fiancée on skype, apparently I've got to pay for the wedding now! I hate e-bays model of making me pay for things - it sucks! Too right about lock in! If it wasn't for e-bay, I wouldn't be engaged! ( ..... actually that last bit, that's not actually true!)
smaller by what metric? number of devices in circulation? or number of new devices likely to be sold this year/next ten years? Business men only care about the latter i'm afraid.... Are you honestly surprised adobe were worried about iOS not supporting flash? (well let's be honest, adobe doesn't care, but their shareholders do..... )
You must be the oldest person on slashdot!
Well done sir. You win todays award for completely failing to understand what HW virtualisation is....
I've got an atom 330 (Dual core HT), and have run a guest OS on it yes. The performance was sodding awful for a number of reasons.....
1) Only 1 core is available to guest OS's (even though it can handle 4 HW threads). Why? No HW virtualisation support.
2) It is impossible to dynamically assign CPU cores to the guest OSes. Why? No HW virtualisation support.
3) It is impossible to run 64bit Os's as a guest. Why? No HW virtualisation support.
4) When running a guest OS, performance is apalling. Why? No HW virtualisation support meaning it can't execute the instructions directly.
So, on your D510, are you seriously trying to suggest that you have magically managed to overcome all the *PHYSICAL* limitations of the CPU? You can run a 64bit guest can you? You can dynamically assign multiple cores can you?
Christ, next you'll be claiming Intel graphics cards support OpenGL4.0.....
Almost, but not quite. The things that suck about the atom:
1. double precision. Use a double, and the Atom will grind to a halt.
2. division. Use rcp + mul instead.
3. sqrt. Same as division.
All of those produce unacceptable stalls, and annihilate your performance immediately. So don't use them!
Now, you'd imagine those are insurmountable, but you'd be wrong. If you use the Intel compiler, restrict yourself to float or int based SSE instuctions only, avoid the list of things that kill performance, and make extreme use of OpenMP, they really can start punching above their weight. Sure they'll never come close to an i7, but they aren't *that* bad if you tune your code carefully. Infact, the biggest problem I've found with my Atom330 system is not the CPU itself, but good old fashioned memory bandwidth. The memory bandwidth appears to be about half that of Core2 (which makes sense since it doesn't support dual channel memory), and for most people that will cripple the performance long before the CPU runs out of grunt.
The biggest problem with them right now is that they are so different architecturally from any other x86/x64 CPU that all apps need to be re-compiled with relevant compiler switches for them. Code optimised for a Core2 or i7 performs terribly on the atom.
Absolute tosh. You need CPU hardware virtualisation support to do those things you speak of. Those features are no where to be found in the Atom. You will not be running VM's on any Atom based system because they are simply not up to the job. (I speak from experience here)
The ATOM doesn't support virtualisation in hardware, so at best, you are limited to one 32bit OS as a guest per core (no 64bit since you need CPU support for virtualisation) - and even then, the performance is so bad that you might as well just not bother. Having tried virtualbox on an Atom330, I can assure you it's really not worth waiting for the guest OS to finish installing....
You'd be better off inserting them in cows.....
You're right. All cars sold in England these days are required by law to be rated using cubic fathoms per inch. My smart car averages 2 when driving through London.....
hmmmmm.... slashdotting spam off the face of the earth? Why did no one think of this before? Tomorrow is a new dawn for nerds everywhere:
"No, I'm not time wasting, I'm slashdotting for the benefit of mankind!"
the worm or the pc?
The OP didn't say anything about employees - he said workplace. Every worked in a university? It's far easier to ghost the machines at the end of every day or session than deal with hundreds of queries a day from the vast majority of the 20,000 students who struggle to understand the basic concepts of computer security.
There's another little thing i discovered today whilst dicking around with a false account. If you add someone as a friend, you can see their 'new friends' news feed items before they've added you as a friend. Not sure if it applies to other news feed items. It was purely an accidental discovery, and the window of opportunity had passed before i could invietigate further. Going to play around a bit more tomorrow.
That's nothing new I'm afraid. I've been making new IP for over 10 years.....
Ferrari has had some experience dealing with the weight problem with their KERS system this year. I'm guessing the hybrid will be taking a lot of lessons learned from the F1 team...