maraist · Slashdot Mirror

Re:Oh gee on Aspect-Oriented Programming Considered Harmful · 2005-04-24 03:40 · Score: 2, Insightful

AOP does have the ability to inject arbitrary code in your function that can change its semantics and you cannot see this when reading the affected code.

Well, except that AOP can only modify code in a structured way.. You can replace a method (no more obscure than method over-loading), you can wrap a method, turning it's body into a nested scope (again similar to a method which calls super). Yes they have before and after methods, but these are mere conviniences to the wrapping of a method.

My argument is only that while AOP does hurt readibility, it's not MUCH different than holding a reference to a base class, and not realizing that your instance is really a derived class which over-loads your method. If you don't have the proper tools to warn you that a method is over-loaded (IntelliJ's Idea, or I believe eclipse), then you aren't privy to the details of execution. Moreoever, there ARE AOP analysis tools which can warn you that an aspect has wrapped this method. Thus, provided sufficiently intelligent tools v.s. sufficiently obscure inheretance points of basic OOP, the whole readibility point is moot.

My only concern with AOP is in definine an aspect that is more than you can chew. It may initially solve your problem VERY elegantly, but as the scope of the aspect grows, you can find more and more hacks which dramatically reduce understandibility of the code. And worse, can allow obscure bugs to appear. As I've said in other posts, I prefer aspects which do not directly affect the ability of a code block to do as it conceptually is designed to do. They must have side-effects which are useful to the aspect, and nothing else. Logging/tracing are perfect examples. Problem is to find other examples. Moreoever, these are "aspects" which could easily be implemented by a given VM.

Re:I like GOTO! on Aspect-Oriented Programming Considered Harmful · 2005-04-24 03:31 · Score: 1

Goto is NEVER required. There are aways structured ways around using it even in your example.

While I'm a long-time student of goto == spagetti, I'd stay away from the "NEVER" statement.

Imagine, for example an embedded controller which has a hot-off-the-shelf c compiler. It's brand new, and c is great for new architectures because it's so low to the metal, it's reltaively easy to make a compiler generate assembly code from the language.

In such a circumstance where driver-performance and memory-usage is critical, a goto is still acceptible, though you might want to wrap functionality in MACRO's at that point. That way you still look like you're calling functions for clearity purposes.

Thankfully low-power, low-cost, high performance computing is virtually apon us, and thus the needs for this are deminishing.

Re:I like GOTO! on Aspect-Oriented Programming Considered Harmful · 2005-04-24 03:25 · Score: 2, Interesting

More and more languages support named continue/break statements. Don't remember if gcc supported it yet. Such statements are more powerful than goto because they imply cleanup code that may or may not be implied by a goto. For example, you know that the for-post code block will be invoked on a labeled continue, as well as running the conditional.

FOO: for(..) {
BAR: for (..) {
if (..)
continue FOO;
if (..)
break BAR;

Perl was nice because it had even more flow-control statements for clearity.. "redo", for example would go to the start of the loop-block without running those two for-blocks. This is nothing particular to perl, just a conspectual-clearity which was valued by the compiler writer; to avoid justifying use of a goto.

Re:I like GOTO! on Aspect-Oriented Programming Considered Harmful · 2005-04-24 03:18 · Score: 1

Breaking the freeing code into functions is a bit heavy solution

See, you obviously recognized the "correct" solution. With the comment "don't want to duplicate code", you immediately have to think of writing a function (which is the only reason functions exist; well, that and easier-to-read recursion).

The problem with your jusitifation is that it isn't true. I challenge you to write a java snippet with a while loop of a million iterations, one with an inlined code-block, one with a private method invocation, and one with a public method invocation.

Guess what, they'll all take the same amount of time. That's the point of a JIT. A c-compiler could see the private method and say, hey, the insert is small, let me inline it.. But I CAN'T inline the public method unless I want to tripple the size of the code (two inlines + the original method). But in a VM, you can do whatever the hell you want, because the actual runtime environment looks NOTHIGN like the original byte-code.. You are actually generating native assembly code based on the byte-code. Moreoever you're continuously modifying the particular byte-code... As for the public-method, the VM knows if anybody is actually refercing that public method; beit a call or an over-loaded method.. And at the point in time that you're iterating over the loop, the VM can decide to erase the public method and inline it, keeping a tight piece of code.. IF a constructor is called from a class which DOES overload or invoke the public method, then the optimization is invalidated and new analysis begins.

That's the bueaty of VM's.. You write really innefficient code that focuses on clearity and maintainability.. The language is written for HUMANs, or more and more analysis code. Optimizations are delegated to the JIT.

Now with the pushign of multi CPU architectures, the arguments of GC's are going to start falling away.. SUN's JVM, for example has certain GC optimizations which are demonstratively only useful once you get to 4 or more CPUs. With hyper-threading enabled P4's, we're already at that point. With dual core hyper-threading, we're that much closer to a practically JIT's world where the detrements of virtualization become disguiseable.

Sorry, this was a rant mostly about performance which really should have been directed at a different post. BUt it was kind of related. :)

Re:I like GOTO! on Aspect-Oriented Programming Considered Harmful · 2005-04-24 03:08 · Score: 1

You know, some people are coding things like HD Controllers or similar. Then you want to control *exactly* when to free the bus

Perhaps there is an aspect of what you're saying that I'm not experienced enough with (not having written/modified many drivers). But if by "freeing" a bus, you mean something equivalent to closing a file, or breaking a network connection. You're absolutely NOT supposed to let the garbage collector do this. You are supposed to explicitly call exit, close, disconnect, or what-have-you. Unless the object put these statements in the "finally" clause, the GC doesn't even do this.. Moreoever, in Java, you are never garunteed to have "finally" called anyway (as certain optimizations can often be made such as with a copying-collector that look only at live objects, not non-live-referenced obects).

And if what you mean is that you're releasing memory to the system and allowing the driver to exit, then that's no different than finding a way to unload your application/applet when you're done anyway.

I am aware of Java-based driver/controller situations. Mostly with embedded hardware that use whatever micro-java OS was popular in the 90's; never worked with it personally, so I don't know the limitations. But in these circumstances, you can use hardware assisted garbage collection which can be rather powerful.

I'm basically just asking for clearification of what you think is inherently limiting with Garbage collection, when in fact, GC to my mind enhances the long-term resource maintainability of any piece of software.. The only caveat is that if you need real-time performance, you have to use a lower throughput G.C. which has an upper-bound CPU utilization.

If you're really real-time, then you're going to adhere to the mantra of keeping two sets of objects, one base set of objects which are long term and one set of transients. Their fields only reference other long-term objects; never transient material. Transient references only live as locals and parameters. To prevent aging of transients, the call stack should periodically exit all the way out to the root of the thread.

Thus the vernarable objects will soon be tenured into a region of memory which is hardly ever garbage collected. And transient objects will only ever live in the eden (with a scant few ever accidently being tenured). Thus each garbage collection will find that most of the eden should be flushed, which involves almost no work. It is only when the eden can't flush out 80% of itself that it spills into tenured memory space. And it is only when tenured memory space is full does it have to perform a full GC and thereby wake up into cache the entire tenured section of memory (less large arrays of primatives).

The above are only suggestions to improve performance / throughput.. Your GC is still chosen so as to have real-time time constraints.

Re:best tool for the task on Aspect-Oriented Programming Considered Harmful · 2005-04-24 02:50 · Score: 2, Insightful

It's very rare that java code at least doesn't call a million functions to perform even the most basic operations.. Want to iterate over a collection.. Most likely you use collections instead of arrays anymore for EVERYTHING.. Why? because there are a million tools which enhance collections, filters, queries, dynamic expansion/contraction, nice default toString capability, arbitrary nesting, etc. So if what you're trying to do is print something in a while loop, you can custom taylor your aspect to capture the exact looping point.

Likewise with the dozens of other "no-function-is-larger-than-a-screen" operations. If you want a logging point, make sure you throw it into a method.

Ok, what about the overhead of calling functions in loops.. Got you covered, it's called JIT. Performance is no longer the reason to use C over VM's. A good C compiler might not know the optimal time to inline code verses leave functions separate. But a JIT can.. Sure it'll take many iterations to figure it out.

This applies to aspects as well. Even though you're adding large amounts of code each each method (entry/exit points), it's likely that the most commonly used points will be inlined, re-arranged for optimal execution, etc.

That being said, I haven't been happy with using aspects for anything that directly affects functionality.. So the example of grabbing a semephor would make me ill.. In a previous life, we used aspects as a security model.. Every method invocation of a certain category had it's returned contents acess-control-restricted. The problem was that functionality started relying on it. We wouldn't redundantly check for things, but that meant that there was magic in the code.. We justified it by saying that we're effectively segregating the database.. You physically can't get to data that you don't directly or indirectly own, so you can simplify your programming assuming that you have access to everything. Of coures this doesn't translate into SQL calls that return counts, and what-have you, so the limitations creapt up too often..

But logging has no direct affect on an application. Statistics gathering, or run-time debugging of hard-to-reach code. Trace-paths. Possibly even event generators. Things that at the very least don't affect the operation of what a class logically wants to do or provide. Afterall most event generators are hidden from the general work-flow. They have internal fireXEvent, and side-interfaces which expose registerX or what-have-you.

That being said, current AOP adoption is too shaky.. Either you have to manually wrap objects with proxies, or you have to run a post-compiler, which plays havoc with other post-compilers (like cactus).. You can only really safely have one post compiler, and so you should probably have no post-compilers; it's a broken concept. Something like jboss with it's class-loader runtime bytecode enhancement is cool, but not everybody wants to run under jboss.

So unless the language itself adopts it, I'm not a huge fan of AOP right now.

Re:Devil's advocate on Michael Robertson Says Root is Safe · 2005-04-18 15:37 · Score: 1

Your argument is basically "Linux distros are safe because all of your applicaitons come from a single vendor you can trust".

Well, the distribution == security wasn't my argument, but that's not bad either I guess. It's more that Linux distro's can have everything you'd want to run installed at OS-installation time; a time where you don't need to log in. We're talking potential purchasers of something like lindows or Linspire; so keep that in mind. Lay people like my father aren't going to do anything with root; they're going to look at somtehing like knoppix and say "what's in my menu of fun today". They'll see a word processor, a browser, an IM client etc. By having stock equipment represent what a target audience needs, then after initial installation and configuration of certain hardware like printers / network connections, there is no further need for root use.

I also acknowledged that this is a crippled way of general computing, as there isn't room for 3'rd parties (outside of the distribution's choice or free-licensing).

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 13:23 · Score: 4, Insightful

There are some good replies here, but nobody's talked about "su" and friends.. I know su's not a user-friendly application, but damnit, I use it all the time. After several OS upgrades, whenever something fishy is going on with an application, I open a terminal window, login as a dummy userand run the application from there with a fresh configuration. Viola, proper settings, it must be my dot-files being mangled in the upgrade.. Time to hunt-save, and rm -r that dot-directory. Harder to do in gnome since they're all in a common tree. And yes, this is more of a power-user thing.

But if I want to visit some illicit web site, and I don't trust that my cookie files won't be sought out by some clever Ajax tricks (hey, it's new, we can fear it), I at least launch a different one of the dozens of install browsers, or if I'm really paranoid, I log in as the dummy user. (again takes half a second from a terminal window). With the exception of X-atom-based consolidation of browsers, so long as I run a different base application (epiphony, mozilla, firefox, galean, etc), I can have two different users displaying graphics on the X-session.

Again, I know.. power-user stuff.. But you could have (as I've pushed for in other posts) applications on the task bar launching applications of different users.. Especially if you're the distribution writer.. And ESPECIALLY if you're a single-user-signon distribution.

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 13:09 · Score: 1

While what you say is true.. I think you're missing the argument. It isn't that having multiple user-accounts for different tasks isn't good or useful.. The article (and it's supporters) are saying the data is king.. And the OS itself holds no interesting data to people that would purchase Linspire or even win XP Home. The data is the only thing of value.. And since the data is owned by the same person that would have to log in. Thus there is no greater security than if there were only one user for the whole system.

I have argued several times in this thread that there are other reasons just as great as the data (namely the integrity of the hardware that you've purchased or spent time setting up, or register-per-install-software). But the issue of your personal data (your spread-sheet files, word-processor documents, pictures, music, bookmarks, etc).. All this is available to any hacked application run by the user. So an open-office macro that runs file-system operations, or an exploited firefox instance, or interesting perl-hooked-into-GNORBA tainted input, can all be just as dangerous as on windows everyone gets a free Admin-car! You get an Admin, and you get an Admin,..

Really, there are still ways you can sell pre-configured machines that are easy to use but hard to exploit. Have a separate user account for each major application-suite. Your gnucash owned by gnuuser. Your open-office by oouser. Your desktop (KDE/Gnome) by another user. Especially your browser and email reader as their own users. Even if you set their passwords all the same or establish a single-sign on, each program won't have the ability to directly touch the files of the other applications.. So long as only GUI applications can make use of such inter-user file-access (via behind-the-scenes sudoing or what-have-you), you're about as safe as you can be.. Course this doesn't work so well with the gnome mentality of taskbar+email+firefox integration.. But that's the price you pay.

Re:Devil's advocate on Michael Robertson Says Root is Safe · 2005-04-18 12:52 · Score: 1

Unix lacks a security model rich enough to be truly useful to everyday users, and by extension, companies like Linspire that cater to them.

Frankly, I don't see the security model as having anything to do with usability. It's not like you take the application CD and install it into the CD-ROM and say "give me Quicken" when you use Linux.. Pretty much the beauty of Linux these days is that everything is already installed.. Why? Because the only things you can even FIND for it are open-source, and thus there's no reason why you shouldn't try to fit it onto the 4 CD installation set.

That says something negative about Linux since we haven't enticed 3'rd party commercial for-profit's our way.. A good video game here and there is worth paying for; you know?

But my point is that with this current situation, what is there that you need to run as root for on the desktop? The only things I can think of are finite, and pretty well established.. So you set up sudo accounts to let any user run the "printer-setup-wizard" or even "system-config-network". There aren't that many exploits that I can think of, and you should be able to spend some think-time coming up with more locked-down script that are sudo-world worthy.

So our user-wants-this-configured-this-way can run as root.. But in a highly isolated fashion. And everything else runs as a non-privledged user.

Is there really a problem here?

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 12:22 · Score: 1

When's the last time you met an elevator operator?

It's called your computer.. HA! I crack myself up.

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 12:08 · Score: 1

Well, fortunately you're not making the decisions. The "users should have to learn" mentality is what keeps computers complicated and difficult to use.

Ok, so people SHOULD NOT have to learn how to drive a car.. It should be as intuitive, and easy to use as possible.. We shouldn't confuse them with keys or remote-entry button-dethingies; you're discriminating against older people that don't have good tactile agility. You shouldn't require that people take tests to drive cars.. They're consumers, let them do what they want; so long as they give us good business.. Don't FORCE them to have air-bags, because we're worried about how air-bags kill babys or worse, cost a lot of money to repair the dashboard if you get into a minor fender-bender (I was one of these paranoid types for a while, mind you).

Ok, so a computer and a car are different.. I get your point. I'll grant you that.. With computers, you can't steal people's social security numbers, bank-account numbers, secret-access-codes-to-terrorist-interested-facili ties, you can't cause a MS-windows oriented network connected medical device to crash, you can't cause infect a hard drive or a cell phone and cause it to overload and self desctruct (yes, you read correctly), you can't be held responsible because a hacker has 0wn3d your computer and used it to perform a highly illegal activity (and in this day and age, the government isn't paranoid about suspending civil liberties in it's cursade against Muslims (I mean terrorists)).

So no, I see you're point.. Test people for planes, trains and automobiles, because they have immediate visible consequences to the consumer for misuse, but leave our virtual-lives to the playground of incompetents.

That rant being made...

rm -Rf / as nonroot will make you give a sigh of relief.

That sounds like a workaround to make up for a design flaw in the command-line interface to me.

Except that there are good reasons why you'd want to rm -r the root of a partition. Or even of the "chroot"ed directory. I won't go further into detail. Don't claim ignorance and then profess a flaw.

ActiveX and a lot of spyware is contained in windows when running as non-administrator.

I don't know the first thing about spyware or Active X or Windows, so I certainly don't care. But since this isn't Windows we're talking about here, I fail to see how this is applicable.

Because Linspire is aspiring to have universal interpolibility of applications LIKE activeX.. Look at open-office, or gnome, or KDE. Here applications all talk with each other... They're all part of one big global community of trusted software.. They're trusted because they're of the same user-id. BUT, each user has independent application-spaces. (for gnome/kde, it's by user-name-named UNIX sockets). I think open-office literally is just one running application. Whenever a central widget in KDE dies, all of my KDE apps become unresponsive.. So I use a mixture of gnome for desktop and KDE for applications.. Thus at least I can click on my stupid task-bar. The EXACT same phenomena occurs in windows.. One shared component stalls for any reason and like a traffic-jam, every other intersection clogs. Ok, fine, great.. But if you're running these applications as ROOT! I'll elaborate no more.

Any exploitable program you run as another user will still need a local escilation exploit in order to do anything harmful.

That's fine, but he has a point. How much actual real-world good does that do?

Well, UNIX wasn't built in a Windows-wannabe world. 90% of the Linux machines I use are not sitting in front of me. They're servers or peer-work-stations. And likewise, others at my company or part of my social demographic or whatever are likewise on those same machines. It isn't a one machine one

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 11:30 · Score: 1

try
cd some-dir
ls
rm -r some-other-dir
cd some-other-other-dir
ls
rm -r some-other-other-other-dir
cd /usr/local
ls
rm -r some-o-o-o-dir
cd /var
ls
rm -r /etc

oops, dislexic brain fart!

I've seen this done on a server.. It's really really fun to see what still runs; you'd be amazed.

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 11:23 · Score: 4, Insightful

Don't forget, as a smart businessman, he knows how to sell his product.. Logging in is REALLY hard to sell.. Even for XP users (notice the pretty typing-free login icons in XP).. If XP required people to memorize passwords to do anything, then people would be use to it, and wouldn't bitch about it in Linux. Thus to have people adopt his product, he needs to soften the hard-core UNIX advocates's argument. Plus XP has one thing over Lin-whatever-the-hell-they-call-themselves, XP has a super-root account which nobody but MS has access to. It just isn't needed for any software/hardware installation.. I'm speaking out of my hat; I don't even know much about win-Administrator.

Re:Okay now... on Michael Robertson Says Root is Safe · 2005-04-18 11:19 · Score: 4, Informative

I should be able to specify that a particular UID can listen on ifname:80

Have you looked into selinux? I don't know if it allows port 80 access from an initially non root user, but it allows you to run a locked-down root process. Problem is that it's apparently very complicated so only supports a scant few products out of the box. But web serving is one of them.

Re:Any reason why you are building it yourself? on Best Motherboard for a Large Memory System? · 2005-04-18 01:31 · Score: 1

I'll make a cheap reply by not checking the facts out for myself, but doesn't the Opteron have 4 external hyper-transport BUS's. The diagrams I remember looking at implied the 4-way symmetrical external interconnect could be attached to either adjacent CPU's or memory busses. So I was under the impression that each hyper-transport bus could be used to connect to 4 separate memory devices, or 4 CPU's. And THAT was why there were so many pins. If this is the case, then pin-out isn't the limitation.. I'll accept that the internal memory controller may be limited to two busses, but a single CPU's pin-out should support 4 (in single-CPU mode).

Now, again, I'll cheat by not checking the facts, but I know that there are 3 Opteron models which support 1, 2 or 4 CPU configurations, which to me implies different pin-counts (or at least internal wiring). So it may be true that you'd have to purchase a 4-way CPU to get 4 [hyper-transport] banks ANYWAY. And in such a situation, it would make the most sense to simply produce motherboards that support 4-way. Which answers all our questions.

Re:Any reason why you are building it yourself? on Best Motherboard for a Large Memory System? · 2005-04-17 12:15 · Score: 2, Interesting

Plus what sane company is going to go with some homebuilt machine that, presumably, is important to the business when product is out there?

Start ups, where dollars matter.. What I've seen people linking to are $3,000 machines at their cheapest.. Then the IBM and other solutions are probably in the tens of thousands of dollars.. Yes, what is your time worth? If you work for a fortune 500 company, then it and the consequences are worth a few measly 10 grand. But if as I suspect the auther DOESN't need CPU horsepower, DOESn't need high-end disk arrays.. They just need memory, then you do the math. The likelyhood of the project needing a 20,000% increase in price ($500 to $10,000) to save a few man-hours...

I quote $500, because a premium MB + 1G memory + 3000+ CPU can be had for much under $500. If you want a rack, sure you're going to pay more, but considering that he was bitching about now wanting a multi-CPU server, I seriously doubt he was looking at racks.

And why would you want memory instead of hardware? Because there are an increasing number of applications which are memory-space intensive, not CPU or even disk intensive.. Running VMware sessions, user-mode-linux, running many-many java servlet engines (each JVM hapilly can consume 1Gig of memory), caching your database. These are various instances that I personally run into on a regular basis where memory is NEVER enough. Especially when I hear people claim "memory is cheap".. No it isn't! Not when you hit artificial limits like 4 DIMM's per AMD64 CPU. It's a fixed resource where a programming environment which assumes an infinite amount of available memory can in certain configurations and situations become a huge bottle-neck..

The most reasonable solution is to purchase multile machines to divide the work.. But this doesn't scale too well as your administrative over-head (as well as cost of UPS's, power/heat constraints, shelf/rack space, etc).

One of the points of going to a 64bit architecture was to remove the architectural limit on addressing. Well, much like the early moterola's, you can address significantly higher than the specific architecture can facilitate.. This is to be expected, but to realistically have an upper bound of 16 chips or 64Gb (assuming 4Gig chips, which I don't even know if exist; certainly not whole-sale) is just kind of sad (considering 1Gig servers are the norm, and 64bitness is well over a decade old)

A 2G chip costs at least $200 (for a crappy server-need-not-apply brand). Half gig's start at $30 (or $120 for the 2Gig counter-part). Server-worthy half-gigs start at $60. Often, however, vendors that sell memory with their system charge enormous premiums (nearly a thousand dollars) for upgrading to the larger chip sizes. If instead of being limited to 4 slots / CPU the architecture facilitated 8 or 16 slots, then vendors wouldn't have as much justification for uprading to the more expensive DIMMs and thereby passing the premium on.

It's simply impractical to afford purchasing a system with 16Gig of memory.. There are many other cheaper solutions which provide a significantly faster machine (including using a swap-file and RAIDing the hell out of it). But this is insane, there is no good reason why a larger memory solution should cost so much as to cause system administrators and developers to have to spend hundreds of thousands of development/deployment dollars working around such limits.

I recognize that you can't just put more memory slots on a BUS, but I don't see why the x86-64 architecture didn't facilitate fully banking the memory out into 4 parallel busses of 4 slots each.. To my knowledge, the most that is supported are two banks of two slots each.

The AMD64, while a GREAT chip series, seems just a little on the cheap side to me.

Re:Non-von Neumann Memory Architecture on AMD's New Venice Core Shows Overclocking Potential · 2005-04-08 17:45 · Score: 1

Again, very learned responses.. But I feel there are still miscommunications.

Executing both paths isn't free. It's way more power.

To my understanding, in the Itanium, you don't execute both paths indefinitely, only so long as the predicate register is "pending".. As soon as the register is no longer pending, all instructions bound to the false-predicate are skipped over. So the only times you're executing "twice the workload" is in the periods which have latency. While I can't speak for the Itanium, from a conceptual view, theoretically this type of architecture means that if 90% of the time you're bound to an ultra-fast cache lookup, then you're likely to instantaneously know the result of the conditional (i.e. was the speculatively loaded memory value a zero). Thus the "false" execution paths are merely skipped over. Moving forward, there is still architectural head-room to speculatively execute one of the paths; the Itanium's initial carnations simply choose not to perform speculative execution. The original discussion was about future progression of architectures, not the Itanium specifically. I'm merely pointing to what I thought were innovative features of the Itanium.

Furthermore - your code size blows up ...
predicate-execution is done via the consolidation of super-block code chunks into hyper-block code chunks (where the super-block rule of never having branch points into the code-block are violated so that the "if" and "else" branches can merge back together within the same block). Compilers such as the open64 compiler originally by SGI make intelligent trade-offs for predicate-register architectures where code-blocks are continuously consolidated so as to elimate branches in favor of predicated instructions.. BUT, then it looks at the resource capabilities of the generalized architecture and then splits up these hyper-blocks into resource-effective sizes. So to say that "this explodes exponentially" is to ignore the responsibility of the compiler.

This means that your caches need to be MUCH larger to hold more code.

Larger cache is a fact of life.. And with multi-megabyte caches finding there way into home-PC's, asking for a larger instruction cache isn't the end of the world to me. Not to mention, last I checked, it was the data-cache with the greater contention.

x86 has prefetch instructions.
Prefetch yes, but what I don't know that the x86 has are consequence-delayed instructions (I haven't read an architecture book for the most recent x86 line). Does the latest x86 allow you to load a potentially invalid memory address and not throw an exception if the memory is never used? The concept is
if (a != null)
{
x = a->b;
}

Can immediately start loading a.b before it knows if "a" is a valid pointer. It begins the testing of the nullity and the loading of b simultaneously.. Then only if the conditional is true does it activate the memory-load (which would cause a seg-fault). Tricks like this are designed to reduce the latency of data-dependency.

Perhaps dynamic out-of-ordering can do this as well; I'm not aware. Last I heard of the Pentium-Pro architecture was the ability to safely pre-load something into the cache.. But you still needed to load it into a register when you were ready to use it (another instruction and thereby delay).

I think I see where you're coming from on the memory-bottleneck issue. I tried to work out a convincing example of what I was trying to say, but it doesn't seem to work out well. The best I could do was consider an architecture that assumes that all cache misses are going to take a lot of time and therefore try to do real work while waiting. But in the process it is effectively context switching (perhaps because it's out-of-ordering the instructions, or like the P4 line has multi-threading). Thus when the memory is ready with the data, the original instruction is not immediately ready to process it. Thus a critical

Re:Lowers the threshold on Indian Call Center Employees Hack US Bank Accounts · 2005-04-08 06:31 · Score: 1

hose were multi-billion dollar fraud, these are minor amounts of money.

I disagree, as the article said, $350k is a LOT in India. It might as well have been a multi-million-dollar coup (which is equivalent to what the heads of Enron probably earned from the scandel).

The issue, in my mind is that there were bright individuals in India without a creative outlet to financially succeed. There simply isn't the infrastructure and commercial-basis for becoming a multi-millionaire thanks to out-of-the-box thinking. Thus very bright individuals wind up working in crappy little high-paying jobs. There is little opportunity for advancement, and anything that would be challenging isn't likely to pay off. So you have intelligent minds taking your credit card numbers and other personal info.

If, on the other hand, intelligent people flocked to industry creating ventures, then those that man the phones would be the under-achievers. Their motivational structure is different, and thus they are generally only capable of pulling off pety crimes which generally are preventable by corporate governance.

The problem is not with India, or even China. The problem is that the US is tapping into this semi-lawful society, and expecting them to work just like min-wage Americans, when the dynamic is completely different.

You can still outsource to them, but treat them more as if you were outsourcing to a minimum-security prison (which many American firms do).

Re:Non-von Neumann Memory Architecture on AMD's New Venice Core Shows Overclocking Potential · 2005-04-08 06:15 · Score: 1

Good reply.. Just a few comments.

If the instructions are dependent, they are dependent, there's nothing you can do to reduce latency other than increasing frequency.

I was talking about algorithms being parallelizable. It's possible to compile many different assembly instruction sequences for the same algorithm. The instruction sequence often produces artificial dependencies. My comment also tried to say that modern CPU's are very good and trying to overcome artificial dependencies (register renaming, etc).

Moreover when I refer to latency, I'm referring to bottlenecks which constrain the high-level algorithmic step. If an algorithmnic step is that two intermediate steps depend on each other, then different architectures may have different ways of approaching the minimalization of this dependency. The Itanium, for example, in the case of a conditional computation can being execution as if it were true and then back-off as soon as the dependency is resolved. While other architectures can do this as well, they must stick with a single path of execution. The Itanium executes all possible execution paths (that are serialized; no branches) until one or more paths are resolved. Thus in a two-way logical branch, if it takes 20 instructions to resolve the dependency. If 10 instructions were the true-path and 10 instructions were the false path, you would have no risk of having "branch predicted" the wrong path. Put that into a 15+ stage plus architecture and a wrongly chosen path starts hurting.

Additionaly, the Itanium has nice memory pre-fetching instructions with delayed consequences. I know the Pentium Pro added a couple of pre-cach loading instructions; don't know if they've incorporated the equivalent though.

Additionally, the register-rolling capability of the Itanium was interesting. Theoretically providing an optimal for-loop, where excluding setup-tear-down, every iteration of the loop took as little as 1 logical clock-tick. (At least as far as the software is concerned; the hardware had room to grow in EPIC-bundles / clock).

The point is that there is still a lot that can be advanced in latency resolution. I agree that the Itanium was a LOT of hardware thrown at a problem, so it's difficult to isolate how well a particular "feature" advanced it's performance capability. But I don't necessarily buy that it's performance / clock was due solely to large register sets and cache.

Dynamic information is needed and is VERY important for extracting parallelism. That's why out-of-order CPUs routinely trounce in-order VLIWs like Itanium.

Problem is that the analysis of OO adds to the latency of an instruction. VLIW gives you the ability to do at least SOME of that computation before-hand. Dependency resolution doesn't need to be calculated by the CPU.. The problem is that giving that info to the CPU takes up valueable instruction-space. EPIC was nice because you only wasted an extra bit to specify data-independent chains of instructions. Of course, I'm discounting the wasted NOP space in the VLIW instructions that are fully data-dependent; but that cost is already paid by the low-latency dispatch capability of VLIW. The issue is that there is no reason why compilers can't be intelligent. But it's generally considered too expensive to pass this info to the CPU. Instead, most architectures (like the alpha) produce simple suggestive-flags which the CPU can ignore if their dynamic analyzer wants something different.

I've toyed around with designing architectures which produce massive batch-instructions (multi-kilo-byte instructions). The over-head of compile-time information is reduced in such circumstances.

Going forward, the key is that some information is expensive to compute; as much of that information that can be computed staticly should be delegated to the compiler.

By lowering the CPU MHZ, you reduce the latency to the all-important main-memory.
Frequency has nothing to do with l

Re:Non-von Neumann Memory Architecture on AMD's New Venice Core Shows Overclocking Potential · 2005-04-07 15:31 · Score: 5, Informative

something revolutionary and cheap, maybe a new optical memory

Revolutionary and cheap.. You don't ask for much do you? Optical is coming slowly, but I'm not convinced it's ever going to replace electric current/voltage-based computing. At least not for general computing.. The problem is shrinking optical paths; you need a wave-guide for optical paths; for electric current, all you need is a string of closely spaced ionized atoms. Theoretically you could get down to a couple-atoms thick of wire with electric current.

Moreover, photons are only slightly faster than electric-current. Electrons move between 0.6 and 0.9 times the speed of light. What photons are really good at is traveling long distances without dispersing as heat. Electrons move only a couple atoms before bouncing into something. But you can do lots of really useful things with electrons that you can't do with photons... Having photons mimic the functionality of electrons might not be doable on the same scale (meaning by the time you get 30 million photonic transistors on a die, you could probalby get a billion electric transistors).

Quantum computing has the same density dilemma as photonic computing. But at least quantum computing does more than electric or photonic switching, so it doesn't need as many functional units. Don't expect to see an Intel Q4 any time soon.

As for a more practical architecture. If practical and economic are what you want then the Pentium 3's with a flat BUS multi-CPU architecture is where it's at. Lots of cheap cores on as simple an architecture as you can get.

The problem of course is in the mathmatical algorithms that we use to do real work. Most steps of computational algorithms are inherently dependent on the results of previous steps, and are thus not parallelizable. single-threaded CPU's have gotten VERY good at parallelizing individual instructions. The compilers aren't well suited for helping the CPU out, so things like the Itanium were supposed to exploit such parallelism. But the loss of backward compatability (and the Itanium's focus on floating point) spelled the death nell for that architecture.

IBM, Intel, AMD are all pushing multi-threaded execution.. Basically giving up on figuring out how to make a particular algorithm work. They're pretending that a CPU which works well on a high-end server with lots of independent jobs (web pages, database transactions, IO requests, etc) can be sold to a market which is trying to scroll the mouse wheel on an excel spreadsheet with a thousand rows. The spread-sheet navigation is extremely sequential. A dual core CPU will be noticeable since there are periodic background tasks which often "get in the way" of your foreground task. But a 3'rd/4rth CPU is not likely to be useful at all to non-workstation end-users. (My workstation generally has 8 visible applications, all actively running).

Personally, I think the answer is taking a step back from MHZ and pipelining. Go back to a 3, 4 or 5 stage pipeline with MASSIVE read-ahead decompilation of instructions (similar to transmeta). Get lots of high-speed cache on board with as little latency as possible (current large-cache architectures have HUGE latancies). By lowering the CPU MHZ, you reduce the latency to the all-important main-memory. Advance the state-of-the-art in power-consumption (I've read of several very novel approaches, including decreasing power to the point of statistically acceptible and correctable errors occuring in the computation). Perhaps put a second core on the CPU, but don't just put two identical masks.. Make use of the fact that a CPU has hot and cold regions.. Rewire both devices so they're really one big device with two functional CPUs..

Develop better heat-dessipation techniques.. THey've been very creative over the years.. Flipping the chip so the silicon directly presses against the heat-sink, for example. They've introduced lower-resistence copper as the main wire interconnect, which was a major material-science challenge. Newer exotic materials may provide for better heat conductivity and voltage regulation. The cooler you run a CPU, the higher the power it can dessipate, the more power you can shove into it, the more work you can ask it to do.

-Cheers

Re:Duh! on AMD's New Venice Core Shows Overclocking Potential · 2005-04-07 15:01 · Score: 3, Insightful

That is assuming you're compute-bound, instead of memory-bandwidth, harddrive-bandwidth, or some other kind of IO bound

Hard-disk bound is hardly ever a factor for system-upgrades. If you're HD bound, it's unmistakable, and you usually are doing something worth the money of upgrading the disk-system. 3D grahpics-card bottlenecks, on the other hand are real and subtle.

As for memory bound, I'm not aware of any benchmark (other than synthetic memory-testers) that didn't improve semi-linearly merely because of being memory bound. Increasing CPU speed these days generally means increasing the cache-speed which implies speeding up critical memory paths.

Re:Still energy on Car Powered by Compressed Air · 2005-04-03 21:06 · Score: 2, Insightful

Electric engines have the disadvantage of having little power. . .

Beg pardon? Not to mention the fact that their torque curves are the stuff that give drag racers wet dreams.

Actually, you're nit-picking.. When he says electric-engines have little power, what he means is that the entire package provided to us in such a small form-factor as a car has too little power. It is more correct to say that the amount of electric power being provided by the battery-source is not sufficient to warrent a high-torque-capable electric motor, but it would be too long winded to say the same effective thing.

Re:Drive Extension on Gmail's Birthday Presents · 2005-04-01 09:54 · Score: 1

2 gigabytes of storage might be pointless for just email

Are you kidding me? Apparently you don't subscribe to mailing lists. And apparently you don't have relatives that send you lots of pictures. With a 10 Meg attachment limit, you'd only have to send 100 such pictures over the course of a year or two and you're saturated.

Yes you can delete stuff, but gmail advertises "never delete your email again".. How many years can you use an email account before it becomes over-saturated.. By using google-search on your "archives" you can always retrieve a little thing that you didn't think was important enough to keep at the time.

Re:Better Question... on AMD Demos Dual-Core Athlon 64 · 2005-02-24 04:08 · Score: 1

Is this going to be another AMD innovation that will recieve no support from the software industry

Sorry, but UNIX platforms have been supporting/encouraging multi-threading for well over a decade. Solaris has been a key-player. You can't sell a 16-way SMT server unless your software can actually make use of it.

Hell, even windows NT has been pushing MT for a long time (selling 2, 4 and 8 processor versions). Both Windows and most UNIXes have very well established MT software bases. Even simple things like web browsers, office applications, etc, have long adopted MT since it improves user-response-time when the system is otherwise too slow. Look at the old MAC OS9.. That was the last major OS that wasn't fully MT.. Prior to 9, you had cooperative multi-tasking (you had to manually yield the processor back to the OS); not full blown pre-emptive multi-threading/tasking.

So, servers are fully ready to accept dual-core in any platform.. More threads means more throughput under heavy-load (if you aren't currently heavily loaded, then you're not purchasing a new machine, now are you?). For desktops, you will still see a noticable difference in performance with MT. If you've never had the system seem to hang for a few seconds when doing something heavy (virus check, disk-defrag, etc), there are two things that could be causing it; disk thrashing (no help from CPU here), and CPU-starvation. If the CPU is starved, then context switching back to the desktop still produces noticable delays; especially since x86 designs chips aren't the most efficient at ctx-switching.

While it's true, most games aren't going to take advantage of SMP, there are some which were specifically designed to have multiple independent threads (AI v.s. graphics). Quake 3 comes to mind. dual-core initially is being targetted at the server market ($1k+ chips). As killer-app designers demonstrate real performance benifits to dual core, we'll see more main-stream dual-core implementations... Eventually we'll see the retirement of single-core chips.

The article makes a very important case.. More's law is all about the performance increasing over time through innovation.. We're hitting a road-block with merely continuing in a straight line (shrinking transistor-size, using more exotic materials, adding larger caches, having different clock speeds for different portions of the core, VLIW, etc).. Each of these provides some performance enhancement, but most is provided in the first generation of the technology.. You have to keep coming up with radically different ideas which orthogonally enhance performance. While we can currently make 4GHZ CPUs with 2Meg L2 cache (see Intel), they have very large pipelines and the overhead (performance loss due to design trade-offs) provides tremendous deminishing returns. Moreoever, the power dessipation is a very real problem.. So much of the current technology is centered around managing power.. Too much power load causes voltage variations, and signal latencies (which prevent simultaneous arrival of 64-bit wide busses; same reason we went from parallel IDE to serial ATA).

Slashdot Mirror

User: maraist

Comments · 1,152