Mr+Z · Slashdot Mirror

Re:lawyers on GPL Issues Surrounding Commercial Device Drivers? · 2002-11-05 11:56 · Score: 1

Clarify what you mean by 'use.' You can run a program that is licensed under the GPL without running afoul of any of GPL's terms. You can run it however you like, providing whatever input you like, and so on. You can delete it. You can reverse the contents of all its files. You can pipe it through a GIMP filter to make desktop wall paper.

You cannot, however, redistribute a GPL or GPL-derived work without adhering to the terms of the GPL. That's a very narrow, but important subset of "use."

GPL is viral only in the sense that you cannot annex or incorporate GPL-licensed code into a non-GPL application and redistribute the result under a license that is not GPL compatible. What's interesting to note is that you can't even place the combined result into the Public Domain (since the GPL'd portion would lose its GPL license). (You can, of course, place your diffs in the public domain though.)

--Joe

[ot] apple ][ remiscence on How About Drivers In Devices? · 2002-11-03 11:29 · Score: 1

Oh, how many times have I rebooted with FAA6G.... Or hooked CIN/COUT... I actually wrote a 'macro package' that hooked CIN and expanded certain control keys out to strings for you. It was one of the first programs I wrote after getting a proper assembler. Prior to that, I hand coded opcodes directly in the monitor. 2C 30 C0.... 4C 00 03.... ah the memories... (Ok, sometimes, if I was programming under DOS 3.3, I'd load Integer BASIC into the language card and do F666G. But it was a PITA.)

But now I'm extremely off topic...

--Joe

Re:Use a VM! on How About Drivers In Devices? · 2002-11-01 06:25 · Score: 1

It's called Open Firmware, and the 'VM' is a Forth interpreter/compiler.

--Joe

Re:Acorn RISC OS "Podules" on How About Drivers In Devices? · 2002-11-01 06:22 · Score: 3, Informative

Apple ]['s also stored the device firmware on the card for their add-in cards. You had 2K-bytes of ROM allocated to each card at $Cx00, where 'x' was the slot number. For cards that needed more storage, you could use the 8K of space from $C800 - $CFFF, but I forget the exact rules for sharing that space.

PR #3 anyone? (Enables 80-column mode on a machine with one installed.) How about PR #1 (to print) or PR #6 (to reboot)?

--Joe

Don't forget PS2 and XBox's software subsidy. on Why Do Graphics Cards Cost So Much? · 2002-10-30 15:57 · Score: 2, Insightful

First of all. PS2 and XBOX systems have custom chips designed to be cheap to produce and are married to logic boards that eek every last bit of performace out of them. Couple that with long productions run and you get a cheap per unit cost. PS2's and XBOX's, unlike PC's, are not locked into the hardware upgrade cycle. Instead, they have a product lifespan nof just a few years.

The mass production angle is an important one, but not the only one. PS2 is (or at least was) sold at a break-even point. XBox was sold at a loss. Neither have enough GPM (gross profit margin) to recoup their development expenses. Thus, it's important to remember that PS2 and XBox are subsidized by software sales, since their respective manufacturers collect licensing fees on games sold for those platforms. In contrast, NVidia et al. do not charge heavy licensing fees to game developers to get them to use their cards. Heck, it's more like the card developers go out and evangelize their specific technology tweaks to try to get buy-in -- they practically beg for people to develop to their cards, as opposed to slapping them with onerous license fees.

--Joe

Customers can't easily improve the optimizer on As Languages Evolve... · 2002-10-27 05:55 · Score: 1

If you're a customer and you're stuck with a particular tool chain for a given device, then you're stuck with whatever your vendor provides for optimization. For many embedded devices, there are only one or two compilers available.

Thus, your suggestion to improve the compiler isn't all that useful. The vendor needs to do that. Meanwhile, the customer either attempts compiler black-magic (eg. tweaking their C code in obscure ways to try to help the compiler) or reduces their code to assembly.

For the vendor, to accurately measure the performance of the tools relative to what can be achieved with a given device, it is extremely useful to have hand-optimized benchmarks that represent "optimal" and measure versus them. (This is especially true for 'heavy compute' functions.) That is an activity I actually get to participate in -- I get to write some of the highly optimized assembly code that we benchmark our compiler against. I get to do that, though, because I work for the vendor. Our customers are stuck optimizing their code with no chance to improve the optimizer.

--Joe

Re:Google has a monopoly on Google Sued over Page Ranking · 2002-10-21 07:53 · Score: 1

Yeah but if KFC refused to serve chicken to employees of Popeye's, that would be illegal.

Non-sequitor. As far as I know, Google lets employees of SearchKing perform searches and buy ads on Google the same as anyone else. It's more like KFC telling Popeye's employees that they need to buy some food before they can grab napkins and plastic-ware, which is fully within KFC's rights.

--Joe

Re:Baseless claim on Google Sued over Page Ranking · 2002-10-21 06:18 · Score: 1

You know, suing Google because they lowered your PageRank is kinda like suing a movie critic because they slammed your sequel even though they lauded the original movie.

Say the original movie was great, then they load up on product placement and tired gags in a sequel. The original will get great reviews, the sequel probably won't. Movie reviews carry great value -- a bad one devalues your movie. So, do the studios get to sue the critics for bad reviews? No, unless the reviews are libelous or slanderous. There's no requirement of objectivity or anything else either. Likewise goes for all the hapless companies whose products were placed in that hypothetical sequel. Shame on them for lacking taste -- they can't sue the critic for the bad PR they indirectly generated.

Free trade threat my ass. (I'm agreeing w/ you here Drizzten.)

--Joe

Re:Google has a monopoly on Google Sued over Page Ranking · 2002-10-21 05:56 · Score: 2, Informative

Google has a monopoly on its PageRank technology.

Yeah, and KFC has a 'monopoly' on its "Original Recipe" with "11 herbs and spices." And Coke has a 'monopoly' on the particular formulas used to make Coke Classic and Diet Coke. So what? It's called trade secret , and it's an accepted, established part of doing business.

Where the rules change, as several other people have pointed out, is when your business is ruled to be a monopoly. Then you fall under regulation so that you cannot use your trade secrets to exert undue influence. It's basically modern capitalism's way of saying "You won this market, you've got the biggest pile. Now play nice with the little guys."

Unfair trade practices don't come into play here. Using one of my examples above, just because Popeye's Chicken can't use KFC's Original Recipe doesn't mean KFC's wronged them. And if KFC accepts competitor's coupons, still no problem. And if KFC launches an advertising campaign saying "we taste better than Popeye's," I'm pretty sure you're still ok.

Unfair trade practices would be something like KFC making deals with poultry distributors so that Popeye's couldn't buy chicken at a decent price. Totally different kind of problem. For instance, Google would be guilty of unfair trade practices only if they went to SearchKing's ISP and exerted muscle on them to degrade SearchKing's connectivity, raise SearchKing's costs, or otherwise affect them. That's totally different than tweaking a private algorithm to cut out the freeloading and search-engine abuse.

--Joe

Re:loose versus lose on Killing Clutter With The Antidesktop · 2002-10-16 05:49 · Score: 1

Either he lost his pants or he loosened his pants. In modern American English, so far as I know, there is no loosed. Although, to be fair, Webster's Revised Unabridged Dictionary has some interesting and archaic-sounding examples of loosed :

Loose \Loose\, v. n. [imp. & p. p. Loosed; p. pr. & vb. n. Loosing.] [From Loose, a.] 1. To untie or unbind; to free from any fastening; to remove the shackles or fastenings of; to set free; to relieve.

Canst thou . . . loose the bands of Orion ? --Job. xxxviii. 31.

Ye shall find an ass tied, and a colt with her; loose them, and bring them unto me. --Matt. xxi. 2.

2. To release from anything obligatory or burdensome; to disengage; hence, to absolve; to remit.

Art thou loosed from a wife ? seek not a wife. --1 Cor. vii. 27.

Whatsoever thou shalt loose on earth shall be loosed in heaven. --Matt. xvi. 19.

3. To relax; to loosen; to make less strict.

The joints of his loins were loosed. --Dan. v. 6.

4. To solve; to interpret. [Obs.] --Spenser.

Joints of his loins? Uhm, yeah. Now bring me that ass. ;-) I'd say "loosed" is not part of the current vernacular. Or "loosing", for that matter. It says something that only the unabridged dictionary lists those strange words.

--Joe

slightly ot: a shortfall of pipelines on Phoenix 0.3 Is Out · 2002-10-16 04:16 · Score: 1

The Unix command line philosophy more closely resembles functional programming: data goes in one end of a component, and comes out the other end suitably transformed. This makes the protocol for fitting components together easy to understand.

One thing I pine for occasionally with pipelines (whether they're text or binary data) is the ability to specify more than one input, or tee off the output to more than one destination application. The Unix tee program lets me tap off of a pipeline to a file, but not to another program. I can kludge around it all with mkfifo and tee into that, but not all programs that accept file inputs like to do so from a named pipe. It also starts to get rather complicated to set it all up correctly.

Basically, what I'm saying is that stdin and stdout are a good start, but it would be even more useful if you could specify a vector of inputs and had a better way to fan out your outputs.

One thing that object models have over pipelines is that you build more of a web-like structure, not a linear concatenation of operations. I suppose that's also a drawback, when analysing the complexity of the system. :-)

--Joe

Re:No, it's her twin sister! on Microsoft Tries a "Switch" Campaign · 2002-10-14 14:07 · Score: 1

But then she'd be umop apisdn (upside down).

A two-fisted, no bullshit approach to working. on Slack · 2002-10-14 04:44 · Score: 1

I personally try to keep my maximum loading on any given day at or below 70%, but I also have a tendency to do some work on the weekends. It's no accident that 5/7ths is about 70%, so scaling a 5 day workweek to 7 days gives 100%. At least in theory. In the position I'm in, there's an infinite amount of work I could be doing. I work in bursts, get lots done, and then coast until the next burst. It works out great, because there are usually a lot of 'fires' that erupt during my coasting periods, and if I were working slavishly, I wouldn't be able to 'firefight.'

Ultimately, I think I do a better job of serving the company, as I'm able to work on projects and activities that are orthogonal to my "critical path tasks", and that helps out the productivity of everyone around me. I can spend the 10 minutes to look at someone elses code and spot a silly bug, or float an idea past someone about some project they're working on and so on. I love it.

I actually spend probably 1/2 of my day hopping between email, browsing websites to keep up on the news, avoiding conference calls, and generally ruminating about the state of the universe as it applies to our group. The other half of the day, I'm slacking off. And then on the weekend, I churn out code. ;-) For some reason, they keep promoting me. (It sometimes has an Office Space feel to it -- "You're firing Michael and Samir, and you're giving me a raise?" -- but really, I'm not that bad.)

Ok, I'm not *quite* that slacked all the time. But when I'm coasting, it's not too different. It balances the occasional mania-induced 14hr days and code-a-thons. I much prefer the work-in-bursts sprinting to sustained drudgery. It keeps it more interesting in the long run. And I am more likely to maintain a healthy reserve of slack.

--Joe

Re:Straight from the source on Intel Must Pay $150M for Patent Infringement · 2002-10-14 00:02 · Score: 1

I was arguing that Integraph's patent was a bit on the "obvious side". As far as hardware is concerned that is.

I tend to agree.

But the problem is a lot harder at compile time since you can't restrict your analysis to comparing addresses.

Scheduling instructions isn't horribly difficult until you either get to a memory reference you can't disambiguate or a non-looping branch. Thank goodness for me, most of the code I write is "computational code", which is largely "compute intensive loops." In a compiler context, branchy code that cannot be converted to predication requires aggressive techniques such as trace scheduling or treegion scheduling. Bleah... to do that right needs a path profile and lots of luck.

--Joe

Re:Why? on Revolutionizing x86 CPU Performance · 2002-10-12 04:10 · Score: 1

I said:

The Pentium 4 has 128 rename registers anyway, so it seems like adding more 'architectural registers' is more an opcode formality than anything else.

Chris replied:

Not at all! Physical registers are no replacement for architectural. Physical registers is essentially the size of your window. They don't stop you from having to store values to memory because you don't have enough architectural registers.

I think you may have misunderstood me. What I was saying is that you could add architectural registers without necessarily adding any physical registers. It seems like the additional hardware to support new architectural registers should be minimal, and largely limited to decoding the new opcodes that specify them. The Pentium IV's rename register file is 128 registers. Instead of having only 8 GPR names that get mapped onto that set, why not have 16 or 32?

By the way, is it just me, or does anyone else think that Hammer's 16 register extension is shooting behind the duck? Other high-end RISCs have 32 to 64 registers. The machine I program has 64 and could make use of more in some cases. Perhaps because x86 is fundamentally a memory-operand instruction set, it can get by with fewer registers more easily? RISC-like instruction sets, with their load/store architecture, do end up needing a few more registers for values that are loaded and used immediately.

--Joe

Re:Why? on Revolutionizing x86 CPU Performance · 2002-10-11 18:32 · Score: 1

And what about conditional branches nearby? You don't know until the instruction commits what the register names will be. Imagine code which simply conditionally branches around a remap instruction. How do you handle that sanely?

I personally think the remap idea is insane. You're essentially adding register-file mode bits, and mode bits are just ugly in too many ways to describe. Just add more architectural registers already! The Pentium 4 has 128 rename registers anyway, so it seems like adding more 'architectural registers' is more an opcode formality than anything else.

--Joe

Re:Straight from the source on Intel Must Pay $150M for Patent Infringement · 2002-10-11 16:30 · Score: 1

The key to Intergraph's claims is the concept of a "parallel bit" in the opcode which says the hardware may (or may not) decide to parallelize this instruction with the one that is next to it. They also show a functional implementation of how an instruction cache + fetch pipeline can use this information.

Their idea is similar to the 'p-bit' idea on TI's TMS320C6000, except that the hardware isn't required to honor the bit. Thus, packets can be arbitrarily long, but instructions within a parallel packet may not have dependencies. On the C6000, instructions within a parallel packet may not necessarily be safely serializable (one instr in the packet may destroy the operand of a later instr in the packet if issued serially), and are bounded to a maximum length defined by the machine's functional units.

EPIC's bundle system looks like it's taken from Clipper's parallel bit system. EPIC's difference relative to PIC is that they compress the 'parallel bits' and unit information into fixed templates.

(As an aside, IBM's DAISY Tree VLIW is similar to Intergraph's scheme, in that a parallel packet of instructions cannot have any interdependencies that prevent it from being issued serially. IBM's twist is that the parallel packet represents a decision-tree structure, and so they'll execute down as many paths of the tree as possible, pruning branches as they come up dead. A narrower machine will execute fewer 'dead branches' and will end up being more efficient in terms of fewer speculated instructions. A wider machine can speculate down many paths and so get some gains on otherwise serialized code. Google turns up lots of nice DAISY hits. I can't find IBM's original pages... I think they've been down for awhile.)

--Joe

Minor brain fart. on Revolutionizing x86 CPU Performance · 2002-10-11 14:08 · Score: 1

I said:

The cache line holding top-of-stack is in exclusive state

Actually, it would have to be in either exclusive or modified states. If it couldn't be in the Modified state, then how would you use these regs?

--Joe

Re:Why? on Revolutionizing x86 CPU Performance · 2002-10-11 14:03 · Score: 1

You'd have to tie it to ESP, not EBP, since GCC and other compilers will (with appropriate flags) use EBP as a general-purpose register. (Consider GCC's -fomit-frame-pointer.) And I'm perfectly aware that accessing the stack frame happens with instrs other than push or pop. Indeed, I'd assumed that this "top of stack looks like rename regs" idea applied only to memory references of the form [ESP + constant_offset] as either src or dst to another instruction. (And for simplicity, limit it to 4-byte-aligned offsets and 4-byte wide accesses.)

The shadowing would have to work like a write-through cache, and you *do* run into some sharing issues in a multiprocessor setup. In order to make refs to the top-of-stack eligible for rename aliases, you would need the following conditions:

The cache line holding top-of-stack is in exclusive state, not shared or invalid. (I think the first 'pushes' would make this line become 'exclusive' fairly quickly, since the caches are write-allocate.)
No push/pop instructions in progress.
All push/pop instructions and non-32-bit alignment/width accesses deferred until shadow writethrus occur. (Honor memory dependences relative to the stack-relative-accesses-turned-rename-registers.)
Any change to ESP invalidates the rename registers.

You still have some issues if you generate ESP-relative addresses into other registers. (For example, generating a pointer to a local value on the stack.) EBP-relative accesses could often overlap ESP-relative accesses if a program uses EBP and ESP for accessing the allocated stack frame. We already need hardware for resolving these memory dependences. Since accesses via these alternate paths are essentially *forced* to go to memory, it's not a big deal. We just need to remember to make them dependent on the writethrus that our top-of-stack shadow provides.

If you think about it, compilers nowadays tend to migrate their stack frame allocation to the top of the function and the stack frame release to the bottom of the function. All spills are converted to ESP or EBP relative addressing, not push/pop. This allows arbitrary access to spills. Thus, the current compilation model already matches well to this rename idea.

I could blather on with more ideas (there's one particularly neat one that I'd like to share), but I think I'd be violating my company's IP to talk about it here. *sigh*

BTW, the original article's content (the software-controlled register-rename-on-steroids-and-acid idea) seems to me pretty typical of a programmer's perspective of the hardware that ignores hardware realities. It essentially ignores the fact that the effect of one instruction on later instructions might be on a pipeline stage other than the execute stages, so there's a pipeline bubble that develops between two such instructions. Register names are resolved for dependence generation many pipeline stages ahead of the execute stage, so you have a gigantic barrier generating effect between anything that changes the naming and the stuff that uses the names. Basically, all name changes will happen in the execute stages, but anything that relies on the naming will be stalled in the earlier dependence tracking stages.

(For those who want a concrete example of "result of instruction A affects a different pipeline stage of instruction B", read up on AGI stalls -- Address Generation Interlock stalls. These occurred on 486s and Pentium I's. On these machines, instructions generated memory addresses for accesses one pipeline stage before the instruction itself executed. So, if you issued, say, "MOV EAX, value" followed by "OP reg, [EAX + offset]", you'd take an AGI stall, because the EAX value would get updated about the same time the second op needed to use it for address generation. Later CPUs hide it better by scheduling out-of-order. this page gives a reasonable explanation of AGI stalls. Google turns up a lot of useful links. This concept is easy to explain w/ a pipeline diagram, but alas, Slashdot would probably eat such a diagram.)

One nice thing about the 'convert ESP-relative accesses to rename-register accesses' idea is that if ever you don't know if it's safe to use the rename-reg aliases, you can always leave these as memory accesses, and it "just works." So, you can eliminate that dependence-ambiguity stall. Just issue the instruction as-is, rather than retargeting it to read/write a rename reg.

--Joe

Stack machines? Ack! Ptooey! on Revolutionizing x86 CPU Performance · 2002-10-11 09:02 · Score: 1

Stack machines are efficient in terms of economy of opcodes and economy of specification. They're inherently serial beasts, though, unless you want to work extra hard and "registerize" the stack.

Registerizing the stack is basically register renaming that has to take into account that every instruction might rename the entire register set.

For a non-performance critical embedded system with tight power constraints, it might be a good match. For top-speed computational performance, you just don't get the parallelism out of a stack machine. At least, I can't see how without jumping through a lot of hoops.

Of course, I might be biased. The CPU I program lets you issue 8 instructions per cycle and has 64 32-bit registers. It can read about 30 registers and write 18 registers every cycle. I just can't imagine trying to write the highly parallel code I write on a stack machine!

--Joe

Re:Why? on Revolutionizing x86 CPU Performance · 2002-10-11 08:42 · Score: 1

The fix there would be to shadow the few words at top-of-stack in the rename registers. Tell compiler writers that, say, the 16 32-bit words at the top of stack are eligible for hardware register shadowing, and are as cheap as registers. In fact, give them names and a hacked assembler, so that you can read and write things that look like register names. You don't even add any instructions. Voila!

I'd say that's MUCH cheaper than this "register windows on steroids and acid" technique.

--Joe

Re:Cache is the key on Revolutionizing x86 CPU Performance · 2002-10-11 08:35 · Score: 2, Informative

It wasn't die area so much as clock rate. At smaller and smaller geometries, the transit time for a bit starts going up at some point due to transmission line effects. RC delay goes up since R goes up (your wire got smaller) and C goes up (you got closer to the other wires).

--Joe

Re:Ah yes, discrete math. on Math Toolkit for Real-Time Programming · 2002-10-10 10:51 · Score: 1

There is still some heavy math in engineering applications that arise in embedded systems. Linear algebra factors very strongly into 3G cellular standards. Stuff like beamforming, etc. So, if you want to get into those kinds of spaces (and still program), get yourself some signal processing background. :-)

In general programming, yes, abstract algebra doesn't come up a whole lot, nor does calculus. It's usually the problem domains themselves, not the programming, which will require these if they're used. If you're in a problem space that doesn't need calculus, you most likely won't need calculus to write your programs in that problem space.

It comes down to the fact that programming is just a tool for solving particular problems. The skills required to write a program to solve your problem are just a small superset of the skills required to solve the problem directly. The difference of these two sets is the minimal skills required to program. That set can actually be pretty small (that is, until you start needing programs of increasing sophistication).

BTW, if you ever decide to write a video game, you'll be surprised how much mathematics does come up. Anything with realistic or semi-realistic physics in it will need buttloads of at least the basics. If you can't remember the relationships between inertia and momentum, or acceleration, velocity and position (hint: integrals and derivatives, boys and girls), then you're hosed. And then there's systems with feedback in them -- discrete differential equations anyone? Of course you can cheese it, but it's good to know what the right answer is so you know how safely you can get away with your short-cuts.

--Joe

Re:Follow-up on Math Toolkit for Real-Time Programming · 2002-10-10 09:51 · Score: 1

BTW, I think you can factor out the "69.1*" out of all those equations. Basically, the sqrt(power(69.1*x, 2) + power(69.1*y, 2)) is the same as 69.1*sqrt(power(x, 2) + power(y, 2)). And since you're sorting, only relative magnitude matters, so you can just omit the 69.1.

This equation seems to be based around Euclidean distance: sqrt(dx*dx + dy*dy), where dx = x1-x2 and dy = y1-y2. Note, however, that they're finding the distance in terms of latitude and longitude. By using latitude and longitude, they're actually measuring distance along the surface of the sphere. Also, if you notice, the longitude is being adjusted by the cosine of the latitude. The net effect of this is to adjust for the fact that latitude lines are closer together near the poles and farther apart near the equator.

So, I'd say this equation already is doing a passible job of taking into account the curvature of the earth, at least for short paths. One thing that does seem kinda bogus is using only cos(zipcode.lat). Seems like you should use (zipcode.lng*cos(zipcode.lat/57.3) - $long1*cos($long1/57.3)). It's still not a great approximation, but it seems a bit more balanced to me.

If you really want to take into account the curvature of the earth correctly, you should do a search on "Great Circle distance." I did one just now, and at least this page has some JavaScript that shows how it's done.

--Joe

Ah yes, discrete math. on Math Toolkit for Real-Time Programming · 2002-10-10 09:09 · Score: 2, Informative

What's sad is that discrete math isn't really taught in public school. (At least, it wasn't when I was in school.) One day, I found a Discrete Math textbook at the local library in the 'For sale, $0.25' bin. I opened it up and thought "Oh my goodness, this is a programming and algorithms book!" To my mind, 'math' had always meant either calculation (symbolic or otherwise, your typical Algebra and Calculus), or geometry and proofs. While geometric proofs may border on discrete math, they really seem different to me. They're not algorithms.

Discrete Math branches into useful concepts such as graph theory (you couldn't do network routing successfully without it!), some of the basics of sorting, and so on. Basically, it was the math of "machines" -- that branch of mathematics which concerns itself with stepwise algorithms. Djikstra's algorithm (least cost path through a weighted graph), Prim's and Kruskal's algorithms (minimum cost spanning trees) were all in there. I thought the book was great.

And, of course, not a single line of code in it. (At least, not in any computer programming language.) But I still thought of it as a programming book.

--Joe

Slashdot Mirror

User: Mr+Z

Comments · 3,254