tomstdenis · Slashdot Mirror

Re:Start an OSS project on Getting the Most Out of a CS Curriculum? · 2007-03-31 02:01 · Score: 1

Who says you can't treat your project properly? For the most part in my projects, I'd fix bugs in a timely fashion [usually under 24 hours] and would announce in advance release dates [and hit them].

My users appreciated that they not only have a say on how the projects advance [through suggestions, fixes, and submissions], but also that they can depend on things progress smoothly. It also helps if you do the whole 9 yards. In my case, I have a 300 page user manual, tons of examples, very commented code, organized nicely, etc. It's not just a scrap of code with no other support.

Keep in mind my comment is for those in school. The chances of getting a useful development job while in school is fairly slim. And even if you do, it's rarely worth the effort. You're treated as sub-junior, paid next to nothing, get no say in how the project develops, etc. I should I worked for a half dozen firms during my college career.

I should point out I got my current job only a couple months out of college. They were long term users of my crypto library and decided they wanted that talent on staff. I didn't have to jobhunt, or apply or anything. In fact, I was in France on other business when they contacted me. So I was kept fairly busy because of my OSS projects.

Tom

Re:CS-type degree course? on Getting the Most Out of a CS Curriculum? · 2007-03-31 01:48 · Score: 1

I leave you with a quote:

Snobby academics are not in high demand. If you want to get a job out of college you need both academics and practical experience. -- Me, right now.

That is, unless you want to work for the uni. I get that on a whole that CS isn't about programming, but I disagree that you shouldn't have it there. I mean you take English writing and other courses during a CS degree. What does English grammar have to do with sorting or searching or ? If you think you can get a job related to the creation of software without knowing how to program, you live in a fantasy world.

I should point out you get your hands dirty in a lot of other degrees. In medicine you tear into cadavers, in geology you perform surveys, in chemistry you perform experiments in a lab, etc. Academia isn't just about what you can learn from a book.

And for the record, my college program was called "Computer Science and Engineering." I suspect many other colleges have similarly long winded titles. So you're not likely to be in a pure CS degree anyways.

Tom

Re:Start an OSS project on Getting the Most Out of a CS Curriculum? · 2007-03-31 01:03 · Score: 1

perhaps. The benefit of an OSS project is it's a very public thing and easy to control all the aspects of it (e.g. design something that really shows off your smarts).

In my case I stuck my name in my project name ... LibTomCrypt. So it's very handy as both advertising and as a catchy name :-)

Tom

Re:CS-type degree course? on Getting the Most Out of a CS Curriculum? · 2007-03-31 00:52 · Score: 2, Insightful

I disagree with the notion that a CS degree should have no programming aspects to it.

Yes, I agree the fundamentals (algorithms, data structures, numerical analysis, compiler theory, graph theory, calculus, etc) are more important than say a class on C... But at the end of the day, if you can't develop software you're not a very useful computer scientist (aside from working at the uni).

I do agree, however, that many CS degrees tend to focus too much on programming languages/tools (things that competent developers really ought to learn on their own anyways).

Tom

Start an OSS project on Getting the Most Out of a CS Curriculum? · 2007-03-31 00:48 · Score: 2, Interesting

Find a problem that isn't adequately solved and do it. Nothing says serious developer more than someone who can develop, maintain, document, and support a useful OSS project. Bonus points: Your future employer may be a user (worked out for me :-)).

People who coast through uni without really taking the initiative are a dime a dozen. If you want to stand out you have to get yourself organized and build a portfolio of public projects that demonstrate you're a competent fellow.

Tom

Re:It All Depends on Context on Static Code Analysis Tools? · 2007-03-30 04:49 · Score: 1

No *one part* of SAP should be 100K+ lines long.

What would it be doing that it can't refactor the code into manageable and verifyable libraries?

Tom

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 03:15 · Score: 1

500K came from the summary.

Frankly, I'd be disappointed if any one part was larger than 100K lines of code.

Tom

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 02:42 · Score: 1

I didn't say it's good, I said it's being refactored. It may or may not get better but it's a start. GCC at least has some comments in the code. Which is more than I can say for most other OSS.

Anyways, point being, you shouldn't have 500K lines in any single part of a project. It makes testing and verification impossible

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 02:35 · Score: 1

Part of IBMs problem is turnaround. Many of the developers are new to DB2 and fresh out of uni. The hash template I saw was a prime example of "I found this in a textbook somewhere." It was completely overkill since it's only used to hash array of bytes (why a template?) and the montgomery reduction used to perform the bucketing is not needed since the hash is invoked only upon startup/shutdown.

Whoever wrote that code obviously failed "problem statement" 101. Worse yet, the code had bugs in it and wasn't being maintained. I don't mean to pick on IBM, I'm sure this happens everywhere else. And while most of the folk there are smart, and experienced, the code I saw didn't reflect a growing concern over code size.

We can see this in OSS as well. For example, OpenOffice. Not only do they include their own copies of shared objects, but it's a mix of java, python, perl, C and C++. All in one application. OpenOffice for all it's virtues is a SHITTY PROGRAM that no sane proper experienced developer would have come up with.

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 02:30 · Score: 1

Chances are very good that if you have >100K lines of code, and they're not all tables, or just plain wasted white space, that you have functionality that can be broken off and re-used through a library. Do you even know what 500K lines of code is? That's a ridiculous amount of code.

If you look at things like the kernel or GCC they're already split up into mini libraries inside the host project. So yeah, all of GCC may be several million lines of code (I don't know the exact numbers) but it's not just one project.

By "project" I mean a work task not a .dsp file. As in, there isn't one person who actively works on the *entire* GCC code base on a routine basis. Most people focus on a specific part of it. So if you're part of the project is a separate block of 10K lines of code, you don't claim you're actually working on 20M lines of code.

So if this guy is actively working on a 500K line project, as in, he's actively developing parts of the entire 500K of code, chances are he needs to refactor the code and look at the design documents again. Most huge projects start off with the requisite support and grow into a final application.

For example, suppose you were writing a winamp clone from scratch. You'd start with the mp3 decoder, test that out, once it's working, package it up, then start on the output plugin, once they're working, package that up, then on the gui, etc... And if you actually look at Winamp, it's a central exe with a bunch of DLLs that do the grunt work. I seriously doubt Justin would on a daily basis be working on code from every aspect of the project. Likely, things like the MP3 DLL sat untouched for months on end [if not longer].

The point is, you'd test/verify the portions of the code as they're written. you wouldn't be looking at the entire mess of 500K lines all at once. That's just unmanageable from a verification point of view.

Tom

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 02:05 · Score: 1

why would you put mp3/flac/vorbis/etc in the same project? Why not just link them in like you're supposed to? As for mp3 codecs [and probably vorbis] most of that is unrolled DCT like transforms and tables.

That's I think part of the problem, people think they have to have all of the source in one build to make a project.

A hello world program execution is the result of a kernel, shell, standard C library, etc... none of which you count as lines of code in the program.

Tom

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 01:15 · Score: 1

Windows is not a single "project". It's comprised of dozens of applications, hundreds of libraries (DLLs), and hundreds of drivers.

Would you consider a Fedora Core installation a single project? No, it's the amalgamation of hundreds of independent OSS projects.

No one DLL or application should be 500k lines of code. If it is, it's either a lot of tables, or shitty code that finds new and inventive ways of doing things you don't need.

Tom

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 01:06 · Score: 1

I assumed that he meant lines of C and/or C++ code.

Look at something like my LibTomCrypt. It covers a wide range of cryptographic algorithms, it's only ~48K lines of code, quite a bit of which are tables for the ciphers/hashes. There are also plenty of comments, etc. Of actual code there is probably only ~30K or so.

And in that 30K I do symmetric ciphers, hashes, prngs, MACs, RSA (with PKCS #1), ECC (DSA/DH), DSA (DSS) and a decent subset of ASN.1.

Would it be more impressive if I did all that in 100K lines?

Re:Ideas on Static Code Analysis Tools? · 2007-03-30 01:02 · Score: 1

While yes, some things take a lot of code, but more often than not the excess code is a result of new coders contributing to a project for which they don't really have a grasp of the big picture. So they re-invent the wheel or add way much more to what should be a simple task.

For example, I worked on DB2 for a while. I routinely saw 3000 line files that implement such complicated things as hash lists. Then there was another 2000 line file that performs modular reduction in a dozen different ways because they didn't want to use a hash to sort their data into buckets, etc... Not saying DB2 is shite (cuz I never really used it I can't say anyways), but if DB2 were written properly and with an eye towards code size, it'd be probably 1/4th the size if not smaller.

If people bragged about the fewest lines of code with the most functionality, maybe we'd not be buying gigs of ram to run an OS ...

To me, when I hear that someone worked on a project with 10M lines of code or whatever, I'm rarely impressed. Not only because most likely they were a small player in a huge project, but that chances are the 10M line program is 10x larger than it needs to be.

Tom

Ideas on Static Code Analysis Tools? · 2007-03-30 00:35 · Score: -1, Offtopic

1. If you have 500k lines in a single project, consider re-factoring it into separate libraries that you can divide and conquer. Also, if you have 500k lines of code, consider cleaning it up, re-factoring it, etc. Fewer lines of code is more impressive than more.

2. Google for David Wagner and David Molnar, they seem to be up on that sort of work.

Re:Von Neuman bottleneck on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 23:39 · Score: 1

Well putting aside the fact that you can't really scale the 8086 design, it also had a very low performance characteristic. No cache, very slow multiplier, no FPU, etc.

Also, even if you could ramp an 8086 up by an order or two of magnitude (recall they ran originally at 4.77MHz) you'd have gate explosions to deal with the seriously lengthly critical paths. So it wouldn't stay at "5000" transistors (btw: according to wiki, the original i8086 had 29,000 transistors). But the fact remains that that design doesn't scale. I think the fastest 8086 like processor on the market tops out at I think at most 50MHz which is only a single order of magnitude. Still a good 60x slower clock.

Then you take into account that there is simply no cache. Every hit to data or code is a hit to main memory. Couple with the single issue non-pipelined ALU and while 1/10000th was an exaggeration it's probably not far off.

Let's see, the 8086 could do a single 16-bit add from memory in [iirc] 3 cycles, or if you wanted todo a 64-bit add, you'd need to perform 4 loads, 4 add/store, and a single store [off the top of my head]. So that's roughly 4*3 + 4*10 + 3 = 55 cycles, whereas the core2 can do that in 1/3 of a cycle if the data is in the cache (effectively, since you can do two others).

That's 165x slower. So 545x slower clock * 165x slower operation == 89925x slower 64-bit add (assuming it ran at the original clock). Suppose you got your 8086 somehow running at 100MHz, that's still 26x * 165x = 4290x slower. And addition is the simplest operation, wanna try multiplication?

Next take into account the fact that effective address generation was not free. Something like

lea ax, (si+bp)

Took extra cycles, that the EA was more restricted (couldn't do scaling, or use as many registers). Then take into account there is no MMU or FPU. They also have more [larger] registers. More debug support, more thermal support, etc....

Now you can start seeing why comparing a micro-risc processor to a core2 is meaningless. They're not even in the same ballpark of functionality.

If you need a low power processor an AVR32, MIPS, or ARM is a better choice. They're not only more MIPS/Watt efficient than the 8086 design but also faster as well.

Tom

Re:Two problems on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 22:41 · Score: 1

I still don't see it being viable. So instead of one fast thread you'll have two slow ones that are bumping into each other all the time. It just doesn't make sense. The core2 and AMD64 don't have enough bubbles to warrant it.

Alpha may have considered it, but given that neither the Alpha or Athlon have it suggests they abandoned the idea.

Consider this benchmark. Notice how the P4 has the highest clock cycle counts across the board for all of the tests? That's because there are gaping wide holes in the pipeline where another thread could easily fit.

In the case of the core2 and AMD64, I often see an IPC approaching 2 (and higher) when doing bignum math. That's the type of scheduling you can't do when you are injecting another stream of opcodes.

Tom

Re:Von Neuman bottleneck on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 22:36 · Score: 1

You couldn't clock the 8086 at "todays speeds" though that's just the point. Let's see ... um it was basically a non-pipelined core. So that means the latency of clock has to be the shortest instruction at least, but that also includes the fetch because the 8086 didn't have a cache. Which means you're basically locked to the memory ADDRESS speed (DDR/2).

By comparison, a PPC can typically encrypt with AES at 460 cycles per block (forget the exact model but it has 16/16 KB of cache and a dedicated load/store unit) whereas a core2 can do that in ~260 cycles (at also 10x the clock rate). So the core2 is 20x faster than a PPC, a PPC which is probably a million or two transistors.

You're telling me that a 5000 transistor 8086 can perform just the same as a million transistor PPC?

OMG!!! Let's start the revolution! Put the wonderful 8086 in everyzing! everyzing!

Tom

Re:One more thing... on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 22:29 · Score: 1

No, *you* buy PCs and gripe about low performance. I buy custom middle of the road gear, put Gentoo on it, and enjoy the utility of my computer.

It's not my fault that you run some bloatware OS like Vista [or whatever] and then gripe about how things are "teh sux." Maybe if you didn't use "My first Fisher Price OS" to run your computer you might notice it's more lively?

Most low end boxes are slower because they have the lowest rung processors, usually single channel memory, low end chipsets (re: via), etc.

Tom

Re:Where's the Software? on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 06:03 · Score: 1

I know hardware isn't designed at the gate level (hint: I work for a hardware design firm), but the parallelism from hardware comes directly from the fact that they're defining the platform.

As for the conditional moves and what not, GCC already supports them. But instead of extending the language in non-portable ways they just use them at the optimization stage.

For instance,

int f(int a, int b)
{
if (a > b) return a + 3; else return b + 4;
}

Turns into: .globl f .type f, @function
f: .LFB2:
leal 3(%rdi), %eax
leal 4(%rsi), %edx
cmpl %esi, %edi
cmovle %edx, %eax
ret

On my AMD64 box with "gcc -O3". That code has no branches, and on the AMD64 computes both additions in parallel. In fact the first three opcodes are executable each simultaneously in one cycle. This function should, take only 2 cycles to compute the return value. (in reality it probably won't all the time).

Re:Why not from the get-go? on Sony May Be Planning 80GB PS3 · 2007-03-29 05:30 · Score: 2, Insightful

first, they're not buying parts from BestBuy or CompUSA. They're getting deep discounts to buy hard drives from companies using older production lines. Hence the 8GB drives for the original xbox.

second, they can get idiots to buy the super duper upgrade piecemeal costing more.

third, I want a wii.

Re:Two problems on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 05:21 · Score: 1

A set of DCT blocks would be almost certainly more universally accepted than a GPU. Think about it, those gates for the GPU consume power whether you use it or not. So if you buy a box, and then install your own GPU you're wasting power on the ondie GPU.

I think this will have a home in laptop markets, but I also think Intel isn't wise to broaden the product line.

Tom

Re:Von Neuman bottleneck on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 05:18 · Score: 1

And your 5000 transistor processor would have 1/10000th the MIPS of that core 2 duo. Big deal? And not all tasks are inherently parallel. So for things that are mostly sequential you're going to lose. Not to mention managing 3000 processors takes resources too.

BTW the core 2 duo comes with 4MB of L2 and 128KB of L1. That's 207,618,048 transistors for the SRAM cells. The 2MB versions have 106,954,752 transistors in their SRAM cells. So really the core of the processor itself is probably only around 45M transistors. Or about 23M per core.

SRAM [cache] is used to compensate for the fact that main memory is just plain slower. Disregarding the refresh cycles and all that. The onboard memory controller has nothing to do with DMA access. OMG go take a course on processor architectures. On die memory controllers reduce the latency between the processor and the memory since there isn't a distinct chip dealing with memory over a [relatively] long bus (in terms of multiple GHz). It's also removing a distinct pipeline step in the access to the memory.

Tom

Re:Where's the Software? on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 05:12 · Score: 1

You missed my point. Verilog developers get parallelism by *defining* the hardware. If they want to add two 64-bit numbers in parallel, they put two adders down. If they want to process a DCT block while performing huffman decompression, they put two logic blocks down, etc.

In software, by definition, you have to run on something that has been well defined in advance. Designing in advance is a risk as you may miss target markets/applications which is why they go for strong overall performance first.

It isn't a language thing. Even if you coded in pure assembler to get down to the metal, you still couldn't make efficient use of a dual core processor for trivial sequences.

As you noted SPEs are hard to program strictly because they don't behave as processors, they're hardware addons that while the OS can deal with don't behave like a native processor you can just load a task on.

Even in the SPE case they're loaded with things that run more than a few instructions before making output.

Anyways, the point is, you can't both have a general flexible processor be very specifically optimal.

Tom

Re:Two problems on Intel Next-Gen CPU Has Memory Controller and GPU · 2007-03-29 03:33 · Score: 1

My point was that it isn't like you can just drop random parts into a cpu and then tape out and sell millions without problems.

The more flexibility in the configuration the more expensive verification becomes. A big part of chip design is meeting efficiency with performance. That is, keep the gate count low but the efficiency high.

I agree that an on-chip GPU would likely take less power, and be easier to integrate, if that was the only processor you made.

By starting to mix in what many consumers look at as "generic" devices into the core you run the risk of making processors that people just don't want.

It'd be like putting the OS in the processor, and having a "Windows only" x86 processor. How useful would that be overall?

Tom

Slashdot Mirror

User: tomstdenis

Comments · 6,870