Sun Wins Top Tech Innovation Award
Carl Bialik from WSJ writes "Sun's DTrace trouble-shooting software won top prize in the Wall Street Journal's 2006 Technology Innovation Awards competition. It's the second time in three years that Sun took the top award. From the article, which also names a dozen other winners: 'Where most debugging takes place as software is being developed, DTrace analyzes problems with systems that are in production — running a company's database, say, or executing stock trades. It does this with a process called "dynamic tracing," which enables a developer or systems administrator to run diagnostic tests on a system without causing it to crash. Before DTrace, such tests often took days or weeks to reproduce the problem and identify the cause. With DTrace, performance problems can be tracked to their underlying causes in hours, even minutes.'"
They are pretty much completely unrelated. I think you could get dtrace to do what strace does, but strace is a special-purpose tool of very limited scope. If you think they are comparable then you don't know anything about drace.
Yea, this is the Wall Street Journal. It's like that old joke about Hollywood Squares: "According to Redbook, what is Plank's constant?" Not really an authoritative source on technical innovation.
DTrace has a degree of OS integration that makes it non-trivial to copy, linux's alternatives don't even come close even though a tool like this would be very useful in linux.
For the foreseeable future, if you want to have this type of debugging on your server then the server has to run Solaris. And if your server is bigger than a 4-way then it makes sense that it's a Sun server.
There is value in premium gear, and while it won't make Sun the next Dell, it can hopefully help improve their standing in their core market.
And this is /. where folk think strace == dtrace
With strace can you trace everything from I/O operations through to system calls to monitor your live application without taking anything offline and get almost no performance hit?
Like it or not, dtrace is a huge innovation - it's also open sourced and coming really soon to an operating system near you. I think anyone involved in major application deployments is going to welcome dtrace and think it worthy of the award.
Sun's DTrace trouble-shooting software won top prize in the Wall Street Journal's 2006 Technology Innovation Awards competition. It's the second time in three years that Sun took the top award.
Sounds like they've put those HP founders to work, instead of just parading them around in t-shirts.
The theory of relativity doesn't work right in Arkansas.
You know who's really not an authoritative source on technical innovation?
People who compare strace to dtrace. In this case, the Wall Street Journal knows a hell of a lot more than both you and the grand parent poster.
"A Lisp programmer knows the value of everything, but the cost of nothing." - Alan Perlis
However, inline analyzers have existed. Intel's VTune is clunky, limited in supported architectures but useful where it applies. Parallel developers might well use DAKOTA and KOJAK to do the same for MPI applications, which traditional analyzers can't handle at all. I also would not advise anyone to just use analyzers. You would be wise to monitor events - there are patches for Linux, such as evlog, which give you very flexible event logging. Linux also provides the ability to monitor all kinds of other statistics - either as standard or through patches such as Web100 (for the network) or LTT-ng (for profiling).
Does this mean I think Sun don't deserve the award? I've not used that tool, so I'm not in a position to say. It would have to do a lot in addition to basic analysis to earn the right to be innovative, never mind the title of "top technical innovation". If it can, that's great and can Sun kindly port it to Linux. If it can't, then all I can say is that the competition must've sucked this year.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
After all, it takes a considerable amount of insight to pick a code analyzer (admittedly one as brilliant as dtrace) as important and newsworthy. Good job, guys! It shows you can look deeply at a topic and understand what makes computer systems valuable. A lesser effort would award something from Microsoft, Google or Apple, whose products are great, but lack the sophistication of many Sun innovations.
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
strace is more like Solaris's truss, except truss is quite a lot better. IMO dtrace is for more serious debugging, tools like truss & strace are quick and dirty tools for easy to solve problems where just knowing the system call and their return values is enough to diagnose the issue.
Several people have mentioned strace, but I have yet to see anyone mention oprofile. I haven't used dtrace before, but oprofile allows you to see where an application is spending it's time transparently, with negligible performance hit, and without restarting the application.
oprofile has been around since late 2002 it seems, so it's not particularly new either. How does dtrace compare to oprofile?
Game! - Where the stick is mightier than the sword!
A tech award from them would surely be an insult to any true geek!
Yes, much like personal hygiene or the concept of heterosexuality.
I noted in my article Boxing in the LLRing, which despite positive responses Slashdot rejected in favor of Roland Piquepaille's daily column and various political commentary, that Squeak has an amazing debugger (I am not going to call it a full-blown analyzer) that allows you to debug applications as they are running on the very interesting Seaside application server.
As described in this paper (pdf), Seaside provides multiple control flows and a high level of abstraction that is very useful to web app developers.
The 4500 word article is coverage of a 300 developer "Lightweight Languages" all-day seminar held in a real boxing ring in Tokyo, covering 30 languages and frameworks including Perl, Python, Ruby, Haskell, OCaml, Squeak, and many others.
But a few points.
1) You need to boot bsd specially into a dtrace mode to use this. That presumably means that the BSD version either slows the system is isn't of production quality. When my database server is dying under the load, rebooting it isn't high on the list of things I want to do.
2) FreeBSD are pretty nimble at developing this kind of thing. I'm more curious to see how long it takes MS or Dell to have something comparable.
3) Sun provided the source and a development machine; presumably because of FreeBSD's favorable licensing. I'm not sure that's an option for any closed source product.
With strace can you trace everything from I/O operations through to system calls to monitor your live application without taking anything offline and get almost no performance hit?
With strace you can get system calls, without taking anything offline (you can attach to a running process); I don't know about any performance hit (the man page appears to say there are some). I/O operations are usually sytstem calls so they're covered.
and maybe after it is ported to linux/*bsd and ten years have gone by, admins will actually start using it to its full potential. Now, if someone were to code a nice gui frontend to dtrace, that'd be innovation, because it would take an absolute master of UI design to turn using dtrace into something that was easy-to-do for the uninitiated.
How we know is more important than what we know.
Admins are not necessarily coder are they? You have to be a fairly savy developer to be able to use Dtrace.
You ultimately need to fix the code and need someone to modify it.
Having said that...I am sure it will get easier to use in the future. I for one welcome all the help I can get. Admins included!!
Dynamic instrumentation (you know -- the "D" in DTrace's name) has been in-use on the live air traffic control systems of several countries' Air Traffic Control systems (http://www.ocsystems.com/cs_memoryleak.html, http://www.ocsystems.com/cs_injectingfaults.html) for more than a decade.
Your worry about bugs in the dynamic instrumentation tool affecting the production system is no different than worrying about bugs in the operating system affecting the production system and addressed the same way -- by seriously thorough testing.
strace is a tool very similiar to truss or sotruss on Solaris. These tools can be used to watch an application at it runs monitoring system calls but is somewhat archaic.
dtrace is a monitoring tool that provides access to the entire system without worring about causing any damage to a running system. I can choose what to monitor from
everything on the system down to watching an individual thread execute I can even access every public variable in the kernel and it is easy to use.
Dtrace provides much of the functionality of adb,mdb, truss, tnfextract, tnfdump, and lockstat plus much much more.
It was released about 4/25, but doesn't show up when you look for dtrace - its works great in Linux/UNIX environments for tracing errors through different packages / libraries.
great job theif!
-Iridium
"During times of universal deceit, telling the truth becomes a revolutionary act" -- George Orwell
If Sun was good at hyping their products, their stock would be trading above $5/share.
To paraphrase the old saw... That award and $2.95 ought to cover a cup of coffee - er, I mean, a cup of Java!...
This issue is a bit more complicated than you think.
Oprofile is more for profiling.
d probes/
LTT helps you analyse events as they happen over time.
Dprobes is one possible source of LTT events.
http://dprobes.sourceforge.net/
http://www.opersys.com/LTT/
http://dprobes.sourceforge.net/documentation/man/
Careful with the accusations of Wall Street's credibility on the subject. The award was decided by a jury of fairly distinguished members of ye olde programming community. And just to be fair, "None of them voted on any entries in which their companies or organizations may have had an interest."
Peter Marshall: Paul, according to Redbook, what is "Plank's Constant?"
Paul Lynde: Well, if Plank were all that constant, he wouldn't be needing that Ex-Lax, would he?
(Uproarious laughter from the studio audience.)
http://www.classicsquares.com/lyndesquares.html
* * * * *
It's only when you look at an ant through a magnifying glass on a sunny day that you realise how often they burst into flames.
--Harry Hill
Total troll. DTrace is indeed an amazing innovation. Having used it a bit myself, I can tell you how immensely powerful it is. Sun has problems, but DTrace is definitely NOT one of them.
So good, in fact, that the anouncment that it will be in the next release of OS X has gotten Apple a new customer.
"A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
It's so reliable you never need to look for problems.
The closest linux equivalent is the Systemtap project, which is based on the kprobes low level hooking API. These aren't yet billed as ready for production systems, but they'll get there soon enough. They look quite slick, also.
That said, the WSJ award seems to me to be maybe a little overstated. While Sun fanboys will shout to the heavens (with some justification, even) that DTrace is an amazing tool with absolutely no counterpart in the linux world, the fact remains that DTrace is at best an incrementally amazing tool. System performance tuning is a hard task, requiring smart developers and lots of work. System performance tuning with DTrace is a hard task requiring smart developers and a little less work.
System performance tuning using DTrace and a typical Solaris IT wonk (a population that tends to correlate highly with the fanboys pushing DTrace the hardest) is a recipe for disaster.
If you find someone telling you that DTrace is a must have tool and indispensable to the systems developer, apply salt. But yeah, it's pretty slick.
Please check out the Chime project which is about visualization software for DTrace. You can find more information at http://www.opensolaris.org/os/project/dtrace-chime /
For those who think that DTrace is old news, I really suggest that you download one of the OpenSolaris-based distributions
http://www.opensolaris.org/os/about/distributions/
and play around with DTrace. Yes, it's CLI is aimed at the geek in all of us but there is software like Chime and MacOS X's upcoming Xray which will help with those who prefer a different sort of UI.
System performance tuning using DTrace and a typical Solaris IT wonk (a population that tends to correlate highly with the fanboys pushing DTrace the hardest) is a recipe for disaster.
Care to expound on how that could be? It sounds like the most you've done with dtrace is read some online docs about it (or worse, the wikipedia entry), said "hmph" to yourself, and quickly moved on. I'm curious as to why you feel that you must spare no effort in disparaging "Solaris IT wonks" for whatever reason. Let us talk about "Linux IT wonks" and what gets them all hot and bothered to the point of fanboyism, shall we? There certainly wouldn't be a shortage of subject matter for that discussion.
The door swings both ways, bud. Good tools are good tools and there's no harm in letting the people who like the tool also be proud of the tool. To frown on that stinks of sour grapes.
For those who, like me, had heard of dtrace but little more (is it like strace, for example), this is very handy article written by one of the authors in Communications of The ACM
p a=showpage&pid=361&page=1
http://www.acmqueue.org/modules.php?name=Content&
Yeah,it's 5 pages long, so those won't RTFA are even less likely to read this, but it's a good read covering motivation, history, solution compromises and some anecdotes that could qualify for http://thedailywtf.com/
I spent a lot of money on booze, birds and fast cars. The rest I just squandered. - George Best
I think what he meant was the DTrace is a tool that most system admins would have no clue how to use. If they did start digging in code, attempting to "optimize" it, things would probably break...hard. He's not necessarily downing Solaris IT wonks, as much as he is the vast sea of IT wonks that are really bad at their job, but don't realize it. Basically, the majority of the IT industry. Since no one else (expect maybe a few FreeBSD wonks) has DTrace, it's fairly safe to narrow it down to Solaris wonks, and if you would trust a typical admin with DTrace and source code, you're a braver soul that most of us.
This is one of the things I am really looking forward to in XCode 3.0. DTrace by itself is pretty amazing, but DTrace plus the UI that Apple have created for it is beyond comparison.
I am TheRaven on Soylent News
Sun definitely deserves an innovation award this year, but I would not have said it was for DTrace. DTrace is an incredibly nice tool, but I would put it well behind ZFS. ZFS is the first filesystem I have looked at in detail and liked everything I've seen. BeFS came close (I only found one thing I disagreed with in the design there), but ZFS does much, much more.
The UltraSPARC T1 is also a very nice chip, and possibly deserves this kind of thing, although I am more interested in the T2 since I tend to do a lot of FPU-intensive things.
I am TheRaven on Soylent News
DTrace does more than just tracing system calls. You can trace any function entry and exit if you desire. Many so-called providers offer hooks you can trigger on. Whatever you like to instrument in your live system, name it, script it, get results. You can query the function parameters of any function call (not just system calls, btw) and act on it.
I think what he meant was the DTrace is a tool that most system admins would have no clue how to use.
Any good sysadmin loves to be able to look under the hood. Yes, most will not go ahead and fix any code but at least they have a handle on what's wrong and have a lot more info they can hand over to tech support if needed. I sure hope that any decent Solaris admin is able to use DTrace. It is a wonderful tool.
Couldn't agree more. Of all the new stuff in Solaris 10, I find zfs far and away the most useful in the real world. Damn, I hope it gets the success it deserves and consigns ODS, VxVM, LVM etc to the nostalgia pages.
/. cluelessness in ages.
DTrace is amazing, but to get the most of it you need to understand your system at a far lower level than most sys-admins honestly do. I use it and I love it, but I barely scratch the surface of what it can do, because it can produce a level of detail that's so far over my head as to be useless. I imagine it's priceless for developers.
LOVING all the "we have strace" comments! Best
press: "state of the art Ferrari wins automotive award"
slashdotters: "So what? We have that car from The Flintstones."
Sun stock has been in the toilet for so long, they have to do shit like this to try and raise it enough to get out.
Glad to see you have rightfully been modded down....
IANAL but write like a drunk one.
Geesh, why does no one link to the original USENIX paper on DTrace:
Dynamic Instrumentation of Production Systems
Quite a fascinating read, actually.
Oh, I worry about bugs throughout the infrastructure. OS bugs, compiler bugs, system library bugs, firmware bugs - all of these can turn even a 100% perfect application (were such a thing to exist) into a smouldering heap of junk. They are unpredictable and almost impossible to trace in those situations where the programmer only has the application to look at. Dynamic instrumentation is, I believe, slightly worse in that non-fatal bugs in a system call, for example, would eventually be inferred by observing that data is mangled after such a call in all places in the system. With dynamic instrumentation that uses embedded operations in the code, the same holds true. Instrumentation that runs in parallel and dips in at intervals is much more of a problem, as there is then almost certainly no correlation between anything in the code and side-effects from the instrumentation. You can eventually deduce that the errors must be external to the code (and all functions linked to it), but in any seriously large application, or if the OS is complex (or, worse, black-box), this can take a hellish long time.
Probably the worst-case scenario is where the side-effects aren't direct. The instrumentation might very occasionally add a delay that, on rare occasions, causes a time-sensitive component of the application to miss a critical deadline. The bug would then not be in the code of either program, but would be in the sequence of operations of successive time-slices. Sequencing bugs are bloody murder, because they are not programming bugs. The code can be 100% clean and still have this class of bug. (Sequencing bugs are much more general than, say, race conditions, and would typically be at a much lower level.)
Debugging programs is extremely difficult and time-consuming to do right, because by the very fact that you are running in a debugging environment, you have changed the characteristics of the environment the program is running in (unless it is ALWAYS running in such a mode). Even disregarding all the above problems, I feel certain that the vast majority of programmers have encountered bugs that cease to exist when debugging information is added, or where the program is placed in a debugger... or, for that matter, ONLY exist when debugging information is added. The last of these is particularly nasty for those in rigid work environments. If there's some bug X that users are seeing that is obscured by bug Y that is introduced by debugging/instrumentation data, there are workplaces where fixing bug Y is not permitted as it is not a user-documented bug and so no time/money has been budgetted for fixing it. That can make fixing X really fun. You can spot projects that are likely a victim of the "no complaint, no fix" attitude - they eventually work just well enough but no better, ran way over on time and are likely to be fragile under unusual conditions.
Surprisingly, this is not a dig at the Usual Target of Slashdot Gripe. Rather, I've seen this attitude when employed within the public sector, which is notorious for producing an amazing amount of crap. Which is ironic, because the less formal and informal projects from the public sector are equal to or better than commercial projects. Sure, there are a lot of crap projects on Freshmeat, but if you look at the really good stuff, you'll see a lot comes from Government research groups, Universities and - occasionally - the US DoD, but they're all projects managed by geeks, not wannabe accountants. (How bad does a person have to be if they can only pretend to be an accountant?)
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)