Domain: sourceforge.net
Stories and comments across the archive that link to sourceforge.net.
Stories · 1,414
-
Google Hires Gaim's Main Developer
astrab writes "According to Dirson's blog, Google's just hired Sean Egan (the main developer of Gaim open IM client), just the same day Yahoo! and Microsoft plan to link their respective proprietary IM networks." From the post: "While Yahoo! and Microsoft link their proprietary networks for Instant Messaging, Google bets on Open Protocols to make information universally accessible ... Currently, Google uses XMPP/Jabber specs, but they claim to be supporting open server-to-server federation, and work "to hear from other people in the communications industry about how best to build a federation model that is open, scalable". In fact, there are this month several tests with firms like EarthLink, Sipphone or PeopleCall. " -
Named Innovators/Developers of Color?
i_c_andrade asks: "Apple and other tech companies were in the past called to task for the lack of Hispanics and African-American's on their Board of Directors, so after doing some research I came to the conclusion that I just did not know a lot of named IT/OSS/Web/CS innovators/developers that were not white (or American) specifically Hispanic or African-American. The first (and only) name that I could think of was Miguel de Icaza, and well I can only blame my own ignorance for not knowing any more, or are there? I know there is a big BSD movement in Brazil (they created the The FreeBSD LiveCD Project; but where else are there developers 'of color' and what are they working on?" -
Bad Movies to Blame for Box Office Slump
macklin01 writes "The LA Times is reporting that box office executives are finally fessing up and taking the blame. Poor box office receipts over the summer weren't caused by surging fuel costs, changes in audience preferences, or anything else. As Slashdot readers might have put it (and as it comes out in the article), 'It's the movies, stupid.'" -
Pepping Up Windows
PhairOh writes "Toms Hardware has an article about improving Windows with free and Open Source Software. It features everything from the obvious like Gimp and OpenOffice and also some interesting choices like Virtuawin. From the article: 'The average Windows user tends to be less than satisfied with Windows. And that's no surprise, either, given the rather woeful state of its default applications.'" -
CA Sec. of State Panel on Open Source Elections
goombah99 writes "The Open Voting Consortium has announced that California Secretary of State Bruce McPherson is forming a panel to investigate using open source software in elections. Suggested Panel members include Security expert Bruce Perens and Python guru David Mertz who is associated with the sourceforge EVM2003 voting machine project. This is big since a favorable outcome could help fund prototypes of true open source election equipment and systems." -
Slashdot HTML 4.01 and CSS
After 8 years of my nasty, crufty, hodge podged together HTML, last night we finally switched over to clean HTML 4.01 with a full complement of CSS. While there are a handful of bugs and some lesser used functionality isn't quite done yet, the transition has gone very smoothly. You can use our sourceforge project page to submit bugs and we'd really appreciate the feedback. Thanks to Tim Vroom for putting the HTML in place, Wes Moran for writing the HTML in the first place, and Pudge for writing the code to convert 900k users, 60k stories, and 13 million comments to comply. And for the brave, download the stylesheet and start experimenting with new themes and designs for Slashdot: some sort of official contest to re-design Slashdot is coming soon, so you can get a head start now.Response to some reader notes in the forum:
- There are a handful of validation errors. Some will be fixed in the next day or so. Others are external HTML that is out of our hands. We may never toally validate with zero errors. yes we're comfortable with that.
- We're not going to XHTML for the same reasons as above- we control almost all of our HTML, but some of it (like the ads, and imports from other sites) just isn't ours to muck about with. We could go to XHTML, and someday we might, but today we're happy to just get to HTML 4.01 and CSS.
- Light Mode will be back in some form or another. The problem is that light mode served two purposes: Low Bandwidth, and Simplified Design. The later will probably be handled with a CSS theme (we have a handheld theme already). Low Bandwidth is a little trickier, but we will resolve that soon.
- All of our code is beta tested on www.slashcode.com and use.perl.org. Unfortunately there's always a few issues from those tiny tiny sites and the giant bohemoth that is Slashdot itself.
-
Google Putting Crowd Wisdom to Work
daveperry writes "The Google Blog has a post about their use of prediction markets to forecast certain events that are relevant to their business. From the article: "Our search engine works well because it aggregates information dispersed across the web, and our internal predictive markets are based on the same principle: Googlers from across the company contribute knowledge and opinions which are aggregated into a forecast by the market. Sometimes, just feeling lucky isn't enough, and these tools can help." In related news, some software was recently open sourced that enables people to set up their own prediction markets." -
MMO-Like Quake Is Possible
An anonymous reader writes "OptimalGrid is a self-contained middleware designed for developers to create grid-enabled parallel applications without themselves becoming experts in grid or high-performance computing (article). The Linux compatible middleware now includes automatic distribution and provisioning on to Grid nodes. See how the first release of Quake II was made massively multi-player [pdf] by running on a Grid. Get modified Quake II from Sourceforge to run with OptimalGrid and let the massive Grid games begin." Update: 09/19 16:12 GMT by Z : Marked the pdf as such. -
Artist Suggesting Ways Around Copy Protection
fanboyslayer writes "Switchfoot's new album Nothing Is Sound shipped from Sony with copy protection software on the CD, much to the dismay of thousands of iPod-wielding fans. The band posted a response on their official forum apologizing for the protection and detailing ways to circumvent the protection and rip their songs to PC. Switchfoot linked to open-source program CDex's download page with instructions on disabling the autorunning protection and ripping the files to MP3. Many of Switchfoot's fans have been upset by the copy protection measures, and it's nice to know the artists seem to care about the issue." -
MethLabs Shuts out PeerGuardian
Lost&Confused writes to tell us Slyck News is reporting that most of Methlabs.org administration and development staff have been forced out of their own website. For the time being PeerGuardian is being hosted on sourceforge. However, users are advised to stop using the Methlabs.org and Blocklist.org hosted blocklists in favor of the Bluetack list until they can sort things out. -
A Useful Grammar Checker?
burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?" -
A Useful Grammar Checker?
burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?" -
Why Does Current Clustering Require Recoding?
AugstWest asks: "I've been doing some research into what the available clustering options are for pooling CPU resources, and it looks like most of the solutions I've found require that programs be re-written to take advantage of the cluster. Since there are virtualization apps like Bochs and VMWare, where the applications just make use of a virtual CPU as if it was a real CPU, why aren't there clustering solutions that do this as well?" -
Help Beta Test Slashdot CSS
After almost 8 years, Slashdot's HTML is finally getting an overhaul. For now the changes are almost entirely under the hood, as we migrate the current skin to CSS. Slashdot itself will migrate in the next few weeks, but for now, we'd appreciate it if people who understand CSS could take a look at Slashcode. If you use a browser that lets you select a stylesheet, you can take a look at that site with the Slashdot CSS Skin. Keep in mind that Slashcode doesn't look exactly like Slashdot, so there will be some differences between that site, and the final version that will appear on Slashdot. We're mainly looking for feedback on compatibility issues and blatant bugs. You can use our our SF bug tracker to submit bug reports. Thanks for your help. Once we move Slashdot, work will begin on a new look & feel. If you have ideas, you could start playing with the CSS stylesheets now! -
OpenGL Programming Guide
Martin Ecker writes "The Red Book, also known as the OpenGL Programming Guide, is back in its fifth edition. It received the name Red Book because of the nice red book cover, and possibly also because it has remained the standard introductory text on the OpenGL graphics API for years, and always referring to it as "OpenGL Programming Guide" is too long. This fifth edition now also covers new features introduced with versions 1.5 and 2.0 of the OpenGL standard. So let me take you on a tour through the pages of this book to see what it has to offer." Ecker's review continues below. OpenGL Programming Guide (5th Ed.) - The Official Guide to Learning OpenGL, Version 2 author Dave Shreiner, Mason Woo, Jackie Neider, Tom Davis pages 838 publisher Addison-Wesley Publishing rating 8 reviewer Martin Ecker ISBN 0321335732 summary A very complete and thorough introduction to OpenGL
I should mention that the last edition I read of the Red Book was the first edition, and a lot of material has been added to the book in the meantime. Just as the first edition, however, the fifth edition is still incredibly complete and thorough. It contains explanations of pretty much every feature of OpenGL, even the rarely used ones. You want to know about the new occlusion queries added to OpenGL recently? It's in this book. You want to know about the accumulation buffer and its uses? It's in this book. You want to know about the (mostly deprecated) use of indexed color buffers? It's in this book. The only thing the book does not cover in detail is vertex and fragment shaders because they have their own book, the Orange Book (aka The OpenGL Shading Language) -- see my previous Slashdot review.
The Red Book is aimed at the beginning to intermediate graphics programmer who is not yet familiar with OpenGL. It assumes a basic background in computer graphics theory and working knowledge of the C programming language. The book consists of 15 chapters and 9 appendices that together span approximately 800 pages.
The first chapter gives a brief introduction to the basic concepts of OpenGL and describes the rendering pipeline model used in the API. GLUT, a cross-platform library that allows easily creating OpenGL applications, is also shortly discussed together with a program that shows GLUT in action. The following chapters proceed to explain the basic geometric primitives, such as lines and polygons, supported by OpenGL and how to render them in different positions and from different viewpoints using the various OpenGL matrix stacks. The authors also discuss here the basics of using colors, fixed-function lighting, framebuffer blending, and fog.
Chapter seven contains a description of display lists, a unique feature of OpenGL that allows to store OpenGL API calls for efficient multiple use later on in a program. Chapter eight then moves on to discuss what an image is for OpenGL, which brings us straight to chapter nine on texture mapping, one of the largest chapters in the book. This chapter discusses everything you need to know on textures, from specifying texture images in uncompressed and compressed form to applying textures to primitives using the various kinds of supported texture filters. Also depth textures and their application as shadow maps are presented.
In chapter ten the authors discuss the buffers that make up the framebuffer, such as the color buffer, depth buffer, and stencil buffer. This chapter summarizes some of the things already presented in the earlier chapters and then describes the various framebuffer operations in more detail. Also the accumulation buffer and its uses, such as motion blur and depth of field effects, are discussed. Chapter eleven and twelve are on the tools provided by GLU, the GL utility library, in particular tesselators, quadrics, evaluators, and NURBs. GLU is nowadays rarely ever used in production code, so these chapters mostly demonstrate just how complete the Red Book is in its coverage of OpenGL. This also applies to chapter thirteen on selection and feedback, which are rarely used features, mostly because of the lack of hardware acceleration.
Finally, chapter fourteen is a collection of topics that didn't fit into the other chapters, such as error handling and the OpenGL extension mechanism. Additionally, this chapter presents various higher level techniques and tricks, for example how to implement a simple fade effect, how to render antialiased text, and some examples of using the stencil buffer. The final chapter of the book - newly added in the fifth addition -- is a short introduction to the OpenGL Shading Language (GLSL, for short). Even though the OpenGL API functions required to use GLSL are presented, this is only a quick overview of how programmable shaders are used in OpenGL. For a more detailed description of GLSL the reader is referred to the Orange Book.
The book closes with quite a few appendices on the order of operations in the OpenGL rendering pipeline, the state variables that can be queried, the interaction of OpenGL with the operating system-specific windowing systems, a brief discussion of homogeneous coordinates as used in OpenGL, and some programming tips. Also a reference of the built-in GLSL variables and functions is included, which is a bit odd considering that the Red Book actually doesn't really concentrate on programmable shaders or GLSL. It's a good reference nevertheless.
The book contains a large number of images and diagrams, all of them in black and white except for 32 color plates in the middle of the book. The illustrations are of high quality and generally help make the explained concepts and techniques easier to understand. Most of the color plates depict spheres, teapots, and other simple geometric objects, so they aren't overly eye-catching but do serve their purpose of showing what can be achieved with OpenGL.
The Red Book remains the definitive guide to learning OpenGL. Whenever someone asks me "What book should I read first to learn OpenGL?" this is the book I refer them to. Apart from being a good introduction, it also contains many interesting tips and tricks that make the experienced OpenGL programmer come back to it often. If you've read through this book in its entirety you pretty much know everything there is to know about OpenGL.
Martin Ecker has been involved in real-time graphics programming for more than 9 years and works as a games developer for casual arcade games. In his rare spare time he works on a graphics-related open source project called XEngine. You can purchase OpenGL Programming Guide (5th Ed.) - The Official Guide to Learning OpenGL, Version 2 from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
OpenGL Programming Guide
Martin Ecker writes "The Red Book, also known as the OpenGL Programming Guide, is back in its fifth edition. It received the name Red Book because of the nice red book cover, and possibly also because it has remained the standard introductory text on the OpenGL graphics API for years, and always referring to it as "OpenGL Programming Guide" is too long. This fifth edition now also covers new features introduced with versions 1.5 and 2.0 of the OpenGL standard. So let me take you on a tour through the pages of this book to see what it has to offer." Ecker's review continues below. OpenGL Programming Guide (5th Ed.) - The Official Guide to Learning OpenGL, Version 2 author Dave Shreiner, Mason Woo, Jackie Neider, Tom Davis pages 838 publisher Addison-Wesley Publishing rating 8 reviewer Martin Ecker ISBN 0321335732 summary A very complete and thorough introduction to OpenGL
I should mention that the last edition I read of the Red Book was the first edition, and a lot of material has been added to the book in the meantime. Just as the first edition, however, the fifth edition is still incredibly complete and thorough. It contains explanations of pretty much every feature of OpenGL, even the rarely used ones. You want to know about the new occlusion queries added to OpenGL recently? It's in this book. You want to know about the accumulation buffer and its uses? It's in this book. You want to know about the (mostly deprecated) use of indexed color buffers? It's in this book. The only thing the book does not cover in detail is vertex and fragment shaders because they have their own book, the Orange Book (aka The OpenGL Shading Language) -- see my previous Slashdot review.
The Red Book is aimed at the beginning to intermediate graphics programmer who is not yet familiar with OpenGL. It assumes a basic background in computer graphics theory and working knowledge of the C programming language. The book consists of 15 chapters and 9 appendices that together span approximately 800 pages.
The first chapter gives a brief introduction to the basic concepts of OpenGL and describes the rendering pipeline model used in the API. GLUT, a cross-platform library that allows easily creating OpenGL applications, is also shortly discussed together with a program that shows GLUT in action. The following chapters proceed to explain the basic geometric primitives, such as lines and polygons, supported by OpenGL and how to render them in different positions and from different viewpoints using the various OpenGL matrix stacks. The authors also discuss here the basics of using colors, fixed-function lighting, framebuffer blending, and fog.
Chapter seven contains a description of display lists, a unique feature of OpenGL that allows to store OpenGL API calls for efficient multiple use later on in a program. Chapter eight then moves on to discuss what an image is for OpenGL, which brings us straight to chapter nine on texture mapping, one of the largest chapters in the book. This chapter discusses everything you need to know on textures, from specifying texture images in uncompressed and compressed form to applying textures to primitives using the various kinds of supported texture filters. Also depth textures and their application as shadow maps are presented.
In chapter ten the authors discuss the buffers that make up the framebuffer, such as the color buffer, depth buffer, and stencil buffer. This chapter summarizes some of the things already presented in the earlier chapters and then describes the various framebuffer operations in more detail. Also the accumulation buffer and its uses, such as motion blur and depth of field effects, are discussed. Chapter eleven and twelve are on the tools provided by GLU, the GL utility library, in particular tesselators, quadrics, evaluators, and NURBs. GLU is nowadays rarely ever used in production code, so these chapters mostly demonstrate just how complete the Red Book is in its coverage of OpenGL. This also applies to chapter thirteen on selection and feedback, which are rarely used features, mostly because of the lack of hardware acceleration.
Finally, chapter fourteen is a collection of topics that didn't fit into the other chapters, such as error handling and the OpenGL extension mechanism. Additionally, this chapter presents various higher level techniques and tricks, for example how to implement a simple fade effect, how to render antialiased text, and some examples of using the stencil buffer. The final chapter of the book - newly added in the fifth addition -- is a short introduction to the OpenGL Shading Language (GLSL, for short). Even though the OpenGL API functions required to use GLSL are presented, this is only a quick overview of how programmable shaders are used in OpenGL. For a more detailed description of GLSL the reader is referred to the Orange Book.
The book closes with quite a few appendices on the order of operations in the OpenGL rendering pipeline, the state variables that can be queried, the interaction of OpenGL with the operating system-specific windowing systems, a brief discussion of homogeneous coordinates as used in OpenGL, and some programming tips. Also a reference of the built-in GLSL variables and functions is included, which is a bit odd considering that the Red Book actually doesn't really concentrate on programmable shaders or GLSL. It's a good reference nevertheless.
The book contains a large number of images and diagrams, all of them in black and white except for 32 color plates in the middle of the book. The illustrations are of high quality and generally help make the explained concepts and techniques easier to understand. Most of the color plates depict spheres, teapots, and other simple geometric objects, so they aren't overly eye-catching but do serve their purpose of showing what can be achieved with OpenGL.
The Red Book remains the definitive guide to learning OpenGL. Whenever someone asks me "What book should I read first to learn OpenGL?" this is the book I refer them to. Apart from being a good introduction, it also contains many interesting tips and tricks that make the experienced OpenGL programmer come back to it often. If you've read through this book in its entirety you pretty much know everything there is to know about OpenGL.
Martin Ecker has been involved in real-time graphics programming for more than 9 years and works as a games developer for casual arcade games. In his rare spare time he works on a graphics-related open source project called XEngine. You can purchase OpenGL Programming Guide (5th Ed.) - The Official Guide to Learning OpenGL, Version 2 from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Nintendo DS Wireless Game Roundup
ImaNumber writes "Brittlefish has posted a nice roundup of the major multiplayer games currently available for the Nintendo DS. They make their picks on which ones have good wireless play and which ones just added it in as a gimmick." From the post: "If you have 2 Nintendo DS's or you know someone else who has one you've probably played some multiplayer games. And you want more. But which games are worth buying with the incentive of good multiplayer?" -
JBoss - A Developer's Notebook
Pankaj Kumar writes "Controversies aside, JBoss has emerged as a credible alternative to commercial J2EE App Servers for developing and deploying Java based server applications. Besides the usual advantages of open source and GPL licensing, what sets it apart is its JMX based microkernel, a light-weight framework to run independently developed Java programs within a single JVM. Together, these make it possible for one to pick and choose components and assemble a custom server anywhere between the two extremes (and beyond!) of a simple Servlet Container and a full-fledged J2EE Server. JBoss - A Developer's Notebook by Norman Richards, a JBoss developer at JBoss, Inc., and Sam Griffith, Jr., a software consultant and trainer, is a no-fluff How-To guide on doing stuff with JBoss in O'Reilly's new Developer Notebook format." Read on for Kumar's review of the book. JBoss - A Developer's Notebook author Norman Richards & Sam Griffith, Jr. pages 150 publisher O' Reilly rating 7 reviewer Pankaj Kumar ISBN 0596100078 summary A How To Guide on Working With JBoss
True to the format, this book doesn't waste pages on paeans to architectural elegance, internal design or conceptual deliberations, and limits itself to the basic needs of most professionals -- how do I do this or that with JBoss, where to start, what steps to carry out or what code to write, and what happens behind the curtains.
Books dealing with J2EE products tend to be fat and bulky, but this (note)book doesn't fall in that category. By covering only JBoss specific aspects and avoiding general J2EE topics, this rather thin book has managed to include a good deal of difficult-to-find information about JBoss. In fact, while going through its pages, I got a feeling that the authors have taken care to be different and complementary to the online documentation available in the JBoss Application Server Guide and JBoss Wiki.
In support of the above claim, let me compare the coverage of how to deploy applications under JBoss, an important activity with any J2EE container, in the JBoss Guide, JBoss Wiki, and the book under review. The JBoss Guide covers application deployment as part of the JMX based microkernel architecture and design, describing, in excruciating detail, the internal components responsible for the deployment and and how they interact. The JBoss Wiki takes a more externally focused approach, talking about hot deployment capability, relevant directories and configuration files in an installed system, and steps in a typical deployment process. In contrast, Developer's Notebook goes through the whole process of creating the deployable WAR file for a web application, deploying that to JBoss by copying the created file to JBoss's deploy directory, and verifying successful deployment or looking for errors. It even talks about how to modify a deployed application. Needless to say, the last one is most useful to someone who just wants to deploy his or her application.
True to its lab notebook style, the book makes important, though not integral, observations about specific topics in the page margins. For example, a note in the margin of deployment steps tells you that you can include a deployment package within another deployment package, up to an arbitrary level of nesting, a la Russian doll packaging. I found this informal way of communicating relevant stuff quite effective.
Another noteworthy aspect of this book is that it makes generous use of appropriate tools, such as Ant and XDoclet, to get things done. This can be either good or bad, depending upon your familiarity with these tools. For me, it turned out to be a mixed bag. I know Ant and am happy writing Ant scripts for packaging and deployment. It is different with XDoclet, which I haven't had a chance to use so far. But perhaps the authors know better and one should just get familiar with it before working on any project involving JBoss and Enterprise Java Beans.
It is difficult, if not impossible, to cover each and every aspect of software as feature rich and complex as JBoss in any single book. This leaves the somewhat unpleasant task of choosing topics to the the authors and editors, for the selection may or may not match the needs of a particular reader. At the same time, it increases the responsibility of a reviewer like me who must help a prospective buyer decide for or against making a purchase, based on her needs.
Let me attempt to do that by making two lists: first, what is included and then, what is not.
What is included (paraphrased Table of Contents):- How to install, start, examine (through JMX Console) and shutdown JBoss Server.
- How to package, deploy, observe and undeploy an application.
- How to create a web application with database access and user authentication.
- How to use MySQL as database for a JBoss application.
- How to setup user database, login modules and enable SSL.
- How to configure logging for various components of JBoss.
- How to map schema, objects and relations to database tables.
- How to monitor and manage a JBoss application with MBeans.
- How to create a custom JBoss with modules that your application needs.
A similar, comprehensive, list of what is not included is simply not possible. Still, I have gone ahead and created the following based on my experience with JBoss. Keep in mind that these reflect the kind of applications I have worked on and may not be representative of your needs.- How to use JBoss as a J2SE container.
- How to develop Web services with JBoss.
- How to create, package and deploy an application consisting of JBoss services, web applications and web services.
- How to troubleshoot class loading problems.
- How to isolate applications within a single JBoss server instance.
- How to profile for performance bottlenecks.
- How to run multiple instances of JBoss Server on a single machine.
I can only hope that the authors will take this as a reader feedback and include some of the above in a future edition.
So, what else is there not to like about this book? One thing that caught my attention was the relative absence of insight into why things worked the way they worked: What are the underlying patterns and how can the awareness about these patterns be applied to other similar situations? These are the things I look for in a new product or technology, and have found them to be much more helpful than just a compilation of step-by-step descriptions of doing things. Perhaps the Developer's Notebook format doesn't allow for such digressions, still I think inclusion of such insights would have improved the book.
Overall, I would say that JBoss - A Developer's Notebook is a good introductory book for those who are thinking of getting started or are just getting started with JBoss. If you have already worked on JBoss and are looking for more advanced or esoteric stuff, then this book is perhaps not for you.
You can purchase JBoss - A Developer's Notebook from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
E-Mail Server Setup Advice?
dhammala asks: "I am responsible for setting up and maintaining a mail server for small web-hosting type business. We currently host about 75 domains, around 100 mailboxes and due to the efforts of our sales team, we are wanting to get ready for some great increases in those numbers. I am worried about my current configuration and ease of administration. More importantly (well, at least to the customers) is email deliverability -- it seems that messages delivered to some big players are being marked as SPAM or disappearing altogether. I am asking the Slashdot community for it's insight and advise on 1) if my current choice of software/configuration is a good match for this situation and 2) if there any additional measures I might take to ensure email deliverability?" "Here is an overview of our current setup:- We lease servers at ev1servers.net.
- The servers are running RHEL ES3.
- We chose to use Postfix and have it configured to support virtual users and domains mapped in MySQL tables. The reference I used to configure this setup is located here. We initially chose Postfix over qmail because it was open and over sendmail because the config files are actually readable.
- I have added in SQLGrey grey-listing for Postfix to provide a simple level of SPAM detection for our users. We are not wanting to deal with the customer service and higher box loads of mail scanning at this time. We might choose to use a 3rd party vendor to do this as needed.
- Messages are delivered locally via maildrop in maildir format.
- Courier IMAP is running to support both IMAP and POP access to the mailboxes.
- Postfix Admin was setup for easy mailbox administration.
- I have verified that our reverse IP records are correct
- I have created SPF records for all of the domains
- I have verified that our server is not listed in any blacklists (great scanner at dnsstuff.com)
- I have started to install DomainKeys for Postfix
I have not yet been able to get DomainKeys to work with Postfix. It was during my configuration attempts that I started to question this setup and wondered if this was the best setup for our situation.. this inquiry has lead to this posting.
In a perfect world, I would have an email server that:- is easy to administer,
- supports automated mailbox setup/removal (currently I can just insert rows into my tables and the mailbox setup is done)
- supports current technologies, like grey-listing, DomainKeys, etc
- is secure
- makes the best use of system resources -- I want to get the 'best bang for the buck'
Are there any other technologies or configurations that I need to implement to support the best deliverabilty rates?" -
Lucene in Action
Simon P. Chappell writes "I don't know about you, but I hardly bother with browser bookmarks any more. I used to have so many bookmarks, back in the early days of Netscape's 4 series, that I would have to regularly trim and edit my bookmark file to prevent my browser from crashing on startup -- that's a lot of bookmarks, folks! Now, I go to my favourite web search engine, enter a couple of appropriate search terms and voila, there's my page! Search engines are so ubiquitous that we rarely give much thought to the technology that powers them. Lucene in Action by Otis Gospodnetic and Erik Hatcher , both committers on the Lucene project, goes behind the HTML and takes you on a guided tour of Lucene, one of a generation of powerful Free and Open-Source search engines now available." Read on for the rest of Chappell's review. Lucene in Action author Gospodnetic and Hatcher pages 421 (7 pages of index) publisher Manning rating 9 reviewer Simon P. Chappell ISBN 1932394281 summary Solid introduction to Lucene Who's it for? Lucene is a library and framework, rather than a complete application. It truly is an engine, around which you are expected to build and extend your own application. Like Lucene, the book is targeted at those who are looking for a tool to build their own search facility application rather than just "download and go." The book does include a number of case studies of Lucene usage (including at least one download and go search engine) but those are included to show how to use and adapt Lucene to fit differing environments rather than as ends in themselves. The Structure The book is sensibly divided into two parts. The first part looks at "Core Lucene" functionality, while the second part addresses "Applied Lucene".
Part one has six chapters, covering the central components and inner workings of Lucene. It's here that the book starts with a tutorial introduction, familiarising the reader with the concepts of Lucene as a search engine around which you wrap your own code. The other five chapters move steadily through good search engine fare, with indexing getting the whole of chapter two to itself The discussion of how to retrieve text from the documents being indexed is mentioned here but postponed until chapter seven, where it is dealt with exhaustively. Chapter three covers searching, and especially how Lucene ranks documents.
Chapter four examines analysis. In it's chapter introduction, the book explains that "Analysis, in Lucene, is the process of converting field text into it's most fundamental indexed representation, terms." This process is performed by an analyser, which tokenises text according to it's own built in rules; each analyser will have a different emphasis, some want only dictionary words, others might explicitly include acronyms and sometimes you'll want an analyser that will block stop words (those words in languages that are part of the structure, but that add nothing to the information being conveyed by the text; classic examples of stop words in English include "a", "and" and "the").
Chapter five looks at advanced search techniques; everything from sorting search results, searching on multiple fields to filtering searches. Many free or open source software tools are extensible, and Lucene is no exception. Chapter six addresses creating and using custom components within Lucene, everything from custom sort methods to custom filters.
Part two, the final four chapters, cover Applied Lucene. It is dedicated to practical uses of Lucene and answers the question "So, what can I do with a search engine?" Chapter seven covers ways and means to parse common, non-plain text document formats. The primary formats covered are RTF, XML, PDF, HTML and Microsoft Word. The ability to parse and index these file formats will cover the search engine needs of the majority of Lucene users. Chapter eight looks at a number of Lucene tools and extensions that are available; many of them being free and open source software. Chapter nine covers ports of Lucene. While for many users, Lucene being a Java library is not a problem, some users want its functionality in environments that do not have Java. The chapter looks at ports written in C++, C#, Perl and Python. Lastly, chapter ten takes a thorough look at seven Lucene case studies. Perhaps the "star" case study is the one about Nutch, a download and go search engine written by Doug Cutting , the original author of Lucene.
There are three appendices. The first offers installation advice for Lucene; a useful addition that those newer to working with Java libraries will surely appreciate. The second appendix has a very well explained description of the Lucene index format. This is the kind of information that can be hard to find, so it is welcome in a book of this sort. The last appendix contains a number of categorised resource references. The number and breadth of the resources provided could provide quite an incredible education in information retrieval theory if the reader was inclined to read them all. What's to Like? There are several things to like about this book. Let's start with the fact that the authors are part of the core development team of Lucene. This gives them both credibility and an excellent understanding of the internal workings of Lucene. Co-author Erik Hatcher is a fantastic writer, having previously been a co-author of the only Ant book worth bothering with, Manning's Java Development with Ant . (Full disclosure: I do know Erik personally.)
The structure of the book is well thought out and each chapter does seem to move your understanding forward when combined with what you learned from the proceeding ones. The division into core and applied Lucene is also helpful. While you'd hope that this was the case, it often isn't; hence I note it as a positive.
I especially appreciate that this book does not fill up page after page with API documentation. The authors appear to have grasped that if you have Internet access to download the software, you might just be able to access the documentation online; rather, they concentrate on the way to use the software. What a concept!
As a part of Manning's "in Action" series, the book has excellent layout and has obviously been thoroughly edited by both technical evaluators and copyeditors. This might seem to be a small thing to some, but a well-edited book stands out clearly from the crowd. What's to consider? If you are looking for a book on using and configuring a download and go style of search engine, this book would be less suitable. While the case study on Nutch is of good length, it would be too short to useful as a configuration guide. Conclusion I enjoyed reading this book. If you have any text searching needs, this book will be more than sufficient equipment to guide you to successful completion. Even, if you are just looking to download a pre-written search engine, then this book will provide a good background to the nature of information retrieval in general and text indexing and searching specifically.
You can purchase Lucene in Action from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
TI Calculators Play Movies
ipapusha writes "TI Calculator enthusiasts rejoice. A few weeks ago, Dan Englender released a new flash application usb8x. Usb8x is a driver that interfaces with the On-the-Go USB port in the TI-84 Plus and TI-84 Plus Silver Edition. It is designed to be used by other programmers to create drivers for a variety of USB peripherals, including a keyboard and mouse. Already, ticalc.org's own Michael Vincent has interfaced his Lexar JumpDrive to play The Matrix's famous lobby scene. (mirror) ." -
TI Calculators Play Movies
ipapusha writes "TI Calculator enthusiasts rejoice. A few weeks ago, Dan Englender released a new flash application usb8x. Usb8x is a driver that interfaces with the On-the-Go USB port in the TI-84 Plus and TI-84 Plus Silver Edition. It is designed to be used by other programmers to create drivers for a variety of USB peripherals, including a keyboard and mouse. Already, ticalc.org's own Michael Vincent has interfaced his Lexar JumpDrive to play The Matrix's famous lobby scene. (mirror) ." -
Convincing Your Superiors to GPL the Code?
jakobgrimstveit asks: "At work I've been developing an intranet/extranet portal framework in PHP based on many other peoples work, including quite a few PEAR modules. I've always wanted to release the coding framework as GPL and publish it on SourceForge, and my boss has - impressively enough - not been too negative about this. This has been going around in the organization for quite a while now, and finally the reply from the company's president was (not surprisingly): 'Why should we do so?' I now have the task of writing a document listing the main reasons for GPLing the code, and this is where I turn to the highly competent Slashdot crowd: How do I convince my bosses to GPL the code I've written? I assume many other developers have the same problems trying to convince their bosses to open up their code." -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Summer Internships - The Good, and the Bad?
loquacious d asks: "This has been a spectacular summer for open-source student internships. Google funded a huge variety of open-source projects through the Summer of Code, including GCC-CIL and other improvements to Mono, new features and fixes for Gaim, and even new packages for Common Lisp. Joel Spolsky at Fog Creek hired four interns to produce a highly modified version of VNC called Fog Creek Copilot, and Paul Graham's new venture capital firm Y Combinator helped students create their own tech companies. What internships did people enjoy this summer, and which ones didn't work out so well? Which ones would you recommend to next year's applicants, and which should they avoid?" -
Fun Stuff at OSCON 2005
OSCON 2005 was held in a convention center this year, instead of a hotel, because it just got too big (2000+ people). Too big, in fact, for pudge and myself to cover more than a fraction of the talks and the ideas flitting around the hallways. But here's some of what I found cool last week. And if you attended or presented at OSCON and want to tell us about all the neat stuff we missed, please, share your thoughts in the comments, or submit a fact-rich writeup and we'll maybe do a followup story later.Mike Shaver's talk on writing Firefox extensions was packed to the walls. If you've been wanting to try it, Firefox 1.5 makes development easier, and should be out soon, so now's a good time. This talk and the tutorial on Ajax persuaded me to start using the DOM Inspector and debugging some JavaScript to get a better understanding of webpage manipulation.
Aaron Boodman's talk on his extension Greasemonkey was a walkthrough of writing a simple GM user script, a discussion of what's coming up, and some Q&A. Greasemonkey 0.5 ("Now With Security!") is in beta: there are multiple security changes that suggest someone really has sat down and thought the whole model through. GM works with Firefox, Seamonkey, Opera, and Windows MSIE (but not, oh please somebody correct this oversight, Safari).
Ruby on Rails is hot; if you want to develop a web app quickly you can't ignore it. It stresses "convention over configuration" with reasonable defaults. The tutorial went from installation to the "hello world" of the web, a blog (!), in a few hours. Anyone have a real-world example of Rails scaling to a large project and lots of traffic?
DarwinBuild is an open-source project from Apple that aids in building the open-source components of Darwin/Mac OS X. Given a build number of Mac OS X, it will fetch and build the software for that version, allowing you to modify the source as needed, making it easy for any developer to modify everything from the kernel to various utilities (just remember to reapply the modifications after running Software Update, if necessary). You can read more about it from, in addition to the web site, the presentation slides.
Google and O'Reilly gave out the 2005 open source awards, with $5000 attached to each. Congratulations to the winners.
Tony Baxter's Shtoom is a cross-platform VoIP client and software framework, written in Python, for writing your own phone applications.
Novell is still moving its employees from Windows to Linux, which we first heard at last year's OSCON. The migration from Microsoft Office to OpenOffice is complete, and the big step, from Windows to Linux, is 50% complete, projected to be 80% by November. Miguel de Icaza gave flashy demos of some Linux desktop applications that didn't impress this cynical observer very much.
PlaceSite is an open-source project looking to bring physical proximity awareness to Internet access at coffeeshops and other meetingplaces: think "local-only Friendster" and you're not far off. They got feedback from a monthlong trial earlier this year and are working on a new version that will be easy to deploy. Could be neat.
In a great 2-hour session on Wednesday, we got to hear from representatives of four leading open source databases about what they've been working on lately. Here are the summaries...
Ingres r3 has an impressive list of big features. Ingres was just open-sourced by Computer Associates this summer, and it's gotten a lot of attention for being a full-featured enterprise database. Ingres supports table partitioning that can be either range-based or hash-based, which can greatly improve performance in many cases. Its optimizer can now come up with parallel execution plans, which can be useful even on single-CPU machines and non-partitioned tables. There's also federated data storage (one can access data stored in another RDBMS through Ingres) and replication. And they're working on a concurrent access cluster, to allow data to be manipulated not just by multiple threads on one machine, but multiple machines.
A side note: Computer Associates was invited by O'Reilly to talk about its recently open-sourcing Ingres. Its representative, while confessing that introducing a new license was "probably the wrong thing to do," said that other licenses wouldn't have worked for them (the GPL "was seen as viral"). The one question that the audience had time to ask was "is Ingres a dump" -- is CA making it open-source to transfer the responsibility of support from the company to the community? The three-part "no" answer was that there are more CA developers working on Ingres now, that Ingres is at the core of their new releases, and that they've sponsored a "million-dollar challenge" to foster community interest. Time will tell I guess.
Firebird 2.0 has been in alpha since January and a beta is expected soon. Since 2000 much of their development has been aimed at making the product easy to install, and making the code easy for a distributed group of developers to work on. This year they're building features on that groundwork. Their design includes 2-phase commits (since the beginning), cooperative garbage collection (as a transaction encounters unneeded data, it removes it) and self-balancing indexes. Backup has been improved. When 2.0 gets to beta, I'm going to check this out, it sounds like very interesting technology (and apparently it will install with four clicks!).
MySQL 5.0 is in beta, and has been feature-frozen since April. Back in 4.1, its abstracted table-type has been put to advantage with odd engines like Archive (only insert, no update); Blackhole for fast replication; and an improvement to MyISAM for logging (allowing concurrent selects with inserts-at-table-end). Their Connector/MXJ lets you run a native MySQL server embedded inside a Java application. In 5.0 we're seeing stored procedures per the SQL:2003 standard, triggers, updatable views, XA (distribution transaction), SAP R/3 compatible server side cursors, fast precision math, a federated storage engine, a greedy optimizer for better handling of many-table joins, and an optional "strict mode" to turn some of MySQL's friendly nonstandard warnings into compliant errors. And they're working on partitioning, ODBC, and letting MySQL Cluster's non-indexed columns to be stored on disk.
PostgreSQL 8.1 is expected to be released in November or December, after a feature-freeze in July -- and it's an impressive list of new features. Their optimizer will make use of multiple indexes when appropriate, which is pretty darn exciting. The recommendation will be that in most cases it will be most efficient to have only single-column indexes and let the optimizer figure out which combination to use. They're implementing a 2-phase commit, they're bringing the automatic vacuum into the core code, and they removed a global shared buffer lock so they're now getting "almost linear" SMP performance scaling. I've never felt the need for Postgres, but I'm definitely going to look at 8.1.
-
Another Step Towards BSD on the Desktop
linuxbeta writes "DesktopBSD is the latest easy to install BSD aimed squarely at the desktop. Installation screen shots. From their site: 'DesktopBSD aims at being a stable and powerful operating system for desktop users. DesktopBSD combines the stability of FreeBSD, the usability and functionality of KDE and the simplicity of specially developed software to provide a system that's easy to use and install.' DesktopBSD joins the ranks of PC-BSD and FreeSBIE." -
Another Step Towards BSD on the Desktop
linuxbeta writes "DesktopBSD is the latest easy to install BSD aimed squarely at the desktop. Installation screen shots. From their site: 'DesktopBSD aims at being a stable and powerful operating system for desktop users. DesktopBSD combines the stability of FreeBSD, the usability and functionality of KDE and the simplicity of specially developed software to provide a system that's easy to use and install.' DesktopBSD joins the ranks of PC-BSD and FreeSBIE." -
DooM Remix Project - The Dark Side of Phobos
djpretzel writes "The Dark Side of Phobos is the latest in a series of site projects at OverClocked ReMix, which each provide unofficial, non-commercial fan arrangements of entire game soundtracks (Sonic 2, Kirby, Donkey Kong Country, and Super Metroid, to date). This latest addition covers id software's perennial classic, the original DooM, with 23 tracks by 19 artists. More information is available at doom.ocremix.org, or simply download the torrent with both mp3 and lossless FLAC, site unseen. Mars never sounded so good." -
Microsoft Linux Lab Manager Responds
Bill Hilf, Microsoft's Linux Lab Manager, got his answers to your questions back to us in time to publish them just before the San Francisco LinuxWorld, where he is speaking. Before you ask: Yes, Microsoft PR had a look at his answers before he sent them. So if you have any follow-up questions for Mr. Hilf, please post them below and I'll try to ask at least a few of them in person at LinuxWorld. 1) Start with the obvious
by Raul654
Dear Mr. Hilf - Surely by now you have to have been accused of helping Microsoft try to exterminate Linux. How do you respond to such accusations?
Bill:
I get that occasionally, you bet. But usually after I explain what I'm actually doing, it helps clear up the conspiracy theories (of which, there are quite a few). The truth is my job is to help Microsoft have a clear, unbiased and knowledgeable understanding of Open Source Software (OSS): the technology, the development models, how the community works, the pros and cons, and the mechanics of the overall process. So, no, Microsoft is not out to exterminate Linux or Open Source, Linux and Open Source Software will continue to be part of the software industry. My job is to help Microsoft have an understanding of the Open Source technology world.
In fact, Microsoft has benefited from OSS, has participated in OSS projects, and feels that OSS will continue to have an important role in the ecosystem. Both commercial and open source offer specific advantages. And several development models can and should coexist in healthy competition. After many years of working in both environments, a mantra I've seen pay off numerous times is "choose technology to fit the need" not based on a belief or religion: in other words, if the software doesn't solve the problem in a cost effective way, belief and religion won't stop the IT guys' cell phones and pagers from ringing at 2 AM, and that goes for *any* technology, regardless of the development model.
2) Open Standards
by Oriumpor
How does Microsoft internally deal with Open Standards and Open Document Formats?
I suppose more generally: In your testing is it solely relegated to Linux in the Server role, or do you address End-User issues as well?
Bill:
We are interested in all sorts of distributions, commercial and non-commercial, of Linux and we test many types of Open Source software overall.
We are very active in helping our product teams test out their open standards implementations. For example, we are currently doing this with Windows Server R2 (a release of Windows Server due out later this year) and its support for NFS and NIS. In a broader answer to this question, Microsoft strongly supports the promotion of open standards. Microsoft's participation in standards bodies such as IETF, W3C and OASIS, and our royalty-free contributions of technology to Web Services standards supports this commitment.
That said, Open Source does not equal Open Standards. It surprises me that this is an issue that(some) people still don't really comprehend. Let's break it down:
* The term "open standards" describes the results of a process for establishing uniform technical specifications (when used in the broader sense);
* While the term "open source," by contrast, refers to a software development and licensing model.
* Open standards may be implemented by software developed under any development and licensing model - non-OSS and OSS alike.
The VCR is a good example of a standards-based product that allowed any video tape* to play on any player - providing a marketplace of competitive VCR implementations, competitive tape media suppliers, and commercial opportunities.
*go ahead, someone say "Hey, but what about Betamax?" - but you get my point.
3) Penguin Aid?
by deathcloset
No doubt one of the activities of microsoft's linux lab is testing the security of linux.
My question is this: if you find a security vulnerability in linux, do you inform the linux community about it?
Bill:
We definitely look at security technologies in OSS in general, including Linux, but we do not actively do security code audits on Linux/OSS. We do occasionally stumble on bugs by accident in various products, and we always email the parties concerned, and it's up to them to do the right thing from that point on.
Let me give you some examples. Michael Howard, one of our security gurus here at Microsoft, has come across some issues in some projects, such as Apache.
As a company, we strongly believe in and encourage responsible disclosure of vulnerabilities. The practice of reporting vulnerabilities directly to a vendor is beneficial to everyone. It helps to ensure that customers receive high-quality software updates for security vulnerabilities, without exposure to malicious attackers while the update is being developed.
In my team's day to day work, we have discovered bugs and submitted fixes upstream. For example, the smbtorture test suite included with Samba had a bug that we identified. We provided a backtrace to the developers, and it was fixed and committed.
We also found some problems with the GAIM Instant Messaging client. GAIM's MSN via HTTP feature didn't work. The bug was noticed by our team because we had a real need for MSN via HTTP on our Linux desktops. So we fixed the issue and submitted the patch upstream.
4) Can Microsoft Ever Give Us Free As In Freedom?
by nurhussein
We've heard a lot about MS having a lower TCO etc., and who knows it may even be true in some cases, but does Microsoft realise that the reason some of us are on Linux is for the "Free as in Freedom" part? This may matter not to the PHBs, but some of the Linux users MS is trying to court such as HPC consist of engineers and scientists who operate things like particle accelerators and are unfazed by the "complexity" of Linux and appreciate the freedom to be able to customise it to their needs?
Can Microsoft ever be as liberal with their operating system as Linux developers are with Linux?
Bill:
Great question, and as someone who has spent time in the academic world as well as in the HPC world, I very much understand your point.
There's always a trade-off between modularity and integration, or said another way, there is always a balance between the ability to customize anything and everything and the ability to deliver a consistent, tested and supported software solution to a broad base of users.
This is not a Windows vs. Linux thing but more of a software design issue. The key is realizing that there's a continuum of possible trade-offs. With increased integration you have certain advantages and disadvantages, and conversely with increased modularity you have other advantages and disadvantages. As an operating system designer, you can pick where you want to be on this modularity/integration spectrum.
Microsoft has found that pursuing a balance, rather than one extreme, is a successful approach that fits the needs of our users and customers in a broad and effective way.
For the global software ecosystem, the best environment for innovation is the coexistence of OSS and commercial software. There is a good review of this successful interaction between software models here.
We try to provide the transparency and flexibility you describe through our Shared Source program. The Microsoft Shared Source Initiative is a range of programs and licenses to make Microsoft source code more broadly available to customers, partners, developers, governments, academics and other people who are interested. Shared Source now serves more than 1.5 million developers through source code access programs. What surprises most people when I tell them about our Shared Source program is that 99% of the >70 programs have full redistribution and modification rights.
5) Stranger in a strage land
by winkydink
Doesn't working at MS isolate you somewhat from the OSS community? What do you do to keep your OSS perspective and skills current?
Bill:
Believe it or not, I use more different types of OSS here at Microsoft than I've ever used before. Our team uses over 40 different flavors of Linux and BSD, plus several commercial Unix variants. Beyond this, we use an ever-growing number of OSS applications. In my spare time, I'm even learning some stuff about Windows J
I also interact with the OSS community and am in contact with many people in the OSS development community from all sorts of different projects. It's important to keep open lines of communication. We may not always agree, but the dialogue is always open and friendly.
6) Why doesn't Microsoft release Microsoft Linux?
by amper
The subject says it all (mostly).
One of the primary reasons Linux is somewhat inferior to commercial offerings when considered as a general-purpose desktop operating system is that there is a lack of a single guiding human interface standard for the various groups to work toward. Companies such as Apple Computer and Microsoft have invested large amounts of money in human interface studies, and although much of this information has been made readily accessible to the public, it would appear that very little of that information has been put to good use by F/OSS developers.
With Apple using the BSD branch of software as its operating system core, do you see a future for a Microsoft-branded Linux distribution, using a Microsoft-developed HCI design?
Though there is a large amount of enmity in the F/OSS community toward Microsoft, it cannot be denied that Microsoft's development methods are demonstrably capable of producing quality software. Could Microsoft serve as a catalyst for consolidation within the community, while remaining true to the F/OSS philosophy? Could such a strategy be profitable for Microsoft?
Bill:
Without question, our strategic bet is on Windows. Windows Vista and Longhorn mark the threshold of our next wave of innovation. This might sound a bit like an 'I drank the Kool-Aid' type answer but I've seen what we've built and are in the process of building, and I've seen what we're architecting. Our developers are creating products and technologies that are redefining what is possible with software. It's an exciting time to be at Microsoft.
But you raise a good point, which is: can there be a positive reciprocal relationship between Microsoft and the OSS development community? I strongly believe the answer is "Yes" and I spend a lot of time trying to help this relationship mature. There is a great amount we can learn from one another, and we have just begun to explore the potential of this relationship.
7) Samba
by miltimj
Is one of your projects to assist in analyzing Samba source code to help coworkers better understand the SMB protocol?
Bill:
This is not something we do, but as I mentioned above, we do use the smbtorture test suite in our labs and we do test for Samba interoperability.
8) Execs trying Linux?
by unsinged int
Have you ever managed to get any of the big shots (for example, Gates) to sit down and try Linux for a few minutes? If so, what did they say? If not, why not? Did they have an allergic reaction and try to run away from you, or have you not asked?
I think it would be interesting to hear the opinions of people at Microsoft who actually have tried Linux (with KDE, OpenOffice, Firefox, etc.), versus the standard "Linux is evil" public relations line.
Bill:
All of our executives see and occasionally use non-Microsoft technologies. This is certainly going to get me flamed, but the Microsoft executives I have worked with are typically very technical, sometimes extraordinarily so. They grasp new technologies very quickly. Sometimes they say "Hey, that problem was solved five years ago - is that it?" -- other times they say "We've got some work to do". I personally have not had an experience here where someone said 'Linux is evil!' Microsoft is a company with deep roots in technology, so most people here approach technology - our own or others - with a technologist's curiosity and interest. Easily one of my favorite things about Microsoft is its culture of curiosity about technology and its potential.
9) Windows Services for Unix
by dtfinch
Microsoft has long offered Services for Unix free for download to provide a unix-like environment on Windows. I've seen rumors and speculation that SFU will be included by default in Windows Vista, with some GPL'd portions replaced or rewritten to maintain compliance. If it's true, what level of functionality and compatibility can we expect?
Bill:
You should attend my LinuxWorld session this week J
I can't confirm what functionality will be in what version of Windows Vista. However, I can confirm that the next-generation of several components of Services for UNIX are being integrated into Windows Server 2003 R2. The NFS client, NFS Server, User/Name Mapping, Telnet Server & Client, Password Sync and NIS Server components of Services for UNIX are all present in the Windows Server 2003 R2 builds. In addition, a revamped POSIX subsystem, the "Subsystem for UNIX-based Applications" or "SUA" is also available as an optional install in R2.
Integrating this functionality in Windows Server 2003 R2 provides native support of cross-platform management tools, Windows/UNIX interoperability and UNIX to Windows application portability. This is a big help for many of the customers I talk to and something I will demonstrate at my LinuxWorld session this week.
10) Beat em or Join em?
by jdehnert
Having been in IT a looong time, I'm pretty familiar with all of the major players.
All of them have their +'s and -'s, but one of my biggest gripes about Microsoft is that instead of trying to leverage OSS, they continually try to crush or marginalize it. Over time I find myself less and less likely to consider a Microsoft solution because I know that over time Microsoft will try and make that solution less interoperable with all of my other solutions.
Microsoft would sell more software to me if I could be sure that they are NOT going to try and lock out all of my other platforms going forward.
Given your current position, does it look as if Microsoft will continue to try and marginalize OSS, or will they do an about face and work to try and ensure ongoing interoperability?
Bill:
If there's one thing that I'd like people to take away from this interview, it's that we can, and should, cooperate and learn from one another.
We love to write great software. One thing Microsoft knows well is the art of 'co-opetition' - competing and also cooperating. Both Microsoft and OSS technologies will continue to be around. We can compete - and competition is healthy - but just as important, we also need to cooperate and make sure that we pursue interoperability as a common goal. We need to be comfortable doing both, simultaneously. We need to have an open, mature relationship.
The key to making this happen is to have open lines of communication. If someone in the OSS community runs into a technical interoperability problem with Microsoft products, I want to know about it. In many cases, we'll be able to do something to resolve the issue. There may be a solution that already exists. Or the problem could be related to an issue that might need to be addressed by one of our product teams. But at the very least, I'll try my best to help and give you a straight answer.
One of my first demos to a high-level executive involved showing some standards-based Linux/Windows interoperability scenarios. I expected to receive an "If it's not built here, then I don't care" kind of response.
To my surprise, his reaction was the opposite: "This is good--we should do more of this type of thing." And I've seen this commitment from many others here at Microsoft, in a variety of roles. At the end of the day, we want software to "just work" too. That's what great software is all about.
If you'd like to contact me directly, I can be reached at billhilf at microsoft dot com.
------ -
The "Google Hack" Honeypot
An anonymous reader writes "On the heels of Google Hacking for Penetration Testers, and Johnny Long's talks at Blackhat/Defcon over the weekend, comes the "Google Hack" Honeypot, a honeypot designed to lure in malicious search engine activity. They had a second release of their tools on monday, according to their site." -
DHTML Utopia
Bruce Lawson submits the review below of Stuart Langridge's Excellent guide to creating dynamic web pages; scalable and sensible., writing "Don't be put off by the title: the DHTML here bears no resemblance to the stupid Web tricks of the late 90s that allowed animated unicorns to follow your cursor or silly Powerpoint-like transitions between Web pages." Read on for the rest. DHTML Utopia: Modern web design using JavaScript and DOM author Stuart Langridge pages 300 publisher Sitepoint rating 8 reviewer Bruce Lawson ISBN 0957921896 summary Excellent guide to creating dynamic web pages; scalable and sensible.
This book is the opening salvo in the latest battle in the Web Standards war -- the battle for unobtrusive JavaScripting, or Unobtrusive DOMscripting as many call it, in order to rid it of all the negative connotations that "DHTML" and "JavaScript" bring. Combined with the non-standard XMLHttpRequest object, it's sometimes referred to as "Ajax". Terminology aside, though, what are the substantive differences between the old-skool and the "modern" of the title?- Graceful degradation. A great example of this is Google Suggest in which the DOMscripting enhances functionality by making the page feel more responsive, but if you don't have JavaScript for some reason, the page still works.
- Separation of structure, presentation and behaviour. The DOMscript deals with the behaviour in the same way as CSS defines the presentation in the brave new Web standards world, and the three remain separate. The html has no JavaScript in it at all -- everything is handled by in separate code files.
- No browser sniffing. This aims to future-proof code by testing for features rather than sniffing for browser name and version. So, before using the TimeTravelCureCancer method, the current browser is tested to see whether it's supported. If it is, the script continues. If it isn't,the script silently fails with graceful degradation.
Chapter 1 has a brief (6-page) overview of the importance of valid code and separating presentation into CSS, and a short description of the unobtrusive nature of Langridge's scripts: no script in the mark-up at all; instead, the .js files contain "event listeners." The reasons why this is desirable are promised later. Chapter 2-4: The basics Now that document.write in the html is no longer needed, you need to know the "proper" way to add text or elements to a Web document. So Langridge gives us a tour of the DOM, showing how to walk the DOM tree and create, remove and add elements to the tree. It's methodical, and by the time I was beginning to get a bit tired of theory and thinking that you'll have to prise document.write out of my cold, dead hands, we get an "Expanding form" which allows us to expand a form ad infinitum to sign up as many friends as you want to receive free beer, without ever going back to the server. (You can see such a thing in action in gmail, when you want to attach multiple documents to an email).
I started to warm to the author and his style. 33 pages into the book, and we get a real-world working example to examine (I like my theory liberally garnished with practice). I also feel a kinship with authors who fantasise about mad millionaire philanthropists giving away beer.
By chapter 3, we've really got going. Apart from one rather pedantic edict (the event is mouseover, the event handler is onmouseover and we should separate the nomenclature, even though it makes no practical difference), the focus here is on real-life browsers. And, as we all know, in Web dev books, real-life browsers means grotesque exceptions to our ideal-world rules .Strangely -- and oddly satisfyingly to this PC user -- the culprit isn't only the perennially despised IE/ Win; shiny Safari comes in for a good bit of stick!
The real-world example here is a data table that highlights the whole row and column of any cell that's being moused-over. Now, in any modern browser except for IE/ Win, the row could be given a hover pseudoclass (IE/ Win only allows :hover on anchors). But as there is (weirdly) no HTML construct for a column, this effect can only be achieved through DOMscripting. What the script does is to dynamically append a class name to every cell in the row and column at run time -- and the pre-defined CSS file determines the styling of that class.
Herein lies an advantage in Unobtrusive DOMscripting: you could just take this script and plug it into a Web site without changing any of the html (except to add a link to the script file in the head). But the script is relatively complex for a newbie to code, and for the techniques to be widely used, I suspect that the billion old-skool cut'n'paste JavaScript sites will need to be replaced with a single, canonical library of modern scripts for people to cut and paste from. For those who find CSS challenging, JavaScript is probably even more complex. . Chapters 5 - 7: blurring the division between Web UI and application UI It's a truism that the Web has set back UI development some years -- in fact, back to the old dumb-terminal paradigm of filling in a screen full of data, pressing the button to send it back to the mainframe and waiting for the next page to be sent -- or the old one returned with errors noted.
Langridge shows that we can make the experience smarter than this, going beyond the traditional JavaScript client-side validation interactivity by adding animation to allow text to fade in and out over time, styling tooltips to be sexier than the default yellow box and which can gently appear into view rather than the browser default on/ off state are examples that struck me.
When I first read these, I thought they were cheesy gimmicks -- the modern equivalent cursor-following unicorn -- until I considered more deeply and realised that many of the UI elements that we enjoy in modern desktop apps are precisely these small, cosmetic effects: abrupt transitions, lack of transparency, sharp edges to UI widgets all feel like old operating systems or clunky Web pages.
It's not all touchy-feely; we get auto-complete text entry, degradable calendar pop-ups, flyout menus and lessons in OOP, encapsulating code for re-usability, and avoiding Internet Explorer memory leaks. Chapters 8- 10: seamlessly working with the Server So far, so client-side. Where Unobtrusive DOMscripting really gets developers juices flowing is the ability to communicate with the server without obviously refreshing the page. Chapter 8 takes you through a variety of methods. Some, like the hacky iframe method or hideous 204 piggyback method are so gruesome that I breathed a sigh of relief loud enough to wake the cat when I finally turned the page to read "XMLHTTP". This method (which is non-standard and introduced by Microsoft) has ushered in the Next Great Web Thing: asynchronous communication with the server. Langridge walks through using the Sarissa library to make a user registration form that checks whether the user name you choose is taken, and if so, suggest some alternatives without refreshing the page.
There's a lot of unresolved accessibility problems with the Ajax method (how does a screenreader alert the listener to the fact that something new has appeared on the page? How do they navigate and hear the new stuff in context?) and while it is laudable that Langridge notes these issues exists, I'd hoped he would have suggested some solutions. He doesn't, but as he's a member of the Web Standards Project's DOMscripting task force I'm guessing it's being worked on.
The project that really kicks ass in this section is a file manager, like the one in most people's Web site control panels, where you can actually drag and drop the icons, like an operating system, and the server does the work. Langridge carefully goes through all of the steps, all of the pitfalls and all of the code needed to make this happen in any modern browser.
It doesn't take a lot of imagination to realise just how this could revolutionise the Web experience. Drag and drop products into a shopping cart. Drag the shopping cart to the checkout icon. Moving money around bank accounts in some integrated internet banking application. The possibilities are huge. Conclusion The whole technique of unobtrusive DOMscripting needs further research before it's ready for prime time -- particularly from an accessibility point of view, but then as an accessibility bore you'd expect me to say that. I think it's beyond question that there's ideas in here that radically enhance the usability of Web-based applications by making them more intuitive and more like the desktop drag-and-drop interface we know from our desktops.
This is a good-humoured, thoroughly-researched book that combines theory with practical learn-by-doing examples. To this reviewer, the code appears scalable and sensible. This book is never going to appeal to the quivering aesthete designers -- probably because it's fundamentally about code. But precisely because it proposes a complete separation of code and design, it facilitates the advancement of the Web.
You can purchase DHTML Utopia: Modern web design using JavaScript and DOM from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Inkscape 0.42: The Ultimate Answer
bulia byak writes "After several months of frantic work by the evergrowing developer community, the aptly numbered Inkscape 0.42 is out. The amount of new features in this version is astounding. Quoting from the (gigantic!) Release Notes, "while some of the new features simply fill long-standing functionality gaps, others are truly revolutionary". Check out the screenshots and grab your package for Linux, Windows, or OSX." The screenshots are pretty mind-blowing; this isn't a 1.0 release, but I think you'll agree it's worth checking out. -
Can a Bayesian Spam Filter Play Chess?
martin-boundary writes "The typical Bayesian spam filters learn to distinguish ham from spam just by reading thousands of emails, but is this all they can do? This essay shows step by step how to teach a Bayesian filter to play chess against a human, on Linux, with XBoard." -
Pocket PC vs. Palm Showdown
Espectr0 writes "TuxTops has a small review comparing the Pocket PC handhelds against the Palm ones (no pun intended), with advantages and disadvantages of each. The conclusion? If you are after gaming, multimedia, good WiFi+Bluetooth support, a lot of accessories and versatility, go with Pocket PC. If you are after small and stylish devices with good battery life, simple interface and simple PIM apps, go with PalmOS." -
Next-Gen Game of Life
SQL31337 writes "Jecology is a life simulator created in the spirit of Conway's Game of Life. It touches on many topics such as cellular automata, ecological balance, and the food chain. There is only one type of creature in Conway's Game of Life(CGoL). They reproduce, but do not mutate or evolve. They do not have to find food, but instead simply die based on scarcity or overpopulation. Jecology encompasses these aspects of ecology with a more complex simulation, but retains much of the elegant simplicity found in CGoL. Jecology is not merely a life simulator, but an ecology simulator. It is also an example of a complex system arising from simple rules, as described in A New Kind of Science. Screenshots and info about Jecology here." -
Create New Atari 2600 Games With BASIC
2600fan writes "Fred Quimby, a game developer for the Stella Atari 2600 emulator, has released a new BASIC compiler that can be used to create games to run on Stella and potentially on the 2600 itself. The compiler generates efficient assembly-language code in DASM syntax, then uses DASM to make a binary. A bitmapped playfield and easy-to-use scrolling routines are included, and programmers can also add inline assembly language and directly access TIA registers if they please." -
Project Gizmo Challenges Skype
valmont writes "The Register is offering an interesting introduction to Project Gizmo, a new player in the Voice over IP field, poised to challenge Skype with its ability to interoperate with others thanks to the SIP protocol it complies to. Whereas Skype has selectively licensed usage of an API that offers limited insight into a closed protocol, a closed ecosystem solely controlled by one organization, the SIP protocol is open. Free open-source proxy/server implementations are sprouting up, and many developers are actively working on SIP clients. The Gizmo Project is the first to bring a truly-usable, user-friendly, cross-platform SIP client (Mac, Windows, Linux coming soon) to market. Meanwhile, theappleblog.com is already offering a Gizmo Project Wish-List to promote better interoperability between current and upcoming SIP providers, to make it more practical for users of disparate SIP clients to communicate with one another." -
Columba Developers Interview
Anonymous Coward writes "Scott Delap's ClientJava.com has an interview with the developers behind Columba, an open source Java email client. They answer questions about Columba development and general Java/Swing issues desktop Java applications face nowadays." -
DECnet Isn't Dead
Ronald Dumsfeld writes "The odds of folks under the age of 25 on Slashdot having heard of DECnet are pretty slim. This article over at Datamation gives some insight into people who've not given up on it. Poke around and find the documentation for the OSI-compliant version, or download the Linux version of the older DECnet IV and bask in the Security Through Obscurity." -
A Glimpse at the Linux Desktop of the Future
hisham writes "Every now and then we see articles pointing out "what's wrong with Linux on the desktop." This one gives a nice overview not only of the problems we all know, but also where to look for solutions (app dirs, smarter filesystems) and what's out there (projects trying to change the face of Linux, like Klik, Zero Install and GoboLinux). Still, it usually boils down to things that Mac OS X already has or that are/were touted for inclusion on MS Longhorn. Fortunately, the major desktops stopped playing catch and are focusing on forward-looking Linux projects, like KDE Plasma and Gnome Beagle. Interesting times ahead." -
A Glimpse at the Linux Desktop of the Future
hisham writes "Every now and then we see articles pointing out "what's wrong with Linux on the desktop." This one gives a nice overview not only of the problems we all know, but also where to look for solutions (app dirs, smarter filesystems) and what's out there (projects trying to change the face of Linux, like Klik, Zero Install and GoboLinux). Still, it usually boils down to things that Mac OS X already has or that are/were touted for inclusion on MS Longhorn. Fortunately, the major desktops stopped playing catch and are focusing on forward-looking Linux projects, like KDE Plasma and Gnome Beagle. Interesting times ahead." -
A Glimpse at the Linux Desktop of the Future
hisham writes "Every now and then we see articles pointing out "what's wrong with Linux on the desktop." This one gives a nice overview not only of the problems we all know, but also where to look for solutions (app dirs, smarter filesystems) and what's out there (projects trying to change the face of Linux, like Klik, Zero Install and GoboLinux). Still, it usually boils down to things that Mac OS X already has or that are/were touted for inclusion on MS Longhorn. Fortunately, the major desktops stopped playing catch and are focusing on forward-looking Linux projects, like KDE Plasma and Gnome Beagle. Interesting times ahead." -
GPL Violations of Miranda IM
Eesh writes "The Miranda project developers have recently posted to their development blog about two GPL violations of companies using their code - vBuzzer and StarMessenger. Today, they also posted that vBuzzer are taking steps to correct that violation. Hopefully this will work out fine. Miranda 0.401 stable was released recently" -
OpenSolaris Code Released
njcoder writes "C|net's news.com.com has reported that Sun Microsystems is releasing parts of the OpenSolaris code today licensed under the OSI-approved CDDL . The release consistes of over 5 million lines of code for the base system OS/Net (kernel and networking). OpenSolaris is based on Solaris 10, the current version of Sun's Unix Operating System. Back in January, Sun released the code for DTrace, a dynamic tracing tool for analyzing and debugging kernel and userland events. DTrace is one of the big features in Solaris 10. Some other highlights include the GRUB bootloader, SMF (Service Management Facility) which replaces init.d scripts, it starts up processes in parallel for faster boots (7 second boot on a dual opteron workstation I think that was the setup) as well as providing features for automatically restarting. OpenSolaris provides support for x86/x86-64 processors as well as Sparc. The Blastware guys are working on Polaris which is an OpenSolaris port to PowerPC. Sun has been working on opening Solaris for over a year now. The OpenSolaris project started with a pilot group of Sun and non-Sun users. During the pilot program a lot of info including screenshots could be found on various OpenSolaris member blogs. (My favorite is Ben Rockwood's blog). Teamware is the source code management system Sun uses for Solaris and OpenSolaris. Which was designed by Larry McVoy (now of BitKeeper) while he was at sun. No word yet on if Teamware will be available for OpenSolaris developers or not. Sun also uses CollabNet for it's Open Source project websites so that might be a possibility as well."