cpan.org · Domains · Slashdot Mirror

Learn About the FRDCSA 'Weak AI' Project (Video)

Developers · 2013-06-13 04:53 · posted by Roblimo · from the let-us-run-your-life-for-you dept. · 52 comments

Today's interviewee, Andrew Dougherty, has a Web page that says he is "...an autodidact mathematician and computer scientist specializing in Artificial Intelligence (AI) and Algorithmic Information Theory (AIT). He is the founder of the FRDCSA (Formalized Research Database: Cluster Study & Apply) project, a practical attempt at weak AI aimed primarily at collecting and interrelating existing software with theoretical motivation from AIT. He has made over 90 open source applications, 400 (unofficial) Debian GNU/Linux packages and 800 Perl5 modules (see http://frdcsa.org/frdcsa)." Tim Lord says Andrew's project "brings together a lot of AI algorithms, collects large sets of data for those algorithms to chew on, and writes software to do things like ... guide your whole life." As you might guess, Andrew occupies a pretty far edge of the eccentric programmer world, as you'll see from this video (and transcript). He calls himself "a serious Stallmanite" (his word), and has chosen the GPL for his software in the hopes that it will therefore help the greatest number of people. (Speaking of help, he's looking for interesting data sets and various "life rules" that can be integrated with his planning software, and one of the reasons he presented at the recent YAPC::NA was to solicit help in putting his hundreds of Perl modules onto CPAN.)

Perl 5.16.0 Released

Developers · Perl · 2012-05-21 08:47 · posted by samzenpus · from the check-it-out dept. · 192 comments

An anonymous reader writes "Perl 5.16.0 is now available with plenty of improvements all around. You can view a summary and all the change details here. With Perl on an annual release schedule, and projects like Mojolicious, Dancer, perlbrew, Plack, and Moose continuing to gain in popularity, are we in the middle of a Perl renaissance?"

RubyGems' Module Count Soon To Surpass CPAN's

Developers · Perl · 2010-12-19 23:33 · posted by timothy · from the we-finally-know-who's-counting dept. · 206 comments

mfarver writes "According to the data gathered by modulecounts.com, the total number of modules checked into RubyGems (18,894, and growing at about 27/day) will probably exceed CPAN (18,928, and growing about 8/day) this week."

Something For (Almost) Every Developer

Developers · Programming · 2010-04-13 13:22 · posted by kdawson · from the get-coding dept. · 263 comments

First up, reader martinjlogan sends along a tutorial for setting up a workable Erlang/OTP development environment on a Mac. Next, reader acid06 notes news of Perl 5.12, including what may be the first delivered fix for the Y2K38 bug. (Hit the Read More link below for some details on Perl's new release strategy.) "After two years of development, the new major version of Perl is now available. Notable new features are: better Unicode support, proper support for time after the Y2038 barrier, new APIs to allow developers to extend Perl with 'pluggable' keywords and syntax, warnings for deprecated features and more. From the linked post: You can get it from the CPAN right now or wait for a platform-specific release (such as Strawberry Perl for Windows)." Finally, from reader snydeq: "InfoWorld's Martin Heller provides an in-depth review of Visual Studio 2010 and finds Microsoft taking several large steps away from its legacy IDE code. 'Visual Studio 2010 is a major upgrade in functionality and capability from its predecessor. Developers, architects, and testers will all find areas where the new version makes their jobs easier. Despite the higher pricing for this version, most serious Microsoft-oriented shops will upgrade to Visual Studio 2010 and never look back,' Heller writes. Chief among the improvements are Microsoft's revamping the core editing and designer views to use WPF, its overhaul of IntelliSense and support for test-driven development, and its intelligent support for multiple versions of the .Net Framework."
Re: Perl. This release cycle marks a change to a time-based release process. Beginning with version 5.11.0, we make a new development release of Perl available on the 20th of each month. Each spring, we will release a new stable version of Perl. One month later, we will make a minor update to deal with any issues discovered after the initial ".0" release. Future releases in the stable series will follow quarterly. In contrast to releases of Perl, maintenance releases will contain fixes for issues discovered after the .0 release, but will not include new features or behavior.

Microsoft Bots Effectively DDoSing Perl CPAN Testers

Developers · Microsoft · 2010-01-18 00:48 · posted by timothy · from the stuck-in-a-rut dept. · 332 comments

at_slashdot writes "The Perl CPAN Testers have been suffering issues accessing their sites, databases and mirrors. According to a posting on the CPAN Testers' blog, the CPAN Testers' server has been being aggressively scanned by '20-30 bots every few seconds' in what they call 'a dedicated denial of service attack'; these bots 'completely ignore the rules specified in robots.txt.'" From the Heise story linked above: "The bots were identified by their IP addresses, including 65.55.207.x, 65.55.107.x and 65.55.106.x, as coming from Microsoft."

Where's the "IronPerl" Project?

Developers · Perl · 2008-10-07 19:05 · posted by kdawson · from the more-than-one-way dept. · 390 comments

pondlife writes "A friend asked me today about using some Microsoft server components from Perl. Over the years he's built up a large collection of Perl/COM code using Win32::OLE and he had planned on doing the same thing here. The big problem is that as with many current MS APIs, they're available for .NET only because COM is effectively deprecated at this point. I did some Googling, expecting to find quickly the Perl equivalent of IronPython or IronRuby. But to my surprise I found almost nothing. ActiveState has PerlNET, but there's almost no information about it, and the mailing list 'activity' suggests it's dead or dying anyway. So, what are Perl/Windows shops doing now that more and more Microsoft components are .NET? Are people moving to other languages for Windows administration? Are they writing wrappers using COM interop? Or have I completely missed something out there that solves this problem?"

Slashdot's Setup, Part 2- Software

News · Meta · 2007-10-26 04:51 · posted by CmdrTaco · from the its-all-just-ones-and-zeros dept. · 151 comments

Today we have Part 2 in our exciting 2 part series about the infrastructure that powers Slashdot. Last week Uriah told us all about the hardware powering the system. This week, Jamie McCarthy picks up the story and tells us about the software... from pound to memcached to mysql and more. Hit that link and read on.

The software side of Slashdot takes over at the point where our load balancers -- described in Friday's hardware story -- hand off your incoming HTTP request to our pound servers.

Pound is a reverse proxy, which means it doesn't service the request itself, it just chooses which web server to hand it off to. We run 6 pounds, one for HTTPS traffic and the other 5 for regular HTTP. (Didn't know we support HTTPS, did ya? It's one of the perks for subscribers: you get to read Slashdot on the same webhead that admins use, which is always going to be responsive even during a crush of traffic -- because if it isn't, Rob's going to breathe down our necks!)

The pounds send traffic to one of the 16 apaches on our 16 webheads -- 15 regular, and the 1 HTTPS. Now, pound itself is so undemanding that we run it side-by-side with the apaches. The HTTPS pound handles SSL itself, handing off a plaintext HTTP request to its machine's apache, so the apache it redirects traffic to doesn't need mod_ssl compiled in. One less headache! Of our other 15 webheads, 5 also run a pound, not to distribute load but just for redundancy.

(Trivia: pound normally adds an X-Forwarded-For header, which Slash::Apache substitutes for the (internal) IP of pound itself. But sometimes if you use a proxy on the internet to do something bad, it will send us an X-Forwarded-For header too, which we use to try to track abuse. So we patched pound to insert a special X-Forward-Pound header, so it doesn't overwrite what may come from an abuser's proxy.)

The other 15 webheads are segregated by type. This segregation is mostly what pound is for. We have 2 webheads for static (.shtml) requests, 4 for the dynamic homepage, 6 for dynamic comment-delivery pages (comments, article, pollBooth.pl), and 3 for all other dynamic scripts (ajax, tags, bookmarks, firehose). We segregate partly so that if there's a performance problem or a DDoS on a specific page, the rest of the site will remain functional. We're constantly changing the code and this sets up "performance firewalls" for when us silly coders decide to write infinite loops.

But we also segregate for efficiency reasons like httpd-level caching, and MaxClients tuning. Our webhead bottleneck is CPU, not RAM. We run MaxClients that might seem absurdly low (5-15 for dynamic webheads, 25 for static) but our philosophy is if we're not turning over requests quickly anyway, something's wrong, and stacking up more requests won't help the CPU chew through them any faster.

All the webheads run the same software, which they mount from a /usr/local exported by a read-only NFS machine. Everyone I've ever met outside of this company gives an involuntary shudder when NFS is mentioned, and yet we haven't had any problems since shortly after it was set up (2002-ish). I attribute this to a combination of our brilliant sysadmins and the fact that we only export read-only. The backend task that writes to /usr/local (to update index.shtml every minute, for example) runs on the NFS server itself.

The apaches are versions 1.3, because there's never been a reason for us to switch to 2.0. We compile in mod_perl, and lingerd to free up RAM during delivery, but the only other nonstandard module we use is mod_auth_useragent to keep unfriendly bots away. Slash does make extensive use of each phase of the request loop (largely so we can send our 403's to out-of-control bots using a minimum of resources, and so your page is fully on its way while we write to the logging DB).

Slash, of course, is the open-source perl code that runs Slashdot. If you're thinking of playing around with it, grab a recent copy from CVS: it's been years since we got around to a tarball release. The various scripts that handle web requests access the database through Slash's SQL API, implemented on top of DBD::mysql (now maintained, incidentally, by one of the original Slash 1.0 coders) and of course DBI.pm. The most interesting parts of this layer might be:

(a) We don't use Apache::DBI. We use connect_cached, but actually our main connection cache is the global objects that hold the connections. Some small chunks of data are so frequently used that we keep them around in those objects.

(b) We almost never use statement handles. We have eleven ways of doing a SELECT and the differences are mostly how we massage the results into the perl data structure they return.

(c) We don't use placeholders. Originally because DBD::mysql didn't take advantage of them, and now because we think any speed increase in a reasonably-optimized web app should be a trivial payoff for non-self-documenting argument order. Discuss!

(d) We built in replication support. A database object requested as a reader picks a random slave to read from for the duration of your HTTP request (or the backend task). We can weight them manually, and we have a task that reweights them automatically. (If we do something stupid and wedge a slave's replication thread, every Slash process, across 17 machines, starts throttling back its connections to that machine within 10 seconds. This was originally written to handle slave DBs getting bogged down by load, but with our new faster DBs, that just never happens, so if a slave falls behind, one of us probably typed something dumb at the mysql> prompt.)

(e) We bolted on memcached support. Why bolted-on? Because back when we first tried memcached, we got a huge performance boost by caching our three big data types (users, stories, comment text) and we're pretty sure additional caching would provide minimal benefit at this point. Memcached's main use is to get and set data objects, and Slash doesn't really bottleneck that way.

Slash 1.0 was written way back in early 2000 with decent support for get and set methods to abstract objects out of a database (getDescriptions, subclassed _wheresql) -- but over the years we've only used them a few times. Most data types that are candidates to be objectified either are processed in large numbers (like tags and comments), in ways that would be difficult to do efficiently by subclassing, or have complicated table structures and pre- and post-processing (like users) that would make any generic objectification code pretty complicated. So most data access is done through get and set methods written custom for each data type, or, just as often, through methods that perform one specific update or select.

Overall, we're pretty happy with the database side of things. Most tables are fairly well normalized, not fully but mostly, and we've found this improves performance in most cases. Even on a fairly large site like Slashdot, with modern hardware and a little thinking ahead, we're able to push code and schema changes live quickly. Thanks to running multiple-master replication, we can keep the site fully live even during blocking queries like ALTER TABLE. After changes go live, we can find performance problem spots and optimize (which usually means caching, caching, caching, and occasionally multi-pass log processing for things like detecting abuse and picking users out of a hat who get mod points).

In fact, I'll go further than "pretty happy." Writing a database-backed web site has changed dramatically over the past seven years. The database used to be the bottleneck: centralized, hard to expand, slow. Now even a cheap DB server can run a pretty big site if you code defensively, and thanks to Moore's Law, memcached, and improvements in open-source database software, that part of the scaling issue isn't really a problem until you're practically the size of eBay. It's an exciting time to be coding web applications.