Domain: cpan.org
Stories and comments across the archive that link to cpan.org.
Comments · 1,172
-
Re:Why not for Windows people?
And also there is PAR.
It works really great now, and i frequently deliver to clients pp exe files that do run-once treatements.
It's much easier than vbs for file processing, will leave no footprint on the server, and is still very easy to maintain as the exe is nothing but a compressed archive with the source code easily accessible -
Re:Strategies for complex perl code bases
All you'll end up storing is an index into a non-persistent table, unless yet-another-glue-module is written to provide for the serialization of such objects.
It's even easier as that. All you need is to add freeze() and thaw() methods, and you can get those almost for free from Class::Std.
-
Re:Should I read this or continue with sed/awk?
Because Perl is a general purpose programming language, it can do a lot more than sed or awk. Learning all three is useful if you do a lot of Unix administration or command-line work, but you can get by with Perl alone if you only learn one.
... apparently Perl is better for smaller/one line regexp manipulation in scripts, and python for building large applications.It depends on the application. I appreciate perl2exe (though I've heard good things about PAR and wxPython has better documentation than wxPerl, but Perl also has the CPAN. I've had no small success building large applications in Perl.
-
Re:XML::LibXML is where it's at
If you know how to tell XML::Simple how to treat various tags you end up with a fairly simple and consistent data structure. And if the ForceArray, ForceContent and other options do not give you enough power you may try XML::Rules. There you can specify exactly what part of data are you interested in for each tag, simplify the data structure as the XML is parsed, handle the data whenever you have all you need etc. All without having to access tag content and attributes through methods with long names. And without loading the whole document into memory and eve converting it into a huge maze of interlinked object
:-P
In short ... use the module that works for you. -
Re:As a longtime(past tense) PHP developer I can s
Personally I wouldn't think of Perl as a substitute, even though I understand in many respects Perl is superior to PHP, I just find it difficult to make the right choices when it comes to picking a module for a given job. I think Perl suffers from a lot of duplicated effort, there is no concerted effort to establish a de facto framework of modules.
In perl's defense, I think that's sometimes the biggest reason I choose perl over the others. The good thing about duplicating or reinventing the wheel is sometimes you need different wheels for different tasks. It also facilitates evolution and a plethora of ideas that you would otherwise miss if you had put everyone together and told them to make THE ONE best module because there will always be drawbacks to even particularly good implementations. The result is perl's library (cpan.org) is massive. When I look at other things like Ruby, though I like the language features, the library work simply isn't there yet.
As a side note, there's some really cool libraries out for perl. Check out moose or catalyst for example.
-
Re:As a longtime(past tense) PHP developer I can s
Personally I wouldn't think of Perl as a substitute, even though I understand in many respects Perl is superior to PHP, I just find it difficult to make the right choices when it comes to picking a module for a given job. I think Perl suffers from a lot of duplicated effort, there is no concerted effort to establish a de facto framework of modules.
In perl's defense, I think that's sometimes the biggest reason I choose perl over the others. The good thing about duplicating or reinventing the wheel is sometimes you need different wheels for different tasks. It also facilitates evolution and a plethora of ideas that you would otherwise miss if you had put everyone together and told them to make THE ONE best module because there will always be drawbacks to even particularly good implementations. The result is perl's library (cpan.org) is massive. When I look at other things like Ruby, though I like the language features, the library work simply isn't there yet.
As a side note, there's some really cool libraries out for perl. Check out moose or catalyst for example.
-
Use Quantum::Superposition;
Parent would be insightful, actually. Guess not many know about Perl's Quantum::Superposition module.
Just when you thought that Perl couldn't possibly get any more confusing... :-) -
Re:OSS idealogical differences... what a crock!
You see something like that in the Perl world though, over at the CPAN.
Anyone can upload a CPAN module. First to use the module name gets to own it.
If the original author is gone, the CPAN admins will let anyone take over a module.
You also see there authors making increasing use of "(Almost) anyone can get commit" svn repositories. People like Audrey Tang and Adam Kennedy are quite infamous for giving anyone and everyone commit access to their code, and between them they have 200 and something modules, plus pugs (the Perl 6 in Haskell thing). -
Re:OSS idealogical differences... what a crock!
You see something like that in the Perl world though, over at the CPAN.
Anyone can upload a CPAN module. First to use the module name gets to own it.
If the original author is gone, the CPAN admins will let anyone take over a module.
You also see there authors making increasing use of "(Almost) anyone can get commit" svn repositories. People like Audrey Tang and Adam Kennedy are quite infamous for giving anyone and everyone commit access to their code, and between them they have 200 and something modules, plus pugs (the Perl 6 in Haskell thing). -
Re:I go to Sourceforge after I learn about a progr
I'm a big fan of http://plone.org/ which is a CMS that sits on top of the http://www.zope.org/ application server. All of which is OSS. I can't speak to OSS CRM but others here have. There are plenty of fantastic server side developer productivity boosting OSS software out there.
- Try http://jakarta.apache.org/ for lots of Java libraries.
- I find http://www.springframework.org/ is a great framework extension for Java.
- I like spring better, but http://www.hibernate.org/ provides an ORM for both Java and
.NET developers. - If you are working in Perl, then http://www.cpan.org/ is the place for you.
When it comes to client side software there is a huge amount of great OSS apps.
- I believe that http://sourceforge.net/projects/ganttproject/ is great for project management.
- I have used http://sourceforge.net/projects/freemind/ for years and know it to be a great mind mapping tool.
- I believe that http://live.gnome.org/Dia/ is a great diagramming tool.
- I'm a big fan of http://www.umlet.com/ and find it to be very useful for creating UML diagrams.
- I switched from sodipodi to http://www.inkscape.org/ which is fantastic for drawing vector images.
- I am also a big fan of http://www.gimp.org/ which is used to draw raster images.
I have used all of these projects for years and would most definitely label them as quality, winner OSS.
-
Re:Once again blocked by the install instructions
See the INSTALL_BASE argument to MakeMaker.
-
Re:what a coincidence (re: XML::Tiny)
Not another one!
Gurg, I can understand the usefulness of an all-perl implementation of an XML parser generator, but a number of them such as XML::Mini have existed for a looong time... -
Re:What?
What are "CPAN docs"? If you can figure out how to install one module you can figure out how to install them all. Also since you know how to read the docs for XML::Simple you know how to read them for the other one as well.
Perhaps XML::Simple is easier to use, but your line of reasoning is like saying it is harder to buy a Ford than a Chevy so buy the Chevy. Built on extremely flawed reasoning.
I'm not saying buy anything or use anything or whatever. I'm just saying the article is pointless.
CPAN docs are the documentation that goes along with a module. In this case:
http://search.cpan.org/~grantm/XML-Simple-2.16/lib /XML/Simple.pm
Yes, if you can install one perl module, you can install any other (to a degree). So what's the point in explaining how to install a perl module? Point the reader to one of the millions of other places that already explain it.
But more importantly, the XML::Simple documentation is very good and very easy to understand. Why is there an article that basically regurgitates the documentation? Just point the reader to the documentation instead!
I'm afraid I don't understand your analogy at all. -
what a coincidence (re: XML::Tiny)
I just today noticed the announcement of XML::Tiny.
-
What?
This is the most pointless article I've seen linked from slashdot in a long time (and yes, I've seen a lot of crap here). What is the point of posting a run of the mill tutorial on something that's been covered many times before?
Having spent a lot of time playing with this crap lately, can I just butt into this pointless thread and say screw XML, use YAML or JSON instead. XML is a steaming, clumsy overrated turd. I benchmarked XML::Simple against YAML::Syck - the latter encoded 2.5 times faster and parsed nine times faster than XML::Simple. The syck library is indeed aptly named.
"Leverage the power of XML" by deprecating it wherever you can for a more sensible cross platform format.
</rant>
-
What?
This is the most pointless article I've seen linked from slashdot in a long time (and yes, I've seen a lot of crap here). What is the point of posting a run of the mill tutorial on something that's been covered many times before?
Having spent a lot of time playing with this crap lately, can I just butt into this pointless thread and say screw XML, use YAML or JSON instead. XML is a steaming, clumsy overrated turd. I benchmarked XML::Simple against YAML::Syck - the latter encoded 2.5 times faster and parsed nine times faster than XML::Simple. The syck library is indeed aptly named.
"Leverage the power of XML" by deprecating it wherever you can for a more sensible cross platform format.
</rant>
-
Re:AJAX is a silly acronym
AJAX is a silly name, but we're probably stuck with it.
Well, if you ask me, it's just a blatant wannabe move. Wa-ay back in the mists of 2001, the inimitable Damian Conway created the Acme::Bleach Perl module. Part of the stunningly [sic] inspired Acme series of Perl modules, it creates the cleanest code ever in the history of programming.
Now some web wanker with a re-tread idea from the nineties indulges in a bit of shameless self-promotion, whoring himself first to Microsoft, then to Google, and when he needs to come up with a name, he - again, shamelessly - stands on the shoulders of Giants like Professor Conway and dilutes the namespace with a pale echo of Damian's greatest masterpiece since his translation of Perl into Klingon.
Note to the humour-impaired: Follow links before modding Troll or Flamebait.
-
Re:AJAX is a silly acronym
AJAX is a silly name, but we're probably stuck with it.
Well, if you ask me, it's just a blatant wannabe move. Wa-ay back in the mists of 2001, the inimitable Damian Conway created the Acme::Bleach Perl module. Part of the stunningly [sic] inspired Acme series of Perl modules, it creates the cleanest code ever in the history of programming.
Now some web wanker with a re-tread idea from the nineties indulges in a bit of shameless self-promotion, whoring himself first to Microsoft, then to Google, and when he needs to come up with a name, he - again, shamelessly - stands on the shoulders of Giants like Professor Conway and dilutes the namespace with a pale echo of Damian's greatest masterpiece since his translation of Perl into Klingon.
Note to the humour-impaired: Follow links before modding Troll or Flamebait.
-
Re:Devil's in the Contracts
If we're going to generalize like that, I would say that calling any American music good is also insane. 99% of commercial music is crap, regardless of country. Stupidity is global.
However, I don't see any modules on CPAN for non-Japanese bands:
Acme::MorningMusume -
Re:Heaven help!
Catalyst FTW.
I used to do a lot of work with Catalyst. I still haven't seen anything as flexible and easy to use. Rails is nice, but it's a bit too "write and drool"[1] for my taste. Oddly enough, it isn't as straight forward as it could be, either.
I highly recommend anyone interested in The Intro. The The Tutorial goes into a lot more depth.
[1] I don't mean to be pretentious. Ruby is a fantastic language for creating domain specific languages, as Rails shows. But Rails' greatest strength is also a major weakness. For the record, my professional work currently involves prototyping number crunching applications with Ruby. -
Re:Heaven help!
Catalyst FTW.
I used to do a lot of work with Catalyst. I still haven't seen anything as flexible and easy to use. Rails is nice, but it's a bit too "write and drool"[1] for my taste. Oddly enough, it isn't as straight forward as it could be, either.
I highly recommend anyone interested in The Intro. The The Tutorial goes into a lot more depth.
[1] I don't mean to be pretentious. Ruby is a fantastic language for creating domain specific languages, as Rails shows. But Rails' greatest strength is also a major weakness. For the record, my professional work currently involves prototyping number crunching applications with Ruby. -
Re:Perl & CSV
DBD::CSV does SQL on CSV files... It's relational, right?
:-) -
"Turn your key, sir!"
-
Perl Rocks!
I remember seeing something to solve this elegantly on CPAN a while ago... The module is called Data::Encrypted and does almost exactly what you ask.
http://search.cpan.org/~amackey/Data-Encrypted-0.0 7/Encrypted.pm
DESCRIPTION
===========
Often when dealing with external resources (database engines, ftp, telnet, websites, etc), your Perl script must supply a password, or other sensitive data, to the other system. This requires you to either continually prompt the user for the data, or to store the information (in plaintext) within your script. You'd rather not have to remember the connection details to all your different resources, so you'd like to store the data somewhere. And if you share your script with anyone (as any good open-source developer would), you'd rather not have your password or other sensitive information floating around.
Data::Encrypted attempts to fill this small void with a simple, yet functional solution to this common predicament. It works by prompting you (via Term::ReadPassword) once for each required value, but only does so the first time you run your script; thereafter, the data is stored encrypted in a secondary file. Subsequent executions of your script use the encrypted data directly, if possible; otherwise it again prompts for the data. Currently, Data::Encrypted achieves encryption via an RSA public-key cryptosystem implemented by Crypt::RSA, using (by default) your own SSH1 public and private keys.
RSA Authentication
==================
Data::Encrypted uses RSA authentication to encrypt and decrypt its data. It achieves this by reading the user's public and private RSA keys. By default, Data::Encrypted assumes these files are stored in the .ssh subdirectory of their home directory (found using File::HomeDir), but you can provide alternative key files yourself, either by supplying alternative key filenames, or by building Crypt::RSA::Key's yourself: -
Re:Ever wondered
-
Re:If not PHP, then what?
If you've started learning PHP, then yeah Perl's probably going to be the easiest switch since PHP has it as it's syntactic parent. It's a good language for pragmatists as well.
Ruby is cute, but it's still relatively young, it's slowish, and no tainting last time I checked. Ruby 2.0 should be much better though. Damned good in fact.
And there seems to be a fair amount of crossover between the Perl and Ruby community, with good ideas stolen in both directions.
But if you go down the Perl route remember the golden rule.
"90% of every program you will ever need to write already exists on the CPAN."
So if you get NOTHING else beyond reading say "Programming Perl", you should spend half your life at http://search.cpan.org/.
Perl's true place in the world hasn't been about the language syntax itself for a long time. It's all about the 20 million lines of code in the CPAN.
In fact, most of the entire Perl world probably has the same "cpan" search bookmark in Firefox at this point :) -
Re:Question from a .NET developer trying to go OSS
Another, more recent sollution [sic] would be Ruby on Rails, which has some realy niffty [sic] features.
Rails is pretty cute. An more functional (but less "shiny") alternative is Catalyst. It's written in Perl, which means you get the benefit of over 10,000 extension libraries from the CPAN to draw upon. Perl also has some nice features that Ruby or PHP lack, like full native unicode support and automatic taint checking. It's also faster, because it's had 10 years to mature. Sadly people seem to be ignoring Perl these days, but with recent improvements it's nearly as cool as Ruby (check out "Moose").
Also, if you'd like to access a database with compound primary keys, ActiveRecord won't support that, but Catalyst's ORM (DBIx::Class) supports it fine.
Rails is good for quick apps like a wiki or a blog, but for more complicated internal applications, Catalyst is where it's at. Stop by the website, check out our advent calendar, or perhaps try the tutorial. Join us in #catalyst on irc.perl.org if you have any questions! -
Re:Question from a .NET developer trying to go OSS
Another, more recent sollution [sic] would be Ruby on Rails, which has some realy niffty [sic] features.
Rails is pretty cute. An more functional (but less "shiny") alternative is Catalyst. It's written in Perl, which means you get the benefit of over 10,000 extension libraries from the CPAN to draw upon. Perl also has some nice features that Ruby or PHP lack, like full native unicode support and automatic taint checking. It's also faster, because it's had 10 years to mature. Sadly people seem to be ignoring Perl these days, but with recent improvements it's nearly as cool as Ruby (check out "Moose").
Also, if you'd like to access a database with compound primary keys, ActiveRecord won't support that, but Catalyst's ORM (DBIx::Class) supports it fine.
Rails is good for quick apps like a wiki or a blog, but for more complicated internal applications, Catalyst is where it's at. Stop by the website, check out our advent calendar, or perhaps try the tutorial. Join us in #catalyst on irc.perl.org if you have any questions! -
More Than One Way To Do It Again
Perl already does QM programming. Maybe the entanglement timemachine experiment in Spring 2008 will have been successful, and Perl hackers willam haven been sending code through the loop back to the 2002 CPAN?
-
perl and cpan are the answer
Like most things, just use perl and a cpan module. Apache2::Geo::IP
-
Banking and Secrecy
I used to work for BankOne (now JPMorganChase) in Chicago doing Perl work for their Capital Markets trading systems. This meant writing about 40,000 lines of Perl code to decode the bank's internal reports and load 'em to a data warehouse. In the process, I had to decode some report formats that were not proprietary, as well code up some helper modules, like ones that retrieved files (recursively) from FTP sites, verified they were complete, etc., based on configuration files.
Much of this code could have been released to http://cpan.org/ nicely.
However, I signed a nondisclosure agreement (NDA) when I joined the bank saying, more or less, I won't release any of the bank's intellectual property to any third party without prior (probably written) approval from umpteen layers of management.
This kind of NDA has been more-or-less standard when joining a new firm as a developer. People don't like it when you release code to the world that gives your company a competitive edge, or might present a security risk if people knew how you were doing things (I know all those rules about security through obscurity being useless, but that's different than posting to a cracking website the protocols you use to get data from servers around the bank).
The problems of being able to contribute back this worthwhile code are legion. Many organizations are not set up to deal with this kind of problem yet. Over time, when managers come to understand that there are definite gains to be made by releasing a module to the wild, and actually find that other people like it and contribute-to/improve it, then word will get around.
I would counsel slow, persistent, quite isolated pushes for very clearly non-business-critical components to be put under the GPL into CPAN or the like. No excited "let's do this" will get the idea through. Calm, rational arguments about a component being broadly useful elsewhere and this would may mean someone else (that you don't have to pay) will fix the small bugs we don't have time for.
I think this is going to take hold at smaller companies MUCH more quickly. I work at a startup now, and we regularly contribute patches to several of the open source (mostly Python) projects we use. Why? Because we want our changes incorporated into the tree so we don't diverge too much from the standard release (which would require much more work to update when they release a new branch).
After a while, larger companies will get the message, too, and understand this business model. Compare this to flying airplanes - pilots all talk, and contribute info, so everyone is safer. Your competitive advantage is the systems you build, and how you run them, not the fact that everyone else crashes more than you do, because whenever anyone crashes, everyone suffers. -
Re:DIY Onetime Addresses
Perhaps Mail::TempAddress might work for you.
-
Re:Good luck
AlbumData.xml is easily parseable. If you dont want to get down and dirty with the xml itself, you can use for example Mac::iPhoto. It's only slightly buggy.
-
Re:Re-inventing a square wheel
Indeed. For most of my simple spidering needs I've found Perl's WWW::Mechanize to be a dream. I say what I mean: go get this page, find a link labeled "Today's Story" and follow it, on the resulting page find the second form and fill in the username and password fields with $username and $password, click submit, return the resulting page. I've found it useful for scraping sites with regular updates that have unpredictable URLs but constant links. Perl.com's "Screen-scraping with WWW::Mechanize" is a good introduction, then check out the full documentation.
-
Re:Re-inventing a square wheel
Finally, there is a Python script. At first glance, it looks slightly better. It uses what appears to be the Python equivalent of HTML::Parse to get links. But a closer look reveals that, to find links, it just gets the first attribute of any a tag and uses that as the link. Never mind if the 1st attribute doesn't happen to be "href".
What bugs me the most about this article is that the author keeps using the most generic libraries he can find instead of something written for this exact task. He should have used WWW:Mechanize for Perl or mechanize for Python. I'm sure there's something like this for Ruby, too.
-
Re:Oh my - mandatory perl plugOf course that could only happen if there were a central source where many people tested the components, and could provide fixes if needed. Something like cpan.org.
I recently put together a browser like tool in about an hour that would do a task in a matter of minutes that would have taken a week to do manually.
http://search.cpan.org/~petdance/WWW-Mechanize-1.
2 1_03/lib/WWW/Mechanize.pmOf couse I could have coded it from C, built my own booting OS and something that would have done the tcp/ip then parsed the results into something recognizable as english text and written in hundreds of potential responses to whatever I got back and it wouldn't have taken more than a decade. Yeah, that would have been "code reuse" too since I didn't write C but I'm not sure that even with a decade I could have done it in binary.
Are we there yet?
-
Re:As if PERL wasn't hard enough to read...
Yes, we could have used Python. Or Ruby. Both these languages have better threading support by leaps and bounds.
Er, how ? Because they don't really use threads ? Sure, they're fast and lightweight...but since they don't use the underlying OS's threads implementation (ie, kernel-compatible threads), they're only marginally useful on multiCPU and/or multicore systems.
2. Perl threads are still quite unstable.
Whats your basis for that statement ? Have you tested the latest versions of the threads and threads::shared modules ? Some significant effort has been applied in the past year to improve stability, as well as reduce footprint...you might want to give it a look...
Perhaps if your org can get some funding, you might throw some money at the TPF to get iCOW implemented ? Which should vastly improve thread startup and reduce footprint. threads::shared remains a bit of a challenge, but that issue can be addressed by some carefully crafted XS (which I'm told Stas is pretty good at
;^). -
Re:As if PERL wasn't hard enough to read...
Yes, we could have used Python. Or Ruby. Both these languages have better threading support by leaps and bounds.
Er, how ? Because they don't really use threads ? Sure, they're fast and lightweight...but since they don't use the underlying OS's threads implementation (ie, kernel-compatible threads), they're only marginally useful on multiCPU and/or multicore systems.
2. Perl threads are still quite unstable.
Whats your basis for that statement ? Have you tested the latest versions of the threads and threads::shared modules ? Some significant effort has been applied in the past year to improve stability, as well as reduce footprint...you might want to give it a look...
Perhaps if your org can get some funding, you might throw some money at the TPF to get iCOW implemented ? Which should vastly improve thread startup and reduce footprint. threads::shared remains a bit of a challenge, but that issue can be addressed by some carefully crafted XS (which I'm told Stas is pretty good at
;^). -
Re:As if PERL wasn't hard enough to read...
It's easy to do threads in perl 5.8.
-
Re:Humans and dictionaries define random different
Except for mathematicians and programmers, most think of "random" in a *very* different way from its technical definition. To most humans, saying that a particular sequence is "random" means *guaranteeing* certain things about it. Among them: the same element does not occur back-to-back, EVER, even if there are only a few elements total to choose from.
Yes exactly. I think the lesson here is that you should never use a mathematically random algorithm for esthetic purposes. If you're trying to get something to seem "mixed-up" to a user, you need to simulate a world where the "gambler's fallacy" is not a fallacy: you need a "randomization" function that has memory, and is weighted against "streaks".
I wrote a CPAN module called Text::Capitalize that includes a function called "scramble_case" that works this way (for when you want capitalization with a "wEiRDly sCRaMbLeD aPpEaREncE").
If certain elements stand out from the others in some significant way, they can neither occur first nor last. (For instance, if test questions are being drawn from a question bank, neither the easiest nor the hardest question should be first or last; if it is, people will say the order was not random.)
That's an interesting way of formulating the principle. For my "scramble_case" I weighted the probability against getting a capitalized first letter, because that looks Too Normal.I could go on and on, but what it really amounts to is that when most people say "random" they mean "carefully arranged in a thoroughly mixed-up order". This is almost the *opposite* of what a mathematician or computer programmer thinks the word "random" means.
This, by the way, is esentially the conclusion that Stephen Levy comes to... this isn't a bad article at all, though it's a bit verbose, and doesn't get down to the point until two-thirds of the way through. (Which probably makes it Pulitzer Prize material). -
Re:What about a Spam Filter
It appears to work with no false positives at all, however I admit I don't check the logfiles very often. The best I can say is that I've never seen it make a mistake, and nobody has every complained to me about it either.
(For reference the code is built on top of the perl Algorithm::NaiveBayes module.)
-
The Playback Machine
There's a system out there that does exactly what you're looking for. It's called the Playback Machine. I currently use it to power the television station for BayCon, which is a science fiction convention in San Jose, California. It's available from CPAN, here: http://search.cpan.org/CPAN/authors/id/S/ST/STEPH
E N/Video-PlaybackMachine-0.03.tar.gz
Here's how it works. You enter your schedule using a web-based front end, picking from a list of movies from a database. Whenever it's time to play a movie, the PM will play it. Whenever it isn't time, PM will play music while showing picture slides, announcements, and "up next"s. You can change the schedule while the system is running. The database backend makes it impossible to inadvertantly schedule two overlapping entries.
The main issue with it is that it's rather difficult to set up. Since I'm the primary developer and primary user, there's been little work on making it easy to install for other people. However, if others are interested in using this system, I would love to work with them. Please send me a private message if you're interested.
Another system which does the same thing is VideoKeg, which is written up here: http://ian.blenke.com/projects/videojukebox/perl/t k/mplayer/xfree86/epia/touchscreen/whitepaper/Vide oKeg.html. -
Re:Turn Key solutions broken?Let's face it, PHP hasn't had a standard database API for quite a long time (I guess there's PDO now). The most common database that PHP is used with is MySQL, which did not have bind parameters for quite a long time, so PHP programmers resort to mysql_real_escape_string or before that mysql_escape_string and then concatenating the query and the parameters while better databases have supported bind parameters since PHP 3. And even if the database API does not natively have separate parsing and binding stages, at least it would make sense to emulate them, because if the database API later gains the ability to bind parameters / send parameters separately from the statement, the users don't have to change their code. If mysql_query was used like this:
mysql_query($db, "select * from products where (color = ?) and (price < ?)", $color, $maxprice);
Then no-one would complain about escaping statements being "hard". Instead we have things like gpc_magic_quotes instead... -
Re:Return of the Flat File
-
Re:There ARE other scriping languages besides PHP
There is already a wrapper function. Its name: IO::Select.
It is part of the CORE since Perl 5.00307. -
Maybe it's just me but isn't 515 pages too much?Does anyone remember "Moby Dick" (hint: "Call me Ishmael
...")? It weighed more than Roseanne Barr/Arnold/Thomas because publishers charged more money for heavier books and thus encouraged the writers to write big books.Now, we have a regular-expression primer that has 515 pages. Is the publisher earning more money by producing a bigger book?
The only information that most people need is contained on a small web page. Armed with the information on that web page, the beginner can learn best by doing: writing various regular expressions in short Perl programs and determining whether they do what you want them to do.
-
What about network backup?
I've become interested lately in network-based backup. That is, I don't have any dedicated "backup media", but instead I just ensure that my important stuff ends up stored on the hard disks of at least two computers somewhere. Right now my solution to this is lame: I just copy the stuff about manually. This means I don't have any idea where a file can be found, there's no record of which copy is the "definitive" copy (ie, the one I should edit), and there's no automated way to restore everything. I've been hoping that someone would come along and make a tool to make this easier.
I found a half-finished thing called Brackup which looks like it was trying to go in this direction, with the extra ability to backup to things like Amazon's network storage service, and with encryption so that you can (with the appropriate amount of caution) back up to systems you don't necessarily trust with your data. Then you just need to back up the much-smaller "index file" to some removable media and store it somewhere safe.
Ideally, though, I want something that's one level above that where it just figures out itself where everything should be replicated to based on conditions like "all of my documents must be stored at at least two premises", and it'd then know that it's not sufficient to back up stuff from the desktop machine in my house to the server in my house -- it must use a host outside my LAN. It would also keep track of the amount of space allotted to backups on each machine and avoid using up too much space, warning me if it was unable to satisfy my criteria so that I can either add more targets or increase the allotted storage space.
The main thing I like about network backup is that I don't have to fumble about with physical media. All of my computers have got spare disk space, and the disks are already there and plugged in, so why screw about with DVD media or tapes? Backups need to just happen automatically in the background or I'll never bother to make them.
-
Re:Google Geocode API inaccurate.
Or... You could download the Tiger/Lines Data from the Government and do it yourself with Perl then feed the long/lat to Google instead of using their geocoding API. Just the maps!
http://search.cpan.org/~sderle/Geo-Coder-US-1.00/U S.pm
It rocks. I've been using it with my voter lists for Door to door. -
Re:OK But...
> perl -MCPAN -e 'install Slashdot::Karma'
Psh, all the cool kids use CPANPLUS now. Among other things, it will build packages for your distro, and keep those up to date for you. Check it out. -
Re:Yes it's a dupe, but lets get something straigh
Their research seems to deal mostly with the third problem, which is one of the biggest barriers to use in real life. Many of the algorithms used on these types of problems are NP, or require ridiculous amounts of (expensive) labeled data to train from. Also there are problems with generalization and overfitting.
They're often convergence algorithms - you run them until the answer is sufficiently accurate for your purposes. The problem is therefore a combination of 'more speed' and 'more accuracy', combined with the need to construct a topic model (a conceptual description of what a 'topic' actually is) that reflects the structure of the text closely enough to say something useful.There is no freeware software that can compete with this type of algorithm under these conditions - over 300,000 articles in just a few hours.
Most research software is available under free licenses. This paper is using a method based on Blei's LDA model, which is available under the GPL, combined with some existing code for name recognition to do some preprocessing (Lingua::EN::Tagger, GPL), and the Griffiths/Steyvers method for using Gibbs sampling to model LDA (I think it's this stuff, free for non-commercial use only). The actual topic modelling in this paper is nothing new (it's a couple years old now and widely known); the paper is about preprocessing for better accuracy. Actually it's not a bad idea, but it's not a particularly interesting one and doesn't have much to do with the subject of topic modelling.All that being said, I'm waiting for the paper, along with more technical specifics, to be released so I can really see what this is about
RTFA. There's a link to the paper in it. If you want the executive summary:
Use Lingua::EN::Tagger to preprocess proper nouns into single tokens.
Use LDA with Gibbs sampling to identify topics and classify documents into them.
As far as I can tell, this is about publicity, and 'proving' to non-researchers that it can be done (which just means doing what researchers do all the time, and showing it to the press). Presumably they want more funding.