smellotron · Slashdot Mirror

Re:If only... on PHP 4 End of Life Announcement · 2007-07-14 18:04 · Score: 1

Here's a few things that frustrate me (as an ex-professional PHP developer):

lack of decent module namespacing. Everything lives in the same global scope, which means any public module needs to "hack" this in with naming conventions, such as pg_{connect|query|execute} instead of language-supported alternatives such as pg.connect or pg::connect
interpreter weaknesses. Only recently has it been possible to chain method calls such as $a->getFoo()->doBar(), and it's still not possible to do something like $a->results()[0]. The result is (IMO) unnecessary temporary variables, which end up being the bane of literate programming. Plus, now that I've been spoiled by Python and Lua's abilities to pass in named parameters to functions, it's frustrating to not have that anymore (e.g. x = Window(x=100, y=100, width=200, height=200), which is much easier to understand sans reference than $x = Window(100, 100, 200, 200))
frustratingly bad precedence rules for short-circuit operators. Consider this:
$a = true;
$b = false;
$x = $a and $b; // parses as ($x = $a) and $b, so $x = true
$y = $a && $b; // parses as $x = ($a && $b), so $y = false

This is an AWFUL gotcha. I can't think of any benefit to justify the behavior. Backwards compatibility should be deliberately broken to fix this.
the typical arguments about poor function standardization. I won't elaborate on them.
Poor coordination between PHP core and PEAR/PECL repositories. I've seen a lot of buck-passing between the two camps. PEAR is treated both as a core part of successful project libraries and as a completely independent repository, depending on whichever is more convenient at the time.
Poor version-numbering schemes. Major behavior changes occur during minor revisions, and changelogs don't always do a good job of explaining the impact of certain revisions.
Poor default error-handling. I can override the default error handler myself, but there are certain failures that can't be overridden (a command-line switch to say "treat all errors as exceptions" would be nice).
Community encouragement of subpar programming practices. PHP/*SQL tutorials are still being written that show gaping injection vulnerabilities. Everyone writing a web framework seems obsessed with the Front Controller design pattern, without realizing that Apache/Lighttpd is the Front Controller [note: Rasmus Lerdorf is awesome in this regard]. Some people even try to relive other languages within PHP, emulating Java's XML-everything-configuration whilst completely forgetting how that works in Java, when the configuration is loaded once for the lifetime of the server, instead of every page load.

Whooh. That's a lot of reasons to dislike PHP. I still love it, though, because the language provides the best way I've seen thus far to evolve a project from a static web page to a full-blown application in baby steps. Most other languages rely on fairly heavy web frameworks, which gain a lot of power at the expense of a huge jump in complexity. With a PHP project, I can choose my own level of complexity. IMO, that's the greatest strength of PHP. The ability to use it however I want.

Re:Oh PLEASE GOD NO on Dot-Com Work Culture Making a Comeback? · 2007-07-03 19:06 · Score: 1

Have you ever looked at the Lua programming languages? From what I gather, the small size the interpreter makes it attractive for scripting embedded devices.

Re:Not loosing the will on Top Linux Developers Losing the Will To Code? · 2007-07-02 14:03 · Score: 1

I'm loosing the will to spell.
Keep commenting like that. I am not English, and this is the only way I can learn it. Anyway, it is not like I have anything to loose...

Kudos for multilingualism. The grandparent may have commented with the assumption that English is your native language, so don't take the jab too heavily. However, don't dismiss it. If Slashdot is a forum that helps you learn a language, consider it a lesson learned that "loose" is different from "lose". In fact, you do stand to lose from poor spelling; on a forum where your entire character is based on your text, you will (for better or worse) be judged on the quality of that text.

Re:Have you seen a cymbal waveform?? on Music Listeners Test 128kbps vs. 256kbps AAC · 2007-05-31 17:52 · Score: 1

What about the frequency response curve? There's still a general pitch difference after the initial hit between crashing on a splash, or a normal crash, or a big honkin' china. Cymbals may be very similar to noise on an absolute scale, but the quality is in the differences.

Re:Measure Attractiveness? on Computers Outperform Humans at Recognizing Faces · 2007-05-31 17:36 · Score: 1

I wonder if this could be used to measure attractiveness? Of course this is somewhat of a subjective thing, because I might find certain women attractive, while my friend prefers different women.

Maybe. There is a certain amount of math in attractiveness. While you may prefer blondes and your friend may prefer redheads, it's at least fairly easy to weed out ugly people. You know... the ones with their eyes way too far apart, or a disproportionately tall nose. Brad Pitt and Jennifer Aniston both have very symmetric faces, which is believed to be part of the reason for their attractiveness. On the other hand, check out Stephen Colbert's ears. Not saying he's ugly, but both a computer and my girlfriend can agree that his ears are less than ideal.

Re:The true test on Computers Outperform Humans at Recognizing Faces · 2007-05-31 17:29 · Score: 1

If it can work on Michael Jackson, I will be really impressed.

It will never work on Michael Jackson. Like a vampire, Michael Jackson is invisible to software algorithms. He just shows up as a bunch of zeros.

Re:Quite impressive.. on Computers Outperform Humans at Recognizing Faces · 2007-05-31 17:24 · Score: 1

The job of measuring the features of a face that is presented to it, then comparing it to a database, is a lot lot easier than finding a face in the midst of a big jumble of non-face, and then recognising it. When a computer can do that, I will be really impressed.

Be impressed. Computer vision at the moment is largely based upon segmenting the image into shapes, and then trying to look for sets of shapes that make a larger object. This is how people-detection works—It assumes that a person is essentially a rigid stick-figure, and it tries to reconstruct joints. Traffic signs are typical false positives, as they look like they have a head and appendages coming out of the bottom. It's certainly not a solved problem, but modern research in this area is more advanced than you realize.

Of course, the problems with these algorithms are similar to the problems with small children: No object-permanence, and a difficult time guessing shapes from awkward angles. Frontal facial detection is wayyyyy easier than side face detection.

Re:Not that impressive on Computers Outperform Humans at Recognizing Faces · 2007-05-31 17:15 · Score: 1

Translation: Throw enough hardware at it, and the machines win? Whatever a computer has been successfully programmed to do, it's usually bloody fast at it. It sounds like a well parallelizable task that should scale easily for many years to come.

Imagine a cluster of processors in a box attached to a camera: One pulls in the video and converts it to a stream. Four others sift through that stream and extract localized human-like video segments. Sixteen others take those segments and generate semi-rigid 3d models of the humanesque things (faces and general body segments). There's our eyes, largely. Now you start to divide out higher-level tasks to the other processors. Given a large database of "training videos" that describe actors as angry, happy, aggressive, tired, etc., classify expected moods. In particular, it helps to make a 3d model of the entire scene and attempt to do eye tracking. You can learn a lot about a scene just by seeing where people are looking; in fact, one of the wonders of good photography is capturing implicit information in a scene by capturing hints on faces.

So yeah, throw enough processors at the right algorithms, and there's a crazy load of information that a computer can infer. We'll get there, no doubt. All we have to do is figure out how to generate computer algorithms for the tasks our brains do.

Re:Hmmm... on Computers Outperform Humans at Recognizing Faces · 2007-05-31 17:01 · Score: 1

If my friend, who has no facial hair, puts on a fake mustache, I can still tell that it's my friend. Will an algorithm be able to distinguish fake markings?

Maybe it's just nitpicking your words, but yes: an algorithm is able to distinguish fake markings. You use an algorithm to identify your friend, even if you couldn't write it down. However, the complexity of that algorithm is rather astounding in comparison to modern facial recognition. Some of the issues facing a humanesque detection algorithm:

Identifying and ignoring ancillary data, such as bangs (hair), glasses, facial scruff
Identifying motion patterns based on facial muscle groups (which won't change, even if the shape of the shape changes)
Dealing with tremendous amounts of bandwidth (a friend is more identifiable in person than in a photograph, for humans and computers)
Pulling in all of the other gobs of information that humans use to identify each other that aren't involved in facial detection. Aside from facial expressions, this involves overall body language, inferred intentions (do they make eye contact and turn to you if they see you? That probably means they recognize you), and gobs of other things our brain is great at that computers aren't (yet).

Re:Really? on A Windows-Based Packaging Mechanism · 2007-05-29 04:07 · Score: 1

Sure, you may see a lot of stuff when you look at what's included, but there's really no tractable way to see what's not included. I bet there's a lot that's not a part of the Debian repo that you or I have never even heard about
True, but that's really not important...

That depends on your perspective. From my point of view, there is a huge gap in complexity between apt-get install foo (which rarely fails and is wonderful) and wget http://foo.invalid/foo.tgz && tar zxvf foo.tgz && cd foo && ./configure && make && sudo make install (which typically doesn't fail, but performs some pretty spectacular death-throes when it does). I really appreciate when packages are made available as precompiled RPMs/DEBs/works-out-of-the-box-tarballs. That's something Windows users take for granted - any application they download is easy to install.

I don't expect esoteric research projects to show up in a distribution (though to give credit, I was pleasantly surprised to see UMFPACK in the repo). However, failing that, in Windows, most of these research projects that are worthwhile end up with MSI installers, or as precompiled zip files. In Linuxland, there isn't a good standard distribution method better than the configure/make/make-install process. It wouldn't be difficult to make tarballs that follow the Linux Standard Base, and then place those into /usr/local, but that just isn't very common.

Regarding Java for research... I hate Java as a language, personally. Some things like JAMA have been ported over to C++, but oftentimes a research project needs to stand on the legs of other packages for complex algorithms like FFT, SVD/Eigenstructure, or PCA. Still, the only place where I've personally seen Java rule the research world is software development tools, such as Molhado Ref and the multitude of Eclipse plugins; and then only really because Java is miles easier to parse and statically analyze than C++.

Someone else already mentioned apt-pinning, which is actually exactly what I'd be interested in for the mix-n-match stability issue. My point isn't that apt is bad, it's that there isn't a very good popular standardized spot in between apt and source tarballs.

Re:A simple starting point on Is Parallel Programming Just Too Hard? · 2007-05-29 03:36 · Score: 1

for (x = 0; x < n; x++) a[x] += b[x];
Which is faster, serial, parallel, vector [e.g. SIMD] or some combination thereof? Don't know? Why would your compiler know either?

On the contrary, I would expect that the compiler knows the architecture it is compiling for, and thus knows how to optimize that loop. Modern compilers (commercial, at least) will look at that loop and understand

every loop iteration is independent
how "wide" the parallel processing can be (2 concurrent iterations? 4? 128?)
how to properly parallelize (loop unrolling? software pipelining? SIMD?)

One of the big difficulties in compiler optimization is taking high-level concepts that have been mapped to low-level implementations like C and then attempting to re-construct the high-level concept in order to see parallelization opportunities. This is one of the nicest things about higher-level languages (even the humble PHP foreach loop). The closer the code mirrors an abstract model, the easier it is to optimize that model.

As an example, these Python snippets all take sequences a and b and does the same as the above C code. You can see that they're all at different levels of abstraction, potentially giving different levels of optimization.

for i in range(len(b)): a[i] += b[i]
a = map(lambda ai, bi: ai + bi, zip(a,b))
a = [ a[i] + b[i] for i in range(len(b)) ]
a = [ ai + bi for ai, bi in zip(a,b) ]

Re:Are Serial Programmers Just Too Dumb? on Is Parallel Programming Just Too Hard? · 2007-05-29 02:54 · Score: 1

Functional programming discourages the use of states use whenever possible, and attempts to separate it out in the cases when it is required.

If you're at all familiar with Design Patterns (the book or the concept), I'd like to point you towards Domain-Driven Design. In particular, "Side-Effect-Free Function" and "Value Object" are two patterns for imperative programming languages that discourage unnecessary state-dependence. One of the requirements for effective unit-testing in imperative languages is the ability to reduce state-dependence (it's much safer to test return values of objects than to try to test internal state).

Maybe it's just a little bit of the functional-programming mentality getting absorbed into the mainstream (OO/imperative), but the goal of reducing side-effects is increasingly common with modern software engineering methodologies

Re:Are Serial Programmers Just Too Dumb? on Is Parallel Programming Just Too Hard? · 2007-05-29 02:44 · Score: 1

C++ doesn't put any parallelization dirty-work onto the compiler, and we C++ programmers still use pthreads. Or do you come from some magic fairy-land where C++ programmers can't use extern "C"?

Re:Patience is not a goal here. on Is Parallel Programming Just Too Hard? · 2007-05-29 02:31 · Score: 1

Out of curiosity, what on earth would Word get from parallel programming?

Nonblocking save, regardless of the size of the document
Sophisticated (and therefore more CPU-intensive) nonblocking spelling and grammar checkers, potentially even using a separate service on the system (imagine a dbus/dcop-like service that could perform grammar and spell checking)
Nonblocking integration with live remote data sources

Sure, the task of writing a document is serial for most individuals, but there's plenty of work that MS Word does that should be (is already?) farmed out to course-grained worker threads. I would particularly be excited about a system-wide spelling/grammar service than any application could invoke asynchronously (it would be better than a library like aspell for an integrated desktop environment, and it would reduce DLL/dependency hell).

Re:Nope. on Is Parallel Programming Just Too Hard? · 2007-05-29 02:05 · Score: 1

What consumer-level apps out there really need more processing power than a single core of a modern CPU can provide?

Anything that spends more than 100ms processing a user's request (disclaimer: that number mostly pulled out of my ass). User interfaces more and more need to be separated out from back-end processing on consumer applications. I hear OS X's Finder still has UI problems with blocking because of thumbnail generation. Some obvious examples that come to mind:

Word-processors or other "office productivity" applications could do nonblocking saves (create a snapshot of the data, then save the snapshot in a different process/thread). Of course, this would also require alternative feedback mechanisms, since most users are (rightly) accustomed to waiting for these operations to complete.
Digital content-creation applications usually do some fairly heavy (and "embarassingly parallelizable") work. This includes audio, video, and image editing,as well as mesh or CAD modeling.
Media players require a certain amount of synchronization, but I'd really love it if my mp3 visualization ran in a different process that could be niced down to a lower priority. Splitting something like this into a different process/thread allows much better control over system resources.
Web browsers should be rendering all page content in separate threads or processes. Just yesterday, YouTube had my dad's firefox segfaulting. If the page rendering and plugin activation was in a separate process, I would have only lost the one tab, instead of the entire program. Again, process segregation allows better customization of processes (imagine configuring Firefox to restrict CPU usage per-tab or per-domain).
Email clients can (and do) download new messages without blocking the UI.

That right there is a large majority of the use for computers. Every task I've outlined has wonderful benefits from a UI perspective for essentially becoming nonblocking, and you can see that some of them have already moved in that direction. UI latency is a huge issue in making a computer feel snappy, and every modern kernel scheduler is tuned to deal with GUIs, so why not harness real parallelism for that goal?

Re:Nope. on Is Parallel Programming Just Too Hard? · 2007-05-29 01:43 · Score: 4, Insightful

But a lot of problems don't fit these models, and need a LOT of thought put into how to parallelize them. It's likely that some problems in P are not efficiently parallelizable.

I would venture a guess that most problems would benefit from parallelizing basic data structure tasks:

anything (comparable) can be sorted using divide-and-conquer mergesort
scanning through an array-based collection (*not* a linked list) can be divided among processors—this is frequently done in hardware, e.g. for CPU cache hashtable lookups

Further, there's a few other obvious ways to parallelize:

Split program into a chain of filters between producers and consumers, and give each filter its own thread/process. For example, create an event receiver thread, a "do-stuff" thread, and a display thread. At the very least, this will reduce UI response latency.
Split program processing into "1 event dispatcher + N worker threads", like Apache or Squid. This by itself would be a good way to reduce blocking in most applications. Why should the interface be locked up when expensive processing is happening in a program? Maybe while Photoshop/GIMP runs some filter on my image, I'd like to browse the help documents or scroll around the viewport.
Re-evaluate any processing as a dependency tree, and code it in something like Twisted, where every piece of code executes nonblocking snippets, and a reactor thread dispatches between them (this is basically like light-light-weight threads)

The reason the problems don't fit these models is moreso that we're used to thinking about algorithms as an ordered list of steps, rather than a set of workers on an assembly line (operating as fast as the slowest individual worker).

Re:Really? on A Windows-Based Packaging Mechanism · 2007-05-29 01:14 · Score: 1

> Right up until the software you want isn't in the repo, or is broken. Then it falls way, way behind. I disagree. First, if you're running Debian then there is very little that isn't in the repository.

The problem with that statement is the same as evaluating a search engine. Sure, you may see a lot of stuff when you look at what's included, but there's really no tractable way to see what's not included. I bet there's a lot that's not a part of the Debian repo that you or I have never even heard about, due to various obvious reasons:

it's not free software (not sure if that's a true restriction for debian distribution, but it's at least a psychological restriction)
neither the developer nor any contributing user has the skill/time to commit to supporting a .deb package
it's simply not popular or complex enough to require easy installation

The only things I use that aren't available through aptitude are some very specialized niche programs developed by academics to solve very particular problems. Most of these are in Java, and so the installation process is identical on Linux and MS.

I've run into plenty of academic packages that weren't Java (actually, I've never run into graphics-related research that wasn't implemented in C or C++). I've used a number of Python packages that were more up-to-date than the debian distribution, but setup.py makes installation rather easy. One problem shows up because of how Debian's stable/testing/unstable segregation works, and how that's not effective for most people's desktops. I want the latest stable KDE, but I want bleeding-edge PidginIM (because I trust the development cycle) or SciPy (because I do development on my local desktop).

Don't get me wrong, I love debian and apt—but it's not as ideal as you suggest.

Re:Why is this needed at all? on Top 15 Free SQL Injection Scanners · 2007-05-20 16:47 · Score: 1

The availability of robust packages like those still doesn't stop newbie (and veteran) PHP programmers alike from just using the raw MySQL API subset known as the mysql_* functions

I'm a veteran PHP programmer, and I've worked in a dedicated Postgres environment (we were more likely to switch our scripting language from PHP to Python than our database from Postgres to anything). We used PEAR::DB, and frankly, I think it sucked. It didn't support modern Postgres features at all (including using addslashes instead of pg_escape, which itself was discovered as a security flaw for some non-ASCII databases). It was slow and bulky, to the point of being the #1 source of CPU time being used according to Xdebug and KCacheGrind.

I enjoy PDO (with a few additions that I can manage with subclassing), and it is efficient. Using PEAR::DB made me want to claw my eyeballs out, and IMHO the Postgres environment I mentioned earlier should have been using the pgsql_* functions.

Re:Why is this needed at all? on Top 15 Free SQL Injection Scanners · 2007-05-20 16:35 · Score: 1

Because PHP is stateless, there is no connection pooling and SQL statements must be re-prepared at ever script execution. In my experience, there are very few cases where a query gets executed multiple times on a single script, so the benefit for preparing goes wayyyyy down. If you really need a complex query prepared, turn it into a procedural function in your database. This pushes the expensive query-parsing to the first function call for PL/PGSQL (Postgres), and turns all of your SQL queries into stuff like "SELECT * FROM my_select_function(?, ?, ?, ?)".

Re:Why is this needed at all? on Top 15 Free SQL Injection Scanners · 2007-05-20 16:13 · Score: 1

Personally, I like either of these two case (syntax off the top of my head): // Using PHP5's PDO $stmt = $pdo->prepare('SELECT f1, f2, f3 FROM foo WHERE x = ?'); $stmt->execute(array($x)); // Using native Postgres calls (ADOdb does this internally) $stmt = pg_prepare('SELECT f1, f2, f3 FROM foo WHERE x = $1'); pg_execute($stmt, array($x));

However, there are cases where prepared statements are insufficient, though they are certainly not mainstream. You can't prepare an incomplete statement. If the SQL query itself varies (e.g. what tables you JOIN and what constraints you put in your WHERE clause), you have some choices:

Prepare every possible query. This is secure by default, but tedious to enumerate all possibilities (in some cases increasing the chance of bugs due to duplicated SQL code)
Use the most detailed query, crafted in such a way that unnecessary JOINs and conditionals can be no-ops. This doesn't always work, but you can craft something like "WHERE (foo = ? OR ? IS NULL)" and pass in $foo to both parameters, to allow conditional checking of foo.
Dynamically generate the SQL query (it should still be parametrized). The downside is that there is no single block of text that contains the exact query, but it is consistently going to have the best performance and least duplication required.

It's perfectly acceptable for someone to say "There will be no dynamic SQL generation in this project," the same way it's perfectly acceptable to say "We don't use #define macros ever." You're just denying yourself a useful tool.

Re:feature catch up on No Competition Between Open and Closed Source? · 2007-05-07 18:43 · Score: 1

Microsoft invented the Al Gore.

Re:No competition between open and closed? on No Competition Between Open and Closed Source? · 2007-05-07 18:40 · Score: 2, Insightful

Have you ever chosen between using Apache and IIS?

No, but I've chosen between Apache and Lighttpd.

Have you ever chosen between using MySQL and DB2?

No, but I've chosen between MySQL and Postgres.

Have you ever chosen between using PHP and Active Server Pages?

No, but I've chosen among PHP, Perl, and Python.

Huh, I guess all of the choices I listed are open-source.

Re:No competition between open and closed? on No Competition Between Open and Closed Source? · 2007-05-07 18:37 · Score: 1

Thunderbird vs. Outlook

You haven't used Outlook to its full potential if you think Thunderbird is an appropriate matchup. Functionality-wise, you're better off matching these:

Thunderbird vs. Outlook Express
Evolution vs. Outlook

Though, I agree in spirit.

Re:Why are websites still doing anything? on Why are Websites Still Forcing People to Use IE? · 2007-04-19 05:07 · Score: 1

There are frameworks for session management which will fall-back to URL session-ids if cookies fail. They're just as easy to work with as cookies.

Yeah, and it's usually recommended to avoid using them, because of the added risk in including sensitive information in the querystring. I am a huge fan of graceful degradation, but IMHO it's not "graceful" to degrade from cookie-based session management to URL-based session management for that reason.

From PHP session documentation:

URL based session management has additional security risks compared to cookie based session management. Users may send a URL that contains an active session ID to their friends by email or users may save a URL that contains a session ID to their bookmarks and access your site with the same session ID always, for example.

Re:Missing from the list on Top 10 Firefox Extensions to Avoid · 2007-04-11 03:13 · Score: 1

It doesn't seem like a huge effort to me to tell it you want to trust a site you use regularly -- you only have to do it one time.

You're missing my point. You wouldn't mind doing this (whitelisting sites for Javascript by default). I wouldn't mind doing this. I believe we are in the minority; most people I know don't really even understand the concept of disabling scripting, let alone care about managing which sites are enabled by default.

Maybe NoScript could be configured with a live-updated whitelist by default, so that the people who do care (power users who manage their own noscript settings) can provide the benefit of selection for people who don't care. Otherwise, the people who don't care will just say "Firefox is broken" and switch back to IE. To these people, tabbed browsing (which IE7 now has) is more important than NoScript's security benefit.

Slashdot Mirror

User: smellotron

Comments · 1,466