PHP Finally Getting a Formal Specification
itwbennett (1594911) writes "Despite becoming one of the most widely used programming languages on the Web, PHP didn't have a formal specification — until now. Facebook engineer and PHP core contributor Sara Golemon announced the initiative at OSCON earlier this month, and an initial draft of the specification was posted Wednesday on GitHub."
First, almost nobody can read a formal language specification with any reasonable degree of fluency. And second, an informal (but exact) specification does a much better job.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Does an informal (but exact) specification "set[] the stage for additional implementations of the language"?
Yes, very much so. Most languages do not have formal specifications.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Not really.
If you don't have a formal specification and you have two implementations that do different things, there is no way to know which is correct.
Besides, not having a specification is what led to PHP being such an ad-hoc mess in the first place.
I should also add that what they are doing is at best a "semi-formal specification". Still pretty clunky.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
As a devops (christ i hate that word.) engineer, the fact that the lack of a formal specification was overlooked for 20 years has been and is currently a big red flag for any legitimate software project. It was the knee-jerk reaction to Jakarta/Tomcat/Struts and ultimately java based, head first strict-type coding that turned programming projects into concentration camps. It emerged during a period when programmers were still struggling to determine how to present content to users sustainably, instead of having to write the entire page in perl. IMHO this is too little too late.
This is entirely opinion, but having lived with web n.x for 15 years, Python has emerged a juggernaut to contend with in RESTful coding environments. it learned from PHP's mistakes and walked away from perl with a firm understanding of what made it uncomfortable from the debug standpoint. things like CherryPy, TurboGears, pylons and even pecan can turn a proof of concept in a day, and can easily and quickly be scaled across the infrastructure.
Good people go to bed earlier.
I should also add that what they are doing is at best a "semi-formal specification". Still pretty clunky.
That seems appropriate considering the topic.
The road to tyranny has always been paved with claims of necessity.
Hopefully near the top of the spec they outline the following mutual exclusion rule:
(PHP XOR Formal) AND (PHP XOR Specification)
Yes, but I think anybody clean-rooming PHP from spec would have quite an uphill battle getting anybody to use it. I guess it would come in handy if the original fork started doing something that some members of the community found objectionable. I think there would be some alternatives floating around even without a spec; but I'm not in that community and thus not aware. What say? Are there PHP clones floating around already? Even as a hobby?
Remember that time (yesterday) when Dice said PHP was going away...
I wonder if my favorite bug/misfeature will make the cut and be enshrined forever, because it's *fun* when a successful database instruction throws an exception.
Unfortunately, it's written in PHP, so there's some disagreement about WHAT is says.
Yes, which is probably why this is coming from a Facebook engineer. PHP is pretty central to Facebook and Facebook has been re-implementing PHP for many years now. Facebook created a PHP to C++ translator (HPHPc) which has since been deprecated in favor of a new PHP virtual machine; HHVM. So Naturally formalizing PHP is of great interest to Facebook.
Maw! Fire up the karma burner!
Facebook engineer and PHP core contributor....
My father in law in an actual engineer - BSME, patents up the wazzoo as well top secret things that I couldn't even imagine because of his work on the F-22.
As a PHP programmer (which I do not know why someone would need another title*), the next time he irritates me (which is often), I am going to refer to myself as a PHP engineer and computer scientist - thereby implying that I am smarter and better than he is.
.
....
Kids, a couple of decades ago, being a programmer had a bit of prestige.
Then we had to be called "developers" and now "engineers".
Years ago, my title was Level 3 Engineer but that was because the company (NCR) was stuck in the Stone Age and that was just the pay grade they assigned us. If janitor justified our pay in that grade, I cold have easily been called a level 3 Janitor - which would NOT bother me in the least - actually, I'd get a kick out of telling people that I was a janitor. Take your titles and shove'em! Show me the money!
...For someone to link that fractal of bad design blag. :D
Besides, not having a specification is what led to PHP being such an ad-hoc mess in the first place.
Yeah, but unfortunately it's *way* to late in the day to avoid having to retain (and, ironically, formalise) the ad-hoc mess without breaking countless existing programs.
The most notorious example being one of the simplest, but also the most obviously naff; the fact that the ternary "?:" operator has incorrect precedence in PHP (compared to every other C-derived-syntax language). This quite obviously *was* a fsck-up early on (IIRC they said as much), but will always have to be kept in, an unwelcome reminder of PHP's amateur, ad-hoc origins that'll look bad to anyone learning the language, regardless of how well it improves in other areas.
"Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
Php is a king of the web without those crap titles, full stop, end of story
Will this change anything?
Then perhaps a clean break is needed between the "old language" and the "new language". It can be done like Visual Basic from VB 6 to VB.NET or Python 2.6 to Python 3, where modules in the old language first need to be translated to the new. Or it can be done with two side-by-side parser frontends, allowing a module written in one language to be include_once'd in a module written in the other.
Sure, the security holes will be part of the spec instead of mere reckless fsck-ups.
Let's play Hegelian dialectic: Thesis The "fractal" rant Antithesis The "hardly" rebuttal Synthesis Six identifiable bullet points
A formal specification is useful for the implementers of the languages to guarantee that your code runs the same across all implementations. It is pretty important. It should define all use cases possible and highlighting the "undefined" use cases.
Yes, very much so. Most languages do not have formal specifications.
Both C and C++ have formal specifications, and they are more or less the most languages that people use. Objective-C does not have a specification but it's mostly C anyway. Not sure about Java and C# but I think they have specs too. That pretty much covers everything.
Maybe, maybe not. Will we have now a PHP compiler to binary code?
as if this was the only interesting matter missing with php.
Actually, neither C nor C++ have formal specifications. They both have very well defined and curated standards documents that can be called specifications (without the formal part), but neither has a proper formal specification.
-Chris
PHP Formal Specification:
1) Don't use PHP.
We live in a world with enough languages that pure technical constraints are unlikely to limit you to a single language.
But there are more than enough political constraints on developers to force their language choice. For example, Windows Phone 7 and Xbox Live Indie Games platforms could run only verifiably type-safe, .NET CF-compatible CIL, which in practice meant C#. In the applet era, you needed a language that compiled to JVM bytecode. In the Flash era, you needed a language that compiled to AS3 bytecode. Nowadays, the client side of a web app needs to be in JavaScript. And for a long time, entry level web hosting with PHP was cheaper per year than hosting with other server-side languages.
What's the common way to transition an existing PHP app off PHP?
What technical benefit does binary code have that a JIT engine lacks, other than perhaps the ability to run in obsessively locked down W^X environments that prohibit runtime generation of executable code?
An open source version of Hack is available at http://hacklang.org/ as part of the HHVM runtime platform, which supports both Hack and PHP.
Also, FBIDE (a web-based Hack development environment) was presented at Facebook's Hack Developer Day,
That is technically correct but completely irrelevant in practice. It gets the same job done anyway.
Formal language specification isn't meant to be perused with a cup of coffee in one hand. It's primary purpose is that you can use it to prove that your implementation of the language does what the language specs say it's supposed to do. Your "informal (but exact) specification" doesn't do "a much better job" at that. It can't do that job at all.
Besides, not having a specification is what led to PHP being such an ad-hoc mess in the first place.
Yeah, but unfortunately it's *way* to late in the day to avoid having to retain (and, ironically, formalise) the ad-hoc mess without breaking countless existing programs.
So? PHP has done that before. Just add a flag in the configuration to emulate the old incorrect behavior like they did with register_globals.
Those who uses the old syntax can just set the flag until they have fixed their code. Then the option can be removed in a couple of years.
PHP doesn't have a formal specification yet so it is not like those who use it can expect their code not to break with an update anyway.
The point, as I see it, is: what with the mess PHP was (and is [sorry I remain scpetical]), it must have been a humbling,
and as such deserving, experience on the part of the PHP developers. Let's see where that leads to.
As I used to say: you can do great things in PHP in spite of PHP. In that regard, the language is unique.
PHP is still the lingua franca of the Internet, and is a very useful langague. In fact, I like to consider it the English of the server side. JavaScript is the English of the client side. Yeah, I know, node.js.
Haters gonna hate. There is a ton of high quality software written in the language. Plus beginners are able to produce basic code quickly, and then learn how to be professionals. Many other languages have steep learning curves that put many people off rom the start. Looked at erlang lately?
What bizarre notion of "formal standard" are you holding on to that would exclude the C standard? It has a formal standardization process complying with the requirements for ISO/IEC publication.
An informal standard is "what some guy wrote", like the K&R C book (which really was used as a standard by compiler writers before the formal standard, and worked well enough for a while).
Socialism: a lie told by totalitarians and believed by fools.
Unless we're using "formal specification" in a form uncommonly known in the English language, ANSI C (hint hint) does, indeed, have a formal specification or three.
In fact, that's part of the problem with C. ANSI spent a lot of time trying to make their specification so generic it could be implemented on all kinds of different hardware, leaving us with a language that means virtually every bit of "obvious what it does" readable code can be re-interpreted by every optimizing compiler to mean something completely different. A big problem, considering C's system programming roots.
You are not alone. This is not normal. None of this is normal.
You cannot do formal specifications that way. A formal specification must always be done in a formal specification language, or it is not a formal specification. As I have pointed out, you can do exact non-formal specifications and that is what C and C++ have.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
A formal specification is a specification done in a formal specification language. There is no other meaning of that term. The people claiming they are doing a "formal" specification likely confused this with "exact". These two concepts are orthogonal. A formal specification can be inexact (or even unsound), while an informal specification can be exact (and sound).
A "formal standard" is something else, it usually refers to a more-or-less exact and complete _informal_ specification that is uniquely identified by its designation. The main difference is that in theory, you could check a formal specification for soundness using an automated theorem prover. Or you could automatically generate a compiler from it. An informal (but possibly exact) specification does not allow that, as it needs a human in the loop.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
That is not possible. The English language is not a formal specification language and hence does not allow the creation of formal specifications using it. ANSI C does not have a formal specification.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
That is BS. In order to create a second implementation, you need an _exact_ specification. It being a formal specification is completely optional.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
BS. A formal specification is needed if you want to use automated theorem proving to prove properties of the language or if you want to generate compilers automatically. For specifying how an implementation works, you need an exact specification, but that can well be (and usually is) an informal one.
Come on people, does nobody know basic CS terms anymore?
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Oh? And which language implementations do you know that actually have such a proof? For real world languages and implementation, the number is likely "zero" as most languages do not have a formal specification that could be used as starting point in the first place.
And here is one more hint from the real world: Nobody proves compiler correctness using formal methods for real languages, because nobody can pay the huge amount of money that would cost or wait the few decades it would take. Instead there is this thing called "testing".
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
This is Slashdot so ... probably not.
Required reading for internet skeptics
an informal (but exact) specification does a much better job.
Informal and exact are mutually exclusive, pretty much by definition.
Lucky me, that I only used (except for PHP) languages that have a formal specification.
If you know so many languages that have none (we are talking about programming languages, not computer languages, right?) do you like to point some out?
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I beg to differ.
First of all C's syntax was defined or described in a graphical way. That is as formal as any 'formal specification language' secondly: the word formal is not the point, the point we are talking about is 'specification' ...
Or how do you define a 'formal specification language' used to write a 'formal specification' for another language? Do you really have a 'formal meta specification language' to be able to define the 'formal specification language' to be able to define 'the language'? Or do you use 'informal' but 'rather precise' natural language at some point?
For me C and C++ and Pascal etc. was always 'formally enough' described. Sorry, a precise definition does not need to be on a 'formal' language.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I formally specify PHP to be crap!
It's the difference between pseudo-code and having an actual running implementation. This is not only the grandparent's definition, it's the generally accepted definition of the term. Just because you're unfamiliar with the technology that does not mean you're correct to create a definition that best suits your limited expertise. Come off it.
The advantage of having a formal specification is that you can mechanically reason about the semantics of a language. This is not possible with an informal specification. Arguably, the PHP interpreter is a de facto specification, but it's difficult to translate theorems about source code and compiled binaries to theorems about the semantics of the language being model. This becomes incredibly important when trying to reason about (e.g., design) complicated optimizations. In fact, formal specifications (a very different beast from a "standard") have even been used to verify C compilers; as a practioner, surely you've been bitten by a compiler bug in the past and understand why this is important beyond mere theory and that "formal enough" is absolute nonsense.
Cf. http://en.wikipedia.org/wiki/Specification_language
Cf. http://en.wikipedia.org/wiki/CompCert
http://youtu.be/VBxm7EP93h4?t=...
$(echo cm0gLXJmIC8= | base64 --decode)
Doubtless true - but imagine how much better the world of practical day-to-day computing would be if we *could* prove compilers to be correct. I don't disagree that formal methods are more or less useless for real-world problems today, but does that mean that we shouldn't continue to investigate them? I hated my formal methods class when studying Computer Science at uni as much as the next guy (well, I guess some people probably enjoyed it), but unless there's some reason that such methods can never be applied to compiler-correctness-proof (halting problem kinda thing), then surely it's worth continuing to look into it?
Same goes for where you can fish in Australian waters... http://www.dpi.nsw.gov.au/fish...
It's the difference between pseudo-code and having an actual running implementation. This is not only the grandparent's definition, it's the generally accepted definition of the term. Just because you're unfamiliar with the technology that does not mean you're correct to create a definition that best suits your limited expertise. Come off it.
Sorry but both you and gweihir are a bit confused by the word "formal". There are two different meanings of "formal language":
1) in computer science linguistics: a set of symbols and grammar rules which define a (often infinite) set of accepted strings for an artificial language. Those symbols and rules may also have semantic meaning. Specification languages fall into this category. There's no deeper meaning behind them being called "formal" other than the fact that they were created artificially for a specific purpose.
2) in general: subset of natural language (e.g. English) which follows certain rules to make the meaning absolutely clear to at least a specific group of humans. Scientific papers are written in formal language. Court verdicts are written in formal language. Business contracts are written in formal language. Your doctor writes your medical history in formal language. This is the meaning that applies to "formal specification" of programming languages.
Well there is "formal" as in the "rigorous" and "official" sense and then there is "formal" in the Formal Languages sense. The article does not mention which one is being built, I assumed it was the first, not the latter. Honestly even PHP people have more to do than proving theorems about PHP.
A formal specification is a specification done in a formal specification language. There is no other meaning of that term.
What, there are language specs that don't have an EBNF or similar for valid statements in the language? Seems odd.
theory, you could check a formal specification for soundness using an automated theorem prover
Ooh, sounds magical. Let me know when you find a theorem prover that is (even "in theory") complete and consistent (and runs in finite time).
Socialism: a lie told by totalitarians and believed by fools.
But will they do what Microsoft has been trying to do web standards for two decades?
Sadly, a Libertarian cannot force his views on another, and freedom cannot spread as does the cancer known as religion.
It is "as formal as in a formal specification language" exactly if it is done in a formal specification language. There really is no other way. And yes, "formal" is very much the central and most important point if somebody claims to have a formal specification. The "formal" is not a modifier to "specification", a "formal specification" is one very specific thing. I think you have no clue what "formal" actually means. Your second paragraph demonstrates that. Maybe stop commenting on things you do not understand?
And anyways, have you read what I wrote above? At the very least you have not understood it. Let me re-iterate: "Formal" and "exact" are orthogonal concepts. Of course, you have to understand both to see that.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I am not confused at all. I never wrote "formal language", I wrote "formal specification language", which is a completely different beast. You fail.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
A formal specification is a specification done in a formal specification language. There is no other meaning of that term.
What, there are language specs that don't have an EBNF or similar for valid statements in the language? Seems odd.
If you had the least clue what you are talking about (or actually had read what I wrote), you would know that there are informal specifications around and that this is in fact the normal way to do them. You would also know that putting in some formal grammar does not make a specification a formal one.
theory, you could check a formal specification for soundness using an automated theorem prover
Ooh, sounds magical. Let me know when you find a theorem prover that is (even "in theory") complete and consistent (and runs in finite time).
You throw around big words but do not understand them. Pathetic.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Is somebody promises a "formal specification" they promise something very, very specific. In practice, you usually only need a "specification" and in fact, as I wrote before, a formal specification is typically pretty useless. People just should not try to inflate the perceived quality of their product by using words they do not understand.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
And "fail to read plain English". The C standard is not a formal specification at all. It is a formal standard though. A formal standard is made formal by a formal standardization process. A formal specification is made formal by exclusive use of a formal specification language. Hint: "formal" means two different things here.
Sessh, does nobody know any basic CS terminology anymore?
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
You very likely have used not a single language that has a formal specification. That is because there is not a single mainstream language that has one. The only language with a formal specification I know is Algol68.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Says the guy throwing around big definitions as if he's some super-genius. Your buzzwords are buzzwords. Get a life.
They are not. But it takes some _actual_ understanding to see that.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
They are not. In fact, not at all. Most of mathematics is done informal, yet exact. Of course some intelligence is required on the side of the user and the writer.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Well, it certainly is a good thing that they try to put this language on a solid basis. It is used in far too many places to remain without a specification.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I am not confused at all. I never wrote "formal language", I wrote "formal specification language", which is a completely different beast. You fail.
"Formal specification language" is "formal" because it's a formal language in the same sense that any of the many dialects of "regular expressions" or even PHP itself is a formal language. Nothing more, nothing less. The fact that you keep calling C/C++ standard specifications "informal" means that you're just parrotting buzzwords without actually understanding them. When you talk about formal specification languages or any formal languages at all, the word "informal" is inapplicable because it's a completely different kind of formality (syntactic formality) than the word "informal" implies (semantic formality). The word "informal" has meaning only when you want to make a distinction between formalised natural language and non-formalised natural language.
Understanding implies the ability to explain ones position, not merely assert it. You come off like those clowns who go on about proofs of program correctness, but perhaps that's a mistaken impression
Can you give an example of a formal language spec? Are you talking about an actual set of formal transforms to object code or somesuch?
BTW, set-ups like a C compiler written in C are very much in line with Godel statements. There certainly exists source code for which the question "does this compiler compile this source code correctly" cannot have a useful answer as a formal proof (and the examples easiest to contrive would include the compiler itself in said source code).
Socialism: a lie told by totalitarians and believed by fools.
Well, if you steadfastly refuse to explain your use of your pet phrase "formal specification language", you'll continue to fail to communicate. Whatever point you're trying to make, no one here outside your head is getting it.
Socialism: a lie told by totalitarians and believed by fools.
Then perhaps JIT might not be the best term. Ahead of time compilation managed by the PHP engine could help; it's the same philosophy that APC uses and Python uses when importing a module. I was mostly objecting to the concept of distributing a compiled PHP script for two reasons. First, debugging integration with a library is more difficult if you don't have the library's source in front of you, and second, ABI is even more likely to break between platforms or minor versions than PHP already is.
The C application might crash, but assuming that it running on any operating system written after 1995, it will not destroy the memory space of another program.
It won't if it's run through CGI or FastCGI. But it will if it's loaded as a module into your web server.
PHP and recent Python give the same results here. Yes, I'll admit that PHP arrived at what Python calls "true division" behavior years before Python did, and legacy Python (3.x has been current for years now) does not use true division by default.