Slashdot Mirror


User: tangi

tangi's activity in the archive.

Stories
0
Comments
11
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 11

  1. Re:PHP doesn't scale? on Scriptiing The Enterprise With Java And PHP · · Score: 3, Insightful
    More than anything I was curious about server side Java in particular, which you claim is more scalable because it shares memory. I'm interested in hearing some more details about this - why you think it's so and any references to back it up.

    Java is not more scalable than PHP by its own because it shares memory. Java enables/simplifies the design of scalable applications, which is not exactely the same. If there is nothing to share, then the execution model doesn't matter. If you can capilize on stuff created once for all, or at least reusable several times, then being able to share memory has a big impact.

    "Java-based SEDA Web server outperforms Apache and Flash (sld12)" because of a design aimed at limiting object reinstantiations and context switching. These two pains obviously occur when you do the same things on many concurrent threads: you'd better do it once and share the result.

    There is really nothing special with Java and multi-threading about that. The same is true for multi-process Apache C modules programmed to use shared memory.
    In fact all four components of the LAMP architecture internally make extensive usage of shared memory (for i in linux apache mysql php; do google "shared memory" $i ; done) simply because cpu cycles and memory allocations are expensive and high performance objectives imply not to waste them. If PHP had a higher level API than its existing one for managing shared memory, web programmers would be able to easily prolong the benefit of using shared memory to the application itself.

    I shouldn't end my post with a flamebait but I believe that if a web developer suffers from Java's drawbacks (bytecode/JVM, performance cost of native UTF-16 strings, garbage collection, ...), he's 99% likely to under-use its strengths (great thread API, servlet model, great librairies, ...). Well used, they enable really performant designs. I've seen so many times applications refactored from C to Java performing several times faster, just because it was easy to do things smarter in Java, while very risky in C (Never had a SIGSEGV in a large multi-threaded C application ? Happy debugging and next time you'll keep it stupid!).

  2. Re:PHP doesn't scale? on Scriptiing The Enterprise With Java And PHP · · Score: 2, Informative
    So Java's answer is just to hoover up all available memory and then share it?;-)

    Yes ;-) and no. PHP provides a bit less control on choosing the appropriate trade-off between size and speed: an issue born with data structures and algorithms, much older than Java.

    How is this scalable, say, to multiple computers?

    Performance is a matter of software design and not of language or bytecode or whatever. It's like the "don't debug, verify correctness" principle of eXtreme Programming. Here it's: "don't optimize your code, design it to be fast and scalable".
    On multiples computers, your design should also reflect the same principle: avoid reprocessing the same things again and again. The difference is only about granularity. Caesar's "divide and conquer" principle is very useful too.
    You may specialize some machines on some subsets of data so each can have in memory the data they need most of the time, let each work on it, and finally merge all sub-results. That's how Google manage its huge index for instance.
    You may also keep databases but use them through Active Objects to "decouple method execution from method invocation to enhance concurrency". I often use a LRU cache to address the issue of what should be kept in memory and what should not.

    Can you point to a document that explains some of this stuff without using words like 'enterprise enabled'?

    That's a very wide topic covering most of computer science: data structures and algorithms, design patterns, architectures of OS, ...
    With some experience, I discover OOP and system programming are very similar. Programmers of both worlds often argue although they in fact agree: they simply don't use the same words for the same things!
    I therefore suggest you have a look at Design patterns from Gamma and al., Pattern Languages of Program Design from Vlissides and al., read any good book about the architecture of modern OS (paper on I/O aspect), and for god sake keep off "The ultimate /my-favorite-language/ Programming Bible ;-).

    IMHO, every programmer should know about design patterns, even if he doesn't consider ever practicing OOP, and about how wonderfully OS are designed, even if he doesn't consider ever leaving Java to write a device driver or an I/O library in C.
    In addition to improving software around, it would also make a true miracle: terminate the weekly Biggest Di*k Contest about languages on /.

  3. Re:PHP doesn't scale? on Scriptiing The Enterprise With Java And PHP · · Score: 5, Insightful
    PHP is a very performant and handy language but it misses a shared memomy model which is the most important source of scalability.
    The rule #1 of scalability is to avoid doing the same thing twice but rather to store the result where it can be reused (by other threads here).

    One may call this a "limited support for object-oriented programming" because it's indeed impossible to implement most of the common OO design patterns in pure PHP but this has in fact very little to do with OOP: shared memory is a system notion and storing intermediate results is what variables exist for! Storing data for later reuse by another thread is not fundamentally different from introducing a variable before a loop to store a constant expression used within this loop instead of recalculating it at each iteration.

    You can't do the former in PHP unless you use a RDBMS (not as fast as direct memory access) or... C/C++ extensions which is what Yahoo does (Making the Case for PHP at Yahoo!). Through such extensions, PHP enables the implementation of something similar to a servlet instance member.

    But that's much more complicated than in Java, even more if you're trying to implement a generic extension because of type mapping issues between PHP and the extension (C/C++ being stronly typed). Yahoo can of course afford the effort but the result is light-years away from common PHP usage: most of us can't just say they are doing like Yahoo because they also use PHP.

    This to say that PHP is a wonderful language. It simply has some drawbacks like all others.

  4. Don't forget CS101 basics when dealing with XML on Using XML in Performance Sensitive Apps? · · Score: 1
    XML can be very verbose and this could be a problem, especially with parsers doing a lot of copying and langages allocating memory slowly.
    Java strings do a lot of copying, the point is to get yourself as close as possible to a zero-copy xml parser as you can.
    This is true for C++ as well. The std::string(const char *) constructor copies the string. This could be a performance bottleneck with C++ wrapper to C parsing API. I therefore use a patched version of Arabica (C++ SAX2 wrapper to Expat) relying on a ConstString class. It just rocks (used on a top 5 French search engine).

    But that's an evil optimization unless you've already designed your DTD to limit memory allocations. You wouldn't put detailled client information in every order item record when using a database, would you?

    I once attended an international conference where the speaker "proved" XML/XSLT had poor performance... with an example doing a simple lookup in an XML file. The XML data was shamely unstructured and the lookup algorithm was O(n2)!

    Design your DTD with the care you naturally take to databases, design your code to avoid multiple passes over the XML and everything should be OK. Never forget things usually go pretty well with databases only thanks to SQL optimizers: most tables and requests are badly designed.

    • Avoid redundant data: factorize them by using a relational-like XML structure, use entities for constants
    • Get rid of any data you could retrieve another way: just put ids in the XML and store detailled persistent data in a hashtable.
    • Don't misuse XML: avoid over-nested structure, prefer attribute to sub-element for singletons, use the ID/IDREF mechanism.
    With these common sense rules, you can achieve several hundreds of requests per second per cpu. Go to XML Pattenrs for further reading.
  5. Re:Not Black box, but *PERFECT* black box on Programmers and the "Big Picture"? · · Score: 1
    Nobody shocked by the "2.0 - (1.0 + 1.0) < 0.001" test?

    I so have ABSolutely succeeded in demonstrating the danger of sourcing perfection out of the box ;-)

  6. Re:Not Black box, but *PERFECT* black box on Programmers and the "Big Picture"? · · Score: 1
    The ranges of imperfection of electronical components are part of their specs, right? Just like an exception or rounding error in SE.
    Those components sometimes fail to comply to their specs or simply burn, right? Just like our code then.
    So it's really a matter of paranoid mindset...
    ... or strategy. Should we enforce the fault tolerance of our code using defensive programming or prevent errors by extensive unit testing?

    I believe defensive programming tends to hide imperfections which may mature and explode later but harder.
    This approach makes sense in EE where the components are manufactured in large series but for hand-written software components, I'd be more skeptical. I'd rather check twice whether my box is appropriate and really black, not only dark grey.
    If 1.0 + 1.0 != 2.0 then I'd write a real black box where 1.0 + 1.0 == 2.0, instead of writing 2.0 - (1.0 + 1.0) < 0.001 everywhere.
    This can unfortunately be done only if the specs of the system clearly say that 1.0 + 1.0 != 2.0.

    The issue is then in fact about packaging the imperfect black boxes in a perfect way, rather than assuming everything could be incorrect. Never had a bug in a defensive test ;-) ? I had and I now limit this strategy to the debug mode to cause avalanches rather than unnoticeable slips.

  7. the need, the problem and the solution on Programmers and the "Big Picture"? · · Score: 1
    Do Slashdot readers think that the theories used to teach (and learn) programming lead to programmers that tend to approach problems with a 'black box', or 'virtual machine' mentality without considering the entire system?
    I don't think CS teaching is directly responsible for this situation.

    The real issue may rather be that most people tend to think in term of solutions rather than problems.
    The prerequisite to the solution is unfortunately the ability to state the problem correctly.
    Even when we do so carefully, we usually lack of an accurate awareness of the real need.
    Unconscious needs lead to poorly stated problems which then lead at best to inappropriate solutions.
    These inappropriate solutions are finally stacked one atop the other and called "layers of abstractions".

    CS teachers, students and programmers are just human. Nothing wrong with that provided they all know their biggest weakness.

  8. Re:still no silver bullets on Has Software Development Improved? · · Score: 2, Insightful
    I may know of a silver bullet.

    Nothing new under the sun, I'm unlikely to be the first one to face any problem. Among those who already faced it, some managed to solve it. Many generous solvers live on the internet, share their production and I can usually find it in no time using Google. Isn't the systematic plundering of others' solution to the issue I'm facing the biggest improvement of the last quarter century? I definitely think so.

    Logic requires careful thought, and careful thought requires time.

    ... But others may already have thought carefully.

  9. Re:They passed on Java because FreeBSD is crappy? on Yahoo Moving to PHP · · Score: 1

    Java is for sure more scalable than PHP A sufficient justification to this statement is its servlet model allowing shared memory (objects) between threads, which is the only real key to scalability. The counterpart is the need of a CS degree. Yahoo's situation is quite special: adding another language to their toolkit was simpler than changing the OS on such a huge web farm. So, yes, PHP, being a very efficient brute force scripting language, may be better for Yahoo. Conclusion: - the best language is the one that suits your needs. - if two people argue about the BEST programming language, they don't know enought about programming basics.

  10. Re:Multi-threading on Pet Bugs II - Debugger War Stories · · Score: 1
    I met that pet too but with two different Solaris thread libraries.

    Strangely, my code worked fine with the first one and freezed with the second one... until I trussed the process making it going on :-o
    It took me some time to figure out that, with the second thread library, a thread owning a mutex, releasing it and immediately claiming it back always remains active without giving any chance to another one to be scheduled and thus to acquire the mutex.
    It wasn't a bug of the library as POSIX doesn't guarantee a descheduling on mutex release, but a lack of an explicit yield in my code.

    Just another illustration of experimentation being definitely not a good strategy when dealing with multi-threading.

  11. Re:from the terms and conditions on Amazon Introduces Web Services Interface · · Score: 2, Informative

    just a top of mind: you could run you php within a dedicated instance of your web server, limit the number of process/threads to 1, and sleep the difference between 1 seconde and Amazon response time with your script.