Slashdot Mirror


Extending and Embedding Perl

habit forming writes "Enjoy using Perl? Ever marvel at how Perl can "do the right thing" but still be written in C? Extending and Embedding Perl aims to take the black magic out of understanding our favorite language. In fact, the authors flat out admit they think it is unfair that only so few of us get to have one foot in Perl and one in C. Tim Jenness and Simon Cozens attempt to break down that barrier with lots of annotated code examples, direct analogies from the structures in Perl to those in C, a fine-grain look at XS and what it takes to robustly use a Perl interpreter in C." Extending and Embedding Perl author Tim Jenness and Simon Cozens pages 375 publisher Manning (http://www.manning.com/) rating 8.7 out of 10 reviewer habit forming ISBN 1930110820 summary Get in touch with the inner Perl. What's that up your sleeve?

It is my experience that many situations require us to "look under the hood" of (thoroughly examine) a solution to understand how to best use it effectively. Perl is no exception. The ability to bring such a force as Perl to a project at the proper time is a valuable skill to possess. However, wading chest-deep into XS and the Perl internals is not for the faint of heart. Jenness and Cozens ease this process by stepping in lightly at first.

What's in it?

The book begins with simple C examples that are then related back to the readers' knowledge of Perl. Then the text seems to throw us a curve by leaping off into building Perl modules. But there is method to the madness: building Perl modules correctly is inextricably linked to XS. Light introductions to XS are performed and the reader is well on his/her way to building .so extensions to any .pm.

After building a very specific foundation of simple C examples, module building, and some XS, the text returns to C to introduce pointers, arrays, file I/O and memory management. With these new skills, we begin to explore the structure and implementation of Perl variable types. Chapter 4 provides many useful diagrams of how Perl variables "look" and what C structures they translate into.

Still following a logical and constant order, we explore the Perl 5 API, learning how to post and retrieve information to the variable types explored in the previous chapter. As much as it might seem, this is not a rehash of the perlapi doc. It is consistent with the perlapi doc, but Jenness and Cozens provide extensively annotated C code examples.

Casting deeper still, we add the advanced C of pointers, arrays, file I/O and memory management to our knowledge of XS. At this point we have everything we need to effectively extend Perl, but the text continues deeper still by exploring how XSUB interfaces to Perl's internals. It is only the clearly documented, step-by-step explanations of this chapter that make it manageable for an average user like myself. Chapter 7 ends our stint with XS by discussing some alternative XS (or equivalent code) generation suites.

Switching gears entirely, we grab libperl.a and stuff into a C program. Chapter 8 begins the task of embedding Perl into a C program. Jenness and Cozens continue the embedded discussion through a Case Study in Chapter 9 and end with a look through the Perl internals in Chapter 10.

The final chapter (Chapter 11) details some of Perl's history, its development process, how we could become involved and what the future of Perl and Perl 6 may entail.

Final Thought

This book was indispensable in gaining a good foothold on using Perl in, from, and around C. I found it a good reference guide as well as an easy ,linear read. It is not a replacement for the perlguts, perlapi and perlxs documentation, but then again, it doesn't try to be. The annotated code examples with every line explained make following the book with development of your own solution a lot easier than in some other books. However, the in-depth explanations can be a bit frustrating for the impatient.

You can purchase Extending and Embedding Perl from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

26 of 145 comments (clear)

  1. XS Isn't the only way. by gorilla · · Score: 5, Informative

    For many applications, you might find it easier to use Inline::C or SWIG. Neither gives you the total power that XS does, but they're much easier to get into.

    1. Re:XS Isn't the only way. by Hornsby · · Score: 3, Informative

      And thank god it's not! XS can be a royal PITA. The marriage of Perl and C is a shotgun wedding at best, and it seems that the groom is wasted before the ceremony even starts.

      Python makes it much easier to interact with C libraries, and Ruby has the nicest C library support of all. Also, for embedding a program language into an application, why not use Scheme? It was designed to be embedded from the beginning and should impose much less overhead. Since it's a functional language, it's also very well suited for AI, which makes it a good choice for games and such.

      I'm not knocking Perl. It has a special place in my heart as the first language I really learned; however, it's best used for what it's really good at, and that's scripting.

      --
      A musician without the RIAA, is like a fish without a bicycle.
  2. XS Mechanics by swm · · Score: 5, Informative
    For an overview of XS, see

    XS Mechanics

  3. Re:What's with the Related Links? by jamie · · Score: 2, Informative

    It's a bug. I'm fixing it.

  4. My $0.02 by dknj · · Score: 4, Informative

    This book explains how to expand the functionality and usefulness of the Perl programming language. This guide delves into the complex issues of using real code examples from the Perl source. Detailed is how to use Perl from C programs, such as writing interfaces to C libraries, implementing Perl callbacks for C libraries, and passing Perl hashes and arrays between Perl and C. Additionally, developers are provided with an API reference for the internal C interface to Perl and a reference on the typemap system.

    It's amazing how much this book covers: Not only does Sam Tregar show how object-oriented Perl modules are architected, how to write regression test suites, how to extend Perl modules with C code, but he gets also the community aspects right -- how does your module get really popular? You can tell that Sam is a successful Perl module author himself.

    -dk

    1. Re:My $0.02 by chromatic · · Score: 4, Informative

      You seem to be thinking of Writing Perl Modules for CPAN. There are similarities, but the Jenness/Cozens book goes into more detail about XS than the Tregar book. That's to be expected.

  5. Re:Perl 5 API??? by Reality+Master+101 · · Score: 5, Informative

    Isn't Perl 6 coming out soon?

    "Soon"? Considering that they haven't even finished deciding the features and changes of Perl 6, I think it's safe to say that a release version is at least a few years off, with 50% adoption being another three years plus after that.

    --
    Sometimes it's best to just let stupid people be stupid.
  6. Re:Perl 5 API??? by WWWWolf · · Score: 2, Informative

    True, Perl 6 is coming, but the shape of the language is still being discussed, the virtual machine isn't doing that much yet, and there's not really anything substantial yet... it may take a year or two until Perl 6 is out (not sure about the developers' actual schedule, though).

    As far as I know, there's not many (if any?) books that discuss the XS or Perl embedding. It sure isn't covered that widely in the Camel or Ram, and the only reference has been "go RTF 'perldoc perlxs'"... =)

    And most importantly, the Perl 6 folks have not said a word about how XS and embedding stuff works in Perl 6. (I suspect that it will be radically different, because of the Parrot...)

  7. Re:Perl 5 API??? by MarcoAtWork · · Score: 3, Informative

    Some observations:

    Maybe the author will make more $$$ releasing the perl 5 book now, and the 'revised' perl 6 version next year :)

    Also don't forget the sometimes extremely long lead times for book publishing, it is entirely possible that the author finished this book 6+ months ago.

    And last but not least, yeah, perl 6 is going to come out soon, but do you really think I'm going to use it for production code right away? I really don't think so, perl 5 will be the tool of choice for quite a while longer.

    --
    -- the cake is a lie
  8. Re:Perl 5 API??? by hamsterboy · · Score: 5, Informative

    AFAIK, Perl 6 is a whole other beast. It's a complete rewrite, with changes to the core language.

    You'll still be able to run your Perl 5.x scripts under 6, but not vice-versa. Thus, with all the existing Perl 5.x scripts existing in the wild, having a Perl 5 book around may still be handy.

    If you like analogies: why would you buy a C book when C++ has been around for years?

    -- Hamster

  9. Re:Perl: Fitting into the Big Picture by sohp · · Score: 5, Informative

    Java does everything that C++ does

    Uh, no. Thanks for playing. There are things that C++ does that Java does not -- some of which I'm thankful do not exist in Java (preprocessor) and some of which I miss (generics). But despite its C-like syntax and superficial resemblances (finalizers seem like destructors but aren't) Java is more like Smalltalk than C++.

    Take a quick gander the section For C, C++ Fans in Peter van der Linden's Java Programmers FAQ

    But then, why am I arguing over the relative merits of Perl, Java, C++, and C# with a user having the handle "Microsoft Research" who posts pure FUD?

  10. I think Perl5/XS will be with us for long time... by truth_revealed · · Score: 4, Informative
    Don't worry about about Perl5/XS becoming obsoleted - it will be with us for a long time since the Parrot (perl 6 VM) project seems to be going nowhere fast. Parrot is suffering from kitchen sink/second system syndrome.

    Recently, the DotGNU have made an overture to try to use the Parrot runtime for their C# compiler but found that Parrot needs a lot of work to get to the point where they could use it.

    Some Parrot VM problems:

    no calling conventions yet for subroutines. There is no hope of offering mixed language support unless they do this.

    no conversion opcodes for various builtin types (float, char, short, int)

    non-perl languages expected to provide additional support in the form of C code libraries for their opcodes. This would nix any hope of having a single standard universal virtual machine.

    no way to call out to C code

    no equivalent of Java's jar file or CLR's assemblies for parrot library distribution

    way too many registers: their register based VM (32 int registers, 32 double registers, 32 string registers, 32 PMC registers plus various stacks) requires a sophisticated compiler to do proper register allocation and needlessly complicates their VM.

    no consideration of threads in their design. How will they handle synchronization, for example?

    The points above are not coding issues, but issues of design. It seems that Parrot is too hung up on making the VM efficient and are not seeing the bigger picture - to get the features in place first so that high-level languages can work. Or perhaps they should simply concentrate on getting Perl6 to work first. They need more focus. The project tries to be all things to all people, but ends up satisfiying no one.

  11. Save some money by Anonymous Coward · · Score: 3, Informative

    bn.com has the book for $35.96 Amazon has it for $31.47.




    ----
    Associates Link

    1. Re:Save some money by marklark · · Score: 5, Informative

      also available from the publisher as an e-book for $22.47 (half SRP)

      http://www.manning.com/jenness/index.html

  12. Re:interpreter for other applications by Avalonia · · Score: 5, Informative

    We use the PerlStream classes to integrate a Perl interpreter into our C++ applications - it was covered in the January 2002 C/C++ Users Journal - with great success.

    Our ASCII file import parsers are written in Perl and the data read into Perl data-structures. The contents of these data structures can then be accessed directly from C++.

    The code is on the web (it has some subtle C++ bugs needed fixes using the base-from-member inititialisation idiom) here

  13. umm... hello? by Ender+Ryan · · Score: 4, Informative
    There are plenty of reasons to embed Perl in C. The first thing that comes to my mind is using Perl as a scripting language for a game, which I have done. Write all the low level graphics stuff in C, then write the high level game logic in Perl.

    As for C in Perl... Perl is a scripting language, it's simply not fast enough for everything, and you're going to need C to access different things, like joysticks, video, graphics libraries, etc...

    --
    Sticking feathers up your butt does not make you a chicken - Tyler Durden
  14. pcre by larry+bagina · · Score: 4, Informative

    Why would you want to embed perl into a C application? Probably to access perl's powerful regular expressions. If that's the case, the Perl Compatible Regular Expressions library is a more ideal solution. PHP, Python, and Ruby all use it to support perl regexes.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  15. Interfacing Perl with C... by Snake · · Score: 5, Informative
    Interfacing Perl with C has its uses, depending on what your current project is.

    In my case, I'm part of a large scale C++ project. I have the ownership of a module with clearly defined interfaces with the other modules written in this project.

    Since my module relies heavily on XML and strings, I have always wanted to pair it with the power of Perl for testing purpose.

    Among various possibles solutions (XS, SWIG, etc.), I settled on SWIG because it could handle 'shallow' classes. (allowing to expose my module as a perl object)

    This has been the best decision I have made over the last year: when I get a bug case, I simply write a perl script to try to reproduce the problem, add some loops to get some combinatory, then check the result. This drastically cuts down on the time spent on debugging my module (or the modules used by it, for that matter :)

    Pros:

    • SWIG: Relatively easy generation of stub code (by using interface files)
    • SWIG: It is possible to use the same interface files to generate stub code for Java, Python (though, I didn't test this feature)
    • SWIG: Excellent doc.
    • Perl: you can leverage the CPAN/PPM modules to do some truly magical hackings
    Cons:
    • While SWIG does a good job of hiding the gory parts of Perl Internals, you still need to brush up on it to better understand how it works, if only to avoid memory leaks
    • Perl: I haven't been to handle properly the passing of strings (I managed to do it by using a horrible hack that seems to work). I'm probably not smart enough to understand the documentation or the samples.
    • SWIG: the stub code is dependant on the version of perl used. It is therefore difficult to release it. This is mostly a dev tool.

    Summary: If you are a C/C++ developper and your code can use XML/text files/strings, consider using SWIG or XS for testing purpose.

    PS: if you want to Quantify/Purify your module/Perl script, using ActiveState Perl, you need to recompile Perl with the -DPURIFY option toggled on.

  16. Re:I think Perl5/XS will be with us for long time. by Elian · · Score: 5, Informative
    You missed the mark in just a few places here.
    1. no calling conventions yet for subroutines. Alas not true. They're in place and have been for quite some time.
    2. no conversion opcodes for various builtin types Also incorrect--we've got them and have since the very beginning. What we don't have is low-level support for specific size integer and floats, since our target languages (perl, python, ruby) don't have them. Adding them adds no overhead, though, so they're going in since it'll make base .NET and JVM compatibility simpler.
    3. non-perl languages expected to provide additional support in the form of C code libraries for their opcodes. Once again, incorrect. No external libraries are or will be required. We're turing complete and generally have a richer set of base semantics than .NET or the JVM does. This doesn't mean that someone can't choose to require an alternate set of opcodes if they want, but the engine doesn't require it.
    4. no way to call out to C code This one's actually true. We've not gotten to that part yet, though it's on the list.
    5. no equivalent of Java's jar file or CLR's assemblies for parrot library distribution Incorrect, bytecode files work just fine for this. That part of the design is somewhat incomplete, though.
    6. way too many registers: their register based VM ... requires a sophisticated compiler to do proper register allocation and needlessly complicates their VM Wrong yet again, sorry. Doesn't requore a sophisticated compiler at all. At best it requires some sophisticated register allocation algorithms. Luckily for us, those algorithms are old, known territory, which is why we've got them implemented already. So what if it makes the VM slightly more complex? (And it is only slightly more complex because of it) We're not writing something for CS101 here.
    7. no consideration of threads in their design. Once again, incorrect, if you'd bothered to read any of the design documents. Threading isn't, at the moment, implemented because other things have been more important, but it has been thought about in the design.
    1.5 for 7. Not too good there. Oh, and this:
    seems that Parrot is too hung up on making the VM efficient
    What drugs are you on? If we piss away efficiency in the design, no amount of clever coding will ever get it back. Maybe you're willing to sacrifice a factor of two in speed for "clarity" but so are nine of your friends. (To misquote the author of make) I, on the other hand, am not.

    Finally

    Or perhaps they should simply concentrate on getting Perl6 to work first. They need more focus. The project tries to be all things to all people, but ends up satisfiying no one.
    If you'll look, you'll notice that perl 6 isn't fully designed yet, but the bits that are have been implemented.

    Just because you can't (or won't, or don't want to) see the focus doesn't mean it's not there. It is, thanks very much, and we're well on track to do what we need and do it well. The design's flexible enough to pick up things like JVM or .NET compatibility without a loss of focus or efficiency, so there's no reason not to.

  17. The trouble with mixed-language work by Animats · · Score: 3, Informative
    If you interface a language with automatic memory management with one with manual memory management, like Java or Perl with C or C++, the low-level stuff has to be very, very carefully written to prevent breaking the memory management system. Most application programmers aren't good enough to write the bulletproof code needed to do that right.

    The result, of course, is undebuggable random crashes in the high-level part of the system. Here's are some typical bug reports from mixed Perl/C work:

    • #1 Okay, it seems to be some kind of conflict between mod_perl/Embperl and PHP and perhaps Apache::DBI. My Embperl stuff works if there's no database access. It also works if I don't load libphp4.so. I guess the best solution is to either build everything statically or run seperate servers for PHP and mod_perl.
    • #2: Following a large number of updates to our database, slapd is prone to crashing when reading values back. We load a database of about 3800 users with slapadd, then modify a single attribute of every 'person'. Then slapd is likely to crash on reading values back. Restarting slapd seems to make it work again. Just prior to the crash, slapd will give incorrect query results. ... We have a large client site limping along due to this kind of problem ... so any help would be welcome.
    After this, you begin to understand the logic behind Microsoft's C# mixed-language run-time environment. That's ugly, too, but more maintainable, because the toolset has some support for mixed-language work.

    I'd like to see safe inter-language calls across a protection boundary. CORBA is about as good as it gets, but it's slow, because it marshalls the data into a stream and pumps it through a socket to the other side. There are faster approaches (look at Multics protection rings) but they need some hardware support, which we don't have today.

  18. Tcl and other languages by DavidNWelton · · Score: 2, Informative
    Tcl certainly isn't as cool as Perl, but the implementation is very, very beautifully written C, and combining C and Tcl is a pleasure. It's fun, simple and easy. You also have access to a lot of neat internal features.


    Perl, IMO, is the worst of the scripting languages to combine with C... the interface is not pretty. Other languages like Python aren't bad. Lua is good if you want something really small and fast.

  19. Re:I think Perl5/XS will be with us for long time. by Elian · · Score: 5, Informative

    You seem to be missing the point with some regularity. PMCs don't have to be written in C--they will be doable in parrot code if need be. And just because 64 bit integers will be done with PMCs doesn't mean that they won't be part of the core distribution, or a recommended library.

    There's nothing particularly wrong with saying "You must have the X library/module/kit to do Y". Requiring the install of the .NET library to get full .NET functionality will undoubtedly be needed. (We're certainly not going to ship the full .NET core library with Parrot any more than we're going to ship the full Java core library) If we don't ship 64 bit ints as part of the core, they'll either be in the .NET library, or in an extended data type library.

    What's next, will you start complaining next that we're going to require installing Postgres to access Postgres databases? (Or will the next complaint be about the bloated size of the distribution to provide the features that match your expectations?)

  20. Re:I think Perl5/XS will be with us for long time. by Anonymous Coward · · Score: 1, Informative

    x86 lacks key builtin opcodes for universal language support - like support for 64 bit ints.

    PPC G4 lacks key builtin opcodes for universal language support - like support for 64 bit ints.

    Java VM lacks key builtin opcodes for universal language support - like support for 64 bit ints.

    You do understand that Parrot is a VM, don't you?

  21. Re:The author talks about alternatives by egarland · · Score: 2, Informative

    Chapter 7 is alternatives to XS. He talks about SWIG and Inline::C as well as some other lesser known things like PDL::PP.

    --
    set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
  22. Re:Perl 5 API??? by bcrowell · · Score: 3, Informative
    And most importantly, the Perl 6 folks have not said a word about how XS and embedding stuff works in Perl 6.
    They have said a lot about it: XS is going away completely, and interfacing C to Perl is going to get a lot easier. (The quote I remember from their FAQ was something like "how could it be any harder than it already is?")

    As far as embedding Perl, well, the Perl interpreter is going to be written in Perl starting with Perl 6. Instead of embedding an interpreter, I think you'd just embed a Parrot VM and hook your compiled Parrot bytecode into your program.

  23. Re:Not intended as a flamebait, but... by chromatic · · Score: 2, Informative

    It's a good thing, according to Larry, because Perl more closely maps to spoken language patterns. His theory is that the computer should do extra work to make the life of a programmer easier.

    I don't understand your comment about interpreting that code as a boolean construct. That's exactly how Perl does evaluate it. See B::Deparse for clarification.