# $Id: GUIDE,v 1.7 2005/07/22 20:17:44 michal Exp $ THE PIRATE GUIDE understanding the code even if you don't know python ############ INSTALLATION ############ Install parrot and python. Pirate should work with the latest versions of parrot and any any python version 2.2 or above. ## PARROT ## When you build Parrot, make sure --prefix is set correctly (e.g., --prefix=`pwd`) when you run Configure.pl, so that parrot can properly locate the python PMC's. For example: svn co https://svn.perl.org/parrot/trunk parrot cd parrot perl Configure.pl --prefix=`pwd` make ## PYTHON ## You can compile python 2.3 or above to work as a shared library with: ./configure --enable-share make Once you've got a decent parrot and a decent python, run the tests: ################# RUNNING THE TESTS ################# The next thing to do (once you have parrot and pirate installed) is run the tests. If everything is working, it looks like this: % python PirateTest.py ............................................ -------------------------------------------- Ran 56 tests in 3.354s OK (where 56 is however many tests there are) If the tests aren't working on a fresh install of pirate and parrot, it either means: - something in parrot has changed - someone checked in code that didn't pass the tests. - your python compiler package is incompatible with mine (2.3 is latest python, I'm using 2.2.3) But, hopefully it's working and we can move on. :) In most cases, creating a test is easy. Simply place a testname.py file in the test/python directory and PirateTest will automatically run the test through the Python interpreter and capture the output, and then compile the same source to Parrot, run it, and compare the outputs. If they are the same, the test passes; if not, the test fails... what can be simpler? You can also run any individual test like this: % python pirate.py test/python/print.py hello, world! There may also be some pirate specific functions that require testing... for example, the PARROT_INLINE function. For these, simply check in both the python source and the expected output (with an extension of .out) into the test/pyrate directory. Tests that require additional code to implement are placed directly in PirateTest.py. If you look in there, you'll see there's a utility function called trim() (nice for indenting multi-line strings in python code) and then a class called PirateTest with a bunch of test_xyz methods. Each of these methods calls self.run() on a chunk of python code. run() compiles the code and invokes parrot to see what the code does. Once we have the output, we assert what the result should be. If the assertion holds true, you get a dot when you run the tests. If the assertion fails, you'll get a traceback instead. For example, if I break test_print so that it looks like this: def test_print(self): res = self.run( """ print 'hello,', print 'world!' """) self.assertEquals(res, "it's broke!\n") Then the tests show something like this: ..............F......... ====================================================================== FAIL: test_print (__main__.PirateTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "PirateTest.py", line 47, in test_print self.assertEquals(res, "it's broken!\n") File "/usr/local/lib/python2.2/unittest.py", line 286, in failUnlessEqual raise self.failureException, \ AssertionError: 'hello, world!\n' != "it's broken!\n" ---------------------------------------------------------------------- Ran 23 tests in 1.443s FAILED (failures=1) Yuck. So keep the tests running! :) #################################### THE PIRATE CODE GENERATOR FOR PYTHON #################################### Because so many languages have similar control structures, the plan is to eventually have a generic code generator that can be used for multiple languages. But for now, the code generator is python-specific. If you open pirate.py and scroll through, you will see that most of the file is one huge class called PirateVisitor and it has a bunch of visitXxxxx and yyyyExpression methods. This class is meant to be used by the python compiler package. The compile() function near the bottom of the file returns a tree-like structure (called "ast" for abstract syntax tree) and we instantiate a PirateVisitor to walk through it. NOTE: See Klaas-Jan Stol's report for an in-depth look at how parsers, ASTs, and compilers in general work. His report is about a Lua->Parrot compiler (also called pirate) but most of it applies to python too: http://members.home.nl/joeijoei/parrot/report.pdf All the work of building the AST gets done in python's parser and compiler packages. We just walk the tree for each node, and call the appropriate method on our PirateVisitor instance. The PirateVisitor builds up a list of strings as we go, where each string is a line of parrot intermediate code. Each visitXxxx method corresponds to a keyword or operator in the python language, and contains a template for generating the corresponding intermediate code. Let's look at two examples. First, find the "visitPrint" method in pirate.py. You will see that visitPrint calls the __py__print routine (which is written in parrot and found in the file pirate.imc )... No big deal. Now find "visitWhile". This one is a template for a "while" loop and is slightly more complex. In fact, the code is a little hard to read. Currently there's a lot of list-appending and string interpolation going on, so these templates can be a little messy. It might help to just try it out. ## examining the compiled code ############################# Create a simple python file like this: while 1: pass Save that as while.py and run this: ./pirate.py -d while.py The -d option tells pirate to dump out the source code. Unless visitWhile() has been improved, this should print: .sub __main__ new_pad 0 newsub P0, .Exception_Handler, __py__catch set_eh P0 newclass P0, "PythonIterator" setline 1 while0: unless 1 goto elsewhile0 # (visitWhile:712) noop # (visitPass:816) goto while0 # (visitWhile:714) elsewhile0: endwhile0: end # (compile:1019) .end .include 'pirate.imc' Which, if you ran it through parrot, would loop forever. :) Let's look a little closer at the generated code. The lines at the top without comments are boilerplate code that set up a default exception handler. The middle lines show the intermediate code that pirate created. The comments on the right indicate which method and line in the *compiler* resulted that particular line of intermediate code. The last line includes the pirate standard library, which defines runtime support routines written in parrot intermediate code. The lines ending in colons are labels, that allow you to jump to different places in the code. You'll notice the labels all end with 0. That's a product of PirateVisitor.gensym(), which appends a unique number to a string so we don't get a bunch of symbol clashes in code we pass to parrot. (No good saying "goto endwhile" if there's 50 "endwhile" labels in the code) ## dumping code for the tests ######################## In addition to pirate's -d flag for dumping a file, you can You can also dump the code that each of the test cases uses. Most of the tests pass dump=0 to self.run (and if one doesn't, you can add it in). Change the 0 to 1 and you'll see the code. The other parameter, lines, will show line numbers for the IMC code -- helpful if imcc gives you a nasty message about something on line 23. :) ## more on PirateVisitor ############################## Okay, moving back to pirate.py: the other major thing to look at in PirateVisitor is the expression() and various .xxExpression() (soon to be expressXXXX) methods. As the expression() docstring explains, standard python uses a stack-based machine, so the standard bytecode just push and pop values on and off a stack. In pirate, we use registers instead, so when we visit an expression we have to pass in a destination register, telling parrot where to store the value. The expression() method takes care of this, and you'll notice these methods all have a "dest" parameter. ## subroutine compiler: PirateSubVisitor ############## About the only other tricky thing right now is PirateSubVisitor. This is actually just a subclass of PirateVisitor, so it can do anything the normal visitor does. The only difference is that when .genCode() (which returns the IMC code) finally gets called, it adds some code to handle the parrot calling conventions. PirateSubVisitor is used for lambdas and subroutines. It works exactly the same as PirateVisitor except the generated code is just a little bit different. The main PirateVisitor invokes PirateSubVisitor whenever it sees a "lambda" or "def" statement, collects the generated code in a separate list called self.subs. Anyway, that's all for now. Hopefully it'll get you started working with Pirate. :)