skip to navigation
skip to content

Planet Python

Last update: September 03, 2010 02:47 AM

September 03, 2010


John Cook

Bug in SciPy’s erf function

Last night I produced the plot below and was very surprised at the jagged spike. I knew the curve should be smooth and strictly increasing.

My first thought was that there must be a numerical accuracy problem in my code, but it turns out there’s a bug in SciPy version 0.8.0b1. I started to report it, but I saw there were similar bug reports and one such report was marked as closed, so presumably the fix will appear in the next release.

The problem is that SciPy’s erf function is inaccurate for arguments with imaginary part near 5.8. For example, Mathematica computes erf(1.0 + 5.7i) as -4.5717×1012 + 1.04767×1012 i. SciPy computes the same value as -4.4370×1012 + 1.3652×1012 i. The imaginary component is off by about 30%.

Here is the code that produced the plot.

from scipy.special import erf
from numpy import linspace, exp
import matplotlib.pyplot as plt

def g(y):
    z = (1 + 1j*y) /  sqrt(2)
    temp = exp(z*z)*(1 - erf(z))
    u, v = temp.real, temp.imag
    return -v / u

x = linspace(0, 10, 101)
plt.plot(x, g(x))

September 03, 2010 12:19 AM

September 02, 2010


Menno's Musings

IMAPClient 0.6.1 released

I've just released IMAPClient 0.6.1.

The only functional change in the release is that it now automatically patches imaplib's IMAP4_SSL class to fix Python Issue 5949. This is a bug that's been fixed in later Python 2.6 versions and 2.7 but still exists in Python versions that are in common use. Without fix this you may experience hangs when using SSL.

The patch is only applied if the running Python version is known to be one of the affected versions. It is applied when IMAPClient is imported.

The only other change in this release is that I've now marked IMAPClient as "production ready" on PyPI and have updated the README to match. This was prompted by a request to clarify the current status of the project and seeing that all current functionality is solid and, I don't plan to change the existing APIs in backwards-incompatible ways, I've decided to indicate the project as suitable for production use.

As always, IMAPClient can be installed from PyPI (pip install imapclient) or downloaded from the IMAPClient site. Feedback, bug reports and patches are most welcome.

September 02, 2010 09:56 PM


Matthew Rollings

Find words with the most anagrams efficiently using python

Following my previous post about 9 letter anagrams I am posting the final code I have created taking into account suggestions/snippets from Michael, Toby and Martin. Added two variables to make it nice and easy to modify what to look for.

Code

# -*- coding: utf-8 -*-
from time import time
from collections import defaultdict  

ag_len = 10 # Anagram word length
ag_min = 2  # Min # of anagrams
dictionary_path = '/usr/share/dict/british-english'
tic = time()

wd = defaultdict(set)
for l in open (dictionary_path, 'r'):
	l=l.strip()
	if ag_len==len(l):
		wd["".join(sorted(l))].add (l)

for ws, wl in wd.iteritems():
	if len ( wl ) >= ag_min:
		print " ".join ( wl )

toc = time()
print toc-tic,'s'

Explanation
The dictionary file is filtered by length into a dictionary. The key for the dictionary is the letter of the word sorted in order, IE:

"".join(sorted('arranging')) = 'aagginnrr'

With the value as the unsorted word. Because words that are an anagram of each other will be identical when sorted this means that using the add method with a dictionary will cause any anagram to share the same key. Eg:

When the dictionary gets to megatons it will create a new key in the dicitonary like so:
{'aegmnost': set(['megatons'])}

Then to magnetos
{'aegmnost': set(['magnetos', 'megatons'])}

Then to montages:
{'aegmnost': set(['magnetos', 'megatons', 'montages'])}

Then we loop over all the items in the dictionary we created and see if the length of the values is greater than the minimum value we are looking for.

All done, a very elegant and simple method to find words with several anagrams for a given word length.

Results

I was going to post the interesting 10 letter anagrams I found however I couldn’t find any with more than 2 anagrams with the dictionary I was using.

There is a 11 letter tripple anagram:

anthologies anthologise theologians

and some 8 letter with 4 or more anagrams:

painters pertains pantries repaints
resident nerdiest inserted trendies
salesmen lameness nameless maleness
strainer restrain terrains retrains trainers
altering triangle relating integral alerting
rangiest ingrates angriest gantries
parroted predator teardrop prorated
iterates teariest treatise treaties
trounces counters recounts construe

September 02, 2010 09:39 PM


Yaniv Aknin

Python’s Innards: Hello, ceval.c!

The “Python’s Innards” series owes its existence, at least in part, to hearing one of the Python-Fu masters in my previous workplace say something about a switch statement so large that it was needed to break it up just so some compilers won’t choke on it. I remember thinking then: “Choke the compiler with a switch? Hrmf, let me see that code.” Turns out that this switch can be found in ./Python/ceval.c: PyEval_EvalFrameEx and it switches over the current opcode, invoking its implementation. If I had to summarize all of CPython into one line, I’d probably choose that switch (actually I’d refuse, but humour me by assuming I was at gunpoint or something). This choice is rather subjective, as arguably there are more complex/interesting bits in Python’s object system (explored here and there) or parser/compiler related code. But I can’t help seeing that line, and its surrounding function and file, as the ‘do-work’ heart of CPython.

The reason I didn’t start the series from this heart is that I thought it would be too hard (mostly for the author…). Thanks to what we (well, at least I) learned in the previous posts, I think we can now understand it quite well. I’ll try to link backwards as necessary throughout the article, but if you haven’t followed the series so far, you’d probably do much better if you went back and read some of the previous articles before tackling this one. Also, for brevity’s sake in this post, I won’t qualify the file ./Python/ceval.c and the function PyEval_EvalFrameEx in it. Finally, remember that usually in the series when I quote code, I may note that I edited it, and in that case I often prefer clarity and brevity over accuracy; this is true for this post as well, only much more so, excerpts here might bear only slight resemblance to the real code.

So, where were we… Ah, yes, monstrous switch statement. Well, as I said, this switch can be found in the rather lengthy file ceval.c, in the rather lengthy function PyEval_EvalFrameEx, which takes more than half the file’s lines (it’s roughly 2,250 lines, the file is about 4,400). PyEval_EvalFrameEx implements CPython’s evaluation loop, which is to say that it’s a function that takes a frame object and iterates over each of the opcodes in its associated code object, evaluating (interpreting, executing) each opcode within the context of the given frame (this context is chiefly the associated namespaces and interpreter/thread states). There’s more to ceval.c than PyEval_EvalFrameEx, and we may discuss some of the other bits later in this post (or perhaps a follow-up post), but PyEval_EvalFrameEx is obviously the most important part of it.

Having described the evaluation loop in the previous paragraph, let’s see what it looks like in C (edited):

PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
    /* variable declaration and initialization stuff */
    for (;;) {
        /* do periodic housekeeping once in a few opcodes */
        opcode = NEXTOP();
        if (HAS_ARG(opcode)) oparg = NEXTARG();
        switch (opcode) {
            case NOP:
                goto fast_next_opcode;
            /* lots of more complex opcode implementations */
            default:
                /* become rather unhappy */
        }
        /* handle exceptions or runtime errors, if any */
    }
    /* we are finished, pop the frame stack */
    tstate->frame = f->f_back;
    return retval;
}

As you can see, iteration over opcodes is infinite (forever: fetch next opcode, do stuff), breaking out of the loop must be done explicitly. CPython (reasonably) assumes that evaluated bytecode is correct in the sense that it terminates itself by raising an exception, returning a value, etc. Indeed, if you were to synthesize a code object without a RETURN_VALUE at its end and execute it (exercise to reader: how?1), you’re likely to execute rubbish, reach the default handler (raises a SystemError) or maybe even segfault the interpreter (I didn’t check this thoroughly, but it looks plausible).

The evaluation loop may look fairly simple so far, but I kept back an important piece: I snipped about 1,450 lines of opcode implementations from within that big switch, all of them presumably more complex than a NOP. In order for you to be able to get a feel for what more serious opcode implementations look like, here’s the (edited) implementation of three more opcodes, illustrating a few more principles:

            case BINARY_SUBTRACT:
                w = *--stack_pointer; /* value stack POP */
                v = stack_pointer[-1];
                x = PyNumber_Subtract(v, w);
                stack_pointer[-1] = x; /* value stack SET_TOP */
                if (x != NULL) continue;
                break;
            case LOAD_CONST:
                x = PyTuple_GetItem(f->f_code->co_consts, oparg);
                *stack_pointer++ = x; /* value stack PUSH */
                goto fast_next_opcode;
            case SETUP_LOOP:
            case SETUP_EXCEPT:
            case SETUP_FINALLY:
                PyFrame_BlockSetup(f, opcode, INSTR_OFFSET() + oparg,
                           STACK_LEVEL());
                continue;

We see several things. First, we see a typical value manipulation opcode, BINARY_SUBTRACT. This opcode (and many others) works with values on the value stack as well as with a few temporary variables, using CPython’s C-API abstract object layer (in our case, a function from the number-like object abstraction) to replace the two top values on the value stack with the single value resulting from subtraction. As you can see, a small set of temporary variables, such as v, w and x are used (and reused, and reused…) as the registers of the CPython VM. The variable stack_pointer represents the current bottom of the stack (the next free pointer in the stack). This variable is initialized at the beginning of the function like so: stack_pointer = f->f_stacktop;. In essence, together with the room reserved in the frame object for that purpose, the value stack is this pointer. To make things simpler and more readable, the real (unedited by me) code of ceval.c defines several value stack manipulation/observation macros, like PUSH, TOP or EMPTY. They do what you imagine from their names.

Next, we see a very simple opcode that loads values from somewhere into the valuestack. I chose to quote LOAD_CONST because it’s very brief and simple, although it’s not really a namespace related opcode. “Real” namespace opcodes load values into the value stack from a namespace and store values from the value stack into a namespace; LOAD_CONST loads constants, but doesn’t fetch them from a namespace and has no STORE_CONST counterpart (we explored all this at length in the article about namespaces). The final opcode I chose to show is actually the single implementation of several different control-flow related opcodes (SETUP_LOOP, SETUP_EXCEPT and SETUP_FINALLY), which offload all details of their implementation to the block stack manipulation function PyFrame_BlockSetup; we discussed the block stack in our discussion of interpreter stacks.

Something we can observe looking at these implementations is that different opcodes exit the switch statement differently. Some simply break, and let the code after the switch resume. Some use continue to start the for loop from the beginning. Some goto various labels in the function. Each exit has different semantic meaning. If you break out of the switch (the ‘normal’ route), various checks will be made to see if some special behaviour should be performed – maybe a code block has ended, maybe an exception was raised, maybe we’re ready to return a value. Continuing the loop or going to a label lets certain opcodes take various shortcuts; no use checking for an exception after a NOP or a LOAD_CONST, for instance.

That’s pretty much it. I can’t really say we’re done (not at all), but this is pretty much the gist of PyEval_EvalFrameEx. Simple, eh? Well, yeah, simple, but I lied a bit with the editing to make it simpler. For example, if you look at the code itself, you will see that none of the case expressions for the big switch are really there. The code for the NOP opcode is actually (remember this series is about Python 3.x unless noted otherwise, so this snippet is from Python 3.1.2):

        TARGET(NOP)
            FAST_DISPATCH();

TARGET? FAST_DISPATCH? What are these? Let me explain. Things may become clearer if we’d look for a moment at the implementation of the NOP opcode in ceval.c of Python 2.x. Over there the code for NOP looks more like the samples I’ve shown you so far, and it actually seems to me that the code of ceval.c gets simpler and simpler as we look backwards at older revisions of it. The reason is that although I think PyEval_EvalFrameEx was originally written as a really exceptionally straightforward piece of code, over the years some necessary complexity crept into it as various optimizations and improvements were implemented (I’ll collectively call them ‘additions’ from now on, for lack of a better term).

To further complicate matters, many of these additions are compiled conditionally with preprocessor directives, so several things are implemented in more than one way in the same source file. In the larger code samples I quoted above, I liberally expanded some preprocessor directives using their least complex expansion. However, depending on compilation flags, these and other preprocessor directives might expand to something else, possibly more a complicated something. I can understand trading simplicity to optimize a tight loop which is used very often, and the evaluation loop is probably one of the more used loops in CPython (and probably as tight as its contributors could make it). So while this is all very warranted, it doesn’t help the readability of the code.

Anyway, I’d like to enumerate these additions here explicitly (some in more depth than others); this should aid future discussion of ceval.c, as well as prevent me from feeling like I’m hiding too many important things with my free spirited editing of quoted code. Fortunately, most if not all these additions are very well commented -actually, some of the explanations below will be just summaries or even taken verbatim from these comments, as I believe that they’re accurate (eek!). So, as you read PyEval_EvalFrameEx (and indeed ceval.c in general), you’re likely to run into any of these:

“Threaded Code” (Computed-GOTOs)

Let’s start with the addition that gave us TARGET, FAST_DISPATCH and a few other macros. The evaluation loop uses a “switch” statement, which decent compilers optimize as a single indirect branch instruction with a lookup table of addresses. Alas, since we’re switching over rapidly changing opcodes (it’s uncommon to have the same opcode repeat), this would have an adverse effect on the success rate of CPU branch prediction. Fortunately gcc supports the use of C-goto labels as values, which you can generally pass around and place in an array (restrictions apply!). Using an array of adresses in memory obtained from labels, as you can see in ./Python/opcode_targets.h, we create an explicit jump table and place an explicit indirect jump instruction at the end of each opcode. This improves the success rate of CPU prediction and can yield as much as 20% boost in performance.

Thus, for example, the NOP opcode is implemented in the code like so:

        TARGET(NOP)
            FAST_DISPATCH();

In the simpler scenario, this would expand to a plain case statement and a goto, like so:

        case NOP:
            goto fast_next_opcode;

But when threaded code is in use, that snippet would expand to (I highlighted the lines where we actually move on to the next opcode, using the dispatch table of label-values):

        TARGET_NOP:
            opcode = NOP;
            if (HAS_ARG(NOP))
                oparg = NEXTARG();
        case NOP:
            {
                if (!_Py_TracingPossible) {
                    f->f_lasti = INSTR_OFFSET();
                    goto *opcode_targets[*next_instr++];
                }
                goto fast_next_opcode;
            }

Same behaviour, somewhat more complicated implementation, up to 20% faster Python. Nifty.

Opcode Prediction

Some opcodes tend to come in pairs. For example, COMPARE_OP is often followed by JUMP_IF_FALSE or JUMP_IF_TRUE, themselves often followed by a POP_TOP. What’s more, there are situations where you can determine that a particular next-opcode can be run immediately after the execution of the current opcode, without going through the ‘outer’ (and expensive) parts of the evaluation loop. PREDICT (and a few others) are a set of macros that explicitly peek at the next opcode and jump to it if possible, shortcutting most of the loop in this fashion (i.e., if (*next_instr == op) goto PRED_##op). Note that there is no relation to real hardware here, these are simply hardcoded conditional jumps, not an exploitation of some mechanism in the underlying CPU (in particular, it has nothing to do with “Threaded Code” described above).

Low Level Tracing

An addition primarily geared towards those developing CPython (or suffering from a horrible, horrible bug). Low Level Tracing is controlled by the LLTRACE preprocessor name, which is enabled by default on debug builds of CPython (see --with-pydebug). As explained in ./Misc/SpecialBuilds.txt: when this feature is compiled-in, PyEval_EvalFrameEx checks the frame’s global namespace for the variable __lltrace__. If such a variable is found, mounds of information about what the interpreter is doing are sprayed to stdout, such as every opcode and opcode argument and values pushed onto and popped off the value stack. Not useful very often, but very useful when needed.

This is the what the low level trace output looks like (slightly edited):

>>> def f():
...     global a
...     return a - 5
...
>>> dis(f)
  3           0 LOAD_GLOBAL              0 (a)
              3 LOAD_CONST               1 (5)
              6 BINARY_SUBTRACT
              7 RETURN_VALUE
>>> exec(f.__code__, {'__lltrace__': 'foo', 'a': 10})
0: 116, 0
push 10
3: 100, 1
push 5
6: 24
pop 5
7: 83
pop 5
# trace of the end of exec() removed
>>>

As you can guess, you’re seeing a real-time disassembly of what’s going through the VM as well as stack operations. For example, the first line says: line 0, do opcode 116 (LOAD_GLOBAL) with the operand 0 (expands to the global variable a), and so on, and so forth. This is a bit like (well, little more than) adding a bunch of printf calls to the heart of VM.

Advanced Profiling

Under this heading I’d like to briefly discuss several profiling related additions. The first relies on the fact that some processors (notably Pentium descendants and at least some PowerPCs) have built-in wall time measurement capabilities which are cheap and precise (correct me if I’m wrong). As an aid in the development of a high-performance CPython implementation, Python 2.4′s ceval.c was instrumented with the ability to collect per-opcode profiling statistics using these counters. This instrumentation is controlled by the somewhat misnamed --with-tsc configuration flag (TSC is an Intel Pentium specific name, and this feature is more general than that). Calling sys.settscdump(True) on an instrumented interpreter will cause the function ./Python/ceval.c: dump_tsc to print these statistics every time the evaluation loop loops.

The second advanced profiling feature is Dynamic Execution Profiling. This is only available if Python was built with the DYNAMIC_EXECUTION_PROFILE preprocessor name. As ./Tools/scripts/analyze_dxp.py says, [this] will tell you which opcodes have been executed most frequently in the current process, and, if Python was also built with -DDXPAIRS, will tell you which instruction _pairs_ were executed most frequently, which may help in choosing new instructions. One last thing to add here is that enabling Dynamic Execution Profiling implicitly disables the “Threaded Code” addition.

The third and last addition in this category is function call profiling, controlled by the preprocessor name CALL_PROFILE. Quoting ./Misc/SpecialBuilds.txt again: When this name is defined, the ceval mainloop and helper functions count the number of function calls made. It keeps detailed statistics about what kind of object was called and whether the call hit any of the special fast paths in the code.

Extra Safety Valves

Two preprocessor names, USE_STACKCHECK and CHECKEXC include extra assertions. Testing an interpreter with these enabled may catch a subtle bug or regression, but they are usually disabled as they’re too expensive.

These are the additions I found, grepping ceval.c for #ifdef. I think we’ll call it a day here, although we’re by no means finished. For example, I’d like to devote a separate post to exceptions, which is where we can discuss the tail of the evaluation loop (everything after the big switch and before the end of the big for), which we merely skimmed today. I’d also like to devote a whole post to locking and synchronization (including the GIL), which we touched upon before but never covered properly. Last but really not least, there’s about 2,000 other lines in ceval.c which we didn’t cover today; none of them are as important as PyEval_EvalFrameEx, but we need to talk at least about some of them.

All these things taken into account, I think we can say that today we finally conquered the evaluation loop. This isn’t the end of the series, far from it, but I do see it as a milestone. “Hooray”, I believe the saying goes. I hope you’re enjoying the show, thanks for the supportive comments (they keep me going), and I’ll see you in the next post.


I would like to thank Nick Coghlan for reviewing this article; any mistakes that slipped through are my own.

1Lazy or timid readers may choose to defer to Nick Coghlan’s example of one way he did it; I urge you not to look there and solve it on your own, it’s rather easy.


Tagged: bytecode, code objects, evaluation, evaluation loop, frame object, internals, python

September 02, 2010 06:19 PM


Duncan McGreggor

HCI at Canonical


uTouch

Back in March, I blogged about future possibilities (in a blue-sky sense) of multi-touch, mentioning the project management I was doing for MT hardware kernel driver support in Lucid (and then proceeding to dive into the deep end of speculation). It's now an Ubuntu cycle later, and holy crap... I'm having a hard time finding the words. I think the blog title says it all. But I'll try to elaborate :-)

Unless you've been living under a rock, you've probably noticed the big announcements we made a few weeks ago:
For the next few days, we were all over Google news. This was quite a shock, given that we'd been heads-down into the project for so long and hadn't really come up for air nor fully anticipated the impact (to others or ourselves). Needless to say, after the intense amount of work that the team had engaged in over the previous couple months, this was quite gratifying, if somewhat unexpected.

There has been a lot of discussion in blog posts, mail lists, IRC (#ubuntu-touch on freenode.net), Launchpad bugs and merge proposals, etc., so much so that touchscreens now pursue me feverishly when I sleep at night. I'm really not interested in writing more of the same :-)

As such, I want to mix things up a bit...

HCI Remixed

I've been reading an amazing anthology of essays on human-computer interaction. I still haven't finished the book (yeah, I've got about 10 in-progress titles on my nightstand), but am relishing every word in this particular collection. The book is HCI Remixed: Reflections on Works That Have Influenced the HCI Community.

While doing some research at the beginning of the Maverick development cycle, I came across HCI Remixed at the local library -- the title intrigued me and I couldn't resist. Weeks later, after having maxed out the number of times I could renew the book, I just purchased it -- I simply couldn't get enough of the book. Every essay I'd read up to that point was fantastic; each one provided volumes of information, experiences, insights, ideas for follow-up, etc. Whenever I finished one essay, I spent days and sometimes weeks reading up on references, pondering the past and future of human-computer interaction.

Due to the unusual nature of the book, describing it is surprisingly difficult. That being said, the MIT Press page gives you a great taste:
Over almost three decades, the field of human-computer interaction (HCI) has produced a rich and varied literature. Although the focus of attention today is naturally on new work, older contributions that played a role in shaping the trajectory and character of the field have much to tell us. The contributors to HCI Remixed were asked to reflect on a single work at least ten years old that influenced their approach to HCI. The result is this collection of fifty-one short, engaging, and idiosyncratic essays, reflections on a range of works in a variety of forms that chart the emergence of a new field.
If you're into HCI, learning from others, and discovering new sources of inspiration for your own work, this is simply a must-have book :-)

A Small Piece of History

By the time I checked the book out of the Golden public library, it was May and we had begun building the MT team. By July -- once it became clear how astounding the team's work was -- I realized that in 10 or 20 years I could very well be writing an article about Henrik, Chase, Stephen, Ikbel, and Rafi. Much like those in the book, I could be sharing the conversations I'd had with Stéphane Chatty, Mark Shuttleworth, Neil Patel, David Siegel, and John Lea. And that's only the crew which which I was collaborating or discussing directly. There are a lot of folks who've been working very hard on multi-touch infrastructure solutions and exploring ways of integrating these for several years (e.g., Peter Hutterer and Carlos Garnacho).

Though many foundations have been laid, as of yet (to the best of my knowledge), no Linux distribution has released a multi-touch stack that integrated gestures in a unified manner across everything from applications to window managers and beyond. This was something that Mark wanted us to provide to the open source world. In this spirit, the multitouch team hasn't just hacked things together to get a product out in time. A lot of generative, creative thought and care has gone into uTouch. A lot of original problem solving has taken place. Physics PhDs, kernel hackers, X.org hackers, driver creators, application integrators, toolkit gurus -- all of this knowledge was concentrated, applied, and used to distill a first approximation of what a gesture stack in Linux could look like, using the latest available technology and methodologies.

To be honest, we weren't really sure we could pull it off. There was a very good chance we could have failed at our task, quietly chalking up the loss as a lesson learned. Now that we've managed to shape these ideas into actual software, taken the threads of dreams and woven something real, we are thrilled to be engaging with others to see where all of us can take multi-touch and gestures from here.

Thanks to expert input from the wider open source community, we're already looking at ways in which we can improve upon the first version, ways of bringing new ideas and experiences to developers and users of multi-touch hardware running Linux. Things are only just warming up, and the greatest contributions have yet to be made. Every single person in the community has before them a world of possibilities for getting involved and creating the future human-computer interfaces for the free and open source world in the coming weeks and months. These are indeed exciting times.

September 02, 2010 06:56 AM


Python User Groups

pyCologne Python User Group Cologne - Meeting, September 08, 2010, 6.30pm

Hello,

The next meeting of pyCologne will take place:

Wednesday, September, 8th
starting about 6.30 pm - 6.45 pm
at Room 0.14, Benutzerrechenzentrum (RRZK-B)
University of Cologne, Berrenrather Str. 136, 50937 Köln, Germany


Agenda:

 - Cooking Eggs - A distutils & setuptools recipe book (Christopher Arndt)
 - Lightning Talks


Further discussion topics, news, book-presentations etc. are welcome on each of our meetings!

At about 8.30 pm we will as usual enjoy the rest of the evening in a nearby restaurant.

Further information including directions how to get to the location can be found at:

http://www.pycologne.de


(Sorry, the web-links are in German only.)

Best Wishes,

Andi (on behalf of pyCologne)

September 02, 2010 03:52 AM


Matthew Rollings

9 letter words with several anagrams

While perusing the statistics of wordcube, I was wondering how many 9 letter words have multiple anagrams (using all the letters in a single word) and what was the maximum number of anagrams. So I wrote a quick and dirty python program to find out. I will first show the results as they are interesting followed by my coding and methods to improve the efficiency of it.

Results
Here are all the nine letter words with more than 2 anagrams:

I only found 12 sets of 3, there may be more with a larger dictionary. I was also disappointed that there were no words with 4 anagrams yet not entirely unsurprising. My personal favourite is number 10

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Python

I recycled an anagram checking function that I have used before:

# -*- coding: utf-8 -*-

# Anagram checking function
def anagramchk(word,chkword):
	for letter in word:
		if letter in chkword:
			chkword=chkword.replace(letter, '', 1)
		else:
			return 0
	return 1 

First program

Firstly I created a dirty program that created a loop to cycle through the 9 letter word dictionary and another loop nested inside to check against every word in the dictionary again. This is a terrible and inefficient method and will create duplicates, I will follow with a more efficient method.

g=open('eng-9-letter', 'r')
for l in g:

	wordin=l.strip()

	f=open('eng-9-letter', 'r')
	count=0
	w=""
	for line in f:
		line=line.strip()
		if anagramchk(line,wordin):
			count+=1
			w+=" "+line
	f.close()
	if count>2:
		print wordin, count, "(",w,")"

g.close()

This program took 80.42s to find the 12 solutions. On the path to better coding I decided to load the dictionary into memory, this sped the code up about 20s to 63.88s.

# Load dictionary into memory
dic=[]
f=open('eng-9-letter', 'r')
for line in f:
	dic.append(line.strip())
f.close()

I then attempted to create a method that loops over and removes words from the dictionary as it loops, however I don’t know the correct way (if there is one?) of modifying the loop variable while inside the loop without causing problems.

for word in dic:
	if ....:
		dic.remove(word)

If anyone knows a good method of doing this please let me know! I did managed to hack together something using slices so that I could modify the dictionary each time, however I imagine this is still quite inefficient.

for word in dic[:]:
	w=""
	count=0
	for word2 in dic[:]:
		if anagramchk(word,word2):
			count+=1
			dic.remove(word2)
			w+=word2+" "
	if count>2:
		print w

Even so this method now avoids duplication of results and completes in 31.87s (machine running at 3.15Ghz). Please let me know of any improvements you think can be made and I’ll happily benchmark to see how much better it is.

September 02, 2010 12:06 AM

September 01, 2010


Evan Fosmark

SSL support in asynchat.async_chat

A while back I needed to be able to use SSL connections in async_chat, but I found it to be horribly incompatible. After quite a bit of investigation I found a suitable solution.

import asynchat
import socket
import ssl
import errno
 
class async_chat_ssl(asynchat.async_chat):
    """ Asynchronous connection with SSL support. """
 
    def connect(self, host, use_ssl=False):
        self.use_ssl = use_ssl
        if use_ssl:
            self.send = self._ssl_send
            self.recv = self._ssl_recv
        asynchat.async_chat.connect(self, host)
 
    def handle_connect(self):
        """ Initializes SSL support after the connection has been made. """
        if self.use_ssl:
            self.ssl = ssl.wrap_socket(self.socket)
            self.set_socket(self.ssl)
 
    def _ssl_send(self, data):
        """ Replacement for self.send() during SSL connections. """
        try:
            result = self.write(data)
            return result
        except ssl.SSLError, why:
            if why[0] in (asyncore.EWOULDBLOCK, errno.ESRCH):
                return 0
            else:
                raise ssl.SSLError, why
            return 0
 
    def _ssl_recv(self, buffer_size):
        """ Replacement for self.recv() during SSL connections. """
        try:
            data = self.read(buffer_size)
            if not data:
                self.handle_close()
                return ''
            return data
        except ssl.SSLError, why:
            if why[0] in (asyncore.ECONNRESET, asyncore.ENOTCONN, 
                          asyncore.ESHUTDOWN):
                self.handle_close()
                return ''
            elif why[0] == errno.ENOENT:
                # Required in order to keep it non-blocking
                return ''
            else:
                raise

It should fit in place of typical use of asynchat.async_chat. In order to specify that you're wanting to use SSL, just set the flag in:

connect(host, use_ssl=True)

It would be nice if SSL support with asynchat.async_chat worked by default. Hopefully I'm not the only one who finds the above solution useful.

And as always, if you see any errors above, I encourage you to post a comment explaining the it!

September 01, 2010 10:02 PM


Mario Boikov

Python Koans - A Great Way to Learn Python!

I just found out about Python Koans by Greg Malcolm (thanks dude) after listening to the from python import podcast podcast (which I find amusing, thanks guys).

It's an awesome way to learn Python. Instead of just reading tutorials and/or books you learn Python by coding.

The interactive tutorial is built around unit-tests and you advance and gain new skills by passing tests and it's really funny. You do learn a lot about the Python language when doing the Koans so I recommend it even if you've been using Python for a while.

Another cool thing is that you learn how to do unit testing in Python, if you're not already familiar with it.



September 01, 2010 10:48 PM


Vern Ceder

Python for Linux at OLF

I’m in the final (but not as final as I would like) stages of preparing for my day-long tutorial at Ohio LinuxFest. OLF, as we call it, is a great event, with some good keynotes, interesting talks, and even maddog. Not to mention first rate tutorials, such as, oh… “Python for Linux System Administration”.

The morning session I’ll spend on basics – writing scripts that illustrate control flow, lists, dictionaries, strings, etc. from the point of view some basic sysadmin scenarios. I’ll also introduce the basics of the subprocess module to call other Linux tools.

Then in the afternoon session, we’ll look at some more involved tasks, like traversing files systems, regular expresssions, daemons, using the network, etc.

I’m looking forward to it – I think it will be a blast.

So if anyone has any cool intersections between Python and Linux sysadmin you wouldn’t mind me stealing, or any other suggestions or words of wisdom, by all means let me know.


Filed under: Python

September 01, 2010 07:58 PM


Imaginary Landscape

Our Django Server Setup: How and Why

 

One of the most important decisions you make in the process of building a new Django application is what software stack you use to serve it to the world. You're not lacking for options: people run Django on Apache, lighty, nginx, and Cherokee. You also need to decide how ...

September 01, 2010 05:38 PM


Roberto Alsina

Goodreads+webcam+python+zbar == hackfun!

I am a big fan of GoodReads a social network for people who read books.

I read a lot, and I like that I can see what other people think before starting a book, and I can put my short reviews, and I can see what I have been reading, and lots more.

In fact, goodreads is going to be a big part of a project I am starting with some PyAr guys.

One thing I have been lazy about is adding my book list to goodreads, because it's a bit of a chore.

Well, chore no more!

Here's how to do it, the hacker way...

  1. Get zbar
  2. Get a cheap webcam
  3. Get a book
  4. Get a 7-line python program (included below)

Now watch the video...

Cute, isn't it?

Here's the code:

import os

p=os.popen('/usr/bin/zbarcam','r')
while True:
    code = p.readline()
    print 'Got barcode:', code
    isbn = code.split(':')[1]
    os.system('chromium http://www.goodreads.com/search/search?q=%s'%isbn)

September 01, 2010 04:36 PM


Brett Cannon

What will forever be exclusive to Python 3?

[2010-08-26: remove PEP 3109 and 3110 as they are both syntactically supported in Python 2.6
 2010-09-01: remove mention of built-ins returning iterators]

A question on Stack Overflow about what is exclusive to Python 3 came up and I realized that there is no clear list of big changes that you cannot access in Python 2.7 through a __future__ import. So I figured I would go through the What's New docs for Python 3.0, 3.1, and 3.2a1 (although the What's New doc has not been written yet) and see what has (not) been backported of significance.

If something is available in Python 2.6 without a __future__ import I will not list it here (e.g., new octal literals, bytes literal, and str.format). I also don't touch the C API. Otherwise stuff that is crossed out has been backported in Python 2.7 or is in Python 2.6 with a __future__ import. Everything else you have to make the switch to Python 3 to get the feature.

So what does the list tell us? First, a ton of syntactic cleanup only appears in Python 3, which is not surprising. Second, there are still plenty of reasons, both from a development perspective but also a performance one, to look forward to moving over to Python 3 when you can.

September 01, 2010 04:02 PM


Richard Tew

Roguelike MUD progress #2

Previous post: Roguelike MUD progress.

I wasn't feeling very motivated when the time I had set aside tonight to work on this came around, but once I got into it I made pretty good progress. The bugs I listed yesterday are now fixed, and additionally the field of view emphasis is working. It's annoying that the stupid mistakes are the ones that take the longest to track down. In this case, absently writing or instead of and.

if y in self.drawRangesNew:
minX, maxX = self.drawRangesNew[y]
if x >= minX or x = maxX:
return True
Maybe I should reconsider minX = x = maxX, although it never quite seems right.


The FoV changes are not as optimal as they could be. Every tile in the post-move FoV tile set is individually marked up with escape codes, I should instead simply mark each row of qualifying tiles in one go.

Current TODO list:
  1. Clean up the FoV emphasis to be row-based, rather than tile-based.
  2. Add some objects and entities to the world, that can be interacted with. Entities should move to add life to the world.

September 01, 2010 02:37 PM


Montreal Python User Group

Next Montréal

The technology scene of Montréal is a very vibrant one. With groups such as our selves, OWASP, JS-Montreal, Montreal.rb, and PHP-Québec; with events such as WordCamp, PodCamp, Startup Drinks, and Startup Camp; you end up with weeks during which all your evenings are booked before lunch time on Monday. Yet, it also happens that you have a guest in town and that you want to show then how active you city is without knowing where exactly you should take him to.

Fortunately, some members of the community decided to take the matter in their own hands and to expose for all to see what is going on with the tech and startup scene here in Montréal. Next Montréal is a blog featuring news and opinion from the Web, mobile, and gaming communities. The site is piloted by a handful of Montréal entrepreneurs, engaging us with interviews with the local players and giving us a good feel for who’s working on what and what’s the next big thing. Beyond interviews, Next Montréal brings together the community by posting job opportunities and a calendar of events.

Next Montréal is a great initiative and we hope to see more Python project featured there.

September 01, 2010 02:21 PM


Salman Haq

Sneak Peek: GAE Channel API

At Google IO 2010, the app engine team announced that they had a Channel API in the works. This week I got invited by Moishe Lettvin of the Channel API team to join a handful of developers to try it out. The api is undocumented at the moment and can be considered in private alpha.

I haven’t had the chance to actually use the api yet but I have studied the examples (there are only two) and browsed the private mailing list to better understand it. Here are a few things that I have understood about the api:

The basic javascript client code is quite simple:

     var channel = new goog.appengine.Channel(channel_id);
      var socket = channel.open();
      socket.onopen = function() {
        window.setTimeout(function() {sendMessage('connected')}, 100);
      }
      socket.onmessage = function(evt) {
          var o = JSON.parse(evt.data);
          ... app logic ...
        }

and the server-side Python code is not too bad either:

from google.appengine.api import channel
from google.appengine.api import users

# creating a channel
user = users.get_current_user()
id = channel.create_channel(user)

# sending a message to that channel later in the code
channel.send_message(user, "json formatted message")

As you can see, the API is quite simple. Hopefully it will prove to be a boon for developers trying to build distributed multi-user/multi-player applications using the app engine platform.

September 01, 2010 01:50 PM


Grig Gheorghiu

MySQL InnoDB hot backups and restores with Percona XtraBackup

I blogged a while ago about MySQL fault-tolerance and disaster recovery techniques. At that time I was experimenting with the non-free InnoDB Hot Backup product. In the mean time I discovered Percona's XtraBackup (thanks Robin!). Here's how I tested XtraBackup for doing a hot backup and a restore of a MySQL database running Percona XtraDB (XtraBackup works with vanilla InnoDB too).

First of all, I use the following Percona .deb packages on a 64-bit Ubuntu Lucid EC2 instance:


# dpkg -l | grep percona
ii  libpercona-xtradb-client-dev      5.1.43-xtradb-1.0.6-9.1-60.jaunty.11 Percona SQL database development files
ii  libpercona-xtradb-client16        5.1.43-xtradb-1.0.6-9.1-60.jaunty.11 Percona SQL database client library
ii  percona-xtradb-client-5.1         5.1.43-xtradb-1.0.6-9.1-60.jaunty.11 Percona SQL database client binaries
ii  percona-xtradb-common             5.1.43-xtradb-1.0.6-9.1-60.jaunty.11 Percona SQL database common files (e.g. /etc
ii  percona-xtradb-server-5.1         5.1.43-xtradb-1.0.6-9.1-60.jaunty.11 Percona SQL database server binaries


I tried using the latest stable XtraBackup .deb package from the Percona downloads site but it didn't work for me. I started a hot backup with /usr/bin/innobackupex-1.5.1 and it ran for a while before dying with "InnoDB: Operating system error number 9 in a file operation." See this bug report for more details.

After unsuccessfully trying to compile XtraBackup from source, I tried XtraBackup-1.3-beta for Lucid from the Percona downloads. This worked fine.

Here's the scenario I tested against a MySQL Percona XtraDB instance running with DATADIR=/var/lib/mysql/m10 and a customized configuration file /etc/mysql10/my.cnf. I created and attached an EBS volume which I mounted as /xtrabackup on the instance running MySQL.

1) Take a hot backup of all databases under that instance:

/usr/bin/innobackupex-1.5.1 --defaults-file=/etc/mysql10/my.cnf --user=root --pasword=xxxxxx /xtrabackup

This will take a while and will create a timestamped directory under /xtrabackup, where it will store the database files from DATADIR. Note that the InnoDB log files are not created unless you apply step 2 below.

As the documentation says, make sure the output of innobackupex-1.5.1 ends with:

100901 05:33:12 innobackupex-1.5.1: completed OK!

2) Apply the transaction logs to the datafiles just created, so that the InnoDB logfiles are recreated in the target directory:

/usr/bin/innobackupex-1.5.1 --defaults-file=/etc/mysql10/my.cnf --user=root --password=xxxxxx --apply-log /xtrabackup/2010-09-01_05-21-36/

At this point, I tested a disaster recovery scenario by stopping MySQL and moving all files in DATADIR to a different location.

To bring the databases back to normal from the XtraBackup hot backup, I did the following:

1) Brought back up a functioning MySQL instance to be used by the XtraBackup restore operation:

i) Copied the contents of the default /var/lib/mysql/mysql database under /var/lib/mysql/m10/ (or you can recreate the mysql DB from scratch)

ii) Started mysqld_safe manually:

 mysqld_safe --defaults-file=/etc/mysql10/my.cnf

This will create the data files and logs under DATADIR (/var/lib/mysql/m10) with the sizes specified in the configuration file. I had to wait until the messages in /var/log/syslog told me that the MySQL instance is ready and listening for connections.

2) Copied back the files from the hot backup directory into DATADIR

Note that the copy-back operation below initially errored out because it tried to copy the mysql directory too, and it found the directory already there under DATADIR. So the 2nd time I ran it, I moved /var/lib/mysql/m10/mysql to mysql.bak. The copy-back command is:

/usr/bin/innobackupex-1.5.1 --defaults-file=/etc/mysql10/my.cnf --user=root --copy-back /xtrabackup/2010-09-01_05-21-36/

You can also copy the files from  /xtrabackup/2010-09-01_05-21-36/ into DATADIR using vanilla cp.

3) If everything went well in step 2, restart the MySQL instance to make sure everything is OK.

At this point, your MySQL instance should have its databases restored to the point where you took the hot backup. If that instance is used in replication, you will most likely need to adjust the master_log_file and master_log_position so that it gets back in sync with its master.

Note that XtraBackup can also run in a 'stream' mode useful for compressing the files generated by the backup operation. Details in the documentation.

September 01, 2010 01:28 PM


Isotoma

Annoying CSS3 Baseline Alignment Problem in Firefox

CSS3 transform enables the rotation of elements including HTML text. If you intend to use it you should be aware that Firefox 3.6.8 and below has very poor baseline alignment.

The heading is just about acceptable. Content text is not.

Firefox 3.6.8:

Webkit:

So be warned if you intend to use CSS transform on text.

September 01, 2010 09:04 AM


S. Lott

Using SCons

In looking at Application Lifecycle Management (see "ALM Tools"), I had found that SCons appears to be pretty popular. It's not as famous as all the make variants, or Apache Ant or Apache Maven, but it seems to have a niche in the forest of Build Automation Software.


While it looks nice, parts of SCons are confusing. I struggled until I found a simple use case.

More: Empirical Comparison of SCons and GNU Make.
"SCons proved to be more accurate, mostly due to its stateful, content-based signature model.

On the other hand, GNU Make proved to be more resource friendly, especially regard- ing the memory footprint. SCons needs to address this problem to be a viable alternative to Make when building large software projects."
[Also, it appears that a lot of build and test automation have been reframed as "Continuous Integration". Which isn't really a bad thing. But it can be confusing because there are too many categories into which general-purpose tools can be fit.]

While SCons looks cool, I haven't had a huge need for it at work. Working in Python, there's no real "build". Instead our continuous integration boils down to unit testing. Our "build" is an SVN checkin. Our deployment is an SVN checkout and `python setup.py install`.

At some point, I would like to create an SConstruct file that runs our integration test suite. But it's trapped at a low priority.

SCons and Sphinx

I did find an SConscript example that automated a document build using Sphinx. This
sphinx-scons was quite cool. However, it was challenging to customize. The SCons documentation requires real work to understand. I could see the value, but it was a lot of work.

I'm hoping that No Starch Press finds someone to write a tidy introduction to SCons.

SCons and RST and LaTeX (oh, my!)

Sphinx has made me a total fanboi of ReStructured Text. While I know MS Word and iWorks Pages quite well, I have no patience with all the pointing and clicking. Getting consistency is requires consistent pointing and clicking; some people can do it, but some of us find that manual pointing and clicking is sometimes irreproducible. Semantic markup is a huge pain in the neck because we have to stop typing to click on the proper style hint for the various words.

I also know DocBook XML and LaTeX quite well. I've used very cool XML editors including XML Mind XML Editor (which is very nice.) I no longer have any patience with any of these tools because there's too much GUI.

RST is fun because you write in plain text. There are a few directives and a few bits of inline roles for semantic markup. But your work can focus on the content, leaving presentation aside. A command-line tool (with templates) emits HTML or LaTeX or whatever. The style considerations can be (a) secondary and (b) completely consistent.

RST will easily produce complex LaTeX from plain text. What a joy. LaTeX, of course, isn't the goal, it's just an intermediate result that leads to DVI which leads -- eventually -- to a PDF.

Because of the Unicode and font selection on the Mac, I'm a user of XeTeX and XeLaTeX. I have some problems with getting my copy of Blackadder ITC to work, but generally I'm able to write without much fussing around.

SCons has a great deal of the TeX/DVI/PDF tool chain already installed. However, it doesn't have either the rst2latex script or the XeTeX tools.

An SConscript

While my first attempts to understand SCons didn't work out well, looking at RST and XeLaTex was a much better use case.

I wound up with this.

rst2latex = Builder( action="rst2latex.py $SOURCES >$TARGET",
suffix='.tex', src_suffix='.rst',
)
xelatex = Builder( action=["xelatex $SOURCES", "xelatex $SOURCES"],
suffix='.pdf', src_suffix='.tex',
)
env = Environment(ENV=os.environ,
BUILDERS = { 'rst2latex' : rst2latex, 'xelatex':xelatex }
)

env.rst2latex('someDoc')
env.xelatex('someDoc' )
env.rst2latex('anotherDoc')
env.xelatex('anotherDoc')

Getting this to work was quite pleasant. I can see how I could further optimize the document production pipeline by combining the two Builders.

[And yes, the xelatex step is run twice to guarantee that the references are correct.]

Now, I can get away with write, run `scons` and review the resulting PDF. It's fast and it produces a nice-looking PDF with very little work and no irreproducible pointing and clicking.

Given this baseline, I can now dig into SCons for ways to make this slightly simpler.

September 01, 2010 09:00 AM


James Polera

smtp_toolkit

smtp_toolkit

Speaking SMTP to mail servers with Python!

In my daily work, I often find the need to test various mail servers: verify that they are responding, see if they support TLS, check what the max supported message size is, etc. This is usually an exercise in running a telnet session to port 25 of the mail server and inspecting from there.

Seeing as telnet isn’t installed by default on some operating systems these days (I’m looking at you Windows 7), writing a Python class seemed to be the right thing to do. I can incorporate it in to scripts, schedule checks, work it into mxutils.com… The list goes on.

It’s pretty straightforward to use, and I’ve made the code available under the BSD license at http://github.com/polera/smtp_toolkit.

Here are some basic examples of usage:

 1 from smtp_toolkit import SMTPServerTest
 2 
 3     # setup a list of servers to check
 4     server_list = ['smtp.gmail.com']
 5   
 6     for server in server_list:
 7       print(server)
 8       s = SMTPServerTest(server)
 9       # server connection results are returned as a dict
10       print(s.results)
11       # get the EHLO options (i.e. what would be returned after an ehlo command)
12       print("EHLO options %s" % ", ".join(s.ehlo_options))
13       # see if the server supports TLS (based on the EHLO response)
14       print("TLS Supported? %s" % s.server_supports_tls)
15       # what is the max message size that this server will handle (also from EHLO)
16       print("Max message size: %d MB" % s.server_max_message_size)

I plan on building this out to support more features in the near future, so if you’re interested, keep an eye on the github repo.

Now, go test your servers!

September 01, 2010 04:00 AM


Montreal Python User Group

ConFooBBQ

This year again, ConFoo is going to be a major conference on Web development bringing together many of the local communities. To celebrate this synergy, everyone is invited to ConFooBBQ, the BBQ for developers and other actors of the Web.

The BBQ will take place on 2010-09-11 starting at 1h00 PM.

On the menu: hot-dogs, chips, salad, soft-drinks, cookies, and lots of fun. In line with our beer inspired events, Montréal-Python will bring a keg a Charmeuse de Serpents, a special batch of India Pale Ale with a very assertive character.

To help us plan adequate supplies, please send an email to board@confoo.ca if you plan to attend. Don’t forget to mention if you come with others. If you can’t find the group once you’re on the site, feel free to give the crew a call: 1-888-679-8466 option 0.

Details of the event:

September 01, 2010 02:46 AM


Richard Tew

Roguelike MUD progress

I finally found some more time to work on my roguelike MUD project. Tonight I managed to get proper multi-player support in, so that concurrently logged in players have their view of the game updated as other game objects change position. Up to this point the players mostly shared the world representation, and had separate state defining what was in it.


While the changes required are more or less fine, some didn't fit well into the existing game framework. I need to put some thought into cleaning those up.

Achievable next steps should hopefully be:

  1. Cleaning up the bugs.
    - Fix the incorrect menu related message that appears in the top telnet window shown in the screenshot.
    - When a player quits the game, the display of other observing players does not update to reflect it.
    - When an object moves, observing players are defined as those who have visited the tile the object moved to and currently have it on their display. Instead only observing players who have the object in their field of vision should see the movement.
  2. Polish the field of vision support.
    - The tiles that are in the field of vision should be distinct from those that are not (Smart Kobold uses this approach to good effect).
  3. Widen the corridors so players can pass each other.
  4. Add entities controlled by AI that move around of their own volition.
Really, in this day and age a game like this should be browser based (like 91). But getting side tracked into a non-game related endeavour to implement an equivalent canvas based display with a AJAX or COMET connection to the backend server, is something I have no time for just now.

September 01, 2010 03:31 AM


Mike Driscoll

Another GUI2Exe Tutorial – Build a Binary Series!

This is the last article of my “Build a Binary Series”. If you haven’t done so already, be sure to check out the others. For our finale, we are to look at Andrea Gavana’s wxPython-based GUI2Exe, a nice graphical user interface to py2exe, bbfreeze, cx_Freeze, PyInstaller and py2app. The latest release of GUI2Exe is 0.5.0, although the source may be slightly newer. Feel free to run from the tip as well. We’ll be using the example scripts that we used for several of the previous articles: one console and one GUI script, neither of which do much of anything.

Getting Started with GUI2Exe

Quite some time ago, I wrote another article on this cool tool. However, the look-and-feel of the application has changed quite a bit, so I felt I should re-write that article in the context of this series. To follow along with this article, you’ll need to hit Google Code for the source. Let’s begin, shall we? Here are some step-by-step directions for making the console script using py2exe via GUI2Exe:

  1. Download the source and unzip them in a convenient location
  2. Run the “GUI2Exe.py” file (you can use your favorite editor, open it via the command line or whatever)
  3. Go to File, New Project. A dialog will appear asking you to name the project. Give it a good name! Then hit OK.
  4. Click in the “Exe Kind” column and change it to “Console”
  5. Click in the “Python Main Script” column and you’ll see a button appear.
  6. Press the button and use the file dialog to find your main script
  7. Fill out the other optional fields however you like
  8. Hit the compile button on the lower right
  9. Try the result to see if it worked!

If you followed the directions above, you should now have an executable file (and a few dependencies) in a “dist” folder at the location of the main script. As you can see in the screenshot above, there are all the typical options that you would set in your setup.py file. You can set your excludes list, the includes, the optimize and compressed settings, whether or not to include a zip, packages and much more! You can tweak to your hearts content and hit the “Compile” button whenever you’re ready to see the result. If I’m experimenting, I usually change the output directory’s name so I can compare the results to see which is the most compact.

If you want to use bbfreeze, cx_freeze, PyInstaller or py2app, just click the respective name in the column on the right. This will cause the middle part of the screen to change according to your choice and show the corresponding options for said choice. Let’s take a quick visual tour!

GUI2Exe in Pictures!

The following is a snapshot of the py2app options:

Next is a shot of the cx_Freeze options:

gui2exe_cxfreeze.png

And here is PyInstaller’s settings:

gui2exe_pyinst.png

Finally, we have bbfreeze’s options:

gui2exe_bbfreeze.png

There’s also a VendorId screen, but I don’t know much about that one, so we’ll be skipping it.

GUI2Exe’s Menu Options

As you might guess, all these options work the same way as they do when you do it all yourself in code. If you ever need to check out the setup.py file that GUI2Exe is making for you, just go to the Builds menu and choose View Setup Script. If you want to see a handy listing of the files it output and where it output the files, go to Build, Explorer and you should see something like the screenshot below:

gui2exe_explorer.png

Other handy option in the Builds menu include the Mission Modules and the Binary Dependencies menu items. These show you what may be missing from the dist folder that you may need to include should you distribute your masterpiece.

The Options menu controls options for GUI2Exe itself and a few custom items for the build process, like setting the Python version, deleting the build and/or dist folders, set the PyInstaller Path and more. The other menus are pretty self-explanatory and I leave them for the adventurous readers.

Wrapping Up

If you’ve read my other tutorials in the “Build a Binary Series” then you should be able to take that knowledge and use it productively with GUI2Exe. I find GUI2Exe to be very helpful when it comes time for me to build an executable and I used it to help me figure out the options for some of the other binary builders in this series. I hope you enjoyed this series and found it helpful. See you next time!

Further Reading

September 01, 2010 12:09 AM

August 31, 2010


BioPython News

Biopython 1.55 released


The Biopython team is proud to announce Biopython 1.55, a new stable release, about three months after our last stable release (Biopyton 1.54) and the beta release earlier in August.

A lot of work has been towards Python 3 support (via the 2to3 script), but unless we broke something you shouldn’t notice any changes ;)

In terms of new features, the most noticeable highlight is that the command line tool application wrapper classes are now executable, which should make it much easier to call external tools. This is described in the updated documentation.

Additionally GenBank and EMBL parsing has been sped up, the BioSQL classes act more like Python dictionaries, and Bio.PDB should handle model numbers and a missing element column better.

Note we are phasing out support for Python 2.4. We will continue to support it for at least one further release (i.e. Biopython 1.56). This could be delayed given feedback from our users (e.g. if this proves to be a problem in combination with other libraries or a popular Linux distribution).

(At least) 12 people have contributed to this release, including 6 new people – thank you all:

Source distributions and Windows installers are available from the downloads page on the Biopython website (biopython.org).

As usual, feedback is most welcome on the mailing lists (or bugzilla).

August 31, 2010 10:49 PM


Mikko Ohtamaa

Testing if hostname is numeric IPv4

I had to resort this hack when testing a hybrid web/mobile site which uses site hostname based device discrimination. In production mode we can have m.yoursite.com and www.yoursite.com hostnames. However, when running the site locally, on your development computer and in LAN this does not work very well: one cannot spoof hostnames for web browsers in devices like iPhone/iPod/other mobile phone unless you install a DNS server. And installing a DNS server for LAN is something you don’t want to do…

So, I figured out that I can use  hostname spoofing on desktop computers (/etc/hosts file) and I always access the site via numeric IP (IPv4 over ethernet) when testing over WLAN on mobile devices.

And,… dadaa,… here is my magical code to test whether hostname is numeric IPv4. I couldn’t find a ready function from Python standard library

import re

ipv4_regex_source = "^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$"
ipv4_regex = re.compile(ipv4_regex_source)

def is_numeric_ipv4(str):
    """

http://answers.oreilly.com/topic/318-how-to-match-ipv4-addresses-with-regular-expressions/

    @param str: Hostname as a string.

    @return: True if the given string is numeric IPv4 address
    """
    # ^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
    return ipv4_regex.match(str)

Read our blog  Subscribe mFabrik blog in a reader Follow us on Twitter Mikko Ohtamaa on LinkedIn

August 31, 2010 02:51 PM