skip to navigation
skip to content

Planet Python

Last update: February 09, 2010 08:43 PM

February 09, 2010


Ben Bangert

Deploying a Pylons App to Production, Step-by-Step

Deploying a Pylons App to Production, Step-by-Step:

One of the many ways to deploy a Pylons application. Hopefully with tools like toppcloud, the Python web world as a whole can start to come to a ‘best practices’ type methodology to ease deployment pains. So far, whether you’re deploying a Pylons, Django, repoze.bfg, Zope, or TurboGears app, many of the same deployment pains will crop up. Since all of them can be deployed as WSGI apps, it would feel like by now we surely could at least have a ‘best practice for deploy Python web app’ type doc that works fine for any Python webapp.

February 09, 2010 06:52 PM


Geert Vanderkelen

Don't forget the COMMIT in MySQL

Yes, MySQL has transactions if you use InnoDB or NDB Cluster for example. Using these transactional storage engines, you'll have to commit (or roll back) your inserts, deletes or updates.

I've seen it a few times now with people being surprised that no data is going into the tables. It's not so a silly problem in the end. If you are used to the defaults in MySQL you don't have to commit anything since it is automatically done for you.

Take the Python Database Interfaces for MySQL. PEP-249 says that, by default, auto-commit should be turned off. You could turn it back on, but it's good practice to be explicit and commit in your code. Remember the Zen of Python!

Here is just a small example to show it. Uses MySQL Connector/Python, but it should also work with the other MySQL database interfaces:


import mysql.connector
cnx = mysql.connector.connect(db='test')
cur = cnx.cursor()
cur.execute("""CREATE TABLE innodb_t1 (
id INT UNSIGNED NOT NULL,
c1 VARCHAR(128),
PRIMARY KEY (id)
) ENGINE=InnoDB""")
ins = "INSERT INTO innodb_t1 (id,c1) VALUES (%s,%s)"
cur.execute(ins,
(1,'MySQL Support Team _is_ already the best',))
cnx.commit()
cur.close()
cnx.close()

February 09, 2010 06:40 PM


Ludvig Ericson

A little off-topic

Oh man, this one's great.

 

Save on Delicious delicious.com

February 09, 2010 04:30 PM


Roberto Alsina

Marave 0.4 is out!

Version 0.4 of Marave, a distraction-free fullscreen editor is out at http://marave.googlecode.com

This version includes several bugs fixed and features implemented since 0.4:

Marave is free softare released under the GPL, and should work in all major desktop platforms.

I would love feedback on this release, as well as ideas for Marave's future, so if you want to help, please join the mailing list:

http://groups.google.com/group/marave-discuss

Of course, if you like Marave, feel free to give me money

February 09, 2010 04:04 PM


Logilab

SCons presentation in 5 minutes

http://www.scons.org/scons-logo-transparent.png

Building software with SCons requires to have Python and SCons installed.

As SCons is only made of Python modules, the sources may be shipped with your project if your clients can not install dependencies. All the following exemples can be downloaded at the end of that blog.

A building tool for every file extension

First a Fortran 77 program will be built made of two files:

$ cd fortran-project
$ scons -Q
gfortran -o cfib.o -c cfib.f
gfortran -o fib.o -c fib.f
gfortran -o compute-fib cfib.o fib.o
$ ./compute-fib
 First 10 Fibonacci numbers:
  0.  1.  1.  2.  3.  5.  8. 13. 21. 34.

The '-Q' option tell to Scons to be less verbose. For cleaning the project, add the '-c' option:

$ scons -Qc
Removed cfib.o
Removed fib.o
Removed compute-fib

From this first example, it can been seen that SCons find the 'gfortran' tool from the file extension. Then have a look at the user's manual if you want to set a particular tool.

Describing the construction with Python objects

A second C program will directly run the execution from the SCons file by adding a test command:

$ cd c-project
$ scons -Q run-test
gcc -o test.o -c test.c
gcc -o fact.o -c fact.c
ar rc libfact.a fact.o
ranlib libfact.a
gcc -o test-fact test.o libfact.a
run_test(["run-test"], ["test-fact"])
OK

However running scons alone builds only the main program:

$ scons -Q
gcc -o main.o -c main.c
gcc -o compute-fact main.o libfact.a
$ ./compute-fact
Computing factorial for: 5
Result: 120

This second example shows that the construction dependency is described by passing Python objects. An interesting point is the possibility to add your own Python functions in the build process.

Hierarchical build with environment

A third C++ program will create a shared library used for two different programs: the main application and a test suite. The main application can be built by:

$ cd cxx-project
$ scons -Q
g++ -o main.o -c -Imbdyn-src main.cxx
g++ -o mbdyn-src/nodes.os -c -fPIC -Imbdyn-src mbdyn-src/nodes.cxx
g++ -o mbdyn-src/solver.os -c -fPIC -Imbdyn-src mbdyn-src/solver.cxx
g++ -o mbdyn-src/libmbdyn.so -shared mbdyn-src/nodes.os mbdyn-src/solver.os
g++ -o mbdyn main.o -Lmbdyn-src -lmbdyn

It shows that SCons handles for us the compilation flags for creating a shared library according to the tool (-fPIC). Moreover extra environment variables have been given (CPPPATH, LIBPATH, LIBS), which are all translated for the chosen tool. All those variables can be found in the user's manual or in the man page. The building and running of the test suite is made by giving an extra variable:

$ TEST_CMD="LD_LIBRARY_PATH=mbdyn-src ./%s" scons -Q run-tests
g++ -o tests/run_all_tests.o -c -Imbdyn-src tests/run_all_tests.cxx
g++ -o tests/test_solver.o -c -Imbdyn-src tests/test_solver.cxx
g++ -o tests/all-tests tests/run_all_tests.o tests/test_solver.o -Lmbdyn-src -lmbdyn
run_test(["tests/run-tests"], ["tests/all-tests"])
OK

Conclusion

That is rather convenient to build softwares by manipulating Python objects, moreover custom actions can be added in the process. SCons has also a configuration mechanism working like autotools macros that can be discovered in the user's manual.

February 09, 2010 03:38 PM


Atamert Olcgen

Top 5 Untrends According To Me

My dear friend Ochronus posted an article titled Top 5 trends and technologies in software development that got me thinking. My thoughts below. Go check Ochronus’s blog if you haven’t, he is the lead developer at Arukereso.hu.

I agree with the suggestions from the original article. Yet, I would like to change the order a little bit; DVCS and then agile (with lowercase a) and then the rest. None of my points below are cool trends, in fact I can guarantee most of you will find them boring. But I think they are all important. OK, I hope you are all psyched now. Here we go:

1. Be Careful With The Buzz

Trends are cool. What could be wrong about following cutting edge stuff? We all want to be up to date, no? I think it’s good to follow the trends if you have the experience and the ability to filter the BS. I know a young developer who was constantly going back and forth between Rails/Ruby and Django/Python. I haven’t heard from him for a while, but he is probably still doing that same dance. Why? Because his considerations were solely based on buzz, not on simple requirements analysis or technical comparisons or personal experience.

2. Learn And Use An Old-Fashioned Modern Low-Level Scripting Language

To all the scripting people, like me, out there: you need to have an understanding of what’s happening under the hood. At the least to appreciate our high-level environments, at the most to become genuinely good programmers. Being a Python person myself, I think the best low-level language to be proficient for me is C. Many other high-level languages have C interfaces. So investing the time to learn C should pay off one way or the other.

3. Do Less Web Programming

Aren’t we doing a lot of web programming these days? Actually I think doing X development exclusively is bad for your programming muscles. Web programming, enterprise work or system scripting, it doesn’t matter. But web programming happens more than anything else. Maybe some of you have only been playing with it, but there are a huge number of us doing nothing but web programming. This is so sad; both in an individual level and for the community at large.

4. Learn How To Educate Yourself

What is a noob? Here is a definition and disambiguation (from newbie):

Newbs are those who are new to some task and are very beginner at it, possibly a little overconfident about it, but they are willing to learn and fix their errors to move out of that stage. n00bs, on the other hand, know little and have no will to learn any more. They expect people to do the work for them and then expect to get praised about it, and make up a unique species of their own.

Make an active effort not to be a noob. Learn how to ask smart questions, how to communicate others and seek help. Being polite is good but actually improving and being a valuable member of the community is much, much better.

5. Open Source Properly

It’s great to open source your project. But please do it properly. There are already too many unmaintained, undocumented projects out there that noone seem to care. Do you really have to add to that? As is argument doesn’t make much sense today. But if you really have to make an open source dead drop, please at least document the status of your project and your intentions clearly.

I wouldn’t be surprised if some you think they all are obvious. But if they are so obvious then why are they widely being ignored? Is it because they are under-retweeted, under-reddited and therefore not trendy.

Related posts:

  1. django-formfieldset
  2. Nominate Qooxdoo for SourceForge Community Choice Awards
  3. Sad State of Web Development Industry in Türkiye

February 09, 2010 03:21 PM


Mike Driscoll

Enabling Screen Locking with Python

A few months ago, my employer needed to lock down some of our workstations to be compliant with some new software we were installing from another government organization. We needed to force those machines to lock after so many minutes elapsed and we needed to make it such that the user could not change those settings. In this article, you’ll find out how do this and as a bonus, I’ll also show you how to lock your Windows machine on demand with Python.

Hacking the Registry to Lock the Machine

To start, we’ll take a look at my original script and then we’ll refactor it a bit to make the code better:

from _winreg import CreateKey, SetValueEx
from _winreg import HKEY_CURRENT_USER, HKEY_USERS
from _winreg import REG_DWORD, REG_SZ
 
try:
    i = 0
    while True:
        subkey = EnumKey(HKEY_USERS, i)
        if len(subkey) > 30:
            break
        i += 1
except WindowsError:
    # WindowsError: [Errno 259] No more data is available
    # looped through all the subkeys without finding the right one
    raise WindowsError("Could not apply workstation lock settings!")
 
keyOne = CreateKey(HKEY_USERS, r'%s\Control Panel\Desktop' % subkey)
keyTwo = CreateKey(HKEY_CURRENT_USER, r'Software\Microsoft\Windows\CurrentVersion\Policies\System')
 
# enable screen saver security
SetValueEx(keyOne, 'ScreenSaverIsSecure', 0, REG_DWORD, 1)
# set screen saver timeout
SetValueEx(keyOne, 'ScreenSaveTimeOut', 0, REG_SZ, '420')
# set screen saver
SetValueEx(keyOne, 'SCRNSAVE.EXE', 0, REG_SZ, 'logon.scr')
# disable screen saver tab
SetValueEx(keyTwo, 'NoDispScrSavPage', 0, REG_DWORD, 1)
 
CloseKey(keyOne)
CloseKey(keyTwo)

It took a while to discover this, but to set the right key, we need to find the first sub-key that is larger than 30 characters in length under the HKEY_USERS hive. I’m sure there’s probably a better way to do this, but I haven’t found it yet. Anyway, once we’ve found the long key, we break out of the loop and open the keys we need or create them if they don’t already exist. This is the reason that we use CreateKey since it will do just that. Next, we set four values and then we close the keys to apply the new settings. You can read the comments to see what each key does. Now let’s refine the code a bit to make it into a function:

from _winreg import *
 
def modifyRegistry(key, sub_key, valueName, valueType, value):
    """
    A simple function used to change values in
    the Windows Registry.
    """
    try:
        key_handle = OpenKey(key, sub_key, 0, KEY_ALL_ACCESS)
    except WindowsError:
        key_handle = CreateKey(key, sub_key)
 
    SetValueEx(key_handle, valueName, 0, valueType, value)
    CloseKey(key_handle)
 
try:
    i = 0
    while True:
        subkey = EnumKey(HKEY_USERS, i)
        if len(subkey) > 30:
            break
        i += 1
except WindowsError:
    # WindowsError: [Errno 259] No more data is available
    # looped through all the subkeys without finding the right one
    raise WindowsError("Could not apply workstation lock settings!")
 
subkey = r'%s\Control Panel\Desktop' % subkey
data= [('ScreenSaverIsSecure', REG_DWORD, 1),
              ('ScreenSaveTimeOut', REG_SZ, '420'),
              ('SCRNSAVE.EXE', REG_SZ, 'logon.scr')]
 
for valueName, valueType, value in data:
    modifyRegistry(HKEY_USERS, subkey, valueName,
                   valueType, value)
 
modifyRegistry(HKEY_CURRENT_USER,
               r'Software\Microsoft\Windows\CurrentVersion\Policies\System',
               'NoDispScrSavPage', REG_DWORD, 1)

As you can see, first we import everything in the _winreg module. This isn’t really recommended as you can accidentally overwrite functions that you’ve imported, which is why this is sometimes called “poisoning the namespace”. However, almost every example I’ve ever seen that uses the _winreg modules does it that way. See the first example for the correct way to import from it.

Next, we create a general purpose function that can open the key, or create the key if it’s not already there. The function will also set the value and close the key for us. After that, we do basically the same thing that we did in the previous example: we loop over the HKEY_USERS hive and break appropriately. To mix things up a bit, we create a data variable that holds a list of tuples. We loop over that and call our function with the appropriate parameters and for good measure, we demonstrate how to call it outside of a loop.

Locking the Machine Programmatically

Now you may be thinking that we already covered how to lock the machine programmatically. Well, we did in a sense; but what we really did was set up a timer to lock the machine sometime in the future when the machine has been idle. What if we want to lock the machine now? Some of you are probably thinking we should just hit the Windows key plus “L” and that is a good idea. However, the reason I created this script is because I have to remotely connect to my machine with VNC from time-to-time and I need to go through multiple steps to lock the machine when using VNC whereas if you have Python set up correctly, you can just double-click a script file and have it do the locking for you. That’s what this little script does:

import os
 
winpath = os.environ["windir"]
os.system(winpath + r'\system32\rundll32 user32.dll, LockWorkStation')

This three line script imports the os module, grabs the Windows directory using its environ method and then calls os.system to lock the machine. If you were to open a DOS window on your machine and type the following into it, you would have the exact same effect:


C:\windows\system32\rundll32 user32.dll, LockWorkStation

Wrapping Up

Now you know how to lock your machine with Python. If you put the first example in a login script, then you can use it lock down some or all the machines on your network. This is very handy if you have users that like to wander off or go to lots of meetings, but leave their machines logged in. It protects them from snooping and can protect your company from espionage.

February 09, 2010 03:11 PM


Huy Nguyen

Exceptions are your Friends

Robust code cries often and loudly as soon as something is not right. It does not cower away in corners of obscurity hoping that no one will notice, until one day, shit hits the fan. Any serious python code contains proper use of exceptions, errors, and asserts. In fact, I would argue that their presence defines [...]

February 09, 2010 03:00 PM


Jesse Noller

Say Hello – Nasuni Launches Today!

nasuni_final.png The company I’ve worked for since July of last year – Nasuni Corporation (a startup in Massachusetts) has gone live! This is the culmination of a lot of hard, but exceedingly fun and exciting work over the past months.

The Nasuni team is an excellent one – and one I am very, very proud to be a part of. Our product is called the Nasuni Filer – a simple-to-use, versioned, encrypted and cloud-storage backed virtual NAS (network attached storage) server (click here for more information).

Without going into all of the features, our goal in making this was to make cloud storage simple, accessible and secure – and I know we’ve accomplished all three. All you do is download it, boot it and start using it – once you do so you have access to truly unlimited storage. It’s an unlimited filesystem for the cloud. Here’s the elevator pitch:

Nasuni has developed a virtual file server, called the Nasuni Filer, that delivers unlimited file storage and complete file protection for businesses. Working in partnership with leading cloud storage vendors, the Nasuni Filer leverages the vast capacity of the cloud to store and protect company files offsite, while retaining the local functionality and performance of a traditional NAS.

This technology allows businesses to use the cloud provider of their choice as a replacement for traditional primary storage. Snapshots, file versioning, and offsite storage are integrated into the file server itself – ensuring business file are safe and secure at all times. No need to manage complex backup and DR schemes – if the file server is running, files are protected.

We’ve launched the Beta of the product today – anyone can sign up, download and use it. Anyone can give us feedback and suggestions – I encourage all of you who might need something like this to download and give it a try. If you want – go check out the videos we’ve put together showcasing the Filer (and better yet – check out the awesome animated cartoon we have on the front page).

Most of you know that my blog is mainly Python oriented. Suffice it to say, Nasuni – and the Nasuni Filer make use of Python for a wide range of tasks. We use Python, Django and as much of the Python ecosystem as we can to drive everything from the website, to the GUI on the appliance itself – Python is part of the DNA of the company, and it has served us well. Without Open Source and Python – I don’t think it would have been possible to build what we have built in as little time as we have.

We have a strong dedication to not just Python, but open source in general (and a fair number of us will be at PyCon this month). As time progresses, now that we’re exiting stealth mode we plan on possibly open sourcing stuff we feel would benefit the community. Some of us already push patches back where and when we can, but as I said – as time progresses this involvement will only increase.

So not only am I proud to announce the product, be part of this team and to see what we’ve made, I’m also happy to thank so many people in the Python and OSS community which have helped us reach this point.

So go – check it out, let us know what you think.

February 09, 2010 01:35 PM


Christopher Denter

Multi-Touch: PyMT 0.4 released

Multi-Touch helps to visualize and interact with medical data (image)

The awesome PyMT library has just been released in version 0.4.

This is a major release that brings a ton of cool new stuff, including a new animation framework, speed & stability improvements and much more. Take a look at the release notes to see what’s new in this release.

I’m using PyMT for my thesis (see picture above) and I love it. Make sure to check the new website, too! (There’s also a new demo video in the works. I will update this posting as soon as it’s available.

February 09, 2010 01:23 PM


Ned Batchelder

A preventable Python packaging peeve

Python packaging is a common theme on which to complain, and rightly so. It's no one's first love, so it tends not to get the devoted attention of say, Numpy. And it's a hard problem to solve well. So we have a mish-mash of tools that each do about 75% of the job.

But there's one small aspect of Python packaging that could easily be solved well if people just attended to it: Not enough Python projects clearly state what versions of Python they run on.

For example, suppose you are in the market for a mock object library for your tests. There's no shortage. Less than a minute at PyPI produces mock, MiniMock, mocktest, Mocky, pmock, mocker, mockito, and ludibrio. Some of those PyPI pages have extensive documentation. Not a single one explicitly mentions the versions of Python supported. And I don't mean 2.x vs. 3.x. I want to know if it will run on 2.4 or not. Ludibrio and Mocky offer a slight clue in that they are available for download as an egg, for 2.5 and 2.4 respectively. pMock mentions >= 2.3 support on the home page linked from the PyPI page.

On top of all the other well-known difficulties people have with Python packaging, at the very least, we should be able to manage this: clearly state what versions of Python you support. This is a simple three-step process:

  1. Decide what versions you want to support.
  2. Test your code on those versions.
  3. Add a sentence like this to your PyPI documentation: "SpockMocker runs on Python 2.5 and 2.6".

The Python community will thank you.

February 09, 2010 12:58 PM


Richard Tew

Mailman-style mailing list archives

I have the posts made to several mailing lists in a variety of non-standard formats. Converting them to a standard mbox file is a matter of parsing and is a different process for each. Once I have each parsed, what I would like to do is generate Mailman-style list archives.

I've downloaded Mailman and tried to get it to take my mbox file, and output the list archives. But the process is to some degree tied to Unix-style platforms, relying on functionality that is not supported on Windows. But to a larger degree, it is tied into the quality of being a proper Mailman hosted mailing list. Even changing the code to address or work around these things is not the cleanest of processes. There must be a better way.

Any suggestions?

Have some hacky code while I am at it:


WORKING_PATH = r"D:\MailingList"
MAILMAN_PATH = os.path.join(WORKING_PATH, "mailman-2.1.13")

class MailList:
def __init__(self, basePath, fileName):
self.basePath = basePath
self.fileName = fileName
self.SetVars()

def SetVars(self):
self._internal_name = "mud-dev"
self._fullpath = "/resource/MUD-Dev/"
self.host_name = "localhost"
self.subject_prefix = "[MUD-Dev] "
self.real_name = "MUD-Dev"

def fullpath(self):
return self._fullpath

def archive_dir(self):
return self.basePath

def internal_name(self):
return self.fileName

def ArchiveFileName(self):
return os.path.join(self.basePath, self.internal_name() + ".mbox")

def GetScriptURL(self, *args, **kwargs):
return args[0]

def GetListEmail(self):
return "no-list-email"

class SuperDuperArchive(HyperArchive):
def GetArchLock(self): return 1

def DropArchLock(self): pass

def fake_symlink(src, dst):
if os.path.exists(src):
open(dst, "w").write(open(src, "r").read())

os.symlink = fake_symlink
os.link = fake_symlink
Mailman.mm_cfg.TEMPLATE_DIR = os.path.join(MAILMAN_PATH, "templates")
Mailman.mm_cfg.LIST_DATA_DIR = WORKING_PATH
Mailman.mm_cfg.PUBLIC_ARCHIVE_FILE_DIR = WORKING_PATH
Mailman.mm_cfg.PRIVATE_ARCHIVE_FILE_DIR = WORKING_PATH

mlist = MailList(os.path.join(filePath, "archives"), "mud-dev")
mlist.preferred_language = 'en'

listPath = os.path.join(Mailman.mm_cfg.LIST_DATA_DIR, mlist.internal_name())
if not os.path.exists(listPath):
os.makedirs(listPath)

class DummyClass:
def internal_name(self):
return "mud-dev"

def archive_dir(self):
return Site.get_archpath(self.internal_name())

def GetScriptURL(self, *args, **kwargs):
return args[0]

def GetListEmail(self):
return "no-list-email"

instance = DummyClass()

Mailman.MailList.MailList.InitVars.im_func(instance, "mud-dev")
for baseclass in Mailman.MailList.MailList.__bases__:
if hasattr(baseclass, 'InitVars'):
baseclass.InitVars.im_func(instance)

MailList.SetVars.im_func(instance)

listConfigPath = os.path.join(listPath, "config.pck")
if not os.path.exists(listConfigPath):
cPickle.dump(instance.__dict__, open(listConfigPath, "wb"))

archive = SuperDuperArchive(mlist)
archive.processListArch()
archive.close()

February 09, 2010 06:37 AM


Heikki Toivonen

Pulling Android Market Sales Data Programmatically

Android Market handles sales through Google Checkout. I haven’t tried selling anything else online before, but what this setup provides for me as the seller leaves a lot to be desired. One issue you will have trouble with is getting the data needed to file taxes.

Google provides a Google Checkout Notification History API that lets you programmatically query sales data. For my purposes the API requests are really simple: just post a small XML document with the date range I am interested in, get back XML documents that contain my data. If there is more data that fits in a single response, look for an element that specifies the token for the next page and keep pulling until you get all data.

Below is a really simple Python script that uses M2Crypto to handle the SSL parts for the connection (needed since Python doesn’t do secure SSL out of the box). You will also need to grab certificates. You should save the script as gnotif.py, save the certificates as cacert.pem and create gnotif.ini as described in the script below all in the same directory. When you execute it, it will ask for start and end date (in YYYY-MM-DD format) and then fetch all the data, saving them in response-N.xml files, where N is a number.

#!/usr/bin/env python
# Script to query Google Checkout Notification History
# http://code.google.com/apis/checkout/developer/Google_Checkout_XML_API_Notification_History_API.html
 
# Supporting file gnotif.ini:
#[gnotif]
# merchant_id = YOUR_MERCHANT_ID_HERE
# merchant_key = YOUR_MERCHANT_KEY_HERE
 
import base64
import re
from ConfigParser import ConfigParser
 
from M2Crypto import SSL, httpslib
 
ENVIRONMENT = "https://checkout.google.com/api/checkout/v2/reports/Merchant/"
XML = """\
<notification-history-request xmlns="http://checkout.google.com/schema/2">
%(query)s
</notification-history-request>
"""
 
config = ConfigParser()
config.read('gnotif.ini')
MERCHANT_ID = config.get('gnotif', 'merchant_id')
MERCHANT_KEY = config.get('gnotif', 'merchant_key')
 
rawstr = r"""<next-page-token>(.*)</next-page-token>"""
compile_obj = re.compile(rawstr, re.MULTILINE)
 
auth = base64.encodestring('%s:%s' % (MERCHANT_ID, MERCHANT_KEY))[:-1]
 
ctx = SSL.Context('sslv3')
# If you comment out the next 2 lines, the connection won't be secure
ctx.set_verify(SSL.verify_peer | SSL.verify_fail_if_no_peer_cert, depth=9)
if ctx.load_verify_locations('cacert.pem') != 1: raise Exception('No CA certs')
 
start = raw_input('Start date: ')
end = raw_input('End date: ')
 
data = XML % {'query': """<start-time>%(start)s</start-time>
<end-time>%(end)s</end-time>""" % {'start': start, 'end': end}}
 
i = 0
 
while True:
    c = httpslib.HTTPSConnection(host='checkout.google.com', port=443, ssl_context=ctx)
    c.request('POST', ENVIRONMENT + MERCHANT_ID, data,
             {'content-type': 'application/xml; charset=UTF-8',
              'accept': 'application/xml; charset=UTF-8',
              'authorization': 'Basic ' + auth})
 
    r = c.getresponse()
 
    f=open('response-%d.xml' % i, 'w')
    result = r.read()
    f.write(result)
    f.close()
 
    print i, r.status
 
    c.close()
 
    match_obj = compile_obj.search(result)
    if match_obj:
        i += 1
        data = XML % {'query': """<next-page-token>%s</next-page-token>""" % match_obj.group(1)}
    else:
        break

As you take a look at the data you will probably notice that you are only getting the sale price information, but no information about the fees that Google is deducting. Officially it is a flat 30%, but I have found out a number of my sales have the fee as 5%. So we need to get this information somehow. Luckily you can toggle a checkbox in your Google Checkout Merchant Settings. Unfortunately there is a bug, and the transaction fee shows as $0 for Android Market sales. I have reported this to Google, and they acknowledged it, but there is no ETA on when this will be fixed.

I also haven’t found any way to programmatically query when and how much did Google Checkout actually pay me. (I can get this info from my bank, but it would be nice to query for that with the Checkout API as well.)

Last but certainly not least, working with the monster XML files returned from Google Checkout API is a real pain. If someone has a script to turn those into a format that could be imported into a spreadsheet or database that would be nice…

February 09, 2010 04:48 AM


Vern Ceder

Get the most out of PyCon – VOLUNTEER

PyCon Atlanta is now less than 2 weeks away, and things are coming together. My big concern, the poster session, is pretty much set to go. Transportation, check. Hotel, check. Conference registration, time off from work, talks I want to catch, tentative open space plans: check, check, check, and check. Yesterday I added the final [...]

February 09, 2010 04:30 AM


Calvin Spealman

DeferArgs on GitHub

A time ago I wrote a library called DeferArgs and I used it when I was still in Twisted code every day. I no longer have that fun, but I was reminded of the code and decided to throw it onto GitHub for anyone who cares for it.


http://github.com/ironfroggy/DeferArgs

An example usage, where foo could take any deferreds and would be called when they all fire.

@deferargs
def foo():
    assert False
@catch(AssertionError)
def onAssert(error): 
    print "OOPS"     
@catch()             
def onOthers(error): 
    print "I WOULD BE REACHED FOR ANYTHING NOT CAUGHT ABOVE."
@cleanup                                                    
def _(r):                                                   
    print "The result was: ", r
 

February 09, 2010 03:44 AM

February 08, 2010


Eric Florenzano

How do we kick our synchronous addiction?

Asynchronous programming is superior both in memory usage and in overall throughput when compared to synchronous programming . We've known this fact for years. If we look at Django or Ruby on Rails, arguably the two most promising new web application frameworks to emerge in the past few years, both of them are written in such a way that synchronous programming is assumed. Why is it that even in 2010 we're still writing programs that rely on synchronous programming ?

The reason that we're stuck on synchronous programming is twofold. Firstly, the programming model required for straightforward asynchronous implementations is inconvenient. Secondly, popular and/or mainstream languages lack the built-in language constructs that are needed to implement a less-straightforward approach to asynchronous programming.

Asynchronous programming is too hard

Let's first examine the straightforward implementation: an event loop. In this programming model, we have a single process with a single loop that runs continuously. Functionality is achieved by writing functions to execute small tasks quickly, and inserting those functions into that event loop. One of those functions might read some bytes from a socket, while another function might write a few bytes to a file, and yet another function might do something computational like calculating an XOR on the data that's been buffered from that first socket.

The most important part about this event loop is that only one thing is ever happening at a time. That means that you really have to break your logic up into small chunks that can be performed incrementally. If any one of our functions blocks, it hogs the event loop and nothing else can execute during that time.

We have some really great frameworks geared towards making this event loop model easier to work with. In Python, there's Twisted and, more recently, Tornado. In Ruby there's EventMachine. In PERL there's POE. What these frameworks do is twofold: provide constructs for more easily working with an event loop (e.g. Deferreds or Promises), and provide asynchronous implementations of common tasks (e.g. HTTP clients and DNS resolution).

But these frameworks stop very short of making asynchronous programming easy for two reasons. The first reason is that we really do have to completely change our coding style. Consider what it would take to render a simple blog web page with comments. Here's some JavaScript code to demonstrate how this might work in a synchronous framework:

function handleBlogPostRequest(request, response, postSlug) {
    var db = new DBClient();
    var post = db.getBlogPost(postSlug);
    var comments = db.getComments(post.id);
    var html = template.render('blog/post.html',
        {'post': post, 'comments': comments});
    response.write(html);
    response.close();
}

Now here's some JavaScript code to demonstrate how this might look in an asynchronous framework. Note several things here: We've specifically written this in such a way that it doesn't become nested four levels deep. We've also written these callback functions inside of the handleBlogPostRequest function to take advantage of closure so as to retain access to the request and response objects, the template context, and the database client. Both the desire to avoid nesting and the closure are things that we need to think about as we write this code, that were not even considerations in the synchronous version:

function handleBlogPostRequest(request, response, postSlug) {
    var context = {};
    var db = new DBClient();
    function pageRendered(html) {
        response.write(html);
        response.close();
    }
    function gotComments(comments) {
        context['comments'] = comments;
        template.render('blog/post.html', context).addCallback(pageRendered);
    }
    function gotBlogPost(post) {
        context['post'] = post;
        db.getComments(post.id).addCallback(gotComments);
    }
    db.getBlogPost(postSlug).addCallback(gotBlogPost);
}

I've chosen JavaScript here to prove a point, by the way. People are very excited about node.js right now, and it's a very cool framework, but it doesn't hide all of the complexities involved in doing things asynchronously. It only hides some of the implementation details of the event loop.

The second reason why these frameworks fall short is because not all IO can be handled properly by a framework, and in these cases we have to resort to bad hacks. For example, MySQL does not offer an asynchronous database driver, so most of the major frameworks end up using threads to ensure that this communication happens out of band.

Given the inconvenient API, the added complexity, and the simple fact that most developers haven't switched to using this style of programming, leads us to the conclusion that this type of framework is not a desirable final solution to the problem (even though I do concede that you can get Real Work done today using these techniques, and many people do). That being the case, what other options do we have for asynchronous programming? Coroutines and lightweight processes, which brings us to our next major problem.

Languages don't support easier asynchronous paradigms

There are a few language constructs that, if implemented properly in modern programming languages, could pave the way for alternative methods of doing asynchronous programming that don't have the drawbacks of the event loop. These constructs are coroutines and lightweight processes.

A coroutine is a function that can suspend and resume its execution at certain, programmatically specified, locations. This simple concept can serve to transform blocking-looking code to be non-blocking. At certain critical points in your IO library code, the low-level functions that are doing IO can choose to "cooperate". That is, it can choose to suspend execution in order for another function to resume execution and continue on.

Here's an example (it's Python, but fairly understandable for all I hope):

def download_pages():
    google = urlopen('http://www.google.com/').read()
    yahoo = urlopen('http://www.yahoo.com/').read()

Normally the way this would work is that a socket would be opened, connected to Google, an HTTP request sent, and the full response would be read, buffered, and assigned to the google variable, and then in turn the same series of steps would be taken for the yahoo variable.

Ok, now imagine that the underlying socket implementation were built using coroutines that cooperated with each other. This time, just like before, the socket would be opened and a connection would be made to Google, and then a request would be fired off. This time, however, after sending the request, the socket implementation suspends its own execution.

Having suspended its execution (but not yet having returned a value), execution continues on to the next line. The same thing happens on the Yahoo line: once its request has been fired off, the Yahoo line suspends its execution. But now there's something else to cooperate with--there's actually some data ready to be read on the Google socket--so it resumes execution at that point. It reads some data from the Gooogle socket, and then suspends its execution again.

It jumps back and forth between the two coroutines until one has finished. Let's say that the Yahoo socket has finished, but the Google one has not. In this case, the Google socket just continues to read from its socket until it has completed, because there are no other coroutines to cooperate with. Once the Google socket is finally finished, the function returns with all of the buffered data.

Then the Yahoo line returns with all of its buffered data.

We've preserved the style of our blocking code, but we've used asynchronous programming to do it. Best of all, we've preserved our original program flow--the google variable is assigned first, and then the yahoo variable is assigned. In truth, we've got a smart event loop going on underneath the covers to control who gets to execute, but it's hidden from us due to the fact that coroutines are in play.

Languages like PHP, Python, Ruby, and Perl simply don't have built-in coroutines that are robust enough to implement this kind of behind-the-scenes transformation. So what about these lightweight processes?

Lightweight processes are what Erlang uses as its main concurrency primitive. Essentially these are processes that are mostly implemented in the Erlang VM itself. Each process has approximately 300 words of overhead and its execution is scheduled primarily by the Erlang VM, sharing no state at all amongst processes. Essentially, we don't have to think twice about spawning a process, as it's essentially free. The catch is that these processes can only communicate via message passing.

Implementing these lightweight processes at the VM level gets rid of the memory overhead, the context switching, and the relative sluggishness of interprocess communication provided by the operating system. Since the VM also has insight into the memory stack of each process, it can freely move or resize those processes and their stacks. That's something that the OS simply cannot do.

With this model of lightweight processes, it's possible to again revert back to the convenient model of using a separate process for all of our asynchronous programming needs. The question becomes this: can this notion of lightweight processes be implemented in languages other than Erlang? The answer to that is "I don't know." To my knowledge, Erlang takes advantage of some features of the language itself (such as having no mutable data structures) in its lightweight process implementation.

Where do we go from here?

The key to moving forward is to drop the notion that developers need to learn to think about all of their code in terms of callbacks and asynchrony, as the asynchronous event loop frameworks require them to do. Over the past ten years, we can see that most developers, when faced with that decision, simply choose to ignore it. They continue to use the inferior blocking methodologies of yesteryear.

We need to look at these alternative implementations like coroutines and lightweight processes, so that we can make asynchronous programming as easy as synchronous programming. Only then will we be able to kick this synchronous addiction.

February 08, 2010 10:15 PM


Roberto Alsina

Marave 0.3 is out!

Version 0.3 of Marave, a distraction-free fullscreen editor is out at http://marave.googlecode.com

This version includes several bugs fixed and features implemented since 0.2:

Marave is free softare released under the GPL, and should work in all major desktop platforms.

I would love feedback on this release, as well as ideas for Marave's future, so if you want to help, please join the mailing list:

http://groups.google.com/group/marave-discuss

Of course, if you like Marave, feel free to give me money

February 08, 2010 09:17 PM


John Cook

Twitter daily tip news

I have five Twitter accounts that send out one tip per day, including a new one I just added last week.

Regular expressions

@RegexTip started over today. It’s a cycle of tips for learning regular expressions. It sticks to the regular expression features common to Python, Perl, C#, and many other programming languages. This account posts Monday through Friday.

Keyboard shortcuts

@SansMouse gives one tip a day on using Windows without a mouse. By practicing one keyboard shortcut a day, you can get into the habit of using your mouse less and your keyboard more. This cycle of tips started over January 29 with the most common and most widely useful shortcuts. I’m also sprinkling in a few extra tips that are less well known. This account also posts Monday through Friday.

Math

I have three mathematical accounts. These post seven days a week.

@AlgebraFact, just started February 2. It will be a mixture of linear algebra, number theory, group theory, etc.

@ProbFact gives one fact per day from probability. Usually these facts are theorems, but sometimes they include a note on history or applications.

@AnalysisFact gives facts from real and complex analysis. The topics range from elementary to advanced.

What if I don’t use Twitter?

You can visit the page for a Twitter account just like any other web page. And every Twitter account has an RSS feed link allowing you to subscribe just as you would subscribe to a blog.

How do you write these?

I write up content for these accounts in bulk. I may sit down on a Saturday and come up with several weeks worth of tips. Then I use HootSuite to schedule the tips weeks in advance. Sometimes I’ll post something spontaneously, such as link to something relevant, but most of the work is done in advance. I use my personal Twitter account for live interaction.

Related links:

Using Windows without a mouse

Regular expressions in

Chart of probability distribution relationships

February 08, 2010 04:26 PM


Ned Batchelder

21st century life in transition

Sitting at the breakfast table, my wife Susan was reading the paper, and when she got to the end of a story, dragged her finger down the paper to try to scroll the newspaper.

I've sat in a movie theater watching trailers, and glanced at the bottom of the screen to try to see the progress bar to see how much time was left in the short clip.

Max said when he's writing on paper with a pencil, and makes a mistake, his left hand twitches as if to hit cmd-Z.

February 08, 2010 12:44 PM


Jonathan Ellis

Distributed deletes in the Cassandra database

Handling deletes in a distributed, eventually consistent system is a little tricky, as demonstrated by the fairly frequent recurrence of the question, "Why doesn't disk usage immediately decrease when I remove data in Cassandra?"

As background, recall that a Cassandra cluster defines a ReplicationFactor that determines how many nodes each key and associated columns are written to. In Cassandra (as in Dynamo), the client controls how many replicas to block for on writes, which includes deletions. In particular, the client may (and typically will) specify a ConsistencyLevel of less than the cluster's ReplicationFactor, that is, the coordinating server node should report the write successful even if some replicas are down or otherwise not responsive to the write.

(Thus, the "eventual" in eventual consistency: if a client reads from a replica that did not get the update with a low enough ConsistencyLevel, it will potentially see old data. Cassandra uses Hinted Handoff, Read Repair, and Anti Entropy to reduce the inconsistency window, as well as offering higher consistency levels such as ConstencyLevel.QUORUM, but it's still something we have to be aware of.)

Thus, a delete operation can't just wipe out all traces of the data being removed immediately: if we did, and a replica did not receive the delete operation, when it becomes available again it will treat the replicas that did receive the delete as having missed a write update, and repair them! So, instead of wiping out data on delete, Cassandra replaces it with a special value called a tombstone. The tombstone can then be propagated to replicas that missed the initial remove request.

There's one more piece to the problem: how do we know when it's safe to remove tombstones? In a fully distributed system, we can't. We could add a coordinator like ZooKeeper, but that would pollute the simplicity of the design, as well as complicating ops -- then you'd essentially have two systems to monitor, instead of one. (This is not to say ZK is bad software -- I believe it is best in class at what it does -- only that it solves a problem that we do not wish to add to our system.)

So, Cassandra does what distributed systems designers frequently do when confronted with a problem we don't know how to solve: define some additional constraints that turn it into one that we do. Here, we defined a constant, GCGraceSeconds, and had each node track tombstone age locally. Once it has aged past the constant, it can be GC'd. This means that if you have a node down for longer than GCGraceSeconds, you should treat it as a failed node and replace it as described in Cassandra Operations. The default setting is very conservative, at 10 days; you can reduce that once you have Anti Entropy configured to your satisfaction. And of course if you are only running a single Cassandra node, you can reduce it to zero, and tombstones will be GC'd at the first compaction.

February 08, 2010 12:08 PM


Isotoma

Beginning development with Plone 4 & Dexterity

Over the past few days, I’ve been tinkering with the latest alphas of Plone 4, particularly with an eye to trying out Dexterity on the latest version.

I started out, as many people will, by downloading the unified installer which will install Python 2.6, Zope 2.12 and the Plone 4.0 alpha for you. After a few teething problems with multiple versions of Python on my Hardy host, I had my Plone install up and running.

First impressions among myself and my colleagues here at Isotoma were that firstly, it was a heck of a lot faster than its predecessor. In fact, John Stahl recently blogged that Plone 4 is potentially three times faster than Drupal, Joomla and Wordpress. The other main, marked difference was the default theme, which is a lot slicker, though in my own opinion with its blocks of bright colours and rounded corners, a little too overtly “Web 2.0” (insert air-quotes here).

My next stop was Martin Aspeli’s Dexterity developer manual which whilst up-to-date for the current stable release of Plone, required some tweaking to get going with Plone 4.

The unified installer, by default, makes use of several config files for buildout, which keeps a lot of the core settings in separate files (base.cfg & versions.cfg). I hear that roadrunner is almost ready for Plone 4, but it’ll be a little while before we’re getting it without checking out the source so that had to be chopped. The extends entry for Dexterity also required updating to the latest alpha.

Otherwise, things went very straightforwardly. My buildout.cfg for use with the unified installer can be found below the fold.

[buildout]
extends-cache = extends
extends = http://good-py.appspot.com/release/dexterity/2.0-next
    base.cfg
    versions.cfg

http-address = 8080

eggs =
    Plone
    Products.PDBDebugMode
    Products.LinguaPlone
    plone.reload

zcml =
    plone.reload

develop =
    src/example.project

debug-mode = off

backups-dir=${buildout:directory}/var

user=admin:password

parts =
    productdistros
    instance
    zopepy
    zopeskel
    backup
    unifiedinstaller
    chown
    omelette
    test

extensions =
    mr.developer
    buildout.dumppickedversions

[versions]
Cheetah = 2.2.1
Paste = 1.7.2
PasteScript = 1.7.3
ZopeSkel = 2.15
collective.recipe.backup = 1.3
plone.recipe.command = 1.0
plone.recipe.distros = 1.5
plone.recipe.unifiedinstaller = 4.0a1
PasteDeploy = 1.3.3

[omelette]
recipe = collective.recipe.omelette
eggs = ${instance:eggs}
packages = ./

[test]
recipe = zc.recipe.testrunner
eggs = example.project
extra-paths =
defaults = ['--exit-with-status', '--auto-color', '--auto-progress']

February 08, 2010 11:18 AM


Simon Willison

Integrate Tornado in Django

Integrate Tornado in Django. A handy ./manage.py runtornado management command for firing up a Tornado server that serves your Django application.

February 08, 2010 11:12 AM


Geert Vanderkelen

Python, oursql and MacOS X 10.6 (Snow Leopard)

This post explains how to compile oursql and install it on MacOS 10.6. oursql is a Python database interface for MySQL, an alternative to MySQL for Python (i.e. MySQLdb) and MySQL Connector/Python.

First, find out which MySQL you installed. This can be either the 32-bit or the 64-bit version. To make sure, find the mysqld (e.g. in /usr/local/mysql/bin) and do the following in a Terminal window:


shell> file /usr/local/mysql/bin/mysqld
.../mysqld: Mach-O 64-bit executable x86_64

If you see x86_64, you got 64-bit, otherwise 32-bit. If you see both, then you have a universal build. This is important for specifying the ARGSFLAG when building.

Download oursql from Launchpad and unpack it into some directory. Using the information from above, you'll have to do following for 64-bit platform (or universal build) in a Terminal window:


shell> ARCHFLAGS="-arch x86_64" python setup.py build
shell> sudo python setup.py install

For 32-bit, you'll have to do:


shell> ARCHFLAGS="-arch i386" python setup.py build
shell> sudo python setup.py install

Following error will be reported when you don't specify the correct ARCHFLAGS:


ld: warning: in .../lib/libmysqlclient.dylib,
file is not of required architecture

Tips:

February 08, 2010 11:09 AM


Geek Scrap

Integrate Tornado in Django

Tornado is a nice python WSGI-compliant web server developed by guys at FriendFeed. It’s primarily thought as application server for python web frameworks and according to FriendFeed benchmarks, it’s blazing fast thanks to its non-blocking connections. There are already some how-to’s on the web on plugging Django web framework into Tornado webserver. A quick recap:

  1. A tutorial on Tornado, Django and nginx by Jeremy Bowers.
  2. How to import django framework inside a Tornado project by Lincoln Loop.
  3. A snippet by lawgon.

My approach is slightly different as I wanted to run Tornado using Django management command-line interface.

The 3 easy steps are:

  1. Add Tornado module to your django setup. If you use buildout, add Tornado git checkout to buildout.cfg using minitage.recipe.fetch recipe, like this:
    [buildout]
    ...
    parts =
    ...
        tornado
        django
    ...
     
    [tornado]
    recipe = minitage.recipe.fetch
    urls = git://github.com/facebook/tornado.git | git | | ${buildout:parts-directory}/tornado
     
    [django]
    recipe = minitage.recipe.scripts
    initialization =
        import os
        os.environ['DJANGO_SETTINGS_MODULE']='project.settings.development'
    scripts =
        django
    eggs =
        Django
        ...
    entry-points=
        django=django.core.management:execute_from_command_line
    extra-paths =
        ${buildout:directory}
        ${tornado:location}
    ...
  2. Next, create a command-line extension hierarchy in your project’s main app:
    $ mkdir project/myapp/management
    $ touch project/myapp/management/__init__.py
    $ mkdir project/myapp/management/commands
    $ touch project/myapp/management/commands/__init__.py
  3. Last, add a runtornado.py script in project/myapp/management/commands/ folder with the following content:
    from django.core.management.base import BaseCommand, CommandError
    from optparse import make_option
    import os
    import sys
     
    class Command(BaseCommand):
        option_list = BaseCommand.option_list + ()
        help = "Starts a Tornado Web."
        args = '[optional port number, or ipaddr:port]'
     
        def handle(self, addrport='', *args, **options):
            import django
            from django.core.handlers.wsgi import WSGIHandler
            from tornado import httpserver, wsgi, ioloop
     
    	sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
    	sys.stderr = os.fdopen(sys.stderr.fileno(), 'w', 0)
     
            if args:
                raise CommandError('Usage is runserver %s' % self.args)
            if not addrport:
                addr = ''
                port = '8000'
            else:
                try:
                    addr, port = addrport.split(':')
                except ValueError:
                    addr, port = '', addrport
            if not addr:
                addr = '127.0.0.1'
     
            if not port.isdigit():
                raise CommandError("%r is not a valid port number." % port)
     
            quit_command = (sys.platform == 'win32') and 'CTRL-BREAK' or 'CONTROL-C'
     
            def inner_run():
                from django.conf import settings
                print "Validating models..."
                self.validate(display_num_errors=True)
                print "\nDjango version %s, using settings %r" % (django.get_version(), settings.SETTINGS_MODULE)
                print "Server is running at http://%s:%s/" % (addr, port)
                print "Quit the server with %s." % quit_command
                application = WSGIHandler()
                container = wsgi.WSGIContainer(application)
                http_server = httpserver.HTTPServer(container)
                http_server.listen(int(port), address=addr)
                ioloop.IOLoop.instance().start()
     
            inner_run()

To run your tornado webserver, you just need to call your usual management program like manage.py with runtornado command, with the same syntax as runserver. In my case, I just run production server using supervisord, with a command like this:

$ ./bin/django runtornado --settings=project.settings.production 8000

If you found this quick how-to useful, remember to follow me on Twitter or subscribe to my feed for more django tips.

Related posts:

  1. .gitignore for Django buildout
  2. Django dynamic template paths
  3. Install wxPython in buildout

February 08, 2010 09:32 AM


Virgil Dupras

Embedded PyObjC

When people think of a PyObjC application, they usually think of a Python application that uses Objective-C libraries. However, it's also possible to do the opposite: An Objective-C application that embeds Python code through a plugin. Building an application this way has advantages (speed, integration and memory usage) and should be used more often. This article explains why and how to achieve this. More

February 08, 2010 08:52 AM