skip to navigation
skip to content

Planet Python

Last update: July 24, 2014 02:48 PM

July 24, 2014


Logilab

EP14 Pylint sprint Day 1 report

https://ep2014.europython.eu/static_media/assets/images/logo.png

We've had a fairly enjoyable and productive first day in our little hidden room at EuroPython in Berlin ! Below are some noticeable things we've worked on and discussed about.

First, we discussed and agreed that while we should at some point cut the cord to the logilab.common package, it will take some time notably because of the usage logilab.common.configuration which would be somewhat costly to replace (and is working pretty well). There are some small steps we should do but basically we should mostly get back some pylint/astroid specific things from logilab.common to astroid or pylint. This should be partly done during the sprint, and remaining work will go to tickets in the tracker.

We also discussed about release management. The point is that we should release more often, so every pylint maintainers should be able to do that easily. Sylvain will write some document about the release procedure and ensure access are granted to the pylint and astroid projects on pypi. We shall release pylint 1.3 / astroid 1.2 soon, and those releases branches will be the last one supporting python < 2.7.

During this first day, we also had the opportunity to meet Carl Crowder, the guy behind http://landscape.io, as well as David Halter which is building the Jedi completion library (https://github.com/davidhalter/jedi). Landscape.io runs pylint on thousands of projects, and it would be nice if we could test beta release on some part of this panel. On the other hand, there are probably many code to share with the Jedi library like the parser and ast generation, as well as a static inference engine. That deserves a sprint on his own though, so we agreed that a nice first step would be to build a common library for import resolution without relying on the python interpreter for that, while handling most of the python dark import features like zip/egg import, .pth files and so one. Indeed that may be two nice future collaborations!

Last but not least, we got some actual work done:

July 24, 2014 02:39 PM


Andrew Dalke

Lausanne Cheminformatics workshop and contest

Dietrich Rordorf from MDPI sent an announcement to the CHMINF mailing list about the upcoming 9th Workshop in Chemical Information. It will be on 12 September 2014 in Lausanne, Switzerland. It seems like it will be a nice meeting, so I thought to forward information about it here. They also have a software contest, with a 2,000 CHF prize, which I think will interest some of my readers.

The workshop has been around for 10 years, so I was a bit suprised that I hadn't heard of it before. Typically between 30 and 50 people attend, which I think is a nice size. The preliminary program is structured around 20 minute presentations, including:

If you know the authors, you might recognize that one is from Strasbourg and another London, and the rest from Switzerland. I can understand. From where I live in Sweden it will cost over US $300 in order to get there, and Lausanne doesn't have its own commercial airport so I would need to fly into Geneva or Bern, while my local air hub doesn't fly there directly.

But I live in a corner of Europe, and my constraints aren't yours.

Source code contest

I had an email conversation with Luc Patiny about an interesting aspect of the workshop. They are running a contest to identify the best open source cheminformatics tool of the year, with a prize of 2000 CHF. That's 1650 EUR, 2200 USD, 1300 GBP, or 15000 SEK, which is plenty enough for someone in Europe or even the US to be able to travel there! They have a time slot set aside for the winner of the contest to present the work. The main requirement is that contest participants are selected from submissions (1-2 pages long) to the open access journal Challenges. (And no, there are no journal page fees for this contest, so it doesn't seem like a sneaky revenue generating trick.)

The other requirement is that the submission be "open source". I put that in quotes because much of my conversation with Luc was to understand what they mean. They want people to be able to download the (unobsfucated) software source code for no cost and be able to read and review it to gain insight.

I think this is very much in line with classical peer review thought, even though it can include software which are neither open source nor free software. For example, software submissions for this contest could be "for research purposes only" or "not for use in nuclear reactors", or "redistributions of modified versions are not permitted."

Instead, I think their definition is more in line with Microsoft terms shared source.

In my case, my latest version of chemfp is 'commercial open source', meaning that those who pay me money for it get a copy of it under the MIT open source license. It's both "free software" and "open source", but it's not eligible for this workshop because it costs money to download it.

But I live in a corner of open source, and my constraints aren't yours. ;) If you have a free software project, open source software project, or shared source software project, then you might be interested in submitting it to this workshop and journal. If you win, think of it as an all-expenses paid trip to Switzerland. If you don't win, think of it as a free publication.

July 24, 2014 12:00 PM


Europython

EuroPython Society Sessions at EuroPython 2014

We are having three EuroPython Society (EPS) sessions today at EuroPython 2014. They are all held in room B09.

All EuroPython attendees are invited to join in to these sessions and to become EuroPython Society members.

If you would like to support the EuroPython Society and want to sign up for membership, please visit our membership application form.

Membership is free and we’d like to get as many EuroPython attendees signed up as members as possible, because the EuroPython conference series is all about its attendees.

Enjoy,

EuroPython Society

July 24, 2014 09:02 AM


EuroPython Society

EuroPython Society Sessions at EuroPython 2014

We are having three EuroPython Society (EPS) sessions today at EuroPython 2014. They are all held in room B09.

All EuroPython attendees are invited to join in to these sessions and to become EuroPython Society members.

If you would like to support the EuroPython Society and want to sign up for membership, please visit our membership application form.

Membership is free and we’d like to get as many EuroPython attendees signed up as members as possible, because the EuroPython conference series is all about its attendees.

Enjoy,

EuroPython Society

July 24, 2014 09:01 AM


End Point

Python Imports

For a Python project I'm working on, I wrote a parent class with multiple child classes, each of which made use of various modules that were imported in the parent class. A quick solution to making these modules available in the child classes would be to use wildcard imports in the child classes:

from package.parent import *

however, PEP8 warns against this stating "they make it unclear which names are present in the namespace, confusing both readers and many automated tools."

For example, suppose we have three files:

# a.py
import module1
class A(object):
    def __init__():
        pass
# b.py
import module2
class B(A):
    def __init__():
        super(B, self).__init__()
# c.py
class C(B):
    def __init__():
        super(C, self).__init__()

To someone reading just b.py or c.py, it is unknown that module1 is present in the namespace of B and that both module1 and module2 are present in the namespace of C. So, following PEP8, I just explicitly imported any module needed in each child class. Because in my case there were many imports and because it seemed repetitive to have all those imports duplicated in each of the many child classes, I wanted to find out if there was a better solution. While I still don't know if there is, I did go down the road of how imports work in Python, at least for 3.4.1, and will share my notes with you.

Python allows you to import modules using the import statement, the built-in function __import__(), and the function importlib.import_module(). The differences between these are:

The import statement first "searches for the named module, then it binds the results of that search to a name in the local scope" ( Python Documentation). Example:

Python 3.4.1 (default, Jul 15 2014, 13:05:56) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> re.sub('s', '', 'bananas')
'banana'

Here the import statement searches for a module named re then binds the result to the variable named re. You can then call re module functions with re.function_name().

A call to function __import__() performs the module search but not the binding; that is left to you. Example:

>>> muh_regex = __import__('re')
>>> muh_regex
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> muh_regex.sub('s', '', 'bananas')
'banana'

Your third option is to use importlib.import_module() which. like __import__(), only performs the search:

>>> import importlib
>>> muh_regex = importlib.import_module('re')
>>> muh_regex
<module 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>
>>> muh_regex.sub('s', '', 'bananas')
'banana'

Let's now talk about how Python searches for modules. The first place it looks is in sys.modules, which is a dictionary that caches previously imported modules:

>>> import sys
>>> 're' in sys.modules
False
>>> import re
>>> 're' in sys.modules
True
>>> sys.modules['re']
&ltmodule 're' from '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py'>

If the module is not found in sys.modules Python searches sys.meta_path, which is a list that contains finder objects. Finders, along with loaders, are objects in Python's import protocol. The job of a finder is to return a module spec, using method find_spec(), containing the module's import-related information which the loader then uses to load the actual module. Let's see what I have in my sys.meta_path:

>>> sys.meta_path
[, , ]

Python will use each finder object in sys.meta_path until the module is found and will raise an ImportError if it is not found. Let's call find_spec() with parameter 're' on each of these finder objects:

>>> sys.meta_path[0].find_spec('re')
>>> sys.meta_path[1].find_spec('re')
>>> sys.meta_path[2].find_spec('re')
ModuleSpec(name='re', loader=_frozen_importlib.SourceFileLoader object at 0x7ff7eb314438>, origin='/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/re.py')

The first finder knows how to find built-in modules and since re is not a built-in module, it returns None.

>>> 're' in sys.builtin_module_names
False

The second finder knows how to find frozen modules, which re is not. The third knows how to find modules from a list of path entries called an import path. For re the import path is sys.path but for subpackages the import path can be the parent's __path__ attribute.

>>>sys.path
['', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages/distribute-0.6.49-py3.4.egg', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python34.zip', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/plat-linux', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/lib-dynload', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages', '/home/miguel/.pythonbrew/pythons/Python-3.4.1/lib/python3.4/site-packages/setuptools-0.6c11-py3.4.egg-info']

Once the module spec is found, the loading machinery takes over. That's as far as I dug but you can read more about the loading process by reading the documentation.

July 24, 2014 10:00 AM


PyCon PL Conference

We are starting Call for Workshop Proposals

We have additional Call for Proposals, but this time only for workshops/tutorials. The deadline for CfP is due to 15th of August.

July 24, 2014 08:44 AM


S. Lott

Building Probabilistic Graphical Models with Python

A deep dive into probability and scipy: https://www.packtpub.com/building-probabilistic-graphical-models-with-python/book

I have to admit up front that this book is out of my league.

The Python is sensible to me. The subject matter -- graph models, learning and inference -- is above my pay grade.

Asking About a Book

Let me summarize before diving into details.

Asking someone else if a book is useful is really not going to reveal much. Their background is not my background. They found it helpful/confusing/incomplete/boring isn't really going to indicate anything about how I'll find it.

Asking someone else for a vague, unmeasurable judgement like "useful" or "appropriate" or "helpful" is silly. Someone else's opinions won't apply to you.

Asking if a book is technically correct is more measurable. However. Any competent publisher has a thorough pipeline of editing. It involves at least three steps: Acceptance, Technical Review, and a Final Review. At least three. A good publisher will have multiple technical reviewers. All of this is detailed in the front matter of the book.

Asking someone else if the book was technically correct is like asking if it was reviewed: a silly question. The details of the review process are part of the book. Just check the front matter online before you buy.

It doesn't make sense to ask judgement questions. It doesn't make sense to ask questions answered in the front matter. What can you ask that might be helpful?

I think you might be able to ask completeness questions. "What's omitted from the tutorial?" "What advanced math is assumed?" These are things that can be featured in online reviews.

Sadly, these are not questions I get asked.

Irrational Questions

A colleague had some questions about the book named above. Some of which were irrational. I'll try to tackle the rational questions since emphasis my point on ways not to ask questions about books.


2.  Is the Python code good at solidifying the mathematical concepts? 

This is a definite maybe situation. The concept of "solidifying" as expressed here bothers me a lot.

Solid mathematics -- to me -- means solid mathematics. Outside any code considerations. I failed a math course in college because I tried to convert everything to algorithms and did not get the math part. A kindly professor explained that "F" very, very clearly. A life lesson. The math exists outside any implementation.

I don't think code can ever "solidify" the mathematics. It goes the other way: the code must properly implement the mathematical concepts. The book depends on scipy, and scipy is a really good implementation of a great deal of advanced math. The implementation of the math sits squarely on the rock-solid foundation of scipy. For me, that's a ringing endorsement of the approach.

If the book reinvented the algorithms available in scipy, that would be reason for concern. The book doesn't reinvent that wheel: it uses scipy to solve problems.

4. Can the code be used to build prototypes? 

Um. What? What does the word prototype mean in that question? If we use the usual sense of software prototype, the answer is a trivial "Yes." The examples are prototypes in that sense. That can't be what the question means.

In this context the word might mean "model". Or it might mean "prototype of a model". If we reexamine the question with those other senses of prototype, we might have an answer that's not trivially "yes." Might.

When they ask about prototype, could they mean "model?" The code in the book is a series of models of different kinds of learning. The models are complete, consistent, and work. That can't be what they're asking.

Could they mean "prototype of a model?" It's possible that we're talking about using the book to build a prototype of a model. For example, we might have a large and complex problem with several more degrees of freedom than the text book examples. In this case, perhaps we might want to simplify the complex problem to make it more like one of the text book problems. Then we could use Python to solve that simplified problem as a prototype for building a final model which is appropriate for the larger problem.

In this sense of prototype, the answer remains "What?"  Clearly, the book solves a number of simplified problems and provides code samples that can be expanded and modified to solve larger and more complex problems.

To get past the trivial "yes" for this question, we can try to examine this in a negative sense. What kind of thing is the book unsuitable for? It's unsuitable as a final implementation of anything but the six problems it tackles. It can't be that "prototype" means "final implementation." The book is unsuitable as a tutorial on Python. It's not possible this is what "prototype" means.

Almost any semantics we assign to "prototype" lead to an answer of "yes". The book is suitable for helping someone build a lot of things.

Summary

Those two were the rational questions. The irrational questions made even less sense.

Including the other irrational questions, it appears that the real question might have been this.

Q: "Can I learn Python from this book?"

A: No.

It's possible that the real question was this:

Q: "Can I learn advanced probabilistic modeling with this book?"

A: Above my pay grade. I'm not sure I could learn probabilistic modeling from this book. Maybe I could. But I don't think that I have the depth required.

It's possible that the real questions was this:

Q: Can I learn both Python and advanced probabilistic modeling with this book?"

A: Still No.

Gaps In The Book

Here's what I could say about the book.

You won't learn much Python from this book. It assumes Python; it doesn't tutor Python. Indeed, it assumes some working scipy knowledge and a scipy installation. It doesn't include a quick-start tutorial on scipy or any of that other hand-holding.

This is not even a quibble with the presentation. It's just an observation: the examples are all written in Python 2. Small changes are required for Python 3. Scipy will work with Python 3. http://www.scipy.org/scipylib/faq.html#do-numpy-and-scipy-support-python-3-x. Reworking the examples seems to involve only small changes to replace print statements. In that respect, the presentation is excellent.




July 24, 2014 09:00 AM


End Point

Python Subprocess Wrapping with sh

When working with shell scripts written in bash/csh/etc one of the primary tools you have to rely on is a simple method of piping output and input from subprocesses called by the script to create complex logic to accomplish the goal of the script. When working with python, this same method of calling subprocesses to redirect the input/output is available, but the overhead of using this method in python would be so cumbersome as to make python a less desirable scripting language. In effect you were implementing large parts of the I/O facilities, and potentially even writing replacements for the existing shell utilities that would perform the same work. Recently, python developers attempted to solve this problem, by updating an existing python subprocess wrapper library called pbs, into an easier to use library called sh.

Sh can be installed using pip, and the author has posted some documentation for the library here: http://amoffat.github.io/sh/

Using the sh library

After installing the library into your version of python, there will be two ways to call any existing shell command available to the system, firstly you can import the command as though it was itself a python library:

from sh import hostname
print(hostname())

In addition, you can also call the command directly by just referencing the sh namespace prior to the command name:

import sh
print(sh.hostname())

When running this command on my linux workstation (hostname atlas) it will return the expected results:

Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sh
>>> print(sh.hostname())
atlas

However at this point, we are merely replacing a single shell command which prints output to the screen, the real benefit of the shell scripts was that you could chain together commands in order to create complex logic to help you do work.

Advanced Gymnastics

A common use of shell scripts is to provide administrators the ability to quickly filter log file output and to potentially search for specific conditions within those logs, to alert in the event that an application starts throwing errors. With python piping in sh we can create a simple log watcher, which would be capable of calling anything we desire in python when the log file contains any of the conditions we are looking for.

To pipe together commands using the sh library, you would encapsulate each command in series to create a similar syntax to bash piping:

>>> print(sh.wc(sh.ls("-l", "/etc"), "-l"))
199

This command would have been equivalent to the bash pipe of "ls -l /etc | wc -l" indicating that the long listing of /etc on my workstation contained 199 lines of output. Each piped command is encapsulated inside the parenthesis of the command the precedes it.

For our log listener we will use the tail command along with a python iterator to watch for a potential error condition, which I will represent with the string "ERROR":

>>> for line in sh.tail("-f", "/tmp/test_log", _iter=True):
...     if "ERROR" in line:
...         print line

In this example, once executed, python will call the tail command to follow a particular log file. It will iterate over each line of output produced by tail and if any of the lines contain the string we are watching for python will print that line to standard output. At this point, this would be similar to using the tail command and piping the output to a string search command, like grep. However, you could replace the third line of the python with a more complex action, emailing the error condition out to a developer or administrator for review, or perhaps initiating a procedure to recover from the error automatically.

Conclusions


In this manner with just a few lines of python, much like with bash, one could create a relatively complex process without recreating all the shell commands which already perform this work, or create a convoluted wrapping process of passing output from command to command. This combination of the existing shell commands and the power of python; you get all the functions available to any python environment, with the ease of using the shell commands to do some of the work. In the future I will definitely be using this python library for my own shell scripting needs, as I have generally preferred the syntax and ease of use of python over that of bash, but now I will be able to enjoy both at the same time.

July 24, 2014 07:58 AM

July 23, 2014


Daniel Greenfeld

Want to work for Eventbrite? (part 2)

For various reasons I had to change some things about my Eventbrite test. The new version is listed here and the previous blog post has been updated:

104, 116, 116, 112, 115, 58, 47, 47, 103, 105, 115, 116, 46, 103, 105, 116, 104, 117,
98, 46, 99, 111, 109, 47, 112, 121, 100, 97, 110, 110, 121, 47, 97, 56, 55, 100, 57,
54, 49, 54, 102, 49, 50, 55, 52, 48, 48, 97, 57, 55, 97, 52

Good luck!

July 23, 2014 07:42 PM

Want to work for Eventbrite?

Join me, Andrew Godwin (South, Django migrations), Simon Willison (co-founder of Django, co-founder of Lanyrd), and many other talented people at Eventbrite. We have great challenges, the kind that inspire you to rise to the occasion. We need you to help us overcome them.

I should mention that Eventbrite is committed to giving back to the community. Most notably Eventbrite just contributed £5000 to the Django Rest Framework kickstarter, or about US$8500!! We're a frequent sponsor of events around the world. It doesn't stop there, as Eventbrite managers during the discussion of any tool outside our domain of running events will ask: "When can we open source this?"

As someone who loves working on open source, Eventbrite is the place to be. I say this because I know what we're planning to do in the future. If you join us, you'll find out sooner rather than later. ;)

What's Eventbrite like as a company? Well, we're rated in the top 20 of best places to work in the United States. We get full benefits, free lunch, educational opportunities, and much more. In addition, I have to say that my co-workers are friendly, intelligent, always learning, and love to do things the right way, even if it's the hard way.

Applying for Eventbrite Python positions

Sure, you could go to the official Eventbrite job site, but this method is a fun challenge that proves to us you have the basics down. All you need to do is pass this little test of mine. If you fail any portion of this test we can't consider hiring you.

  1. Can you work in San Francisco (USA), Nashville (USA), or Mendoza (Argentina)?
  2. Do you know Python? Sorry, we aren't looking for Foxpro coders. Experience with git-scm, CSS, JavaScript, Django, and MySQL definite pluses.
  3. Are you able to communicate in both written and verbal English?
  4. Are you a coder? I will throw away anything from a recruiter.
  5. Can you figure out how to contact me? Eventbrite doesn't believe in testing applicants with puzzle logic questions. Instead, we ask you meaningful technical questions or to solve a coding problem. With that in mind, use the following to contact me:
104, 116, 116, 112, 115, 58, 47, 47, 103, 105, 115, 116, 46, 103, 105, 116, 104, 117,
98, 46, 99, 111, 109, 47, 112, 121, 100, 97, 110, 110, 121, 47, 97, 56, 55, 100, 57,
54, 49, 54, 102, 49, 50, 55, 52, 48, 48, 97, 57, 55, 97, 52

hints: chr, problem solving, list comprehension

Note: This is the updated test that is identical to my next blog post.

Good luck!

http://pydanny.com/static/eventbrite_logo_gradient_v2.png

July 23, 2014 07:00 PM


Brendan Scott

Kivy – Problems with on_new_intent in Apps

In an earlier post I wrote about the wonderful on_new_intent hook which allows your application to receive intents from other apps on an Android device.  When the other app sends an intent your app will process it in whatever callback you bound to on_new_intent (which I will call on_new_intent() for the sake of simplicity).  To […]

July 23, 2014 06:16 AM


Wichert Akkerman

Lingua 2.4 released

This is a bugfix release which fixes several problems reported by users.

Lingua is a Python package that helps you find translateable texts in your code, and generate POT-file for them. Think of it as xgettext on steroids for Python.

Read entire article.

July 23, 2014 01:00 AM

July 22, 2014


Enthought

Webinar: Work Better, Smarter, and Faster in Python with Enthought Training on Demand

  Join Us For a Webinar We’ll demonstrate how Enthought Training on Demand can help both new Python users and experienced Python developers be better, smarter, and faster at the scientific and analytic computing tasks that directly impact their daily productivity and drive results. Space is limited! Click a webinar session link below to reserve […]

July 22, 2014 11:40 PM


Data Community DC

Confire: A new Python library

Announcing the release of a new open source library: Confire is a simple but powerful configuration scheme that builds on the configuration parsers of Scapy, elasticsearch, Django and others. The basic scheme is to have a configuration search path that looks for YAML files in standard locations. The search path is hierarchical (meaning that system configurations are overloaded by user configurations, etc). These YAML files are then added to a default, class-based configuration management scheme that allows for easy development.

Full documentation can be found here: http://confire.readthedocs.org/

Confire on PyPI

In a fit of procrastination, I put my first project on PyPI (the Python Package Index): Confire, a simple app configuration scheme using YAML and class based defaults. It was an incredible learning experience into the amount of work that goes into Python developers being simply able to pip install something! I wanted to go the whole nine yards, and set up documentation on Read The Docs and an open source platform on Github and even though it took a while, it was well worth the effort!

There are many configuration packages available on PyPI – it seems that everyone has a different way of doing it. However, this is my prefered way, and I found that after I copy and pasted this code into more than 3 projects that it was time to add it as a dependency via PyPI. The configuration builds on what I’ve learned/done in configuring Scapy, elasticsearch, and Django – and builds on these principles:

  1. Configuration should not be Python (sorry Django). It’s too easy to screw stuff up, and anyway, you don’t want to deal with importing a settings file from /etc!
  2. Configuration should be on a per-system basis. This means that there should be an /etc/app.yaml configuration file as well as a $HOME/.app.yaml configuration file that overwrites the system defaults for a particular user. For development purposes there should also be a $(pwd)/app.yaml file so that you don’t have to sprinkle things throughout the system if not needed.
  3. Developers should be able to have reasonable defaults already written in code if no YAML file has been provided. These defaults should be added in an API like way that is class based and modularized.
  4. Accessing settings from the code should be easy.

So there you have it, with these things in mind I wrote confire and I hope you enjoy it!

The awesome details:

The link on PyPI is here: https://pypi.python.org/pypi/confire
The docs on Read the Docs: http://confire.readthedocs.org/
The docs on Pyhosted: http://pythonhosted.org/confire/

The regular details:

The repository is here: https://github.com/bbengfort/confire
And the bug tracker, of course: https://github.com/bbengfort/confire/issues
The agile board: https://waffle.io/bbengfort/confire

Needless to say, this means you can do the following:

pip install confire

Hopefully this means you will be able to quickly deploy sensible configuration defaults in your code!

The post Confire: A new Python library appeared first on Data Community DC.

July 22, 2014 06:45 PM


Ian Ozsvald

IPython Memory Usage interactive tool

I’ve written a tool (ipython_memory_usage) to help my colleague and I understand how RAM is allocated for large matrix work, it’ll work for any large memory allocations (numpy or regular Python or whatever) and the allocs/deallocs are reported after every command. Here’s an example – we make a matrix of 10,000,000 elements costing 76MB and then delete it:

IPython 2.1.0 -- An enhanced Interactive Python.
In [1]: %run -i  ipython_memory_usage.py
In [2]: a=np.ones(1e7)
'a=np.ones(1e7)' used 76.2305 MiB RAM in 0.32s, 
peaked 0.00 MiB above current, total RAM usage 125.61 MiB 
In [3]: del a 
'del a' used -76.2031 MiB RAM in 0.10s, 
peaked 0.00 MiB above current, total RAM usage 49.40 MiB

 

The more interesting behaviour is to check the intermediate RAM usage during an operation. In the following example we’ve got 3 arrays costing approx. 760MB each, they assign the result to a fourth array, overall the operation adds the cost of a temporary fifth array which would be invisible to the end user if they’re not aware of the use of temporaries in the background:

In [2]: a=np.ones(1e8); b=np.ones(1e8); c=np.ones(1e8)
'a=np.ones(1e8); b=np.ones(1e8); c=np.ones(1e8)' 
used 2288.8750 MiB RAM in 1.02s, 
peaked 0.00 MiB above current, total RAM usage 2338.06 MiB 
In [3]: d=a*b+c 
'd=a*b+c' used 762.9453 MiB RAM in 0.91s, 
peaked 667.91 MiB above current, total RAM usage 3101.01 MiB

 

If you’re running out of RAM when you work with large datasets in IPython, this tool should give you a clue as to where your RAM is being used.

UPDATE – this works in IPython for PyPy too and so we can show off their homogeneous memory optimisation:

# CPython 2.7
In [3]: l=range(int(1e8))
'l=range(int(1e8))' used 3107.5117 MiB RAM in 2.18s, 
peaked 0.00 MiB above current, total RAM usage 3157.91 MiB

And the same in PyPy:

# IPython with PyPy 2.7
In [7]: l=[x for x in range(int(1e8))]
'l=[x for x in range(int(1e8))]' used 763.2031 MiB RAM in 9.88s, 
peaked 0.00 MiB above current, total RAM usage 815.09 MiB

If we then add a non-homogenous type (e.g. adding None to the ints) then it gets converted back to a list of regular Python (heavy-weight) objects:

In [8]:  l.append(None)
'l.append(None)' used 3850.1680 MiB RAM in 8.16s, 
peaked 0.00 MiB above current, total RAM usage 4667.53 MiB

The inspiration for this tool came from a chat with my colleague where we were discussing the memory usage techniques I discussed in my new High Performance Python book and I realised that what we needed was a lighter-weight tool that just ran in the background.

My colleague was fighting a scikit-learn feature matrix scaling problem where all the intermediate objects that lead to a binarised matrix took >6GB on his 6GB laptop. As a result I wrote this tool (no, it isn’t in the book, I only wrote this last Saturday!). During discussion (and later validated with the tool) we got his allocation to <4GB so it ran without a hitch on his laptop.

I’m probably going to demo this at a future PyDataLondon meetup.


Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight and Mor Consulting, founded the image and text annotation API Annotate.io, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

July 22, 2014 03:57 PM


ShiningPanda

Requires.io introduces changelogs

Requires.io is proud to introduce a new feature: changelogs for your requirements!

Knowing that a dependency is outdated is only the first step towards upgrading: you also want to check the changelog before moving forward, unless of course your CI is top notch. Requires.io brings these changelogs right on your requirements page!

Changelog of Kenneth Reitz' requests.

You can check any public project monitored by requires.io to see what it looks like, for instance:

It is not quite perfect yet: some changelogs are not parsed correctly, or are missing because we do not know how to find them automatically. Expect incremental improvements as we refine the crawling/parsing.

For Package Maintainers

This is a short guide for package maintainers who wish to have the changelog of their library on requires.io in case it isn't there already.

We look for changelogs in two different places:

  1. PyPI Descriptions

    We look at the description of the package hosted on pypi. It is encoded in reStructuredText.

    An example that works very well for us is Kenneth Reitz' requests.

  2. Changelog Files on GitHub/Bitbucket

    We look for files named changes or changelog (case insensitive), with or without extension, at the root of the package directory if it is hosted on BitBucket or GitHub.

    Of course for us to find the file, you first need to make sure we actually know where your project is hosted! This can be done by setting the url in the metadata of the project (for instance as home_url), or by putting a link in the description of the project: we will find it providing the project name on GitHub/Bitbucket matches the name of the package on pypi.

    We then parse the file as reStructuredText or Markdown according to the file extension: so changes.md will be parsed as markdown while changes.rst will be parsed as reStructuredText. If there is no extension, we try to guess... with varying degrees of success.

    Example of projects that works well for us are:

    • lxml (it would be even better with .rst as an extension rather than .txt),
    • six (as you can see it works even without any extension at all).

I just updated my Changelog, how do I update it on requires.io?

Drop us an email here: support@requires.io! We will do a re-run of the crawling and parsing of your package.

July 22, 2014 01:45 PM


PyCon Australia

DjangoGirls: Free programming workshop for women

Following the success of the amazing DjangoGirls at Euro Python, we're running a workshop on the first day of sprints.

So come along to pick up an Open Source project to work on in a supportive environment or just sprint with other girl programmers for the day.

No Django experience is required just an interest in learning.

There'll be cupcakes and excellent coaches/mentors.

We'd love to have anyone who identifies as a women to come along!

Register here: http://djangogirls.org/australia.html

July 22, 2014 09:19 AM

Dinner Speaker: Paul Gampe

Netwhois. logfilerotate. logfileradius. What do these three things have in common? They were all written in Perl by Paul Gampe to solve problems faced by ISPs in the 90s. Those are problems he would know about, being the man who connected Japan to the internet.

Paul Gampe’s dinner talk is entitled “The Rise and Fall of Perl: Lessons for Python from an old Perl guy”. Paul will share war stories from his long career in open source software, illustrating how Perl got big by being the language of choice for solving a wide range of software problems. Paul believes that Python now fills that role, and sounds a cautionary note, warning that “If you forget where you came from, you might wind up like Java”.

He will also draw on experience gained through his involvement with Open Source Industry Australia (OSIA), especially in preparing a response to the proposed Trans-Pacific Partnership. Paul knows first-hand about the need to be ever vigilant of the complexity of these multilateral agreements that have immense challenges for an economy such as Australia’s, which needs to preserve the existing flexibilities and freedoms Australians enjoy under our own legal frameworks.

Paul Gampe currently directs information management, security, applications and software development, communications technology and service management for Australia’s leading Data-Centre-as-a-Service (DCaaS) provider, NEXTDC Limited. Before that, Paul was Vice President of World-Wide Engineering Services and Operations for Red Hat, where he directed a global function delivering software engineering services and pioneered development of Red Hat’s Asian language products. You might have seen him at linux.conf.au talking about how to make money from open source software.

July 22, 2014 08:41 AM

July 21, 2014


Mike Driscoll

Python 101: An Intro to Pony ORM

The Pony ORM project is another object relational mapper package for Python. They allow you to query a database using generators. They also have an online ER Diagram Editor that is supposed to help you create a model. They are also one of the only Python packages I’ve seen with a multi-licensing scheme where you can develop using a GNU license or purchase a license for non-open source work. See their website for additional details.

In this article, we will spend some time learning the basics of this package.


Getting Started

Since this project is not included with Python, you will need to download and install it. If you have pip, then you can just do this:


pip install pony

Otherwise you’ll have to download the source and install it via its setup.py script.


Creating the Database

We will start out by creating a database to hold some music. We will need two tables: Artist and Album. Let’s get started!

import datetime
import pony.orm as pny
 
database = pny.Database("sqlite",
                        "music.sqlite",
                        create_db=True)
 
########################################################################
class Artist(database.Entity):
    """
    Pony ORM model of the Artist table
    """
    name = pny.Required(unicode)
    albums = pny.Set("Album")
 
########################################################################
class Album(database.Entity):
    """
    Pony ORM model of album table
    """
    artist = pny.Required(Artist)
    title = pny.Required(unicode)
    release_date = pny.Required(datetime.date)
    publisher = pny.Required(unicode)
    media_type = pny.Required(unicode)
 
# turn on debug mode
pny.sql_debug(True)
 
# map the models to the database 
# and create the tables, if they don't exist
database.generate_mapping(create_tables=True)

Pony ORM will create our primary key for us automatically if we don’t specify one. To create a foreign key, all you need to do is pass the model class into a different table, as we did in the Album class. Each Required field takes a Python type. Most of our fields are unicode, with one being a datatime object. Next we turn on debug mode, which will output the SQL that Pony generates when it creates the tables in the last statement. Note that if you run this code multiple times, you won’t recreate the table. Pony will check to see if the tables exist before creating them.

If you run the code above, you should see something like this get generated as output:

GET CONNECTION FROM THE LOCAL POOL
PRAGMA foreign_keys = false
BEGIN IMMEDIATE TRANSACTION
CREATE TABLE "Artist" (
  "id" INTEGER PRIMARY KEY AUTOINCREMENT,
  "name" TEXT NOT NULL
)
 
CREATE TABLE "Album" (
  "id" INTEGER PRIMARY KEY AUTOINCREMENT,
  "artist" INTEGER NOT NULL REFERENCES "Artist" ("id"),
  "title" TEXT NOT NULL,
  "release_date" DATE NOT NULL,
  "publisher" TEXT NOT NULL,
  "media_type" TEXT NOT NULL
)
 
CREATE INDEX "idx_album__artist" ON "Album" ("artist")
 
SELECT "Album"."id", "Album"."artist", "Album"."title", "Album"."release_date", "Album"."publisher", "Album"."media_type"
FROM "Album" "Album"
WHERE 0 = 1
 
SELECT "Artist"."id", "Artist"."name"
FROM "Artist" "Artist"
WHERE 0 = 1
 
COMMIT
PRAGMA foreign_keys = true
CLOSE CONNECTION

Wasn’t that neat? Now we’re ready to learn how to add data to our database.


How to Insert / Add Data to Your Tables

Pony makes adding data to your tables pretty painless. Let’s take a look at how easy it is:

import datetime
import pony.orm as pny
 
from models import Album, Artist
 
#----------------------------------------------------------------------
@pny.db_session
def add_data():
    """"""
 
    new_artist = Artist(name=u"Newsboys")
    bands = [u"MXPX", u"Kutless", u"Thousand Foot Krutch"]
    for band in bands:
        artist = Artist(name=band)
 
    album = Album(artist=new_artist,
                  title=u"Read All About It",
                  release_date=datetime.date(1988,12,01),
                  publisher=u"Refuge",
                  media_type=u"CD")
 
    albums = [{"artist": new_artist,
               "title": "Hell is for Wimps",
               "release_date": datetime.date(1990,07,31),
               "publisher": "Sparrow",
               "media_type": "CD"
               },
              {"artist": new_artist,
               "title": "Love Liberty Disco", 
               "release_date": datetime.date(1999,11,16),
               "publisher": "Sparrow",
               "media_type": "CD"
              },
              {"artist": new_artist,
               "title": "Thrive",
               "release_date": datetime.date(2002,03,26),
               "publisher": "Sparrow",
               "media_type": "CD"}
              ]
 
    for album in albums:
        a = Album(**album)
 
if __name__ == "__main__":
    add_data()
 
    # use db_session as a context manager
    with pny.db_session:
        a = Artist(name="Skillet")

You will note that we need to use a decorator caled db_session to work with the database. It takes care of opening a connection, committing the data and closing the connection. You can also use it as a context manager, which is demonstrated at the very end of this piece of code.


Using Basic Queries to Modify Records with Pony ORM

In this section, we will learn how to make some basic queries and modify a few entries in our database.

]
import pony.orm as pny
 
from models import Artist, Album
 
with pny.db_session:
    band = Artist.get(name="Newsboys")
    print band.name
 
    for record in band.albums:
        print record.title
 
    # update a record
    band_name = Artist.get(name="Kutless")
    band_name.name = "Beach Boys"

Here we use the db_session as a context manager. We make a query to get an artist object from the database and print its name. Then we loop over the artist’s albums that are also contained in the returned object. Finally, we change one of the artist’s names.

Let’s try querying the database using a generator:

result = pny.select(i.name for i in Artist)
result.show()

If you run this code, you should see something like the following:


i.name
--------------------
Newsboys
MXPX
Beach Boys
Thousand Foot Krutch

The documentation has several other examples that are worth checking out. Note that Pony also supports using SQL itself via its select_by_sql and get_by_sql methods.


How to Delete Records in Pony ORM

Deleting records with Pony is also pretty easy. Let’s remove one of the bands from the database:

import pony.orm as pny
 
from models import Artist
 
with pny.db_session:
    band = Artist.get(name="MXPX")
    band.delete()

Once more we use db_session to access the database and commit our changes. We use the band object’s delete method to remove the record. You will need to dig to find out if Pony supports cascading deletes where if you delete the Artist, it will also delete all the Albums that are connected to it. According to the docs, if the field is Required, then cascade is enabled.


Wrapping Up

Now you know the basics of using the Pony ORM package. I personally think the documentation needs a little work as you have to dig a lot to find some of the functionality that I felt should have been in the tutorials. Overall though, the documentation is still a lot better than most projects. Give it a go and see what you think!


Additional Resources

July 21, 2014 05:15 PM


Europython

Video Streams of EuroPython 2014

All talks and keynotes will be streamed.

Please checkout the video streaming page for related streams by room, quality and device.

July 21, 2014 07:16 AM


Wingware News

Wing IDE 5.0.8 Released: July 21, 2014

This release supports creating multiple selections, adds ability to step over statements or blocks in the debugger, adds support for Stackless 2.7.8, fixes debugging 32-bit Python on OS X, and makes many other minor improvements and bug fixes.

July 21, 2014 01:00 AM

July 20, 2014


Europython

Evaluation of talks and trainings

This year we will proceed with the evaluation and feedback of talk and trainings as follows:

image

image

July 20, 2014 11:30 AM


PyCon PL Conference

Hynek Schlawack will be speaker on PyCon PL 2014!

Hynek Schlawack have joined PyCon speakers. He will raise security issues like Heartbleed attacks and TSL. Hynek Schlawack is an infrastructure and software engineer from Berlin/Germany, PSF fellow, and contributor to a wide variety of open source projects including high-profile ones such as Twisted and CPython. Currently he works at the Potsdam-based Variomedia AG redoing its infrastructure with Python-based solutions, mostly using Twisted and Pyramid. His main areas of interest are security, networks, and robust software. Get ticket for conference here

July 20, 2014 11:12 AM


Europython

Cash-only payments at the registration desk

If you want/need to pay one-day passes, partner programme tickets etc. at the conference registration desk  when you have to pay in cash. We do not accept cards (neither Maestro cards nor credit cards).

July 20, 2014 10:28 AM

Registration desk opening hours

The registration desk is today (Sunday) open from 17 to 20 h and we start the registration on Monday morning at 8:30 h.

July 20, 2014 10:05 AM