Archive for the ‘Python’ Category

64-bit Python on Macs

Saturday, June 7th, 2008

There was a question recently on the ROOT mailing list where someone was having a problem using the python executable that comes with Mac OS X 10.5 and 64-bit libraries.  I went digging around, and noticed a strange discrepancy.  The compiled python libraries that ship with Leopard are four architecture universal binaries:

stan@Rover:/usr/lib/python2.5/config$ file libpython2.5.a 
libpython2.5.a: Mach-O universal binary with 4 architectures
libpython2.5.a (for architecture ppc7400):
    Mach-O dynamically linked shared library ppc
libpython2.5.a (for architecture ppc64):
    Mach-O 64-bit dynamically linked shared library ppc64
libpython2.5.a (for architecture i386):
    Mach-O dynamically linked shared library i386
libpython2.5.a (for architecture x86_64):
    Mach-O 64-bit dynamically linked shared library x86_64

(Reformatted to avoid spilling into my sidebar…)

However, the python executable is not compiled for 64-bit architectures:

stan@Rover:/usr/bin$ file python2.5
python2.5: Mach-O universal binary with 2 architectures
python2.5 (for architecture ppc7400):
    Mach-O executable ppc
python2.5 (for architecture i386):
    Mach-O executable i386

I hadn’t noticed this since my MacBook is the early Core Duo model, rather than Core 2 Duo, so the hardware does not support x86_64.  Apple may have good reasons to force all python scripts to run as 32-bit applications even on 64-bit systems, but I don’t know what they are.

If you find yourself wanting 64-bit python, it’s very easy to make your own, since all the Python libraries on Leopard are already 32/64-bit universal.  Just go grab the very short python64.c from the ROOT svn repository, and compile it like this:

gcc -arch ppc64 -arch x86_64 -arch i386 -arch ppc  \
    -o python64 -I/usr/include/Python2.5 -lpython2.5 python64.c

(Note that this has nothing to do with the ROOT libraries. If you have no idea what ROOT is, the above will still work.)

Now you can check the python64 executable:

Rover:tmp stan$ file python64
python64: Mach-O universal binary with 4 architectures
python64 (for architecture ppc64):
    Mach-O 64-bit executable ppc64
python64 (for architecture x86_64):
    Mach-O 64-bit executable x86_64
python64 (for architecture i386):
    Mach-O executable i386
python64 (for architecture ppc7400):
    Mach-O executable ppc

All four architectures are now present. I haven’t got a 64-bit Mac to try this out on, so I don’t know if it actually runs correctly there. Being universal, this binary works just fine on my 32-bit Mac, of course.

OS X Leopard Roundup

Tuesday, November 6th, 2007

After spending 4 days with the new 10.5 release of Mac OS X, I’ve been pretty impressed. Visually, things have improved, except for the obvious problems with the dock and the menu bar. I had the same initial negative reaction to the translucent 3D dock that most other people did, but it has grown on me slowly. The translucent menu bar, however, is simply atrocious if you do not have a very color-neutral (black, white or grey) background. The most common way to correct this is to add a white stripe to the top of your background image. OpaqueMenuBar does this for you automatically whenever your background changes.

Looks aside, I think Leopard’s biggest advance is the amount of attention Apple has shown to developers. (Clearly, OS X is stealing the love from the iPhone…)

X11.app 2.0 is generally a huge improvement. Now based on the X.org 7.2 code base, it draws on the XDarwin code base for X/OS X integration, rather than the source of the old Panther/Tiger X11.app, which was based on XFree86. As a result, there have been a number of regressions, but it sounds like the future of X11.app will be much better. In particular, the source is now being hosted in the X.org git repository, and the main developer is committed to engaging with the user community.

There are a number of glitches, though. Using a fullscreen X desktop (which I suspect is not terribly common) is broken, as is dragging an X window to another display if you have two monitors on your computer. Most annoying, the patch to fix the yellow cursor bug was dropped on the floor, and didn’t make it into X11.app 2.0. The author has since fixed this in an alpha release on the XDarwin wiki page. The Xquartz binary he posts there works great for me, so I’m happy for now.

The launchd program, which is like init/rc/cron/at/inetd all rolled together, is used to pull off two neat tricks in Leopard. First, the $DISPLAY variable is set to a socket that launchd monitors, so the X server now starts automatically on demand. This means you can start up Terminal, do some work, and as soon as you start an X application, you’ll see X11.app appear. The second trick is now an ssh-agent is started on demand when you use SSH. Apple’s ssh-agent can fetch passphrases for your keys from the OS X Keychain as well. You don’t need to use SSHKeychain any more (which is good since it had a major memory leak on my system). The only downside to the ssh-agent Keychain support is there is no obvious way to expire the ssh keys in the agent when the Keychain locks. Once those keys are decrypted into the ssh-agent memory, they stay valid even after you lock your Keychain.

Python has been updated to 2.5.1, which is great because it solves a linker problem I had with the Python bindings for ROOT. The Leopard install of Python includes easy_install, numpy, twisted, and some other handy stuff. In addition, there are new Objective-C/Cocoa bindings, and it comes along with py2app, for generating proper-looking Mac applications entirely written in Python!

The Cocoa programmers are probably excited about Objective-C 2.0, which adds garbage collection and some other improvements, like a compact syntax for looping over an iterator. I’ve been reading up on Objective-C, and the message passing style of object-orientation reminds me greatly of Python’s duck-typing. I find the syntax unspeakably ugly looking, but that’s really just a matter of taste. You can get used to anything, really. :)

Parallel Processing in Python with processing

Sunday, October 7th, 2007

There seems to be lots of discussion these days about concurrency going around the Python blogs. I first clued into the arguments when I read Bruce Eckel’s critique of Python 3000 and Guido’s discussion of why the GIL persists in Python. Ignoring the philosophical question of processes vs. threads, you will, as a practical matter, probably need to use multiple processes in Python to scale your programs to more than one core.

I just finished reading the free first issue of Python Magazine, where Doug Hellmann reviewed three options for running processes in parallel in Python: the subprocess module, the parallel python package, and the processing module. (The article is very good, as is the rest of the magazine. Go read it! I only wish that the US dollar were doing a little better as the magazine is priced in CAN$.)

In a bit of serendipity, Fredrik Lundh posted an article where he optimizes a log parsing program in Python, starting from a single process implementation, moving up to threaded and multi-process implementations. Remembering the processing module from Doug’s article, I decided to take a whack at porting one of Fredrik’s versions to use it.

The processing module is designed to be a nearly drop-in replacement for threading, so I started with the threaded version of the log parser. The changes were very simple, as you can see from the diff:

--- wf-4.py 2007-10-07 00:08:04.000000000 -0500
+++ wf-4-processing.py 2007-10-07 00:37:12.000000000 -0500
@@ -1,6 +1,7 @@
import re, sys
from collections import defaultdict
-import threading, Queue
+import processing
+

# 1: 2.7 seconds
# 2: 2.5 seconds
@@ -28,14 +29,13 @@
if not s:
break

-class Worker(threading.Thread):
+class Worker(processing.Process):
def run(self):
while 1:
chunk = queue.get()
if chunk is None:
break
- result.append(process(*chunk))
- queue.task_done()
+ result_queue.put(process(*chunk))

# --------------------------------------------------------------------

@@ -51,7 +51,8 @@
except:
count = 2

-queue = Queue.Queue()
+queue = processing.Queue()
+result_queue = processing.Queue()
result = []

for i in range(count):
@@ -59,10 +60,13 @@
w.setDaemon(1)
w.start()

+chunk_count = 0
for chunk in getchunks(FILE):
queue.put((FILE, chunk))
+ chunk_count += 1

-queue.join()
+for i in range(chunk_count):
+ result.append(result_queue.get())

count = defaultdict(int)
for item in result:

Basically, I had to add a results queue to pass the parsed log results back to the parent process, since the child processes no longer had direct access to a list object in the parent memory space. I could have used processing.Namespace to create a shared list, but a queue to feed back to the parent seemed more natural. It also provided a convenient way to test when all of the workers were finished. (Send one chunk out, look for one item in the return queue)

Update: Based on Fredrik’s comment below, I’ve updated to his new test version which reports the wallclock time difference with both time() and clock(). clock() doesn’t work so well measure the right quantity on OS X, so with time() the numbers make more sense now.

The speed improvement was amazing pretty good! Here are the times on my MacBook 1.83 GHz Core Duo sytem for other versions that were posted, as well as the processing version:

wf-1-gala.py: 6.40 sec (original by Santiago Gala)
wf-2.py: 2.43 sec (optimized, single threaded version)
wf-3.py: 2.25 sec (chunked version)
wf-4.py: 2.10 sec (chunked version with threads)
wf-5.py: 4.95 sec (2 processes using the subprocess module)
wf-6.py: 1.28 sec (2 processes, subprocess module, using memory-mapped files)
wf-4-processing.py: 1.34 sec (chunked version with processing module, 2 worker processes)

I should note that to get wf-5.py and wf-6.py to run, I had to add a call to flush() on the file object in the putobject() function. The flush() is shown in the blog entry, but is not present in the actual source files. Flush problem fixed now! Strangely, wf-5.py does very poorly, but wf-6.py does much better.

So it looks like the processing version does almost as well as the final, fancy version of the log parser, but with a lot less code. I think that counts as a win. :)

You can download wf-4-processing.py here, and the README over here explains how to download the test data set.

Update: As another aside: I saw very little benefit, and usually some penalty, when increasing the number of processes in any of the multi-process tests beyond two. I suspect this is because, unlike Linux, OS X is a little sluggish with the process creation.

Update #2: I finally tried this code on Ubuntu (single proc virtual machine, so no benchmarks) and discovered that the default implementation of processing.Queue uses a POSIX message queue when available. The objects passed around in this program are too large for such a queue, so you need to explicitly ask for a PipeQueue. The code has been updated to reflect this.

Update #3: And as proof that you can screw up even dead-easy parallel tasks, the above code in fact has a deadlock. A threading.Queue object has no length limit, so put() never blocks. However, processing.PipeQueue.get() can block if the pipe fills up.

The obvious way around this would be fix processing.PipeQueue so that it has the same semantics as threading.Queue. Ironically, one could solve that problem by having an extra thread in each process that monitored the pipes. Perhaps something like select() could be used instead to avoid having to introduce threads. That’s probably not portable to Windows, so more thinking here might be required.

Update #4: The processing test code in fact includes a program called test_worker.py which shows how to avoid this problem using a thread in the main process. Would be nice if there was a way to hide that in the Queue implementation though.

Getting rid of magic

Thursday, April 20th, 2006

The Djangoistas have declared the magic-removal branch open for beta testing. This is very exciting as the model magic in Django was a bit of a wart. I can understand why it was initially implemented, but it frequently suprised people by making modules containing your models behave in unexpected ways. (Just try importing something in your model or splitting your models into separate files, and you see what I mean.)

The physics experiment for which I created my Document Management System in Django some months back was (effectively) cancelled last week, before I even got to deploy the system to the rest of the group. That sucks greatly, but the upside is now I have no reason not to dig in and make some larger changes to the DMS. Top of the list will be porting to the magic-removal branch, followed by fixing a race condition using the new database transaction support, and also replacing my crappy home-grown indexer with a real one, like Hype. Hype was pretty impressive in my brief testing, and should be MUCH faster than my indexer/search algorithm.

Then I can face the scary prospect of how to make this code easy to deploy. This is one thing that I don’t understand how to do very well for Django apps. Right now, there’s too much that needs to be changed for each new installation. Once I can fix that, then I will be ready to post the DMS on the Internet and hope that it turns out to be useful for someone.

IMAP Backup

Monday, March 27th, 2006

Today I noticed that my IMAP accounts were getting close to quota, so I decided it was time to back up the mailboxes, stash those tar.gz files somewhere, and clean out the old stuff.

Step 1 was find a command line tool to copy the contents of an IMAP folder, with all nested folders, to some sort of mbox/maildir/whatever local disk format. This sounds easy, but I bounced around between a bunch of perl and python tools to do this, all of which were *WAY* more high powered than what I wanted. Most everything was geared toward IMAP-to-IMAP copy or IMAP to local synchronization, with the ability to detect and propagate changes in either copy. Even after getting one of the sync tools to do a simple copy, I soon discovered that Python’s imaplib has some MemoryError bug, and cannot download multi-MB emails. That pretty much killed every Python based tool. (I couldn’t find anything based on Twisted, and my brain is too weak to write a Twisted-based IMAP tool myself.)

So then I went to Perl. The problem here is that one of my IMAP servers only supports SSL connections, and the default implementation of Mail::IMAPClient does not make it easy to establish a SSL connection (basically, they make you do the socket setup and IMAP login by hand). Thus, most all of the Perl-based tools refused to connect. (Interestingly, Debian has a patched Mail::IMAPClient with a SSL option, but by that point I had given up.)

So, having eliminated Perl and Python tools, I finally just caved in and used Thunderbird. Setup your IMAP account, enable offline browsing on all your folders, and then go to offline mode. Click “Download” when it asks, and *bang*, in your profile directory under ImapMail, you will have an mbox tree of all your mail. Tar, gzip, and go home.

Entries (RSS)