Archive for May, 2005

Or maybe Nokia figured it out

Saturday, May 28th, 2005

Looks like Nokia is planning to release an Internet tablet that comes closer to my hoped-for tablet. Being open to third party developers and using a Linux/GNOME based platform is very nice indeed. And at an expected retail price of $350, it’s at least priced closer to “digital accessory” than “computer”.

Unfortunately, it’s missing out on two things: size and storage. The device is 5.6 in x 3.1 in, which would put the screen at about the size of a 3×5 index card. At that size, there isn’t even room for a microdrive, which is why you only get 64 MB of user-accessible flash storage. The WiFi will have to be your primary information conduit.

I was hoping for something a little closer to maybe 75% of the size of a letter sheet of paper. That would probably drive the cost up to > $600, and price it out of the range of most people. On the flip side, this thing has a resolution of 800×480, which translates to a screen with ~150 dpi. That (assuming your vision or glasses are good) will make even small text pretty readable. Hopefully some gadget store near me will get these when they come out and I can go take a look. I need to see in person if reducing the size of a PDF by 33% to make the horizontal dimension fit on a 5 inch wide screen produces something you could read on the bus.

Tablet Mac?

Tuesday, May 10th, 2005

Looks like Apple got my earlier memo, and has filed a design patent application. Thanks, guys! I’ll be checking my mailbox for the prototype soon. Extra props for the touch screen!

(But seriously, I do hope this makes it to market, though I fear it will be overdesigned and overpriced. Think “accessory,” not “desktop replacement.”)

Fun with Tiger

Saturday, May 7th, 2005

Been having much fun with Tiger like some other Mac-heads I know. Lots of neat things about it, but I’ve been most fascinated by Spotlight. Pervasive metadata on your files sounds like a great idea, long since overdue.

The model appears to be: Volumes have a metadata store which contains information about most files on disk. The metadata for a given file is represented by a dictionary, with keys like kMDItemAudioSampleRate, and such. The metadata is generated by plugins (extension .mdimporter) which are invoked via just one method. The method is given the pathname of the file and returns a dictionary with the attributes of that file. Pretty simple. I want to try making one, but I need to think of a useful plugin idea first. My first thought was Ogg Vorbis files, but those seem to be handled by the Quicktime importer. (The only part of the Vorbis plugin that still works, but more on that later.)

Complex metadata is pretty time-consuming to calculate, so the UNIX locate technique of running every night with a cron job just wouldn’t work. (Even for locate, which only indexes filenames, reindexing from scratch every night is annoying.) Instead, the kernel informs the metadata system whenever a file on disk changes, and it invokes the appropriate metadata importer plugin. Now the indexing work is basically distributed over the idle time of the machine, and only repeated as needed.

Of course, Apple hides all this stuff behind a very simple user interface. You type in words, and it shows you files. However, if you want to search by a particular attribute, no GUI is provided. In the vast majority of cases, you don’t need it, so Apple opted not to distract you with the options. Fortunately, if you open up a Terminal window, there are several command line utilities to query the metadata and really see what’s going on:

  • mdimport - Manually run importer plugins on files.
  • mdls - Show the metadata attributes stored for a file.
  • mdutil - general management functions, like erasing the metadata store, or disabling indexing on a volume
  • mdfind - query the metadata

(Links to man pages provided because it looks like the man pages are not installed by default. Update: I just borked my MANPATH, the man pages are there.)

For example, here’s the info for a Quicktime movie I downloaded with Safari:

Rover:~/Desktop stan$ mdls RvB_Episode55_LoRes.mov
RvB_Episode55_LoRes.mov -------------
kMDItemAttributeChangeDate = 2005-05-07 11:13:36 -0500
kMDItemAudioBitRate = 127832
kMDItemAudioChannelCount = 2
kMDItemCodecs = (AAC, "Sorenson Video 3")
kMDItemContentCreationDate = 2005-05-03 11:50:43 -0500
kMDItemContentModificationDate = 2005-05-03 11:50:43 -0500
kMDItemContentType = "com.apple.quicktime-movie"
kMDItemContentTypeTree = (
"com.apple.quicktime-movie",
"public.movie",
"public.audiovisual-content",
"public.data",
"public.item",
"public.content"
)
kMDItemDisplayName = "RvB_Episode55_LoRes.mov"
kMDItemDurationSeconds = 302.735
kMDItemFSContentChangeDate = 2005-05-03 11:50:43 -0500
kMDItemFSCreationDate = 2005-05-03 11:50:43 -0500
kMDItemFSCreatorCode = 0
kMDItemFSFinderFlags = 0
kMDItemFSInvisible = 0
kMDItemFSLabel = 0
kMDItemFSName = "RvB_Episode55_LoRes.mov"
kMDItemFSNodeCount = 0
kMDItemFSOwnerGroupID = 501
kMDItemFSOwnerUserID = 501
kMDItemFSSize = 24302686
kMDItemFSTypeCode = 0
kMDItemID = 5274552
kMDItemKind = "QuickTime Movie"
kMDItemLastUsedDate = 2005-05-03 10:50:43 -0500
kMDItemMediaTypes = (Sound, Video)
kMDItemPixelHeight = 240
kMDItemPixelWidth = 360
kMDItemStreamable = 0
kMDItemTotalBitRate = 639232
kMDItemUsedDates = (2005-05-03 10:50:43 -0500)
kMDItemVideoBitRate = 511400
kMDItemWhereFroms = (
"http://files.redvsblue.com/3x55shisno/RvB_Episode55_LoRes.mov",
"http://www.redvsblue.com/archive/"
)

So you see the list includes all sorts of information about when the content in the file was created (this is separate from the creation date of the file), bitrates, dimensions, etc. Especially interesting is the last attribute, kMDItemWhereFroms. Both the originial URL of the file and the page which linked to it are included as part of the file metadata. I saw it reported on some blog (which I cannot now locate) that files downloaded with Safari have the URL information included with them, but I haven’t figure out how this is being achieved. Because Safari does this with any file you download, it must either manually inject this information into the metadata store somehow, OR it is writing some sort of extended attribute information to the filesystem directly, which is later picked up by the metadata importers. (Incidentally, this is how Beagle stores file metadata in general.)

Of course, you can turn it around and search for stuff like “What files have I downloaded from redvsblue.com?”:

Rover:~/Desktop stan$ mdfind "kMDItemWhereFroms == *redvsblue.com*"
/Users/stan/Desktop/RvB_Episode55_LoRes.mov

or “What rock songs do I have that are less than 3 minutes long?”:

Rover:~/Desktop stan$ mdfind "kMDItemMusicalGenre == Rock && kMDItemDurationSeconds < 180"
/Users/stan/Music/iTunes/iTunes Music/Chuck Berry/Blues/12 Route 66.m4p

Of course, you have to be a little careful with this, because it would require files to follow some sort of attribute standard. This is the nasty, endless argument that every metadata tagging standard has to deal with. That’s probably why Spotlight doesn’t let you select specific attributes to search on. Then it doesn’t matter if some kinds of files call the producer of creative work the “Artist,” and others the “Author,” or whatever. Hurling all the attribute values into a big bag and ignoring the keys is remarkably effective.

I’m still investigating other questions, like:

  • Can multiple mdimporters be run on a single file? The attributes for a file would just be the union of the dictionaries produced by all the relevant plugins. This would allow existing mdimporters to be extended by just making a new importer that extracts some additional attributes. Not as efficient, but useful if you can’t modify the old importer.
  • How is textual content indexed? Spotlight clearly returns results based on the contents of PDF/Text/RTF files, but mdls doesn’t show any attributes corresponding to keywords used in the document. Is this information stored somewhere else?

What’s Your Vector, Victor?

Monday, May 2nd, 2005

Along with playing with gcj, I’ve been playing around with gcc 4.0. I don’t expect to use it on any production code in the near future, but I want to know what’s coming. In general, reports have been mixed, at least on x86 and AMD64 architectures. I’m still waiting for my university to get the site-licensed copies of Tiger, so I can’t test how GCC 4.0 performs on the iBook G4. I anxiously await benchmarks on the PPC architecture.

In the meantime, I wanted to try out what I consider to be the most interesting addition to GCC: the loop auto-vectorizer. The possibility of using SIMD instructions on CPUs to accelerate loops sounds very promising. This optimization is NOT enabled by default, and while trying to figure out how to turn it on, I ended up reviewing the SIMD options on x86/AMD64:

  • MMX - Eight 64-bit wide registers. Integer ops only.
  • 3DNow! - Eight 64-bit wide registers. Integer and single-precision float ops.
  • SSE - Eight 128-bit wide registers. Single-precision float ops only.
  • SSE2 - Eight 128-bit wide registers. Integer, single and double-precision float ops.
  • SSE2/AMD64 - Sixteen (!) 128-bit wide registers.

To actually get the vectorization optimizations to be used, you need to use -OX -ftree-vectorize, where X = 1, 2, or whatever your favorite optimization level is. If you leave out -O, the vectorizer will be skipped. On AMD64, this is enough to get the vectorizer going. On x86, you might also need -msse or -msse2. For added fun, you can throw in -fdump-tree-vect -ftree-vectorizer-verbose=8 and check out the .vect file for a detailed explanation of what the compiler is doing when it analyzes loops.

I’ll spare you crappy benchmarks for now. My test code is basically the simplest array loop possible, and totally meaningless. I’m hoping to turn this compiler loose on our FORTRAN code and see what it does with our array loops.

(Update) One thing I will say about performance: With SSE2 on the Opteron, the benefit of vectorizing simple loops appears to be linear in the number of variables you can pack into one SIMD register. So, for 16-bit shorts, the speedup is 8x, for 32-bit floats it is 4x, and for 64-bit doubles it is 2x. This makes sense, but I was surprised to see it work out almost exactly. A little hunting in the gcc manual showed this is because on AMD64, gcc actually uses the SSE registers by default, even for scalar floating point math, and just wastes most of the register.

Next Scientific Computing Language?

Sunday, May 1st, 2005

As the resident “computer enthusiast” in our lab, I was asked some time ago by my advisor what I thought the next big scientific computing language would be. We do experimental particle physics, which involves lots of Monte Carlo and LOTS of CPU time. Simulations are always either statistics limited or only represent very simplified models of our detectors. The rise of cheap, fast, commodity CPUs has been very good to us, but we always need more FLOPS than we are given. :)

So, any scientific computing language is going to have to be fast over most all other considerations. This is why FORTRAN 77 has been so popular for so long. It is a very simple language, and it is damn fast when compiled. Simple structures, static variable allocation, and a total lack of pointers all help compilers a lot.

The defacto successor to FORTRAN in the particle physics community is now C++. (FORTRAN 90 and 95 address many of the reasons why you might want to drop FORTRAN, but I’m told the standard was too late in coming to stop the abandonment of FORTRAN.) I can see how this might be seen as a great improvement. Objects and dynamic memory allocation are very nice to have, and C++ compilers have gotten a lot better. However, in both my own code and others, I’ve seen that C++ is also a double-edged sword. Pointer mistakes are all too easy to make, especially for inexperienced programmers. (As we all are. We went into physics and not CS for a reason.) C++ offers too many ways to blow our collective legs off, and we spend a lot of time trying to deal with the products of our own ignorance.

Thus I come back to the original question: what’s next? This is really hypothetical, since the community is unlikely to change languages just to keep up with current fads. But pretend for the moment that we could pick the language of the future. Since speed is important, the language must be compiled in some fashion. That kills Python and Perl, and related languages. (A pity too, Python would have been my first choice except for that. I still think it makes good glue and scripting code, driving compiled components.)

What’s left? I actually cannot think of a language designed to be compiled to native code that has come out since C++. I guess Java finally convinced everyone that bytecode and virtual machines are the way to go. That’s fine for most programs (which are seldom CPU bound on today’s fast CPUs), but a virtual machine for scientific computing? You must be joking!

However, I remembered GCJ, a GCC front-end which can produce native binaries from Java source code (and even compiled class files). I thought maybe compiled Java would be a decent language for particle physics. You get to have object-oriented language features, but the pointer/memory leak problems are taken care of for you. As a compiled language, surely it would be close to C++ in performance. Especially if you could turn off safety checks in production code, like array bounds checking and such.

I wanted a very simple program that I could easily port to both C/C++ and Java that would be relevant for particle physics, so I picked a random number generator. Lots of tight loops and arithmetic operations. I started with the Mersenne Twister random number generator implemented in C. Then I did a fairly straightforward port to Java (my Java is pretty rusty):

Then I compiled the code (gcc 3.3.5) maxing out the optimizations:

g++ -O3 -o benchmark_c++ Benchmark.cxx
gcj -O3 -fno-store-check -fno-bounds-check --main=Benchmark -o benchmark_java Benchmark.java
gcj -O3 -C Benchmark.java

And finally timed the code on a mostly idle machine:

Compiler/VM CPU time (sec)
g++ (native binary) 3.334
gcj (native binary) 10.440
gcj (.class) + Sun JVM 1.4 13.544

Note that the first few times I did this, I wasn’t using -O3 -fno-store-check -fno-bounds-check with gcj. Stock gcj was actually 50% slower than the Sun JVM. That just blew me away. Clearly Sun has been working hard on their JVM implementation.

Of course, this benchmark is quite lame. The code only really tests bit operations on integers and small loop performance. Fortunately, I found a very comprehensive set of Java/C#/C++ benchmarks which was very enlightening. Sun’s JVM performance on Linpack is stunning. Both Java and C# (on Sun’s JVM 1.5 and MS’s .NET, respectively) were able to get within 5% of the C performance for 1000×1000 matrices.

This little exercise has definitely cleared up some prejudices I had. While Java as a UI generally sucks, for non-GUI code it is much faster than I expected. Sun’s JVM is way faster than I even thought was possible, and Java (or C#) for scientific computing isn’t nearly as crazy as I thought it would be. If Java is still around in 8 years, I could see it (or a related language) taking over from C++ in particle physics.

Entries (RSS)