Max Ischenko’ blog

auto-detect file Unicode encoding using BOM

April 2nd, 2006

As anyone who worked with Unicode probably knows, Unicode files can be written using an array of different encodings. Special byte sequence known as BOM (Byte-Order-Marker) is (usually) put at the beginning of the file to declare its encoding explicitly.

The other day I needed a Python routine that should support reading files in all these encodings properly. Simply using codecs.open doesn’t work as expected: it does not strip BOM always as someone already found out. After reading Mark Pilgrim’s chardet module documentation and related ASPN recipe I come up with the following code:


def detect_unicode_encoding(fd):
    '''Peeks inside the file stream to guess correct variant of Unicode encoding
    via Byte-Order-Marker (BOM) tag.    
	
    Requires a file stream which supports backward and forward
    positioning via .seek(). Resets read position to 0 (start of the file).'''
	
    encodings_map = [
        (3, codecs.BOM_UTF8, ‘UTF-8′),
        (4, codecs.BOM_UTF32_LE, ‘UTF-32LE’),
        (4, codecs.BOM_UTF32_BE, ‘UTF-32BE’),
        (2, codecs.BOM_UTF16_LE, ‘UTF-16LE’),
        (2, codecs.BOM_UTF16_BE, ‘UTF-16BE’),
    ]
    buf = fd.read(4)
    for (offset, bom, name) in encodings_map:
        if buf[:offset] == bom:
            fd.seek(offset) # skip byte-order marker
            return name
    fd.seek(0) # return to the beginning - no BOM found

Orig:http://maxischenko.in.ua/wordpress/wp-content/themes/whiteasmilk/images/file.gif * file Posted in Python | No Comments

Search pagination in TurboGears

March 24th, 2006

A common task for a web application is to present large result set to user efficiently. Usually this means paging into multiple chunks and adding navigation controls, like does Google or Yahoo.

Here is the solution I use to implement search pagination in TurboGears.
Read the rest of this entry »

Orig:http://maxischenko.in.ua/wordpress/wp-content/themes/whiteasmilk/images/file.gif * file Posted in TurboGears | 3 Comments

New venture: book-swapping service

March 2nd, 2006

For some time now I’ve been working sporadically on a new web-based service that will help locals to buy/sell/exchange books (hardcopies, not e-books). This is my third web site and like the ones before, it is aimed at the Ukranian market however small it looks compared to Internet worldwide. Somehow I don’t feel like entering the world just yet.

I’m far from sure I have a bright idea at hand so this time I want to “fail early” if I don’t. It was hard and a bit scary to resist the usual “total obscurity” course, though. Anyway, I plan to launch a “preview” (no, not a “beta”) version of the service in March and gather and listen to users’ feedback to decide if it’s worth to continue.

For now you can take a project’s survey here (it’s in Russian).

Orig:http://maxischenko.in.ua/wordpress/wp-content/themes/whiteasmilk/images/file.gif * file Posted in My ventures | 1 Comment

Give TurboGears a try

March 1st, 2006

Some time ago I wrote that I’d not recommend using TurboGears for real projects just yet. Now, only a few months later, my advice would be a bit different.

I drive two TurboGears-based projects to production and so far results have been more than satisfactory. The framework is well thought-out, nicely structured and was a performance boost to us. Deployment went much easier than expected.

One major issue we had to struggle with was fast-paced development when doing “svn up” often break the app in obscure ways. Now, as TurboGears entered “feature-freezing” mode with 0.9a1 release, this should not be a problem to new-comers anymore. Another issue was i18n: there are only a handful developers using TG outside ASCII environment so I discovered a number of i18n-related issues and sometimes even helped to fix those. Again, this is much less a problem now as most of these issues been fixed.

The real problem for new-comers is TurboGears documentation which often lagging behind the latest code. There were a number of great features developed for 0.9 but many (most?) of them still “under radar” for casual observers. This issue is acknowledged by developers and is one of the main goals for release 1.0.

So, if you’re evaluating Python web frameworks for your first (next) project give TurboGears a try. If you’re willing to tolerate lacking documentation and to contribute bug reports back to the developers TurboGears is good enough to use right now. If you’re rather not into such Open Source attitude you may want to wait for version 1.0, which is hopefully not a distant future either.

Orig:http://maxischenko.in.ua/wordpress/wp-content/themes/whiteasmilk/images/file.gif * file Posted in Python, TurboGears | 2 Comments

TurboGears is a tribute to Python’s maturity

February 23rd, 2006

A lot has been said about Django, TurboGears, Ruby on Rails and one-versus-another comparisons. I want to add a slightly different perspective.

I think rise of TurboGears is a sign that Python’s maturing. It’s not quite mature just yet but we’re heading there, slowly but surely. Unlike Django or RoR, which are mostly thing-in-itself, TurboGears would not be possible a couple of years ago.

For a start, excellent setuptools weren’t available which made distribution so painless. But there is much more that that. Even small but amazinigly good nose takes it share by making testing affordable. Same for other components that comprise TurboGears’ stack.

What’s the most aspiring is that TurboGears acts as a focal point to bring even more attention and development resources to those individual components. Ian Bicking’s announcement of the SQLObject 2 project, which is IMO direct result of the extra attention given recently to SQLObject which is TurboGears’ ORM tool of choice.

Orig:http://maxischenko.in.ua/wordpress/wp-content/themes/whiteasmilk/images/file.gif * file Posted in Python, TurboGears | 1 Comment

Max Ischenko’ blog

Mastering software craftsmanship

auto-detect file Unicode encoding using BOM

Search pagination in TurboGears

New venture: book-swapping service

Give TurboGears a try

TurboGears is a tribute to Python’s maturity

About this site

Categories

Archives

Subscribe