Monday, December 22, 2014

Exporting to MS Excel format (XLS)

from Python is relatively easy with xlwt package. The linked page says that xlwt is a
Library to create spreadsheet files compatible with MS Excel 97/2000/XP/2003 XLS files, on any platform, with Python 2.3 to 2.7
To that I can only add that it's fast, too!
I've heard it doesn't support creation of XLS files with formulas, which might or might not be true (haven't tested that myself).

Installation is easy as usual:
$ pip install xlwt
Here's the link to PDF tutorial covering xlwt along with xlrd (read Excel files from Python) and xlutils (utulities for both xlwt and xlrd). You may also want to check out this site for some information.

To create yourself an Excel file, you basically do this:
import xlwt
from datetime import datetime


book = xlwt.Workbook()
sheet = book.add_sheet('SheetName', cell_overwrite_ok=True)  # Ability to overwrite may be extremely handy 

# You might want to change column widths. 700 is 0.21" as I've found out experimentally
sheet.col(0).width = 700

# Or row hights, but in this case enabling height_mismatch helps you achieve the desired effect.
sheet.row(0).height_mismatch = 1
sheet.row(0).height = 260  # 0.18"

sheet.write(0, 0, u"Current date/time:",
            xlwt.easyxf('font: name Arial, height 160; align: vertical bottom, horizontal left; '
                        'pattern: fore_colour white, pattern solid;'))
sheet.write(0, 2, datetime.now(),
            xlwt.easyxf('font: name Arial, height 160; '
                        'align: vertical center, horizontal center; '
                        'borders: left thin, right thin, top thin, bottom thin; '
                        'pattern: fore_colour white, pattern solid;',
                        num_format_str="DD/MM/YY H:MM:SS;@"))
sheet.col(2).width = 4000  # Hopefully this is enough to show datetime

sheet.write_merge(1, 1, 0, 2,
                  123.55,
                  xlwt.easyxf('font: name Arial Cyr, height 160; '
                              'align: vertical center, horizontal right; '
                              'pattern: fore_colour white, pattern solid;',
                              num_format_str="#,##0.00"))

book.save("out.xls")

Need to create a nice PDF from Python?

pdfkit is a package that can help. This package can cook you some PDFs from HTML. It's a wrapper around wkhtmltopdf, so make sure you install that. On Debian/Ubuntu I wouldn't apt-get it from the standard repository - those wkhtmltopdf QT patches sure provide nice functionality.

Installing pdfkit is as simple as
$ pip install pdfkit
Then you can do this:
import pdfkit

pdfkit.from_url('http://google.com', 'out1.pdf')
pdfkit.from_file('test.html', 'out2.pdf')  # Provided you have test.html in your current folder
pdfkit.from_string('Hello!', 'out3.pdf')
pdfkit.from_string('<html><table><tr><th>Header</th></tr><tr><td>Row 1<td></tr><tr><td>Row 2<td></tr></table></html>', 'out4.pdf')

Must validate

your Python dictionary against a certain set of rules you know it must comply with (schema)?Voluptuous is a Python data validation library! (Despite the name)

Say you've got a dictionary from json.loads. Just define your schema and check if what you've got is good:
from voluptuous import Schema, Required, All, Length, Range
schema = Schema({
    Required('q'): All(str, Length(min=1)),
    Required('per_page', default=5): All(int, Range(min=1, max=20)),
        'page': All(int, Range(min=0)),
    })
"q" is required:
from voluptuous import MultipleInvalid, Invalid
try:
    schema({})
    print "MultipleInvalid not raised"
except MultipleInvalid as e:
    print e
required key not provided @ data['q']
...must be a string:
try:
    schema({'q': 123})
    print "MultipleInvalid not raised"
except MultipleInvalid as e:
    print e
expected str for dictionary value @ data['q']
...and must be at least one character long:
try:
    schema({'q': ''})
    print "MultipleInvalid not raised"
except MultipleInvalid as e:
    print e
length of value must be at least 1 for dictionary value @ data['q']
Note how per_page is assigned its default value:
try:
    s = schema({'q': '#topic'})
    print "s = {0}".format(s)
except MultipleInvalid as e:
    print e
s = {'q': '#topic', 'per_page': 5}

Gedit as a developer's editor

A good text editor is an essential tool for anyone who writes code. Gedit is okay-ish out of the box, but many nice features are lacking. Here are the plugins I use to make life with Gedit easier:

  1. Gedit Source Code Browser: adds a new side panel that shows symbols (functions, classes, variables, etc.) in the code you're editing. This is a common feature found in virtually any IDE out there.
    Sadly, this plugin didn't work out of the box, so I had to make several modifications to make it:

    Went to ~/.local/share/gedit/plugins/ and in sourcecodebrowser.plugin replaced
    Loader=python
    
    with
    Loader=python3
    

    Because of switch to Python 3, I also had to replace the following line in sourcecodebrowser/ctags.py, in parse(self, command, executable=None:
    symbols = self._parse_text(p.communicate()[0])
    
    with
    symbols = self._parse_text(p.communicate()[0].decode("utf-8"))
    

    I'm not sure if this was really necessary, but it seemed logical to me not to supply 'ctags' executable name in the parameters string (since it's given separately as self.ctags_executable anyway). In sourcecodebrowser/plugin.py:
    command = "ctags -nu --fields=fiKlmnsSzt -f - '%s'" % path
    
    replaced with
    command = "-nu --fields=fiKlmnsSzt -f - '%s'" % path
    
    Once again, not sure this was needed.
  2. Gedit Restore Tabs. Does what it says - Gedit automatically opens all the files you had open in your previous session. Not having to recall where exactly are the files I was editing helps me concentrate on the code.
    Once again had to modify plugin definition file here. In restoretabs.plugin replaced
    Loader=python
    
    with
    Loader=python3
    
    But that was it. No source-code modification was required.

    Important update: I've encountered a text document that somehow causes Restore Tabs to crash Gedit on startup. A workaround is to remove this document from the list of files to be loaded on Gedit startup using dconf-editor tool (if you don't have it, do sudo apt install dconf-editor). To find the list, run dconf-editor and go to org -> gnome -> gedit -> plugins -> restoretabs. Remove the offending file from "uris", or just clear the list if you don't know which file is causing the problem.
  3. gedit-file-search: Gedit plugin to search a text in all files in a directory. No tweaking needed, nice!
  4. Plugins in gedit-plugins package, like Color Scheme Editor, Draw Spaces, File Browser Panel, Smart Spaces, Text Size and even a Python Console!

Sunday, December 21, 2014

Adding command line arguments to your Python program

is so simple using Click package! From Click documentation page:
import click

@click.command()
@click.option('--count', default=1, help='Number of greetings.')
@click.option('--name', prompt='Your name',
              help='The person to greet.')
def hello(count, name):
    """Simple program that greets NAME for a total of COUNT times."""
    for x in range(count):
        click.echo('Hello %s!' % name)

if __name__ == '__main__':
    hello()
And what it looks like when run:
$ python hello.py --count=3
Your name: John
Hello John!
Hello John!
Hello John!
It automatically generates nicely formatted help pages:
$ python hello.py --help
Usage: hello.py [OPTIONS]

  Simple program that greets NAME for a total of COUNT times.

Options:
  --count INTEGER  Number of greetings.
  --name TEXT      The person to greet.
  --help           Show this message and exit.
It's magic! Intallation is super-simple, too:
pip install click
And you're set!

Python logging

module has several interesting types of handlers, useful if you want your log files to roll over, like RotatingFileHandler and TimedRotatingFileHandler. The former is useful if you want to limit log file size, but I wanted my logs to roll over at midnight, so I went with the later.

What I didn't know is that TimedRotatingFileHandler has to be created before the moment of rollover and be present during or after it for the rollover to occur. So if your code runs occasionaly and you create your TimedRotatingFileHandler only when you need it, it's not of any help to you.

But there's a simple solution - when creating my FileHandler I just give it the name of the log file which includes current date!

log_fname = "log-{0}.log".format(datetime.now().strftime("%Y-%m-%d"))
log_full_fname = os.join(LOG_PATH, log_fname)
file_handler = logging.FileHandler(log_full_fname)
It works for me since I create the handler every time I need to log something.