Work in progress....

Sapo Codebits 2008: First Project – irnotify

Posted: November 14th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , | Comments Off

The first project I’ve made (well technically second since it’s a support lib for another project) is now available (via SVN) at google code – irnotify.

A python library for making notifications. Currently implements notifications via XMPP (Jabber), SMTP (Mail) and Twitter direct messages.

Here’s the example code:

Twitter DM

from twitternotify import TwitterNotify
n = TwitterNotify("twitter.conf")
n.notify("lrei", "How do you feel, Rei?")

XMPP message

from xmppnotify import XMPPNotify
n = XMPPNotify("xmpp.conf")
n.notify("luis.rei@gmail.com", "How do you feel, Rei?")



Noise Sample

Posted: October 1st, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , , , | Comments Off

How Digg Works is a nice blog post on the digg technology blog that gives an overview of the digg network architecture: load balancers -> application server (apache, memcached, gearman) -> database server (MySQL) / file server (MogileFS). Quoting:

“Digg uses Debian GNU/Linux across the board with a mixture of MySQL, Memcached, MogileFS, Python, PHP, Apache, Gearman and various appliances to serve up billions of requests a month (and more every day!)”

What’s New In Python 2.6. My highlights: new documentation format using Sphinx (which btw looks incredibly powerful – read “complex”), awesome (but still optional) with statment (from future) to replace try…finally blocks, multiprocessing package,and more.

I had more stuff to put in this post but it has been in my “draft” folder for more than a week now so I’m just gonna post it like it is.


MNUM – SimpsonCsv

Posted: May 10th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , | Comments Off

This one instead reads the values from a CSV file containing experimental data.
Link to my previous implementation of Simspson’s rule.


import csv

class simpsonCsv:
    def __init__(self, filename):
        reader = csv.reader(open(filename, "rb"))

        xlist = []
        ylist = []
       
        for row in reader:
            try:
               x = float(row[0])
               y = float(row[1])
            except TypeError:
                continue
            except ValueError:
                continue
            xlist.append(float(x))
            ylist.append(float(y))
       
        self.len = len(xlist) - 1
        self.xstart = xlist[0]
        self.xend = xlist[self.len]
        self.interval = self.xend / self.len
       
        self.data = map(None, xlist, ylist)
   
    def f(self, x):
        for (u,v) in self.data:
            if u == x:
                return v
               
        print "Bad x value (probably bad n): %s" % (x)
        return None
       
       
   
    def simpson(self, n):
        "Approximate the definite integral of f from a to b by Simpson's rule."

        if self.len % n != 0:
            print "Error: %d mod %d is not zero but should be." % (self.len, n)
            return -1
       
       
        h  = float(self.xend - self.xstart)/n
       
   
        si = 0.0
        sp = 0.0
        xk = 0.0
   
        for i in range(1, n, 2):
            xk = self.xstart + i*h
            si += self.f(xk)
   
        for i in range(2, n, 2):
            xk = self.xstart + i*h
            sp += self.f(xk)
       
       
        s = 2*sp + 4*si + self.f(self.xstart) + self.f(self.xend)

        return (h/3)*s
   

filename = "integral_tabela_CO2_csv.csv"
s = simpsonCsv(filename)

print s.simpson(20)
print s.simpson(40)
print s.simpson(60)
print s.simpson(100)
print s.simpson(300)
print s.simpson(600)
print s.simpson(900)
print s.simpson(1800)

MNUM – Simpson’s Rule in Python

Posted: April 11th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , | Comments Off

def simpson(f, a, b, n):
    "Approximate the definite integral of f from a to b by Simpson's rule."

    if n % 2 != 0:
        print "Ups: n must be even!"
        return -1
        
    h  = (float(b) - a)/n
    
    si = 0.0
    sp = 0.0
    
    for i in range(1, n, 2):
        xk = a + i*h
        si += f(xk)
    
    for i in range(2, n, 2):
        xk = a + i*h
        sp += f(xk)
        
        
    s = 2*sp + 4*si + f(a) + f(b)

    return (h/3)*s

def f(x):
    return x**4

ni = 50
nf = 1000000
n = ni
a = -20
b = 0
s = []
qc = []
ec = []
t = 0.0
div = 0.0
i = 0.0

while n < nf:
    t = simpson(f, a, b, n)
    s.append(t)
    n = n * 2

for i in range(0, len(s)-2):
    div = (s[i+2] - s[i+1])
    if div == 0:
        break
    
    t = (s[i+1] - s[i]) / div
    qc.append(t)
    t = div / 15
    ec.append(t)

for i in range(0, len(qc)):
    print "%.12f, %.12f, %.12f => qc=%.12f, e=%.12f" % (s[i], s[i+1], s[i+2], qc[i], ec[i])

MNUM – Gauss

Posted: April 4th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , | Comments Off

# this requires numpy get it from http://numpy.sf.net

from copy import deepcopy
from numpy import *

# this function, swapRows, was adapted from
# Numerical Methods Engineering with Python, Jean Kiusalaas
def swapRows(v,i,j):
    """Swaps rows i and j of vector or matrix [v]."""
    if len(v) == 1:
        v[i],v[j] = v[j],v[i]
    else:
        temp = v[i].copy()
        v[i] = v[j]
        v[j] = temp

def pivoting(a, b):
    """changes matrix A by pivoting"""

    n = len(b)

    for k in range(0, n-1):
        p = int(argmax(abs(a[k:n, k]))) + k
        if (p != k):
            swapRows(b, k, p)
            swapRows(a,k,p)

def gauss(a, b, t=1.0e-9, verbose=False):
    """ Solves [a|b] by gauss elimination"""

    n = len(b)

    # make copies of a and b so as not to change the values in the arguments
    tempa = deepcopy(a)
    tempb = deepcopy(b)

    # check if matrix is singular
    if abs(linalg.det(tempa)) < t:
        return -1

    pivoting(tempa, tempb)

    for k in range(0,n-1):    
        for i in range(k+1, n):
            if tempa[i,k] != 0.0:
                m = tempa[i,k]/tempa[k,k]
                if verbose:
                    print "m =", m
                tempa[i,k+1:n] = tempa[i,k+1:n] - m * tempa[k,k+1:n]
                tempb[i] = tempb[i] - m * tempb[k]

    # Back substitution
    for k in range(n-1,-1,-1):
        tempb[k] = (tempb[k] - dot(tempa[k,k+1:n], tempb[k+1:n]))/tempa[k,k]

    return tempb

def residue(a, b, c):
    """Calculates the residue of a system solved by gauss elimination"""
    n = len(b)

    t = a * c # t is the A with the values of x replaced (an [n x n] matrix)

    s = []
    for i in range(0, n):
        s.append(sum(t[i])) # s is the solution

    res = b - s # res is the residue

    return res

#a = array([[1.0, 2.0, 0.0],[-1.0, 2.0, 3.0],[1.0, 4.0, 1.0]])
#b = array([3.0, -1.0, 4.0])
#a = array([[-1.414214, 2, 0],[1, -1.414214, 1], [0, 2, -1.414214]])
#b = array([1.0,1.0,1.0])
#a = array([[2.0, 2.0, 2.0],[1.0, 1.0, 5.0], [2.0, 5.0, 1.0]])
#b = array([6.0, 7.0, 8.0])
a = array([[1.001, 2.001, 3.001],[0.999, 2.0, 2.999], [1.002, 1.999, 2.999]])
b = array([4.003, 4.001, 3.999])

x = gauss(a, b)
print "Solution = ", x

#sol = linalg.solve(a, b)
#print "linalg Solution = ", sol

y = residue(a, b, x)
print "Residue = ", y

u = gauss(a, y)
print "Residue destribution = ", u

z = gauss(a, b+y)
print "New Solution (with added residue) = ", z

y2 = residue(a, b+y, z)
print "Residule of new solution = ", y2

if linalg.norm(y2) < linalg.norm(y):
    print "New solution has a smaller residue."
else:
    print "Original solution has a smaller residue."

Numerical Methods and Python

Posted: March 27th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , , | 2 Comments »

I finally decided to take the Numerical Methods (MNUM) course. It turns out it’s a lot more fun than I thought. There is programming involved but you can chose to use whatever language you want. This is yet another nice excuse for me to use Python instead of C++ or Java. Last semester I was able to use Python to implement the game logic for Software Application Laboratory (LAS), which is mostly an OpenGL course with IPC via sockets thrown into the mix, and to write an article on dynamic languages (focusing mostly on Python) for Software Engineering (ESOF).

But back to this semester, 3 classes into the semester and the teacher is already said something like “I’m going to learn python now. I didn’t believe when I heard someone saying it was the best language in the world, but now I see there might be some truth to that claim”. That and I suspect his next laptop might be a macbook but that’s another story.

There are a few things that make Python great for Numerical Methods. In my opinion, Python’s clear, easy to understand, syntax is the most important one.It makes algorithms easier to implement. The syntax ends up being very close to language neutral pseudocode available in numerical methods books. Also Python’s datatypes as well as those provided by other libraries can be very useful.

The following code implements the stuff in chapter 2 (determining zeros) of the course. The methods implemented are Bisection, Rope and Newton. The function returns both the solution and the number of iterations necessary to get to that solution.

UPDATE: forgot the book - Numerical Methods in Engineering with Python

Appendix A – mnum2.py

from math import log

def bisect(f, a, b, e):
	""" Determines zero between a and b using Bisection. """
	n = 0
	fa = f(a)
	if fa == 0.0: return (a, n)
	fb = f(b)
	if fb == 0.0: return (b, n)

	while (abs(a-b) > e):
		c = 0.5*(a+b)
		fc = f(c)

		if fc == 0.0: return (c, n)
		n = n + 1
		if fb*fc < 0.0:
			a = c
			fa = fc

		else:
			b = c
			fb = fc

	if fa < fb:
		return (a, n)
	else:
		return (b, n)

def rope(f, a, b, e):
	""" Determines zero between a and b using the Rope methode. """
	n = 0
	fa = f(a)
	if fa == 0.0: return (a, n)
	fb = f(b)
	if fb == 0.0: return (b, n)

	while (abs(a-b) > e):
		c = (a*fb - b*fa) / (fb - fa)
		fc = f(c)
		if fc == 0.0: return (c, n)
		n = n + 1
		if fb*fc < 0:
			a = c
			fa = fc

		else:
			b = c
			fb = fc

	if fa < fb:
		return (a, n)
	else:
		return (b, n)

# Note: must verify that for the function f and guess c
#		the method will _converge_.
def newton(f, df, c, t):
	""" Determines zero between a and b using Newton """
	n = 0
	fc = f(c)
	if fc == 0.0: return (c, n)

	while (True):
		fc = f(c)
		dfc = df(c)
		if dfc == 0:
			print "dfc is 0"
			return (0, -1)

		dc = -fc/dfc

		c = c + dc
		n = n + 1
		if abs(dc) < t: return (c, n)

##Tests
#def f(x): return -log(x)+4.0
#def df(x): return -1.0/x
#x= bisect(f, 1, 70, 0.00000001)
#print x
#x = rope(f, 1, 70, 0.00000001)
#print x
#x = newton(f, df, 0.1, 0.0001)
#print x

ACM ICP

Posted: March 25th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , , | 6 Comments »

I just got an email (via the faculty-wide spam network) about the ACM International Collegiate Programming Contest (ICP) or more precisely about SWERC 08. For a few moments I thought “hey, this might be fun”. A second later I read something like

“Languages allowed: C, C++, Java or Pascal”

Yay it’s 1995 again!!!

Pascal? Didn’t Pascal die like more than a decade ago?

C++ and Java are so 90’s… where are the cool languages? Python? Ruby? Heck even Erlang and Haskell?

Considering the “retro” theme, I would’ve was expecting Lisp and Fortran to be in there. (Not (that I have anything against (lisp))).

So basically the point of the contest is seeing how well you do with a handicap? Kinda like the Olympic 100m with a broken leg (or 2 if you’re using C)?

Sounds very interesting but I rather go cut myself with the portuguese tokio hotel fans (via Tiago Farrajota)…


Out of Wordpress.com and Into Evernote.com

Posted: March 24th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , , , , , , , | 13 Comments »

I mentioned in a previous post that I was using a private Wordpress blog to keep my notes. Not anymore. I migrated to Evernote.

Thanks to Maria Joao Valente for sending me the invite to evernote.

Evernote is a note organizers, similar to Journler which I used a while back.

Check out the About Evernote and their screencast. My highlights:
* Web client
* Desktop client
* Works with Mobile Devices
* Painless, automatic synchronization (think gmail + IMAP but better)
* Notes can be found by searching and filtering for text within images
* Clip (via bookmarklet) or email entire webpages into your account
* Can import html files (you’ll see why this was important for me)

See also: Wired Review and TUAW Review.

Migrating between applications has never been an easy task. In this case I need to migrate from a Wordpress blog to evernote. I could manually click “Clip to Evernote” for each post on that blog or I could’ve written a simple AppleScript to do it or I could probably have found a way to do it in Javascript or I could’ve taken advantage of the “clip” thing in another way. But off course I choose the hardest way possible – I wrote a python script to convert the Wordpress XML Export File to multiple HTML notes and then dragged those files to evernote. At least it was fun if a colossal waste of time…

Anyway here’s the python script in case you ever want to convert a wordpress blog (or more accurately a Wordpress XML Export File) to html files.

wpdepress.py


# Copyright (c) 2008 Luis Rei
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

# Notes:
# - currently does not handle images, attachments or comments
# - was only tested on MacOS X (10.5)
# - not "carefully" developed e.g. poor exception handling, little testing, ...
# - see also http://wordpress.com/blog/2006/06/12/xml-import-export/

import string, os, sys, getopt
from xml.dom import minidom

__author__ = 'Luis Rei (luis.rei@gmail.com)'
__homepage__ = 'http://luisrei.com'
__version__ = '1.0'
__date__ = '2008/03/23'

def convert(infile, outdir, authorDirs, categoryDirs):
    """Convert Wordpress Export File to multiple html files.

    Keyword arguments:
    infile -- the location of the Wordpress Export File
    outdir -- the directory where the files will be created
    authorDirs -- if true, create different directories for each author
    categoryDirs -- if true, create directories for each category

    """

    # First we parse the XML file into a list of posts.
    # Each post is a dictionary

    dom = minidom.parse(infile)

    blog = [] # list that will contain all posts

    for node in dom.getElementsByTagName('item'):
    	post = dict()

    	post["title"] = node.getElementsByTagName('title')[0].firstChild.data
    	post["date"] = node.getElementsByTagName('pubDate')[0].firstChild.data
    	post["author"] = node.getElementsByTagName(
    	                'dc:creator')[0].firstChild.data
    	post["id"] = node.getElementsByTagName('wp:post_id')[0].firstChild.data

    	if node.getElementsByTagName('content:encoded')[0].firstChild != None:
    	    post["text"] = node.getElementsByTagName(
    	                    'content:encoded')[0].firstChild.data
    	else:
    	    post["text"] = ""

    	# wp:attachment_url could be use to download attachments

    	# Get the categories
    	tempCategories = []
    	for subnode in node.getElementsByTagName('category'):
    		 tempCategories.append(subnode.getAttribute('nicename'))
    	categories = [x for x in tempCategories if x != '']
    	post["categories"] = categories

    	# Add post to the list of all posts
    	blog.append(post)

    # Then we create the directories and HTML files from the list of posts.

    # The "base" directory
    outdir += "/wordpress/"
    if os.path.exists(outdir) == False:
        os.makedirs(outdir)
    os.chdir(outdir)

    for post in blog:
        # The "category" directories
        path = ""
        if authorDirs == True:
            path += post["author"].encode('utf-8') + "/"

        # This creates a path for the file in the format
        # category1/category2/category3/file. Note that the category list was
        # sorted.

        if categoryDirs == True:
            if (post["categories"] != None):
                path += string.join(post["categories"],"/")

        if os.path.exists(path) == False and path != "":
            os.makedirs(path)

        # And finally the file itself
        path = outdir + path
        title = post["title"].encode('utf-8')
        filename = path + "/" + post["id"] + ' - ' + title \
                    + '.html'

        # Add a meta tag to specify charset (UTF-8) in the HTML file
        meta = """"""

        f = open(filename, 'w')
        f.write(meta+"\n")

        # Add "HTML header"
        start = "\n\n\n\n\n"
        f.write(start)

        # Convert the unicode object to a string that can be written to a file
        # with the proper encoding (UTF-8)
        text = post["text"].encode('utf-8')

        # Replace simple newlines with
 + newline so that the HTML file
        # represents the original post more accuratelly
        text = text.replace("\n", "
\n")

        f.write(text)

        # Finalize HTML
        end = "\n\n"
        f.write(end)

        f.close()

def usage(pname):
    """Displays usage information

    keyword arguments:
    pname -- program name (e.g. obtained as argv[0])

    """

    print """python %s [-hac] [-o outdir] infile
    Converts a Wordpress Export File to multiple html files.

    Options:
        -h,--help\tDisplays this information.
        -a,--authors\tCreate different directories for each author.
        -c,--categories\tCreate directory structure from post categories.
        -o,--outdir\tSpecify a directory for the output.

    Example:
    python %s -c -o ~/TEMP ~/wordpress.2008-03-20.xml
        """ % (pname, pname)

def main(argv):
    outdir = ""
    authors = False
    categories = False

    try:
		opts, args = getopt.getopt(
		    argv[1:], "ha:o:c", ["help", "authors", "outdir", "categories"])
    except getopt.GetoptError, err:
		print str(err)
		usage(argv[0])
		sys.exit(2)

    for opt, arg in opts:
		if opt in ("-h", "--help"):
			usage(argv[0])
			sys.exit()
		elif opt in ("-a", "--authors"):
			authors = True
		elif opt in ("-c", "--categories"):
		    categories = True
		elif opt in ("-o", "--outdir"):
		    outdir = arg

    infile = "".join(args)

    if infile == "":
	    print "Error: Missing Argument: missing wordpress export file."
	    usage(argv[0])
	    sys.exit(3)

    if outdir == "":
	    # Use the current directory
	    outdir = os.getcwd()

    convert(infile, outdir, authors, categories)

if __name__ == "__main__":
	main(sys.argv)

A Look at Dynamic Languages

Posted: February 20th, 2008 | Author: lrei | Filed under: Programming, Python | Tags: , , | 3 Comments »

Since there’s (apparently) been some discussion (I missed) @ PrintScreen about what’s the best programming language for beginners I’ll leave here my opinion:

Python

I don’t feel like copy-pasting & summarizing a bunch of text so I’ll just leave the link for a (draft) paper I co-authored which looks at the issue in some detail

A Look At Dynamic Languages

Abstract

This paper aims to present an overview of Dynamic Languages in comparison with the more traditional languages, namely Java and C++. The definition of the term “dynamic language” is given and what is commonly understood nowadays when the term is used. Then one lists the most common features of these languages and the advantages and disadvantages pointed by their proponents and opponents. Furthermore is enumerated some of the domains in which dynamic languages have been more successful and explain the reasons that justify it, as well as the domains in which there are few or no reports of success. Finally, some common examples of fanaticism sorrounding dynamic languages are given.

If you’re just interested in why Python is the best choice for a first language jump to chapter 3 – Dynamic Languages in Science and Education.


Unicode and Unicode in Python

Posted: December 2nd, 2007 | Author: lrei | Filed under: Programming, Python | Tags: , , | Comments Off

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Unicode in Python
How to Use UTF-8 with Python
Unicode HOWTO

Updated: added more links