Python module for processing a file line-by-line

Note: Since writing this post, I've learned about the fileinput module, which turns most of the following into a oneliner:

import fileinput
for line in fileinput.input():

It works both with stuff you pipe into the program, or if you use a filename as argument.

Read on for my old post

In this post I'll show how you could process a file, line-by-line with some "plugable" Python code (see code listings below):

cat some_file | <line_proc_module> <line_proc_module> ...

In the example above, a "line proc module" is any python module that contains a function called proc_line that takes a string, and returns another string. Such as the following:

# Example of proc_line
def proc_line(line):
	return line # Hint: Do something to the line before returning

Example: Uppercasing

In this example the user has written a "line proc module" called "uppercase" (see code listings below):

$ echo "bla bla bla" | ./ uppercase

Example: Chaining

Here the user has written two "line proc modules", and chained them together (see code listings below):

echo "bla bla bla" | ./ uppercase leet

First "uppercase" is applied, then "leet" is applied, to each line.

Code listings

def proc_line(line):
	return line.upper()

def proc_line(line):
	return line.replace("A", "4").replace("a", "4")

import sys
def main(argv):
	usermodules = []
	if len(argv) < 2:
		print "Usage: <name-of-module> ..."
		print "module most contain a function proc_line(line)"
		return 1
		for i in range(1, len(argv)):
			usermodules.append(__import__( argv[i] ))
		print 'Failed to import module "%s"' % (argv[1])
		return 1
	line = sys.stdin.readline()
		while line:
			# do something to the line and print the result
			for mod in usermodules:
				line = mod.proc_line(line)
			print line
			# fetch new line
			line = sys.stdin.readline()
	except EOFError:
		return None
	return 0
if __name__ == "__main__":

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.