Python Toys

The following files are in public domain except where otherwise noted. THESE FILES COME WITH ABSOLUTELY NO WARRANTY.


Articles


Crappy tools

webapp.py and runapp.py
A simplistic Web application framework and its testing server.
clitrans.py
Generate a simple-formatted text reference for C# standard libraries.
sortbydate.py
Sort the files by its date and display the older (or newer) files first.
vnc2flv
Another screen recorder.
Pyntch
Python static code analyzer. Detects possible runtime errors before acually running the code.
PDFMiner
PDF parser and interpreter written entirely in Python. Useful for text extraction / analysis for PDF.
homesync.py
An Rsync wrapper. Useful to manually synchronize directories which are *almost* identical.
des.py
SSLeay des.pl function ported into Python.
flv2mp3.py
Extract a MP3 file from a FLV (Flash Video) file.
Logweeder
A log analyzer. It scans UN*X logs (syslog), categorizes common events, and discovers uncommon events automatically.
jacus.py
Simple Jabber client. You need jabber.py.
PyVnc2swf
Tiny screen recorder. Converting VNC session into Macromedia Flash file. Pygame is required.
Webstemmer
HTML layout analyzer which extracts body and title from news sites, without mixing up with banners or ads.
textcrawler.py (now included in Webstemmer package)
A skeleton of tiny web crawler. Gzip-encoding, HTTP persistent connection, robots.txt, Cookies, and multi-threading are supported.
mp3cat.py
Tiny mp3 frame editor. It can concatinate or splice MP3 frames without loss of audio quality.
unagi.py
P2P-like system monitoring tool.
s.py
A simple port forwarder using asyncore.
netmonitor.py
A network monitor for Linux with flickering pixel at the top-left corner of the screen. Python 1.5.2 or newer. Tkinter is needed.
html2txt.py (now included in Webstemmer package)
Extract texts from HTML. It removes JavaScript and stylesheets. Each paragraph is combined into one line. Python 2.0 or newer.
pycdb.py
A Python implementation of cdb. For learning.
unimap.py
Generate a Unicode character map in EUC-JP encodings.
pyfetchmail.py
Fetch emails via pop3 and deliver a Maildir on your home directory. qmail package is needed.
watcher.py
Notify new mail on GNU Screen's status line. Used with pyfetchmail.py.
PyOne
Python One-liner helper.
pgrep.py
A grep for millions of parallel patterns.
makegif.py
Create a transparent gif from a bitmap. Python Imaging Library is required.
fuser.py
Display all files opened by any process. I wouldn't write this if I knew lsof.

Libraries

iters.py
Common iterator/generator functions.
bidi.py
Bidirectional File Object. It supports backward readline.
pdfparser.py
Dumb PDF parser. It just parses and doesn't do anything useful. (This is integrated into PDFMiner package above.)
swfparser.py
Extracts text strings from a Macromedia Flash file.
encdet.py
An experiment for automatic encoding detection.
abstfilter.py
A simple framework for pipelining between objects.
pstring.py
An extended string class which remember their origin. Each character knows the file and position where it comes from and holds the information even when split.
regpat.py
Regular expression matcher for general sequence. A matching between elements can be a procedre. An action which is performed when matched can also be a procedure. Reverse matching will be supported in future.
texparser.py
A simple parser of LaTeX. Similar to sgmllib.
lcs.py
Calculate the Longest Common Subsequence.
sexpr.py
An S-expression parser. Convert S-exps into Python lists.
reg.py
Easy-to-use regexp wrapper. You don't need to compile regexp patterns.

Tips

forall (does the function pred return true for all of elements in seq?):

def forall(pred, seq):
  return reduce(lambda r,x: r and pred(x), seq, True)

exists (does there exist an element which the function pred returns true?):

def exists(pred, seq):
  return reduce(lambda r,x: r or pred(x), seq, False)

Mapping multiple arrays with enumerate, zip:

Copying one array to another.

for (i,x) in enumerate(seq1):
  if condition(x):
    seq2[i] = x

When assignment is not needed:

for (x,y) in zip(seq1, seq2):
  if condition(x,y):
    ...

Handling sets:

Calculate the equivalences of a set:

def equivalence(feqv, objs):
  eqs = []
  for x in objs:
    eq1 = [x]
    i = 0
    while i < len(eqs):
      if exists(lambda y:feqv(x,y), eqs[i]):
        eq1.extend(eqs[i])
        del(eqs[i])
      else:
        i += 1
    eqs.append(eq1)
  return eqs

print equivalence(lambda x,y:(x%5)==(y%5), [1,1,2,3,4,5,1,2,3,4,0])
# => [[1,1,1], [2,2], [3,3], [4,4], [0,5]]

Transpose a matrix (= sequence of sequences):

Use zip.

>>> apply(zip, [[1,2], [3,4]])
[(1, 3), (2, 4)]

Similar effect of python -u:

import sys, os
sys.stdin = os.fdopen(0, "rb", 0)
sys.stdout = os.fdopen(1, "wb", 0)
sys.stderr = os.fdopen(2, "wb", 0)

Typical use pattern of getopt:

if __name__ == "__main__":
  import sys, getopt
  def usage():
    print "usage: foo.py [-a] [-b] [-c arg] [-d arg] [file ...]"
    sys.exit(2)
  try:
    opts, args = getopt.getopt(sys.argv[1:], "abc:d:")
  except getopt.GetoptError:
    usage()
  (opta, optb, optc, optd) = (False, False, None, None)
  for (k, v) in opts:
    if k == "-a": opta = True
    elif k == "-b": optb = True
    elif k == "-c": optc = v
    elif k == "-d": optd = v
  doit(args)
  sys.exit(0)

Last Modified: Sat Aug 14 06:59:31 UTC 2010

Yusuke Shinyama