One Does Not Simply Document Code
Or: Documenting Python code using pylint, doxygen and doxypy
First let's remember the words of an old Jedi Master: One does not simply document code.
This text will give a brief introduction to pylint, doxygen and doxypy. Their combined strength is almost as good as the triforce.
With pylint you can check your code before running it and you are told of you miss some documentation. With doxygen you can generate documentation from the source code in a number of formats - including html, latex, rtf and so on. And, finally, doxypy allows you do document things once. The pythonic way.
A short introduction to pylint
From the man-page: pylint - python code static checker, is a Python source code analyzer which looks for programming errors, helps enforcing a coding standard and sniffs for some code smells [...] and wikipedia also mentions [pylint] follows the style recommended by PEP 8, the Python style guide.
See more in:
What's a static code analyzer?
So, pylint is a python static code checker, what the hell is that? According to our friends at wikipedia Static program analysis (also static code analysis or SCA) is the analysis of computer software that is performed without actually executing programs built from that software [...]. In most cases the analysis is performed on some version of the source code [...]. The term is usually applied to the analysis performed by an automated tool, with human analysis being called program understanding, program comprehension or code review., see . In other words pylint reads your code and checks if it is good or bad, and boy - my code is usually really bad.
A minimal installation and configuration of pylint
- I just ran the canonical sudo apt-get install pylint and pressed y for hit me baby
- After installation I made a configuration file: pylint --generate-rcfile > ~/.pylintrc
- I don't want the reports to I updated the config file: "reports=no".
- That's it, pretty simple.
Let's look at this little example. A code file with a class having four methods.
1 """A wrapper for a file - a file-like object. 2 """ 3 4 class AbstractFileWrapper: 5 """The mother of all wrapper - to use for inheritance. 6 """ 7 8 def __init__( self, filename = Mone ): 9 """Constructor with optional filename 10 """ 11 pass 12 13 def write( salf, text ): 14 """Writes some text to the "file". 15 """ 16 17 def copy( self, path ): 18 pass 19 20 def close( self ): 21 """ 22 """ 23 pass 24
As you can see there is no code here yet - just a skeleton for later classes to inherit to. Do you think that stops pylint from finding issues? Nope.
Running pylint from the command line is as easy as 1-2-3:
$ pylint main.py ************* Module main E: 8:AbstractFileWrapper.__init__: Undefined variable 'Mone' E: 13:AbstractFileWrapper.write: Method should have "self" as first argument C: 17:AbstractFileWrapper.copy: Missing docstring C: 20:AbstractFileWrapper.close: Empty docstring
As you can see the E's are errors and the C's are comments. The format of this output can be modified in the configuration file. It is oftentimes helpful to get the id of the error or comment or warning since sometimes pylint is incorrect and you want to suppress the message.
I correct the above problems and add a real class in the same file:
27 class StringFile( AbstractFileWrapper ): 28 """A string that acts like a file 29 """ 30 31 def __init__( self, filename = None ): 32 AbstractFileWrapper.__init__( self, filename ) 33 self.text = "" 34 35 def write( self, text ): 36 self.text += text 37 38 def copy( self, path ): 39 filehandle = open( path, 'wt' ) 40 filehandle.write( self.text ) 41 filehandle.close() 42 43 def __len__( self ): 44 return len( self.text ) 45 46 def count_char( self, item ): 47 """Count the number if instances of item in the file 48 """ 49 tot = 0 50 for i in xrange( len(self) ): 51 tot = self.text.count( item ) 52 53 return tot
This real class now works pretty ok:
Python 2.7.2+ (default, Oct 4 2011, 20:03:08) [GCC 4.6.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from main import StringFile >>> sf = StringFile( 'dev-noll.txt' ) >>> sf.write( 'one does not simply document code' ) >>> len( sf ) 33 >>> sf.copy( 'dev-mordor.txt' ) >>> sf.close() >>> exit()
$ cat dev-mordor.txt one does not simply document code
But pylint will still find some issues to complain on:
$ pylint main.py ************* Module main W0612: 50:StringFile.count_char: Unused variable 'i'
Let's update that i to _ to show the reader that it is an unused variable. (If you found the possible optimization you gain karma).
A short introduction to doxygen
From the man-page: doxygen - documentation system for various programming languages and from wikipedia: Doxygen is a tool for writing software reference documentation [...] within code.
See more in:
- wikipedia: 
What's a tool for writing software reference documentation within code?
It is similar the idea of docstrings that you should already be familiar with - see Python Module With Doctest, Python Doctest And Docstring or . In short: comments in the code and structures in the code are used to generate documentation.
For another example that relates to doxygen see  - also: see below.
A not so minimal installation and configuration of pylint
Installing doxygen is simple but takes space on your hard drive and requires a big download. When I wrote the typical sudo apt-get install doxygen I was recommended to also install: doxygen-doc doxygen-gui graphviz auctex debhelper perl-tk dvidvi fragmaster latexmk purifyeps xindy psutils t1utils texpower and dot2tex, so I did that:
sudo apt-get install doxygen doxygen-doc doxygen-gui graphviz auctex debhelper perl-tk dvidvi fragmaster latexmk purifyeps xindy psutils t1utils texpower dot2tex and it required a whopping 862 MB - but relax it's worth it.
After the installation you will need to make a configuration file per project: doxygen -g that generates a good enough file called Doxyfile. I tweaked it a bit but and updated some variables:
- PROJECT_NAME = One Does Not Simply Document Code
- PROJECT_NUMBER = 0.1
- PROJECT_BRIEF = "The dev in /dev/mordor"
To my big surprise I did not even have to mention that I wanted python files - I ran it with doxygen Doxyfile in the same folder as my code and it generates about 50 files in the html folder (that it generates) and 14 files in the latex folder. The output in index.html will look something like this when viewed in firefox:
This is excellent - what is the problem with this?
No, it is not excellent - it is merely pretty good. I want somewhing awesome! Something like this:
The problem is that to achieve this I need to comment the code in an unpythonic way using the doxygen-style comments you see below. Notice that the documentation is inside comments and not inside the docstring!!!
38 ## Clone the file to another file. 39 # 40 # @param path The path needed to write the file to a file. 41 # @returns Trace amounts of file on the hard drive. 42 def copy( self, path ): 43 """Clone to file by writing it to the hard drive. 44 """ 45 filehandle = open( path, 'wt' ) 46 filehandle.write( self.text ) 47 filehandle.close()
Let's summarize the problems with this:
- The documentation is inside comments and not inside the docstring.
- You need to write the comments in more than one place.
- Some documentation is hidden for the excellent python help that is interactive and just awesome when coding in the python shell (see below).
>>> dir( StringFile ) ['__doc__', '__init__', '__len__', '__module__', 'close', 'copy', 'count_char', 'write'] >>> help( StringFile ) class StringFile(AbstractFileWrapper) | A string that acts like a file | | Methods defined here: | | __init__(self, filename=None) | | __len__(self) | | copy(self, path) | Clone to file by writing it to the hard drive. | | count_char(self, item) | Count the number if instances of item in the file | | write(self, text) | | ---------------------------------------------------------------------- | Methods inherited from AbstractFileWrapper: | | close(self) | Close the file.
A short introduction to doxypy
The man-page says it all: Doxypy is an input filter for Doxygen. It reformats Python comments to conform to Doxygen documentation blocks. This makes it possible to use the Doxygen/Javadoc syntax inside of docstrings when writing code documentation and automatically generate API documentation out of it instead of being forced to use non-Python documentation blocks or to document code redundantly.
See more in:
- Doxypy home page 
How you want to document code:
What you would like is of course to write the documentation once, and get it in both doxygen and the python docstrings, in a way that pleases pylint. Something ore or less like this:
38 def copy( self, path ): 39 """Clone to file by writing it to the hard drive. 40 @param path The path needed to write the file to a file. 41 @returns Trace amounts of file on the hard drive. 42 """ 43 filehandle = open( path, 'wt' ) 44 filehandle.write( self.text ) 45 filehandle.close()
And using the built-in help you'd get:
>>> help( StringFile ) class StringFile(AbstractFileWrapper) | A string that acts like a file | | Methods defined here: | | [...] | | copy(self, path) | Clone to file by writing it to the hard drive. | @param path The path needed to write the file to a file. | @returns Trace amounts of file on the hard drive. | | [...]
The solution is doxypy
The solution is to use doxypy as an input filter to doxygen - that way there is a conversion made when running doxygen.
First let's find out where to find doxygen:
$ which doxypy /usr/bin/doxypy
And now update another variable in the doxygen configuration file typically called Doxyfile:
- INPUT_FILTER = /usr/bin/doxypy
The epic results is of course a screen shot!
This introduction taught you how to install and configure pylint, doxygen and doxypy. The installation was very simple using apt-get (on Debian Gnu Linux Distro-based systems like on the Ubuntu Distro). The configuration was also pretty simple - for pylint and doxygen we generated configuration files that we mildly tweaked (a lot of options remains to be tested). doxypy only needed installation.
There is in fact no end to the possibilities here - documentation written inside the code is the best place to put the documentation.
But remember: One Does Not Simply Document Code.
Tillhör Kategori Programmering