Archive for the ‘Programming’ Category
Treating Python Functions as First Class Objects
For this Bench Press post, I wanted to discuss a relatively unused feature in Python which I found to be a big help while doing some Benchside-related coding. For many coders, the programming experience centers around C, C++, and/or Java. While these three languages are still widely used and are as prominent as ever, the transition from these “C-like” syntaxes to Python can be a bit tough. Oftentimes, programmers learn only a portion of Python’s syntax and proceed to write “Java-like” Python. As a result, their Python code can become unnecessarily long, inefficient, and un-Pythonic, mainly because the best way to write a program in Java isn’t always the best way to write it in Python.
While there are many ways to write Pythonic code, one of the key features most people typically don’t employ is the use of functions as first-class objects. To see what I mean by first-class object, let’s take a look at the following Python functions:
def sumOfSquares(a, b):
return a**2 + b**2
def sumOfCubes(a, b):
return a**3 + b**3
def sumOfNegatives(a, b):
return (-a) + (-b)
While these functions will do the job, each function essentially performs the same duty, only with minor adjustments. As a programmer, you should have alarms going off, as one of the cardinal sins in programming is to write redundant code. While there are solutions to this in C, C++, or Java (such as using templates), none are as clean, in my opinion, as Python’s:
def square(a):
return a**2
def cube(a):
return a**3
def negative(a):
return -a
def sum (a, b, function):
return function(a) + function(b)
For our examples, sumOfSquares becomes sum(a, b, square), sumOfCubes becomes sum(a, b, cube), and sumOfNegatives becomes sum(a, b, negative). What makes this Pythonic solution more impressive is that the function sum actually takes in as an argument a function (in our case, square, cube, and negative)! Not only does this allow sum to be generalized for other functions (for example, square root or double), instead of writing out sumOfSquares, sumOfCubes, etc. we simply need to invoke the sum function with the proper function.
In addition to being passed as an argument, first-class functions can also be stored within data types (like lists and dictionaries), returned from functions, and referenced by variables. If you are new to this style of programming, I recommend reading a few Python tutorials on how to use higher order functions and experimenting on your own.
Extra for experts: For those C/C++/Java wizards out there, yes I realize C and C++ have similar functionality using function pointers, and that Java has a built-in function object interface. However, each of these generally requires a very messy implementation and aren’t usually taught in standard courses. Furthermore, you will find that Python’s support of higher order functions is much more expansive and well-documented (such as built-in support for map, reduce, and filter operations).
Also, some of you may find that my example above skimped out on my example because I haven’t actually declared a sumOfSquares or sumOfCubes. I agree, while sum(a, b, square) is equivalent to sumOfSquares, to actually declare a sumOfSquares function object, I’ll need to introduce lambda. In short:
sumOfSquares = lambda a, b: sum(a, b, square)
sumOfCubes = lambda a, b: sum(a, b, cube)
Lambda creates objects known as “anonymous functions,” which can be assigned to a variable name like sumOfSquares. Our new version of sumOfSquares is completely equivalent to the one that we first defined at the top.
Build from source in Windows
While Bench Press is a big fan of open source, we realize that it can be intimidating for the lay-scientist (or layperson for that matter) to build code from an open source repository when asking a question might quickly get the asker labeled a n00b and not taken seriously. This problem is especially relevant to us Windows users who don’t have ready access to UNIX-style command line-fu and are dependent on kind open source community members to create Windows-specific installers.
This weekend, while working on integrating the open source database SQLite into some code I was writing for Benchside, I realized that the only way to integrate SQLite’s Full Text Search capability was to recompile it from source code, something I had never done before. As I ran a Windows system, I wasn’t able to use the UNIX command line instructions on from Michael Trier’s post on how to integrate SQLite’s Full Text Search capability into a Python program.
A few hours of research and trial and error later, I finally came up with how to do it on Windows in a way which hopefully generalizes to building other open source projects out there. Hopefully I can lay these out so that other “n00bs” out there no longer need to feel left out when someone gives them some source code:
- Install MinGW and MSYS – MinGW is the Minimalist GNU for Windows software pack which allows Windows computers to implement the development tools (like
gccandmake) that UNIX-style operating system users (Linux, Mac) take for granted. MSYS provides a command line interface which emulates directly the UNIX command line, at least up to the point where it is needed to build from source code. Both packages are open source and have Windows installers for download at their respective pages (to keep things simple). These installers are probably a sub-version or two behind (as the updated tools need to be packaged together in Windows installers), but with MinGW and MSYS, it should be relatively easy to download and build from source the new tools! In any event, install MinGW first, and afterwards when you install MSYS, it will ask where you placed the MinGW install and be able to link directly to those compiler tools- (optional step) Install GnuWin32 – GnuWin32 is, like MinGW, a set of tools which emulate many of the other commands that UNIX-style operating systems have. The primary use for these would be to do other command line tricks which the UNIX guys have access to (i.e. downloading and de-compressing source code files from the command line). This is unnecessary as most commercial compression software (or open source like 7zip) can handle most of the de-compressing that you need
- (optional step) Install the Windows Open Command Window Here PowerToy which will let you right-click on a folder and open a command line window right there. Makes things more convenient many times.
- Add MinGW and MSYS to your system path – Now that you’ve installed MinGW and MSYS (and possibly GnuWin32), you need to make sure that your tools can be accessed from anywhere by command line and not just in the folders that you installed them to. To do this, pull up the “System” panel in your Control Panel. You can do this by clicking on “Start”, “Run”, and typing in
“control sysdm.cpl”and hitting [ENTER]. A window should pop up called “System Properties”. Click on the “Advanced” tab and then click on the “Environment Variables” button. Another window should pop up. In the System variables panel (the one on the bottom, see image below), select the row that has “Path” in the Variable column and click on the “Edit” button. This should bring up a text-editing dialog box where you should add the full directory paths to the MinGWbinand MSYS1.0folders separated by semicolons there. In my case, for instance, the paths to the bin folders were “C:\MinGW\bin” and “C:\msys\1.0” and so I added “C:\MinGW\bin;C:\msys\1.0;” to the end of the Variable Value text box.
- Download and de-compress the source code – This usually comes in the form of a gzipped tarball file (*.tar.gz) which you can unpack with software like 7zip or at the command line (if you have the appropriate tools installed) using
gzipandtarcommands. - Run MSYS and use the
cdcommand to go to the directory where you’ve de-compressed the source code – You should be able to do this, assuming you added MSYS to your path properly, from anywhere in the command line by just typingmsysand hitting [ENTER]. Use thecdcommand (syntax: “cd<name of directory>”) to move to the directory where you unzipped the source code. In my case it was: “cd C:/sqlite3/sqlite-3.6.22/” - Set any compiler flags that need to be set and run
./configure: For SQLite, I had to set theDSQLITE_ENABLE_FTS3andDSQLITE_ENABLE_FTS3_PARENTHESISflags before compiling. To do this, I simply typed in:CFLAGS="-DSQLITE_ENABLE_FTS3=1 -DSQLITE_ENABLE_FTS3_PARENTHESIS =1" ./configureand hit [ENTER]. What this does is first, set the environment variableCFLAGS, which is what the compiler will look at to set any preprocessor flags (setting-DXXX=Ysets the preprocessor variableXXXthe value ofY) that it needs before building, and second call./configurewhich runs a quick diagnostic to see if your system has all the development tools it needs to compile. If an error pops up, this might be a sign that your MinGW installation is incomplete or that you did not set the system path correctly. - Follow any instructions in the README – Source code packs usually come with a README text file which gives instructions for how to build the software package. You should definitely read those (as they may also provide instructions for which compiler flags that you might want to set in step 5). These usually end with you entering the instruction “
make”(and then [ENTER]) and subsequently “make install”(and then [ENTER]) to make sure that the compiled code is embedded in the system at its proper location. - Post-setup – Depending on the software, there may be further configuration/setup steps that need to take place. In the case of my custom SQLite build, I had to copy the Windows DLL file from the ./lib folder that was created by the build process (
libsqlite3-0.dll) into my Python installation’sDLLfolder and rename itsqlite3.dllto replace my old built-in Python-SQLite setup.
The process above is probably not 100% fool-proof and skims over some details which may be important for different source code types, but hopefully my painfully self-taught lessons in building from source in Windows-land will be helpful to those of you out there who find yourself needing to build something by compiling from source code.
