Wednesday, August 22, 2012

Computing Time in a Method Call


I was trying to do some comparisons on how fast a particular algorithm was calculating Checksum values. In my case, I have Adler32, CRC32, and PureCrc32 (Apache Hadoop) to check. This is my particular case, but in general this question comes up for a number of cases; performance optimization being one of them. Since we are calculating the Checksum on files, the problem is complicated by I/O considerations. File caching, and operating system constraints (resources) being the big ones. So how do most people do it?

The Ugly Truth

The general methodology that most developers use is to wrap the method call of interest with start and end time variables, and calculate the difference. They may repeat the process a couple of times, and average it. If they are fancy, they may randomly execute the method to provide some variance. This pseudo randomness gives the feel of more objective results.

Does it work?

The truth is about precision. This will give you a rough time frame. Perhaps that is all you need. My college mathematics profession, Dr. Alden Monberg, always talked about scale and precision. A tolerance of ± 25mm may work on a ship section weighing a 50 MT., but the same value on a pacemaker, or heart value would be unacceptable. A precision of ± 10nm would work for the heart value, but would be impossible for the ship.

Getting nanometer precision (or nanosecond in our case)

The results from the calculations above will give "thumb in the air" results, but is too imprecise to be useful when you may be calculating 10,000 file, or 10,000,000 files. The overhead of the method call is lost in the I/O, but it still there. The greater the number of files, the more optimized the algorithm, the less time it consumes. We can't control the I/O per se, so lets control our algorithm.

Take a different approach

We still want to do the calculation, but we want to do it on the void update(byte[] b, int off, int len) method which is the root of our algorithm being called. I decided to try to use AOP, and try to wrap the method invocation. I know there are some folks who use Spring, but I do not. I wanted to use an AOP framework directly instead of using it wrapped in Spring. I tried a number of frameworks to see which one I thought would be the least invasive and easiest to use.
After trying all of them, I found that Guice was the easiest to use. I was not able to use it directly out of the box though. The "preferred" method for Guice is to annotate the classes you want to test. In my case, I can not annotate the JDK, nor Hadoop classes. So I had to use another methodology which I will detail.

How do you do it?

I had not originally considered Guice until I found this blog post: Guice Tutorial – part 2: method interception which details performing method interception. This was the jackpot find that got me on track. The example code on that tutorial has the methodology which I used. However, it falls a little short because Guice does not have method matcher which matches on the method name. A little more searching came up with an issue/enhancement posted in Issue tracker for Guice. Issue #164 method name matcher enhancement. The issue is closed as a WONTFIX, but the submitter (Aaron Whiteside) proposed some code to do it. I modified the code slightly, but it was a near perfect fit. Google Groups has an explanation why Bob Lee thinks it is a bad idea: Guice/AOP - intercepting a method thanks to its name. I disagree, but that is his choice. The code Aaron mostly works, and my code below does work. So be it. OK, so we now have most of the pieces in place. The cool thing about this method is that we can use it anywhere. I am going to use it on a JUnit test like the tutorial in my code, but it is very general. So here is what I needed:
  1. A MethodInterceptor implementation
  2. A Matcher implementation
  3. An AbstractModule implementation
  4. A JUnit test



However one test does not tell the whole story. When I actually ran this as a more general performance calculation. The results were on curves. On small files like the 10MB one in my test, the Adler32 was fastest. The PureCrc32 code performed the best as the file sizes got larger. Please don't accept the output here as a performance test without considering other factors. It is cool though that we got more precise results rather than brute force attempting it.