Ben J. Christensen

Software Development and Other Random Stuff

Dynamic Stacked Bar Chart Using d3.js

A prototype of a stacked bar chart that can dynamically add/remove bars and update the data for each bar implemented using d3.js.

It  represents data freshness (time since update) per bar using an opacity decay so a bar fades away if it doesn’t receive fresh data.

The examples I found elsewhere represent static data, so my model of implementation is different in that I don’t use data binding or d3.layout.stack() because I couldn’t figure out how to make those work with dynamic data (if someone can show me a better way, I’ll gladly accept the guidance). Thus, my implementation directly adds/removes the bars and determines the bar widths and x-position itself.

The use case I intend to apply this prototype for is to visualize a stream of realtime data.

Functionality not implemented in this prototype include things such as hover and click actions to show details of the data a bar represents.

Here are links to the code and working example:

http://bl.ocks.org/1488375
https://gist.github.com/1488375

Filed under: Code, User Interface

Animated Circle Using d3.js

While working on visualizing data (application traffic) in realtime I used circles with size representing volume and color representing health.

Here are basic examples of circles with varying sizes and colors and animating them to dynamically change that I used as building blocks.

Here are links to the code and working example:

http://bl.ocks.org/1473535
https://gist.github.com/1473535

Filed under: Code, User Interface

Making the Netflix API More Resilient

A new Netflix Tech Blog post by my manager (Ben Schmaus) discusses how we’ve been making the Netflix API more resilient through the use of circuit breakers, bounded thread-pools and realtime decision making:

Here are some of the key principles that informed our thinking as we set out to make the API more resilient.

  1. A failure in a service dependency should not break the user experience for members
  2. The API should automatically take corrective action when one of its service dependencies fails
  3. The API should be able to show us what’s happening right now, in addition to what was happening 15-30 minutes ago, yesterday, last week, etc.

A video showing the realtime monitoring dashboard is on Vimeo:

Filed under: Code, Production

JUnit Tests as Inner Classes

For several years on multiple Java projects I have written my unit tests as inner classes of the class they are testing. I have never liked or bought into the idea of putting unit tests in a separate class in a separate source folder. I am fine with and like having functional and/or system tests off in that separate ./test/ source folder – just not the unit tests.

Following are my reasons why I find it so much more productive and beneficial to write the unit tests as inner classes.

Note on terminology: I will use “concrete class” to represent the class that needs to be tested.

Low Friction

Unit testing should not feel like a burden, if it does many will likely not do it. Sure, developers often agree on the surface that writing unit tests is “the right thing to do”, but in practice most developers don’t do it. There are many reasons, but I believe one of them is that the friction for doing it is too high in most cases. Partly this is because unit tests off in another source folder have no context to the concrete class being worked on and rely upon human memory and tedious process to find, manage and keep in sync with the concrete class through the development cycle – especially when it’s someone other than the original developer of the class and tests.

Putting the unit tests in an inner class greatly reduces the friction of writing and maintaining unit tests for the following reasons:

  • I only have to deal with a single class in my file/package/class navigator instead of two
  • I don’t have to open and manage 2 editor windows/tabs for every class I want to edit (this is a big deal, especially when most developers have dozens of classes open at any given time)
  • When I use keyboard shortcuts to open that single class file, the unit tests are right there with them, I don’t have to do twice the work to open first the concrete file, then the second file
  • I don’t have to try and maintain the naming convention of 2 separate files, especially through refactorings (more on this below)
  • I can immediately execute the unit tests for the class I’m viewing without going off and searching for another class
  • All developers see the unit tests when they open the class, they don’t need to remember to go looking if there happens to be one (especially in code bases where most classes don’t have tests so they will almost certainly never bother to go looking after failing to find any the first several times).
  • Maintainability is improved because the non-original developers who open a class see the tests and are thereby reminded and encouraged to use and add to them as they edit the class

In short, unit tests as inner classes are easy to find, run, work on and maintain. This in turn encourages adoption and maintenance of them.

Context

Another benefit is that the unit tests as an inner class are very contextual to what they are testing. Many of the points of the previous section related to “low friction” are due to the context an inner class has with the concrete class. The tests are “in your face” and can’t be missed. They are obviously intended for testing the class currently being viewed and immediately prompt the developer to use them as part of their development process.

This in turn enforces them being “unit tests” and not becoming system tests by developers starting to test multiple concrete classes from a single test class just because it’s easier to keep adding tests to an existing test class than to create a new test class for every concrete class. Tests in a separate source folder have a very loose relationship to the concrete class, thus it’s very easy for the context to be lost and have it start testing interaction between classes, rather than only unit testing the class it was originally intended to test.

The mental model of unit tests in an inner class is very clear – these tests are for this class and only this class.

Encapsulation

Unit testing should not result in the weakening of encapsulation. This easily starts to happen when the tests are in a separate class and trying to gain full access to a non-trivial concrete class to setup mocks and perform assertions.

Methods and constructors start being made public or package accessible that should have remained private just so that hooks can be provided for the test class.

Arguments are plentiful about whether private methods and inner classes should be unit tested, and in many cases the arguments are valid that only public methods should need to be tested and they will internally exercise the private members.

Unfortunately there are plenty of use cases I have come across where this ideal is not a reality on non-trivial classes due to a desire to keep things encapsulate and hide implementation details.

Here are 2 examples:

  • Example of ‘wanting’ to test privates: lots of internal logic in private methods where building the class via test-driven-development (TDD) is easier by testing the private methods as you go (like building blocks with simple progressive tests), rather than trying to write all the code then test only the public methods at the end. (Yes, it can theoretically all be done via TDD by only going via the public method, and yes I understand the theory of it. In practice however I have found it beneficial to have some types of private methods tested so they are covered as “building blocks” rather than relying on the top level public method test failing and digging into what internal private method failed.)
  • Example of ‘needing’ to test privates: an inner class which runs a daemon thread to perform background cache refreshes. This is something I want fully encapsulated and not exposed in any way, but I need to test that it correctly runs, does what it’s supposed to do and mock it out for other unit tests.

Specifically on the second example of an inner class and background thread, these could easily be made testable by an external class by exposing things via publics or package private methods or variables, or even by pulling the inner class for the thread into a separate class – but all of those break the encapsulation I was striving for. I do not want the package structure or javadocs to know anything about the implementation details. I want it all private, not package private and certainly not public.

Thus, the only way to get access without breaking encapsulation and good object design is to put the tests inside the concrete class.

Unit tests as inner classes allow for testing without opening member variables or methods to package or public access which then leak the implementation details.

Refactoring

If and when the the concrete class has its name or package refactored the unit tests as an inner class just go along for the ride.

No one needs to remember to go find the associated unit tests and also rename them.

This is particularly important in large codebases maintained by many developers where the person working on the code likely is not the one who originally wrote it.

Otherwise, the unit tests becomes an orphan, in the wrong package with the wrong name. Yes, the tests likely will still work (unless they depend on package access, in which case a compilation error would have flagged it) but they are now less maintainable than ever since the naming convention that was holding them together is gone and nobody will know to go looking for it in future edits to the concrete class.

In short, when unit tests are done as an inner class, everything is fully contained and goes along for the ride regardless of where the concrete class goes or how it’s named.

Self-documenting

When a class has all of its tests as an inner class they act as built-in documentation of what the class is supposed to do, regardless of whether the person looking for the code knows or cares to go looking for unit tests.

They can’t be missed – they are staring the developer in the face at the bottom of the class and in the outline as “UnitTest” with a bunch of methods declaring the functionality that is expected.

Arguments Against This Approach

Unfortunately the use of inner classes for unit testing is not more common so I sometimes get opposition when I work with new teams and they see my tests as inner classes.

Here are the common questions/concerns and my perspective on them:

What About Shipping Test Code to Production?

The small amount of byte code that will get shipped is negligible compared to the amount of 3rd party JARs in most deployments so it hardly dents the size of WAR files being shipped around, and since the classes are never referenced or invoked in production they are never loaded into the class loaders and thus never take up permgen space on the heap.

And if there is a philosophical or actual real reason to not ship test code they can simply be stripped by a build process since they all compile to $UnitTest.class (if that naming convention is used, which I follow and recommend) and can then be easily filtered before building the JAR files.

Apache Doesn’t Do It This Way

Or otherwise said: ‘Why would you put test classes there!?’.

Apache/Maven advocates having a ./src/main and ./src/test folder and they have enough clout in the industry to have made it the most known place of putting test classes.

Just because it works for them doesn’t mean it’s the best way of doing it and in practice I have found it to be detrimental. I agree that in a theoretical “standard directory layout” it makes sense, but in practice the lack of context, increased friction, maintainability issues and impacts on encapsulation make it less-than-ideal for the developers writing the tests.

Summary

Inner classes are a great home for unit tests when writing Java.

This pattern reduces friction of both writing and maintaining unit tests which in turn increases code coverage, speeds up development, improves maintainability, increases velocity and enables adopting practices such as continuous deployment.

 

Update (Jan 17 2012):

One drawback of unit tests as inner classes is that they show up in Javadocs. The solution is to filter them out using a custom ‘Doclet’ implementation.

Example can be found here: https://gist.github.com/1410681

Filed under: Code, Tools

AtomicCircularArray Wins My Concurrency Throughput Test

I recently made several implementations of a RollingNumber (allowing a sum to be calculated over a 10 second period with 1 second increments with continual updates) to determine what would perform best under high-concurrency.

In my case, concurrency means 4-8000 writes/second from hundreds of threads on an 8-core machine and only a handful of reads per second.

In the end, a circular-array (using AtomicReferenceArray internally) won my throughput tests and also achieved my side goal of being non-blocking.

The code is on GitHub and the winning implementation is RollingNumberViaTryLockWithAtomicCircularArray.

Note that this code is NOT what I would deploy in production, since I forced each implementation to comply to the same interface so they could all be run by my test-harness.

However, a modified version of RollingNumberViaTryLockWithAtomicCircularArray is being run in production processing billions of writes per day.

Another aspect of the test is the use of ReentrantLock.tryLock() instead of a synchronized block. The tryLock() works well in this use-case, allowing only 1 thread to get the lock and routing all others around it instead of blocking them.

Test Results

Here are charts (higher is better) showing the differences in performance between implementations on two different machines:

MacBook Pro 2.2GHz Intel Core i7, 8GB Memory, SSD Drive, OSX Lion 10.7.1

$ uname -a
Darwin 11.1.0 Darwin Kernel Version 11.1.0: Tue Jul 26 16:07:11 PDT 2011; root:xnu-1699.22.81~1/RELEASE_X86_64 x86_64

$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03-383-11A511)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02-383, mixed mode)

The winner is the blue bar which is the highest.

Amazon EC2 m2.4xlarge
High-Memory Quadruple Extra Large Instance 68.4 GB of memory, 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each), 1690 GB of local instance storage, 64-bit platform

$ uname -a
Linux 2.6.21.7-2.ec2.v1.3.fc8xen #1 SMP Sat Sep 25 01:16:50 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

$ /usr/java/latest/bin/java -version
java version "1.6.0_25"
Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode)

The winner is the orange bar which is the highest.

Running the Test

You can run the test using an executable JAR with the command:

java -Xmx1g -jar RollingNumberThroughputTest.jar

The test also validates that the code does what it should and is able to find non-thread-safe implementations by looking for incorrect math caused by concurrency bugs.

Conclusion

I found it very interesting how differently the results were on the 2 machines (operating systems and CPUs are very different, the JVM implementations are also different). In both cases though the AtomicCircularArray performed better, far better on the EC2 instance running the Sun JVM.

These tests provided me with the data to choose AtomicCircularArray and as mentioned above it is similar to what I’m now using in production.

Filed under: Code, Performance

Simple Sparkline using SVG Path and D3.js

I’ve been playing with SVG visualization and the d3.js library (replacement to Protovis).

As a starting point this is a simple line chart used as a sparkline. The HTML and Javascript provide a boiler plate from which more complex visualizations and charts can be built.

Here are links to the code and working example:

http://bl.ocks.org/1133472
https://gist.github.com/1133472

To make the size more applicable to inline use as a sparkline decrease the ranges:

var x = d3.scale.linear().domain([0, 10]).range([0, 20]);
var y = d3.scale.linear().domain([0, 10]).range([0, 10]);

 

UPDATE: I added another version that shows animations with transformations and transitions.

http://bl.ocks.org/1148374
https://gist.github.com/1148374

Filed under: Code, User Interface

Tweet Archiver

I wanted to learn Ruby so wrote a script to do something I’ve wanted for a while – a simple mechanism to retrieve all of my Tweets using the Twitter API.

https://github.com/benjchristensen/TweetArchiver

Eventually I’ll add the functionality I want that will expand all shortened links and perhaps even download pictures that I’ve linked to or posted via services such as TwitPic.

Anyway, Ruby is kind of perfect for this type of hacking it seems and now I can archive my Twitter history easily (and yes I know there are lots of other people who have done this before … don’t bother telling me, I just needed something more than “Hello World”).

Filed under: Code

Dynamic Directory of Jar Files in Classpath via Eclipse Plugin

I came across a scenario where I needed Eclipse to dynamically add a folder of jar files to the classpath and found out that Eclipse doesn’t support this out of the box (no idea why … IntelliJ does).

So I began Googling and found “how” to solve it by writing a classpath container but could find a pre-packaged solution.

The how was explained here: https://www.ibm.com/developerworks/opensource/tutorials/os-eclipse-classpath/

I took that example and tweaked it into a working plugin. It now lives at GitHub where the plugin and source can be downloaded.

Once the plugin is installed in Eclipse, edit the “Java Build Path” on a project and click “Add Library” and choose “Directory Container”:

Then choose the folder (a subfolder of the project so it’s relative) and defines what file extensions it should include:

Once saved this library will show up like any other Eclipse classpath library and show all Jar files from the selected folder … and most importantly will dynamically update the classpath when refreshed to whatever is in that folder.

I hope this helps someone else needing the same behavior in Eclipse!

Filed under: Code, Tools

Trigger Native Javascript Events with Prototype.js

After much searching and playing with various solutions, I found one at this page:

http://stackoverflow.com/questions/590289/javascript-event-that-fires-without-user-interaction/590339#590339

This provides an easy method for programmatically invoking native events such as “onchange”.

Here it is:


// this supports trigger native events such as 'onchange' 
// whereas prototype.js Event.fire only supports custom events
function triggerEvent(element, eventName) {
    // safari, webkit, gecko
    if (document.createEvent)
    {
    var evt = document.createEvent('HTMLEvents');
    evt.initEvent(eventName, true, true);</code>

        return element.dispatchEvent(evt);
    }

    // Internet Explorer
    if (element.fireEvent) {
        return element.fireEvent('on' + eventName);
    }
}

This is primarily to deal with prototype.js not allowing the firing of native events (which jQuery does).

Here is another approach I found but have not tried which adds the capability to prototype.js: event.simulate.js

Filed under: Code

Technical Debt Quadrant

Martin Fowler wrote a blog entry on technical debt this week that communicates the concepts of “technical debt” and classifies them very well.

techDebtQuadrant

Some favorite portions:

“A mess is a reckless debt which results in crippling interest payments or a long period of paying down the principal.”

“The prudent debt to reach a release may not be worth paying down if the interest payments are sufficiently small – such as if it were in a rarely touched part of the code-base.”

“Not just is there a difference between prudent and reckless debt, there’s also a difference between deliberate and inadvertent debt. The prudent debt example is deliberate because the team knows they are taking on a debt, and thus puts some thought as to whether the payoff for an earlier release is greater than the costs of paying it off. A team ignorant of design practices is taking on its reckless debt without even realizing how much hock it’s getting into.”

“while you’re programming, you are learning. It’s often the case that it can take a year of programming on a project before you understand what the best design approach should have been. Perhaps one should plan projects to spend a year building a system that you throw away and rebuild, but that’s a tricky plan to sell. Instead what you find is that the moment you realize what the design should have been, you also realize that you have an inadvertent debt.”

Filed under: Architecture, Code, Management & Leadership

Twitter Updates

View Ben Christensen's profile on LinkedIn
Follow

Get every new post delivered to your Inbox.