Design patterns in the real world: Strategy

Quite some software engineers think that design patterns are some overly complicated, mythical, abstract things that bring no practical value to software development. This is unfortunate. In order to prove they are indeed something real, in this (and some upcoming) post(s) we are going to take a look on a few examples on how real software products implement some of the GoF design patterns. Today, we are going to visit Strategy, from HotSpot’s point of view. (See the previous post about Flyweight here).

Strategy defined

Wikipedia defines Strategy as follows:

In computer programming, the strategy pattern (also known as the policy pattern) is a software design pattern that enables an algorithm’s behavior to be selected at runtime. The strategy pattern

  • defines a family of algorithms,
  • encapsulates each algorithm, and
  • makes the algorithms interchangeable within that family.

A usually good starting example is sorting arrays/collections. Let’s say we have a requirement to sort an array of numbers, so we implement BubbleSort, because it’s relatively easy. In a few days, we realize the input data sometimes is much larger then we expected, so we implement QuickSort instead, to speed things up. Now, QuickSort turns out to be great for the large data sets, but for small arrays it implies a lot of overhead. What if we took the best of both approaches and picked the right sorting strategy  at runtime, based on the information we have about the data set? Let’s take a look at the diagram below:

strategy (1)

We simply extracted a common interface of the two sorting algorithms. The interface (or abstract class, depending on implementation) is called the abstract strategy, while BubbleSorter and QuickSorter are the concrete strategies.  Worker can pick the right sorting strategy once it encounters the data to be sorted; that strategy can be passed in to the Client through the constructor or a method argument.

Collections and Arrays classes

Now this may come as no surprise, one of the most evident Strategy implementations in the JDK is introduced by the Comparable interface. And just like in the example above, it has to do with sorting. Just look at the following diagram; it presents how sorting is implemented in Java for Lists and arrays:

sort

The similarities are easy to spot between the two diagrams. Comparator interface represents the abstract strategy, BooleanComparator, TestComparator and UserDefinedComparator are the concrete strategies, while strategy’s clients are represented by Arrays and Collections in this case. Of course, starting from Java 8, classes like UserDefinedComparator are getting rarer and rarer, as Comparator<T> is a functional interface, so the implementation class itself can be replaced with a simple lambda expression.

Let’s take a look on this -at first sight- rather strange dependency of Collections.sort on Arrays.sort. The former method first converts the list into an array, using its toArray method. Next, Arrays.sort method is called, and in a last step the elements of the list are overwritten with the elements of the array, using a ListIterator. So there is no specific sort method implemented for lists, Arrays.sort is reused.

Note that  Collections.sort() can not sort any kind of collection as many think; it can only sort a list. Well, this makes sense if you think about it. If one needs to sort a set, they should use a sorted set in the first place. The same is true for maps – there’s SortedMap that sorts entries based on their keys. For queues, a sort method would contradict its main use case – put new elements at one end, take at the other – or take from both ends in case of Dequeues. The only collection for which it makes sense to have a sort method is List.

But back to the Strategy implementation, Arrays.sort is another interesting thing. This method first checks whether the Comparator passed in is null or not. If it is so, natural order of the elements will be used for sorting. Note that the natural ordering (implemented as an inner class in Arrays, with the name NaturalOrder) assumes the elements of the array are comparables (aka they implement Comparable<T>). If they do not, for some reason, a really unpleasant ClassCastException will be thrown.

Next, the sort method is selected. In most cases, Tim Sort will be used, unless legacy Merge Sort is explicitly requested. The latter is not a particularly good idea, as it is going to be removed in a future release, as per its comment. (In order to request merge sort the JVM has to be started using the -Djava.util.Arrays.useLegacyMergeSort=true VM argument).

I was curious how the two sort methods compare to each other in terms of speed. I did not use any sophisticated measurement methods, I’ve only used increasingly large arrays with random Integer values and measured the time required for sorting. The running times below reflect the time taken by the sort method only, they do not include the time needed to create the test data:

TimSort (ms) Legacy Sort (ms)
10 16 0
100 0 0
1,000 0 0
10,000 15 47
100,000 78 140
1,000,000 1347 628
10,000,000 8048 7348

As you can see, TimSort is usually a bit faster (except for arrays with very few elements) until we reach a pretty large data set; from that point on, the legacy algorithm outperforms the new one.

Java 8 functional interfaces

It is not hard to find more examples of the Strategy design pattern in the JDK, and it is even simpler starting from version 8. Making a quick reference search for the new functional interfaces (Predicate, Function, Consumer, Supplier etc) makes it clear that this pattern is used literally everywhere.

Just one example: ArrayList defines a method forEach, that takes as parameter a Consumer. The ArrayList class has no knowledge of the specific Consumers, it is only aware of the interface. The client of the collection has the responsibility to pass in the desired implementation of that interface. Of course, these Consumers usually do not manifest as a class, but are rather represented as lambda expressions.

HashMap and ConcurrentHashMap also rely heavily on the functional interfaces Function and BiFunction for computing and merging entries.

Advertisements

Author: tamasgyorfi

Senior software engineer, certified enterprise architect and certified Scrum master. Feel free to connect on Twitter: @tamasgyorfi

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s