Last week: Karate Chop
This is going to be a longer post because my goal was to follow the kata description and try out five different approaches. I will explain all of them:
Approach 1: Object oriented PHP
My goal in this approach was to use as little PHP functions as possible and implement the algorithm in an object oriented way
The final solution had the following classes:
interface IntegerHaystack { public function positionOf(int $needle) : int; public function isEmpty() : bool; } class SortedIntegers implements IntegerHaystack { public function __construct(int ...$integers) { ... } ... } class SortedIntegersRange implements IntegerHaystack { public function __construct(SortedIntegers $sortedIntegers, int $indexFrom, int $indexTo) { ... } ... } class SortedIntegersElement implements IntegerHaystack { public function __construct(SortedIntegers $sortedIntegers, int $index) { ... } ... } class ListEmpty extends \RuntimeException { } class NotFound extends \RuntimeException { }
SortedIntegers
is an immutable list of integers and SortedIntegersRange
represents a range within a SortedIntegers
object. I’ll explain SortedIntegersElement
later.
The test case itself was one very simple PHPUnit test with data provider. I created a function according to the specification that looked like this in the end:
function chop(int $needle, int ...$haystack) : int { try { $sorted = new SortedIntegers(...$haystack); return $sorted->positionOf($needle); } catch (NotFound $e) { return -1; } catch (ListEmpty $e) { return -1; } }
Before my final refactoring, SortedIntegersRange::positionOf
looked like this:
public function positionOf(int $needle) : int { if ($this->indexTo > $this->indexFrom) { switch ($this->sortedIntegers->at($this->indexMiddle()) <=> $needle) { case 1: return $this->subRangeTo($this->indexMiddle() - 1)->positionOf($needle); case -1: return $this->subRangeFrom($this->indexMiddle() + 1)->positionOf($needle); default: return $this->indexMiddle(); } } if ($this->sortedIntegers->at($this->indexFrom) === $needle) { return $this->indexFrom; } throw new NotFound(); }
I was not happy with it because all the conditions. For example it was not obvious that the second if statement is made for the case that the range only has one element. After I made it explicit with a private method isSingleElement()
, I noticed that it would not make a difference to always check the first element before halving the range if we assume that a comparison between two integers in the list costs the same than a comparision between two indexes.
Thus, the next iteration was:
public function positionOf(int $needle) : int { try { if ($this->sortedIntegers->at($this->indexFrom) === $needle) { return $this->indexFrom; } switch ($this->sortedIntegers->at($this->indexMiddle()) <=> $needle) { case 1: return $this->subRangeTo($this->indexMiddle() - 1)->positionOf($needle); case -1: return $this->subRangeFrom($this->indexMiddle() + 1)->positionOf($needle); default: return $this->indexMiddle(); } } catch (\OutOfRangeException $e) { throw new NotFound(); } }
I handled the case of a one-element range with no matching element by throwing out of range exceptions in subRangeFrom()
and subRangeTo()
if the start or end point are not within the current range.
Now the method looked a bit clearer, but using exceptions for control flow seemed like cheating, so I gave the previous approach another try. I introduced a new class for a range with one element, SortedIntegersElement
, a method isEmpty()
that returns true if from > to
and made the rest a bit more explicit.
I considered replacing the switch statement and the spaceship operator with “if” conditions, but switching the arguments and reordering the cases already helped a lot to make it more clear:
public function positionOf(int $needle) : int { if ($this->isEmpty()) { throw new NotFound; } if ($this->isSingleElement()) { return $this->firstElement()->positionOf($needle); } switch ($needle <=> $this->valuePivot()) { case -1: return $this->firstHalf()->positionOf($needle); case 0: return $this->indexPivot(); case 1: return $this->secondHalf()->positionOf($needle); } }
Now it is clear that there are three cases, empty range, single element range and range with >1 elements. Also all lower level details are hidden in private methods, following the “Single Level of Abstraction” principle.
Off-by-one Errors
As the Kata description predicted, I ran into an off-by-one error because I first used the position of the pivot element as new border for the subset instead of its neighbors. This resulted in infinite recursion in some cases.
Approach 2: Esoteric
I knew it would be difficult to come up with five different approaches, so I tried to think outside the box early and solved the Kata in CJam, an esoteric programming language (What’s this?). CJam is stack based, but also has variables, and all operators/functions consist of only one or two characters.
This is the program
ri:A;{r:I}{RIi+:R;}wR,1-:N;W:F;{FW=TN1+<e&}{NTm2/iT+:P;RP=A=P-1?:F;RP=A<{P1+:T;}{P1-:N;}?}wF
I “worked” with languages like this before for fun, but bringing TDD to them was new to me. Since CJam does not have its own testing capacities, I wrote the test in Bash:
#!/usr/bin/bash function test { RES=`echo $1 | java -jar ./cjam-0.6.5.jar ./kc.cjam` if [ "$RES" == "$2" ]; then echo -n "." else FAILURES=$FAILURES"Input $1 - expected $2, got $RES\n" echo -n "F" fi } FAILURES="" test "3" -1 test "3 1" -1 test "1 1" 0 test "1 1 2" 0 test "2 1 2" 1 test "2 3 5 7" -1 test "3 3 5 7" 0 test "4 3 5 7" -1 test "5 3 5 7" 1 test "6 3 5 7" -1 test "7 3 5 7" 2 test "8 3 5 7" -1 echo echo -e $FAILURES
And using TDD worked really well here. So, having no test framework is no excuse to write no tests!
Errors
I avoided the same off-by-one error from the first approach, lesson learned! I still ran into infinite loops sometimes, but only because I still was figuring out how the various CJam operators work.
Approach 3: Recursive Ruby Function
On the third day I implemented a recursive function in Ruby that passes a part of the array to itself:
def chop needle, haystack begin chop_recursive needle, haystack rescue return -1 end end def chop_recursive needle, haystack pos = (haystack.length / 2).floor if haystack[pos] == needle then return pos elsif haystack.length <= 1 then raise 'not found' elsif haystack[pos] > needle then return chop_recursive(needle, haystack[0..pos-1]) else offset = pos+1 return offset + chop_recursive(needle, haystack[offset..-1]) end end
Errors
I did not want to add offset
as parameter to the recursive function, and so one issue that occured was that if the needle was not found in the second half of the array, the offset was added to -1
. I solved it using a “not found” exception eventually.
Approach 4: Plain old loop in PHP
On the fourth day, I went back to PHP, but in an old school procedural way.
<?php declare(strict_types=1); namespace Kata\Loop; function chop(int $needle, int ...$haystack) : int { $left = 0; $right = count($haystack) - 1; do { $pos = middle($left, $right); if ($left > $right || !isset($haystack[$pos])) { return -1; } if ($haystack[$pos] < $needle) { $left = $pos + 1; } else { $right = $pos - 1; } } while ($haystack[$pos] !== $needle); return $pos; } function middle(int $left, int $right) : int { return $left + (int)floor(($right - $left) / 2); }
I realized that the algorithm is pretty much what I did in CJam (while loop, with $left
and $right
pointers), just in a different language.
I started with a single “if”, which worked for 0-1 elements, then refactored it into a loop. What was interesting is that as soon as I got the loop working for two elements, it also worked with more elements, without further ado. I’m not sure if I did too much in advance or the form of a non-linear while loop forced me to do it right.
Approach 5: –
I’m honest, I did not have time to come up with a really different approach. I thought about using a functional language or get familiar with ES6, but the initial effort did not fit into my schedule. So instead of doing nothing, I wrote a recursive function again, this time in PHP. Not much to tell here, except that it’s quite ugly, with array_slice()
and so on.
Sixth Kata: Roman Numerals
Our next kata is “Roman Numerals”: Use TDD to implement a converter of (arabic) numbers to roman numbers (I, III, III, IV, V, …). The system is described here. With which numbers will you start? As a bonus, try to implement the reverse as well, a converter from roman to arabic.
My personal goals this week
Nothing special, I’ll try to do it in a few different ways in both directions.