Code Kata - Roman Numerals: C# - Part 2

Following on from completing the first part of the Roman Numerals kata, time to tackle the reverse - converting from Roman Numerals to Arabic.

I'll be following a similar pattern, so again, let's start simple:

[Test]
public void When_passed_i_should_convert_to_1()
{
  Assert.AreEqual("1", RomanNumeralConverter.Convert("I"));
}

This fails of course, so we'll add a quick fix to the top of the Convert method:

if (arabicOrRomanNumeral == "I") return "1";

Back to green, now refactor. There's no requirement to handle error conditions so if the input is an integer, use the existing convert method, otherwise, go down a new code path to convert from Roman Numerals to Arabic:

int arabic;
if (int.TryParse(arabicOrRomanNumeral, out arabic))
  return new RomanNumeralConverter(arabic)
    .ToRomanNumerals()
    .Aggregate(new StringBuilder(),
        (sb, romanNumeral) => sb.Append(romanNumeral.Numeral))
    .ToString();
else
  return "1";

Not much of a change, but I feel I could end up in a long refactoring session without any clear direction. I have some ideas of further changes, but let's add a few more tests to see if this clears the path ahead:

[Test]
public void When_passed_ii_should_convert_to_2()
{
  Assert.AreEqual("2", RomanNumeralConverter.Convert("II"));
}

It's red again, so add a simple fix to the else clause:

return arabicOrRomanNumeral == "II" ? "2" : "1";

Now I think it's time for a bit of a bigger refactoring attempt. First off, I'm going to copy some of the original code and then hopefully be able to simplify it down later. To start with, the return statement will look similar to the Arabic -> Roman Numeral counterpart:

return new RomanNumeralConverter(0)
  .ToRomanNumerals(arabicOrRomanNumeral)
  .Sum(x => x.ArabicValue)
  .ToString();

The first line I use the existing constructor, passing a dummy value. This is a hack and I have made a note to come back to this at the next opportunity. The ToRomanNumerals method is now overloaded - one taking no args which parses the Arabic value, and now this new method which takes the Roman Numeral string. Again, this needs urgent refactoring as it is badly named and confusing, but at this point I need to get back to a successful compile and make sure nothing is broken. Another note is made.

To complete this change, I add the ToRomanNumerals method:

public IList<RomanNumeral> ToRomanNumerals(string romanNumerals)
{
  var parsedRomanNumerals = new List<RomanNumeral>();
  while (romanNumerals.Length > 0)
  {
    var romanNumeral = (from numeral in RomanNumerals.All
                        where numeral.Matches(romanNumerals) != null
                        select numeral.Matches(romanNumerals)).First();
    romanNumerals = romanNumerals.Substring(romanNumeral.Numeral.Length);
    parsedRomanNumerals.Add(romanNumeral);
  }
  return parsedRomanNumerals;
}

I could have simply made this method the return arabicOrRomanNumeral == "II" ? "2" : "1"; statement, but feel I can make a bigger leap. This is very similar to the previous ToRomanNumerals method, but in this case take off Roman Numeral characters (as opposed to Arabic values) until there are no more characters left. One more method missing, RomanNumeral.Matches which takes a string:

public RomanNumeral Matches(string romanNumeral)
{
  if (this.Numeral[0] == romanNumeral[0]) return this;
  return null;
}

As the ToRomanNumerals method was a big jump, I've temporarily ignored the substitution rules for the Matches method. I will be coming back to this.

With this Matches method, we're back to compiling, and running the tests, we're green. Now I could either go back and fix some of the points I noted down during the initial implementation or go ahead and add some more tests. I plump for the former. Tackling the first issue, I need to fix the constructor argument. For this, I will simply move the argument to the ToRomanNumerals method. In the Convert method, move arabic argument from the constructor to the ToRomanNumerals method.

The constructor can be removed entirely, along with the Arabic field.

Compiling and the tests still pass. Next, I noted the ToRomanNumerals method. I rename it to Parse. There is also some duplication - most notably the search through the RomanNumerals.All collection. Finally, I decide to not return the parsed list of RomanNumeral objects, but save them into a field. Therefore, I add an additional accessor for this field:

private IList<RomanNumeral> parsed;
public RomanNumeralConverter() { }
public IList<RomanNumeral> Parsed
{
  get { return this.parsed; }
}

public RomanNumeralConverter Parse(int number)
{
  var parsed = new List<RomanNumeral>();
  while (number > 0)
  {
    var romanNumeral = RomanNumerals.FindFirstMatching(numeral => numeral.Matches(number));
    number -= romanNumeral.ArabicValue;
    parsed.Add(romanNumeral);
  }
  this.parsed = parsed;
  return this;
}

public RomanNumeralConverter Parse(string romanNumerals)
{
  var parsed = new List<RomanNumeral>();
  while (romanNumerals.Length > 0)
  {
    var romanNumeral = RomanNumerals.FindFirstMatching(numeral => numeral.Matches(romanNumerals));
    romanNumerals = romanNumerals.Substring(romanNumeral.Numeral.Length);
    parsed.Add(romanNumeral);
  }
  this.parsed = parsed;
  return this;
}

The FindFirstMatching method is extracted into the RomanNumerals class:

public static RomanNumeral FindFirstMatching(Func<RomanNumeral, RomanNumeral> matcher)
{
  return (from numeral in RomanNumerals.All
          let matched = matcher(numeral)
          where matched != null
          select matched).First();
}

The Convert method has to be changed to now call the Parsed property after the Parse method call. There are still a number of things I'm not happy with - the Convert method has two pretty long chained method calls. Clearly the API isn't quite right forcing the client to make too many calls. Looking at the calls, both are making a Parse, followed by a Parsed - this should be one call. Additionally, the Aggregate/Sum calls could be pushed down into instance methods. I had to do this in a number of steps, but I'll just jump to the final version:

public static string Convert(string arabicOrRomanNumeral)
{
  var converter = new RomanNumeralConverter();
  int arabic;
  if (int.TryParse(arabicOrRomanNumeral, out arabic))
    return converter.WithArabic(arabic).RomanNumeral;
  else
    return converter.WithRomanNumerals(arabicOrRomanNumeral).Arabic.ToString();
}

I'm happier with this Convert method. This involved changing the rest of the class in a number of small ways.

First off, I removed the Parsed getter - this is no longer needed. The Parse methods are renamed WithArabic and WithRomanNumerals as appropriate. The Sum and Aggregate calls are pushed down into separate properties:

public string RomanNumeral
{
  get
  {
    return parsed
      .Aggregate(new StringBuilder(),
        (sb, romanNumeral) => sb.Append(romanNumeral.Numeral))
      .ToString();
  }
}

public int Arabic
{
  get { return parsed.Sum(x => x.ArabicValue); }
}

At this point, I'm reasonably happy with the implementation. There remains similarities between the WithArabic and WithRomanNumerals methods, but after some fruitless attempts at refactoring, I'll leave it as is.

Finally, I can get back to the rest of the tests. I feel quite confident except I made a note that I ignored the substitution logic within the RomanNumeral.Matches(string) method. I add a test for III -> 3, which passes and then IV -> 4, which indeed fails. In the Matches method, I need to first see if a substitution applies:

public RomanNumeral Matches(string romanNumeral)
{
  var modifiedNumeral = this.Modified;
  if (modifiedNumeral != null && romanNumeral.StartsWith(modifiedNumeral.Numeral))
    return modifiedNumeral;

  if (this.Numeral[0] == romanNumeral[0]) return this;
  return null;
}

private RomanNumeral Modified
{
  get
  {
    if (canBeModifiedBy == null) return null;
    return new RomanNumeral(canBeModifiedBy.Numeral + this.Numeral, this.ArabicValue - canBeModifiedBy.ArabicValue);
  }
}

I added a helper Modified property. I also change the Matches(int) version to make use of this property. The IV test now passes. I add tests for V to VIII and the reverse selection of the random set of numbers. All pass without further changes.

Phew! That's it, all tests are green, I can convert from Arabic to Roman Numerals and vice versa. Here's the final code listing. As noted previously, the one part of code I'm not particularly happy with is the similarities between the WithArabic and WithRomanNumerals implementations, but as mentioned in the intro in part 1, sometimes you get design leaps which greatly improve the software - this didn't come to me while doing the problem this time around. I may well try this problem again - perhaps attacking the problem differently - to see if the solution is vastly different.

While doing this second part, I felt I spent much more time refactoring than implementing "new" logic. Effectively, doing the reverse is very similar so it was almost a case of copy/paste/refactor. I think next time I'd try a more 'parallel' approach to see if this 'flows' better.

In summary, I think I need more practice! Which is good, as this is one of the aims of doing these katas. I will be trying this again, in all probability in a different language first before coming back for another go in C#.

Till next time, I encourage you to keep practicing too!


Comment Guidelines
See the FAQ for details on the full rules and guidelines. No Spam. Write clearly and thoughtfully - no bad language.