Test Driven Development by Example

 

See the c# version of this tutorial here.

This is a tutorial on test driven development using google test.  My two favorite references on TDD are “Test Driven Development by Example”, by Kent beck, and this short interview transcription from Joel Spolsky. The book covers the topic very well, and is example based, which is why I like it. The interview transcription tempers things down to reality a bit.

test_driven_development_by_example

Here are a couple sentences from the book that best define Test Driven Development for me.

“Clever play on words in the title. Test driven development is development by example. This book is also structured by example”.


The two rules of Test Driven Development

  “Test-Driven Development:
              • Don’t write a line of new code unless you first have a failing automated test.
              • Eliminate duplication”

  “The two rules imply an order to the tasks of programming:
               • Red — write a little test that doesn’t work, perhaps doesn’t even compile at first
               • Green — make the test work quickly, committing whatever sins necessary in the process
               • Refactor — eliminate all the duplication created in just getting the test to work

               Red/green/refactor. The TDDs mantra “

I have come to really like TDD as an individual programmer ( I have not used it in a team setting ), that is projects in which I am the only programmer, be it for a personal project, for practice when learning a new language, or even as part of a team that doesn’t require unit testing.  I  like test driven development because it makes me more productive. Using TDD improves the quality of my code, but the real attraction for me is that I simply get more done, and I don’t mean more lines of code ( TDD naturally results in more lines of code, mostly from the testing code itself). This is somewhat paradoxical, I produce more results in less time while writing more code. If I want to write a program that does x and y, using TDD  I will finish the project faster. I’ll give four reasons why I think this is so.

1) it avoids wasted effort
2) makes it is easier to get started.
3) enables design by coding
4) lends itself to bottom up programming

This is subjective, and won’t be true for everyone, but it works for me.

1) The number on reason I like it is because it avoids wasted effort, before I adopted TDD I would often write a function or custom data structure, call it done and then a few hours later realize that I made a fundamental design error and now need to go back and refactor it. I have found that by insisting on writing a test case first I get things correct early on. You’ll never completely remove wasted effort, but using test driven development has really reduced it for me.

2) I find that with some problems, it can be a struggle to get started, or choose the right place to start. Using a test driven development approach forces me to produce small testable units which has the effect of very quickly separating the necessary from the unnecessary, and gets me into a productive mode.

3) Before taking up TDD, I would always tend to design a program using pen and paper first, then once I had a rough overview, I would start to code. TDD has changed my approach, now I immediately start by writing test cases, then pick up pen and paper periodically as I progress, I am always writing code this way. This combination of coding and pen and paper again results wasting less effort, and I get the design right sooner than if I had not been doing TDD.

4) The workflow of test driven development embodies a bottom-up style of programming. I write a lot of small programs where a bottom up approach works great all the way through to completion, but I find I like to take a bottom up approach even with object-oriented design. I find that when doing OOP,  I end up wasting a lot of effort in creating code that I end up throwing away. If I first take a bottom-up approach write some bit of functionality to wrap my head around the problem space, then I do a much better job when designing classes, and end up keeping the code I wrote at first.

That’s why I like it, now let me show you a project that is both fun, and a great example for learning about Test Driven Development using Google Test or gtest.

This is a simple console program in which you type a word and it displays all anagrams of that word that are in the dictionary. I chose it because I stumbled upon the algorithm to do it, and thought it was really cool.

If you look at different programming books they often discuss anagrams when writing code that generates permutations, as that’s all anagram is. The brute force approach would generate all the permutations of the string, and then look for each of those words in a dictionary. This will work but is very inefficient, first of all generating the permutations is of order n * n!, and the required number of dictionary searches is also of order n * n!.


The algorithm used here makes use of the fundamental theorem of arithmetic, which says that every number can be uniquely written as a product primes, so what we can do is assign each character in the alphabet to a prime number, anagrams of word will have the same set of characters, and thus they will have the same prime factorization, and the fundamental theorem of arithmetic guarantees that no other set of characters will have this factorization . We can create a dictionary data structure to map prime factorizations to lists of words. To compute anagrams we will need to compute the prime factorization of the word then we just do a dictionary search. This approach will have a little higher up front cost in generating the word look up table but the actual computation will be very fast, it will consist of n multiplications per on a n character string, and a single dictionary look up.

You can just follow along here, cutting and pasting code as you go, or you can obtain this code from my github account here.

First you need to download and build gtest. You can download it at https://code.google.com/p/googletest/downloads/list, or get it from the the github package.

To build gtest, extract the executable and go into the msvc folder ( e.g. C:\programming\gest-1.7.0\msvc ). Open and build the gtest.sln or gtest-md.sln, depending on whether you to build gtest  as a static (gtest.sln) or dynamic (gtest-md.sln) library.

First take a look at the README file which explains how the project is structured.

integrates gtest with visual studio express 2010.

there are 3 projects within anagram.sln.
    1) anagramMain - think of this as your "application", i.e. no 
                     testing code will be included or referenced.

    2) anagramTest - this project contains your test cases, and test
                     data. Has refs to anagramLib, and gtest

    3) anagramLib - this project contains the functions that are tested
                    using gtest, and called from your main() 
                    application

 

If you are actually writing code to be released, then you will want to exclude your testing code from the released executable. To simulate this you can create a separate “testing” project which contains all your testing code, in this example, this project is called anagramTest. The actual released executable is anagramMain, which contains no references to testing code ( i.e. does not link with the gtest binary, or contain unit tests ).  The code that you actually want to be covered under unit tests is placed in anagramLib. This approach makes it easy isolate code that is not covered under your unit tests, for example if you were not testing some of your GUI code (GUI code is challenging to unit test ) into the “Main” program, or place in another project.

One more advantage of TDD that is potentially huge is the ease with which it allows a new developer to come up to speed on your code. By going through the unit tests for a project, you get to see into the authors mind, and follow the original development path. So to understand how this project works, let’s start with anagramTest.cpp.

These test cases appear in the order that I wrote them. The individual functions were incorporated later into two classes class anagram, and class anagramUtils

int anagramUtils::getPrime( char c )

which takes a char and returns the associated prime.

unsigned long anagramUtils::getPrimeFactorization(string word) 

which takes a string and returns an integer.

static void createWordIndex( string path, map<unsigned long, vector<string>> &wordIndex); 

is the function that creates and initializes the dictionary  structure, this is done by reading a file that contains a list of words ( named Wordlist.txt, and stored in the project directory), and computing the prime factorization, then populating the map structure.

vector<string> getAnagrams();

returns a vector of anagrams.


// anagramTest.cpp : Defines the entry point for the console application.
//
#pragma once
#include "anagramLib.h"
#include <iostream>
#include <algorithm>
// keep last in include list
#include "gtest/gtest.h"

using namespace std;

TEST(anagramTestCase, getPrimeTest)
{
  int primeA = anagramUtils::getPrime('A');
  int primeE = anagramUtils::getPrime('e');
  int primeQ = anagramUtils::getPrime('Q');
  int primeS = anagramUtils::getPrime('s');
  EXPECT_EQ(primeA, 2);
  EXPECT_EQ(primeE, 11);
  EXPECT_EQ(primeQ, 59);
  EXPECT_EQ(primeS, 67);
}

TEST(anagramTestcase, getPrimeFactorization)
{
  string teststr1 = "one";
  string teststr2 = "two";
  string teststr3 = "parsec";
  EXPECT_EQ(22231, anagramUtils::getPrimeFactorization(teststr1));
  EXPECT_EQ(276971, anagramUtils::getPrimeFactorization(teststr2));
  EXPECT_EQ(23827210, anagramUtils::getPrimeFactorization(teststr3));
}

TEST(anagramTestCase, createWordIndex)
{
  string wordListPath = "../WordLookup.txt";
  map<unsigned long, vector<string>> wordIndex;
  anagramUtils::createWordIndex(wordListPath, wordIndex);

  ASSERT_FALSE(wordIndex.empty());

  // find a prime factorization
  map<unsigned long, vector<string>>::iterator it = wordIndex.find(9409346);
  if( it != wordIndex.end())
   {
    // find a word in the vector associated with that prime factorization
    EXPECT_NE(find(it->second.begin(), it->second.end(), "stain"), it->second.end());
    EXPECT_NE(find(it->second.begin(), it->second.end(), "saint"), it->second.end());
    EXPECT_NE(find(it->second.begin(), it->second.end(), "satin"), it->second.end());
   }
}

TEST(anagramTestCase, createClassInstance)
{
  // generate word index
  string wordListPath = "../WordLookup.txt";
  map<unsigned long, vector<string>> wordIndex;
  anagramUtils::createWordIndex(wordListPath, wordIndex);

  anagram testAnagram = anagram::anagram("parsec", wordIndex);
  vector<string> testVector = testAnagram.getAnagrams();
  EXPECT_NE(find(testVector.begin(), testVector.end(), "parsec"), testVector.end());
  EXPECT_NE(find(testVector.begin(), testVector.end(), "capers"), testVector.end());
  EXPECT_NE(find(testVector.begin(), testVector.end(), "scrape"), testVector.end());
  EXPECT_NE(find(testVector.begin(), testVector.end(), "spacer"), testVector.end());
  EXPECT_NE(find(testVector.begin(), testVector.end(), "escarp"), testVector.end());
  EXPECT_NE(find(testVector.begin(), testVector.end(), "sparce"), testVector.end());

}

int main(int argc, char** argv)
{
  testing::InitGoogleTest(&argc, argv);
  RUN_ALL_TESTS();
  std::getchar(); // keep console window open until Return keystroke
  return 0;
}

 

anagramLib.h and anagramLib.cpp contain the declarations and definitions for the two classes.

 


#pragma once
#include <string>
#include <vector>
#include <map>

using namespace std;

class __declspec(dllexport) anagram
{
  private:
    // word to find anagrams of
    string initialWord;

    // prime factorization of the initial word
    unsigned long prime_factorization;

    // list of anagrams for the initial word
    vector<string> anagrams;

  public:
    // getter methods
    string getInitialWord();
    vector<string> getAnagrams();

    // ctor
    anagram(string word, map<unsigned long, vector<string>> &wordIndex);
};

/* utility class, private ctor, and static methods only */
class __declspec(dllexport) anagramUtils
{
  private:
    // private ctor
    anagramUtils() { /* never implement this function */  };

  public:
    // return a prime associated with the supplied character
    static int getPrime( char c );

    // returns a prime factorization of the supplied word
    static unsigned long getPrimeFactorization( string word );

    // populate the supplied wordIndex
    static void createWordIndex( string path, map<unsigned long, vector<string>> &wordIndex);
};

 

anagramMain is the “application”, it contains a loop that prompts the user for a word, then calls getAnagram() on that word and displays the output.

 


// anagramMain.cpp : Defines the entry point for the console application.
//
#pragma once
#include "stdafx.h"
#include "AnagramLib.h"
#include <iostream>

int _tmain(int argc, _TCHAR* argv[])
{
  string word;
  // generate word index
  cout << "Generating word Index ......" << endl;
  string wordListPath = "../WordLookup.txt";
  map<unsigned long, vector<string>> wordIndex;  // would prob be better if file path and index were global
  anagramUtils::createWordIndex(wordListPath, wordIndex);

// main loop body
do
{
  cout << "Enter a word or (-1) to exit" << endl;
  getline(cin, word);

  // create the anagram object and display
  anagram newAnagram = anagram(word, wordIndex);
  vector<string> tempAnagrams = newAnagram.getAnagrams();
  for(vector<string>::iterator vecIt = tempAnagrams.begin(); vecIt != tempAnagrams.end(); vecIt++)
    {
      cout << *vecIt << endl;
    }

} while (word != "-1");

 return 0;
}

 

 Posted by at 5:16 pm

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)


Time limit is exhausted. Please reload CAPTCHA.