Monthly Archives: September 2013

A Trie in PHP

A trie is a tree used to compactly represent a natural-language dictionary. The nodes themselves contain only a part of the information they represent: the node’s value must be appended to its parent’s key in order to obtain the information it represents. Most nodes in a trie (in some tries, all nodes) contain a single character. A node’s children contain characters which may be appended to its key in order to form a word in the dictionary.

That description is probably better if you already understand what a trie is supposed to do. A picture works better.

A trie representing the words 'cart,' 'care,' 'cat' and 'coins'

A trie storing the words ‘cart,’ ‘care,’ ‘cat’ and ‘coins’

The trie above could just as well store ‘car,’ as well, so long as we give each node a flag to determine whether its key is a word in our dictionary.

Here’s the code for the trie and the trienode:

And the code to try it out:

Which gives the expected output.

A Binary Tree in Java

A tree is a data structure in which each node contains links to zero or more child nodes, but only zero or one parent nodes. Furthermore, in any tree, there should be only one node with zero parents. We refer to the parent-less node as the ‘root’ of the tree. Nodes without children are called ‘leaf nodes’ or simply ‘leaves.’

A tree

A tree

A binary tree is a tree in which a node may have only zero, one or two children.

A binary tree 

Limiting the number of child nodes to two allows us to store data for more efficient searching later, especially if we are careful about how we insert new nodes into the tree.

Much like with the linked list, we define a node class which stores the data and the relationship between each node and its children:

In fact, it’s safe to think of the node as being a tree in and of itself: instead of nodes with left and right child nodes, we could speak of trees with left and right subtrees. We could put everything for the entire binary tree concept in a single class, and it wouldn’t be that hard to understand. That’s not possible for all data structures, and I’d like to keep my conventions similar across this entire project, so I’m going to stick with the node/structure class distinction.

Since the node  itself contains the bare minimum functionality for the tree, the bare minimum tree class is somewhat spartan.

And let’s round things out with some client code.

Which gives the expected output:

Implementing just a binary tree as such is a little bit silly. The real advantages of the structure become clear with binary search trees, in which we add the restriction that a node’s left child (and their left children) should have values less than the node itself, and that the right children (and their right children) should have values greater than the node itself. This restriction allows us to search the tree very quickly.

Simple Linked List in C#

Before I start interviewing in earnest, I need to be well-versed on some widely-used data structures and algorithms. We covered all of them (and more) as part of UM’s CS curriculum, but that was long enough ago that I wouldn’t pass the finals now. Implementing these data structures will grant me enough familiarity with them that I’ll feel comfortable answering relevant questions in an interview.

The necessary data structures are linked lists, binary trees, tries, stacks, queues, vectors, and hash tables, as suggested by McDowell’s Cracking the Coding Interview. I intend to cover them in order, rotating between C#, Java and PHP. As I continue this process, I may repeat the exercises in other languages.

I’m not going to duplicate the functionality available in .NET, Java or PHP’s built-in data structures. Even if I could write a better ArrayList than Microsoft, the effort would not bear much fruit. Instead, I’m going to illustrate my understanding of the core functionality of each data structure, with no regard for whether the results are useful.

At its core, a linked list is a collection of nodes, each of which stores a piece of data and a link to another node. In order to illustrate the concept, the list class should also have methods to add or delete nodes. I’ll define each node’s data as a generic in order to build good habits, and provide a GetEnumerator() method in order to more quickly display the contents of the list.

A linked list with two nodes

A linked list with two nodes

Here’s the linked list itself:

And a simple test in the console:

The output of the test is as expected:

Because we Insert() nodes at the head of the list, we end up enumerating the nodes in the opposite order of insertion, as if we’d pushed and popped to and from a stack. We also Delete() from the head of the list.

Making this basic linked list into something we’d actually use for a more extensive application would mean adding most or all of the methods Microsoft exposes in their LinkedList(T) class. Perhaps my most notable omission is that I don’t track the last (or ‘tail’) node, which I’d want to do in order to use this list as anything other than a stack. After adding a ‘last’ member, I’d add InsertAtLast() and DeleteLast() methods, then rename the existing Insert() and Delete() to InsertFirst() and DeleteFirst(). I’d also certainly want Find(), AddAfter() and Remove(T) methods.

The code could also be made more robust with some exception handling, especially for dealing with empty lists.

Microsoft’s Introduction to Generics and the StackOverflow question “How do I implement a Linked List in Java?” helped me write this post.

FizzBuzz!

Jeff Atwood of Coding Horror wrote a post back in 2007 lamenting the incompetence of applicants to programming jobs. He used “FizzBuzz” as an example of a problem that every programmer ought to be able to solve. The statement is simple:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

Readers took this as a challenge, and posted their solutions in the comment section. Some of these solutions did not satisfy the requirements. I think most of those programmers could come up with workable solutions if they understood the spec.

Atwood’s follow-up post instructed the readers not to bother with FizzBuzz, since that wasn’t the point of the discussion, but I think publicly presenting solutions to simple problems like this will help me get a job when I finally need one.

As such, I present my FizzBuzz solution in PHP. This isn’t optimized at all, in part because the goal of the exercise is to show the interviewer (or reader) how I think.

In plan English:

  1. Repeat the following steps for the integers from 1 to 100, inclusive.
  2. If the integer is a multiple of 3, write “Fizz,” and make a note of the fact that you wrote something.
  3. If the integer is a multiple of 5, write “Buzz,” and make a note of the fact that you wrote something.
  4. If you didn’t already write something for this integer, write the integer itself.
  5. Write the next thing on a new line.

Some other correct solutions treat multiples of 15 as a special case, which allows them to use ‘else if’ and refrain from making a note of whether we’re supposed to write the integer. That’s probably no less robust, but my immediate instinct upon reading the problem statement is what I wrote above.