Monday, April 15, 2013

My attempt in solving Conway's game of life - a post mortem


Recently, I'd applied for a Senior Developer position in Thoughtworks. Like they usually do, I was asked to solve any one of the problems from this list
- Merchant's guide to Galaxy
- Conference track management
- Conway's Game of Life

I chose Conway's Game of Life. One of my friends had applied for Thoughtworks a few months ago and I remember discussing at length with him about the problem. To begin with, I had a fair idea of what to do. The problem looks fairly simple. You'll be given a cell pattern (comprising of live and dead cell). The cell pattern would grow/shrink based on the neighbouring cell's state. There are rules to govern that.

For instance, any live cell with more than 3 live neighbours will die due to over crowding.
Any live cell with less than 2 live neighbours will die of loneliness. Etc
Check out this wiki link for more details about the game. 

My approach
I wanted to decouple the way in which the rules are applied to the cells so that one rule need not know what the other one does and can be executed exclusively. I came up with this approach of listing down the rules in a properties file in this manner and load them at run time to a list.


1
2
3
#Add Rules this way
#Rule_Name = Rule_Implementation_Class
TestRule = com.java.tw.gol.rules.handlers.TestRuleHandler

Each cell in the universe will be subjected to the rules from the list all at once. 

I was asked to come down to their office for the pair programming round. Two other employees sat with me to analyze the code. I was asked to explain my design and I was asked questions on the rationale/logic behind my design. They gave an extension to the problem and my design handled it #likeABoss without much code change.

YET, I FAILED TO CLEAR THE PAIR PROGRAMMING ROUND. It lasted for about 1.5 hrs after which the HR told me this (verbatim)

They were impressed with your design and the way in which you have thought through the problem. But you must concentrate on finer nuances of OOPS.

Pain points in my design

They asked me why I'd added an attribute private String cellValue. I told them that I added it purely to represent a cell(x for a live cell and - for a dead cell).
The actual problem lies not in having an attribute for display purpose but in the way I'd handled the attribute.

Now, imagine we create a cell using this constructor.


1
2
3
4
5
6
7
public Cell(int x, int y, boolean cellState, String value)
{
 this.xPos = x;
 this.yPos = y;
 this.isAlive = cellState;
 this.cellValue = value;
}


Cell c = new Cell(0,0,true,"-"); 

See what I'd done there? We are setting the isAlive attribute to true and representing the cell using '-'. This kind of breaks the whole thing right?


public Cell(int x, int y, boolean cellState)
{
  this.xPos = x;
  this.yPos = y;
  this.isAlive = cellState;
  //Fix
  if(cellState)
   this.cellValue = "X";
  else
   this.cellValue = "-";
} 

My GolUtils class had methods like getNeighbouringLiveCellCount and checkIfCellIsAlive which do the chunk of the business logic. Placing them in Util class was really bad in their perspective. 

CellPatter.java has getters and setters which are exposed publicly. As in, anybody can change the cell state. By exposing the setters we are running into the danger of raw data being tampered. 

Nice to have feature: The user should be able to run the game for n no of times. 

Over all, its was 90 mins well spent. I have learnt to look things the way, which I would have not even bothered earlier. 

Game of Life - Github URL

Do let me know if you see any basic flaws/any improvements that can be done. Please feel free to fork it on Github.

Happy Coding :)
~ cheers.!

Wednesday, April 10, 2013

generate very large excel spreadsheets using apache poi and jdbc

In our last project we had a requirement to generate an excel report. Looks like a straight forward requirement, right? Even though the requirement looks simple on paper, we ran into quite a few issues. The number of records in the excel sheet varied between 1L and 6L. We had a logic in place to put these records across sheets with each sheet containing not more than 60k records. But we faced a few issues while generating the excel.

Problems:
- The system ran out of heap space during excel generation and we ran into OOM errors several times.
- We couldn't paginate the query that we used to pull the data as it was generated dynamically. We had to run the query all at once and pull the data.

Couple of observations:
- setFetchSize - to pull more records from the DB.
The default number of rows that Oracle returns when you query the DB is 10. So, in our case, with a dataset containing well over 2L rows, it meant 20,000 DB hits. We were able to gain a significant improvement in performance by overriding the setFetchSize(int n) before running the query. In our case, we'd set the fetch size to 2000. 
- Using SXSSF implementation instead of the usual XSSF/HSSF implementations.
When we are dealing with excel sheets containing large data sets it's always better to use Apache POI's SXSSF Implementation so that we don't run into OOMs.

Do let me know if there are any better ways to deal with such requirements.

Happy coding.
~ cheers.!