Connect

  • Connect with me

  • Sunday, July 03, 2016

    NDepend Pro review

    Background

    NDepend is a tool for performing code analysis with regards to different aspects like Code complexity, Code Coverage, static code analysis similar to the one done by StyleCop, Dependency management and many more useful features. In a typical enterprise application there are different  tools used to achieve these things. I have worked mostly with .Net framework in my prefessional experience and Microsoft Visual Studio is the default option to perform many of these things. I have used other tools like NCover for measuring code coverage. Back in 2011 I had used NDepend specifically to measure the Cyclomatic Complexity of different modules in our application.

    How did I come across NDepend Pro?

    Recently I was approached by NDepend developer to try the Pro version and evaluate its features. This post is about my experience of using NDepends (almost after a gap of 5 years) and see how it has evolved over the period of time.

    I started with a very small codebase which I had developed for simulating the Martingale theory. The code is availabe on github. The codebase is very simple. It consists of a library which computes the amount for a particular trade based on the payout percentage. There are set of unit tests which are used to test the functions of the library. There is also a console application which acts as the client for invoking the library functions.

    Analysis results

    Below is the output of the analysis done using NDepend pro. Lets look at some of the details available via different tabs

    Dashboard

    NDepend Analysis Results

    The summary breaks down into following categories on the dashboard

    1. # Lines of code
    2. # Types
    3. Comment %
    4. Method complexity
    5. Code Coverage by Tests (Allows you to import code coverage data from other tools)
    6. Third-party usage
    7. Code Rules

    I find the Code Rules section personally very useful as it gives you hyperlinks to drill-down further into the details.

    Dependency Graph

    dependency graph

    This graph gives a visual representation of the relationship between different assemblies. The best part I like is the interactivity of the graph. You can hover over the nodes and the affected nodes are dynamically highlighted using different colors. In the above example we have only 3 namespaces. But you can imagine how useful this can be in a real project when you have 100’s of classes and different namespaces involved.

    There are multiple options to customize the way Depepndency Graph is represented. The default is based on the number of lines of code. But you can change it to any of the options shown below

    Dependency graph options

    Dependency Matrix

    Dependency Matrix gives a matrix view of how different assemblies are dependent on one another. I find this feature very helpful as it gives within an instance a quick representation of the links between different assemblies in the application. The tools which I have used in the past like Visual Studio, NCover etc do not provide such feature.

    Dependency Matrix

    A well designed application will have good distribution of classes across different assemblies. You will also be able to see what could be the impact of replacing one thing with another. Lets take an example. Assuming you use a third party component like Infragistics in your application and for some reason you wish to replace it with something else. Using depepndency matrix you could find out which assemblies are dependent on Infragistics.

    There are multiple options available via the context menu which gives in depth analysis of the code. I have not yet explored these options so far.

    Metrics Heatmap

    Metrics Heatmap

    Heatmap is a feature which shows how classes are spread across different namespaces based on the cyclomatic complexity. The default measure of cyclomatic complexity can be changed to various other options like Il Cyclomatic Complexity, Lines of Codes, Percentage Comments etc etc.

    For a small codebase like my sample Martingale theory tester, this analysis using NDepend is quite interesting. To make full use of the wonderful features the tool provides I intend to use this on a much larger codebase. I was recently refering to the CQRS Journey code from Microsoft Patterns and practices team. Let me see what I can discover from this decent size code using NDepend. I will keep the readers of this blog posted with the details in future posts.

    Conclusion

    I have always been a fan of code quality tools. NDepend has lot to offer in this area. I particularly liked the Dashboard and Dependency Matrix along with Dependency Graph. I have scratched just the top of the surface and I am excited to try other features offered by the tool. The feature I am interested in exploring more in future posts is the benchmarking of codebase. Based on my past experience and whatever little I have seen so far of the latest version, I would highly reccomend NDepend for code analysis. I personally find the integrated nature of the tool which provides so many aspects related to code quality in one single place. You can chose to run it as a standalone application which is what I did on this intance or you can integrate it within Visual Studio IDE. I also like the fact that it integrates nicely into the build process which is a must in todays world. Interoperability with other tools like TFS, TeamCity, SonarCube is another benefit.

    I personally like the options offered to customize the default settings and configurations. The rules for example are mainly derived from what Visual Studio uses by default. You can always chose to filter the rules not relevant to your analysis. Another nice feature is the Code Query Language offered by NDepend using LINQ. This gives you a great ability to explore your code using queries.

    There is so much to explore in NDepend that it is impossible to do it in one blog post. One feature which I did not cover in this post is the Queries and Rules Explorer.In my opinion it deserves a dedicated post. I will try to cover some of these features in future. Until next time Happy Programming.

    Monday, January 04, 2016

    Configure Standalone Spark on Windows 10

    Background

    Its been almost 2 years since I wrote a blog post. Hopefully the next ones will be much more frequent. This post is about my experice of setting up Spark as a standalone instance on Windows 10 64 bit machine. I got back to bit of programming after a long gap and it was quite evident that I struggledd a bit in configuring the system. Someone else coming from a .Net background and new to Java way of working might face similar difficulties tat I faced over a day to get Spark up and running.

     

    What is Spark

    Spark is an execution engine which is  gaining popularity due to its ability to perform in memory parallel processing. It claims to be upto 100 times more faster compared to Hadoop MapReduce processing methods. It also fits more in the distributed computing paradigm related to big data world. One of the positives of Spark is that it can be run in standalone mode without having to setup nodes in the cluster. This also means that we do not need to set up Hadoop cluster to get started with Spark. Spark is written in Scala & support Scala, Java, Python and R languages as of writing this post in January 2016. Currently it is one of the most popular projects among the different tools used as part of Hadoop ecosystem.

     

    What is the problem in installing Spark in stand alone mode on Windows machine?

    I started with downloading a copy of Spark distribution 1.5.2 Nov 9 2015 from the Apache website. I chose the version which is pre-built for Hadoop 2.6 and later. If you prefer you can also download the source code & build the whole package. After extracting the contents of the downloaded file, I tried running the Spark-shell command from the commnand prompt. If everything is installed successfully, we should get a Scala shell to execute our commands. Unfortunately on Windows 10 64 bit machine, Spark does not start very well. This seems to be a known issue as there are multiple resources on the internet which talk about it. 

    When the Spark-shell command is executed, there are multiple errors which are reported on the console. The error which I received showed problems with creation of SqlContext. There was a big stack trace which was difficult to understand.

    Personally this is one thing which I do not like about Java. In my past exxperience I always found it very difficult to debug issues as the error messages showed some error which may not be the correct source of the problem. I wish Java based tools and applications in future will be easier to deploy. In one sense it is good that it makes us aware of many of the internal things, but on the other hand sometimes you just want to install the stuff and get startedd with it without wanting to spend days configuring it.

    I was referring to the Pluralsight course relted to Apache Spark fundamentals. The getting started and the installatioon module of the course was helpful in the first step to resolve the issue related to Spark. As suggested in the course, I changed the verbosity of the output for Spark from INFO to ERROR and the amount of info on the consoe reduced a lot. With this change, I was immediately able to get the error related to missing Winutils which is like a utility required specifically for Windows systems. This is reported as an issue SPARK-2356 in the Spark issue list. 

    After copying the Winutils.exe file from the pluralsight course in the Spark installation’s bin folder, I was getting the permissions error for the tmp/Hive folder error. As reccommended in different online posts, I tried changing the permissions using chmod and setting it to 777. This did not seem to fix the issue. I tried running the command with administrative previlages. Still no luck.I updated the PATH environemnt variable to point to the Spark\bin directory. As suggested, I added the Spark_HOME, HADOOP_HOME to environment variables. Initially I had put the Winutils.exe file in the Spark/bin folder. I moved it out to dedicated directory named Winutils and updated the environemnt variable for HADOOP_HOME to this directory. Still no luck.

    As many people had experienced the same problem with the latest version of Spark 1.5.2, I thought of trying an older version. Even in 1.5.1 I had the same issue. I went back to 1.4.2 version released in November 2014 and that seemed to create the SqlContext correctly. but the version is more than a year old, so there wass no point sticking to the outdated version. 

    At this stage I was contemplating the option of getting the source code and building it from scratch. Having read in multiple posts about setting JAVA_HOME environment variable I thought of trying this apparoach. I downloaded the Java 7 SDK and created the environment variable to point to the location where jdk was installed. Even this did not solve the problem.

     

    Use right version of Winutils

    As a last option, I decided to download the Winutils.exe from a different source. In the downloaded contents, I got Winutils and some other dlls as well like Hadoop.dll as shown in the figure below.

    Winutils with hadoop dlls

    After putting these contents in the Winutils directory and running the Spark-shell command everything was in place and SqlContext was successfully created.

    I am not really sure which step fixed the issue. Was it the jdk and setting of JAVA_HOME environment.Or was it the update of winutils exe along with other dll. All this setup was quite time consuming. Hope this is helpful for people trying to setup standalone instance of Spark on Windows 10 machines.

    While I was trying to get Spark up & running, I found following links which might be helpful in case you face similar issues

    The last one was really helpful from where I took the idea of separating Winutils exe into different folder and also to install JDK & Scala. But setting scala envirnment variables were not required as I was able to get the scala prompt without scala installation.

    Conclusion

    Following are the steps I followed for installing Standalone instance of Spark on Windows 10 64 bit machine

    • JDK (6 or higher version)
    • Download Spark distribution
    • Download correct version of Winutils.exe dll
    • Set Environment variables for JAVA_HOME, SPARK_HOME & HADOOP_HOME

    Note : When running the chmod command to set 777 attributes for tmp/hive directory make sure to run the command prompt with Administrative privilages.

    Sunday, March 24, 2013

    Software Craftsmanship

    Background

    After a gap of few months, I went through the list of items in my Google Reader feed. As I was reading more than 400+ blogs, I came across many posts in quick succession related to the topic which remains very close to my heart. It is about Software Craftsmanship as the title of this post suggest. I have been thinking about writing my views on this topic for quite sometime. The thoughts which were expressed by other writers in their posts encouraged me to express myself and my views on this subject.

    Software Craftsmanship was the term I got familiar with by reading the posts, books and watching videos by Uncle Bob Martin. To me it is a state that a professional gets into where he starts treating his day to day work as a craft. For him it is not merely a way of earning monthly salary to pay off the bills, but a way of showcasing the skills, the talents, the creativity in bringing out the best in him. If you are from software industry you will be able to relate to my thoughts much better compared to other professions. But I believe that it is prevalent in other professions as well. How many times do you come across individuals whom you meet for the first time and you immediately know that there is something special about them. This will not be visible every time but if not the first time, within couple of meetings or discussions with them you know that they are different from the usual crowd. In my experience there are very few people like this whom we can call craftsman (at least in software world).

    What are the qualities which make these special breed of people so different?

    There are numerous qualities that these people exhibit. To me the single most thing that distinguishes the craftsman from the experts is their passion. There are so many experts in this field. But the passion that is displayed by the people whom I consider to be craftsman is simply unmatchable. It almost seems like their passion becomes their motive in life. They are driven by the desire to do something different which others can only think of most probably in their dreams. These craftsman put in lot of time and effort in building their passion and honing their craft. Their dedication towards constantly improving, learning and spreading their craft is something to look forward to. They are not afraid of sharing their ideas with others. At the same time, they are not afraid of accepting others ideas and correcting their own mistakes.

    You will be easily able to identify a craftsman among a herd of ordinary people. They will be wiling to showcase their expertise when the need arises. Usually they are not the people who will do the talking for the sake of talking. They know very well that sometimes silence is the best wisdom. There are people who tend to give their views and opinions irrespective of whether it is required or not. But a craftsman will know exactly when to express his concerns and when to stamp his authority with a killer punch which will demonstrate the superiority of the individual more so through his work rather than through his words.

    Far too often I find managements and consulting companies comparing individuals based on number of years of experience. In my opinion when I see a job description which says I am looking for a person with X number of years of experience, it is the first step towards trying to compare an apple to an orange. In software industry you can never get two individuals with same number of experience, with same skill sets producing the same output. Let me take an example. In one of the project I was working, there were two professionals working with me. One with more than 8 years of experience and other with just about 4 years of experience. But the output that I was getting in terms of quality and efficiency from the 4 years guy was much more than that compared to the 8 years experienced person. In an ideal world you would expect the output of an individual with 8 years of experience to be at least 1.5 times that of 4 years of experience if not exactly double. Why do you think this happens?

    One of the greatest challenge in software industry is the constantly changing technology. Some people become obsessed with one technology and they just don’t want to change. Even if they realize that the new technology is better they are reluctant to put in the efforts required to upskill themselves. They are too comfortable with what they know. This could be one scenario. Other scenario is that they somehow got hold of one technology and its too much of a burden for them to move on to something else. In my experience, there is another extreme where some people are very quick in adjusting to new tools and technologies. But they think it is only their responsibility to know about it and they do not share it with others. Even if you ask these people for help, they will not give you complete information or purposely give you misleading information. My experience says that these people are mostly protective about their jobs and are afraid of giving the correct information fearing that someone might take their job.

    I remember having a conversation with a close friend of mine. We were discussing about people changing jobs frequently and the impact that can have on the dynamics of a team or organization. He narrated me what his boss had told him. His boss was having the opinion that people who are insecure about their jobs are the ones who play all these tactics. Even in the worst market conditions, good people will be able to find their way.

    So coming back to the topic of craftsmanship, one thing I have seen in them is that they are very adaptive in nature. They adapt very quickly to the changing needs. They can quickly grasp new things and find right places to make use of their knowledge. This does not mean that they jump on to each and every new buzzword. They are able to filter out the noise and concentrate on things which really mater. They are usually the ones who try to understand why things work in certain ways rather than saying it should work because so and so person says it works for him. They try to keep things as much as possible.

    How would you identify the work of a craftsman?

    It doesn’t take much effort to recognize the work of a craftsman. A craftsman will choose the simplest solution to the most complex problem that is being presented to them. The reverse is also true. You will find the most complex solution to the simples problem in the world. That is one way of identifying craftsmanship. Their work is the hallmark of their craftsmanship. Simplifying things for themselves as well as others who work with them is what they are good at. They can do it time and again. They are not the ones who will wait for others to tell them what needs to be done next. They will be eager to get on with the next thing that needs to get done. If they are stuck somewhere in between, they are not afraid of taking help from others. There is a difference between finding the perfect solution and finding the best solution. Usually people will strive to find the so called perfect solution. In an attempt to find the perfect solution, they will tend to complicate things much more than what is required. A craftsman on the other hand will be the one looking for the best possible solution rather than perfect solution.

     

    Some of the craftsman I have known so far

    Here is a small list of people whom I think I craftsman's in their own right

    Robert Martin also known as Uncle Bob

    Martin Fowler

    Kent Beck

    Ron Jeffries

    Scott Guthrie

    Scott Hanselman

    Phil Haack

    Steve Sanderson

    Joel Spolsky

    Jon Skeet

    Laurent Bugnion

    These are some of the most famous ones. If you are associated with software, you would have heard about a tleast one of these guys. Apart from these people who are known worldwide for their work there are some people from my close circle who are doing great work in their own right. Some of these people I have met personally and many through other forums like BDotNet community group. Some I have not met personally but still consider them craftsman and my mentors.

    Vinod Kumar

     vinod Kumar

    I have been following Vinod’s blog for quite some time. I wonder from where does this guy gets all his ideas and inspiration. Every blog of his is so full of information. Keep up the good work Vinod, I am sure you have inspired many software developers of our generation through your writings.

     

     

     

     

     

    Pinal Dave

    Do a google search for a question related to SQL Sever and I am sure most likely you will find a post by Pinal Dave on his blog http://blog.sqlauthority.com/ about it. He is a SQL Authority as the name of his blog suggests.

    Lohith G N

    Lohith

    Lohith is my ex-colleague. We worked together for a very short duration in my first workplace. Although we both moved on since then we have been in touch through emails and social media. Lohit’s simplicity and his hunger for acquiring knowledge has always amazed me. He has been very active in the community events off late and has taken lot of initiatives in spreading the knowledge within and across DotNet communities in India. I am sure being part of Telerik team in India he will be able to inspire many youngsters in doing good things in future.  Way to go dude. Keep up the good work.

     

    Conclusion

    It is not easy to be a evangelist or a craftsman. It takes a lot of effort and dedication. If you are willing in to put in the efforts life will reward you in proportion to your own efforts. If you wish to follow the footsteps of these or any other person whom you consider a craftsman, find a mentor who can guide you in the right direction. With the advance technology it is possible to communicate with almost anybody who has a presence on the internet. Make use of the tools and technologies to get a step closer to being a craftsman.

    Wednesday, August 15, 2012

    Memento Design Pattern

    In this post I’ll explore the Memento Design Pattern which is a Behavioural pattern. If this is the first time you are coming to this site, you can also checkout my earlier posts related to Design Patterns And Enterprise Patterns. Gang of Four defines Memento pattern as a way to capture the internal state of an object without violating the encapsulation. It allows us to restore the state at a later point of time.

    Problem statement

    Assume we are building shopping cart software for a retailer. The user has an option to persist the shopping cart and amend it on subsequent visits to the website of the retailer. Until the user checks out the order and makes the payment, he or she is allowed to amend the order unlimited times. During modification user can also decide not to persist his changes in which case any unsaved changes should be reverted and the order state should be set to the last persisted state.

    Shopping Cart without Memento

    We start with a very basic ShoppingCart class which consists of different methods for manipulating the cart items collection.

        public class ShoppingCart

        {

            private IList<CartItem> currentCartItems;

     

            public ShoppingCart()

            {

                CartItems = new List<CartItem>

                    {

                        new CartItem { Id = 1, ProductName = "Lays Chips", Quantity = 2, UnitPrice = 5 },

                        new CartItem { Id = 2, ProductName = "Coca Cola", Quantity = 1, UnitPrice = 15 }

                    };

            }

     

            public int Id { get; set; }

     

            public IList<CartItem> CartItems { get; set; }

     

            public void AddItem(CartItem cartItem)

            {

                CartItems.Add(cartItem);

            }

     

            public void RemoveItem(CartItem cartItem)

            {

                CartItems.Remove(cartItem);

            }

     

            public void EditCart()

            {

                currentCartItems = new List<CartItem>(CartItems);

            }

     

            public void CancelEditing()

            {

                CartItems = new List<CartItem>(currentCartItems);

            }

     

            public void SaveCart()

            {

                // persist changes to database

            }

        }

    In the above code snippet the constructor initializes the CartItems collection with two items. This kind of simulates the fetching of existing items from the persisted database. Assume that there is some service which takes care of populating these CartItem objects with the values from the persistent medium.

    The ShoppingCart class exposes following methods to manipulate the contents of the cart.

    • AddItem adds new CartItem to the items collection
    • RemoveItem removes existing item from the collection
    • EditCart method is used to change the state of cart from read only mode to update mode
    • CancelEditing method is used to cancel the changes which are not yet saved into the database
    • SaveCart is used to persist the changes to the database.

    The workflow is as follows, user chooses to put the Cart into Edit mode by invoking the EditCart method. At this point a copy of the existing items are saved into a variable currentCartItems. The user can modify the cart items by adding or removing the items. The changes can be persisted by invoking SaveCart method or cancel edits by invoking the CancelEdit method.

                ShoppingCart shoppingCart = new ShoppingCart();

     

                Console.WriteLine("Print initial state");

                PrintCartDetails(shoppingCart.CartItems);

     

                shoppingCart.EditCart();

     

                shoppingCart.AddItem(new CartItem { Id = 3, ProductName = "Pepsi", Quantity = 1, UnitPrice = 2 });

     

                Console.WriteLine("Print after adding 1 cart item");

                PrintCartDetails(shoppingCart.CartItems);

     

                shoppingCart.CancelEditing();

     

                Console.WriteLine("Print after cancelling edit");

                PrintCartDetails(shoppingCart.CartItems);

    In the above code snippet, we added a new item to the shopping cart and cancelled the change. Since persisting changes to database are outside the scope of this post, I’ll not touch upon that topic here.

    The ShoppingCart class gets the job done as per the requirement. This would be sufficient in most cases. Do you see any flaw with this approach? Although not a major flaw but this method does pose a problem because the state is stored within the same class which can be restored later. Since the data is available to class methods, it might be possible for some other method to change the state unintentionally. How can we avoid such unintentional modification to the internal state of the object?

    The Memento Design Pattern comes handy in the situation explained above. We can externalize the storage of CartItems instead of storing them in the currentCartItems variables. At the points where we wish to restore the collection, we restore it from the external source. By providing read-only access to the state of the external object we can avoid tempering with the intermediate state.

    Refactoring towards Memento Pattern

    We start off with defining a class which will store the data that we are interested in. This happens to be the cart items collection. We define the CartItemsMemento class as shown below

        public class CartItemsMemento

        {

            private readonly IList<CartItem> _cartItems;

     

            public CartItemsMemento(IList<CartItem> cartItems)

            {

                _cartItems = new List<CartItem>(cartItems);

            }

     

            public IList<CartItem> CartItems

            {

                get

                {

                    return _cartItems;

                }

            }

        }

    The only responsibility for this class is to hold onto the state of an object which can be restored later. Next step is to get this data structure out of the ShoppingCart class. Here is the refactored ShoppingCart implementation.

        public class ShoppingCart

        {

            public ShoppingCart()

            {

                CartItems = new List<CartItem>

                    {

                        new CartItem { Id = 1, ProductName = "Lays Chips", Quantity = 2, UnitPrice = 5 },

                        new CartItem { Id = 2, ProductName = "Coca Cola", Quantity = 1, UnitPrice = 15 }

                    };

            }

     

            public int Id { get; set; }

     

            public IList<CartItem> CartItems { get; set; }

     

            public void AddItem(CartItem cartItem)

            {

                CartItems.Add(cartItem);

            }

     

            public void RemoveItem(CartItem cartItem)

            {

                CartItems.Remove(cartItem);

            }

     

            public void EditCart()

            {

                // changes state from readonly to edit mode

            }

     

            public void CancelEditing()

            {

                // reverts back to readonly mode

            }

     

            public CartItemsMemento CreateMemento()

            {

                return new CartItemsMemento(CartItems);

            }

     

            public void RestoreCartItems(CartItemsMemento memento)

            {

                CartItems = new List<CartItem>(memento.CartItems);

            }

     

            public void SaveCart()

            {

                // persist changes to database

            }

        }

    We have got rid of the currentCartItems private variable. Instead we have added two method CreateMemento and RestoreCartItems. CreateMemento returns a new instance of CartItemsMemento with the current shopping cart items. The restore method is used to restore the state stored inside the memento object.

    By doing these changes we have the source of the object and the destination where it needs to reside temporarily. How do we glue them together. To bring the pieces together we have another intermediate class which acts as the caretaker of this data. I named it as CartItemsCareTaker.

        public class CartItemsCareTaker

        {

            public CartItemsMemento CartItemsMemento { get; set; }

        }

    The only job of this class is to temporarily hold onto the internal state stored inside of the memento object. We are almost done with these changes. Here is the client code which makes use of these classes.

                ShoppingCart shoppingCart = new ShoppingCart();

     

                Console.WriteLine("Print initial state");

                PrintCartDetails(shoppingCart.CartItems);

     

                CartItemsCareTaker careTaker = new CartItemsCareTaker { CartItemsMemento = shoppingCart.CreateMemento() };

     

                shoppingCart.AddItem(new CartItem { Id = 3, ProductName = "Pepsi", Quantity = 1, UnitPrice = 2 });

     

                Console.WriteLine("Print after adding 1 cart item");

                PrintCartDetails(shoppingCart.CartItems);

     

                shoppingCart.RestoreCartItems(careTaker.CartItemsMemento);

     

                Console.WriteLine("Print after cancelling edit");

                PrintCartDetails(shoppingCart.CartItems);

    Note that we invoke the shoppingCart.CreateMemento and shoppingCart.RestoreCartItems methods here.

    Conclusion

    Although using Memento pattern increases the number of classes in the solution, it increases the maintainability by splitting the responsibilities between the Originator (ShoppingCart), the Care taker (CartItemsCareTaker) and Memento (CartItemsMemento) classes. Each class has a unique responsibility. We achieve the objective of separating the internal state of an object using an encapsulated manner with the help of the care taker class. Anytime there is a need to backup and restore the internal state of object we should consider using the Memento pattern instead of managing the state within the same class.

    The complete working solution is available for download Memento Design  Pattern Demo.zip

    Until Next time Happy Programming.

    Further Reading

    Based on the topics discussed in this post I would like to recommend following books.

    Monday, August 13, 2012

    Distributed Version Control System

    Over the past decade or so there has been an exponential rise in the usage of Distributed Version Control System (DVCS). This post is about my experiences in using GitHub and Bitbucket distributed version control systems. I would like to share my experience in setting up the systems on a Windows 7 PC.

    GitHub & BitBucket

    There is already so much information available on the history and evolution of Version Control Systems that I would not like to repeat it myself. There has been lot of talk off late about the Distributed version control systems. Initially these DVCS were considered to be mostly suitable for Open Source projects. Recently I was part of a group discussion where one of the member suggested using DVCS in an enterprise application. The advantage offered by DVCS was the ease of merging and better support for branching. Although we did not progress much on that discussion, it was worth noting that there are other enterprises who are using this approach and leveraging the benefits offered by DVCS systems.

    I was looking for some means of storing the source code I use for my blog entries in the file sharing systems over the internet. I have been using Dropbox for long time. The problem with Dropbox is that it is a file sharing system and not a source control repository. Apart from the blog source codes, I also have some personal projects which I keep updating with the changes in technology. This is where I found GitHub useful for storing the source codes.

    GitHub

    GitHub is built on top of Git and offers various pricing options. It is free for open source projects. You can store your code as open source in GitHub repositories. GitHub promotes itself as social coding website. There are various GUI based clients for Git on Windows. Here is a link which gives a step by step guide to setting up GitHub and in turn Git on a Windows PC.

    There are different ways of working with a DVCS on your PC. If you are a geek and like to work off the command line, you can work with GitHub using the shell. Unfortunately I am not a command line freak and prefer to have a GUI to work with GitHub. There are multiple options available. I have tried the Msysgit Git GUI and TortoiseGit. Msysgit comes bundeled with the Git installation package. TortoiseGit is a port of TortoiseSVN for Git.

    Setting up GitHub can take up some time. After installing the software you need to set up the RSA keys. If you are unable to set up the keys correctly you’ll not be able to store anything in GitHub. Once the Git software is installed on the PC we are ready to create Repository and push the changes to GitHub. These are the steps I undertake to setup new repository in GitHub. I’ll skip the steps related to registration with GitHub as it is one time activity. Assuming you are a registered GitHub user these are the steps to follow.

    1 – Create a new Repo in GitHub

    Git Homepage

    We can create a new Repository in GitHub using any one of the highlighted options in the above screenshot. You can use the Create a Repository link under the Bootcamp section as indicated by 1 or the link under Welcome to GitHub indicated by 2 below the bootcamp section. There is also a New repository button next to Your Repositories. This is indicated by 3 or using the Create new repo button at the top right of the page as shown using 4. Except the first link all other redirect us to the repository creation page as shown below

    Create Repository Screen

    I want to add one of the directory containing source code available on my hard disk to GitHub. I created a repository named DecoratorDesignPattern

    decorator repo on GitHub

    Note the three highlighted areas in the above screenshot. At the top of the page we have the URL to the newly created repository. We can use one of the options from 2 or 3. If we are starting from scratch we can use the create new Repo on Command line option. If we have created the repository on the local hard drive we can use the third option that of pushing an existing repository from command line.

    2 – Create new repository on command line

    I need to set up a local copy of the remote repository. So I’ll use the Create a new repository from command line option. In Windows Explorer navigate to the directory where you wish to create the new repository and right click on the folder.

    image

    From the context menu select Git Bash Here option. It opens up the Git command prompt. Type the commands as shown in the previous screen

    image

    Please note that the GitHub url is case sensitive. After the repository is created, we can push the changes to the remote server. If everything goes fine you should be able to see a screen as below

    image

    With this step we have successfully pushed the contents of the local folder to a remote repository. If you don’t want to use the command line tools, you can use the GUI tools. Here is a screenshot of the default GUI provided by Git.

    Git Gui Context Menu

    Right click on the directory in Windows Explorer which contains the Git repository and select Git GUI Here option from the context menu. You’ll be able to perform all the operation that we performed using command line from the GUI as well.

    Git GUI

    The Git GUI is good option to start with. There are other client softwares which allows us to work with Git using GUI. I have also used Tortoise Git. This client offers much more options in the context menu as shown in the screen shot below

    tortoise Git GUI

    One of the major limitation of GitHub is that all your repositories will be public if you don’t want to subscribe to their monthly or yearly plans. As always now a days there are alternatives available and at times you are spoiled for the choices. Recently BitBucket started a service similar to GitHub with DVCS support.

    BitBucket

    It has support for multiple DVCS systems including Git and Mercurial. There is also an option of importing existing repositories from other providers like GitHub. The advantage it has over Git is that you can have unlimited number of private repositories. The process of setting up the system is also simpler compared to GitHub. You can follow the steps in this post to set up Bitbucket repository with either Git or Mercurial as DVCS. If you choose Mercurial as the option the download comes with a full featured Tortoise Hg GUI.

    The steps for setting up a Mercurial repository and storing it remotely are relatively simple compared to Git. The link related to setting up of Bitbucket talks about creating new repository so I would not like to repeat it again. Following is the screenshot of the Tortoise Hg client for BitBucket

    Tortoise Hg client

    We can use the dashboard in the Bitbucket web site to create new repositories or import existing repositories from other sources like GitHub

    BitBucket dashboard

    Currently there are multiple sources supported by BitBucket to import your existing code as shown below

    Bitbucket Import repo

    Current support is provided for CodePlex, git/GitHub, Google Code, Mercurial, SourceForge and Subversion repositories. You can choose either Git or Mercurial as the target type and also make it a private repository.

    Conclusion

    Not everybody can afford to invest in a full fledged source control system. If you are in a small start-up or a individual freelancer, using these systems can help you to leverage the facilities of commercial SCM. As for me I use multiple laptops and copying files across multiple devices can be avoided using online storage. It also helps me maintain the history of changes even for a personal projects using lightweight version control system. I can now keep copies of all my trial projects in a central repository. Apart from the source code related to my blog posts, I can also store my personal projects as private repositories using BitBucket.

    With DVCS you have a local copy of the repository. The speed is an advantage. These are designed for speed. After the repository is cloned from the central server, we do not need connection to central server. Operations like comparisons and version history can be done without network connectivity.

    Until next time Happy Programming.

    Further Reading

    Based on the topics discussed in this post I would like to recommend following books as further reference.

    Wednesday, August 08, 2012

    Composite Design Pattern

    In this post I’ll demonstrate the Composite Design Pattern. This is the next post in the Design Pattern Series which falls under the category of Structural Design Patterns. In the previous post we looked at the Builder Design Pattern which is used to build a complex or a composite object using a series of steps. Composite Design Pattern is very helpful when we have a tree structure and there is a need to treat the parent as well as child object in the same manner.

    Problem Statement

    Assume we are building software for a renowned retailer. The retailer has stores across the country. We need to calculate the profit for each City where the retailer operates. Also the profits need to be calculated at the State level. We can extend this example further by saying that various states can be grouped together into regions and so forth. For simplicity we will stop at the State level.

    Profit Calculator Without Composite Pattern

    Lets start with defining the domain objects related to this problem statement. We can map the requirement to different classes like Store, City and State. City class will contain a list of Stores within that city. Similarly State class will contain a list of States. We start with the simplest class Store.

        public class Store

        {

            public int Id { get; set; }

     

            public string Name { get; set; }

     

            public int Profit { get; set; }

        }

    The Store class defined above is very simple and self explanatory. To keep things simple, we assume that the Profit at Store level is computed using some complex financial calculations which are outside the scope of this post. We don’t bother about how it is calculated, but we know that we’ll be able to calculate this value by some means. Lets move on to the City class which has a collection of Stores.

        public class City

        {

            public City()

            {

                CityStores = new List<Store>();

            }

     

            public int Id { get; set; }

     

            public string Name { get; set; }

     

            public IList<Store> CityStores { get; private set; }

     

            public void AddStore(Store store)

            {

                CityStores.Add(store);

            }

     

            public void RemoveStore(Store store)

            {

                CityStores.Remove(store);

            }

     

            public int GetCityProfit()

            {

                int profit = 0;

     

                foreach (Store store in CityStores)

                {

                    profit += store.Profit;

                }

     

                return profit;

            }

        }

    We have a CityStores collection which is manipulated using the AddStore and RemoveStore methods. We also have the GetCityProfit method which iterates all the stores and adds up the profits for them. Similarly we have the State class which is almost the same but operates at the City level while aggregating the profits.

        public class State

        {

            public State()

            {

                Cities = new List<City>();

            }

     

            public int Id { get; set; }

     

            public string Name { get; set; }

     

            public IList<City> Cities { get; private set; }

     

            public void AddCity(City city)

            {

                Cities.Add(city);

            }

     

            public void RemoveCity(City city)

            {

                Cities.Remove(city);

            }

     

            public int GetStateProfit()

            {

                int profit = 0;

     

                foreach (City city in Cities)

                {

                    foreach (Store store in city.CityStores)

                    {

                        profit += store.Profit;

                    }

                }

     

                return profit;

            }

        }

    AddCity and RemoveCity methods help in managing the list of Cities related to a particular State. The GetStateProfit method iterates over two collections. The first iteration is for the cities within the state and the next loop is for all the stores within a city.

    Limitations of this approach

    As we can see from the above code for every level of hierarchy we have additional looping to do in order to compute the profit. For example imagine what will happen if we were to calculate the profit at the regional level by combining multiple states together. In another scenario imagine if we have a big metropolitan city which we wish to split further into smaller groups.

    There exists a hierarchy between a top level element and its children like State and City and also City and individual Store elements. All that we are doing is performing similar operation be it at the parent level or the child level. All this process could be simplified if we can treat the parent which contains a list of child objects and the child nodes itself in the same manner. This is exactly the kind of situation tailor made for implementing the composite design pattern.

    Refactoring towards Composite Pattern

    The crux of the composite pattern is that we can treat the composite object and the individual leaf object in exactly the same way. In order to do that we can define an interface which can be implemented by both the parent and the child nodes.

        public interface IProfitable

        {

            int GetProfit();

     

            void AddChild(IProfitable child);

     

            void RemoveChild(IProfitable child);

        }

    The IProfitable interface defines a method for getting the profit and to manage the child elements. All the interested objects will implement this interface. This interface definition seems ok for those objects which have child elements associated with them like State or City. The problem comes if we were to implement the same interface at the leaf node level which happens to be Store in our context. Lets see the Store class implementation.

        public class Store : IProfitable

        {

            public int Profit { get; set; }

     

            public int GetProfit()

            {

                return Profit;

            }

     

            public void AddChild(IProfitable city)

            {

                throw new NotImplementedException();

            }

     

            public void RemoveChild(IProfitable city)

            {

                throw new NotImplementedException();

            }

        }

    We implement only the methods of IProfitable which are relevant at the leaf level like GetProfit in this case. GetProfit simply returns the value of Profit property. The methods related to child management AddChild and RemoveChild are not implemented and throw exception if the client tries to invoke these methods. Lets continue to the City class implementation.

        public class City : IProfitable

        {

            public City()

            {

                Stores = new List<IProfitable>();

            }

     

            public IList<IProfitable> Stores { get; private set; }

     

            public int GetProfit()

            {

                int profit = 0;

     

                foreach (IProfitable store in Stores)

                {

                    profit += store.GetProfit();

                }

     

                return profit;

            }

     

            public void AddChild(IProfitable store)

            {

                if (store is Store)

                {

                    Stores.Add(store);

                }

            }

     

            public void RemoveChild(IProfitable store)

            {

                Stores.Remove(store);

            }

        }

    In case of City class the code is mostly the same as it was previously with the exception that we are referring the list as well as the parameters to the AddChild and RemoveChild methods using the IProfitable interface. Because of this we have to add the type cheking logic in the AddChild and RemoveChild methods to make sure the type of parameter which is passed is Store. Now lets look at the refactored Sate class

        public class State : IProfitable

        {

            public State()

            {

                Cities = new List<IProfitable>();

            }

     

            public IList<IProfitable> Cities { get; private set; }

     

            public int GetProfit()

            {

                int profit = 0;

     

                foreach (IProfitable city in Cities)

                {

                    profit += city.GetProfit();

                }

     

                return profit;

            }

     

            public void AddChild(IProfitable city)

            {

                if (city is City)

                {

                    Cities.Add(city);

                }

            }

     

            public void RemoveChild(IProfitable city)

            {

                Cities.Remove(city);

            }

        }

    The State class is almost identical to City class. The difference comes in the implementation of the GetProfit method. We are no longer looping multiple times. We have only one level of looping which iterates over the child elements calling their respective GetProfit method. You can imagine how simple it would be to accommodate the two scenarios described above that of having regional profit within metropolitan city of profit at the regional level comprising multiple states. Also note that we no longer have methods like GetStateProfit or GetStateProfit. I agree that this discrepancy in names could have been solved in earlier implementation as well by using the same name GetProfit for all the classes.

    Conclusion

    Composite design pattern helps us to treat the composite and individual objects in the unified manner. It is used in situations where the part-whole hierarchies are used in code structure resulting in tree structure.

    Many people prefer not to add the AddChild and RemoveChild or similar methods related only to the composite object to the interface. If we take this approach there is no need to throw exceptions from the leaf node class like we did for Store class. The composites can inherit from an abstract class which has the methods specific to composite objects.

    As always the complete source code is available for download Composite Design Pattern Demo.zip.

    Until next time Happy Programming.

    Further Reading

    Here are some books I recommend based on the topic discussed in this blog.