Monday, January 18, 2010

The Unit of Work Design Pattern

"Everything that we do in this grid or in these fields, or in this group of tabs cannot be saved to the system until the Save button is actually pressed", says the business analysts.

This made Brian Carr and myself sit way back in our chairs and go "Hmmm... How can we solution this problem given our architecture?". The piece of software we're developing is built around SOA - our front end user interface is strictly ExtJS that communicates over ReSTful web services to our platform. Not a lick of CF on the front end. Shortly after those requirement gathering meetings we buckled down to analyze the problem at hand.

Here were some of our main considerations:

  1. ExtJS grids come with their own dirty/clean mechanism but that only existed for grids. Our client's requirement existed for any form control, not just grids. That option was out of the picture, especially in a SOA.
  2. Even if that functionality existed for all form controls, how would the platform handle saving these data points simplistically with minimal overhead and a clean API? In a traditional application that doesn't use data mappers or ORM, you'd usually have a post page that inserted/saved everything about that entity - for instance a User. You'd have a web form that when posted, would update all 20 fields even though all you needed to update was one email address. In an enterprise level environment this is a HUGE waste both in sending data over the pipe as well as causing unnecessary database overhead.
  3. We already had our own ActiveRecord implementation in place so handling crud operations from our domain model was extremely simple. 
  4. We already implemented the Observer Pattern that I've detail here, to handle saving composites among other things.
  5. At a systems level, we wanted to make small, light and quick web service requests to our platform whenever a form field was updated.
  6. It must be transactional. 
We knew that when it came to persisting an object singularly, it was solid. The api was such that these objects could save itself, load itself etc. We also knew that via our Observer Pattern implementation that when an object saved itself and there were any composite relationships attached to that instance, any of those composites would also save itself. So the only two problems where this:
  1. How do we logically group disparate objects in memory that are part of a unit of work - like a front end data grid or a group of data separated by tabs - to ensure that the underlying objects that encapsulated these data points can be saved, deleted or rollbacked (removed from memory) when needed. All while using our ActiveRecord and Observer patterns? 
  2. ALL of this must be managed across multiple independent http requests to the platform.
As usual Brian and I banged our heads against the wall trying to come up with a solid solution. Then it happened... enter the Unit of Work design pattern by Martin Fowler.

Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.


Let's jump right in. I won't go over the exact implementation that we used in our software, but I'll provide a quick overview of this design pattern and how you might get started with it.

As always, here is the class diagram first.



So you'll see here that we've got:
  1. An IPersistent interface that has save and delete that returns a string for example purposes.
  2. A UnitOfWork (UoW) object.
    1. that has 3 fields to store the new, delete and changed stack. Which all should take an array of objects, preferably strongly typed if your language can support it.
    2. methods for adding to each stack.
    3. a method for committing and deleting the UoW.
    4. a get() method that will pull out an object from the UoW registry.
The idea is simple really. The UoW acts as a container/wrapper for disparate objects that will need to take part in a single logical transaction. 

Here is the code for the UnitOfWork object:


Pretty straight forward here. Class members are created to hold objects (stack) and methods are there to throw objects onto a stack. Notice how the entirety of commit() is transactional - it is all or nothing. We iterate over all the stacks and call the appropriate crud method. Also the get() method will pull out the object from the registry based on hashcode. You can do this however based on how you uniquely identify objects in your system.

Here's the scratch code:


Which produces:


In this test we:
  1. created three users
  2. displayed the unique hashcodes for each user
  3. assign user1 and user2 to to the UoW new stack.
  4. assign user3 to the UoW delete stack.
  5. then call commit() on the UoW.
Cool? On uow.commit() we've executed the appropriate database method to handle persistence and printed the save/delete messages to the screen. Of course in a real implementation this would be whatever persistence mechanism you have in place - DAO, ActiveRecord, data mappers, whatever.

Here are a couple key things to consider that this post didn't cover:
  • You will most likely need a singleton object, usually called the UnitOfWorkManager, that is responsible for managing your UnitOfWork objects. This plays into the feature of supporting UoW across multiple HTTP requests.
  • You will need a method on the UoW object that will remove objects from certain stacks altogether. This is for situations that perhaps the user clicks a "new user" object and a new Use robjects get registered. However he/she decides that a new user is no longer needed so they subsequently remove it but there are still other data changes that need to be persisted for that UoW. Don't get this confused for rollback() or registerDeleted().
This is one of those design patterns that you can find yourself getting really deep and start customizing like crazy. Be careful for that, make sure to keep it simple...

Hopefully you can use this as your jumping off point... Happy UnitOfWorking!

-Micky

References

Wednesday, January 13, 2010

My Brother's Term Paper - Strengths and Weaknesses of Object-Oriented and Procedural Programming



My 17 year old little brother, Gian Dionisio, is currently working on getting into college to pursue a computer science degree and has always expressed a genuine interest in software engineering. I'm on his back 24/7 when it comes to good grades, school etc. so it was a pleasant holiday present when he showed me his senior term paper in which he received a 98 out of 100. Glad he hasn't caught "senioritus" like his older brother did! I'm posting this more for selfish reasons as it's something I'd like to point people to for a totally objective view when it comes to object oriented vs. procedural programming. I've probably given Gian about a 1 minute dump of my own thoughts on the subject but as you'll read, his essay is based entirely on his own research. so here it is, good work Gian, very good work.

Strengths and Weaknesses of Object-Oriented and Procedural Programming by Gian Dionisio

The process of designing a program is broken down into four steps: analysis, design, implementation, and maintenance. Analysis refers to determining if a problem is solvable, design refers to the process of determining how to solve the problem, implementation is the actual coding of the program, and maintenance refers to the debugging and modification of the program. As the process progresses from one phase to the next, costs rise higher and higher, with maintenance being the most expensive (Lambert). Currently, many programmers use one of two paradigms of programming when designing and implementing a program: procedural and object-oriented. However, many programmers have begun to stray away from procedural programming in favor of object-oriented
because of the way object-oriented programs handle data, which allows for more dynamic and flexible programs. Even though object-oriented programming offers more flexibility than procedural programming, programmers should not perceive procedural programming as obsolete because both paradigms have strengths and weaknesses that do not make one necessarily better than the other.

Procedural programming is best described as a one-way system, using only variables and functions, with no major data grouping, which is the primary strength of object-oriented programming (gabehabe). This paradigm emphasizes on conciseness and reaching the end result as quickly as possible, solving only the problem presented and nothing more, making procedural programs very straightforward (Weisfield). John Barton of IBM notes that generations of programmers have essentially been taught to create a list of instructions for the
machine on how to read data (gather ingredients), how to use equations and functions with subroutines (cook the meal), and how to display the product (a finished meal) (Waldrop).

While the design and implementation of a procedural program may be straightforward, especially in smaller scale projects, as the programs grows larger in scale, both design and implementation become increasingly more difficult as different subroutines are being performed on the same set of data at the same time (Waldrop). When the program is finally functional, modification is made either difficult or impossible because changes in one part of the program may affect the rest of the program as a whole (Archwing). Because procedural programs are not able to be readily modified, maintenance costs become very high. For example, think of a 10-story apartment complex with 10 apartments on each floor, and a development lot for 100 houses; the lots of houses may cost more to construct than the apartment complex, but the houses are, for the most part, self-contained with their own
utilities and walls, versus an apartment, which shares utilities and walls, therefore adding another house would be far less expensive than adding another apartment to an apartment complex (Orr). This problem of modification is solved with the principles emphasized in object-oriented programming.

The main strength of object-oriented programming is its ability to handle data by taking advantage of the way we perceive of environment with the use of objects (Parr). Objects can be seen as separate entities, each with their own characteristics. These entities all share two characteristics: states and behaviors (Sun). For example, a
dog has 'states' (name, color, breed) and 'behaviors' (barking, fetching) (Sun). Objects "encapsulate" these qualities, meaning that they contain data, otherwise known as states, and methods, otherwise known as behaviors (Orr). These objects inherit these qualities from blocks of code known as "classes"; a class can be seen as a blueprint or template of an object, much like how biologists group organisms into classes such as "bird" or ”mammal” (Waldrop). This use of objects gives a solution of designing large scale programs by writing small-scale software units that communicate with each other, rather than a huge system (Parr). This tactic of "divide and conquer” is employed in both procedural, with the use of subroutines, and object-oriented programs, with the use of objects (Nemirovsky). However, in a procedural program, the programmer must tell the computer how to perform an action and when to execute it, in contrast to an object-oriented program, where the programmer only tells the object what to do, and then the object performs it (Sherer). This means that a programmer doesn't need to know how or why an object works, but merely what it does; demonstrating the principle of information hiding, a method of encapsulation in which only certain details are able to be seen (Orr). Information hiding allows for large scale projects to run much more smoothly because programmers don’t need to know the implementation of a class to use it, and allows changes to be made without drastically affecting the entire program, as well as a certain degree of consistency within the application, as programmers working on the project are using the same base classes
(Lindsay).

When code is segmented into smaller sections, each with their own separate individual functions, programs become easier to work with. For example, in an object-oriented program, if a program needs to know the month in a two-digit format, it sends a message to a Date object with the message asking for the date in that format, rather than accessing the date from a variable floating somewhere within the program (Waldrop). Within that same program, another object can ask for the date in a three-letter format by sending a message that corresponds to that demand to the same Date object, and the Date object would return that date. Object-orientation also allows the developer to assemble his or her code in the same manner that he or she thinks about the problem (Waldrop). The developer can create an object named  ”Airfoil” and define a function that instructs the object how to handle data, and then the developer can send a message to the Airfoil object to calculate something, such as wind resistance (Chastain). By developing and debugging small components independent from the program, developers are able to isolate and test code more efficiently, and can assume that the program works as advertised, rather than guess where problem may lie (as with the case of a procedural program) (Chastain). By breaking up code into smaller pieces, object-oriented programs become easier to modify, and because of this many companies are interested in reaping the benefits of the object-oriented approach, that is, reusable and easily modifiable programs (Patrizio). Object-oriented programming can be considered better at representing the real world than procedural programming because object-orientation allows for more intricate and dynamic interactions as well as allowing non-technical workers to better understand and participate in the maintenance of a program because objects better appeal to the natural human cognition patterns (Archwing). An example of an object-oriented program’s ability to be modified can be demonstrated in a payroll program; if the program was written procedurally, the area that assesses the employee's paycheck would be in an "if-then" form, versus an object-oriented program, which would send messages to all the employee objects, which would then calculate all the paychecks (Sherer). Any modifications done to the procedural program might cause a chain effect in the program, making unintentional changes. Object-oriented programs also become easy to build upon in case the scope of the project grows larger
because of the principles stressed in object-oriented programming, such as encapsulation and inheritance, allow the application to expand quickly, and pre-existing classes do not have to be modified in any way because objects interact with each other through their methods and messages (Chastain).

Object-orientation offers many tools and principles that allow programmers to create dynamic programs; however, learning to use these tools can prove difficult. An entire way of thinking was built off of the procedural method, and “reprogramming” the programmers may take a lot of effort on the developer’s part before they can fully reap the
benefits of clarity and reusability (Waldrop). Procedural programming is action-oriented and solves problems through a series of logical steps, versus object-oriented programming, which looks at the entire problem as a whole and then derives a solution with the use of a series of reusable classes (Patrizio). The object-oriented approach presents a radically different method of method approaching software: code and data become merged into one single, cohesive unit— the object, and is an abstraction of a set of real-world things, such as "date" or "employee" (Archwing). Systems are made up of multiple objects, such as date, or processing, and make requests with messages which ask for specific pieces of information (Archwing). This different way of thinking can be met with resistance by some, as in the case or Chevron’s programming staff, who have difficulty grasping the concepts (Moser).

A method for tackling the problem of learning how to use object-oriented programming and concepts is through the use of graphical user interfaces (Moser). Graphical user environments have been in use by high school computer classes in hopes to intrigue students to Iearn object-oriented programming by showing the fun, enjoyable side of programming without having to touch the dense terminology of the industry (Demski). For example, many high schools have begun to use the BlueJ graphical user interface, which displays coded boxes and arrows that allows students to watch the concepts in action, and allows them to get a clear understanding of the concept (Dernski).

However, learning how to use object-oriented programming is not the only problem, but when to use it. Object-oriented programming, while very good at managing data complexity, is difficult to apply, which may be a major reason for programmers not to adopt object-oriented programming. Marc Funaro, a computer programmer, made a blog post detailing his experience in trying to incorporate object-oriented programming principles into his own programs, and stated that by doing so, had almost ruined his business (Funaro). Marc was under the impression that by incorporating aspects of object-oriented programming, his programs would become much more efficient and reusable (Funaro). However, this is only partly true, because in order to fully take advantage of object-orientation, one needs to learn how use object-oriented analysis and design; using an object-oriented language will not yield the promised benefits unless one learns to think in the appropriate manner (Buckier). Brian Carr, a seasoned software architect, stated that programmers who use an object-oriented language but are not thinking in an object-oriented way are essentially procedural programmers; additionally, learning how to use object-oriented programming does not make one a better programmer, but rather, adds another tool to the programmer's disposal
(CF OOP Debate). Carr also notes that object-oriented programming is better than procedural in one aspect: managing complexity. Funaro also complained about design and implementation time when using object-oriented programming, and that a project done procedurally would be completed much more quickly (Funaro). Mike Chandler, another object-oriented programmer, commented on this, saying, “There are some projects that procedural programming would be better suited than OOP, especially those that are smaller scale. When you’re designing a program, you really need to think about the customer’s needs. Will he wants to modify or extend the program? That’s when you need OOP” (CFOOP Debate).

In this age of technology, advancements are being made very quickly, however just because progress is being made, doesn’t mean we should abandon older methods. Object-oriented programming, though powerful and very flexible, is not without its drawbacks, suffering from longer design and implementation times. Procedural programs, though inflexible, are quick to design and implement, and may even prove superior to object-oriented programming is smaller scale projects if fast delivery is key. However for the mid to enterprise level software systems, the object oriented approach does cost a little bit more up front but extraordinary gains are realized during the maintenance and enhancement phase, which is where most cost is incurred.

Friday, January 8, 2010

Fundamental Lightwire Bug?

**UPDATE 01/09/10** Peter Bell, the creator of Lightwire, has addressed this issue. You can find his blog post about it here. I've also update this post as well.


This bug seems extremely fundamental to be a real bug so my initial thoughts are that I'm doing something completely wrong. An issue like this would have been discovered by now.


Brian Carr and I were working on some front end development for our project that leverages Quicksilver and Lightwire when we came across an bug in the UI. Whenever a form was submitted that contained form validation errors, it appeared to be returning correctly the first time. However if you were to submit the same form again, we noticed the error messages would continue to duplicate and stack. Initially we thought there was an issue with Quicksilver not configuring the beans correctly or perhaps we tagged our Errors object with a singleton annotation. After some digging into our bean configuration annotations, everything was annotated correctly. 


So next step was to switch our dependency injection provider to ColdSpring, which Quicksilver allows you to do with one line. What happened to our surprise? It worked! The errors were displaying correctly. Ok, so now we know that the issue is Lightwire specific but didn't know whether it was a Quicksilver or Lightwire problem so we dug into Quicksilver's injection provider service for Lightwire. Perhaps we're wiring something incorrectly. After a good amount of investigation we verified Quicksilver was setting configuring beans correctly for Lightwire consumption.


Our next step was to completely bypass Quicksilver altogether. If we can configure these beans using just Lightwire and a scratch pad and still seeing this error then we can safely deduce that it was a problem in Lightwire. So here is what we found...


First, here is the jist of the classes involved. Both are TRANSIENT objects.




Simply, a BusinessServiceResponse has an Error object. For this example, the hashcode member is a unique hash that is attached to each unique instance.


Here is the code that we are using to test:





  1. Line 4 - 14 Configure the two transients - Errors and BusinessServiceResponse
  2. Line 16-20 Configure setter dependency. BusinessServiceResponse has an Errors property to be injected with the Errors transient object.
  3. Line 24-38 On the first iteration we are getting the bean for the first time then displaying the hashcode on the BusinessServiceResponse and the Errors. It's important to note that Errors hashcode is accessed through the BusinessServiceResponse on line 37.
Here are the results:


It appears that errors is being handled as a transient as the hashcode are identical across all three BusinessServiceResponse creations. Now that we know what the problem was, we dug into the Lightwire code. We found the issue in Lightwire.cfc. It loads setter dependencies as singletons by default??? Nooo, can't be.


On line 273, its creating each bean as a singleton. After replacing line 273 with an singleton/transient check like so:


It worked. Let's rerun our tests.


Hashcodes are different meaning now they are transients as well.





*** UPDATE 01/09/10 *** Do NOT use the code I posted as a fix as there has since been an official patch to fix this issue. Grab it from here. Many thanks to Peter Bell and Brian Rinaldi for their help and quick response on this!

Please guys, help me see the light...