Down in the Jungle Room
For last week’s coding dojo at CoreMedia, I prepared a small exercise on concurrent programming. As a rule, it is a bad idea to write concurrency algorithms like a lock or a thread pool yourself. Get a proven library and use it! So the exercise had to deal with a more realistic problem: apply some changes to legacy code that is supposed to be concurrency-proof.
The given source code simulated – in the most elementary form – a company that organizes safaris. The original source code that was presented to the participants can be found here: https://github.com/okummer/ConcurrencyJungle
The task was to fix scalability issues in the code. The exact extend of the scalability issues was not mentioned, which initially did not seem to disconcert anyone. It was allowed to change the code in any way. It was pointed out that the developer of the existing code was no longer with the company.
The participants decided to try mob programming for this task. This led to very interesting discussions and provided me with a natural way to drop the occasional comment on possible alternate solutions. See http://marcabraham.wordpress.com/2014/02/05/what-on-earth-is-mob-programming for details on mob programming.
If you want to do this exercise in your dojo, it is a good idea to have someone with experience in concurrent programming attend the dojo, so that bad solutions can be spotted and hints can be given.
Spoiler alert: The remainder of this text contains hints that might lessen the experience of solving the problem oneself.
The code was sprinkled with various style issues and small inconsistencies that were supposed to serve as warm-up exercises and as distractions during the actual programming.
However, the participants ignored the small issues, hardly looking at the code. Instead, they headed straight to the test class, which actually seemed to test relevant aspects of the program. Of course it didn’t. It took some time until all of the blunders in the (green) tests were discovered and the tests were rewritten. It became quite obvious how existing code, even code that one should be suspicious of, can be accepted at face value.
After much test writing, some risen eyebrows, and finally a debugging session, it became clear why the code did not scale: nearly every method was synchronized. After removing the central synchronization point, which allowed exactly one safari at any given time, the program immediately ran into a deadlock. It was designed that way, of course, but the surprising takeaway was that you can get deadlocks by removing locks.
Only now did the participants look more closely at the entire set of classes. The code was a jungle of interrelated code that exhibited a highly interwoven call structure. Of course, this is bad news for concurrency requirements. Furthermore, the code suggested that this is really a thoroughly chosen architecture that closely matches the application domain. Thoroughly chosen, yes, but …
At this point, some approaches were considered and it was decided to reduce the potential for deadlock by removing those locks that were provably not necessary, because they protected immutable data or data that was accessed in one thread, only. The participants were very careful in this phase, avoiding a trap that would have led to an inconsistent object graph when synchronizing too little. I was quite proud of them.
When all was done, we talked a little about the result of the refactorings and about general approaches to deadlock avoidance.
It was a lot of fun to do this exercise. It turned out to contain the right amount to trickery, so that a robust solution was found before time was up. If there had been more time, code cleanup and further testing would surely have been possible.