I’ve been hired to help refactor code. The lead developer and project owner know they have a code mess and they want it fixed. Rewriting the application is not an option, but the code base is tightly coupled and has turned into a ball of mud. Where does one start?
At first (and still am to a certain degree) perplexed about the process. Core features are intertwined in a whimsical manner where no thought of abstraction or single responsibility has ever taken place. Procedural code runs rampant among threads, inherited classes (abstract or not) and a myriad of if statements. And as developers have tried to “improve” the methodology of making calls into the bowels of the communication layer it has quickly become a congealed mess.
While I have faced similar issues with code in the past, I’ve never been given the green light to really fix the problems and make it work. I’ve argued unsuccessfully to rewrite projects because the code base is so messed up that only a proper burial into the sea of electricity was fitting. This does not sit well with people who have many years of time and money invested into the project(s). It requires more creative thinking to get the code base into a true usable state.
I’ve told the project manager that we need to stop current development. Adding more features will only serve to create more threads of illegitimate code. The second step I’ve taken is getting the source code repository into a better managed state so we know what code is getting developed, tested and in production. I wouldn’t say the previous state of source control was horrible, but it lacked the ability to exist in these different states independent of each other.
Next I traversed through the code and began to pick at the low hanging fruit. Simple things like removing needless comments (almost all of them), unused variables, unused methods and classes, clean up using declarations and other tidying refactors to make the code a bit better to read. In Fowler’s book, “Refactoring: Improving the Design of Existing Code” he indicates that code not placed with unit testing is considered legacy code and thus needs to be refactored. This means the start point for refactoring is getting the code wrapped in unit tests.
This means the developers need to have the ability to write unit tests. This also means we need to have good coding standards and reviews as we move forward. Since the company is hiring real Sr. Developers we have a nice core team that understands these principals. We now need to get the existing team to understand these are not suggestions, but requirements if we are to bring the code base back to life.
My fear is that while we might do a good job of cleaning up the cuts and bruises we will need to use the electric paddles a couple of times in this process to keep the pulse going.
I finally started my new job today, about 3 weeks later than I really expected. I'm not sure what the real hold up was, but involved lawyers, insurance and I'm sure money. No matter, I was able to start at 1pm today and had an interesting discussion with the PM and the Lead developer.
They are committed to making this development group one of the best in Wisconsin and they want me to be a pivotal player in this development. They want to hire 5 developers in the next few weeks. They have already have two others on board, they are needing IT to help with equipment and such.
We talked a lot about design and moving the application in the right direction. They have a developer in Colorado who phoned into our room and I got a chance to see the code he was working on. Not very good, but then again I expect that in most applications. Even when developers are well meaning a lot of bad code slips through the cracks as release dates are nearing and it needs to be done now.
It is going to be interesting working with them. I already warned them that I usually come with both guns blazing wanting to make the software better, but I believe my recent experiences has taught me to temper how I go about it. They responded that they welcome the input. Be careful what you wish for!
This could be a gold mine of opportunity for me. I think the goals, the conditions and the software that is wanting to be written has the potential to be really neat and exciting.
And I'll be able to post about something more than a file copy system!
Yesterday (every time I type that word I think of the Beatles song...) I had an interview with the PM/Lead. I was given the opportunity to complete a calculator program. I had two hours and while I didn't complete it (I was too worried about removing the dependency from the UI to the BLL (Core) layers. If I wouldn't have been so anal I would have been able to implement an abstract class for the base methods for the operators and completed it easily. I think sometimes I get all caught up in the details I forget to get it working and then re-factor.
I used tests to help me along, but when I got side-tracked in my writing I also got side tracked in my tests. I'm learning more and more to trust the tests and not myself. As I learned from my music project, the tests will tell me what I need to do. Testing will then find the missing implementations or the erroneous errors that exist in computers that one doesn't always think about. But then you fix it with a test, then fix the code and you are back riding into the sunset again.
I had a quick code review of what I had written and it made sense to him. I was at least able to show him that the calculating worked, there was error handling and undo functionality, but I had the framework completed for that, just needed about 30 more minutes. I think I was too stressed to work efficiently. Not having a job and needing one does that I believe.
Sooo... we talked about the team. Right now it is pretty small, but they have a ton of work that needs done. The lead told me the code base wasn't pretty, but they have done a good job of removing some technical debt. He also discussed doing code reviews and weekly technical lectures where the team is teaching each other.
I had told them that I started that kind of program at my previous employer and even put the lesson notes on a wiki. I offered they should start video taping it. I think people are more often willing to watch something rather than read something.
We talked about source control as they are currently using SubVersion but are looking at Team Foundation System which I had installed and configured at said last place of employment.
I was given a tour of the software. It's some pretty impressive stuff (a WPF/C# program that uses USB to chat to modules inside of boat engines). They have a ton of features that are needed and are looking to add (hence why they were interviewing me).
On the way home, I was called an told I got the job if I wanted it. Pretty cool software system, working on boats (yes going on the lake with them!) and a team lead that is pushing the team towards software craftsmanship. I can handle a place like that.
The other day I asked the blog world about TDD. And while I didn't get many comments in the blog :P I did get some private feedback statements about how I could handle it. I swear developers are teachers at their heart, because rarely does any give a straight answer to a question. It is always something abstract (no programming pun here). So I banged my head against the keyboard and came up with my approach.
TDD is about letting the tests help you design. If you toss the tests away then the design doesn't really matter because it is untestable. I'm not going to discuss what should be testable or not, but I believe if you are doing TDD in the 'logic' part of your program then it should be wrapped in tests. If I change the behavior of the method how then would I know if my tests pass at the individual function level? Again this is where the Agile manifesto comes into play, I'm not scared of change because if I change something I know where it breaks.
My solution was to understand what I wanted from the facade class. You could call it a controller class really so it is going to handle all of the dependencies and while I'm a fan of cutting them down as much as possible, you have to have dependencies someplace. Since this is such a small program I'm fine putting them there because now I can test the underlying functions in a more precise way. I'm not just testing a return value, but I'm testing the intent of the function.
So I created an interface to the Tag.cs file. This will allow me to write to the interface and open it up for testing. I also removed the dependency in the tag file for the MP3 reader and put it in the facade. By doing this I opened up my testing to mocking that object. I was then easily able to test that the function was returning the correct data and formatting it in the way I wanted.
Before I wrote any UI code, I knew my files where getting renamed in the manner I wanted and they were going to get copied to the correct network share based on the Artist's name. All before ever copying a single file to the media server. Clicking he button calls the facade and it simply worked.
After I realized I wanted to change the way I had named the mp3 file, I simply changed the test to the output I wanted and ran the test. Obviously it failed. But without running the debugger I knew exactly which function failed, fixed the format and ran the test again. This time it passed.
As I tested the program through the interface I realized that I had illegal characters in my file name and pathing since music artists have to be weird. Write some tests with some embedded in the title and artist and bam I now have tests against them.
The only issue I'm having now is that the mp3 tagger library I'm using doesn't support v2.4 of ID3. I think I have some older mp3's that have different versions of ID3 and so I have to do some manual fudging with iTunes to convert the tags to the right version.
I didn't learn much more with TDD, but I did learn more about interfaces, abstract and OOP.
Working on my music project and I decided that I needed a more direct tool to rename my mp3 files. The best way to do that is to read the tagging info embedded in the file and name and place the file in the right location on the media server. The idea is to eliminate as much manual copying/naming as possible.
One thing I realized was that it wasn't Pollux that named the files, I made that decision myself and had MediaMonkey name them only by title. And that mistake was documented below.
I knew the basic premise of how mp3 files work, but I needed more specifics. I googled mp3 tags and thought about writing my own extraction program, but why redesign the wheel when I figured there had to be some libraries out there.
I also figured this would be a good chance to explore visual studio 2010. Since 90% of the jobs in North East Wisconsin are .net, its best to keep those skills current. Although there is a certain romance to do it in Ruby or some other new language, but I don't want to spend any more time on this.
The first library I found seemed like it was going to work, but it didn't update all the tags. It also wasn't open sourced so I couldn't update to do the things I wanted. So I thought I'd use Reflector to see if I could gleam anything useful. Taking a look at the code I knew it wasn't going to work for me.
The next library I found was on source forge called, C# ID3 Library not only did this include the source, but it was wrapped within a couple of dialogs to use as examples. It even had some test projects.
I started writing my own tests to extract the file's location. I abstracted the reading of the tags into its own file, passing the file's location. I then created the new file name along with its new location doing some string manipulation.
I did some Mocking with Moq with NUnit and using an interface I was able to mock the object getting the correct tags and creating the right file path.
I then wanted to wrap the whole process within a facade. As I developed my routines, I wrote unit tests to make sure each function behaved as predicted by the tests. But then I realized that I didn't want the facade to have the dependency of the Mp3file class that reads the file.
So I wanted to make Tag an Internal class so that only the facade can access it. As I'm typing this I should make an ITag interface file to access the Tag.cs and abstract that out. That way if I want to reuse it, then all I need to do is call the interface.
Anyways as I was refactoring last night I realized that I'd have to add attributes to allow my test project access my internal classes so I could keep my unit tests. My thought however was to write a unit test around the facade method RenameMP3File. But this got me thinking, if I did that then I would lose the ability to mock the MP3Tags object. The Tags.cs constructor instantiates the MP3Tags file with the file name and fills the properties I need. The facade is doing a great job of hiding this functionality, but now I can't mock my MP3Tags object.
So I have two options that I can think of, exposing the Tags class so I can continue to use my unit tests as already configured. Or change my unit tests to test only the facade method, but losing the ability to mock my object up thus my tests are brittle. I suspect there is a third way, but I think I'm missing it.
I know in the end that this is a small project and won't expand much, but I'm using as an exercise to try to keep my skills up and active since I'm looking for work. While I know I can get this to work, I don't want to fall out of the TDD methodology to slap it in to work. I know this changed when I added the facade logic outside of TDD, but then I would have had to add that dependency up the line. Is that a bad thing?
A bit of a preface is needed here. I am organizing my mp3 music collection, which by most people's standards is a lot (around 40k mp3 files). Simply stated is that I have a ton of music and I'm trying to figure a way to best organize it.
I started with a hard drive full of ripped mp3s that were haphazardly thrown for storage and I wasn't keen on keeping good play lists so I organized most of my music for each person by windows folders. It was easy to sync to multiple devices. (don't roll your eyes at me!). I ended up with a mess of duplicate songs. I spent the last few days "flattening" out the songs into a few directories located on my media center.
I copied them to my local drive and then ran the Pollux application to make sure my tags were clean and adding the lyrics. For $10, its a good app for what it does, but I did wish it had more options to help me with getting my files organized.
I had a problem copying files from subdirectories to a parent directory on the mediacenter. I looked at robocopy and others, but I didn't want to spend three hours writing a script for another program, so I took about thirty minutes and wrote a quick wpf app to set parent source (it spidered through all subdirectories looking for any mp3's) and a destination drop on the media center. I put a check for duplicate filenames to ignore and diskspace. All was well.
Having 10K plus songs in a single directory got a bit overwhelming if I needed to search without MediaMonkey(MM)/ITunes. So I had MM rename/copy the tracks and put them into a folder structure that was like \\mediacenter\drive C\A\Aerosmith then all of Areosmith's songs would be in a single folder. If I wanted to sort by name or by album I can view that in windows explorer or better yet in MM or ITunes where I would be managing the files 90% of the time.
I completed the task!
All my songs were nicely arranged on my mediacenter drives with extra space to spare amongst them. However I noticed a couple of issues as I loaded the mass of mp3s into my MM.
The first issue when flattening out the files and copying to the mediacenter pc, I ignored if a song was already there by the same name. I was trying to eliminate the 500 Beatles songs into 300 since I have many cds that have the same Beatles song. Trying to save on space and reduce the volume of files to shift through. But a band can remix the song by the same name and I'd end up with only one version of the song.
The second mistake was if the two different bands had the same name for a song. For example Breathe is a popular song title. Since I renamed all the mp3's to its simplist form (only its title), my fix for duplicate files backfired again.
So I realized I needed to go back to my source and recopy these again, but I needed to do this smarter since it takes a while to copy and arrange all these songs. So I decided to write another program that did more than simple copy files and move them around. I need something smarter.
I need to start from scratch since I have no idea which songs I've lost. But now I have some "requirements" to my program.
I have three problems to solve here:
1. I need to re-run Pollux to retag my music. I need to make sure they do not get renamed in such a way that I lose duplicate songs.
2. I want to keep the directory structure I currently have Drive/alphabet/artist/ but I want to reduce the steps it takes me to get from my mess on my portable drive to my media center.
3. I need to come up with a naming convention for my songs to avoid the problems above, but I need to ask the end user if having the same song from different albums is really a problem? Since I'm trying to automate this organization, I'm going to have to say that I will have to live with having duplicates from different albums so I don't lose different versions, but I will ignore having the same song from the same album, which is a possibility at times.
Tomorrow I will post about item #3 and my discovery with TDD and how I started and how it ended up.