Happiness and other technical requirements – Finale

And here we are, looking at the gran finale of this (now) four-part software development story.

We left our team in the middle of it’s defining challenge, facing doubts and concerns.

Now, your first instinctive reaction – if you find yourself in a similar condition – is denial. It’s a natural defence mechanism.

– There are no problems! They are just over reacting! Besides, it all looks easy to me: you just need to do this and that, and that’s it…

Why do we do that? We don’t want to admit there are issues. We invested much of our ego in a project, and when problems arise, we ignore them. The bigger the problems, the more we tend to ignore them. And we are all walking together our death march.

It takes some skills to actually stop, and be honest with ourselves. Stop and listen.

What helped me to acknowledge it was my focus on listening and respecting my team’s opinion.

What helped us to work it out and solve it, was our unwritten (but practiced) belief in respecting and eliciting each other’s feedback.

So we made a plan, we plotted some practical actions.

To shed some light on the root cause of those bugs we introduced an extensive logging solution, that tracked everything happening on the HTML 5 app, storing it in the device storage and syncing it back to server for further analysis.

Funky bugs

What we found was quite surprising. Some of those most obscure bugs where only happening using a mobile connection and not over WiFi.

We looked at the extensive logs, and we found some weird error messages, that were definitely not generated by our code.

It turned out that some network operators, to optimise user’s bandwidth, re-scale the images contained in a web page, and then inject some custom JavaScript at the bottom of the HTML in order to serve the full-quality version only when the user taps on them.

That piece of random JavaScript written 8 years ago, totally inadequate to modern single-page applications, was hijacking our event handlers and throwing odd errors.

The elevator test

Our testers did an amazing job in identifying this and other potential bugs.

In contrast with the situation we found at the very beginning of this story, this time the collaboration between developers and testers really payed off.

As developers, instead of rejecting the concerns of our testers that couldn’t be proved, we built them tools that allowed them to automatically record what was happening and easily report it.

As for them, they used their dedication and creativity to come with new clever ways of testing.

That’s how we invented the SouthWest Train commute test (aka: what happens to your mobile app with an intermittent connection) and the elevator test.

Taking the elevator from our second floor to ground zero, facing an internal corner of the compartment, was a very accurate and reliable way to cut all 3G signal.

These tests raised a lot of questions, and highlighted important assumptions we missed. Is you app able to properly detect your connectivity status? Is it blowing out in the middle of a server requests? Is it able to recover from that?

So, you could sometimes spot a developer and a tester, going up and down on the elevator, until some obscure bug wasn’t ruled out…

Agility

All our technical choices and process decisions were oriented towards flexibility and adaptability: we choose an HTML5 app, so that changing device and operating system would not be an issue, we choose REST protocols and a loosely typed data format such as JSON, we choose a NO-SQL database which allowed to quickly enhance our data, we choose Ruby and JavaScript as dynamic programming languages, we choose a cloud system which let us delay performance concerns with super-easy scalability, and we wrapped everything in robust but loose test suites, to be able to constantly and safely refactor and reshape our code.

Towards the end of our development, when the user trial was approaching, these choices really payed off. As we became more confident, our velocity increased. Our business sponsors were noticeably satisfied by the pace of our delivery.

We used retrospectives and demo to get fast feedback and steer the direction our efforts, and – as lubricant – we encouraged open communication, and we relied on mutual trust.

I happened to witness a brilliant example of that: after a demo which raised some discussions, our two business owners stopped for a chat with one of the developers. Looking at the screen, they were able to quickly take a better decision, implement it, and commit and deploy the new feature. They were happily shocked. You must know they were used to have long discussions with business analysts and technical architects before a change could even be scheduled by a project manager, and then wait for weeks, and sometimes get something completely different from what they were looking for.

Good and open communication was really the magic that kept the engine working so well, both internal communication between developers, testers and our business analyst, and external communication, establishing great relationships with he DevOps teams, our department technical directors, and our business sponsors and project manager.

The last obstacle

We finally launched our user trial, with 20 sales agents using our iPad application in their daily job. We built them an easy way to report their feedback, and we put in place analytics to track the outcome.

The results were encouraging. They all loved it.

No bugs ever reached our users, except… except one user started to complain about losing all his data.

For better network and battery efficiency, and to keep functioning in areas without 3G connections, our application was storing a lot of data in an offline storage and synchronising this data on-demand.

Losing that data was a huge problem. Luckily no sales were lost, but all the logs about the daily activity – which meant so much for the agents, their managers and for our understanding of the success or failure of the experiment – were gone.

As an isolated report, we blamed it on the agent (he may had accidentally cleared the offline storage after all).
But when a second agent reported the same issue, our worries become real.

How was that possible? We tested the application for months, both manually and automatically, on different networks and connections, using the same physical device used by the field agents, and we never ever found a similar problem.

Our attempts to reproduce the issue, involving a lot of creativity and hard work, kept failing.

Finally, after more research, hacking devices, and inspecting the internals, we discovered the cause of the issue.

Months before our launch, Apple introduced a change in the way HTML5 applications local storage was treated by iOS. When the operating system decided to run a “garbage collection” – because of low memory or as part of a routinely clean-up – it could effectively wipe out that data. Local storage for HTML5 applications was now treated as temporary.

The only reliable solution, guess what, involved adding an extra new technology to our stack: a native application wrapper for iOS, written in ObjectiveC. As you may imagine at this stage, learning a new technology couldn’t really do more harm than galvanising our team. Such are the benefits of working with a team like that.

We quickly built a prototype that proved our solution, and with the support of native developers from the Mobile department, we refined it and released it to our agents. The bug was squashed.

Mission accomplished

So, as I promised you four blog posts ago, there is a happy ending. And here it is. : )

At the end of our user trial our business sponsors were hugely satisfied, so much that they tried do assign us other parts of a bigger project. We delivered much more than what we promised, by listening and constantly adapting to their requirements.

Our agents and final users loved the application and loved being involved in it’s development.

We pioneered new technologies, we showcased them to the rest of the department, making our technical directors happy.

We went from being the cinderella of the department, to the raising star.

What happened after that to our team, the hero of our story?

Soon after, as a result of some high level restructuring, the team was moved to another department and bound to merge with another one.

Somehow, we all felt that our experience as a team was over, and in few months most of us went looking for new adventures.

And so, the story ends, with our heroes walking towards a sunset.

But before closing the curtain, I’d like to spend a final post on this topic.

So, if you had the patience to follow it up until now, there is an extra bonus in preparation…

Soon… : )

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s