Is the Agile approach right for R&D and Data Science?

Egor Korneev
Oct 18, 2022
4 min read

A path to rapid discovery or a pipe dream?

The Agile methodology of software development has ushered a revolution. It replaced a tedious and protracted specification development with a dynamic process which engages developers, stakeholders, and end users throughout the lifecycle of the product. It shortened the development timeframes, brought products to market faster, and limited costly errors with a cycle of realignment inherent in the Agile sprint planning. It infiltrated most branches of software and is making in-roads into advanced data science and R&D. Is it the right trajectory?

The benefits of Agile are understood and appreciated, especially by those of us unfortunate enough to have built under the old regime. I will take the benefits for granted and pledge my allegiance to the new software development ways.

Of course, the legacy hyper-detailed, specification-driven development approach still has its place. It may be the best, if not the only way, to build a flight control software, nuclear reactor operating module, or a firmware driving the James Webb telescope. Iterating through code versions while a multi-billion dollar piece of equipment is floating at a Lagrange point, a million miles away from Earth, is too risky. The old ways have their rightful place.

So is Agile great for everything else? Not so fast. In our work we found a conflict between the expectations stakeholders have under the Agile regime and the realities of building something truly new.

The Agile Alliance lists twelve principles underlying the Agile approach in the Agile Manifesto. I list them here for convenience.

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.
Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Business people and developers must work together daily throughout the project.
Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
Working software is the primary measure of progress.
Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
Continuous attention to technical excellence and good design enhances agility.
Simplicity–the art of maximizing the amount of work not done–is essential.
The best architectures, requirements, and designs emerge from self-organizing teams.
At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

Almost all are good practices under any regime, even outside of technology. But items 1, 3 and 7 run afoul in a difficult world of R&D. What is the difference?

The key is the degree of uncertainty present in R&D projects vs software development.

Software development generally deals with well-understood spaces where the fundamental methods are seldom in question. The speed of execution and quality of new projects depend on the strength of the team. The development schedule is predictable. Workflows are tested and tried on prior projects.

Mistakes, of course, happen. A wrong architecture choice can derail the schedule. Incorrect platform selection can ruin scalability requirements. Poor intra-team communication may lead to misses in the interface design. And a combination of errors can lead to a project disaster. But the structure of Agile and Scrum is built to catch the misses early and correct them at minimal costs.

By contrast, Research and Development deals with fundamental methods that have not been vetted in the context of the project, if at all. This uncertainty forces the development onto unexplored roads, many of which lead to cul-de-sacs. The consequence is an unpredictable schedule which breaks the key principle of agile - deliver a tangible value to customers with every sprint.

In R&D projects the team and the customer must accept a protracted period of “fruitless” exploration before an approach yields results. Fruitless, of course, it is not, as each pass decreases uncertainty. The progress should be measured with a sophistication beyond checking the boxes next to the completed features comprising the MVP. This is difficult to reconcile with the Agile requirement for concrete features bookending a sprint.

The degree to which we can use the Agile principles for R&D projects is proportional to how well understood is the technology and the research underlying it. This uncertainty could be expressed as a Technology Readiness Level (TRL). The lower the number, the less mature is the technology or algorithms. Agile approach may not be appropriate for TRL below eight, in some cases seven. So R&D and Agile are in uneasy tension, although many Agile principles still apply.

Can we use Agile for data science? The conclusion is similar to the above, with a proviso that as data science itself matures, the number of projects prime for Agile grows. At Ordinal Science, we learned to segregate the projects into R&D and rote data science then execute each type using the appropriate approach.

As I finish the article, I wonder what I will think of this post in a year. I suspect my perspective may shift. And this is Agile in action - implement, learn, update. Do it again. Never stick to aging ideas, embrace the change instead.

Is the Agile approach right for R&D and Data Science?

Recent Posts

Comments