Updated: Jul 14
Deriving the ORA Score
Part 1 - Probability of Technical Success
You are proposing a project with a promising return on investment, a positive impact on the business, and a marketing potential that could distinguish the company from competition. Good so far. You continue and clarify that it is unknown which technical hurdles may undermine the project before it is completed. And even if it is completed, you understand that your company may not choose to deploy it. But regardless of this latter possibility, could you have a few hundred thousand dollars added to your budget to complete this project? Well? A kind boss may send you back to your desk to do some homework. A typical boss may just stare at you in bewilderment. But would they? As it turns out, many data science, AI, IoT and other “sexy” technology projects are approved under these circumstances.
Ordinal Science has walked into a few such projects. Their leaders begin with a healthy optimism and elevated expectations driven by an excessive “inside view” (see Daniel Kahneman, Thinking, Fast and Slow). But their vision ultimately does not survive the harsh realities of poor data, immature precursor technologies, insufficient expertise, and a company culture resistant to innovation.
Why were the projects started in the first place? The answer is not simple. Most of the companies we encountered conducted an assessment attempting to quantify and measure risk. But such assessments suffered from an inability to evaluate AI and data science technologies that had only recently burst into the scene. These companies lacked objective measures and a mechanism to block projects with a low probability of success. In the end, they were seduced by the hyped-up, if at times dubious, success of competitors. Afraid of being left behind, these companies defied caution and lurched into development.
In the remainder of this post, I will discuss the objective measures that Ordinal Science uses to evaluate the probabilities of technical and commercial success. I will introduce the resulting ORA score designed to insert a measure of objectivity into the decision to launch—or abandon—a project. There is some math involved, but this is less important than the ideas and principles at play. Data Science, AI, and applied mathematics deliver substantial returns, but what projects you choose to do, and under what charter determines how much they do for your company.
The Feasibility Study determines the likelihood of a project’s success. It delivers a single measure of this probability but comprises two independent evaluations. The first is the probability of Technical Success - P(T). It focuses on the project prototype or a proof of concept. The second determines the probability of Commercial Success - P(C). It evaluates the likelihood that the prototype can be scaled to production and adopted by the company or its customers. The product of these two measures is the Probability of Success: P(S) = P(T)P(C). We treat each measure as an independent variable. When evaluating the probability of commercial success, we assume that the prototype has already been successfully developed.
Before unpacking the derivation for each score I should clarify that our approach is seldom static. Most data science and research projects do share characteristics and benefit from a consistent approach to risk assessment. However, it is often necessary to adapt the details of an assessment to the organizational peculiarities of a project, and to take into account the nuances of industry verticals. Please consider, as we must with every project, what fits your needs. Do so with the intent to remain objective and to avoid the temptation to adjust the parameters of an assessment so that your favorite projects return a better, but inflated, score.
Technical Success Evaluation.
Estimating the probability of technical success—designated as P(T) from now on—is an exercise in identifying the salient risk factors, then scoring each individually on a consistent scale before plugging the values into a mathematical function. Easy. But what exactly is “technical success”?
The definition of P(T) is as follows: a likelihood that, given existing data, available technologies, required research, and development, the prototype will be completed and will perform the defined function in a curated laboratory or constrained production environment. P(T) is given as a percentage.
The definition is important as it sets expectations for the deliverables. We limit the scope of P(T) to the prototype designed to test the technologies and convince a moderate sceptic that the approach works. The tests should run in a limited but representative case and use a vetted data set. Why the limitations? They decrease the time needed to develop the prototype without compromising the evaluation. They reduce the chance of misalignment with final goals by facilitating minor, continual adjustments. They reduce the risk of investment. At the same time, the process still validates the difficult components and algorithms required for the production solution.
We defined the specifics of the prototype in the prior engagement stage: Concept Development. This is the subject for a different article and we will assume that our prototype has a clear purpose, associated user persona, definition of function, and unacceptable failure modes.
Now, we evaluate what it takes to build the prototype. I will compress the process into a series of questions which elicit investigation and result in a score from 1 to 10, where 1 is a categorical no and 10 is a categorical yes. We group the evaluation into four equally weighted categories: Data, Technology, Business, and Expertise.
1. We have all required data features
2. The data for each feature is complete
3. The data for each feature is clean and accurate
4. We have ability to synthetically simulate sample data
5. We can access data without bureaucratic roadblocks
6. The Technology Readiness Level (TRL) for the majority of the required technology. (See NASA https://www.nasa.gov/directorates/heo/scan/engineering/technology/technology_readiness_level)
7. Lowest TRL of an individual required component
8. We have access to source-code or libraries for a related solution
9. We have access to comprehensive documentation for the low TRL components.
10. We have the computational capacity to achieve adequate results
11. We have a clear path to validate the correctness of the result
12. The project has a strong internal champion
13. The project has a strong internal ROI
14. The project team has a defined charter
15. The project has an adequate budget
16. Client has an internal subject matter expert
17. The team (Ordinal Science) has the exact alignment of expertise with the low TRL components
18. The team has successfully concluded research and prototype development in a conceptually similar space
19. The team has access to external resources available for consulting on low TRL components
20. The team is granted a degree of independence during the initial research phase
We assign a score to each of the 20 parameters, then plug the numbers into the formula we developed. We call the resulting number the “ORA Score” -- Ordinal Risk Assessment Score.
, is an array of individual parameter scores
N is a number of parameters, in our case N = 20
C is a compression coefficient, C = aN, where a = 3 (empirically chosen)
, is an array of parameter weights that determine the criticality of each parameter. w = 6 for a single parameter when s = 1 will have an effect of suppressing the P(T) below 0.5.
is the safest, unless the reasons for particular weights are well understood.
If you said “Huh?” to the above, no worries—give us a call and we will do the Feasibility assessment for you. That is our function. If you followed along then a word of caution on the weights. They matter very much and must make sense in the context of your business. They are not equal to 1 for all features all the time. How we derive our weight vectors is a bit of a trade secret, but with much experimentation you may arrive at a reasonable set for your context.
The primary purpose of defining the P(T) is to decide whether to continue to the next phase. What is the right threshold for moving forward? I don’t know. We generally gate our projects at 70% likelihood of technical success, but the context matters as well as the risk tolerance of the client. The proper threshold should be determined by a business discussion with stakeholders.
I will end Part 1 of the Feasibility post here, to resume with “Part 2— Probability of Commercial Success” in the next post.