AI, AI Everywhere, not a test in sight

Who's minding  your AI?

Related image

I think it's beyond question that the single biggest buzzword cutting across the tech landscape today is AI. From the tech giants in Silicon valley and beyond,  to 1-person startups on the other side of the world, everyone is investing in AI and trying to deliver software that's smarter and more intuitive.

The cloud has enabled the compute necessary for the work. The algorithms now come in nice packages just waiting to be deployed. Every single day, I see proof of concepts doing incredible things with just a few lines of code. Object recognition and tracking now seems almost easy. Route optimization, why I'll have some of that. 

Of course, when I call AI a buzzword, I do it a disservice. It's not a flash-in-the-pan annual marketing gimmick that will be forgotten next year. In fact, over the last 10 years, I would say, the cloud and AI have been the 2 paradigm shifting events in the technology landscape and they are very real.

Also, in case you've been living under a rock for a while, AI is well beyond the proof of concept phase. It's deeply embedded into a number of systems the entire world uses every single day. Google Maps has AI driving its route calculation, Siri is using AI on the NLP and search side, Amazon is using it for recommendations on the site.

These are just the most obvious consumer facing examples. There's a lot more happening on the government and enterprise business side. From the criminal justice system to the global supply chain, AI is everywhere. 

It would not be an understatement to say that the goal today is to make all our software smarter by applying AI based analysis to the problem statements and help make intelligent decisions, faster.


Which brings me to my key point:  How do we know it's working? 

More specifically, how do YOU know your expensively assembled AI system is not causing issues in certain cases while you see great results in another area?

AI systems are inherently complete black boxes. At it's core, most AI we talk about is machine learning algorithms crunching through data sets to find patterns. These are managed via giant matrices that are assembled by the program on the fly. 

The program itself doesn't understand the context of the patterns it's determining and the connections it's making. The program is given a goal to achieve and it works its way through the problem by positively weighing events that align to the goal and negatively weighing the events that are not aligned to its goal. And on and on it goes, scoring, weighing and moving ever closer to a smaller data set that appears to meet its desired goal.

This is how AI works. And this is how we end up with systems we don't understand, giving us what look like bloody good looking results.

The problem is, how do we know the result set is good in all conditions? Remember, It's the closest thing to a genuine black box in all of testing, because we probably can't work out it's logic even if we wanted to. 

Worse still, we as a society, operate under a guise of machine neutrality, where we trust a machine generated results more than human curated ones because we believe the machines will be more neutral, when in fact, the machines are  reproducing and even amplifying the same biases we operate with. Crucially, in the case of machine generated results, our guard is even lower. 

As these things get deployed at global scales in ever widening use cases, it becomes imperative to have a foundation that allows us to be sure that the AI is working as it's supposed to. Because, when these applications don't work, they can expose you to complete nightmare scenarios. 

There's already some pretty epic examples of how these systems fail BIG:

  1. There's the 2015 case where google's image classification algorithm wrongly classified black people as Gorillas. To my knowledge, this is still not  fixed: https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai
  2. In 2016, algorithms deployed at the NZ passport office incorrectly rejected applications from people of Asian origin and asked them to keep their eyes open: https://www.reuters.com/article/us-newzealand-passport-error/new-zealand-passport-robot-tells-applicant-of-asian-descent-to-open-eyes-idUSKBN13W0RL
  3. less than 4 months ago, amazon scrapped it's SECRET internal AI powered recruiting tool as it showed bias against women: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G
  4. Here's a truly disturbing paper from Propublica where their investigation proved that algorithms being used to determine a defendant's risk profile and thus possible sentencing was deeply biased and inaccurate. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm

The examples just keep on coming. To the point where it's well established today that AI based algorithms are extremely susceptible to bias. 

The term Garbage In, Garbage Out has never rung truer. It's the nature of the systems themselves that cause this issue. 


In fact, today may be a momentous day. If you have followed the 10k filed by Microsoft for the last quarter of 2018, the company just warned investors that its investments in AI are not only huge and continuing to grow, they introduce a significant investor risk due to the possibility of extremely large scale screw ups. https://1reddrop.com/2019/02/06/microsoft-forewarns-investors-about-its-ai-efforts-going-awry-heres-the-real-story/

You certainly don't read THAT everyday and I believe investors should pay heed. The risk is real.


What does all this mean? 

As one of my gifted colleagues put it .. "Blah, Blah, Blah .. but who's going to test it all?"

I've spent time with some big brains as I research this problem from a testing view point and even the leaders in the space are flummoxed at best. 

Elon Musk thinks we need a federal agency overseeing all AI development. While that's possibly quite out there, my interactions have led me to believe that a neat answer is so far not available. In the end, it will come down to taking a safety first approach and being vigilant. 

This isn't meant to scare people off or be a piece against the use of AI. I think the benefits far outweigh the risks but the risks are somewhat arbitrary and the smart organization is aware of them and guarding against them to the appropriate degree. 

So, by all means go ahead and deploy these algorithms into your applications. But have a robust testing strategy in place. In fact, have an ongoing monitoring strategy in place as most of the issues will be found in real world deployments. I would strongly urge use of only open source algorithms. If you can, make the data sets open as well, at least to researchers who can audit the systems and hopefully detect issues before your customers.

As always, the amount of effort spent testing and monitoring should be directly tied to the scope of the implementation and the exposure. Just know, AI systems have a propensity for the spectacular, on both the success and failure front.


Comments

Popular posts from this blog

Dearest Pepper,

The place i wanna be...

Life covers a lot of ground ....