The Fine Print: deep learning

Tuesday, 8 October 2019

Hype Will Harm Artificial Intelligence

After exploring AI deployment in some depth and chairing the SCL's overview of AI in Dublin in September, I've been particularly conscious of the hype vs reality. Nobody should deny that narrow artificial intelligence is here to stay - for good and bad. We just have to be realistic about its capabilities and shortcomings - and how to detect their consequences - so that AI is developed and deployed responsibly.

In a recent report on 'smart cities', for example, the Oliver Wyman Forum found that no city on Earth is ready for the disruptive effects of artificial intelligence.

Talk of 'killer robots' and beating humans at board games is also all the rage, but Barry O'Sullivan assured us in Dublin that robots take ages to 'train' for any one sequence, can't cope with door handles and their batteries soon run down. It took $50m in electricity to train a computer to beat a human at Go.

AI can be used for good, but it can also be 'weaponised' against a population, and 'hacked' by altering the appearance of things or people's appearance in quite subtle ways - without actually interfering with the AI itself.

In the 'real world' of AI, the genuine concerns are inaccuracy, lack of explainability and the inability to remove bias. And there remain vast challenges associated with the reliability of evidence and how to resolve disputes arising from their use.

That means we have to challenge the use of AI where the consequences of false positives or negatives are fatal or otherwise unacceptable, such as denying fundamental rights or compensation for loss, for example.

Instead, artificial neural networks and deep learning are better used to automate decision-making where "the level of accuracy only needs to be "tolerable" for commercial parties [who are] interested only in the financial consequences... than for individuals concerned with issues touching on fundamental rights."

Being realistic about AI and its shortcomings also has implications for how it is regulated. Rather than risk an effective ban on AI by regulating it according to the hype, regulation should instead focus on certifying AI's development and transparency in ways that enable us to understand its shortcomings to aid in our decision about where it can be developed and deployed appropriately.

Sunday, 16 June 2019

Of Caution And Realistic Expectations: AI, ANN, BDA, ML, DL, UBI, PAYD, PHYD, PAYL...

A recent report into the use of data and data analysis by the insurance industry provides some excellent insights into the pros and cons of using artificial intelligence (AI) and machine learning (ML) - or Big Data Analytics (BDA). The overall message is to proceed with caution and realistic expectations...

The report starts by contrasting in detail the old and new types of data being used by the motor and health segments in the European insurance industry:

Existing data sources include medical files, demographics, population data, information about the item/person insured ('exposure data') and loss data; behavioural data, frequency of hazards occuring and so on;

New data sources include data from vehicles and other machines or devices like phones, clothing and other 'wearables' (Internet of things); social media services; call centres; location co-ordinates; genetics; and payment data.

Then the report explains the analytical tools being used, since "AI" is a term used to refer to many things (including some not mentioned in the report, like automation, robotics and autonomous vehicles). Here, we're talking algorithms, ML, artificial neural networks (ANN) and deep learning networks (DLN) - the last two being the main focus of the report.

The difference between your garden variety ANN and DLN, is the number of "hidden" layers of processing that the inputs undergo before the results pop out the other end. In a traditional computing scenario you can more readily discover that the wrong result was caused by bad data ("shit in, shit out", as the saying goes) but this may be impracticable with a single hidden layer of computing in an ANN, let alone in a DLN with its multiple hidden layers and greater "challenges in terms of accuracy, transparency, explainability and auditability of the models... which are often correlational and not causative...".

~~Of course, this criticism could be levelled at the human decision-making process in any major financial institution, but let's not go there...~~

In addition, "fair use" of algorithms relies on data that has no inherent bias. Everyone knows the story about the Amazon recruitment tool that had to be shut down because they couldn't figure out how to kill its bias against women. The challenge (I'm told) is to reintroduce randomness to data sets. Also:

As data scientists find themselves working with larger and large data sets and working harder and harder to find results that are just slightly better than random, they will also have to spend significantly more time and effort in accurately determining what exactly constitutes true randomness in the first place.

Alarmingly, the insurers are mainly using BDA tools for pricing and underwriting, claims handling, sales and distribution - so you'd think it pretty important that their processes are accurate, transparent, explainable and auditable; and that they understand what results are merely correlated as opposed to causative...

There's also a desire to use data science throughout the insurance value chain, particularly on product development using much more granular data about each potential customer (see data sources above). The Holy Grail is usage-based insurance (UBI), which could soon represent about 10% of gross premiums:

pay-as-you-drive (PAYD): premium based on kms driven;

pay-how-you-drive (PHYD): premium based on driving behaviour; and

pay-as-you-live (PAYL): premium based on lifestyle, tracking.

This can enable "micro-segmentation" - many small risk pools with more accurate risk assessments and relevant 'rating factors' for each pool - so pricing is more risk-based with less cross-subsidy from consumers who are less likely to make claims. A majority of motor insurers think the number of risk pools will increase by up to 25%, while few health insurers see that happening.

Of course, micro-segmentation could also identify customers who insurers decide not to offer insurance (though many countries have rules requiring inclusion, or public schemes for motorists who can't otherwise get insurance, like Spain, Netherlands, Luxembourg, Belgium, Romania and Austria). Some insurers say it's just a matter of price - e.g. using telematics to allow young high risk drivers to literally 'drive down' their premiums by showing they are sensible behind the wheel.

Increases in the number of 'rating factors' is likely to be more prevalent in the motor insurance segment, where 80% (vs 67%) are said to have a direct causal link to premium (currently driver/vehicle details, or age in health insurance), rather than indirect (such as location or affluence).

Tailoring prices ('price optimisation') has also been banned or restricted on the basis that it can be unfair - indeed the FCA has explained the factors it considers when deciding whether not price discrimination in unfair.

Apparently 2% of firms apply BDA to the sales process, resulting in "robo-advice" (advice to customers with little or no human intervention). BDA is also used for "chatbots" that to help customers through initial inquiries; to forcecast volumes and design loyalty programmes to retain customers; prevent fraud; to assist with post-sales assistance and complaints handling; and even to try to "introduce some demand analytics models to predict consumer behaviour into the claims settlement offer."

Key issues include how to determine when a chatbot becomes a robo-adviser; and the fact that some data is normally distributed (data about human physiology) while other data is not (human behaviour).

All of which begs the question: how you govern the use of BDA?

Naturally, firms who responded to the report claim they have no data accuracy issues and have robust governance processes in place. They don't use discriminatory variables and outputs are unbiased. But some firms say third party data is less reliable and only use it for marketing, while others outsource BDA altogether. But none of this was verified for the report, let alone whether or not outputs of ANN or DLN were 'correct' or 'accurate'.

Some firms claim they 'smoothed' the output of ML with human intervention or caps to prevent unethical outcomes.

Others were concerned that it may not be possible to meet the privacy law (GDPR) requirements to explain the means of processing or the output where ANN or DLN is used.

All of the concerns lead some expert legal commentators to suggest that ANN and DLN are more likely to be used to automate decision-making where "the level of accuracy only needs to be "tolerable" for commercial parties [who are] interested only in the financial consequences... than for individuals concerned with issues touching on fundamental rights." And there remain vast challenges in how to resolve disputes arising from the use of BDA, whether in the courts or at the Financial Ombudsman.

None of this is to say, "Stop!" But it is important to proceed with caution and for its users to be realistic in their expectations of what BDA can achieve...

Search This Blog

Tuesday, 8 October 2019

Hype Will Harm Artificial Intelligence

Sunday, 16 June 2019

Of Caution And Realistic Expectations: AI, ANN, BDA, ML, DL, UBI, PAYD, PHYD, PAYL...