The Fine Print: Internet of Things

Showing posts with label Internet of Things. Show all posts

Sunday, 16 June 2019

Of Caution And Realistic Expectations: AI, ANN, BDA, ML, DL, UBI, PAYD, PHYD, PAYL...

A recent report into the use of data and data analysis by the insurance industry provides some excellent insights into the pros and cons of using artificial intelligence (AI) and machine learning (ML) - or Big Data Analytics (BDA). The overall message is to proceed with caution and realistic expectations...

The report starts by contrasting in detail the old and new types of data being used by the motor and health segments in the European insurance industry:

Existing data sources include medical files, demographics, population data, information about the item/person insured ('exposure data') and loss data; behavioural data, frequency of hazards occuring and so on;

New data sources include data from vehicles and other machines or devices like phones, clothing and other 'wearables' (Internet of things); social media services; call centres; location co-ordinates; genetics; and payment data.

Then the report explains the analytical tools being used, since "AI" is a term used to refer to many things (including some not mentioned in the report, like automation, robotics and autonomous vehicles). Here, we're talking algorithms, ML, artificial neural networks (ANN) and deep learning networks (DLN) - the last two being the main focus of the report.

The difference between your garden variety ANN and DLN, is the number of "hidden" layers of processing that the inputs undergo before the results pop out the other end. In a traditional computing scenario you can more readily discover that the wrong result was caused by bad data ("shit in, shit out", as the saying goes) but this may be impracticable with a single hidden layer of computing in an ANN, let alone in a DLN with its multiple hidden layers and greater "challenges in terms of accuracy, transparency, explainability and auditability of the models... which are often correlational and not causative...".

~~Of course, this criticism could be levelled at the human decision-making process in any major financial institution, but let's not go there...~~

In addition, "fair use" of algorithms relies on data that has no inherent bias. Everyone knows the story about the Amazon recruitment tool that had to be shut down because they couldn't figure out how to kill its bias against women. The challenge (I'm told) is to reintroduce randomness to data sets. Also:

As data scientists find themselves working with larger and large data sets and working harder and harder to find results that are just slightly better than random, they will also have to spend significantly more time and effort in accurately determining what exactly constitutes true randomness in the first place.

Alarmingly, the insurers are mainly using BDA tools for pricing and underwriting, claims handling, sales and distribution - so you'd think it pretty important that their processes are accurate, transparent, explainable and auditable; and that they understand what results are merely correlated as opposed to causative...

There's also a desire to use data science throughout the insurance value chain, particularly on product development using much more granular data about each potential customer (see data sources above). The Holy Grail is usage-based insurance (UBI), which could soon represent about 10% of gross premiums:

pay-as-you-drive (PAYD): premium based on kms driven;

pay-how-you-drive (PHYD): premium based on driving behaviour; and

pay-as-you-live (PAYL): premium based on lifestyle, tracking.

This can enable "micro-segmentation" - many small risk pools with more accurate risk assessments and relevant 'rating factors' for each pool - so pricing is more risk-based with less cross-subsidy from consumers who are less likely to make claims. A majority of motor insurers think the number of risk pools will increase by up to 25%, while few health insurers see that happening.

Of course, micro-segmentation could also identify customers who insurers decide not to offer insurance (though many countries have rules requiring inclusion, or public schemes for motorists who can't otherwise get insurance, like Spain, Netherlands, Luxembourg, Belgium, Romania and Austria). Some insurers say it's just a matter of price - e.g. using telematics to allow young high risk drivers to literally 'drive down' their premiums by showing they are sensible behind the wheel.

Increases in the number of 'rating factors' is likely to be more prevalent in the motor insurance segment, where 80% (vs 67%) are said to have a direct causal link to premium (currently driver/vehicle details, or age in health insurance), rather than indirect (such as location or affluence).

Tailoring prices ('price optimisation') has also been banned or restricted on the basis that it can be unfair - indeed the FCA has explained the factors it considers when deciding whether not price discrimination in unfair.

Apparently 2% of firms apply BDA to the sales process, resulting in "robo-advice" (advice to customers with little or no human intervention). BDA is also used for "chatbots" that to help customers through initial inquiries; to forcecast volumes and design loyalty programmes to retain customers; prevent fraud; to assist with post-sales assistance and complaints handling; and even to try to "introduce some demand analytics models to predict consumer behaviour into the claims settlement offer."

Key issues include how to determine when a chatbot becomes a robo-adviser; and the fact that some data is normally distributed (data about human physiology) while other data is not (human behaviour).

All of which begs the question: how you govern the use of BDA?

Naturally, firms who responded to the report claim they have no data accuracy issues and have robust governance processes in place. They don't use discriminatory variables and outputs are unbiased. But some firms say third party data is less reliable and only use it for marketing, while others outsource BDA altogether. But none of this was verified for the report, let alone whether or not outputs of ANN or DLN were 'correct' or 'accurate'.

Some firms claim they 'smoothed' the output of ML with human intervention or caps to prevent unethical outcomes.

Others were concerned that it may not be possible to meet the privacy law (GDPR) requirements to explain the means of processing or the output where ANN or DLN is used.

All of the concerns lead some expert legal commentators to suggest that ANN and DLN are more likely to be used to automate decision-making where "the level of accuracy only needs to be "tolerable" for commercial parties [who are] interested only in the financial consequences... than for individuals concerned with issues touching on fundamental rights." And there remain vast challenges in how to resolve disputes arising from the use of BDA, whether in the courts or at the Financial Ombudsman.

None of this is to say, "Stop!" But it is important to proceed with caution and for its users to be realistic in their expectations of what BDA can achieve...

Monday, 27 May 2019

Trends In Digital Regulation

There are so many initiatives designed to control the digital world that I'm struggling to keep track.

There's also plenty of overlap and commonality in the issues and regulatory solutions, as well as the digital environments and problems the solutions seek to address.

So I put together a few slides for ready comparison.

Interesting to see what leaps out...

Trends in Digital Regulation from Simon Deane-Johns

Tuesday, 19 May 2015

Of #Smart Contracts, Blockchains And Other Distributed Ledgers

Seems I caught Smart Contract Fever at last week's meeting of the Bitcoin & Blockchain Leadership Forum. So rather than continuing to fire random emails at colleagues, I've tried to calm myself down with a post on the topic.

For context it's important to understand that 'smart contracts' rely on the use of a cryptographic technology or protocol which generates a 'ledger' that is accessible to any computer using the same protocol. One type of 'distributed ledger' is known as a 'blockchain', since every transaction which is accepted is then 'hashed' (shortened into a string of letters and numbers) and included with other transactions into a single 'block', which is itself hashed and added to a series or chain of such blocks. The leading distributed ledger is 'Bitcoin', the blockchain-based virtual currency. But virtual currencies (commodities?) are just one use-case for a distributed ledger - indeed the Bitcoin blockchain is being used for all sorts of non-currency applications, as explained in the very informative book, Cryptocurrency: How Bitcoin and Digital Money are Challenging the Global Economic Order. As Jay Cassano also explains, another example is Ripple, which is designed to be interoperable with other ledgers to support the wider payments ecosystem; while Ethereum is even more broadly ambitious in its attempt to use smart contracts as the basis for all kinds of ledger-based applications.

Generally speaking, the process of forming a 'smart contract' would be started by each party publishing a coded bid/offer or offer/acceptance to the same ledger or 'blockchain', using the same cryptographic protocol. These would be like two (or more) mini-apps specifying the terms on which the parties were seeking to agree. When matched, these apps would form a single application encoding the terms of the concluded contract, and this would also be recorded in the distributed ledger accessible to all computers running the same protocol. Further records could be 'published' in the ledger each time a party performed or failed to perform a contractual obligation. So the ledger would act as its own trust mechanism to verify the existence and performance of the contract. Various applications running off the ledger would be interacting with the contract and related performance data, including payment applications, authentication processes and messaging clients of the various people and machines involved as 'customers' or 'suppliers' in the related business processes. In the event of a dispute, a pre-agreed dispute resolution process could be triggered, including enforcement action via a third party's systems that could rely on the performance data posted to the ledger as 'evidence' on which to initiate a specific remedy.

Some commentators have suggested this will kill-off various types of intermediaries, lawyers and courts etc. But I think the better view is that existing roles and processes in the affected contractual scenarios will adapt to the new contractual methodology. Some roles might be replaced by the ledger itself, or become fully automated, but it's likely that the people or entities occupying today's roles would be somehow part of that evolution (if they aren't too sleepy). The need for a lot of human-readable messages would also disappear, signalling the demise of applications like email, SMS and maybe even the humble Internet browser. Most data could flow among machines, and they could alert humans in ways that don't involve buttons and keyboards.

So what are the benefits?

Well, it might take significant investment to set up such a process, but it should produce great savings in time, cost, record-keeping and so on throughout the lifetime of a contract. And, hey, no more price comparison sites or banner ads! Crypto-tech distributed ledgers would enable you to access and use a 'semantic web' of linked-data, open data, midata, wearables, smart meters, robots, drones and driverless cars - the Internet of Things - to control your day-to-day existence.

The downside?

This also might also play into the hands of the Big Data crowd (if they find a way to snoop on your encrypted contracts), or even the machines themselves. So it's critical that we figure out the right control mechanisms to 'keep humans at the heart of technology - the topic of the SCL's Tech Law Futures Conference in June, for example.

Meanwhile, I'm reviewing my first smart contract, which is proving rather like being involved in the negotiation of a software development agreement - which it is, of course. I'll post on that in due course, confidentiality permitting...

Tuesday, 12 March 2013

DNA In A Cup, Rats And Primordial Soup

My article on recent developments in the data world is published at SCL - the IT Law Community.

Search This Blog