The greatest obstacle to discovery is not ignorance—it is the illusion of knowledge

So said Daniel J. Boorstin. It’s been an interesting week for those, like me, who follow the development of interaction between humans and machines. Specifically, people seem shocked that voice assistants are being used for health questions, also that the companies who make them employ people to listen to samples of voice recordings to make them better.

Before diving into that, let’s just zoom out a bit and remind ourselves that the average level of digital literacies in the general population is pretty poor. Sometimes I wonder how on earth VC-backed companies manage to burn through so much cash. Then I remember the contortions that those who design visual interfaces go through so that people don’t have to think.

Discussing ‘fake news’ and our information literacy problem in Forbes, you can almost feel Kalev Leetaru‘s eye-roll when he says:

It is the accepted truth of Silicon Valley that every problem has a technological solution.

Most importantly, in the eyes of the Valley, every problem can be solved exclusively through technology without requiring society to do anything on its own. A few algorithmic tweaks, a few extra lines of code and all the world’s problems can be simply coded out of existence.

Kalev Leetaru

It’s somewhat tangential to the point I want to make in this article, but Cory Doctorow makes a a good point in this regard about fake news for Locus

Fake news is an instrument for measuring trauma, and the epistemological incoherence that trauma creates – the justifiable mistrust of the establishment that has nearly murdered our planet and that insists that making the richest among us much, much richer will benefit everyone, eventually.

Cory Doctorow

Before continuing, I’d just like to say that I’ve got some skin in the voice assistant game, given that our home has no fewer that six devices that use the Google Assistant (ten if you count smartphones and tablets).

Voice assistants are pretty amazing when you know exactly what you want and can form a coherent query. It’s essentially just clicking the top link on a Google search result, without any of the effort of pointing and clicking. “Hey Google, do I need an umbrella today?”

However, some people are suspicious of voice assistants to a degree that borders on the superstitious. There’s perhaps some valid reasons if you know your tech, but if you’re of the opinion that your voice assistant is ‘always recording’ and literally sending everything to Amazon, Google, Apple, and/or Donald Trump then we need to have words. Just think about that for a moment, realise how ridiculous it is, and move on.

This week an article by VRT NWS stoked fears like these. It was cleverly written so that those who read it quickly could easily draw the conclusion that Google is listening to everything you say. However, let me carve out the key paragraphs:

Why is Google storing these recordings and why does it have employees listening to them? They are not interested in what you are saying, but the way you are saying it. Google’s computer system consists of smart, self-learning algorithms. And in order to understand the subtle differences and characteristics of the Dutch language, it still needs to learn a lot.

[…]

Speech recognition automatically generates a script of the recordings. Employees then have to double check to describe the excerpt as accurately as possible: is it a woman’s voice, a man’s voice or a child? What do they say? They write out every cough and every audible comma. These descriptions are constantly improving Google’s search engines, which results in better reactions to commands. One of our sources explains how this works.

VRS NWS

Every other provider of speech recognition products does this. Obviously. How else would you manage to improve voice recognition in real-world situations? What VRS NWS did was to get a sub-contractor to break a Non-Disclosure Agreement (and violate GDPR) to share recordings.

Google responded on their blog The Keyword, saying:

As part of our work to develop speech technology for more languages, we partner with language experts around the world who understand the nuances and accents of a specific language. These language experts review and transcribe a small set of queries to help us better understand those languages. This is a critical part of the process of building speech technology, and is necessary to creating products like the Google Assistant.

We just learned that one of these language reviewers has violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again.

We apply a wide range of safeguards to protect user privacy throughout the entire review process. Language experts only review around 0.2 percent of all audio snippets. Audio snippets are not associated with user accounts as part of the review process, and reviewers are directed not to transcribe background conversations or other noises, and only to transcribe snippets that are directed to Google.

The Keyword

As I’ve said before, due to the GDPR actually having teeth (British Airways was fined £183m last week) I’m a lot happier to share my data with large companies than I was before the legislation came in. That’s the whole point.

The other big voice assistant story, in the UK at least, was that the National Health Service (NHS) is partnering with Amazon Alexa to offer health advice. The BBC reports:

From this week, the voice-assisted technology is automatically searching the official NHS website when UK users ask for health-related advice.

The government in England said it could reduce demand on the NHS.

Privacy campaigners have raised data protection concerns but Amazon say all information will be kept confidential.

The partnership was first announced last year and now talks are under way with other companies, including Microsoft, to set up similar arrangements.

Previously the device provided health information based on a variety of popular responses.

The use of voice search is on the increase and is seen as particularly beneficial to vulnerable patients, such as elderly people and those with visual impairment, who may struggle to access the internet through more traditional means.

The BBC

So long as this is available to all types of voice assistants, this is great news. The number of people I know, including family members, who have convinced themselves they’ve got serious problems by spending ages searching their symptoms, is quite frightening. Getting sensible, prosaic advice is much better.

Iliana Magra writes in the The New York Times that privacy campaigners are concerned about Amazon setting up a health care division, but that there are tangible benefits to certain sections of the population.

The British health secretary, Matt Hancock, said Alexa could help reduce strain on doctors and pharmacists. “We want to empower every patient to take better control of their health care,” he said in a statement, “and technology like this is a great example of how people can access reliable, world-leading N.H.S. advice from the comfort of their home.”

His department added that voice-assistant advice would be particularly useful for “the elderly, blind and those who cannot access the internet through traditional means.”

Iliana Magra

I’m not dismissing the privacy issues, of course not. But what I’ve found, especially recently, is that the knowledge, skills, and expertise required to be truly ‘Google-free’ (or the equivalent) is an order of magnitude greater than what is realistically possible for the general population.

It might be fatalistic to ask the following question, but I’ll do it anyway: who exactly do we expect to be building these things? Mozilla, one of the world’s largest tech non-profits is conspicuously absent in these conversations, and somehow I don’t think people aren’t going to trust governments to get involved.

For years, techies have talked about ‘personal data vaults’ where you could share information in a granular way without being tracked. Currently being trialled is the BBC box to potentially help with some of this:

With a secure Databox at its heart, BBC Box offers something very unusual and potentially important: it is a physical device in the person’s home onto which personal data is gathered from a range of sources, although of course (and as mentioned above) it is only collected with the participants explicit permission, and processed under the person’s control.

Personal data is stored locally on the box’s hardware and once there, it can be processed and added to by other programmes running on the box – much like apps on a smartphone. The results of this processing might, for example be a profile of the sort of TV programmes someone might like or the sort of theatre they would enjoy. This is stored locally on the box – unless the person explicitly chooses to share it. No third party, not even the BBC itself, can access any data in ‘the box’ unless it is authorised by the person using it, offering a secure alternative to existing services which rely on bringing large quantities of personal data together in one place – with limited control by the person using it.

The BBC

It’s an interesting concept and, if they can get the user experience right, a potentially groundbreaking concept. Eventually, of course, it will be in your smartphone, which means that device really will be a ‘digital self’.

You can absolutely opt-out of whatever you want. For example, I opt out of Facebook’s products (including WhatsApp and Instagram). You can point out to others the reasons for that, but at some point you have to realise it’s an opinion, a lifestyle choice, an ideology. Not everyone wants to be a tech vegan, or live their lives under those who act as though they are one.

2 Comments

Add yours →

  1. Nice summary of some of the vexing issues that have raised concerns on personal data privacy. It is the case that many of the big players have played fast and loose with ‘our data’. That speaks to the Doctorow quote as applied to complicit corporations.

    I think it’s worth emphasizing that the BBC Box approach you shared is fundamentally different in that it is decentralized and operates within the oversight and physical proximity to the individual user. Even from what I could discern the processing takes place locally. Decentralized system designs show great promise for returning control and insuring the privacy of personal data.

    • Doug Belshaw

      22 July 2019 — 09:24

      Thanks Phil, and as someone leading a decentralised project, I definitely agree with you.

      One of the main problems is data literacy, as in how do people *know* where their data is coming from and where it’s going?

Comments are closed.