Thinking About Data

Recently a friend noted that he wanted to talk soon about his use of Chat GPT for better writing and I’ll be interested to see the results = without being tempted to try it out for reasons that follow..  My first encounter with artificial intelligence was through Kevin Kelly’s The Inevitable, Understanding the 12 technological Forces that will Shape our Future,  which I read soon after its publication in 2016. AI came in as number two in a chapter called Cognifying – after a chapter called Becoming – dealing with the inevitablity of always having to upgrade our devices.

I’ve just read a bit of the book again.  I enjoy telling young persons advising me on the latest technology that I first turned on a desktop in 1983, was on CompuServe by 1989,  added the web the next year, and had my own website in 1995 – a one pager with no images. Kelly was a couple of decades ahead of me, but I relate well to his sense of excitement of where we have been – and take seriously his notions of where we might be going.  Like Brian Arthur, he notes that technology frames us after we adapt it for our own purposes and it creates the cultural era.  I look forward to re-reading the whole book again.

Kelly points out that AI technology was already here as a force when he wrote the book – because of the networks of information already in existence. He said, “It will be hard to tell where its thoughts begin and ours end.” He also pointed out how its utter ubiquity hid it from us even then. AI is the ultimate disrupter and suddenly the ability to deal with quantities of data is in our hands. I can remember when a mainframe was the size of a dining room and now I have instant data available on my phone. As Kelly points out, it was the magic of combining computer/phone/internet that has happened during less than half my lifetime.

When the latest New Yorker came through the door, I tried not to add it to the pile of others immediately and turned to the article by Jill Lepore, their excellent staff writer who tackles many current topics. Lepore received her Ph.D. in American studies from Yale in 1995 and is the David Woods Kemper ’41 Professor of American History at Harvard University.  The article’s title is Data Driven and it resonates particularly as I serve as a volunteer secretary for an institution which is reviewing its immediate past and deciding where it wants to go. Online Zooms and surveys have become part the scene and its steering group has been given quantities of data to sort through.  I don’t have to read it, but like the other volunteers, I browse through. From workers recovering from the pandemic, there are many responses of “poor me”. There is also dissatisfaction with reduced revenue and attendance, a typical desire for more from the young and a general sense of foreboding.  It’s hard to find the desired glimmers of hope – and one solution has been to turn the data over to a  bot to sort it out.

Not precisely a bot – it has been turned over to an outside consultant with a bot that attempts to put single words in larger contexts. “Day” is a simple word. “Good day” and “Bad day” mean something different. I was pleased to see the observation that the coding of the data into larger sets was something that she had done totally subjectively. Putting the data through this process was helpful even when the sample of the total population she was drawing from was very small – about 2.4%. But we are encouraged to listen to the data to predict the future. even as we really know we can’t.

Lepore starts her article with an entertaining fantasy of a millionaire trying to develop a new plan for universal knowledge. He recruits 500 college grads to read three hundred books a year for five years. Instead of being paid for their efforts, when they turn up to be paid, their brains are removed and wired up to a radio and a typewriter. Lapore then turns to the latest gismo for universal knowledge, Chat GPT and asks it to write an essay on toadstools. It comes out right away. She liked what she saw, but also imagined a missing shadow side of the instruction to eat them – some toadstools are poisonous.

She  goes on to imagine an old fashioned small steel case, like the one in my closet still holding dcoument-filled file folders and also serving as the base for an all-in-one printer that is used less and less. Her cabinet has four drawers labelled “Mysteries”, “Facts” “Numbers” and “Data”. The labels might suggest the contents are similar but each follows a different logic. She describes them:

  • Mysteries are things that only God knows – top drawer because it is closest to heaven. The point of collecting them is the search for salvation and the discipline to study them is theology.

  • One collects Facts to find the truth through discernment and in contrast to the previous drawer. they are associated with secularization and liberalism; the disciplines are law, the humanities, and the natural sciences.

  • Numbers are associated with the gathering of statistics by measurement, These are associated with administration; their disciplines are the social sciences.

  • Feeding Data into computers leads to the discovery of patterns to make predictions. Data is associated with late capitalism, authoritarianism, techno- utopianism – and the discipline known as data science.

All of these, LePore points out, are good ways of knowing – and the best thing to do in any situation is to open all four drawers. But we are now in an era where we tend to want to open only the bottom one. In citing a recent book, How Data Happened, A History from Reason to the Age of Algorithms, by Chris Wiggins and Natthew L. Jones, she notes how statistics, numbers and data have been used to support previous biases in fields like intelligence, race, crime and eugenics. Some of us are old enough to remember sets of Books of Knowledge and Encyclopedia – often bought by parents on the installment plan in the hopes that their offspring would thrive.  Now the alleged cryptologist Sam Bankman-Fried is quoted as having famously said, “I would never read a book”.

Technocrats – chiefly engineers – promised a new world following the depression, though it fell out of favour in the 40’s. As data storage became more available and information became digitized,  data science has started to be perceived as the only tool in the storage case.

But should it be? Tatum Hunter notes three things that everyone is getting wrong in an article yesterday in the Washington Post. One thousand people have asked AI experiments like these large language models to slow down – though Kevin Kelly would probably wish them good luck. There may be other necessary ways to deal with them.

First and most important, we should not project human qualities on AI.  When my Iphone really basic AI prompts me to change a word as a write, I’m tempted to say, “Don’t be stupid, that isn’t what I mean” – as though I am talking to another human being.  Instead I should be saying, “This platform’s algorithm is not sound in the information it has searched for.” I’m not ever likely to look to AI for emotional support. Sadly the most vulnerable are those that receive their information from questionable real people and they are the ones likely to put their faith in words drawn from equally questionable sources by a machine.

Second, what is coming down the pipe is not one technology but a whole sequence of them with different building blocks. Who the builder is and what the purpose is will vary.  Different AI platforms will have different values, rules and priorities. Some specialized ones may indeed have their positive uses. Some will start well with high values and become commercially greedy.  Hello Facebook, Hello Google.  We are not very good right now at holding the creators of algorithms to account for all those advertisements on the social media platforms we use every day. That might not be a bad place to start educating ourselves.

Third – and following from this – always be skeptical.  As I learn today of the indictment of a former US president. I can only imagine what a chatbot might come up with as an answer to a question about it. The danger that I see immediately in an amalgamation of information, neatly returned in good English and paragraphs is to trust its accuracy when it looks so professional. I use Google all the time to research a topic, but at least I can see the source it is drawing from, and I can make judgments about the source. I won’t say that I am without bias, but at least I know what the source is. Tatum Hunter at the end of her article rather optimistically states these sources as reliable ones – newspapers, government and university websites, academic journals. Sometimes, yes. What she might add in all these cases is to look for a diversity of views within the sources themselves and clear attribution.  As I write, you can at least see mine.

 

 

Previous
Previous

Misinformation

Next
Next

Very High Confidence