Language has always evolved naturally, but what happens when it’s unnatural?

Generative AI tool ChatGPT is about to tell us

By Robert Stevenson

Monday, 12th of December 2022

2022 has been a break-out year for generative AI tools.

They are quickly moving from niche tech interest to mainstream awareness, albeit with relatively modest practical usage as of yet.

We are, however, seeing just the start of their adoption and the extent of their capabilities – and they’re already astonishing.

Want a quick overview of how GPT (Generative Pre-Trained Transformer) tools work?

Firstly, an enormous dataset of content (not Google search results) is trained on user prompts. A neural network is then set, based on what it has learnt. Given a prompt, the tool then draws on relevant information and uses the neural network to produce results.

If you want to know more, the details are available on the OpenAI website.

The GPT3.5 model has 175 billion parameters compared with GPT2’s already vast 1.5 billion, which goes some way to explaining the significant advancement of this release. ChatGPT4 due in early 2023 is expected to be another leap.

Ease of use, accessibility (many AI tools are free for now, or have free credits before a paid subscription) and the ability to gather inspiration and quickly spin up ideas, is likely to lead to adoption from any industry – including or perhaps even spearheaded by brand and marketing industries.

ChatGPT is conversational, with call and response

Prompt to text generative AI tools are now able to produce anything from concise brand straplines and website copy right through to business plans and entire novels.

These may dent Google search domination – TikTok already has.

When search results look less like a menu you have to assess and choose from, and more like a conversation or a recommendation video from a friend, it’s apparent why people may favour this over traditional search.

The ability for users to ask follow up questions to initial searches is intuitive and leaves Google results pages looking rather antiquated. Google does have a standalone AI unit called DeepMind, which “taught” a computer to beat a human champion of the notoriously complex Chinese board game Go!. So, while Google is by no means out of the race, they are facing some serious competition.

Coders, who often rely quite heavily on Google to tell them how to improve their code (no judgement), have found that AI tools will absolutely outclass existing results. This in itself means that every business that relies on code can be more efficient and produce more, arguably better products, faster.

Of course, there are huge challenges too.

Envisioning weird futures with AI is my new favourite thing

Data Protection

Firstly, there’s data protection, licensing and remuneration for creators whose source material is the basis for AI generated text and images. Elon Musk, who co-founded OpenAI (the company responsible for DALL-E and ChatGPT), has stepped away citing data protection issues.

Ethics

Secondly, there’s the ethics. A child could, in theory, do their homework in minutes with little investment or understanding – and this carries through to the world of work. To what extent could AI replace a human workforce?

Accuracy

Thirdly, accuracy. While AI is drawing from almost any available source and should average out in some way, programming quirks and where data is extracted from gives rise to serious trust issues.

As fast as factually incorrect content can be produced with AI, it can be published, distributed and consumed.

There is opaque validation within the code and little validation outside of it. With ChatGPT, instead of “computer says no”, you get a rough approximation of an answer. OpenAI does state upfront: “ChatGPT sometimes writes plausible sounding but incorrect or nonsensical answers”.

Despite these concerns, speed is an enormous advantage and the low barrier to entry presents egalitarian opportunities to collaborate with AI – in theory.

Yesterday morning, for example, I was on the London Underground and found myself staring at two posters next to each other for the entire journey (full 8am zombie vibes). They were for a Muslim marriage app called Salams and an African mobile money transfer company called NALA. Both had short-form large-font copy that used English language in a dialect that was highly relevant to their intended audiences, but would mean almost nothing to others.

This prompted me to think about localisation. Could the language of niche communities, minority groups and marginalised sections of society be suppressed, leaving them behind or out of this generative AI revolution? If AI language becomes the first port of call in brand and marketing, the opportunity for nuance and cultural distinctiveness may be squeezed.

In the future, what effect would this have on language interacted with on a daily basis? Is it possible that communities already fighting for their voice to be heard and seen are about to be drowned out by the white noise of AI?

Further in the future, growth of content produced by AI, may become de-facto source material. This has the potential to upend existing language structures. When you consider how quickly language is adopted and adapted already by school kids, in music, games and across borders.

Another OpenAI product “Whisper” is an ASR (automatic speech recognition) “trained on 680,000 hours of multilingual and multitask supervised data collected from the web…. the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language”.

I am intrigued as there is a distinction between accent and dialect that cannot be overlooked. Reinforced or supervised learning and open-sourcing means that specialisms can be honed and trained, but this requires input.

Ultimately, if equilibrium is to occur, early measures should be taken and an emphasis should be placed on the assistive capabilities of AI alongside the value of human creativity and quality.

Creativity is in essence most alive when it is drawing from disparate sources and producing something new. Outside of translation, how can localisation of dialect and culture be included and factored into this before content multiplies to an extent that it is detrimental?

I will be continuing to put AI through its paces and trying to understand how my own neural pathways work while doing so. There is also a lot of fun to be had! Try it for yourself here: ChatGPT.