AI Series

Tracing the (Surprisingly) Long History of Language-Based AI

It’s a regular Monday morning. Your fitness tracker wakes you up at just the right moment - because who wants to start the week feeling groggy? You unlock your phone with a quick glance (thanks, facial recognition), check your notifications, and ask Siri about the weather. Sound familiar? That’s my morning routine, too. It’s a perfect reminder of how deeply artificial intelligence (AI) is woven into our daily lives.

From Ray-Ban’s smart glasses that “talk” to you to Tesla’s self-driving cars that are so good you could confidently open your laptop and get ChatGPT’s language model to edit your blog post while merging on the highway (I said could, ok? Not suggesting nor endorsing!). Kidding aside, AI is everywhere, but have you ever stopped to reflect on what artificial intelligence actually is? This new blog series aims to answer this question in a light, digestible format with an obvious focus on language-based AI, which is what we know best.

Let’s start from the top. Most people think of AI as an umbrella term for machines that perform tasks requiring “human-like” intelligence. We can agree on that definition for now, but it’s important to note that there’s much more happening under the hood - in the “backend” where these diverse systems are built.

There are multiple approaches to building AI systems, some of these systems use rule-based AI methods where there are predetermined rules to analyze data, while other systems learn and evolve their “rules” based on input data using Machine Learning (ML). What might be surprising is that AI has been around for longer than most people think - or at least the notion of a “thinking machine”. Take Siri answering your weather question, for example - a perfect use of language-based AI, specifically natural language processing (NLP). NLP uses methods to analyze, understand, and generate human language. While Siri debuted in 2011, the history behind NLP can be traced back all the way to 1883.

1833

When a French philologist, Michel Bréal, developed the concept of semantics while studying word relationships in language. Bréal’s work laid the groundwork for understanding the meaning and interpretation of language.

1906-1911

Ferdinand de Saussure, often considered the father of modern linguistics, introduced the idea of languages as “systems” and influenced the development of the binary structure in computer sciences.

1950

Computer scientist and philosopher, Alan Turing published a paper proposing the question “Can machines think?” and describing a test for a “thinking” machine that could imitate a human. He also predicted that in half a century, human interrogators will not correctly differentiate machines from humans in the described test.

1966

MIT computer scientist Joseph Weizenbaum developed ELIZA, one of the first programs to utilize NLP. This early chatbot identified keywords from the users’ input and responded with pre-programmed answers. In 2022, it won the Peabody award as a Legacy Winner in the Digital and Interactive Category and is now open-sourced (Don’t be shy, test it out!).

1986

David E. Rumelhart, Geoffry Hinton, and Ronald J. Williams introduced the concept of a new type of neural networks trained to process and use sequential data for common NLP tasks called Recurrent Neural Networks (RNN).

1991

The World Wide Web (www) was made public which began the process of the accumulation of a massive amount of public text data..

1997

Long Short Term Memory (LSTM), A new type of RNN, was introduced to address the previous generation’s long term memory loss.

2007

NVIDIA’s new generation of GPUs called CUDA evolved data computing using parallel processing.

2010

Stanford CoreNLP was introduced as one of the most used natural language analysis toolkits.

2011

IBM released Watson, a competitive computer system, in 2011 that would go on to play the hit US quiz show Jeopardy with its NLP abilities. In the same year, Apple introduced its first virtual assistant, Siri.

2012

Computer Scientist Geoffry Hinton and two of his students from University of Toronto built an AI system using Convolutional Neural Networks (CNN) that halved the error rate in existing visual recognition models and displayed their work at the ImageNet competition.

2017

Transformer based models are introduced as an improvement over RNNs and LSTMs. Two years later, Google developed, from the Transformer architecture, the first widespread Large Language Model: the Bidirectional Encoder Representations from Transformers (BERT).

Fast forward to today, the most well-known examples of LLMs include OpenAI’s GPT series and Google’s BERT. These modern models combine deep learning and NLP to tackle tasks that once seemed impossible - GPT generates human-like responses, while BERT excels at understanding and analyzing language. These innovations demonstrate the power of neural networks and the transformative intersection of deep learning and NLP. Take a look at this simple graph representing the AI terminology mentioned so far and where Large Language Models (LLMs) performing NLP tasks fit:

Stay tuned for our next blog post, where we’ll dive into transformer-based LLMs and uncover how they’ve redefined the boundaries of intelligence and creativity.

Share on LinkedIn

Share on Twitter