Natural Language Processing – An Overview

What is NLP?

“Natural Language Processing (NLP) investigates the use of computers to process or to understand human (i.e., natural) languages for the purpose of performing useful tasks. NLP is an interdisciplinary field that combines computational linguistics, computing science, cognitive science, and artificial intelligence. From a scientific perspective, NLP aims to model the cognitive mechanisms underlying the understanding and production of human languages. From an engineering perspective, NLP is concerned with how to develop novel practical applications to facilitate the interactions between computers and human languages. Typical applications in NLP include speech recognition, spoken language understanding, dialogue systems, lexical analysis, parsing, machine translation, knowledge graph, information retrieval, question answering, sentiment analysis, social computing, natural language generation, and natural language summarization.”

The approach that says that knowledge of language in the human mind is fixed in advance by generic inheritance, dominated most of NLP research before late 1980s. These approaches have been called rationalist approach.

In contrast to rationalist approach, empirical approach considers “that the human mind only begins with general operations for association, pattern recognition, and generalization. Rich sensory input is required to enable the mind to learn the detailed structure of natural language.”

The advancements in Artificial Intelligence with the huge progress in deep learning are the major driving force towards the current state of NLP.

Basic concepts of NLP

Embeddings

The most important concept in NLP is Embeddings. Embeddings are procedures for converting input data into vector representations. A word embedding is a real-valued vector representation of a word.

Transformer

“Transformers are deep learning models that have achieved state-of-the-art performance in several fields such as natural language processing, computer vision, and speech recognition.” (Refer to the article “Attention Is All You Need” by Ashish Vaswani and others.)

Encoder-Decoder

The job of the encoder is to encode the information from the input sequence into a numerical representation that is often called the last hidden state. This state is then passed to the decoder, which generates the output sequence.

Attention

What an attention mechanism computes is a context (query)-dependent summary of the input. Issue with RNN like models is that they cannot take context into account. RNN has vanishing and exploding gradient problems and LSTM could not solve the parallelization problem of RNN. With Attention, parallelization problem was resolved.

(Refer to the article “Attention Is All You Need” by Ashish Vaswani and others.)

Transfer Learning

Transfer learning is a technique in machine learning (ML) in which knowledge learned from a task is re-used in order to boost performance on a related task. For example, for image classification, knowledge gained while learning to recognize cars could be applied when trying to recognize trucks. (From Wikipedia)

Foundation model

A foundation model is a machine learning model that is pre-trained with a large amount of unlabeled data using a transformer architecture. The foundation model can be put to use in different functional areas with fine-tuning.

Large Language Model (LLM)

An LLM is a type of foundation model that is primarily based on Transformer Architecture. An LLM is meant for understanding and generating human language/natural language.

Reinforcement Learning from Human Feedback (RLHF)

After pre-training of an LLM, sometimes the human feedback is used as a part of the steps of fine-tuning using the Reinforcement Learning techniques. (Refer to the article on Reinforcement Learning).

Generative AI

Generative AI leverages its success on GAN (Refer to the article on GAN). It refers to the technique of Artificial Intelligence that learns the representation from artifacts with data and models, and generates new artifacts.

References

  1. Transfer Learning for Natural Language Processing, Manning Publications
  2. Real World Natural Language Processing, Manning Publications
  3. Attention Is All You Need by Ashish Vaswani and others
  4. Transformers for Machine Learning by Uday Kamath and others
  5. Deep Learning in Natural Language Processing by Li Deng and others
  6. A Comprehensive Survey of AI-Generated Content (AIGC) by Yihan Cao and others