BA 7060 AI Concepts Overview

Large Language Models

HOW AI IS CHATTING WITH YOU

AI is getting very good at conversation. This is all thanks to a powerful kind of neural network called a large language model, or LLM. LLMs enable computers to understand and generate language better than ever before. Anyone can get started building with them, whether you're a developer or not.

WHAT ARE LANGUAGE LEARNING MODELS (LLMS)

LLMs are machine learning models that are really good at understanding and generating human language. They are based on Transformers, a type of neural network architecture invented by Google. One model can be used for a whole variety of tasks, like chat, translation, summarization, brainstorming, code generation. You can prototype language applications incredibly fast with LLMs.

LEARNING LARGE LANGUAGE MODELS

In this video, you will learn all about large language models. The goal is to assign a probability to every sentence. To really model language, we need to somehow model things like grammar and style.

GRADIENT DESCENT IN NEURAL NETWORKS

To train our network, we need the gradient of the error function. This is a vector with eight partial derivatives, one for each weight. This process of calculating partial derivatives in a network is called backpropagation. It's the workhorse of neural networks.

NEURAL NETWORKS: UNIVERSAL

Neural networks are universal. You can fit any function. But how about this function, which models language? How do we go about designing a neural network to model language?

Large Language Models from scratch
https://www.youtube.com/watch?v=lnA9DMvHtfI

NEURAL NETWORKS FOR LANGUAGE

Given some text like this, we would like to predict the last word. What if we could train a neural network to solve this attention problem? We can do this by stacking many of these prediction attention layers. To give our network a lot more capacity, we need to give it more capacity.

TRAINING LARGE LANGUAGE MODELS IN PYTHON

Today's large language models have read half a trillion words. Training GPT-3 would take 355 years on a single GPU or computer. But Transformers are designed to be highly parallelizable, so you can do in about a month. These models are not perfect, but if you need cooking ideas, they can help.

THREE FACTS THAT ARE NOT REAL

The distance from the Earth to the moon is 54 million km. Before working at IBM, I worked at a major Australian airline. The James Webb telescope took the very first pictures of an exoplanet outside of our solar system. All three facts are an example of an hallucination of a large language model.

WHAT ARE HALLUCINATIONS IN LARGE LANGUAGE MODELS?

Large language models can generate fluent and coherent text on various topics and domains. But they are also prone to just make stuff up plausible sounding nonsense. hallucinations are outputs of LLMs that deviate from facts or contextual logic. They can range from minor inconsistencies to completely fabricated or contradictory statements.

WHY DO LLM ASPIRATIONS HAPPEN?

There are a number of common causes. One is data quality. LLMs are trained on a large copora of text that may contain noise errors, biases or inconsistencies. Another common cause for hallucinations is input context. As LLM reasoning capabilities improve, hallucinations tend to decline.

HOW TO PREVENT LLN HALLUCINATIONS

What can we do to reduce hallucinations in our own conversations with LLNs? One thing we can certainly do is provide clear and specific prompts to the system. And then one more is multi shot prompting. This can be particularly useful in tasks that require a specific output format.

Why Large Language Models Hallucinate
https://www.youtube.com/watch?v=cfqtFvWOfg0

AI Concepts

Artificial Intelligence, deep learning, machine learning — whatever you're doing if you don't understand it — learn it. Because otherwise you're going to be a dinosaur within 3 years.
-- Mark Cuban

Overview

Table of Contents

What are Large Language Models (LLMs)? (5:29)

Large Language Models

HOW AI IS CHATTING WITH YOU

WHAT ARE LANGUAGE LEARNING MODELS (LLMS)

Large Language Models: Part 1 (10:12)

LEARNING LARGE LANGUAGE MODELS

GRADIENT DESCENT IN NEURAL NETWORKS

NEURAL NETWORKS: UNIVERSAL

Large Language Models: Part 2 (7:15)

NEURAL NETWORKS FOR LANGUAGE

TRAINING LARGE LANGUAGE MODELS IN PYTHON

Why Large Language Models Hallucinate (9:25)

THREE FACTS THAT ARE NOT REAL

WHAT ARE HALLUCINATIONS IN LARGE LANGUAGE MODELS?

WHY DO LLM ASPIRATIONS HAPPEN?

HOW TO PREVENT LLN HALLUCINATIONS

Risks of Large Language Models (8:17)

Risks of Large Language Models (8:17)

What Are Transformers? (5:37)

Large Language Models Are Zero Shot Reasoners (7:46)

AI Concepts

Artificial Intelligence, deep learning, machine learning — whatever you're doing if you don't understand it — learn it. Because otherwise you're going to be a dinosaur within 3 years.-- Mark Cuban

Overview

Table of Contents

What are Large Language Models (LLMs)? (5:29)

Large Language Models

HOW AI IS CHATTING WITH YOU

WHAT ARE LANGUAGE LEARNING MODELS (LLMS)

Large Language Models: Part 1 (10:12)

LEARNING LARGE LANGUAGE MODELS

GRADIENT DESCENT IN NEURAL NETWORKS

NEURAL NETWORKS: UNIVERSAL

Large Language Models: Part 2 (7:15)

NEURAL NETWORKS FOR LANGUAGE

TRAINING LARGE LANGUAGE MODELS IN PYTHON

Why Large Language Models Hallucinate (9:25)

THREE FACTS THAT ARE NOT REAL

WHAT ARE HALLUCINATIONS IN LARGE LANGUAGE MODELS?

WHY DO LLM ASPIRATIONS HAPPEN?

HOW TO PREVENT LLN HALLUCINATIONS

Risks of Large Language Models (8:17)

Risks of Large Language Models (8:17)

What Are Transformers? (5:37)

Large Language Models Are Zero Shot Reasoners (7:46)

Artificial Intelligence, deep learning, machine learning — whatever you're doing if you don't understand it — learn it. Because otherwise you're going to be a dinosaur within 3 years.
-- Mark Cuban