AI Concepts

Artificial Intelligence, deep learning, machine learning — whatever you're doing if you don't understand it — learn it. Because otherwise you're going to be a dinosaur within 3 years.
-- Mark Cuban 

What are Large Language Models (LLMs)? (5:29)

Large Language Models

HOW AI IS CHATTING WITH YOU

AI is getting very good at conversation. This is all thanks to a powerful kind of neural network called a large language model, or LLM. LLMs enable computers to understand and generate language better than ever before. Anyone can get started building with them, whether you're a developer or not.

WHAT ARE LANGUAGE LEARNING MODELS (LLMS)

LLMs are machine learning models that are really good at understanding and generating human language. They are based on Transformers, a type of neural network architecture invented by Google. One model can be used for a whole variety of tasks, like chat, translation, summarization, brainstorming, code generation. You can prototype language applications incredibly fast with LLMs.

Large Language Models: Part 1 (10:12)

LEARNING LARGE LANGUAGE MODELS

In this video, you will learn all about large language models. The goal is to assign a probability to every sentence. To really model language, we need to somehow model things like grammar and style.

GRADIENT DESCENT IN NEURAL NETWORKS

To train our network, we need the gradient of the error function. This is a vector with eight partial derivatives, one for each weight. This process of calculating partial derivatives in a network is called backpropagation. It's the workhorse of neural networks.

NEURAL NETWORKS: UNIVERSAL

Neural networks are universal. You can fit any function. But how about this function, which models language? How do we go about designing a neural network to model language?

Large Language Models from scratch
https://www.youtube.com/watch?v=lnA9DMvHtfI

Large Language Models: Part 2 (7:15)

NEURAL NETWORKS FOR LANGUAGE

Given some text like this, we would like to predict the last word. What if we could train a neural network to solve this attention problem? We can do this by stacking many of these prediction attention layers. To give our network a lot more capacity, we need to give it more capacity.

TRAINING LARGE LANGUAGE MODELS IN PYTHON

Today's large language models have read half a trillion words. Training GPT-3 would take 355 years on a single GPU or computer. But Transformers are designed to be highly parallelizable, so you can do in about a month. These models are not perfect, but if you need cooking ideas, they can help.

Why Large Language Models Hallucinate (9:25)

THREE FACTS THAT ARE NOT REAL

The distance from the Earth to the moon is 54 million km. Before working at IBM, I worked at a major Australian airline. The James Webb telescope took the very first pictures of an exoplanet outside of our solar system. All three facts are an example of an hallucination of a large language model.

WHAT ARE HALLUCINATIONS IN LARGE LANGUAGE MODELS?

Large language models can generate fluent and coherent text on various topics and domains. But they are also prone to just make stuff up plausible sounding nonsense. hallucinations are outputs of LLMs that deviate from facts or contextual logic. They can range from minor inconsistencies to completely fabricated or contradictory statements.

WHY DO LLM ASPIRATIONS HAPPEN?

There are a number of common causes. One is data quality. LLMs are trained on a large copora of text that may contain noise errors, biases or inconsistencies. Another common cause for hallucinations is input context. As LLM reasoning capabilities improve, hallucinations tend to decline.

HOW TO PREVENT LLN HALLUCINATIONS

What can we do to reduce hallucinations in our own conversations with LLNs? One thing we can certainly do is provide clear and specific prompts to the system. And then one more is multi shot prompting. This can be particularly useful in tasks that require a specific output format.

 

Why Large Language Models Hallucinate
https://www.youtube.com/watch?v=cfqtFvWOfg0

Risks of Large Language Models (8:17)

  • Risk 1: Hallucinations (aka Falsehoods)
    • Strategy: Explainability
  • Risk 2: Bias 
    • Strategy: Culture and Audit
  • Risk 3: Consent 
    • Strategy: Accountability
  • Risk 4: Security
    • Strategy: Education

Risks of LLMs
https://www.youtube.com/watch?v=r4kButlDLUc

Risks of Large Language Models (8:17)

  • Risk 1: Hallucinations (aka Falsehoods)
    • Strategy: Explainability
  • Risk 2: Bias 
    • Strategy: Culture and Audit
  • Risk 3: Consent 
    • Strategy: Accountability
  • Risk 4: Security
    • Strategy: Education

Risks of LLMs
https://www.youtube.com/watch?v=r4kButlDLUc

What Are Transformers? (5:37)

Transformers, composed of multiple self-attention layers, hold strong promises toward a generic learning primitive applicable to different data modalities, including the recent breakthroughs in computer vision achieving state-of-the-art standard accuracy. 

Large Language Models Are Zero Shot Reasoners (7:46)

When you create a prompt for a large language model, are the answers sometimes wrong? It may be you! Or more accurately, the way you are formulating your question.