SEARCH

MEMORY.COM
4.37.48

Guest

Log In Homepage

00

Custom Exams

Dark mode

User ID: 999999
Version: 4.37.48

www.memory.com

You are in browse mode. You must login to use MEMORY

Log in to start

Index » Intro to AI 2 » Chapter 1 - The Problem of Automated Reasoning » Chapter 6 - Neural Network Architectures

From course:

Intro to AI 2

» Start this Course
(Practice similar questions for free)

Question:

Transformer - Add & Norm

Author: Christian N
8 months ago

Answer:

Layer Normalization = Output of the previous layer (from attention block) + Input Embedding (From the first step) Benefits - Faster training, Reduce Bias, Prevent weight explosion Types of Normalization - Batch & Layer normalization *Layer normalization is preferable for transformers, especially for Natural language processing tasks

0.5 Stars1 Star1.5 Stars2 Stars2.5 Stars3 Stars3.5 Stars4 Stars4.5 Stars5 Stars
0 / 5 (0 ratings)

1 answer(s) in total

Author

Christian N

Memory

Available Learning Modes
Supported Languages
About Memory Pages
Question Library
Classrooms
Sitemap

Popular

Create Language Courses
Learn Spanish
Flashcards
Spanish Flashcards
French Flashcards
Korean Flashcards
Japanese Flashcards

Resources

Policies
Privacy
Terms & Conditions
Cookies

Quick links

Help
News & Updates
Forum
What is Memory?
V4 Updates

© 2018 - 2025 - All rights reserved.

Memory.com™ is an online application and educational tool designed for studying, learning and revision. Memory is not a game and is not available offline or associated with any other company.
Memory is originally based on the Supermemo SM-2 spaced repetition Algorithm.
Read more about our online study algorithms

App Version: 4.37.48

Problems?
Page frozen or something won't load?
Try hard refreshing below:

(*You will have to login again)