SEARCH
You are in browse mode. You must login to use MEMORY

   Log in to start


From course:

Intro to AI 2

» Start this Course
(Practice similar questions for free)
Question:

Transformer - Architecture

Author: Christian N



Answer:

Transformers are the basic architecture used in NLP (chatbots, translators) • Contrary to LSTM, they do not work sequencially -> high parallelization • They need positional encoding to specify the position of each word • They use multiple attention layers to keep track of important information across sentences • The output sentence is produced word by word • The output of the network is a probability distribution across all word in the dictionary to predict the next word in the sentence • The process stops only when <EOS> is predicted


0 / 5  (0 ratings)

1 answer(s) in total