From Encoder-Decoder To Transformer

images are from https://zh-v2.d2l.ai/

Encoder-Decoder

Seq2Seq Learning

Attention Mechanism & Attention Score

Seq2Seq with Attention

Self-attention

Position Encoding

Multi-head Attention

Transformer

Annotated graph