Temporal Processing in TFT

Temporal processing refers to how TFT explicitly models time dependencies, capturing patterns such as seasonality, trends, and events across multiple time horizons.

Detailed Explanation

Temporal processing within TFT involves:

1. Sequence Encoding:

Historical input data is transformed into a meaningful internal representation using LSTM layers. This helps the model capture sequential patterns and long-term dependencies:

$( h_t = \text{LSTM}(x_t, h_{t-1}) )$

where $( h_t )$ represents hidden states at time $( t )$, and $( x_t )$ are the inputs.

2. Multi-headed Attention:

Uses attention mechanisms to capture temporal relationships across different time steps, allowing the model to focus on various historical periods relevant for forecasting.

$$[ \text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V ]$$

3. Temporal Fusion:

Integrates information from historical and known future data through attention and gating mechanisms, creating rich, time-aware representations that help make accurate forecasts.

Example

For a retail demand forecasting problem:

  • Historical data (sales, promotions, events) and known future data (upcoming promotions, holidays) are encoded.
  • Attention mechanisms highlight influential past periods (e.g., past holidays).
  • Temporal fusion blends these insights to enhance predictions.

Visualization Example

  • Past Holidays: High attention during similar past holidays.
  • Seasonality: Recurrent patterns recognized and prioritized across years or seasons.