You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pipeline/processors/tda.md
+12-9Lines changed: 12 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# TDA
2
2
3
-
The `tda` processor applies **Topological Data Analysis (TDA)** – specifically, **persistent homology** – to Fluent Bit’s metrics stream and exports **Betti numbers** that summarize the shape of recent behavior in metric space.
4
-
5
-
This processor is intended for detecting **phase transitions**, **regime changes**, and **intermittent instabilities** that are hard to see from individual counters, gauges, or standard statistical aggregates. It can, for example, differentiate between a single, one-off failure and an extended period of intermittent failures where the system never settles into a stable regime.
3
+
The `tda` processor applies **Topological Data Analysis (TDA)**—specifically, **persistent homology**—to Fluent Bit's metrics stream and exports **Betti numbers** that summarize the shape of recent behavior in metric space.
6
4
5
+
This processor is intended for detecting **phase transitions**, **regime changes**, and **intermittent instabilities** that are difficult to detect from individual counters, gauges, or standard statistical aggregates.
6
+
It can, for example, differentiate between a single, one-off failure and an extended period of intermittent failures where the system never settles into a stable regime.
7
7
Currently, `tda` works only in the **metrics pipeline** (`processors.metrics`).
8
8
9
9
---
@@ -48,7 +48,8 @@ On each metrics flush, `tda`:
48
48
To stabilize very different magnitudes and bursty traffic, each rate is mapped to
49
49
`norm = log1p(|rate|)`, and the sign of `rate` is reattached. This yields a vector that is roughly scale-invariant but still sensitive to relative changes in rates across groups.
50
50
51
-
The resulting normalized vector is written into a **ring buffer window** (`tda_window`), implemented via a lightweight circular buffer (`lwrb`) that stores timestamped samples. The window maintains at most `window_size` samples; older samples are dropped when the buffer is full.
51
+
The resulting normalized vector is written into a **ring buffer window** (`tda_window`), implemented through a lightweight circular buffer (`lwrb`) that stores timestamped samples.
52
+
The window maintains at most `window_size` samples; older samples are dropped when the buffer is full.
52
53
53
54
### 2. Sliding window and delay embedding
54
55
@@ -65,7 +66,7 @@ $$
65
66
66
67
where each `x_·` is the **D-dimensional normalized metrics vector** at that time. This yields embedded points in (\mathbb{R}^{mD}).
67
68
68
-
Because we need all lags to be inside the window, the number of embedded points is:
69
+
Because all lags must be inside the window, the number of embedded points is:
69
70
70
71
$$
71
72
n_{\text{embed}} = n_{\text{raw}} - (m - 1)\tau
@@ -77,8 +78,8 @@ This embedding follows the idea of **Takens’ theorem**, which states that, und
77
78
78
79
Intuitively:
79
80
80
-
*`embed_dim = 1`: you see only the current “snapshot” geometry.
81
-
*`embed_dim > 1`: you expose **loops and recurrent trajectories** in the joint evolution of metrics, which later show up as **H₁ (Betti₁) features**.
81
+
*`embed_dim = 1`: only the current "snapshot" geometry is visible.
82
+
*`embed_dim > 1`: **loops and recurrent trajectories** in the joint evolution of metrics become visible, which later show up as **H₁ (Betti₁) features**.
* The off-diagonal distances are collected, sorted, and several quantiles are evaluated, e.g.`q ∈ {0.10, 0.20, …, 0.90}`.
100
+
* The off-diagonal distances are collected, sorted, and several quantiles are evaluated, for example`q ∈ {0.10, 0.20, …, 0.90}`.
100
101
* For each candidate quantile `q`, a threshold `r_q` is chosen and Betti numbers are computed using Ripser.
101
102
* The plugin prefers the scale where **Betti₁** (loops) is maximized; if all Betti₁ are zero, it falls back to Betti₀ as a secondary indicator.
102
103
@@ -174,7 +175,9 @@ Some practical patterns:
174
175
* The trajectory in phase space forms **loops**: metrics move away from the healthy region and then return, many times.
175
176
* Betti₁ (and occasionally Betti₂) increases noticeably while this behavior persists, reflecting the emergence of non-trivial cycles in the metric dynamics.
176
177
177
-
In the sample output, as the HTTP output oscillates between success and various `Connection refused` / `broken connection` errors, `fluentbit_tda_betti1` and `fluentbit_tda_betti2` grow from small values to larger plateaus (e.g., Betti₁ around 10–13, Betti₂ around 1–2) while Betti₀ also increases. This is a direct signature of a **phase transition** from a stable regime to one with persistent, intermittent instability.
178
+
In the sample output, the HTTP output oscillates between success and various "Connection refused" and "broken connection" errors.
179
+
As this occurs, `fluentbit_tda_betti1` and `fluentbit_tda_betti2` grow from small values to larger plateaus (for example, Betti₁ around 10—13, Betti₂ around 1—2) while Betti₀ also increases.
180
+
This is a direct signature of a **phase transition** from a stable regime to one with persistent, intermittent instability.
178
181
179
182
These interpretations are consistent with results from condensed matter physics and dynamical systems, where persistent homology has been used to detect phase transitions and changes in underlying order purely from data (References 1 and 2).
0 commit comments