CTI Mathematical Methodology
How the Composite Threat Index is calculated, what changed, and why. We publish our mathematics openly so that every number on the dashboard can be independently verified.
What is the CTI?
The Composite Threat Index is a single 0β100 number computed daily from four sub-indices. Every input comes from the 22 public data sources listed in our guide β no classified data, no editorial judgement. Given identical inputs, the formula always produces the identical score.
Mathematical Concepts Explained
Not a mathematician? Here is what each technique does and where to learn more.
An S-shaped curve that smoothly maps any number to a bounded range (like 0β1). Think of it as a dimmer switch instead of an on/off light β tiny input changes never cause sudden score jumps.
Wikipedia βAdds up small daily deviations from normal. When the running total crosses a threshold β a change has occurred, even if no single day looked alarming. Proven fastest possible detector for a given false alarm rate.
Wikipedia βExponentially Weighted Moving Average. A smoothed average where yesterday matters more than last week. We use it to estimate the current "normal" level of the threat index.
Wikipedia βMeasures concentration vs diversity. High entropy = many narratives active (organic). Low entropy = one narrative dominates (likely a coordinated campaign).
Wikipedia βKullbackβLeibler divergence. Measures how different today is from the baseline. Zero = "exactly normal." Large value = "something shifted" β which narrative is being pushed harder than usual?
Wikipedia βCombines evidence from sources that may disagree. Unlike averaging, it tracks how much sources conflict β which is itself intelligence. Satellite says HIGH, OSINT says LOW? The disagreement is reported, not hidden.
Wikipedia βHow many standard deviations from average. A z-score of 3 = "this is very unusual." We use it to flag when any sensor reading is far outside its normal range.
Wikipedia βMakes large numbers manageable. Going from 100β200 signals matters more than 1000β1100. Early growth matters, later growth has diminishing returns.
Wikipedia βπ What Changed
We replaced hand-tuned heuristics with calibrated, mathematically proven formulas. The goal: the same score, but one you can trust β with the maths to back it. Below is what changed and why.
Step functions with hard jumps. Inflation at 4.99% scored 0 points; at 5.01% it jumped to 3 points. One data-point change could swing the CTI by 3.
Sigmoid calibration functions. The score rises smoothly through the threshold β no discontinuities, no phantom jumps. The curve is fit to the same expert thresholds, so the midpoints match.
Inflation example: $S(\text{CPI};\,1.0,\,4.0,\,3)$ β smooth S-curve centred at 4%, max 3 pts. The sigmoid is the maximum entropy distribution for a bounded variable given its mean β it encodes the least amount of assumption.
Fixed Β±8-point threshold: "if today is 8+ points above last week, call it RISING." Could not detect slow escalation (1 point/day over 2 weeks), and sometimes fired on random noise.
Dual CUSUM + EWMA. CUSUM (Cumulative Sum) is mathematically proven optimal for detecting the smallest possible shift at a given false alarm rate (Lorden, 1971). EWMA smooths the level estimate.
$k = \sigma/2$ (half the natural variance), $\lambda = 0.2$ (5-day effective window). A change-point is signalled when $S^+_t$ exceeds $4\sigma$. This catches gradual escalation 5β10 days earlier while keeping the false alarm rate at roughly 1 per year.
Volume only: $\log_2(\text{count}/300) \times 3$. Could not distinguish between 600 signals spread evenly across 5 narratives (normal monitoring) and 600 signals all pushing one narrative (coordinated campaign).
Shannon entropy measures how concentrated the narrative distribution is. KL-divergence compares today against the 30-day baseline to detect anomalous shifts.
$H = 0$ when all signals push the same narrative (full coordination). $H = \log_2 k$ when signals are evenly distributed (organic). $D_{\text{KL}} \geq 0$ always (Gibbs' inequality), with equality only when the current distribution matches baseline exactly. Together they separate genuine campaigns from routine noise.
All signals within a 7-day window carry equal weight. A 6-day-old RSS article and a 1-hour-old satellite detection contribute the same.
Polynomial decay, based on the MISP threat intelligence framework (Mokaddem et al., 2019). Each source type gets its own lifetime and decay speed.
$\tau$ = source lifetime (hours), $\delta$ = decay speed. Example: ADS-B military flights decay over 24 hours ($\tau=24$, fast). Sanctions data persists for 30 days ($\tau=720$, slow). Unlike exponential decay, polynomial reaches exactly zero at $\tau$ β stale signals produce zero noise.
Weighted average of 6 sources: satellite Γ 0.30 + OSINT Γ 0.25 + FIRMS Γ 0.15 + GDELT Γ 0.15 + ACLED Γ 0.10 + milwatch Γ 0.05. When satellite says HIGH and OSINT says LOW, the average just says MEDIUM β hiding the fact that sources disagree.
Dempster-Shafer evidence theory. Each source produces a belief mass assignment; sources are combined using Dempster's rule, which measures how much they conflict. The conflict mass $K$ is itself intelligence β high $K$ means "we are not sure, investigate further."
Each source is discounted by its reliability: satellite ($\alpha=0.90$), OSINT ($\alpha=0.65$), FIRMS ($\alpha=0.80$). The combination rule is commutative and associative (Shafer, 1976) β order doesn't matter. Result: a belief value plus an uncertainty range, not just a point estimate.
Current Formulas
The exact mathematics behind every number on the dashboard, as currently deployed.
1. Security Indicators β $S_{\text{sec}} \in [0,\,40]$
Each indicator $i$ from MILITARY, MARITIME, NATO, and DIPLOMATIC categories:
$$v_i \in \{\text{GREEN}=0,\;\text{YELLOW}=3,\;\text{ORANGE}=7,\;\text{RED}=10\}$$ $$c_i \in \{\text{HIGH}=1.0,\;\text{MEDIUM}=0.7,\;\text{LOW}=0.4\}$$ $$S_{\text{sec}} = \min\!\left(\frac{40}{50}\sum_{i} v_i \cdot c_i ,\; 40\right)$$The normalisation base of 50 assumes ~5 indicators at mixed severities as a practical ceiling.
2. FIMI / Disinformation β $S_{\text{fimi}} \in [0,\,25]$
Four sub-scores over a 7-day rolling window. Tag count $T$, narrative code distribution $P = \{p_1, \ldots, p_k\}$, 30-day baseline distribution $Q$, campaigns $C$:
a) Volume
$$S_{\text{vol}} = \begin{cases} 0 & T < 100 \\ \min\!\big(3\,\log_2(T/300),\; 8\big) & T \geq 100 \end{cases}$$b) Entropy concentration β high score when one narrative dominates:
$$H = -\sum_{i=1}^{k} p_i\,\log_2 p_i, \qquad \hat{H} = 1 - \frac{H}{\log_2 k}$$ $$S_{\text{entropy}} = S\!\big(\hat{H};\,8,\,0.5,\,6\big)$$c) KL-divergence shift β how today differs from 30-day baseline (Laplace-smoothed):
$$D_{\text{KL}}(P\,\|\,Q) = \sum_{i} \tilde{p}_i\,\log_2\frac{\tilde{p}_i}{\tilde{q}_i}$$ $$S_{\text{KL}} = S\!\big(D_{\text{KL}};\,3,\,1.0,\,5\big)$$d) Campaigns
$$S_{\text{camp}} = \min(1.5\,C,\; 6)$$ $$S_{\text{fimi}} = \min\!\big(S_{\text{vol}} + S_{\text{entropy}} + S_{\text{KL}} + S_{\text{camp}},\; 25\big)$$Entropy-based scoring separates genuine campaigns (low $H$, high $D_{\text{KL}}$) from organic discussion (high $H$, low $D_{\text{KL}}$). All four components are live.
3. Hybrid Threats β $S_{\text{hyb}} \in [0,\,20]$
a) HYBRID category indicators with doubled weight:
$$H_{\text{ind}} = \sum_{i \in \text{HYBRID}} 2\,v_i\,c_i$$b) Anomaly boost (last 72h). Raw anomaly score is sigmoid-smoothed to prevent saturation:
$$R = \sum_j a_j, \quad a_j \in \{\text{INFO}=1,\;\text{WARNING}=2,\;\text{ALERT}=3,\;\text{CRITICAL}=4\}$$ $$H_{\text{anom}} = \min\!\Big(S(R;\,0.3,\,10,\,5),\; 5\Big)$$c) Structural signals β sigmoid-smoothed counts from specialised collectors (FIRMS thermal, GPS jamming, satellite IMINT, AIS shadow fleet):
$$H_{\text{struct}} = \min\!\bigg(\underbrace{S(F;\,0.15,\,20,\,3)}_{\text{FIRMS}} + \underbrace{S(G;\,0.5,\,4,\,3)}_{\text{GPS jam}} + \underbrace{S(V;\,0.02,\,100,\,3)}_{\text{shadow fleet}} + \underbrace{S(I;\,0.5,\,3,\,3)}_{\text{satellite}},\; 12\bigg)$$ $$S_{\text{hyb}} = \min\!\big(H_{\text{ind}} + H_{\text{anom}} + H_{\text{struct}},\; 20\big)$$Each structural sigmoid is calibrated to the sensor's normal range. For example, $S(F;\,0.15,\,20,\,3)$ produces ~0.5 for 10 FIRMS detections (routine) and ~2.9 for 40+ (active burning). Structural signal counts are decay-weighted (see Β§7 below) β a 1-hour-old FIRMS detection counts more than a 2-day-old one.
4. Economic Stress β $S_{\text{econ}} \in [0,\,15]$
Sigmoid-smoothed economic indicators. Currently active: energy prices and supply disruption signals.
$$E_{\text{price}} = S(\overline{P};\,0.03,\,100,\,5) \quad \text{(avg EUR/MWh, 7-day)}$$ $$E_{\text{supply}} = S(D;\,1.0,\,2,\,5) \quad \text{(disruption/outage signal count, 7-day)}$$ $$S_{\text{econ}} = \min\!\Big(E_{\text{price}} + E_{\text{supply}},\; 15\Big)$$Energy price sigmoid: midpoint at 100 EUR/MWh (crisis threshold), max 5 pts. Normal Baltic prices (~50) score ~1; extreme spikes (~200+) saturate near 5. CPI, unemployment, and consumer confidence planned when data feeds are added.
5. Composition & Trend
Final composite:
$$\text{CTI} = \min\!\Big(S_{\text{sec}} + S_{\text{fimi}} + S_{\text{hyb}} + S_{\text{econ}},\; 100\Big)$$Trend via dual CUSUM + EWMA (up to 30 days of history):
$$Z_t = \lambda\,x_t + (1-\lambda)\,Z_{t-1}, \quad \lambda = 0.2$$ $$S^+_t = \max\!\big(0,\; S^+_{t-1} + x_t - Z_t - k\big), \quad k = \sigma/2$$ $$S^-_t = \max\!\big(0,\; S^-_{t-1} - x_t + Z_t - k\big)$$ $$\text{Trend} = \begin{cases} \blacktriangle\;\text{RISING} & S^+_t > 4\sigma \\ \blacktriangledown\;\text{FALLING} & S^-_t > 4\sigma \\ \rightarrow\;\text{STABLE} & \text{otherwise}\end{cases}$$EWMA ($\lambda=0.2$) gives a ~5-day effective window for the "normal" level. CUSUM accumulates deviations from normal; the threshold $h=4\sigma$ keeps the false alarm rate at roughly 1 per year. Catches gradual escalation that the old Β±8 threshold missed.
6. Temporal Signal Decay
Each signal is weighted by polynomial decay based on its age, per the MISP framework (Mokaddem et al., 2019):
$$w(t;\,\tau,\,\delta) = \max\!\Big(0,\; 1 - \Big(\frac{t}{\tau}\Big)^{1/\delta}\Big)$$Source-specific parameters:
| Source | $\tau$ (hours) | $\delta$ | Description |
|---|---|---|---|
| ADS-B | 24 | 1.5 | Military flights β fast decay |
| AIS | 48 | 2.0 | Vessel tracking β slow initial |
| FIRMS | 72 | 1.5 | Thermal detections |
| GPSJam | 48 | 1.5 | Electronic warfare |
| Satellite | 168 | 2.5 | Imagery β 7 days, very slow |
| Telegram | 72 | 1.0 | Social media β fast |
| RSS | 120 | 2.0 | News β 5 days |
| OSINT | 336 | 3.0 | Perplexity/milbase β 14 days |
| Sanctions | 720 | 3.0 | 30 days, very slow decay |
Unlike exponential decay, polynomial decay reaches exactly zero at $\tau$ β stale signals produce zero noise. Applied to hybrid structural signal counts.
7. Baseline Anomaly Detection
For each sensor, 7-day rolling statistics detect deviations:
$$z = \frac{x_{\text{today}} - \bar{x}_7}{\sigma_7}$$ $$\text{Severity} = \begin{cases} \text{ALERT} & z \geq 4.0 \\ \text{WARNING} & z \geq 3.0 \\ \text{INFO} & z \geq 2.0 \end{cases}$$Tracked: narrative tag volume, Telegram messages, satellite HIGH sites, GPS jamming zones, shadow fleet vessels, FIRMS detections. Runs 3Γ/day.
8. Military Base Evidence Fusion (Dempster-Shafer)
Each source produces a discounted belief mass assignment (BPA) over $\Theta = \{\text{HIGH, MODERATE, LOW, NONE}\}$:
$$m_\alpha(A) = \alpha \cdot m(A), \qquad m_\alpha(\Theta) = 1 - \alpha$$Source reliability $\alpha$:
| Source | $\alpha$ | Rationale |
|---|---|---|
| Satellite | 0.90 | Direct observation, highest confidence |
| FIRMS | 0.80 | Thermal is unambiguous when present |
| ACLED | 0.75 | Ground truth but delayed reporting |
| OSINT | 0.65 | LLM-generated, may hallucinate |
| GDELT | 0.60 | Media volume, indirect |
| Milwatch | 0.50 | Aggregated news, least specific |
Sources combined using Dempster's rule (commutative, associative β order doesn't matter):
$$m_{12}(A) = \frac{1}{1-K}\sum_{B \cap C = A} m_1(B)\,m_2(C)$$ $$K = \sum_{B \cap C = \varnothing} m_1(B)\,m_2(C)$$The conflict mass $K$ is reported per site. High $K$ means sources disagree β which is itself actionable intelligence ("investigate this site further"). The remaining uncertainty $m(\Theta)$ indicates how much is still unknown. Result: a belief value plus an uncertainty range, not just a point estimate.
Mathematical Principles
The standards every formula on this platform must meet.
Every sub-score has a proven minimum and maximum. No formula can produce a value outside its stated range, regardless of input.
More threatening input always produces a higher score. No formula has regions where increasing threat decreases the score.
No step functions or hard thresholds in the scoring pipeline. All mappings are continuous β small input changes produce small output changes.
Same inputs, same outputs, every time. No randomness, no editorial discretion. The formula is the formula.
Every formula is published on this page. If our score disagrees with your calculation given the same inputs, file a bug.
Academic References
The peer-reviewed papers and frameworks behind our formulas.
- Lorden, G. (1971). "Procedures for reacting to a change in distribution." Annals of Mathematical Statistics, 42(6). β CUSUM optimality proof
- Page, E.S. (1954). "Continuous inspection schemes." Biometrika, 41(1β2). β Original CUSUM
- Mokaddem, S. et al. (2019). "Taxonomy driven indicator scoring in MISP." arXiv:1902.03914. β Signal decay model
- Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton. β Dempster-Shafer fusion
- OECD/EC/JRC (2008). Handbook on Constructing Composite Indicators. β Normalisation & aggregation
- Adams, R.P. & MacKay, D.J.C. (2007). "Bayesian Online Changepoint Detection." arXiv:0710.3742. β Future: regime detection
- Shannon, C.E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal. β Entropy foundations
- Mazziotta, M. & Pareto, A. (2013). "Methods for constructing composite indices." RIEDS. β Future: non-compensatory aggregation