Algorithms

Dynamic Model

Dynamic Algorithm Notes

Core Idea

The algorithm uses two building blocks:

  1. Hash scheme – isolates matching vertices into separate buckets.
  2. $\ell_0$-samplers – samples a random edge from a bucket pair.

$\ell_0$-Sampler

Notation reference: l0-Sampler Notation

Let $0 < \delta < 1$ be a parameter.

Let \(\mathcal{S} = (i_1,\Delta_1),\ (i_2,\Delta_2),\ \dots,\ (i_p,\Delta_p)\) be a stream of updates for $x \in \mathbb{R}^n$, where:

  • $i_j \in [n]$ – coordinate index
  • $\Delta_j \in \mathbb{R}$ – update value

An update $(i_j, \Delta_j)$ sets $x_{i_j} \leftarrow x_{i_j} + \Delta_j$.

Vector View

Notation reference: l0-Sampler Notation

Think in terms of a vector $\mathbf{x} \in \mathbb{R}^n$ indexed by edges $e_0, e_1, e_2, \dots$

\[\mathbf{x}_j = \text{number of times } e_j \text{ has been inserted (minus deletions)}\] \[\|\mathbf{x}\|_0 = \text{number of nonzero coordinates (edges currently present)}\]

Example

Edges:

e0 = [1,2],  e1 = [1,3],  e2 = [2,3],  e3 = [3,4]
Operation $\mathbf{x}$ $\Vert\mathbf{x}\Vert_0$
Initial [0, 0, 0, 0] 0
Insert [1,2] [1, 0, 0, 0] 1
Insert [2,3] [1, 0, 1, 0] 2
Delete [1,2] [0, 0, 1, 0] 1

Sketch-Based $\ell_0$-Sampler

Notation reference: l0-Sampler Notation

The $\ell_0$-sampler is sketch-based. We maintain the product $Ax$, where $A$ is a random matrix.

Insertion of edge $e$: \(Ax \leftarrow Ax + (\text{column } e \text{ of } A)\)

Deletion of edge $e$: \(Ax \leftarrow Ax - (\text{column } e \text{ of } A)\)


Sketch Representation

$A$ is never stored explicitly. Instead:

  • Store a seed of $O(\log n)$ bits
  • Use a PRF (Pseudorandom Function):
\[A[i][j] = \text{PRF}(\text{seed},\ i,\ j)\]

Matrix dimensions: \(\underbrace{A}_{m \times n} \cdot \underbrace{x}_{n \times 1} = \underbrace{Ax}_{m \times 1} \quad \Rightarrow \quad O(m) \text{ space to store } Ax\)


The $\ell_0$-sampler operates across $O(\log n)$ levels, each thinning the edge set by half:

  • One 1-sparse recovery structure per level (a 2D array of cells, each $O(\log n)$ bits) $\to$ total $O(\log^2 n \cdot \log(s/\delta))$ bits
  • The matrix $A$ is never stored explicitly – a $k$-wise independent hash function serves as a PRF at cost $O(\log n)$, dominated by the recovery structures

Proposition 2.2

There exists an $\ell_0$-sampler algorithm that, given a dynamic graph stream, either:

  • Returns FAIL with probability at most $\delta$, or
  • Returns an edge chosen uniformly at random from all edges currently in the stream.

Complexity

  Bound
Space $O(\log^2 n \cdot \log(1/\delta))$ bits
Update time $O(\text{polylog}(n))$

Hash Scheme

Notation reference: HashScheme Notation

Level 1

Notation reference: HashScheme Notation

Pick a hash function $f$ from a $\lceil 12 \ln k \rceil$-wise independent family: \(f : U \to [d_1]^-\)

where $d_1 = 2^d$ and $d$ is the smallest integer satisfying: \(\frac{k}{\ln k} \le 2^d \quad \Longrightarrow \quad \frac{k}{\ln k} \le d_1 < \frac{2k}{\ln k}\)

This partitions $U$ into $d_1$ groups $U_0, U_1, \dots, U_{d_1-1}$.


Proposition 2.1

Given $X_1,\dots,X_j$ that are 0-1 random variables, let $X = \sum X_i$ and $\mu = \mathbb{E}[X]$.

If the variables are $\lceil \mu\delta \rceil$-wise independent: \(\Pr[X \ge \mu(1+\delta)] \le \begin{cases} e^{-\mu\delta^2/3} & \text{if } \delta < 1 \\ e^{-\mu\delta/3} & \text{if } \delta \ge 1 \end{cases}\)


Applying Proposition 2.1 to Level 1

Let $X_j = \lvert S \cap U_j\rvert$. Each element of $S$ lands in bucket $j$ with probability $1/d_1$, so: \(\mathbb{E}[X_j] = \frac{|S|}{d_1} = \frac{k}{d_1}\)

Since $\frac{k}{\ln k} \le d_1 < \frac{2k}{\ln k}$, we get: \(\frac{\ln k}{2} < \mathbb{E}[X_j] \le \ln k\)

Choose $\delta = \dfrac{12 \ln k}{\mathbb{E}[X_j]}$, then: \((1+\delta)\,\mathbb{E}[X_j] \le 13 \ln k\)

So with high probability: \(|S \cap U_j| < 13 \ln k\)

Each group stays small – bounded by $O(\log k)$.


Level 2

Notation reference: HashScheme Notation

Pick $d_2$ hash functions from a universal family: \(H = \{h_1, h_2, \dots, h_{d_2}\}, \quad h_i : U \to [d_3]^-\)

where: \(d_2 = \lceil 8 \ln k \rceil, \qquad d_3 = \lceil 13 \ln k \rceil^2\)

Proposition 2.4 – Why $d_2 = \lceil 8 \ln k \rceil$ and $d_3 = \lceil 13 \ln k \rceil^2$

Define two events:

  • $\mathcal{E}’$ – every level-1 bucket is small: $\lvert S \cap U_j\rvert < 13 \ln k$ for all $j$
  • $\mathcal{E}’’$ – for every bucket $U_j$, at least one $h_i \in H$ is perfect w.r.t. $S \cap U_j$

Step 1 – Per-bucket failure probability

Given $E’$, $\lvert S \cap U_j\rvert < 13 \ln k < \sqrt{d_3}$ elements. By Proposition 2.4, a single $h_i$ chosen from a universal family mapping to $[d_3]$ is perfect w.r.t. $S \cap U_j$ with probability $\geq 1/2$.

\[\Pr[\text{all } d_2 \text{ functions fail for } U_j \mid E'] < \left(\frac{1}{2}\right)^{d_2}\]

Since $d_2 = \lceil 8 \ln k \rceil$:

\[\left(\frac{1}{2}\right)^{d_2} = 2^{-8\ln k} = k^{-8\ln 2} \approx k^{-5.5} < \frac{1}{k^4}\]

$d_2 = \lceil 8 \ln k \rceil$ – it is the minimum needed to push per-bucket failure below $1/k^4$.


Step 2 – Over all buckets

There are $d_1 \leq \dfrac{2k}{\ln k}$ buckets. By the union bound:

\[\Pr[\exists\, j \text{ with no perfect function} \mid E'] \leq d_1 \cdot \frac{1}{k^4} < \frac{2k}{\ln k} \cdot \frac{1}{k^4} = \frac{2}{k^3 \ln k}\] \[\Pr[E'' \mid E'] \geq 1 - \frac{2}{k^3 \ln k}\] \[\Pr[E' \cap E''] = \Pr[E'' \mid E'] \cdot \Pr[E'] \geq \left(1 - \frac{2}{k^3 \ln k}\right) \cdot \Pr[E']\]

By saying $E’ \cap E’’$ holds we are saying every bucket has a perfect hash function isolating its elements of $S$


Combined Hash (Offsets)

Notation reference: HashScheme Notation

Composed function: \(h_i^+(x) = \underbrace{f(x) \cdot d_2 d_3}_{\text{group offset}} + \underbrace{(i-1) \cdot d_3}_{\text{function offset}} + \underbrace{h_i(x)}_{\text{level-2 value}}\)

This maps $U$ into: \([d_4], \quad \text{where } d_4 = d_1 d_2 d_3 = O(k \log^2 k)\)

The offsets ensure the $d_1 \cdot d_2$ image sets are pairwise disjoint.


Example

Level-1 buckets: $U_0 = {A,B}$, $U_1 = {C,D}$, with $d_2 = 1$, $d_3 = 4$.

Without offsets, $h_1(U_0)$ and $h_1(U_1)$ could collide in $[d_3]$.

With offsets, the ranges are forced apart:

Index:   0  1  2  3  |  4  5  6  7
Bucket:      U0      |      U1

Each bucket occupies its own disjoint slot – no cross-bucket collisions possible.

Theorem 4.2 – Hash Scheme Guarantee

*Notation reference: HashScheme Notation Probability Events*

Let $S = {u_1, v_1, \dots, u_k, v_k}$ be the $2k$ vertices of any $k$-matching $M_{\max}$.

With probability at least: \(1 - \frac{4}{(2k)^3 \ln(2k)} = 1 - \frac{1}{2k^3 \ln(2k)}\)

there exist $2k$ pairwise disjoint subsets $H^+{\text{inv}}(i_1),\ H^+{\text{inv}}(i_1’),\ \dots,\ H^+{\text{inv}}(i_k),
H^+
{\text{inv}}(i_k’)$ such that: \(u_j \in H^+_{\text{inv}}(i_j) \quad \text{and} \quad v_j \in H^+_{\text{inv}}(i_j') \quad \text{for all } 1 \le j \le k\)

That is: every matching edge has its two endpoints in different, disjoint preimage sets.

This is event $\mathcal{E}’$ in the correctness proof – contributing failure probability $\dfrac{1}{2k^3 \ln(2k)} = \dfrac{10}{20k^3\ln(2k)}$.

Dynamic Algorithm

Notation reference: Dynamic Algorithm Notation

Preprocessing

Notation reference: Dynamic Algorithm Notation

  1. Initialize $L = \emptyset$ (empty collection of $\ell_0$-samplers).

  2. Set failure parameter:

\[\delta = \frac{1}{20 k^4 \ln(2k)}\]

Derivation: The overall failure probability is bounded by: \(\Pr[\text{failure}] \le \underbrace{\frac{1}{2k^3 \ln(2k)}}_{\text{hashing}} + \underbrace{k\delta}_{\text{samplers}}\) Setting $k\delta = \dfrac{1}{20k^3 \ln(2k)}$ gives $\delta = \dfrac{1}{20k^4 \ln(2k)}$, and: \(\Pr[\text{success}] \ge 1 - \frac{11}{20 k^3 \ln(2k)}\)

  1. Build hash scheme:
\[H^+ = \{h_1^+, \dots, h_{d_2}^+\}, \quad d_2 = \lceil 8 \ln(2k) \rceil\]

Updating

Notation reference: Dynamic Algorithm Notation

While stream $S$ is not ended:

Step 4.1 – Get next element

Read $([u,v],\ wt(u,v),\ op)$ from the stream. Time: $O(1)$.


Step 4.2 – Compute hash images

\[H^+(u) = \{h_1^+(u),\dots,h_{d_2}^+(u)\}, \qquad H^+(v) = \{h_1^+(v),\dots,h_{d_2}^+(v)\}\]
  • Each $h_i^+(x)$ takes $O(\text{polylog}(n))$ by Theorem 4.1.
  • $d_2 = O(\log k)$ evaluations per vertex.
  • Time: $O(\log k \cdot \text{polylog}(n)) = O(\text{polylog}(n))$.

Step 4.3 – Update samplers for each bucket pair

Loop over all $i \in H^+(u)$, $j \in H^+(v)$:

\[|H^+(u)| \times |H^+(v)| = d_2^2 = O(\log^2 k) \text{ pairs}\]

For each pair $(i, j)$:

  1. 4.3.1 – If $\mathbf{L}_{i,j,w} \notin L$: create the $\ell_0$-sampler and add to $L$. Search 2D array for $C[i][j]$ (balanced BST keyed by weight): $O(\log W)$.

Note on data structure: C is the full 2D array of size $r \times r$ (indexing is $O(1)$). C[i,j] is a balanced BST keyed by edge weight, storing all $\ell_0$-samplers for vertex-subset pair $(i,j)$. $\mathbf{L}_{i,j,w}$ is the individual $\ell_0$-sampler retrieved from the BST in $O(\log W)$.

  1. 4.3.2 – Run sampler update with $\langle [u,v],\ op \rangle$: $O(\text{polylog}(n))$ by Proposition 2.2.

Time per pair: $O(\text{polylog}(n))$. Total: $O(\log^2 k \cdot \text{polylog}(n)) = O(\text{polylog}(n))$.


Query

Step 5 – Initialize edge set

Set $E_0 = \emptyset$.


Step 6 – Sample from each sampler

For each $\mathbf{L}_{i,j,w} \in \mathcal{L}$:

  1. 6.1 – Apply $\mathbf{L}_{i,j,w}$ with parameter $\delta$ to sample an edge $e$.
  2. 6.2 – If $\mathbf{L}_{i,j,w}$ does not fail, add $e$ to $E_0$.
  • Each sampler fails independently with probability $\le \delta$.
  • By union bound over $k$ critical samplers: $\Pr[\text{any fail}] \le k\delta = \dfrac{1}{20k^3\ln(2k)}$.

Time: $O(\text{polylog}(n))$ per sampler by Proposition 2.2.


Step 7 – Extract matching

Compute a maximum weighted $k$-matching in the induced subgraph $G(E_0)$.

  • $\lvert E_0\rvert \le r^2 W$ (at most one edge per sampler). Space: $O(r^2 W)$.
  • Under events $\mathcal{E}’ \cap \mathcal{E}’’$, the $k$ sampled edges share no endpoints (disjoint subsets) and each has the correct weight, so $G(E_0)$ contains a maximum weighted $k$-matching.