Algorithms

Reference Notation

Notation Reference for the Paper

A comprehensive table of all notation used in the paper, organized by topic.


Basic Math Notation

Notation Definition Where introduced
$[i]$ The set ${1, 2, \ldots, i}$ Section 2
$[i]^-$ The set ${0, 1, \ldots, i-1}$ Section 2
u.a.r. Uniformly at random Section 2
$\Pr[\cdot]$ Probability of an event Section 2.3
$\operatorname{Exp}[\cdot]$ Expected value of a random variable Section 2.3
$\Vert\mathbf{x}\Vert_0$ $\ell_0$-norm: the number of non-zero coordinates of vector $\mathbf{x}$ Section 2.4
$\lceil x \rceil$ Ceiling of $x$ (smallest integer $\geq x$) Standard
$\ln k$ Natural logarithm (base $e$) of $k$ Standard
$\log k$ Logarithm base 2 of $k$ (used for counting hash functions) Standard
$\mathbb{R}_{\geq 0}$ The set of non-negative real numbers Section 3
$\lambda$-wise independent A set of random variables where any $\lambda$-size subset is mutually independent Section 2.3

l0-Sampler Notation (Section 2.4)

Notation Definition Where introduced
$\mathbf{x} \in \mathbb{R}^n$ The coordinate vector maintained by the $\ell_0$-sampler. Each coordinate $\mathbf{x}_j$ tracks the net count of edge $e_j$ (insertions minus deletions). Section 2.4
$(i_j, \Delta_j)$ An update step: increment coordinate $i_j$ of $\mathbf{x}$ by $\Delta_j$ (positive for insertion, negative for deletion) Section 2.4
$\mathcal{S}$ (l0-sampler context) Stream of updates $\mathcal{S} = (i_1,\Delta_1),(i_2,\Delta_2),\ldots$ for vector $\mathbf{x}$. Note: same symbol $\mathcal{S}$ is used for the graph stream (Section 2.1); context distinguishes meaning. Section 2.4

Graph Notation

Notation Definition Where introduced
$G = (V, E)$ A graph with vertex set $V$ and edge set $E$ Section 2
$V(G)$ Vertex set of graph $G$ Section 2
$E(G)$ Edge set of graph $G$ Section 2
$[u, v]$ An edge with endpoints $u$ and $v$ Section 2
$H$ A subgraph of $G$ (general parameter). Used as the input in compact/reduced subgraph constructions. In the streaming algorithm, concretely $G_{\hat{s}}^f \cup G_{\hat{s}+1,s}$ (previous summary + current batch). Not the same as $G$ – always a subset of $G$’s edges. Section 3
$\Vert G\Vert$ or $\Vert H\Vert$ Size of a graph: number of vertices + number of edges Section 2
$n$ Number of vertices, $V = [n]^-$ Section 2.1
$M$ A matching (set of edges with no shared endpoints) Section 2
$k$-matching A matching consisting of exactly $k$ edges Section 2
$wt(e)$ Weight of edge $e$ under the weight function $wt: E \to \mathbb{R}$ Section 2
$wt(M)$ Weight of matching $M$: sum of weights of its edges Section 2
$W$ Number of distinct edge weight values in graph $G$ Theorem 4.4
$R_{wt}$ Ratio of maximum edge weight to minimum edge weight Theorem 4.5
Maximum weighted $k$-matching A $k$-matching with the largest total weight among all $k$-matchings Section 2

Streaming Model Notation

Notation Definition Where introduced
$\mathcal{S}$ A graph stream: sequence of edge operations ${e_1, e_2, \ldots}$ Section 2.1
$(e, op)$ A stream element: edge $e$ with operation $op$ (insert or delete) Section 2.1
Insert-only model Stream where every operation is an edge insertion Section 2.1
Dynamic model Stream where operations can be insertions or deletions Section 2.1
Update time Processing time per stream element (per edge arrival) Section 1
p-Matching Problem: find a $k$-matching in an unweighted graph stream Section 2.2
p-wMatching Problem: find a maximum weighted $k$-matching in a weighted graph stream Section 2.2
Size-$k$ Constraint Assumption that the maximum matching size is bounded by $k$ at every instant Section 1.1

Hash Function Notation

Notation Definition Where introduced
$f$ A hash function mapping $V \to [4k^2]^-$ (insert-only model) Section 3
Perfect w.r.t. $S$ A hash function $h$ is injective on subset $S$ (no collisions within $S$) Section 2.5
$h \stackrel{\text{u.a.r.}}{\longleftarrow} \mathcal{H}$ Hash function $h$ chosen uniformly at random from family $\mathcal{H}$ Section 2.5
Universal family Hash family where $\Pr[h(x) = h(y)] \leq 1/r$ for distinct $x, y$ Section 2.5
$\kappa$-wise independent family Hash family where any $\kappa$ outputs are fully independent Section 2.5
$h_{a,b,r}(x)$ $((ax + b) \bmod p) \bmod r$ – a specific universal hash function Proposition 2.3
$\mathcal{V} = {V_0, V_1, \ldots, V_{4k^2-1}}$ Partition of $V$ by hash function $f$: $V_i = {v : f(v) = i}$ Section 3
Nice matching (w.r.t. $f$) A matching where no two vertices are in the same bucket $V_i$ Section 3

Edge Ordering and Subgraph Notation (Section 3)

Notation Definition Where introduced
$\beta(e)$ For edge $e = [u,v]$ with $u < v$: $\beta(e) = (wt(e), u, v)$. Gives a unique value to every edge and defines a total order (lexicographic). Used to break weight ties. Section 3
“$i$-th heaviest edge” The edge with the $i$-th largest $\beta$-value in an edge set Section 3
$\mathcal{C}_f(H)$ Compact subgraph of $H$ under $f$: for each pair of buckets $(V_i, V_j)$, keep only the single heaviest edge between them Section 3
$\mathcal{R}’_f(H)$ Intermediate graph: from $\mathcal{C}_f(H)$, for each bucket $V_i$, keep only the $2k$ heaviest incident edges. Delete any edge not in the top $2k$ on either side. Section 3
$\mathcal{R}_f(H)$ Reduced compact subgraph of $H$ under $f$: from $\mathcal{R}’_f(H)$, keep only the overall $4k^2$ heaviest edges Section 3
$E_i$ The set of all surviving edges (those with endpoints in different buckets) that have at least one endpoint in bucket $V_i$. Constructed during Lemma 3.2’s algorithm by placing each cross-bucket edge $[u,v]$ into both $E_{f(u)}$ and $E_{f(v)}$. Acts like an adjacency list for bucket $V_i$ – used to find the heaviest edge per bucket pair (Step 2) and the $2k$ heaviest incident edges (Step 3). Lemma 3.2 proof
“incident to $V_i$” An edge has at least one endpoint in bucket $V_i$ Lemma 3.1 proof
“covers $V_i$” A matching has an edge incident to bucket $V_i$ Lemma 3.1 proof
$\mathcal{M}_k^f(H)$ Set of all nice $k$-matchings in subgraph $H$ w.r.t. hash function $f$. Used in Lemma 3.1 to state that existence and optimal weight of nice $k$-matchings is preserved under reduction. Lemma 3.1

Insert-Only Algorithm Notation (w-Match_ins, Figure 1)

This is where s' appears. These notations track positions and subgraphs within the stream.

Notation Definition Where introduced
$e_s$ The $s$-th edge in the stream $\mathcal{S}$ Section 3
$G_s$ Subgraph of $G$ consisting of the first $s$ edges: ${e_1, \ldots, e_s}$ Section 3
$G_{r,s}$ Subgraph consisting of edges $e_r, e_{r+1}, \ldots, e_s$ (a contiguous slice of the stream) Section 3
$\hat{s}$ The largest multiple of $4k^2$ strictly less than $s$. Formally: $\hat{s} = 4k^2 \cdot i$ where $s = \hat{s} + q$ with $1 \leq q \leq 4k^2$. Even if $s$ is a multiple of $4k^2$, $\hat{s} < s$. Section 3
$G_s^f$ Recursive reduced compact subgraph at position $s$: $G_s^f = \mathcal{R}f(G{\hat{s}}^f \cup G_{\hat{s}+1,s})$, with base case $G_0^f = \emptyset$ Section 3, Eq. (1)
$\epsilon$ Fixed constant bounding the error probability (e.g., 0.01) Figure 1
$\mathcal{H}_\epsilon$ Set of $\lceil \log(1/\epsilon) \rceil$ hash functions, each picked u.a.r. from a universal family mapping $V \to [4k^2]^-$ Figure 1, Step 1
$G_{1,s}$ The first batch: edges ${e_1, \ldots, e_s}$ where $s \leq 4k^2$ Figure 1, Step 3
$s’$ The ending index of the next batch of edges being read. It is the position in the stream where the current batch ends. Either $s’ = s + 4k^2$ (a full batch was read) or $s’ < s + 4k^2$ (the stream ended before a full batch). See detailed explanation below. Figure 1, Step 4.2
$G_{s+1,s’}$ The next batch of edges: ${e_{s+1}, e_{s+2}, \ldots, e_{s’}}$, containing at most $4k^2$ edges Figure 1, Step 4.2
$G_f$ The final subgraph for hash function $f$ at query time: $G_f = \mathcal{R}f(G{\hat{s}}^f \cup G_{\hat{s}+1,s})$ Figure 1, Step 5.1

HashScheme Notation (Figure 2, Section 4.1)

Notation Definition Where introduced
$U$ Universal set of elements (the vertex set, so $\Vert U\Vert = n$) Section 4.1
$S$ An unknown $k$-subset of $U$ (e.g., the $2k$ vertices of $M_{\max}$) Section 4.1
$d_1$ $2^d$ where $d$ is smallest integer with $k/\ln k \leq 2^d$. So $d_1 = \Theta(k/\ln k)$. Number of Level-1 buckets. Figure 2
$d_2$ $\lceil 8 \ln k \rceil$. Number of Level-2 hash functions. Figure 2
$d_3$ $\lceil 13 \ln k \rceil^2$. Range size for each Level-2 hash function. Figure 2
$d_4$ $d_1 \cdot d_2 \cdot d_3 = O(k \log^2 k)$. Total range of the combined hash functions $h_i^+$. Figure 2
$f$ (in HashScheme) Level-1 hash function: $f: U \to [d_1]^-$, picked from a $\lceil 12 \ln k \rceil$-wise independent family Figure 2
$U_j$ Level-1 bucket: $U_j = {x \in U : f(x) = j}$ Section 4.1
$H = {h_1, \ldots, h_{d_2}}$ Level-2 hash functions: each $h_i: U \to [d_3]^-$, from a universal family Figure 2
$h_i^+(x)$ Combined hash function: $h_i^+(x) = f(x) \cdot d_2 \cdot d_3 + (i-1) \cdot d_3 + h_i(x)$. Maps $U \to [d_4]^-$. Figure 2
$H^+ = {h_1^+, \ldots, h_{d_2}^+}$ The set of combined hash functions Figure 2
$H_{\text{inv}}^+(q)$ The “inverse” of $H^+$ on value $q$: the set of $x \in U$ such that $h_i^+(x) = q$ for some $h_i^+ \in H^+$ Section 4.1

Dynamic Algorithm Notation (w-Match_dyn, Figure 3)

Notation Definition Where introduced
$\mathcal{L}$ Collection of $\ell_0$-samplers used by the algorithm Figure 3, Step 1
$\delta$ Failure probability parameter for each $\ell_0$-sampler: $\delta = 1/(20k^4 \ln(2k))$ Figure 3, Step 2
$\ell_0$-sampler A sketch that either FAILs (prob $\leq \delta$) or returns a uniformly random surviving edge Section 2.4
$H^+(u)$ ${h_1^+(u), \ldots, h_{d_2}^+(u)}$: all $d_2$ hash values for vertex $u$ Figure 3, Step 4.2
$\mathbf{L}_{i,j,w}$ An $\ell_0$-sampler tracking edges of weight $w$ between vertex subsets $H_{\text{inv}}^+(i)$ and $H_{\text{inv}}^+(j)$ Figure 3, Step 4.3.1
$r$ Range of $H^+$: $r = d_1 \cdot d_2 \cdot d_3 = O(k \log^2 k)$ Section 4.2
$[r]^-$ ${0, 1, \ldots, r-1}$: the range each $h_i^+$ maps into Section 4.2
$M_{\max}$ A fixed maximum weighted $k$-matching in $G$ (used in analysis) Section 4.2
$E_0$ Set of edges successfully sampled from all $\ell_0$-samplers during the query phase Figure 3, Step 5
$G(E_0)$ Subgraph of $G$ induced by the sampled edge set $E_0$ Figure 3, Step 7
$C[1..r, 1..r]$ 2D array organizing $\ell_0$-samplers; $C[i,j]$ is a balanced search tree over weights Theorem 4.4 proof

Probability Events (used in proofs)

Notation Definition Where introduced
$\mathcal{E}’$ (Theorem 4.2) Event that all Level-1 buckets have $< 13 \ln k$ elements from $S$ Theorem 4.2 proof
$\mathcal{E}’’$ (Theorem 4.2) Event that for every bucket, $H$ contains a perfect hash function Theorem 4.2 proof
$\mathcal{E}’$ (Lemma 4.3) Event that $H^+$ produces $2k$ pairwise disjoint subsets isolating $M_{\max}$’s vertices Lemma 4.3 proof
$\mathcal{E}’’$ (Lemma 4.3) Event that all $k$ relevant $\ell_0$-samplers succeed Lemma 4.3 proof
$X_{x,j}$ 0-1 random variable: 1 iff $f(x) = j$ Theorem 4.2 proof
$X_j$ $\sum_{x \in S} X_{x,j}$ = number of elements of $S$ hashed to bucket $j$ Theorem 4.2 proof

Lower Bound Notation (Appendix A)

Notation Definition Where introduced
Index problem Alice has $m$-bit string $x$, Bob has index $z \in [m]$, goal is to compute $x_z$ Appendix A.1
$k_1$ $\lceil \sqrt{m} \rceil$: used to construct the reduction graph Theorem A.1 proof
$\chi$ An injection $[m] \to [k_1] \times [k_1]$ Theorem A.1 proof
$V_L, V_R$ Two disjoint vertex subsets of size $2k_1$ each in the reduction graph Theorem A.1 proof