You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: chapter_5/2_bourgain.tex
+105-22
Original file line number
Diff line number
Diff line change
@@ -552,18 +552,42 @@ \section{Dimensionality Reduction in $\ell_2$}
552
552
553
553
\end{itemize}
554
554
555
+
\noindent
555
556
\textbf{Extensions}:
556
557
\begin{itemize}
557
-
\item Remark 1
558
+
\item There exists a deterministic version of the JL Lemma for finding
559
+
$f$ through Derandomization Techniques \cite{sivakumarderandom}
560
+
561
+
The deterministic extension of the JL Lemma construction is derived from
562
+
the randomized construction such that for $d\times D$ matrix $R$,
563
+
$R_{ij} \sim\text{DiscreteUniform}(-1,1)$ and for $u \in\bbR^D$
564
+
$f(u) = Ru$. The above paper postulates that this construction can be
565
+
made deterministic with runtime $(2^b D n/\epsilon)^{O(1)}$ with space
566
+
constraint $O(b + \log D + \log\epsilon^{-1} + \log n)$ where $b$ is
567
+
the precision of a bit in $u$.
568
+
558
569
\item We cannot improve the JL Lemma result (in asymptotic space) from:
559
570
$$d \geq\Omega(\frac{log n}{\epsilon^2})$$
560
-
\item\textbf{Question:} What if $S$ is of infinite size? \\\\
561
-
In general, we cannot achieve a bound on the number of dimensions $d$ required to embed an infinitely sized $S$ in Euclidean space. \\\\
562
-
However, if $S$ is particularly structured, then progress can be made. That is, it is possible to construct an embedding of $S$ that achieves some bound on dimensionality. \\
563
-
\begin{itemize}
571
+
572
+
573
+
\item\textbf{Question:} What if $S$ is of infinite size?
574
+
575
+
576
+
In general, we cannot achieve a bound on the number of dimensions $d$
577
+
required to embed an infinitely sized $S$ in Euclidean space.
578
+
579
+
580
+
However, if $S$ is particularly structured, then progress can be made.
581
+
That is, it is possible to construct an embedding of $S$ that achieves
582
+
some bound on dimensionality.
583
+
584
+
585
+
\indent
564
586
\textbf{Example 1:} $S$ is a fixed (but unknown) $k$-dimensional affine subspace of $\bbR^d$. If this is the case, there is a random linear map
565
587
$$f : \bbR^D \rightarrow\bbR^d $$ where $d < D$, such that mapping $S$ will preserve inter-point distances.
566
-
\end{itemize}
588
+
589
+
590
+
567
591
Remarks and Observations:
568
592
\begin{itemize}
569
593
\item$d > k$. It is not possible to place a $k$-dimensional object into $k-1$ dimensions and preserve inter-point distance.
@@ -572,32 +596,91 @@ \section{Dimensionality Reduction in $\ell_2$}
572
596
\end{itemize}
573
597
574
598
575
-
However, $k$-dimensional affine spaces are very stringent structures. We could extend random linear mapping to a more flexible notion of structure. \\
576
-
\begin{itemize}
599
+
However, $k$-dimensional affine spaces are very stringent structures. We could extend random linear mapping to a more flexible notion of structure.
600
+
601
+
602
+
\indent
577
603
\textbf{Example 2:} $S$ is a \textit{curved} $k$-dimensional object (e.g. a $k$ manifold), and the dimensionality and distance between points are functions of its bend in $\bbR^D$. Here, we care about geodesic distance (shortest distance \textit{on the manifold}).
578
-
\end{itemize}
579
-
In this case, if we take a random linear map of this $k$-dimensional manifold, we can preserve distances approximately. In fact, a random linear mapping preserves \textbf{both geodesic and shortest/Euclidean distance}, with $d = \Omega(\frac{k}{\epsilon^2})$. \\\\
580
-
Taking a random linear map might be counter-intuitive because the object can twist and turn in $\bbR^D$, or can create loops or knots. It seems like there should be some distortion when doing a random linear mapping. The $\epsilon^2$ in $d = \Omega(\frac{k}{\epsilon^2})$ accounts for this event. \\\\
604
+
605
+
606
+
In this case, if we take a random linear map of this $k$-dimensional manifold,
607
+
we can preserve distances approximately. In fact, a random linear mapping
608
+
preserves \textbf{both geodesic and shortest/Euclidean distance},
609
+
with $d = \Omega(\frac{k}{\epsilon^2})$.
610
+
611
+
612
+
Taking a random linear map might be counter-intuitive because the object can
613
+
twist and turn in $\bbR^D$, or can create loops or knots. It seems like there
614
+
should be some distortion when doing a random linear mapping. The $\epsilon^2$
615
+
in $d = \Omega(\frac{k}{\epsilon^2})$ accounts for this event.
616
+
617
+
581
618
More generally, we have
582
619
$$ d = \Omega(\frac{k}{\epsilon^2} + log(V\alpha))$$
583
-
where $\alpha$ is a curvature parameter and $V$ is the $k$-dimensional volume of the object (how large it is). Although not proven, the above can be considered a lower bound for the minimum number of dimensions required to embed a $k$-dimensional object curved in $\bbR^D$. \\\\
584
-
We've so far considered flat objects in $\bbR^D$ or some $k$-dimensional manifold in $\bbR^D$. What are other structures? \\
585
-
\begin{itemize}
620
+
where $\alpha$ is a curvature parameter and $V$ is the $k$-dimensional volume
621
+
of the object (how large it is). Although not proven, the above can be
622
+
considered a lower bound for the minimum number of dimensions required to
623
+
embed a $k$-dimensional object curved in $\bbR^D$.
624
+
625
+
626
+
We've so far considered flat objects in $\bbR^D$ or some $k$-dimensional manifold in $\bbR^D$. What are other structures?
627
+
628
+
629
+
\indent
586
630
\textbf{Example 3:} $S$ could be sparse. Every data point in a $k$-sparse $S$ has no more than $k$ entries. \\
587
631
More specifically, $\forall x \in\bbR^D$ and $\rho\in\bbR^+$ (a.k.a the budget) $$\sum\limits_{i =1}^{D} \mid x_i \mid\leq\rho$$
588
-
\end {itemize}
632
+
633
+
589
634
If $S$ has the above property, then the entire object can be embedded in $l_2$. The embedding and dimensionality would depend on the budget $\rho$.
635
+
636
+
637
+
\item Alternative Measure of Defining Metric Spaces of Non-
638
+
euclidean Nature: Growth Rate/Doubling Dimension
639
+
640
+
641
+
Let there be a ball $B(n,r) = \{y \in X | \rho(x,y) \leq r\}$.
642
+
We consider a measure of this non-euclidean space as the rate
643
+
at which the number of points in ball $B$ change as we change
644
+
the tolerance $r$.
645
+
646
+
Define the ``doubling dimension''$\text{dim}(X,\rho)$ as
647
+
$\lceil\log_2(k) \rceil$ where the ``doubling constant''$k$
648
+
is the smallest integer such that every ball $B(x, 2r)$ can be
649
+
covered by at most $k$ balls $B(x, r)$.
650
+
651
+
\textit{Negative result}: If ``doubling dimension''$k$ of $S$ is small we still
652
+
do not have a guarantee that we can do Random Projection in reasonably low
653
+
dimensions $O(k)$ and be able to preserve interpoint distances. That is, for
654
+
some $p$-normed space $\ell_p^{O(k)}$ we cannot bound distortion on simply a
655
+
function of $k$ [\cite{doubling}].
656
+
657
+
Other extensions using this idea include bouding the distortion of tree metrics [\cite{boundedgeometries}].
658
+
659
+
590
660
\end{itemize}
661
+
662
+
\noindent
591
663
\textbf{Aside: A list of concentration inequalities}
592
664
\begin{itemize}
593
-
\item\textit{Markov}: For a nonnegative random variable $X$ and $a > 0$$$P(X \geq a) \leq\frac{\bbE[X]}{a}$$
594
-
\item\textit{Chebychev}: For a continuous random variable $X$ with $\bbE[X] = \mu$ and $\Var[X] = \sigma^2$; then for $k>0$
665
+
\item\textit{Markov}: For a nonnegative random variable $X$ and $a > 0$$$P(X \geq a) \leq\frac{\E[X]}{a}$$
666
+
\item\textit{Chebychev}: For a continuous random variable $X$ with $\E[X] = \mu$ and $\Var[X] = \sigma^2$; then for $k>0$
\item\textit{Chernoff}: An extension of the Markov Inequality for $e^{tX}$
597
-
\item\textit{Hoeffding}:
668
+
\item\textit{Chernoff}: An extension of the Markov Inequality for $e^{tX}$ and moment generating functions: $$P(X \geq a) \leq\frac{\E[e^{tX}]}{e^{ta}}$$
669
+
\item\textit{Hoeffding}: Bounding sums of bounded random variables $X_1, \ldots, X_n$ where $X_i \in [a,b]$
0 commit comments