Skip to content

Commit d52d38c

Browse files
committed
Doubling Dimension + Bounds
1 parent 284a12e commit d52d38c

File tree

2 files changed

+151
-22
lines changed

2 files changed

+151
-22
lines changed

chapter_5/2_bourgain.tex

+105-22
Original file line numberDiff line numberDiff line change
@@ -552,18 +552,42 @@ \section{Dimensionality Reduction in $\ell_2$}
552552

553553
\end{itemize}
554554

555+
\noindent
555556
\textbf{Extensions}:
556557
\begin{itemize}
557-
\item Remark 1
558+
\item There exists a deterministic version of the JL Lemma for finding
559+
$f$ through Derandomization Techniques \cite{sivakumarderandom}
560+
561+
The deterministic extension of the JL Lemma construction is derived from
562+
the randomized construction such that for $d\times D$ matrix $R$,
563+
$R_{ij} \sim \text{DiscreteUniform}(-1,1)$ and for $u \in \bbR^D$
564+
$f(u) = Ru$. The above paper postulates that this construction can be
565+
made deterministic with runtime $(2^b D n/\epsilon)^{O(1)}$ with space
566+
constraint $O(b + \log D + \log \epsilon^{-1} + \log n)$ where $b$ is
567+
the precision of a bit in $u$.
568+
558569
\item We cannot improve the JL Lemma result (in asymptotic space) from:
559570
$$d \geq \Omega(\frac{log n}{\epsilon^2})$$
560-
\item \textbf{Question:} What if $S$ is of infinite size? \\ \\
561-
In general, we cannot achieve a bound on the number of dimensions $d$ required to embed an infinitely sized $S$ in Euclidean space. \\ \\
562-
However, if $S$ is particularly structured, then progress can be made. That is, it is possible to construct an embedding of $S$ that achieves some bound on dimensionality. \\
563-
\begin{itemize}
571+
572+
573+
\item \textbf{Question:} What if $S$ is of infinite size?
574+
575+
576+
In general, we cannot achieve a bound on the number of dimensions $d$
577+
required to embed an infinitely sized $S$ in Euclidean space.
578+
579+
580+
However, if $S$ is particularly structured, then progress can be made.
581+
That is, it is possible to construct an embedding of $S$ that achieves
582+
some bound on dimensionality.
583+
584+
585+
\indent
564586
\textbf{Example 1:} $S$ is a fixed (but unknown) $k$-dimensional affine subspace of $\bbR^d$. If this is the case, there is a random linear map
565587
$$f : \bbR^D \rightarrow \bbR^d $$ where $d < D$, such that mapping $S$ will preserve inter-point distances.
566-
\end{itemize}
588+
589+
590+
567591
Remarks and Observations:
568592
\begin{itemize}
569593
\item $d > k$. It is not possible to place a $k$-dimensional object into $k-1$ dimensions and preserve inter-point distance.
@@ -572,32 +596,91 @@ \section{Dimensionality Reduction in $\ell_2$}
572596
\end{itemize}
573597

574598

575-
However, $k$-dimensional affine spaces are very stringent structures. We could extend random linear mapping to a more flexible notion of structure. \\
576-
\begin{itemize}
599+
However, $k$-dimensional affine spaces are very stringent structures. We could extend random linear mapping to a more flexible notion of structure.
600+
601+
602+
\indent
577603
\textbf{Example 2:} $S$ is a \textit{curved} $k$-dimensional object (e.g. a $k$ manifold), and the dimensionality and distance between points are functions of its bend in $\bbR^D$. Here, we care about geodesic distance (shortest distance \textit{on the manifold}).
578-
\end{itemize}
579-
In this case, if we take a random linear map of this $k$-dimensional manifold, we can preserve distances approximately. In fact, a random linear mapping preserves \textbf{both geodesic and shortest/Euclidean distance}, with $d = \Omega(\frac{k}{\epsilon^2})$. \\ \\
580-
Taking a random linear map might be counter-intuitive because the object can twist and turn in $\bbR^D$, or can create loops or knots. It seems like there should be some distortion when doing a random linear mapping. The $\epsilon^2$ in $d = \Omega(\frac{k}{\epsilon^2})$ accounts for this event. \\ \\
604+
605+
606+
In this case, if we take a random linear map of this $k$-dimensional manifold,
607+
we can preserve distances approximately. In fact, a random linear mapping
608+
preserves \textbf{both geodesic and shortest/Euclidean distance},
609+
with $d = \Omega(\frac{k}{\epsilon^2})$.
610+
611+
612+
Taking a random linear map might be counter-intuitive because the object can
613+
twist and turn in $\bbR^D$, or can create loops or knots. It seems like there
614+
should be some distortion when doing a random linear mapping. The $\epsilon^2$
615+
in $d = \Omega(\frac{k}{\epsilon^2})$ accounts for this event.
616+
617+
581618
More generally, we have
582619
$$ d = \Omega(\frac{k}{\epsilon^2} + log(V\alpha))$$
583-
where $\alpha$ is a curvature parameter and $V$ is the $k$-dimensional volume of the object (how large it is). Although not proven, the above can be considered a lower bound for the minimum number of dimensions required to embed a $k$-dimensional object curved in $\bbR^D$. \\ \\
584-
We've so far considered flat objects in $\bbR^D$ or some $k$-dimensional manifold in $\bbR^D$. What are other structures? \\
585-
\begin{itemize}
620+
where $\alpha$ is a curvature parameter and $V$ is the $k$-dimensional volume
621+
of the object (how large it is). Although not proven, the above can be
622+
considered a lower bound for the minimum number of dimensions required to
623+
embed a $k$-dimensional object curved in $\bbR^D$.
624+
625+
626+
We've so far considered flat objects in $\bbR^D$ or some $k$-dimensional manifold in $\bbR^D$. What are other structures?
627+
628+
629+
\indent
586630
\textbf{Example 3:} $S$ could be sparse. Every data point in a $k$-sparse $S$ has no more than $k$ entries. \\
587631
More specifically, $\forall x \in \bbR^D$ and $\rho \in \bbR^+$ (a.k.a the budget) $$ \sum\limits_{i =1}^{D} \mid x_i \mid \leq \rho$$
588-
\end {itemize}
632+
633+
589634
If $S$ has the above property, then the entire object can be embedded in $l_2$. The embedding and dimensionality would depend on the budget $\rho$.
635+
636+
637+
\item Alternative Measure of Defining Metric Spaces of Non-
638+
euclidean Nature: Growth Rate/Doubling Dimension
639+
640+
641+
Let there be a ball $B(n,r) = \{y \in X | \rho(x,y) \leq r\}$.
642+
We consider a measure of this non-euclidean space as the rate
643+
at which the number of points in ball $B$ change as we change
644+
the tolerance $r$.
645+
646+
Define the ``doubling dimension'' $\text{dim}(X,\rho)$ as
647+
$\lceil \log_2(k) \rceil$ where the ``doubling constant'' $k$
648+
is the smallest integer such that every ball $B(x, 2r)$ can be
649+
covered by at most $k$ balls $B(x, r)$.
650+
651+
\textit{Negative result}: If ``doubling dimension'' $k$ of $S$ is small we still
652+
do not have a guarantee that we can do Random Projection in reasonably low
653+
dimensions $O(k)$ and be able to preserve interpoint distances. That is, for
654+
some $p$-normed space $\ell_p^{O(k)}$ we cannot bound distortion on simply a
655+
function of $k$ [\cite{doubling}].
656+
657+
Other extensions using this idea include bouding the distortion of tree metrics [\cite{boundedgeometries}].
658+
659+
590660
\end{itemize}
661+
662+
\noindent
591663
\textbf{Aside: A list of concentration inequalities}
592664
\begin{itemize}
593-
\item \textit{Markov}: For a nonnegative random variable $X$ and $a > 0$ $$P(X \geq a) \leq \frac{\bbE[X]}{a}$$
594-
\item \textit{Chebychev}: For a continuous random variable $X$ with $\bbE[X] = \mu$ and $\Var[X] = \sigma^2$; then for $k>0$
665+
\item \textit{Markov}: For a nonnegative random variable $X$ and $a > 0$ $$P(X \geq a) \leq \frac{\E[X]}{a}$$
666+
\item \textit{Chebychev}: For a continuous random variable $X$ with $\E[X] = \mu$ and $\Var[X] = \sigma^2$; then for $k>0$
595667
$$\Pr[|X-\mu| \geq k\sigma^2 ] \leq \frac{1}{k^2}$$
596-
\item \textit{Chernoff}: An extension of the Markov Inequality for $e^{tX}$
597-
\item \textit{Hoeffding}:
668+
\item \textit{Chernoff}: An extension of the Markov Inequality for $e^{tX}$ and moment generating functions: $$P(X \geq a) \leq \frac{\E[e^{tX}]}{e^{ta}}$$
669+
\item \textit{Hoeffding}: Bounding sums of bounded random variables $X_1, \ldots, X_n$ where $X_i \in [a,b]$
670+
\begin{align*}
671+
\E\left[\exp(\lambda(X_i - \E[X_i]))\right] &\leq \exp \left( \frac{\lambda^2(b-a)^2}{8}\right)\\
672+
\Pr\left( \frac{1}{n} \sum_{i=1}^n (X_i - \E[X_i]) \geq t\right) & \leq \exp \left( -\frac{2nt^2}{(b-a)^2}\right)
673+
\end{align*}
598674
\item \textit{Bernstein}:
599-
\item \textit{Effron-Stein}:
600-
\item \textit{Azuma}:
601-
\item \textit{Mcdiarmid}:
675+
676+
$$\Pr \left( \left| \frac{1}{n} \sum_{i=1}^n X_i \right| > \varepsilon\right) \leq 2 \exp \left( - \frac{n \varepsilon^2}{2(1+\frac{\varepsilon}{3})}\right)$$
677+
\item \textit{Effron-Stein}: A moment bound on variance. Define $X^{(i)}$ as all random variables except $X_i$.
678+
679+
$$\Var(f(X)) \leq \frac{1}{2} \sum_{i=1}^n \E [(f(X) - f(X^{(i)}))^2]$$
680+
681+
\item \textit{Azuma}: Similar result to Hoeffding Inequality
682+
\item \textit{Mcdiarmid}: Similar result to Hoeffding Inequality
602683
\item \textit{Talagrand}:
684+
\item \textit{Jensen's Inequality}: Used in the derivation of the EM Algorithm; Given convex function $f$
685+
$$f(\E[X]) \leq \E[f(X)]$$
603686
\end{itemize}

chapter_5/references.bib

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
@inproceedings{sivakumarderandom,
2+
author = {Sivakumar, D.},
3+
title = {Algorithmic Derandomization via Complexity Theory},
4+
booktitle = {Proceedings of the Thiry-fourth Annual ACM Symposium on Theory of Computing},
5+
series = {STOC '02},
6+
year = {2002},
7+
isbn = {1-58113-495-9},
8+
location = {Montreal, Quebec, Canada},
9+
pages = {619--626},
10+
numpages = {8},
11+
url = {http://doi.acm.org/10.1145/509907.509996},
12+
doi = {10.1145/509907.509996},
13+
acmid = {509996},
14+
publisher = {ACM},
15+
address = {New York, NY, USA},
16+
}
17+
18+
@article{boundedgeometries,
19+
author={Robert Krauthgamer, Anupam Gupta, James Lee},
20+
title={Bounded Geometries, Fractals, and Low-Distortion Embeddings},
21+
journal={3D Digital Imaging and Modeling, International Conference on},
22+
volume={},
23+
number={},
24+
issn={},
25+
pages={534},
26+
doi={10.1109/SFCS.2003.1238226},
27+
publisher={IEEE Computer Society},
28+
address={Los Alamitos, CA, USA}
29+
}
30+
31+
@article{doubling,
32+
author = {Yair Bartal and
33+
Lee{-}Ad Gottlieb and
34+
Ofer Neiman},
35+
title = {On the Impossibility of Dimension Reduction for Doubling Subsets of
36+
{\(\mathscr{l}\)}\({}_{\mbox{p}}\), p{\textgreater}2},
37+
journal = {CoRR},
38+
volume = {abs/1308.4996},
39+
year = {2013},
40+
url = {http://arxiv.org/abs/1308.4996},
41+
archivePrefix = {arXiv},
42+
eprint = {1308.4996},
43+
timestamp = {Mon, 13 Aug 2018 16:48:18 +0200},
44+
biburl = {https://dblp.org/rec/bib/journals/corr/BartalGN13},
45+
bibsource = {dblp computer science bibliography, https://dblp.org}
46+
}

0 commit comments

Comments
 (0)