词条 | Johnson–Lindenstrauss lemma |
释义 |
In mathematics, the Johnson–Lindenstrauss lemma is a result named after William B. Johnson and Joram Lindenstrauss concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection. The lemma has uses in compressed sensing, manifold learning, dimensionality reduction, and graph embedding. Much of the data stored and manipulated on computers, including text and images, can be represented as points in a high-dimensional space (see vector space model for the case of text). However, the essential algorithms for working with such data tend to become bogged down very quickly as dimension increases.[1] It is therefore desirable to reduce the dimensionality of the data in a way that preserves its relevant structure. The Johnson–Lindenstrauss lemma is a classic result in this vein. Also, the lemma is tight up to a constant factor, i.e. there exists a set of points of size m that needs dimension in order to preserve the distances between all pair of points.[2] LemmaGiven , a set of points in , and a number , there is a linear map such that for all . The formula can be rearranged: One proof of the lemma takes ƒ to be a suitable multiple of the orthogonal projection onto a random subspace of dimension in , and exploits the phenomenon of concentration of measure. Obviously an orthogonal projection will, in general, reduce the average distance between points, but the lemma can be viewed as dealing with relative distances, which do not change under scaling. In a nutshell, you roll the dice and obtain a random projection, which will reduce the average distance, and then you scale up the distances so that the average distance returns to its previous value. If you keep rolling the dice, you will, in polynomial random time, find a projection for which the (scaled) distances satisfy the lemma. Alternate statementA related lemma is the distributional JL lemma. This lemma states that for any 0 < ε, δ < 1/2 and positive integer d, there exists a distribution over Rk × d from which the matrix A is drawn such that for k = O(ε−2log(1/δ)) and for any unit-length vector x ∈ Rd, the claim below holds.[3] One can obtain the JL lemma from the distributional version by setting and for some pair u,v both in X. Then the JL lemma follows by a union bound over all such pairs. Speeding up the JL transformGiven A, computing the matrix vector product takes O(kd) time. There has been some work in deriving distributions for which the matrix vector product can be computed in less than O(kd) time. There are two major lines of work. The first, Fast Johnson Lindenstrauss Transform (FJLT),[4] was introduced by Ailon and Chazelle in 2006. Another approach is to build a distribution supported over matrices that are sparse.[5] [6]See also
Notes1. ^For instance, writing about nearest neighbor search in high-dimensional data sets, Jon Kleinberg writes: "The more sophisticated algorithms typically achieve a query time that is logarithmic in n at the expense of an exponential dependence on the dimension d; indeed, even the average case analysis of heuristics such as k-d trees reveal an exponential dependence on d in the query time. {{citation | last = Kleinberg | first = Jon M. | contribution = Two Algorithms for Nearest-neighbor Search in High Dimensions | doi = 10.1145/258533.258653 | isbn = 0-89791-888-6 | location = New York, NY, USA | pages = 599–608 | publisher = ACM | series = STOC '97 | title = Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing | year = 1997}}. 2. ^{{cite conference | title=Optimality of the Johnson-Lindenstrauss Lemma | conference=Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science (FOCS) | year=2017 | doi=10.1109/FOCS.2017.64 | author1=Kasper Green Larsen | author2=Jelani Nelson | page=633-638 }} 3. ^{{cite encyclopedia | last1 = Johnson | first1 = William B. | author1-link = William B. Johnson (mathematician)| last2 = Lindenstrauss | first2 = Joram | author2-link = Joram Lindenstrauss | contribution = Extensions of Lipschitz mappings into a Hilbert space | doi = 10.1090/conm/026/737400 | location = Providence, RI | mr = 737400 | pages = 189–206 | publisher = American Mathematical Society | series = Contemporary Mathematics | title = Conference in modern analysis and probability (New Haven, Conn., 1982) | volume = 26 | year = 1984 | editor1-last = Beals | editor1-first = Richard | editor2-last = Beck | editor2-first = Anatole | editor3-last = Bellow | editor3-first = Alexandra |display-editors = 3 | editor4-last = Hajian | editor4-first = Arshag | isbn = 0-8218-5030-X }} 4. ^{{cite encyclopedia | last1 = Ailon | first1 = Nir | last2 = Chazelle | first2 = Bernard | chapter = Approximate nearest neighbors and the fast Johnson–Lindenstrauss transform | title = Proceedings of the 38th Annual ACM Symposium on Theory of Computing | year = 2006 | mr = 2277181 | doi = 10.1145/1132516.1132597 | pages = 557–563 | publisher = ACM Press | location = New York | isbn = 1-59593-134-1}} 5. ^{{cite journal | first1 = Daniel M. | last1 = Kane | last2 = Nelson | first2 = Jelani | doi = 10.1145/2559902 | issue = 1 | pages = 1 | journal = Journal of the ACM | title = Sparser Johnson-Lindenstrauss Transforms | volume = 61 | year = 2014 | mr = 3167920| arxiv = 1012.1577}} 6. ^{{cite encyclopedia | first1 = Daniel M. | last1 = Kane | last2 = Nelson | first2 = Jelani | title = Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, | chapter = Sparser Johnson-Lindenstrauss Transforms | year = 2012 | publisher = Association for Computing Machinery (ACM) | location = New York | isbn = 978-1-611972-11-5 | mr = 3205284 | pages = 1195–1203}} Further reading
| last = Achlioptas | first = Dimitris | doi = 10.1016/S0022-0000(03)00025-4 | issue = 4 | journal = Journal of Computer and System Sciences | mr = 2005771 | pages = 671–687 | title = Database-friendly random projections: Johnson–Lindenstrauss with binary coins | volume = 66 | year = 2003}}. Journal version of a paper previously appearing at PODC 2001.
|last1 = Baraniuk |first1 = Richard |author1-link = Richard Baraniuk |last2 = Davenport |first2 = Mark |last3 = DeVore |first3 = Ronald |author3-link = Ronald DeVore |last4 = Wakin |first4 = Michael |doi = 10.1007/s00365-007-9003-x |issue = 3 |journal = Constructive Approximation |mr = 2453366 |pages = 253–263 |title = A simple proof of the restricted isometry property for random matrices |url = http://dsp.rice.edu/files/cs/JL_RIP.pdf |volume = 28 |year = 2008 }}{{dead link|date=November 2017 |bot=InternetArchiveBot |fix-attempted=yes }}.
| last1 = Dasgupta | first1 = Sanjoy | last2 = Gupta | first2 = Anupam | doi = 10.1002/rsa.10073 | issue = 1 | journal = Random Structures & Algorithms | mr = 1943859 | pages = 60–65 | title = An elementary proof of a theorem of Johnson and Lindenstrauss | url = http://cseweb.ucsd.edu/~dasgupta/papers/jl.pdf | volume = 22 | year = 2003}}.
2 : Lemmas|Metric geometry |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。