词条 | Memoization |
释义 |
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again. Memoization has also been used in other contexts (and for purposes other than speed gains), such as in simple mutually recursive descent parsing[1]. Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement. In the context of some logic programming languages, memoization is also known as tabling;[2] see also lookup table. Etymology{{wikt}}The term "memoization" was coined by Donald Michie in 1968[3] and is derived from the Latin word "memorandum" ("to be remembered"), usually truncated as "memo" in American English, and thus carries the meaning of "turning [the results of] a function into something to be remembered." While "memoization" might be confused with "memorization" (because they are etymological cognate OverviewA memoized function "remembers" the results corresponding to some set of specific inputs. Subsequent calls with remembered inputs return the remembered result rather than recalculating it, thus eliminating the primary cost of a call with given parameters from all but the first call made to the function with those parameters. The set of remembered associations may be a fixed-size set controlled by a replacement algorithm or a fixed set, depending on the nature of the function and its use. A function can only be memoized if it is referentially transparent; that is, only if calling the function has exactly the same effect as replacing that function call with its return value. (Special case exceptions to this restriction exist, however.) While related to lookup tables, since memoization often uses such tables in its implementation, memoization populates its cache of results transparently on the fly, as needed, rather than in advance. Memoization is a way to lower a function's time cost in exchange for space cost; that is, memoized functions become optimized for speed in exchange for a higher use of computer memory space. The time/space "cost" of algorithms has a specific name in computing: computational complexity. All functions have a computational complexity in time (i.e. they take time to execute) and in space. Although a time-space trade-off occurs (i.e., space used is speed gained), this differs from some other optimizations that involve time-space trade-off, such as strength reduction, in that memoization is a run-time rather than compile-time optimization. Moreover, strength reduction potentially replaces a costly operation such as multiplication with a less costly operation such as addition, and the results in savings can be highly machine-dependent, non-portable across machines, whereas memoization is a more machine-independent, cross-platform strategy. Consider the following pseudocode function to calculate the factorial of n: function factorial (''n'' is a non-negative integer) if ''n'' is 0 then return 1 [''by the convention that'' '''0! = 1'''] else return factorial(''n'' – 1) times ''n'' [''recursively invoke factorial'' ''with the parameter 1 less than n''] end if end function For every integer n such that n≥0, the final result of the function
In a non-memoized implementation, every top-level call to A memoized version of the function factorial (''n'' is a non-negative integer) if ''n'' is 0 then return 1 [''by the convention that'' '''0! = 1'''] else if ''n'' is in ''lookup-table'' then return ''lookup-table-value-for-n'' else let x = factorial(n – 1) times ''n'' [''recursively invoke factorial'' ''with the parameter 1 less than n''] store ''x'' in ''lookup-table'' in the ''n''th slot [''remember the result of n! for later''] return x end if end function In this particular example, if Some other considerationsFunctional programming{{main|Thunk#Functional programming}}{{section-expand|date=April 2014}}Memoization is heavily used in compilers for functional programming languages, which often use call by name evaluation strategy. To avoid overhead with calculating argument values, compilers for these languages heavily use auxiliary functions called thunks to compute the argument values, and memoize these functions to avoid repeated calculations. Automatic memoizationWhile memoization may be added to functions internally and explicitly by a computer programmer in much the same way the above memoized version of In programming languages where functions are first-class objects (such as Lua, Python, or Perl ), automatic memoization can be implemented by replacing (at run-time) a function with its calculated value once a value has been calculated for a given set of parameters. The function that does this value-for-function-object replacement can generically wrap any referentially transparent function. Consider the following pseudocode (where it is assumed that functions are first-class values): function memoized-call (''F'' is a function object parameter) if ''F'' has no attached array ''values'' then allocate an associative array called ''values''; attach ''values'' to ''F''; end if; if ''F''.''values[arguments]'' is empty then ''F''.''values[arguments]'' = ''F''(arguments); end if; return F.''values[arguments]''; end function In order to call an automatically memoized version of The above strategy requires explicit wrapping at each call to a function that is to be memoized. In those languages that allow closures, memoization can be effected implicitly by a functor factory that returns a wrapped memoized function object. In pseudocode, this can be expressed as follows: function construct-memoized-functor (''F'' is a function object parameter) allocate a function object called ''memoized-version''; let memoized-version(arguments) be if ''self'' has no attached array values then ['''''self''' is a reference to this object''] allocate an associative array called ''values''; attach ''values'' to ''self''; end if; if self.''values[arguments]'' is empty then self.''values[arguments]'' = ''F''(arguments); end if; return self.''values[arguments]''; end let; return ''memoized-version''; end function Rather than call The above example assumes that the function Essentially, such techniques involve attaching the original function object to the created functor and forwarding calls to the original function being memoized via an alias when a call to the actual function is required (to avoid endless recursion), as illustrated below: function construct-memoized-functor (''F'' is a function object parameter) allocate a function object called ''memoized-version''; let ''memoized-version''(arguments) be if ''self'' has no attached array values then ['''''self''' is a reference to this object''] allocate an associative array called ''values''; attach ''values'' to ''self''; allocate a new function object called ''alias''; attach ''alias'' to ''self''; [''for later ability to invoke '''F''' indirectly''] self.''alias'' = ''F''; end if; if self.''values[arguments]'' is empty then self.''values[arguments]'' = self.''alias''(arguments); ['''''not''' a direct call to '''F'''''] end if; return self.''values[arguments]''; end let; return ''memoized-version''; end function (Note: Some of the steps shown above may be implicitly managed by the implementation language and are provided for illustration.) ParsersWhen a top-down parser tries to parse an ambiguous input with respect to an ambiguous context-free grammar (CFG), it may need an exponential number of steps (with respect to the length of the input) to try all alternatives of the CFG in order to produce all possible parse trees. This eventually would require exponential memory space. Memoization was explored as a parsing strategy in 1991 by Norvig, who demonstrated that an algorithm similar to the use of dynamic programming and state-sets in Earley's algorithm (1970), and tables in the CYK algorithm of Cocke, Younger and Kasami, could be generated by introducing automatic memoization to a simple backtracking recursive descent parser to solve the problem of exponential time complexity.[1] The basic idea in Norvig’s approach is that when a parser is applied to the input, the result is stored in a memotable for subsequent reuse if the same parser is ever reapplied to the same input. Richard Frost also used memoization to reduce the exponential time complexity of parser combinators, which can be viewed as “Purely Functional Top-Down Backtracking” parsing technique.[6] He showed that basic memoized parser combinators can be used as building blocks to construct complex parsers as executable specifications of CFGs.[7][8] It was again explored in the context of parsing in 1995 by Johnson and Dörre.[9][10] In 2002, it was examined in considerable depth by Ford in the form called packrat parsing.[11] In 2007, Frost, Hafiz and Callaghan{{citation needed|date=December 2017}} described a top-down parsing algorithm that uses memoization for refraining redundant computations to accommodate any form of ambiguous CFG in polynomial time (Θ(n4) for left-recursive grammars and Θ(n3) for non left-recursive grammars). Their top-down parsing algorithm also requires polynomial space for potentially exponential ambiguous parse trees by 'compact representation' and 'local ambiguities grouping'. Their compact representation is comparable with Tomita’s compact representation of bottom-up parsing.[12] Their use of memoization is not only limited to retrieving the previously computed results when a parser is applied to a same input position repeatedly (which is essential for polynomial time requirement); it is specialized to perform the following additional tasks:
Frost, Hafiz and Callaghan also described the implementation of the algorithm in PADL’08{{citation needed|date=December 2017}} as a set of higher-order functions (called parser combinators) in Haskell, which enables the construction of directly executable specifications of CFGs as language processors. The importance of their polynomial algorithm’s power to accommodate ‘any form of ambiguous CFG’ with top-down parsing is vital with respect to the syntax and semantics analysis during natural language processing. The X-SAIGA site has more about the algorithm and implementation details. While Norvig increased the power of the parser through memoization, the augmented parser was still as time complex as Earley's algorithm, which demonstrates a case of the use of memoization for something other than speed optimization. Johnson and Dörre[10] demonstrate another such non-speed related application of memoization: the use of memoization to delay linguistic constraint resolution to a point in a parse where sufficient information has been accumulated to resolve those constraints. By contrast, in the speed optimization application of memoization, Ford demonstrated that memoization could guarantee that parsing expression grammars could parse in linear time even those languages that resulted in worst-case backtracking behavior.[11] Consider the following grammar: S → (A '''c''') | (B '''d''') A → X ('''a'''|'''b''') B → X '''b''' X → '''x''' [X] (Notation note: In the above example, the production S → (A c) | (B d) reads: "An S is either an A followed by a c or a B followed by a d." The production X → x [X] reads "An X is an x followed by an optional X.") This grammar generates one of the following three variations of string: xac, xbc, or xbd (where x here is understood to mean one or more x's.) Next, consider how this grammar, used as a parse specification, might effect a top-down, left-right parse of the string xxxxxbd: The rule A will recognize xxxxxb (by first descending into X to recognize one x, and again descending into X until all the x's are consumed, and then recognizing the b), and then return to S, and fail to recognize a c. The next clause of S will then descend into B, which in turn again descends into X and recognizes the x's by means of many recursive calls to X, and then a b, and returns to S and finally recognizes a d. The key concept here is inherent in the phrase again descends into X. The process of looking forward, failing, backing up, and then retrying the next alternative is known in parsing as backtracking, and it is primarily backtracking that presents opportunities for memoization in parsing. Consider a function
Let the return value of the function When the rule A descends into X at offset 0, it memoizes the length 5 against that position and the rule X. After having failed at d, B then, rather than descending again into X, queries the position 0 against rule X in the memoization engine, and is returned a length of 5, thus saving having to actually descend again into X, and carries on as if it had descended into X as many times as before. In the above example, one or many descents into X may occur, allowing for strings such as xxxxxxxxxxxxxxxxbd. In fact, there may be any number of x's before the b. While the call to S must recursively descend into X as many times as there are x's, B will never have to descend into X at all, since the return value of Those parsers that make use of syntactic predicates are also able to memoize the results of predicate parses, as well, thereby reducing such constructions as: S → (A)? A A → /* some rule */ to one descent into A. If a parser builds a parse tree during a parse, it must memoize not only the length of the input that matches at some offset against a given rule, but also must store the sub-tree that is generated by that rule at that offset in the input, since subsequent calls to the rule by the parser will not actually descend and rebuild that tree. For the same reason, memoized parser algorithms that generate calls to external code (sometimes called a semantic action routine) when a rule matches must use some scheme to ensure that such rules are invoked in a predictable order. Since, for any given backtracking or syntactic predicate capable parser not every grammar will need backtracking or predicate checks, the overhead of storing each rule's parse results against every offset in the input (and storing the parse tree if the parsing process does that implicitly) may actually slow down a parser. This effect can be mitigated by explicit selection of those rules the parser will memoize.[13] See also{{too many see alsos|date=November 2016}}
References1. ^1 2 Norvig, Peter, "[https://people.eecs.berkeley.edu/~fateman/294-8/norvig/cfparser.ps Techniques for Automatic Memoization with Applications to Context-Free Parsing]," Computational Linguistics, Vol. 17 No. 1, pp. 91–98, March 1991. 2. ^Warren, David. "Tabling and Datalog Programming". Accessed 29 May 2009. 3. ^Michie, Donald, "[https://www.cs.utexas.edu/users/hunt/research/hash-cons/hash-cons-papers/michie-memo-nature-1968.pdf Memo Functions and Machine Learning]," Nature, No. 218, pp. 19–22, 1968. 4. ^Hoffman, Berthold, "Term Rewriting with Sharing and Memoïzation," Algebraic and Logic Programming: Third International Conference, Proceedings, H. Kirchner and G. Levi (eds.), pp. 128–142, Volterra, Italy, 2–4 September 1992. 5. ^Mayfield, James, et al., Using Automatic Memoization as a Software Engineering Tool in Real-World AI Systems, Proceedings of the Eleventh IEEE Conference on Artificial Intelligence for Applications (CAIA '95), pp. 87-93, 1995. 6. ^Frost, Richard. and Szydlowski, Barbara. "Memoizing Purely Functional Top-Down Backtracking Language Processors. " "Sci. Comput. Program. " 1996 – 27(3): 263–288. 7. ^Frost, Richard. "Using Memoization to Achieve Polynomial Complexity of Purely Functional Executable Specifications of Non-Deterministic Top-Down Parsers. " "SIGPLAN Notices" 29(4): 23–30. 8. ^Frost, Richard. "[https://www.researchgate.net/profile/Richard_Frost/publication/221442121_Monadic_Memoization_towards_Correctness-Preserving_Reduction_of_Search/links/09e4150f60f01c6ad2000000.pdf Monadic Memoization towards Correctness-Preserving Reduction of Search]. " "Canadian Conference on AI 2003." p 66-80. 9. ^Johnson, Mark, "Memoization of Top-Down Parsing,” Computational Linguistics, Vol. 21 No. 3, pp. 405–417, September 1995. 10. ^1 Johnson, Mark & Dörre, Jochen, "Memoization of Coroutined Constraints," Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, 1995. 11. ^1 Ford, Bryan, [https://dspace.mit.edu/bitstream/handle/1721.1/87310/51972156-MIT.pdf;sequence=2 Packrat Parsing: a Practical Linear-Time Algorithm with Backtracking], Master’s thesis, Massachusetts Institute of Technology, September, 2002. 12. ^Tomita, Masaru. “Efficient Parsing for Natural Language.” Kluwer, Boston, MA, 1985. 13. ^Acar, Umut A. A. et al., "Selective Memoization," Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, New Orleans, Louisiana, pp. 14–25, 15–17 January 2003. External links
2 : Software optimization|Computer performance |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。