词条 | Boyer–Moore majority vote algorithm |
释义 |
The Boyer–Moore majority vote algorithm is an algorithm for finding the majority of a sequence of elements using linear time and constant space. It is named after Robert S. Boyer and J Strother Moore, who published it in 1981,[1] and is a prototypical example of a streaming algorithm. In its simplest form, the algorithm finds a majority element, if there is one: that is, an element that occurs repeatedly for more than half of the elements of the input. However, if there is no majority, the algorithm will not detect that fact, and will still output one of the elements. A version of the algorithm that makes a second pass through the data can be used to verify that the element found in the first pass really is a majority. The algorithm will not, in general, find the mode of a sequence (an element that has the most repetitions) unless the number of repetitions is large enough for the mode to be a majority. It is not possible for a streaming algorithm to find the most frequent element in less than linear space, when the number of repetitions can be small.[2] DescriptionThe algorithm maintains in its local variables a sequence element and a counter, with the counter initially zero. It then processes the elements of the sequence, one at a time. When processing an element {{mvar|x}}, if the counter is zero, the algorithm stores {{mvar|x}} as its remembered sequence element and sets the counter to one. Otherwise, it compares {{mvar|x}} to the stored element and either increments the counter (if they are equal) or decrements the counter (otherwise). At the end of this process, if the sequence has a majority, it will be the element stored by the algorithm. This can be expressed in pseudocode as the following steps:
Even when the input sequence has no majority, the algorithm will report one of the sequence elements as its result. However, it is possible to perform a second pass over the same input sequence in order to count the number of times the reported element occurs and determine whether it is actually a majority. This second pass is needed, as it is not possible for a sublinear-space algorithm to determine whether there exists a majority element in a single pass through the input.[2] AnalysisThe amount of memory that the algorithm needs is the space for one element and one counter. In the random access model of computing usually used for the analysis of algorithms, each of these values can be stored in a machine word and the total space needed is {{math|O(1)}}. If an array index is needed to keep track of the algorithm's position in the input sequence, it doesn't change the overall constant space bound. The algorithm's bit complexity (the space it would need, for instance, on a Turing machine) is higher, the sum of the binary logarithms of the input length and the size of the universe from which the elements are drawn.[3] Both the random access model and bit complexity analyses only count the working storage of the algorithm, and not the storage for the input sequence itself. Similarly, on a random access machine the algorithm takes time {{math|O(n)}} (linear time) on an input sequence of {{mvar|n}} items, because it performs only a constant number of operations per input item. The algorithm can also be implemented on a Turing machine in time linear in the input length ({{mvar|n}} times the number of bits per input item). CorrectnessIf there is a majority element, the algorithm will always find it. For, supposing that the majority element is {{mvar|m}}, let {{mvar|c}} be a number defined at any step of the algorithm to be either the counter, if the stored element is {{mvar|m}}, or the negation of the counter otherwise. Then at each step in which the algorithm encounters a value equal to {{mvar|m}}, the value of {{mvar|c}} will increase by one, and at each step at which it encounters a different value, the value of {{mvar|c}} may either increase or decrease by one. If {{mvar|m}} truly is the majority, there will be more increases than decreases, and {{mvar|c}} will be positive at the end of the algorithm. But this can be true only when the final stored element is {{mvar|m}}, the majority element. See also
References1. ^{{citation|first1=R. S.|last1=Boyer|author1-link=Robert S. Boyer|first2=J S.|last2=Moore|author2-link=J Strother Moore|contribution=MJRTY - A Fast Majority Vote Algorithm|editor-first=R. S.|editor-last=Boyer|title=Automated Reasoning: Essays in Honor of Woody Bledsoe|series=Automated Reasoning Series|publisher=Kluwer Academic Publishers|location=Dordrecht, The Netherlands|year=1991|pages=105–117|url=http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA131702|doi=10.1007/978-94-011-3488-0_5}}. Originally published as a technical report in 1981. {{DEFAULTSORT:Boyer-Moore majority vote algorithm}}2. ^{{citation | last1 = Cormode | first1 = Graham | last2 = Hadjieleftheriou | first2 = Marios | date = October 2009 | doi = 10.1145/1562764.1562789 | issue = 10 | journal = Communications of the ACM | page = 97 | title = Finding the frequent items in streams of data | volume = 52 | quote = no algorithm can correctly distinguish the cases when an item is just above or just below the threshold in a single pass without using a large amount of space}}. 3. ^1 {{citation|title=Notes on streaming algorithms|url=http://theory.stanford.edu/~trevisan/cs154-12/notestream.pdf|first1=Luca|last1=Trevisan|author1-link=Luca Trevisan|first2=Ryan|last2=Williams|author2-link=Ryan Williams (computer scientist)|date=January 26, 2012|work=CS154: Automata and Complexity|publisher=Stanford University}}. 1 : Streaming algorithms |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。