请输入您要查询的百科知识:

 

词条 Hopkins statistic
释义

  1. Preliminaries

  2. Definition

  3. Interpretation

  4. Variations

  5. Notes and references

  6. External links

{{refimprove|date=August 2017}}

The Hopkins statistic (introduced by Brian Hopkins and John Gordon Skellam) is a way of measuring the cluster tendency of a data set.[1] It belongs to the family of sparse sampling tests. It acts as a statistical hypothesis test where the null hypothesis is that the data is generated by a Poisson point process and are thus uniformly randomly distributed.[2] A value close to 1 tends to indicate the data is highly clustered, random data will tend to result in values around 0.5, and uniformly distributed data will tend to result in values close to 0 {{Citation needed|reason=Acording to https://pubs.acs.org/doi/pdf/10.1021/ci00065a010 niformly distributed data will tend to result in values close to 0.5, althougt theorycally posible, not sure if H would ever go much under 0.5|date=March 2019}}.

Preliminaries

A typical formulation of the Hopkins statistic follows.[2]

Let be the set of data points.

Consider a random sample (without replacement) of data points with members .

Generate a set of uniformly randomly distributed data points.

Define two distance measures,

the distance of from its nearest neighbour in , and

the distance of from its nearest neighbour in .

Definition

With the above notation, if the data is dimensional, then the Hopkins statistic is defined as:

Notes and references

1. ^{{Cite journal | title = A new method for determining the type of distribution of plant individuals | last1 = Hopkins | first1 = Brian | last2 = Skellam | first2 = John Gordon | journal = Annals of Botany | volume =18 | number = 2 | pages = 213–227 | year = 1954 | publisher = Annals Botany Co}}
2. ^{{Cite journal | last = Banerjee | first = A. | title = Validating clusters using the Hopkins statistic | journal = IEEE International Conference on Fuzzy Systems | pages = 149–153 | doi = 10.1109/FUZZY.2004.1375706 | year = 2004}}

External links

  • http://www.sthda.com/english/wiki/assessing-clustering-tendency-a-vital-issue-unsupervised-machine-learning

1 : Clustering criteria

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/22 0:58:19