“Multi-label classification”的意思、由来-开放百科全书

In machine learning, multi-label classification and the strongly related problem of multi-output classification are variants of the classification problem where multiple labels may be assigned to each instance. Multi-label classification is a generalization of multiclass classification, which is the single-label problem of categorizing instances into precisely one of more than two classes; in the multi-label problem there is no constraint on how many of the classes the instance can be assigned to.

Formally, multi-label classification is the problem of finding a model that maps inputs x to binary vectors y (assigning a value of 0 or 1 for each element (label) in y).

Problem transformation methods

Several problem transformation methods exist for multi-label classification, and can be roughly broken down into:

Adapted algorithms

Some classification algorithms/models have been adapted to the multi-label task, without requiring problem transformations. Examples of these include:

Learning paradigms

Based on learning paradigms, the existing multi-label classification techniques can be classified into batch learning and online machine learning. Batch learning algorithms require all the data samples to be available beforehand. It trains the model using the entire training data and then predicts the test sample using the found relationship. The online learning algorithms, on the other hand, incrementally build their models in sequential iterations. In iteration t, an online algorithm receives a sample, x_t and predicts its label(s) ŷ_t using the current model; the algorithm then receives y_t, the true label(s) of x_t and updates its model based on the sample-label pair: (x_t, y_t). Recently, a new learning paradigm called the progressive learning technique has been developed.^[16] The progressive learning technique is capable of not only learning from new samples but also capable of learning multiple new labels of data being introduced to the model and yet retain the knowledge learnt thus far.^[17]

Multi-label Stream Classification

Data streams are possibly infinite sequences of data that continuously and rapidly grow over time.^[18] Multi-label stream classification (MLSC) is the version of multi-label classification task that takes place in data streams. It is sometimes also called online multi-label classification. The difficulties of multi-label classification (exponential number of possible label sets, capturing dependencies between labels) are combined with difficulties of data streams (time and memory constraints, addressing infinite stream with finite means, concept drifts).

Many MLSC methods resort to ensemble methods in order to increase their predictive performance and deal with concept drifts. Below are the most widely used ensemble methods in the literature:

Statistics and evaluation metrics

Evaluation metrics for multi-label classification performance are inherently different from those used in multi-class (or binary) classification, due to the inherent differences of the classification problem. If {{mvar|T}} denotes the true set of labels for a given sample, and {{mvar|P}} the predicted set of labels, then the following metrics can be defined on that sample:

Cross-validation in multi-label settings is complicated by the fact that the ordinary (binary/multiclass) way of stratified sampling will not work; alternative ways of approximate stratified sampling have been suggested.^[28]

Implementations and datasets

Java implementations of multi-label algorithms are available in the Mulan and Meka software packages, both based on Weka.

The scikit-learn Python package implements some multi-labels algorithms and metrics.

The binary relevance method, classifier chains and other multilabel algorithms with a lot of different base learners are implemented in the R-package [https://mlr-org.github.io/mlr/articles/tutorial/multilabel.html mlr]^[29]

A list of commonly used multi-label data-sets is available at the Mulan website.

See also

References

1. ^¹²³Jesse Read, Bernhard Pfahringer, Geoff Holmes, Eibe Frank. Classifier Chains for Multi-label Classification. Machine Learning Journal. Springer. Vol. 85(3), (2011).
2. ^¹{{Cite journal|title = Efficient monte carlo methods for multi-dimensional learning with classifier chains|journal = Pattern Recognition|date = 2014-03-01|pages = 1535–1546|volume = 47|series = Handwriting Recognition and other PR Applications|issue = 3|doi = 10.1016/j.patcog.2013.10.006|first = Jesse|last = Read|first2 = Luca|last2 = Martino|first3 = David|last3 = Luengo|arxiv = 1211.2190}}
3. ^{{Cite journal|last=Read|first=Jesse|last2=Martino|first2=Luca|last3=Olmos|first3=Pablo M.|last4=Luengo|first4=David|date=2015-06-01|title=Scalable multi-output label prediction: From classifier chains to classifier trellises|journal=Pattern Recognition|volume=48|issue=6|pages=2096–2109|arxiv=1501.04870|doi=10.1016/j.patcog.2015.01.004}}
4. ^{{cite journal|last1=Heider|first1=D|last2=Senge|first2=R|last3=Cheng|first3=W|last4=Hüllermeier|first4=E|date=2013|title=Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction|journal=Bioinformatics|volume=29|issue=16|pages=1946–52|doi=10.1093/bioinformatics/btt331|pmid=23793752}}
5. ^{{cite journal|last1=Riemenschneider|first1=M|last2=Senge|first2=R|last3=Neumann|first3=U|last4=Hüllermeier|first4=E|last5=Heider|first5=D|date=2016|title=Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification|journal=BioData Mining|volume=9|pages=10|doi=10.1186/s13040-016-0089-1|pmc=4772363|pmid=26933450}}
6. ^{{Cite journal|last=Soufan|first=Othman|last2=Ba-Alawi|first2=Wail|last3=Afeef|first3=Moataz|last4=Essack|first4=Magbubah|last5=Kalnis|first5=Panos|last6=Bajic|first6=Vladimir B.|date=2016-11-10|title=DRABAL: novel method to mine large high-throughput screening assays using Bayesian active learning|journal=Journal of Cheminformatics|volume=8|pages=64|doi=10.1186/s13321-016-0177-8|pmid=27895719|pmc=5105261|issn=1758-2946}}
7. ^{{Cite journal|last=Spolaôr|first=Newton|last2=Cherman|first2=Everton Alvares|last3=Monard|first3=Maria Carolina|last4=Lee|first4=Huei Diana|date=March 2013|title=A Comparison of Multi-label Feature Selection Methods using the Problem Transformation Approach|journal=Electronic Notes in Theoretical Computer Science|volume=292|pages=135–151|doi=10.1016/j.entcs.2013.02.010|issn=1571-0661}}
8. ^{{Cite web|url=http://www.scikit-yb.org/en/latest/api/classifier/threshold.html|title=Discrimination Threshold — yellowbrick 0.9 documentation|website=www.scikit-yb.org|access-date=2018-11-29}}
9. ^{{cite conference |last1=Tsoumakas |first1=Grigorios |first2=Ioannis |last2=Vlahavas |title=Random {{mvar|k}}-labelsets: An ensemble method for multilabel classification |conference=ECML |year=2007 |url=http://talos.csd.auth.gr/tsoumakas/publications/C19.pdf}}
10. ^{{cite journal |first1=M.L. |last1=Zhang |first2= Z.H. |last2= Zhou |title=ML-KNN: A lazy learning approach to multi-label learning |journal=Pattern Recognition |volume=40 |issue=7 |year=2007 |doi=10.1016/j.patcog.2006.12.019 |pages=2038–2048|citeseerx=10.1.1.538.9597 }}
11. ^{{cite journal |first1=Gjorgji |last1=Madjarov |first2=Dragi |last2=Kocev |first3=Dejan |last3=Gjorgjevikj |first4=Sašo |last4=Džeroski |title=An extensive experimental comparison of methods for multi-label learning |journal=Pattern Recognition |volume=45 |issue=9 |year=2012 |doi=10.1016/j.patcog.2012.03.004 |pages=3084–3104}}
12. ^{{cite journal|last2=Hsu|first2=Chang-Ling|last3=Chou|first3=Shih-chieh|year=2003|title=Constructing a multi-valued and multi-labeled decision tree|journal=Expert Systems with Applications|volume=25|issue=2|pages=199–209|doi=10.1016/S0957-4174(03)00047-2|first1=Yen-Liang|last1=Chen}}
13. ^{{Cite journal|last=Chou|first=Shihchieh|last2=Hsu|first2=Chang-Ling|date=2005-05-01|title=MMDT: a multi-valued and multi-labeled decision tree classifier for data mining|journal=Expert Systems with Applications|volume=28|issue=4|pages=799–812|doi=10.1016/j.eswa.2004.12.035}}
14. ^{{Cite journal|last=Li|first=Hong|last2=Guo|first2=Yue-jian|last3=Wu|first3=Min|last4=Li|first4=Ping|last5=Xiang|first5=Yao|date=2010-12-01|title=Combine multi-valued attribute decomposition with multi-label learning|journal=Expert Systems with Applications|volume=37|issue=12|pages=8721–8728|doi=10.1016/j.eswa.2010.06.044}}
15. ^{{cite conference |first1=M.L. |last1=Zhang |first2= Z.H. |last2= Zhou |title=Multi-label neural networks with applications to functional genomics and text categorization |conference=IEEE Transactions on Knowledge and Data Engineering |volume=18 |year=2006 |pages=1338–1351 |url=http://cse.seu.edu.cn/people/zhangml/files/Multi-label%20neural%20networks%20with%20applications%20to%20functional%20genomics%20and%20text%20categorization.pdf}}
16. ^{{cite arXiv|last1=Dave|first1=Mihika|last2=Tapiawala|first2=Sahil|last3=Meng Joo|first3=Er|last4=Venkatesan|first4=Rajasekar|title=A Novel Progressive Multi-label Classifier for Classincremental Data|eprint=1609.07215|date=2016|class=cs.LG}}
17. ^{{cite web|last1=Venkatesan|first1=Rajasekar|title=Progressive Learning Technique for Multi-label Classification|url=http://rajasekarv.wixsite.com/rajasekar-venkatesan/progressive-learning-multi-label}}
18. ^{{Cite book|date=2007|editor-last=Aggarwal|editor-first=Charu C.|title=Data Streams|journal=Advances in Database Systems|volume=31|doi=10.1007/978-0-387-47534-9|isbn=978-0-387-28759-1}}
19. ^{{Cite journal|last=Oza|first=Nikunj|date=2005|title=Online Bagging and Boosting|url=https://ieeexplore.ieee.org/document/1571498|journal=IEEE International Conference on Systems, Man and Cybernetics|volume=|pages=|via=}}
20. ^{{Cite book|last=Read|first=Jesse|last2=Pfahringer|first2=Bernhard|last3=Holmes|first3=Geoff|date=2008-12-15|title=Multi-label Classification Using Ensembles of Pruned Sets|url=http://dl.acm.org/citation.cfm?id=1510528.1511358|publisher=IEEE Computer Society|pages=995–1000|doi=10.1109/ICDM.2008.74|isbn=9780769535029|hdl=10289/8077}}
21. ^¹{{Cite journal|last=Osojnik|first=Aljaź|last2=Panov|first2=PanăźE|last3=DźEroski|first3=Sašo|date=2017-06-01|title=Multi-label classification via multi-target regression on data streams|url=http://dl.acm.org/citation.cfm?id=3095368.3095465|journal=Machine Learning|volume=106|issue=6|pages=745–770|doi=10.1007/s10994-016-5613-5|issn=0885-6125}}
22. ^{{Cite journal|last=Sousa|first=Ricardo|last2=Gama|first2=João|date=2018-01-24|title=Multi-label classification from high-speed data streams with adaptive model rules and random rules|journal=Progress in Artificial Intelligence|volume=7|issue=3|pages=177–187|doi=10.1007/s13748-018-0142-z|issn=2192-6352}}
23. ^¹²³{{Cite journal|last=Read|first=Jesse|last2=Bifet|first2=Albert|last3=Holmes|first3=Geoff|last4=Pfahringer|first4=Bernhard|date=2012-02-21|title=Scalable and efficient multi-label classification for evolving data streams|journal=Machine Learning|volume=88|issue=1–2|pages=243–272|doi=10.1007/s10994-012-5279-6|issn=0885-6125}}
24. ^{{Citation|last=Bifet|first=Albert|title=Learning from Time-Changing Data with Adaptive Windowing|date=2007-04-26|work=Proceedings of the 2007 SIAM International Conference on Data Mining|pages=443–448|publisher=Society for Industrial and Applied Mathematics|doi=10.1137/1.9781611972771.42|isbn=9780898716306|last2=Gavaldà|first2=Ricard|citeseerx=10.1.1.215.8387}}
25. ^¹²³⁴{{Cite book|last=Büyükçakir|first=Alican|last2=Bonab|first2=Hamed|last3=Can|first3=Fazli|date=2018-10-17|title=A Novel Online Stacked Ensemble for Multi-Label Stream Classification|publisher=ACM|pages=1063–1072|doi=10.1145/3269206.3271774|isbn=9781450360142|arxiv=1809.09994}}
26. ^{{Cite book|last=Xioufis|first=Eleftherios Spyromitros|last2=Spiliopoulou|first2=Myra|last3=Tsoumakas|first3=Grigorios|last4=Vlahavas|first4=Ioannis|date=2011-07-16|title=Dealing with concept drift and class imbalance in multi-label stream classification|url=http://dl.acm.org/citation.cfm?id=2283516.2283659|publisher=AAAI Press|pages=1583–1588|doi=10.5591/978-1-57735-516-8/IJCAI11-266|isbn=9781577355144}}
27. ^{{cite conference |last1=Godbole |first1=Shantanu |first2=Sunita |last2=Sarawagi |title=Discriminative methods for multi-labeled classification |conference=Advances in Knowledge Discovery and Data Mining |year=2004 |pages=22–30 |url=http://www.godbole.net/shantanu/pubs/multilabelsvm-pakdd04.pdf}}
28. ^{{cite conference |last1=Sechidis |first1=Konstantinos |first2=Grigorios |last2=Tsoumakas |first3=Ioannis |last3=Vlahavas |title=On the stratification of multi-label data |conference=ECML PKDD |year=2011 |pages=145–158 |url=http://lpis.csd.auth.gr/publications/sechidis-ecmlpkdd-2011.pdf}}
29. ^Philipp Probst, Quay Au, Giuseppe Casalicchio, Clemens Stachl, Bernd Bischl. [https://journal.r-project.org/archive/2017/RJ-2017-012/index.html Multilabel Classification with R Package mlr]. The R Journal (2017) 9:1, pages 352-369.