请输入您要查询的百科知识:

 

词条 Hyperparameter (machine learning)
释义

  1. Considerations

      Tunability    Robustness  

  2. Optimization

  3. Reproducibility

      Services    Software  

  4. See also

  5. References

{{About|hyperparameters in machine learning|hyperparameters in Bayesian statistics|Hyperparameter}}

In machine learning, a hyperparameter is a parameter whose value is set before the learning process begins. By contrast, the values of other parameters are derived via training.

Different model training algorithms require different hyperparameters, some simple algorithms (such as ordinary least squares regression) require none. Given these hyperparameters, the training algorithm learns the parameters from the data. For instance, LASSO is an algorithm that adds a regularization hyperparameter to ordinary least squares regression, which has to be set before estimating the parameters through the training algorithm.

Considerations

The time required to train and test a model can depend upon the choice of its hyperparameters.[1] A hyperparameter is usually of continuous or integer type, leading to mixed-type optimization problems.[1] The existence of some hyperparameters is conditional upon the value of others, e.g. the size of each hidden layer in a neural network can be conditional upon the number of layers.[1]

Tunability

Most performance variation can be attributed to just a few hyperparameters.[2][1][3] The tunability of an algorithm, hyperparameter, or interacting hyperparameters is a measure of how much performance can be gained by tuning it.[4] For an LSTM, while the learning rate followed by the network size are its most crucial hyperparameters,[5] whereas batching and momentum have no significant effect on its performance.[6]

Although some research has advocated the use of mini-batch sizes in the thousands, other work has found the best performance with mini-batch sizes between 2 and 32.[7]

Robustness

An inherent stochasticity in learning directly implies that the empirical hyperparameter performance is not necessarily its true performance.[1] Methods that are not robust to simple changes in hyperparameters, random seeds, or even different implementations of the same algorithm cannot be integrated into mission critical control systems without significant simplification and robustification.[8]

Reinforcement learning algorithms, in particular, require measuring their performance over a large number of random seeds, and also measuring their sensitivity to choices of hyperparameters.[8] Their evaluation with a small number of random seeds does not capture performance adequately due to high variance.[8] Some reinforcement learning methods, e.g. DDPG (Deep Deterministic Policy Gradient), are more sensitive to hyperparameter choices than others.[8]

Optimization

{{main|Hyperparameter optimization}}

Hyperparameter optimization finds a tuple of hyperparameters that yields an optimal model which minimizes a predefined loss function on given test data.[1] The objective function takes a tuple of hyperparameters and returns the associated loss.[1]

Reproducibility

Apart from tuning hyperparameters, machine learning involves storing and organizing the parameters and results, and making sure they are reproducible.[9] In the absence of a robust infrastructure for this purpose, research code often evolves quickly and compromises essential aspects like bookkeeping and reproducibility.[10] Online collaboration platforms for machine learning go further by allowing scientists to automatically share, organize and discuss experiments, data, and algorithms.[11]

A number of relevant services and open source software exist:

Services

Name Interfaces
[https://www.comet.ml/ Comet.ml][12] Python[13]
[https://www.openml.org/ OpenML][14][11][15][16] REST, Python, Java, R[17]

Software

Name Interfaces Store
[https://github.com/openml/openml-docker-dev OpenML Docker][14][11][15][16] REST, Python, Java, R[17] MySQL
[https://github.com/IDSIA/sacred sacred][9][10] Python[18] file, MongoDB, TinyDB, SQL

See also

  • Hyper-heuristic
  • Replication crisis

References

1. ^{{cite article |arxiv=1502.02127 |title=Claesen, Marc, and Bart De Moor. "Hyperparameter Search in Machine Learning." arXiv preprint arXiv:1502.02127 (2015).|bibcode=2015arXiv150202127C}}
2. ^{{cite article |url=http://proceedings.mlr.press/v32/hutter14.html |title=Hutter, Frank, Holger Hoos, and Kevin Leyton-Brown. "An efficient approach for assessing hyperparameter importance." International Conference on Machine Learning. 2014.}}
3. ^{{cite article |arxiv=1710.04725 |title=van Rijn, Jan N., and Frank Hutter. "Hyperparameter Importance Across Datasets." arXiv preprint arXiv:1710.04725 (2017).|bibcode=2017arXiv171004725V}}
4. ^{{cite article |arxiv=1802.09596 |title=Probst, Philipp, Bernd Bischl, and Anne-Laure Boulesteix. "Tunability: Importance of Hyperparameters of Machine Learning Algorithms." arXiv preprint arXiv:1802.09596 (2018).|bibcode=2018arXiv180209596P}}
5. ^{{cite article |url=http://ieeexplore.ieee.org/abstract/document/7508408/ |title=Greff, Klaus, et al. "LSTM: A search space odyssey." IEEE transactions on neural networks and learning systems (2017).}}
6. ^{{cite article |arxiv=1508.02774 |title=Breuel, Thomas M. "Benchmarking of LSTM networks." arXiv preprint arXiv:1508.02774 (2015).|bibcode=2015arXiv150802774B}}
7. ^{{cite article |arxiv=1804.07612 |title=Revisiting Small Batch Training for Deep Neural Networks (2018).|bibcode=2018arXiv180407612M}}
8. ^{{cite article |arxiv=1803.07055 |title=Mania, Horia, Aurelia Guy, and Benjamin Recht. "Simple random search provides a competitive approach to reinforcement learning." arXiv preprint arXiv:1803.07055 (2018).|bibcode=2018arXiv180307055M}}
9. ^{{cite article |url=https://indico.lal.in2p3.fr/event/2914/contributions/6476/subcontributions/169/attachments/6034/7159/Sacred_3.pdf |title=Greff, Klaus, and Jürgen Schmidhuber. "Introducing Sacred: A Tool to Facilitate Reproducible Research." |year=2015}}
10. ^{{cite article |url=http://conference.scipy.org/proceedings/scipy2017/pdfs/klaus_greff.pdf |title=Greff, Klaus, et al. "The Sacred Infrastructure for Computational Research." |year=2017}}
11. ^{{cite article |arxiv=1407.7722 |title=Vanschoren, Joaquin, et al. "OpenML: networked science in machine learning." arXiv preprint arXiv:1407.7722 (2014).|bibcode=2014arXiv1407.7722V}}
12. ^{{cite web |url=https://www.kdnuggets.com/2018/04/comet-ml-machine-learning-experiment-management.html |title=Comet.ml – Machine Learning Experiment Management}}
13. ^{{cite web |url=https://pypi.python.org/pypi/comet-ml |title=PyPI: comet-ml}}
14. ^{{cite book |title=Van Rijn, Jan N., et al. "OpenML: A collaborative science platform." Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2013.|volume=7908|pages=645–649|doi=10.1007/978-3-642-40994-3_46|chapter = OpenML: A Collaborative Science Platform|series = Lecture Notes in Computer Science|year = 2013|last1 = Van Rijn|first1 = Jan N.|last2=Bischl|first2=Bernd|last3=Torgo|first3=Luis|last4=Gao|first4=Bo|last5=Umaashankar|first5=Venkatesh|last6=Fischer|first6=Simon|last7=Winter|first7=Patrick|last8=Wiswedel|first8=Bernd|last9=Berthold|first9=Michael R.|last10=Vanschoren|first10=Joaquin|isbn=978-3-642-38708-1}}
15. ^{{cite web |url=http://www.jmlr.org/proceedings/papers/v41/vanschoren15.pdf |title=Vanschoren, Joaquin, Jan N. van Rijn, and Bernd Bischl. "Taking machine learning research online with OpenML." Proceedings of the 4th International Conference on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications-Volume 41. JMLR. org, 2015.}}
16. ^{{cite web |url=https://openaccess.leidenuniv.nl/handle/1887/44814 |title=van Rijn, J. N. Massively collaborative machine learning. Diss. 2016.|date=2016-12-19}}
17. ^{{cite web |url=https://github.com/openml |title=GitHub: OpenML}}
18. ^{{cite web |url=https://pypi.python.org/pypi/sacred |title=PyPI: sacred}}

2 : Machine learning|Model selection

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/23 13:26:25