Comparative analysis of machine learning methods by the example of the problem of determining muon decay
- Авторлар: Gevorkyan M.N.1, Demidova A.V.1, Kulyabov D.S.1,2
-
Мекемелер:
- Peoples’ Friendship University of Russia (RUDN University)
- Joint Institute for Nuclear Research
- Шығарылым: Том 28, № 2 (2020)
- Беттер: 105-119
- Бөлім: Computer Science
- URL: https://journal-vniispk.ru/2658-4670/article/view/315316
- DOI: https://doi.org/10.22363/2658-4670-2020-28-2-105-119
- ID: 315316
Дәйексөз келтіру
Толық мәтін
Аннотация
The history of using machine learning algorithms to analyze statistical models is quite long. The development of computer technology has given these algorithms a new breath. Nowadays deep learning is mainstream and most popular area in machine learning. However, the authors believe that many researchers are trying to use deep learning methods beyond their applicability. This happens because of the widespread availability of software systems that implement deep learning algorithms, and the apparent simplicity of research. All this motivate the authors to compare deep learning algorithms and classical machine learning algorithms. The Large Hadron Collider experiment is chosen for this task, because the authors are familiar with this scientific field, and also because the experiment data is open source. The article compares various machine learning algorithms in relation to the problem of recognizing the decay reaction τ– →μ– + μ– + μ+ at the Large Hadron Collider. The authors use open source implementations of machine learning algorithms. We compare algorithms with each other based on calculated metrics. As a result of the research, we can conclude that all the considered machine learning methods are quite comparable with each other (taking into account the selected metrics), while different methods have different areas of applicability.
Негізгі сөздер
Авторлар туралы
Migran Gevorkyan
Peoples’ Friendship University of Russia (RUDN University)
Хат алмасуға жауапты Автор.
Email: gevorkyan-mn@rudn.ru
Candidate of Sciences in Physics and Mathematics, Assistant Professor of Department of Applied Probability and Informatics
6, Miklukho-Maklaya St., Moscow, 117198, Russian FederationAnastasia Demidova
Peoples’ Friendship University of Russia (RUDN University)
Email: demidova-av@rudn.ru
Candidate of Sciences in Physics and Mathematics, Assistant Professor of Department of Applied Probability and Informatics
6, Miklukho-Maklaya St., Moscow, 117198, Russian FederationDmitry Kulyabov
Peoples’ Friendship University of Russia (RUDN University); Joint Institute for Nuclear Research
Email: kulyabov-ds@rudn.ru
Docent, Doctor of Sciences in Physics and Mathematics, Professor at the Department of Applied Probability and Informatics
6, Miklukho-Maklaya St., Moscow, 117198, Russian Federation; 6, Joliot-Curie St., Dubna, Moscow region, 141980, Russian FederationӘдебиет тізімі
- M. N. Gevorkyan, A. V. Demidova, T. S. Demidova, and A. A. Sobolev, “Review and comparative analysis of machine learning libraries for machine learning,” Discrete and Continuous Models and Applied Computational Science, vol. 27, no. 4, pp. 305-315, Dec. 2019. DOI: 10.22363/ 2658-4670-2019-27-4-305-315.
- L. A. Sevastianov, A. L. Sevastianov, E. A. Ayrjan, A. V. Korolkova, D. S. Kulyabov, and I. Pokorny, “Structural Approach to the Deep Learning Method,” in Proceedings of the 27th Symposium on Nuclear Electronics and Computing (NEC-2019), V. Korenkov, T. Strizh, A. Nechaevskiy, and T. Zaikina, Eds., ser. CEUR Workshop Proceedings, vol. 2507, Budva, Sep. 2019, pp. 272-275.
- P. Langacker, The standard model and beyond, ser. Series in High Energy Physics, Cosmology and Gravitation. CRC Press, 2009.
- I. Lakatos, “Falsification and the Methodology of Scientific Research Programmes,” in Criticism and the growth of Knowledge, I. Lakatos and A. Musgrave, Eds., Cambr. University Press, 1970, pp. 91-195.
- R. Aaij et al., “Search for the lepton flavour violating decay τ– → μ– + μ+ + μ −,” Journal of High Energy Physics, vol. 2015, no. 2, p. 121, Feb. 2015. doi: 10.1007/JHEP02(2015)121. arXiv: 1409.8548.
- (2018). “Flavours of Physics: Finding τ→ μμμ (Kernels Only),” [Online]. Available: https://www.kaggle.com/c/flavours-of-physicskernels-only.
- F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
- F. Chollet. (2020). “Keras,” [Online]. Available: https://keras.io/.
- (2020). “XGBoost Documentation,” [Online]. Available: https:// xgboost.readthedocs.io.
- (2020). “Hep_ml,” [Online]. Available: https://arogozhnikov.github. io.
- (2020). “CNTC official repository,” [Online]. Available: https://github. com/Microsoft/cntk.
- Theano Development Team, “Theano: A Python framework for fast computation of mathematical expressions,” arXiv e-prints, vol. abs/1605.0, 2016.
- I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, ser. The Morgan Kaufmann Series in Data Management Systems. Elsevier, 2011. DOI: 10.1016/ C2009-0-19715-5.
- A. Bruce and P. Bruce, Practical Statistics for Data Scientists: 50 Essential Concepts. O’Reilly Media, 2017.
- J. VanderPlas, Python Data Science Handbook: Essential Tools for Working with Data. O’Reilly Media, 2016.
- (2020). “Scikit-learn home site,” [Online]. Available: https://scikitlearn.org/stable/.
- D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, ser. Wiley Series in Probability and Statistics. Wiley, 2013.
- J. M. Hilbe, Logistic Regression Models, ser. Chapman & Hall/CRC Texts in Statistical Science. Chapman and Hall/CRC, May 2009. doi: 10.1201/9781420075779.
- D. Ruppert, “The Elements of Statistical Learning: Data Mining, Inference, and Prediction,” Journal of the American Statistical Association, Springer Series in Statistics, vol. 99, no. 466, p. 567, 2004. DOI: 10. 1198/jasa.2004.s339.
- R. Collins, Machine Learning with Bagging and Boosting. Amazon Digital Services LLC - Kdp Print Us, 2018.
- J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of Statistics, vol. 29, no. 5, pp. 1189-1232, 2001. doi: 10.2307/2699986.
- A. W. Kemp and B. F. J. Manly, Randomization, Bootstrap and Monte Carlo Methods in Biology. Ser. Chapman & Hall/CRC Texts in Statistical Science 4. CRC Press, Dec. 1997, vol. 53. doi: 10.2307/2533527.
- O. Soranson, Python Data Science Handbook: The Ultimate Guide to Learn How to Use Python for Data Analysis and Data Science. Learn the Essential Tools for Beginners to Work with Data, ser. Artificial Intelligence Series. Amazon Digital Services LLC - KDP Print US, 2019.
- M. Abadi, A. Agarwal, Paul Barham, EugeneBrevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, and Jeffrey Dean. (2015). “TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems,” [Online]. Available: http://tensorflow.org/.
- (2020). “TensorFlow home site,” [Online]. Available: https://www. tensorflow.org/.
- A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in PyTorch,” in 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 2017.
Қосымша файлдар
