ACTIVATION FUNCTIONS OF NEURAL NETWORKS, THEIR VISUALIZATION AND APPLICATION IN PROGRAMMING USING THE TENSORFLOW LIBRARY
DOI:
https://doi.org/10.36773/1818-1112-2026-139-1-123-130Keywords:
neural networks, activation function, deep learning, nonlinearity, decaying gradient, adaptivity, ReLU, TensorFlowAbstract
This paper presents a comprehensive study of activation functions (AFs), a fundamental component of artificial neural networks that determines a node's output based on input data. The paper consistently reveals the role of AFs as a critical element of nonlinearity, enabling deep architectures to learn complex patterns and act as universal approximators.
The paper demonstrates the evolution of mathematical methods for signal transformation: from simple linear functions (Identity), suitable only for scalar regression problems, to complex adaptive and non-monotonic structures. The paper analyzes S-shaped curves (Sigmoid, Tanh), the ReLU family, exponential units, learnable functions, and diversified experimental samples.
Gradient flows are also analyzed. The paper examines how the mathematical properties of AF derivatives influence the emergence of vanishing and exploding gradient problems. It demonstrates how switching from saturating functions (sigmoid) to ReLU-like functions has accelerated the training of deep networks, and the implementation of functions such as GELU has become an industry standard for modern transformer architectures (BERT, GPT).
The study reveals the potential of modern solutions such as Swish and Mish. These functions, obtained through automated search and systematic analysis, provide more efficient information propagation in ultra-deep networks due to their smoothness and non-monotonicity. The section on specialized functions discusses probabilistic normalization tools such as Softmax and their sparse alternatives (Sparsemax, Entmax).
The practical significance of the work is supported by implementation examples in the TensorFlow environment. The article demonstrates how automatic differentiation mechanisms (GradientTape) and visualization tools (TensorBoard) allow the researcher to effectively monitor activation dynamics and training stability. The key findings of the paper highlight that the choice of activation function is not simply a technical parameter, but a strategic architectural decision that directly impacts the accuracy, convergence rate, and generalization ability of artificial intelligence.
References
Документация TesnorFlow. – URL: https://www.tensorflow.org/ api_docs/python/tf (дата обращения: 13.02.2026).
Документация Keras. – URL: https://keras.io/api/layers/activations (дата обращения: 13.02.2026).
Bouraya, S. A comparative analysis of activation functions in neural networks: unveiling categories / S. Bouraya, A. Belangour // Bulletin of Electrical Engineering and Informatics. – 2024. – Vol. 13. – 8 c.
Gustineli, M. A survey on recently proposed activation functions for Deep Learning / M. Gustineli // Cornell University. – 2022. – 7 c.
Activation Functions: Comparison of Trends in Practice and Research for Deep Learning / C. E. Nwankpa, W. Ijomah, A. Gachagan, S. Marshall // Cornell University. – 2018. – 20 c.
Dubey, S. R. Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark / S. R. Dubey, S. K. Singh, B. B. Chaudhuri // Cornell University. – 2022. – 18 c.
Titiya, M. D. Analyzing the Effect of Different Activation Functions in Deep Learning on Accuracy and Execution time / M. D. Titiya, A. V. Bala, S. Degadwala // International Journal of intelligent systems and applications in engineering. – 2024. – 8 c.
Демещенко, М. В. Функции активации / М. В. Демещенко, Р. С. Марковец, Т. А. Сугако, В. Д. Владымцев // Компьютерные системы и сети : сборник статей 59-й науч. конф. аспирантов, магистрантов и студентов, Минск, 17–21 апр. 2023 г. / БГУИР. – Минск, 2023. – С. 333–340.
Kunc, V. Three decades of activations: a comprehensive survey of 400 activation functions for neural networks / V. Kunc, J. Klema // Cornell University. – 2024. – 107 c.
Хашин, С. И. Сравнение активаторных функций нейросети / С. И. Хашин // Вестник Ивановского государственного университета. Серия: Естественные, общественные науки. – 2020. – Вып. 2. – С. 106–111.
Kingma, D. P. Adam: a method for stochastic optimization / D. P. Kingma, J. L. Ba // 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings. – 2015. – 15 p.
Ruder, S. An overview of gradient descent optimization algorithms / S. Ruder // Cornell University. – Dublin, 2016. – 14 p.
Kochenderfer, M. J. Algorithms for Optimization / M. J. Kochenderfer, T. A. Wheeler // The MIT Press. – Cambridge, Massachusetts, 2019. – P. 79–82.
Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems / A. Géron. – 2nd ed. – O’Reilly Media, Inc., 2019. – 856 p.
Singh, P. Learn TensorFlow 2.0: Implement Machine Learning and Deep Learning Models with Python / P. Singh, A. Manure // Apress. – Berkeley, CA, 2020. – 177 p. – DOI: 10.1007/978-1-4842-5558-2.
Рашка, С. Python и машинное обучение / С. Рашка, А. В. Логунова. – М. : ДМК Пресс, 2017. – 418 с.
Рамсундар, Б. TensorFlow для глубокого обучения / Б. Рамсундар, Р. Б. Заде ; пер. с англ. А. В. Логунова. – СПб. : БХВ-Петербург, 2019. – 256 с.
Шукла, Н. Машинное обучение и TensorFlow / Н. Шукла, К. Фритлас ; пер. с англ. А. А. Слинкина. – СПб. : Питер, 2019. – 336 с.
Поляк, Б. Т. Введение в оптимизацию / Б. Т. Поляк. – М. : Наука. Главная редакция физико-математической литературы, 1983. – 384 с.
Траск, Э. Грокаем глубокое обучение / Э. Траск ; пер. с англ. А. А. Слинкина. – СПб. : Питер, 2019. – 352 с.
Потапов, А. С. Технологии искусственного интеллекта : учебное пособие / А. С. Потапов. – СПб. : СПбГУ ИТМО, 2010. – 218 с.
Шолле, Ф. Глубокое обучение на Python / Ф. Шолле ; пер. с англ. А. Н. Киселева. – 2-е изд. – СПб. : Питер, 2023. – 576 с.
Рашид, Т. Создай свою нейронную сеть / Т. Рашид ; [пер. с англ. А. В. Логунова]. – СПб. : Альфа-книга, 2017. – 272 с.
Гудфеллоу, Я. Глубокое обучение / Я. Гудфеллоу, И. Бенджио, А. Курвилль ; [пер. с англ. А. А. Слинкина]. – М. : ДМК Пресс, 2018. – 652 с.
Современные численные методы оптимизации / А. В. Гасников, А. А. Дородницына, Ю. Г. Евтушенко [и др.] // Современные численные методы оптимизации : в 2 ч. / под ред. А. В. Гасникова. – М. : МЦНМО, 2021. – Ч. 1. – 272 с.
Матренин, П. В. Методы стохастической оптимизации / П. В. Матренин, М. Г. Гриф, В. Г. Секаев. – Новосибирск : Изд-во НГТУ, 2016. – 67 с.
Гасников, А. В. Основные конструкции над алгоритмами выпуклой оптимизации и их приложения к получению новых оценок для сильно выпуклых задач / А. В. Гасников, Д. И. Камзолов, М. А. Мендель // Труды МФТИ. – 2016. – Т. 8, № 4 (32). – С. 36–52.
Воронцова, Е. А. Выпуклая оптимизация : учебное пособие / Е. А. Воронцова. – М. : МФТИ, 2021. – 364 с.
Nocedal, J. Numerical Optimization / J. Nocedal, S. J. Wright. – 2nd ed. – New York : Springer Science+Business Media, LLC, 2006. – 664 p.
Bishop, C. M. Pattern Recognition and Machine Learning / C. M. Bishop. – New York : Springer Science+Business Media, LLC, 2006. – 738 p.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The work is provided under the terms of Creative Commons public license Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). This license allows an unlimited number of persons to reproduce and share the Licensed Material in all media and formats. Any use of the Licensed Material shall contain an identification of its Creator(s) and must be for non-commercial purposes only. Users may not prevent other individuals from taking any actions allowed by the license.


