Robust Training of Artificial Neural Networks via p-Quasinorms
- Data used for the purpose of machine learning are often erroneous. In this thesis, p-quasinorms (p<1) are employed as loss functions in order to increase the robustness of training algorithms for artificial neural networks. Numerical issues arising from these loss functions are addressed via enhanced optimization algorithms (proximal point methods; Frank-Wolfe methods) based on the (non-monotonic) Armijo-rule. Numerical experiments comprising 1100 test problems confirm the effectiveness of the approach. Depending on the parametrization, an average reduction of the absolute residuals of up to 64.6% is achieved (aggregated over 100 test problems).