Technical Report Number
TRAINREC is a system for training feedforward and recurrent neural networks that incorporates several ideas. It uses the conjugate-gradient method which is demonstrably more efficient than traditional backward error propagation. We assume epoch-based training and derive a new error function having several desirable properties absent from the traditional sum-of-squared-error function. We argue for skip (shortcut) connections where appropriate and the preference for a sigmoidal yielding values over the [-1,1] interval. The input feature space is often over-analyzed, but by using singular value decomposition, input patterns can be conditioned for better learning often with a reduced number of input units. Recurrent networks, in their most general form, require special handling and cannot be simply a re-wiring of the architecture without a corresponding revision of the derivative calculations. There is a careful balance required among the network architeucture (specifically, hidden and feedback units), the amount of training applied, and the ability of the network to generalize. These issues often hinge on selecting the proper stopping criterion. Discovering methods that work in theory as well as in practice is difficult and we have spent a substantial amount of effort evaluating and testing these ideas on real problems to determine their value. This paper encapsulates a number of such ideas ranging from those motivated by a desire for efficiency of training to those motivated by correctness and accuracy of the result. While this paper is intended to be self-contained, several references are provided to other work upon which many of our claims are based.
Kalman, Barry L. and Kwasny, Stan C., "High-Performance Training of Feedforward & Simple Recurrent Networks" Report Number: WUCS-94-29 (1994). All Computer Science and Engineering Research.
Permanent URL: http://dx.doi.org/10.7936/K7KH0KJW