Can PAC Learning Algorithms Tolerate Random Attribute Noise?

Sally A. Goldman, Washington University in St Louis
Robert H. Sloane, University of Illinois at Chicago


This paper studies the robustness of pac learning algorithms when the instances space is {0,1}n, and the examples are corrupted by purely random noise affecting only the instances (and not the labels). In the past, conflicting results on this subject have been obtained -- the "best agreement" rule can only tolerate small amounts of noise, yet in some cases large amounts of noise can be tolerated. We show that the truth lies somewhere in between these two alternatives. For uniform attribute noise, in which each attribute is flipped independently at random with the same probability, we present an algorithm that pac learns monomials for any (unknown) noise rate less than 1/2. Contrasting this positive result, we show that product random attribute noise, where each attribute i is flipped randomly and independently with its own probability pi, is nearly as harmful as malicious noise-- no algorithm can tolerate more than a very small amount of such noise.