TY - JOUR

T1 - Computational capabilities of restricted two-layered perceptrons

AU - Priel, Avner

AU - Blatt, Marcelo

AU - Grossmann, Tal

AU - Domany, Eytan

AU - Kanter, Ido

PY - 1994

Y1 - 1994

N2 - We study the extent to which fixing the second-layer weights reduces the capacity and generalization ability of a two-layer perceptron. Architectures with N inputs, K hidden units, and a single output are considered, with both overlapping and nonoverlapping receptive fields. We obtain from simulations one measure of the strength of a network-its critical capacity, αc. Using the ansatz τmed(αc-α)-2 to describe the manner in which the median learning time diverges as αc is approached, we estimate αc in a manner that does not depend on arbitrary impatience parameters. The chir learning algorithm is used in our simulations. For K=3 and overlapping receptive fields we show that the general machine is equivalent to the committee machine with the same architecture. For K=5 and the same connectivity the general machine is the union of four distinct networks with fixed second layer weights, of which the committee machine is the one with the highest αc. Since the capacity of the union of a finite set of machines equals that of the strongest constituent, the capacity of the general machine with K=5 equals that of the committee machine. We were not able to prove this for general K, but believe that it does hold. We investigated the internal representations used by different machines, and found that high correlations between the hidden units and the output reduce the capacity. Finally we studied the Boolean functions that can be realized by networks with fixed second layer weights. We discovered that two different machines implement two completely distinct sets of Boolean functions.

AB - We study the extent to which fixing the second-layer weights reduces the capacity and generalization ability of a two-layer perceptron. Architectures with N inputs, K hidden units, and a single output are considered, with both overlapping and nonoverlapping receptive fields. We obtain from simulations one measure of the strength of a network-its critical capacity, αc. Using the ansatz τmed(αc-α)-2 to describe the manner in which the median learning time diverges as αc is approached, we estimate αc in a manner that does not depend on arbitrary impatience parameters. The chir learning algorithm is used in our simulations. For K=3 and overlapping receptive fields we show that the general machine is equivalent to the committee machine with the same architecture. For K=5 and the same connectivity the general machine is the union of four distinct networks with fixed second layer weights, of which the committee machine is the one with the highest αc. Since the capacity of the union of a finite set of machines equals that of the strongest constituent, the capacity of the general machine with K=5 equals that of the committee machine. We were not able to prove this for general K, but believe that it does hold. We investigated the internal representations used by different machines, and found that high correlations between the hidden units and the output reduce the capacity. Finally we studied the Boolean functions that can be realized by networks with fixed second layer weights. We discovered that two different machines implement two completely distinct sets of Boolean functions.

UR - http://www.scopus.com/inward/record.url?scp=0000884439&partnerID=8YFLogxK

U2 - 10.1103/physreve.50.577

DO - 10.1103/physreve.50.577

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:0000884439

SN - 1063-651X

VL - 50

SP - 577

EP - 595

JO - Physical Review E

JF - Physical Review E

IS - 1

ER -