TY - JOUR
T1 - Computational capabilities of restricted two-layered perceptrons
AU - Priel, Avner
AU - Blatt, Marcelo
AU - Grossmann, Tal
AU - Domany, Eytan
AU - Kanter, Ido
PY - 1994
Y1 - 1994
N2 - We study the extent to which fixing the second-layer weights reduces the capacity and generalization ability of a two-layer perceptron. Architectures with N inputs, K hidden units, and a single output are considered, with both overlapping and nonoverlapping receptive fields. We obtain from simulations one measure of the strength of a network-its critical capacity, αc. Using the ansatz τmed(αc-α)-2 to describe the manner in which the median learning time diverges as αc is approached, we estimate αc in a manner that does not depend on arbitrary impatience parameters. The chir learning algorithm is used in our simulations. For K=3 and overlapping receptive fields we show that the general machine is equivalent to the committee machine with the same architecture. For K=5 and the same connectivity the general machine is the union of four distinct networks with fixed second layer weights, of which the committee machine is the one with the highest αc. Since the capacity of the union of a finite set of machines equals that of the strongest constituent, the capacity of the general machine with K=5 equals that of the committee machine. We were not able to prove this for general K, but believe that it does hold. We investigated the internal representations used by different machines, and found that high correlations between the hidden units and the output reduce the capacity. Finally we studied the Boolean functions that can be realized by networks with fixed second layer weights. We discovered that two different machines implement two completely distinct sets of Boolean functions.
AB - We study the extent to which fixing the second-layer weights reduces the capacity and generalization ability of a two-layer perceptron. Architectures with N inputs, K hidden units, and a single output are considered, with both overlapping and nonoverlapping receptive fields. We obtain from simulations one measure of the strength of a network-its critical capacity, αc. Using the ansatz τmed(αc-α)-2 to describe the manner in which the median learning time diverges as αc is approached, we estimate αc in a manner that does not depend on arbitrary impatience parameters. The chir learning algorithm is used in our simulations. For K=3 and overlapping receptive fields we show that the general machine is equivalent to the committee machine with the same architecture. For K=5 and the same connectivity the general machine is the union of four distinct networks with fixed second layer weights, of which the committee machine is the one with the highest αc. Since the capacity of the union of a finite set of machines equals that of the strongest constituent, the capacity of the general machine with K=5 equals that of the committee machine. We were not able to prove this for general K, but believe that it does hold. We investigated the internal representations used by different machines, and found that high correlations between the hidden units and the output reduce the capacity. Finally we studied the Boolean functions that can be realized by networks with fixed second layer weights. We discovered that two different machines implement two completely distinct sets of Boolean functions.
UR - http://www.scopus.com/inward/record.url?scp=0000884439&partnerID=8YFLogxK
U2 - 10.1103/physreve.50.577
DO - 10.1103/physreve.50.577
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:0000884439
SN - 1063-651X
VL - 50
SP - 577
EP - 595
JO - Physical Review E
JF - Physical Review E
IS - 1
ER -