FREE ELECTRONIC LIBRARY - Online materials, documents

Pages:   || 2 |

«Herbert Wiklicky Centrum voor Wiskunde en Informatica P.O.Box 4079, NL-1009 AB Amsterdam, The Netherlands· e-mail: herbert Abstract We prove ...»

-- [ Page 1 ] --

On the Non-Existence of a Universal Learning

Algorithm for Recurrent Neural Networks

Herbert Wiklicky

Centrum voor Wiskunde en Informatica

P.O.Box 4079, NL-1009 AB Amsterdam, The Netherlands·

e-mail: herbert@cwi.nl


We prove that the so called "loading problem" for (recurrent) neural networks is unsolvable. This extends several results which already demonstrated that training and related design problems for neural networks are

(at least) NP-complete. Our result also implies that it is impossible to find or to formulate a universal training algorithm, which for any neural network architecture could determine a correct set of weights. For the simple proof of this, we will just show that the loading problem is equivalent to "Hilbert's tenth problem" which is known to be unsolvable.


It seems that there are relatively few commonly accepted general formal definitions of the notion of a "neural network". Although our results also hold if based on other formal definitions we will try to stay here very close to the original setting in which Judd's NP completeness result was given [Judd, 1990]. But in contrast to [Judd, 1990] we will deal here with simple recurrent networks instead of feed forward architectures.

Our networks are constructed from three different types of units:.E-units compute just the sum of all incoming signals; for II -units the activation (node) function is given by the product of the incoming signals; and with E)-units - depending if the input signal is smaller or larger than a certain threshold parameter fl - the output is zero or one. Our units are connected or linked by real weighted connections and operate synchronously.

Note that we could base our construction also just on one general type of units, namely what usually is called.EII -units. Furthermore, one could replace the II -units in the below 432 Wiklicky construction by (recurrent) modules of simple linear threshold units which had to perform unary integer multiplication. Thus, no higher order elements are actually needed.

As we deal with recurrent networks, the behavior of a network now is not just given by a simple mapping from input space to output space (as with feed forward architectures). In geneml, an input pattern now is mapped to an (infinite) output sequence. But note, that if we consider as the output of a recurrent network a certain final, stable output pattern, we could return to a more static setting.

2 THE MAIN RESULT The question we will look at is how difficult it is to construct or train a neural network of the described type so that it actually exhibits a certain desired behavior, i.e. solves a given

learning task. We will investigate this by the following decision problem:

Decision 1 Loading Problem INSTANCE: A neural network architecture N and a learning task T.

QUESTION: Is there a configuration C for N such that T is realized by C?

By a network configuration we just think of a certain setting of the weights in a neural network. Our main result concerning this problem now just states that it is undecidable or unsolvable.

–  –  –

The decision problem (as usual) gives a "lower bound" on the hardness of the related constructive problem [Garey and Johnson, 1979]. If we could construct a correct configuration for all instances, it would be trivial to decide instantly if a correct configuration exists at

all. Thus we have:

Corollary 2 There exists no universal learning algorithm for (recurrent) neural networks.

3 THE PROOF The proof of the above theorem is by constructing a class of neural networks for which it is impossible to decide (for all instance) if a certain learning task can be satisfied. We will refer for this to "Hilbert's tenth problem" and show that for each of its instances we can construct a neuml network, so that solutions to the loading problem would lead to solutions to the original problem (and vice versa). But as we know that Hilbert's tenth problem is unsolvable we also have to conclude that the loading problem we consider is unsolvable.

–  –  –

Our reference problem - of which we know it is unsolvable - is closely related to several famous and classical mathematical problems including for example Fermat's last theorem.

On the Non-Existence of a Universal Learning Algorithm for Recurrent Neural Networks 433

–  –  –

with each term d i of the form di( 3:1,.1:2,...,.1: rt ) = r.i. J: i •. J: iz..... J : im, where the indices {i I, £2,..., ; rrt} are taken from {I, 2,..., 11 } and the coefficient r.i E Z.

The concrete problem, first formulated in [Hilbert, 1900] is to develop a universal algorithm how to find the integer solutions for all D, i.e. a vector (3: J,.1:2,...,3:,1) with.1: i E Z (or IN), such that D( 3: 1,3:2,...,.1: rt) = O. The corresponding decision problem therefore is the


Decision 2 Hilbert's Tenth Problem INSTANCE: Given a diophantine equation D.

QUESTION: Is there an integer solutionfor D?

Although this problem might seem to be quite simple - it formulation is actually the shortest among D. Hilbert's famous 23 problems - it was not until 1970 when Y. Matijasevich could prove that it is unsolvable or undecidable [Matijasevich, 1970]. There is no recursive computable predicate for diophantine equations which holds if a solution in Z (or N) exists and fails otherwise [Davis, 1973, Theorem 7.4].


The construction of a neural network IV for each diophantine D is now straight forward (see FigJ). It is just a three step construction.

First, each variable.1: i of D is represented in IV by a small sub-network. The structure of these modules is quite simple (left side of Fig.1). Note that only the self-recurrent connection for the unit at the bottom of these modules is "weighted" by 0.0 'II! 1.0.

All other connection transmit their signals unaltered (i.e. w = 1.0).

Second, the terms di in D are represent by Il-units in IV (as show in Fig.1). Therefore, the connections to these units from the sub-modules representing the variables.1: i of D correspond to the occurrences of these variables in each term d i.

Finally, the output signals of all these Il-units is multiplied by the corresponding coefficients C:i and summed up by the ~-unit at the top.


The fundamental property of the networks constructed in the above way is given by the simple fact that the behavior of such a neural network IV corresponds uniquely to the evaluation of the original diophantine D.

First, note that the behavior of N only depends on the weights Wi in each of the variable modules. Therefore, we will take a closer look at the behavior of these sub-modules.

Suppose, that at some initial moment a signal of value 1.0 is received by each variable module. After that the signal is reset again to 0.0.

434 Wiklicky Wi.

The "seed" signal starts circling via With each update circle this signal becomes a little bit smaller. On the other hand, the same signal is also sent to the central 8-unit, which sends a signal 1.0 to the top accumulator unit as long as the "circling" activation of the bottom unit is larger then the (preset) threshold 0,. The top unit (which also keeps track of its former activiations via a recurrent connection) therefore just counts how many updates it takes before the activiation of the bottom unit drops below 0,.

The final, maximum, value which is emitted by the accumulator unit is some integer.1:, for

which we have:

–  –  –

To conclude the proof, we now have to demonstrate the equivalence of Hilbert's tenth problem and the loading problem for the discussed class of recurrent networks and some learning task.

The learning task we will consider is the following: Map an input pattern with all signals equal to 1.0 (presented only once) to an output sequence which after afinite number of steps On the Non-Existence of a Universal Learning Algorithm for Recurrent Neural Networks 435 is constant equal to 0.0. Note that - as discussed above - we could also consider a more static learing task where a final state, which detennines the (single) output of the network, was detennined by the condition that the outgoing signals of all 8-units had to be zero.

Considering this learing task and with what we said about the behavior of the sub-modules it is now trivial to see that the constructed network just evaluates the diophantine polynomial for a set of variables ;r i corresponding to the (final) output signals of the sub-modules (which are detennined uniquely by the weight values !lii) if the input to the network is a pattern of all 1.0s.

If we had a solution.1.' i of the original diophantine equation D, and if we take the corresponding values Wi (according to the above relation) as weights in the sub-modules of N, then this would also solve the loading problem for this architecture. On the other hand, if we knew the correct weights Wi for any such network N, then the corresponding integers 3:i would also solve the corresponding diophantine equation D.

In particular, if it would be possible to decide if a correct set of weights Wi for N exists (for the above learning task), we could also decide if the corresponding diophantine D had a solution 3: i E :IN (and vice versa). As the whole construction was trivial, we have shown that both problems are equivalent.


We demonstrated that the loading problem not only is NP-complete - as shown for simple feed fOIward architectures in [Judd, 1990], [Lin and Vitter, 1991], [Blum and Rivest, 1992], etc. - but actually unSOlvable, i.e. that the training of (recurrent) neural networks is among those problems which "indeed are intractable in an especially strong sense" [Garey and Johnson, 1979, P 12]. A related non-existence result concerning the training of higher order neural networks with integer weights was shown in [Wiklicky, 1992, WIklicky, 1994].

One should stress once again that the fact that no general algorithm exists for higher order or recurrent networks, which could solve the loading problem (for all its instances), does not imply that all instances of this problem are unsolvable or that no solutions exist. One could hope, that in most relevant cases - whatever that could mean - or, when we restrict the problem, a sub-class of problems things might become tractable. But the difference between solvable and unsolvable problems often can be very small.

In particular, it is known that the problem of solving linear diophantine equations (instead of general ones) is polynomially computable, while if we go to quadratic diophantine equations the problem already becomes;V P complete [Johnson, 1990]. And for general diophantine the problem is even unsolvable. Moreover, it is also known that this problem is unsolvable if we consider only diophantine equations of maximum degree 4, and there exists a universal diophantine with only 13 variables which is unsolvable [Davis et al., 1976].

But we think, that one should interpret the "negative" results on NP-complexity as well as on undecidability of the loading problem not as restrictions for neural networks, but as related to their computational power. As it was shown that concrete neural networks can be constructed, so that they simulate a universal Turing machine [Siegelmann and Sontag, 1992, Cosnard et al., 1993]. It is mere the complexity of the problem one attempts to solve which simply cannot disappear and not some intrinsic intractability of the neural network approach.

436 Wiklicky Acknowledgement This work was started during the author's affiliation with the "Austrian Research Institute for Artificial Intelligence", Schottengasse 3, A-101O Wien, Austria. Further work was supported by a grant from the Austrian "Fonds zur Forderung der wissenschaftlichen Forschung" as Projekt J0828-PHY.

References [Blum and Rivest, 1992] Avrim L. Blum and Ronald L. Rivest. Training a 3-node neural network is NP-complete. Neural Networks, 5:117-127,1992.

[Cosnard et al., 1993] Michael Cosnard, Max Garzon, and Pascal Koiran. Computability properties of low-dimensional dynamical systems. In Symposium on Theoretical Aspects of Computer Science (STACS '93), pages 365-373, Springer-Verlag, BerlinNew York, 1993.

[Davis, 1973] Martin Davis. Hilbert's tenth problem is unsolvable. Amer. Math. Monthly, 80:233-269, March 1973.

[Davis et aI., 1976] Martin Davis, Yuri Matijasevich, and Julia Robinson. Hilbert's tenth problem - diophantine equations: Positive aspects of a negative solution. In Felix E.

Browder, editor, Mathematical developments arising from Hilbert, pages 323-378, American Mathematical Society, 1976.

[Garey and Johnson, 1979] Michael R. Garey and David S. Johnson. Computers and Intractability -A Guide to the Theory of NP-Complete ness. W. H. Freeman, New York, 1979.

[Hilbert, 1900] David Hilbert. Mathematische Probleme. Nachr. Ges. Wiss. G6ttingen, math.-phys.Kl., :253-297, 1900.

[Johnson, 1990] David S. Johnson. A catalog of complexity classes. In Handbook of Theoretical Computer Science (Volume A: Algorithms and Complexity), chapter 2, pages 67-161, Elsevier - MIT Press, Amsterdam - Cambridge, Massachusetts, 1990.

[Judd, 1990] J. Stephen Judd. Neural Network Design and the Complexity of Learning.

MIT Press, Cambridge, Massachusetts - London, England, 1990.

[Lin and Vitter, 1991] Jyh-Han Lin and Jeffrey Scott Vitter. Complexity results on learning by neural networks. Machine Learning, 6:211-230,1991.

[Matijasevich, 1970] Yuri Matijasevich. Enumerable sets are diophantine. Dokl. Acad.

Nauk., 191:279-282, 1970.

[Siegelmann and Sontag, 1992] Hava T. Siegelmann and Eduardo D. Sontag. On the computational power of neural nets. In Fifth Workshop on Computational Learning Theory (COLT 92), pages 440-449, 1992.

[Wiklicky, 1992] Herbert Wiklicky. SyntheSis and Analysis of Neural Networks - On a Framework for Artificial Neural Networks. PhD thesis, University of Vienna Technical University of Vienna, September 1992.

[WIklicky, 1994] Herbert Wiklicky. The neural network loading problem is undecidable.

Pages:   || 2 |

Similar works:

«Death by Committee? An Analysis of Delegation in Corporate Boards ∗ Ren´ e B. Adams† e Vanitha Ragunathan Robert Tumarkin UNSW Australia and ECGI University of Queensland UNSW Australia December 24, 2016 Abstract Did the corporate governance reforms of the early 2000s have unintended consequences? While readily observable board characteristics have not changed much over time, boards have increasingly delegated responsibilities to sub-committees staffed entirely by independent directors....»

«Andrés Portilla Director, Regulatory Affairs Department September 30, 2013 Mr. Svein Andresen Secretary General Financial Stability Board Bank for International Settlements Centralbahnplatz 2 CH-4002 Basel Switzerland Submit via email to: fsb@bis.org RE: IIF Response to FSB’s Principles for Effective Risk Appetite Framework Dear Mr. Andresen: The Institute appreciates the opportunity to comment on the proposed FSB Principles (“Principles”). Since 2008, the Institute has devoted...»

«University and College Union University of Bradford Local Association President: Andy Scally Minutes of the Annual General Meeting held in room D01.27, Horton Building on Wednesday, 13th June 2007 Chair: The President 1. Apologies for Absence. Were received from C Alder, E Clement, M Dailey, J Fletcher, J Goddard, M Haley, S Harrop, H Marshall, B Ochieng, A Persaud, J Porter, J Radice, D Regas, I Robshaw, E Simpson, D Twigger, J Whitelock.2. Minutes of the AGM held on 14th June 2006. The...»

«Why Basel III matters for Latin American and Caribbean financial markets Jaime Caruana General Manager, Bank for International Settlements ASBA-FSI High-Level Meeting on “The emerging framework to strengthen financial stability and regulatory priorities in the Americas” Antigua, Guatemala, Friday 19 November 2010 It is a great pleasure to address this high-level meeting. I would like to thank our co-hosts, the Association of Banks of Supervisors of the Americas and the Financial Stability...»

«Procuring clean and efficient road vehicles Clean Fleets Guide Date of Publication: November 2014 Source: Viorel Sima, Dreamstime Authors: Simon Clement, Natalie Evans (ICLEI – Local Governments for Sustainability) Contributions & acknowledgements: Clean Fleets project partners (TTR, City of Stockholm, City of Bremen, Transport for London, City of Rotterdam, City of Palencia, TÜV Nord, URTP, Zagreb Holding, VAG Freiburg, ISIS, City of Sofia), Giles Liddell (Bristol City Council), Orlando...»

«VIRTUAL PRESENTERS: TOWARDS INTERACTIVE VIRTUAL PRESENTATIONS Anton Nijholt Human Media Interaction Group University of Twente Enschede, the Netherlands anijholt@cs.utwente.nl Abstract – We discuss having virtual presenters in virtual environments that present information to visitors of these environments. Some current research is surveyed and we will look in particular to our research in the context of a virtual meeting room where a virtual presenter uses speech, gestures, pointing and...»

«The Prosperous Tradesman’s Easy Online Marketing Guide *** A winning formula for successfully marketing your contracting or service business online even if you have no idea where to start Please Share This Guide With Fellow Tradesmen. The More Who Get It The Better! The Prosperous Tradesman Marketing Guide | Larry G. Maguire Contents Contents Who Is Larry G. Maguire Making A Commitment Introduction Some Assumptions Before We Start Here’s How This Guide Breaks Down Why You Need To Build Your...»

«1 Curriculum Vitae Sydney Anne Rice, MD Office Address: University of Arizona Department of Pediatrics Division of Pediatric Neurology 1501 N. Campbell Tucson, AZ 85724 520-626-6615 E-mail: srice@peds.arizona.edu CHRONOLOGY OF EDUCATION Education: 1982-1986 Stanford University, BS 1987-1991 University of Arizona, MD University of Virginia, MS 1-2003 Post-Graduate Training: July 1991-June 1994 Resident, Pediatrics, University of Arizona, Tucson, AZ July 1994-June 1995 Chief Resident, Pediatrics,...»

«Celebrating over 27 Years of Services and Solutions October 2016 BREAKING NEWS In This Issue Job Posting: Director, Somerville Senior • Job Posting: Director, Citizen Housing Somerville Senior Citizen Housing Location: Somerville Senior Citizen Housing, Somerville, NJ • RHIIP Listserv #363 – 2017 Title: Director, Somerville Senior Citizen Housing. Operating Cost Adjustment Factors (OCAFs) Published Direct Reports: • SAVE TPS Automatic Extension • Director, Social Services Issue...»

«Customer’s Responses to Crowded Restaurant Environment: Cross Cultural Differences between American and Chinese Dae-Young Kim, Ph. D.* Department of Food and Hospitality Systems University of Missouri-Columbia, Missouri, USA kimdae@missouri.edu Sangwon Park Department of Food and Hospitality Systems University of Missouri-Columbia, Missouri, USA Submitted to the Journal of Hospitality & Leisure Marketing for the special issue on “Marketing in Pan-Asia” *For correspondence, please contact...»

«Digital Relationship Management Digital Commerce Everywhere. Elastic Commerce Online selling of digital content such as software, e-books and periodicals, games and information services is very different from traditional retail ecommerce. It should be frictionless: The sale should happen in the context of life. Ordering the sequel as one finishes an e-book, additional resources in the context of a game, an upgrade or subscription renewal while using an application, or access to related...»

«TH E LIF E O F T H E PROPHET (P B U H) SARWAT SAULAT Source: www.prophetmuhammadforall.org 1 CONTENTS Introduction Chapter 1 – The World on the Eve of the Advent of the Last Prophet Chapter 2 – The Birth of the Prophet Muhammad (PBUH) and His Early Life Chapter 3 – Beginning of the Prophetic Mission Chapter 4 – Public Preaching and Persecution of Muslims Chapter 5 – Social Boycott, Journey to Taif and the Year of Sorrow Chapter 6 – Hopeful Signs from Madinah Chapter 7The Hijrah...»

<<  HOME   |    CONTACTS
2017 www.thesis.dislib.info - Online materials, documents

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.