This is the
talk page for discussing improvements to the
Multilayer perceptron article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||
|
From Applications section, "They are useful in research in terms of their ability to solve problems stochastically, which often allows one to get approximate solutions for extremely complex problems like fitness approximation." MLPs do not have any stochastic processes. In other words, there isn't a random element to a MLP. This is either an error or someone intended something else. -- Joseagonzalez ( talk) 04:21, 9 August 2010 (UTC)
XOR can be easily represented by a linear activation function multilayer perceptron.
It is just (X1 OR X2) AND NOT (X1 AND X2). All of these can easily be represented by perceptrons, and putting them together simply requires more layers. What is this nonsense about non-linear activation functions being required to make them any different? The difference is that you can compose multiple functions by having each input go to more than one middle node.—Preceding unsigned comment added by 96.32.175.231 ( talk • contribs)
Agree with you. Matlab nntool Network manager confirms that little through possibility to train multilayer network with purelin (linear activation function). Dzid ( talk) 15:56, 13 June 2010 (UTC)
see this ref 1 where they describe using MLP in marine energy conversion work.-- Billymac00 ( talk) 03:13, 2 July 2011 (UTC)
There is a discussion on this talk page (Only difference is non-linear activation... what?!) which demonstrates that an activation function is a linear combination of the input nodes, **not an on-off mechanism**. The on-off mechanism is only used on the final output node, not the hidden layer nodes.
As described in the other discussion, an on-off mechanism (in the hidden layer) is an example of a nonlinear activation. However, since the binary function is not differentiable, a neural net using the binary function for activation can't be trained with back-propagation.
These are very important distinctions, and obviously a source of confusion (as demonstrated by the referenced discussion). Unfortunately, I don't have time to fix it at the moment. — Preceding unsigned comment added by Qiemem ( talk • contribs) 02:47, 19 February 2013 (UTC)
MLP utilizes a supervised learning technique called backpropagation for training the network.[1][2]
The source [1] was written in the 60's whereas backpropagation was published in context of AI by Rumelhart et al (PDP group): Rumelhart, D., and J. McClelland (1986), Parallel Distributed Processing, MIT Press, Cambridge, MA.
Suggested solution: Remove reference [1]. — Preceding unsigned comment added 15:04, 6 May 2014 (UTC)
The current picture shows MLP behavior. But there is no MLP image.-- Bojan PLOJ ( talk) 09:04, 17 April 2020 (UTC)
I believe it is the standard citation; see, for example, the second sentence of the transformer article. Pogogreg101 ( talk) 04:44, 18 August 2023 (UTC)
27 October 2023: I added the citation. — Preceding unsigned comment added by Pogogreg101 ( talk • contribs) 03:51, 28 October 2023 (UTC)
Hi,
There is a mistake in the first paragraph. It says that modern neural networks use a nonlinear activation function while the original perceptron uses the Heaviside step function. The Heaviside step function is definitely nonlinear. The difference is that there is now a hidden layer, not that the original perceptron didn't have nonlinear outputs. It did. I would also say it is not a misnomer to call it a multi-layer perceptron as a result. CireNeikual ( talk) 19:09, 4 January 2024 (UTC)
This is the
talk page for discussing improvements to the
Multilayer perceptron article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||
|
From Applications section, "They are useful in research in terms of their ability to solve problems stochastically, which often allows one to get approximate solutions for extremely complex problems like fitness approximation." MLPs do not have any stochastic processes. In other words, there isn't a random element to a MLP. This is either an error or someone intended something else. -- Joseagonzalez ( talk) 04:21, 9 August 2010 (UTC)
XOR can be easily represented by a linear activation function multilayer perceptron.
It is just (X1 OR X2) AND NOT (X1 AND X2). All of these can easily be represented by perceptrons, and putting them together simply requires more layers. What is this nonsense about non-linear activation functions being required to make them any different? The difference is that you can compose multiple functions by having each input go to more than one middle node.—Preceding unsigned comment added by 96.32.175.231 ( talk • contribs)
Agree with you. Matlab nntool Network manager confirms that little through possibility to train multilayer network with purelin (linear activation function). Dzid ( talk) 15:56, 13 June 2010 (UTC)
see this ref 1 where they describe using MLP in marine energy conversion work.-- Billymac00 ( talk) 03:13, 2 July 2011 (UTC)
There is a discussion on this talk page (Only difference is non-linear activation... what?!) which demonstrates that an activation function is a linear combination of the input nodes, **not an on-off mechanism**. The on-off mechanism is only used on the final output node, not the hidden layer nodes.
As described in the other discussion, an on-off mechanism (in the hidden layer) is an example of a nonlinear activation. However, since the binary function is not differentiable, a neural net using the binary function for activation can't be trained with back-propagation.
These are very important distinctions, and obviously a source of confusion (as demonstrated by the referenced discussion). Unfortunately, I don't have time to fix it at the moment. — Preceding unsigned comment added by Qiemem ( talk • contribs) 02:47, 19 February 2013 (UTC)
MLP utilizes a supervised learning technique called backpropagation for training the network.[1][2]
The source [1] was written in the 60's whereas backpropagation was published in context of AI by Rumelhart et al (PDP group): Rumelhart, D., and J. McClelland (1986), Parallel Distributed Processing, MIT Press, Cambridge, MA.
Suggested solution: Remove reference [1]. — Preceding unsigned comment added 15:04, 6 May 2014 (UTC)
The current picture shows MLP behavior. But there is no MLP image.-- Bojan PLOJ ( talk) 09:04, 17 April 2020 (UTC)
I believe it is the standard citation; see, for example, the second sentence of the transformer article. Pogogreg101 ( talk) 04:44, 18 August 2023 (UTC)
27 October 2023: I added the citation. — Preceding unsigned comment added by Pogogreg101 ( talk • contribs) 03:51, 28 October 2023 (UTC)
Hi,
There is a mistake in the first paragraph. It says that modern neural networks use a nonlinear activation function while the original perceptron uses the Heaviside step function. The Heaviside step function is definitely nonlinear. The difference is that there is now a hidden layer, not that the original perceptron didn't have nonlinear outputs. It did. I would also say it is not a misnomer to call it a multi-layer perceptron as a result. CireNeikual ( talk) 19:09, 4 January 2024 (UTC)