![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
Page seems unnecessarily complex - finding it hard to understand, even knowing underlying terms. Am I alone in this? Perhaps this needs a few images / figures to compliment learning? Dm1911 ( talk) 21:26, 26 May 2015 (UTC)
Will be seriously re-workign much of this article to be more understandable and legible. Priority will be to explain (with examples) TD(0), n-step TD and TD(lambda).
Help / Talk appreciated ! Dm1911 ( talk) 10:20, 29 May 2015 (UTC)
Hello fellow Wikipedians,
I have just added archive links to one external link on
Temporal difference learning. Please take a moment to review
my edit. If necessary, add {{
cbignore}}
after the link to keep me from modifying it. Alternatively, you can add {{
nobots|deny=InternetArchiveBot}}
to keep me off the page altogether. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— cyberbot II Talk to my owner:Online 14:55, 23 March 2016 (UTC)
Sutton, R. S.; Barto, A. G. (1990). "Time Derivative Models of Pavlovian Reinforcement" (PDF). Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497–537. 156.40.255.18 ( talk) 15:31, 22 September 2023 (UTC)
This article is lacking a (non-mathematical) description of how the algorithm works. 109.49.139.84 ( talk) 13:44, 22 October 2023 (UTC)
As a mergist Wikipedian, I believe we should add some section here about policy iteration. SpiralSource ( talk) 13:22, 18 April 2024 (UTC)
![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
Page seems unnecessarily complex - finding it hard to understand, even knowing underlying terms. Am I alone in this? Perhaps this needs a few images / figures to compliment learning? Dm1911 ( talk) 21:26, 26 May 2015 (UTC)
Will be seriously re-workign much of this article to be more understandable and legible. Priority will be to explain (with examples) TD(0), n-step TD and TD(lambda).
Help / Talk appreciated ! Dm1911 ( talk) 10:20, 29 May 2015 (UTC)
Hello fellow Wikipedians,
I have just added archive links to one external link on
Temporal difference learning. Please take a moment to review
my edit. If necessary, add {{
cbignore}}
after the link to keep me from modifying it. Alternatively, you can add {{
nobots|deny=InternetArchiveBot}}
to keep me off the page altogether. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— cyberbot II Talk to my owner:Online 14:55, 23 March 2016 (UTC)
Sutton, R. S.; Barto, A. G. (1990). "Time Derivative Models of Pavlovian Reinforcement" (PDF). Learning and Computational Neuroscience: Foundations of Adaptive Networks: 497–537. 156.40.255.18 ( talk) 15:31, 22 September 2023 (UTC)
This article is lacking a (non-mathematical) description of how the algorithm works. 109.49.139.84 ( talk) 13:44, 22 October 2023 (UTC)
As a mergist Wikipedian, I believe we should add some section here about policy iteration. SpiralSource ( talk) 13:22, 18 April 2024 (UTC)