![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||
|
On the topic of LISP..
Not so, a thunk stored in a SASL list allows delayed evaluation of the thunk until the value is absolutely needed, a common example is a list of all prime numbers where the list of elements are represented with a thunk that is used to derive the prime numbers, filling in the contents with the primes as they are provided by the thunk, thus allowing repeated accesses of the same list elements takes constant time because they are already evaluated. The thunk requires access to the list indices (thunk implemented as a lambda function). .
Actually, a lambda function can be used to create a list of thunks, but the thunks are determined by the lambda function, so.. It's been a while since I've coded LISP, ever since I learned the language, great for learning but I wouldn't code in it.. Ask Suzanne Sluizer she knows..
-- Rofthorax 09:56, 24 August 2005 (UTC)
No No No! Thunking is more general than that! Probably needs a reference to Algol implementations which used thunking... -- 81.79.64.46 11:14, 3 May 2004 (UTC)
For a good discussion, see:
http://compilers.iecc.com/comparch/article/98-03-043
The legend that I heard was that, generically, a "thunk" is a function (or procedure) which takes no arguments, and returns no values, and that it was coined by Donald Knuth in The Art of Computer Programming, who came up with thunk as an anagram of his surname, and as a way to describe a minimal function. I don't have my copy handy to verify. But this seems like a general definition which would apply to invoking the continuation of closures, which usually take much longer than a little thunk should.
According to the Internet's famous jargon file:
File: jargon.info :thunk: /thuhnk/ n. 1. "A piece of coding which provides an address", according to P. Z. Ingerman, who invented thunks in 1961 as a way of binding actual parameters to their formal definitions in Algol-60 procedure calls. If a procedure is called with an expression in the place of a formal parameter, the compiler generates a {thunk} to compute the expression and leave the address of the result in some standard location. 2. Later generalized into: an expression, frozen together with its environment, for later evaluation if and when needed (similar to what in techspeak is called a `closure'). The process of unfreezing these thunks is called `forcing'. 3. A {stubroutine}, in an overlay programming environment, that loads and jumps to the correct overlay. Compare {trampoline}. 4. People and activities scheduled in a thunklike manner. "It occurred to me the other day that I am rather accurately modeled by a thunk --- I frequently need to be forced to completion." --- paraphrased from a {plan file}. Historical note: There are a couple of onomatopoeic myths circulating about the origin of this term. The most common is that it is the sound made by data hitting the stack; another holds that the sound is that of the data hitting an accumulator. Yet another holds that it is the sound of the expression being unfrozen at argument-evaluation time. In fact, according to the inventors, it was coined after they realized (in the wee hours after hours of discussion) that the type of an argument in Algol-60 could be figured out in advance with a little compile-time thought, simplifying the evaluation machinery. In other words, it had `already been thought of'; thus it was christened a `thunk', which is "the past tense of `think' at two in the morning".
As far as my mind can tell (which is quite far, I believe), two meanings of the word "thunk" are used on this page. I feel that each should have a separate page... I dunno why, though :-) -- Ihope127 03:42, 22 August 2005 (UTC)
I say split. Thunks are a mechanism for delayed computation of values. The OOP-related thing and the Microsoft-specific "hack" stuff definitely have nothing to do with that---those sound more like "stubs", which is the term I would use for bits of code that stand in for calling other code. Of course I'm not saying we should get rid of that info, but it could be split into separate articles or perhaps merged into something more relevant. --cos —Preceding unsigned comment added by 129.67.53.183 ( talk) 23:40, 22 October 2009 (UTC)
I agree, there is a need to split. The third point is definitely specific to Windows and should be in another page than the two first generic notions. Should we first agree on that ? (and on dialectics also, of course !) —Preceding unsigned comment added by 134.157.168.24 ( talk) 16:47, 26 February 2010 (UTC)
I removed the following and replaced it with a shorter summary based on the first half:
A flat thunk consists of a pair of dlls (one 32bit and one 16bit) that are used to translate calls from 32bit code to 16bit code. To allow the two dlls to communicate, some intermediate code must be used to translate memory addresses (pointers) between platforms. If you have any past experience with 16bit process memory calls, you may recall that they consist of a pointer to a memory segment and the offset into that memory segment. This is different than a 32bit process memory pointer which consists of an absolute address to the memory being accessed. So, the problem, in a nutshell, is translating segment + offset pointers into absolute addresses. VB programmers don't usually need to worry about things like memory pointers, but the problem is that ALL software is ultimately based on memory pointers. It is the IDE and the programming language that hide these ugly details from us but when you get right down to it, every variable, function, sub and etc... that you write (in any language) consists of an address in memory. Now, imagine a 16bit dll being loaded into a 32bit process where none of the memory addresses match up on either side of the function calls. It just plain can't be done without proper translation. By Gaurav Bhaskar Microsft Support v-2gabha@mssupport.microsoft.com
Cammy 19:25, 29 December 2006 (UTC)
I deleted this fragment. I'm putting in here in case anybody wants to expand and properly cite it into an encyclopedic paragraph. -- shadytrees 17:50, 16 July 2007 (UTC)
It does not belong under a heading of "delayed computation",
as constantly
is a function and hence its
argument is evaluated when the "thunk" is created, not when
used.
70.111.106.99 (
talk) 21:00, 3 January 2008 (UTC)
The code examples in this article are too obscure to be informative. They don't so much demonstrate the use of thunks as they show bits of Algol or some such. It would perhaps be more useful to say that thunks are in fact a data structure that contains a reference to code as well as an environment to execute that code in (the env might be a stack frame or dictionary pointer -- it's a place that somehow holds mappings between variables and values).
For instance:
variable a = 7; def foo(a); // NB: two vars called a print a + 2; enddef; foo(a + 10);
When foo is called in an eager language, the expression (a + 10) is evaluated before the function is called, and the value passed in is 17. foo will print the value of (17 + 2) = 19 (not 7 + 2), as it uses the latest definition of a, which is the one local to the function.
In a lazy language, instead of evaluating (a + 10) before passing it into foo, the code of the expression just gets wrapped up in a thunk along with current variable mappings as [(a + 10), {a -> 7}] and that gets passed in. When foo tries to print the value of (a + 2), the thunk that is now in the local variable a is executed using the environment it carries in it, not the one of foo, so we get (17 + 2), which is printed as 19. The point to note is that in principle you shouldn't have the same bit of code executing in environments it doesn't belong to (this relates to lambda calculus and how you can't subst RHS expressions that use the same variables as the LHS expressions).
Thunks can help with some infinite loops. Consider the following Haskell code (a lazy language):
-- takes two args and returns the first first x y = x -- infinite recursion infinity = let loop k = loop (k+1) in loop 0 first 5 infinity => PRINTS 5
The infinite recursion is never evaluated, as it's never used by the program. It only ever uses the first arg of the function, and the second just stays a thunk and is not expanded. An eager language would try to get a value for infinity before calling first. At this point it may be worth looking into innermost vs outermost reduction.
While thunks can be useful in this way, they have their disadvantages: creating and destroying thunks all the time can make programs slow. Thunks that contain side-effects (e.g. print a value to the screen) may not execute in the order the programmer intended as their contents are only evaluated whenever the function called tries to use their value (if at all).
There are some interesting exceptions in common langs like C and Java where this sort of lazy thing is used (but not using thunks). For instance, if statements only execute one of their branches, which is a bit of forced lazy evaluation.
Just some points that could be used to spruce this article up, or that people can read up on if they're interested in this sort of thing.
--cos
—Preceding unsigned comment added by 129.67.53.183 ( talk) 19:05, 22 October 2009 (UTC)
The Algol 60 thunk is not the same thing as lazy evaluation because the semantics are that it is re-evaluated for each use, not only on first use (the article "lazy evaluation" makes this distinction very early). That is why Jensen's device "works".
Nor is it a "delayed computation" because its value is the value at the point of use not at some earlier time. It is not in general possible to evaluate it any earlier, so it is not delayed. Of course, in a given use, one or both of these things might be true but that is a property of the program, not of the thunk itself.
BTW, if I remember correctly, and to nit pick, addresses (aka "names") are not part of the value domain in Algol 60, in which case, these thunks do not return values.
The C thing that cos mentions I would not describe as lazy evaluation but consecutional evaluation. The difference being that in the consecutional case, the expression is never referenced (i.e. bound), as against the value never being needed but the expression having been referenced. The difference may appear subtle but it is conceivable that in a (imagined) language with precise exception semantics the lazy case might throw an exception at the point of being bound while the consecutional case might not.
There should, perhaps, be an article for consecutional semantics.
Noticing the similarities between these concepts is useful, but we should also be careful to notice their differences. -- dlm —Preceding unsigned comment added by 192.55.54.38 ( talk) 13:50, 17 May 2011 (UTC)
Please can someone write a section explaining why running 32-bit programs on 64-bit Windows is not considered thunking. —Preceding unsigned comment added by 198.54.202.114 ( talk) 15:43, 18 September 2010 (UTC)
Do people think (no semi-pun intended) that it would be useful to put an example of the assembly code a compiler would generate for the wrapper function in the example? It talks about how the function saves time on average (one expensive operation and one cheap operation in one case, and an addition of only a heap branch in another case), but it's kind of hard to see that without knowing what exact instructions are generated. I don't quite understand it well enough to write such :) - but I've seen this kind of code (Newton ROM - ArM Norcraft-compiled C++ [0.43/C4.68 (rel195/Newton Release 9); 1996]). I can sat that it looks like a mess of jumps, and I think seeing a "simple" assembly compilation of this pattern would help understanding (to the extent that Wikipedia CS articles are a psudo-textbook..)
Also, why doesn't the Talk page for Thunk_(object-oriented_programming
automatically redirect here to Talk:Thunk
? — Preceding
unsigned comment added by
98.223.232.121 (
talk) 16:42, 21 December 2011 (UTC)
The Thunk (compatibility mapping) page (which I'm merging back here) has these two examples.
I don't know what this paragraph is talking about, nor are there sources that might help me understand. The second one is maybe trying to say that "thunking" = "event-driven programming", but then it's wrong and by all appearances original research. Nonetheless I'm leaving the examples here in case I'm mistaken. 50.136.204.132 ( talk) 10:10, 7 March 2014 (UTC)
This article was split in Feb. 2011. This was its state at the time: disjointed, poorly formatted, jargon-heavy, filled with examples but very little explanation. So little explanation that editors were convinced that the article was talking about several unrelated concepts. It was not. All the meanings of "thunk" here (except the ones I pasted above) are practical evolutions of Ingerman's "computations as parameters" concept. They differ with regard to the reason that a computation is needed, and how it is generated (by the compiler, another compile-time tool, or a run-time service). There should be one article, and now there is. 50.136.204.132 ( talk) 07:36, 9 March 2014 (UTC)
IMHO, example in this section is uselessly complicated, since it uses a class A which merely appears in the example while being irrelevant to the issue discussed.
To achieve the improvement goal requested in the header of "Thunk" page, class A could be suppressed and class B and class C could be renamed into class A and class B respectively, thus making clearer the intrisic issue of the case shown.
This would yield something like:
Thunks are useful in object-oriented programming platforms that allow a class to inherit multiple interfaces, leading to situations where the same method might be called via any of several interfaces. The following code illustrates such a situation in C++.
class A {
int value;
virtual int access() { return this->value; }
};
class B : public A {
int better_value;
virtual int access() { return this->better_value; }
};
int use(A *a) {
return a->access();
}
// ...
A someA;
use(&someA);
B someB;
use(&someB);
In this example, the code generated for each of the classes A and B will include a
dispatch table that can be used to call access
on an object of that type, via a reference that has the same type. Class B will have an additional dispatch table, used to call access
on an object of type B via a reference of type A. The expression a->access()
will use A's own dispatch table or the additional B table, depending on the type of object a refers to. If it refers to an object of type B, the compiler must ensure that B's access
implementation receives an
instance address for the entire B object, rather than just the inherited A part of that object.
[1]
As a direct approach to this pointer adjustment problem, the compiler can include an integer offset in each dispatch table entry. This offset is the difference between the reference's address and the address required by the method implementation. The code generated for each call through these dispatch tables must then retrieve the offset and use it to adjust the instance address before calling the method.
The solution just described has problems similar to the naïve implementation of call-by-name described earlier: the compiler generates several copies of code to calculate an argument (the instance address), while also increasing the dispatch table sizes to hold the offsets. As an alternative, the compiler can generate an adjustor thunk along with B's implementation of access
that adjusts the instance address by the required amount and then calls the method. The thunk can appear in B's dispatch table for A, thereby eliminating the need for callers to adjust the address themselves.
[2]
References
{{
cite journal}}
: Cite journal requires |journal=
(
help)
One of the uses for thunks is a numeric computation which repeatedly invokes a subcomputation with different inputs, e.g., an integration routine. In ALGOL 60 such a routine would normally be written as a procedure with a call-by-name parameter for the subcomputation. In some other languages it would probably be coded with a procedure parameter, avoiding the need for a thunk. I believe that Thunk#Applications should have a subsection with examples of such uses, but am not sure what it should be called. Shmuel (Seymour J.) Metz Username:Chatul ( talk) 17:45, 3 May 2016 (UTC)
To the text "The address and environment of this helper subroutine," Chatul added the annotation that the environment of the helper routine (the thunk) is not the environment in question, but that it is some other environment which is unclear from the wording of the note.
Possibly incorrect text should be corrected, not annotated with a "clarification" that contradicts it. In this case, the text previously said "the address of the helper subroutine." This ought to be enough. Whether the environment of a function travels with it is an implementation question. An environment should be included to deal with the funarg problem, but thunks have seen limited use in languages such as C++ that do not include one. When the funarg problem is solved, the environment passed for the thunk routine is — almost by definition — the thunk routine's environment. What other routine's environment could it be? 73.71.251.64 ( talk) 19:02, 24 May 2020 (UTC)
![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||
|
On the topic of LISP..
Not so, a thunk stored in a SASL list allows delayed evaluation of the thunk until the value is absolutely needed, a common example is a list of all prime numbers where the list of elements are represented with a thunk that is used to derive the prime numbers, filling in the contents with the primes as they are provided by the thunk, thus allowing repeated accesses of the same list elements takes constant time because they are already evaluated. The thunk requires access to the list indices (thunk implemented as a lambda function). .
Actually, a lambda function can be used to create a list of thunks, but the thunks are determined by the lambda function, so.. It's been a while since I've coded LISP, ever since I learned the language, great for learning but I wouldn't code in it.. Ask Suzanne Sluizer she knows..
-- Rofthorax 09:56, 24 August 2005 (UTC)
No No No! Thunking is more general than that! Probably needs a reference to Algol implementations which used thunking... -- 81.79.64.46 11:14, 3 May 2004 (UTC)
For a good discussion, see:
http://compilers.iecc.com/comparch/article/98-03-043
The legend that I heard was that, generically, a "thunk" is a function (or procedure) which takes no arguments, and returns no values, and that it was coined by Donald Knuth in The Art of Computer Programming, who came up with thunk as an anagram of his surname, and as a way to describe a minimal function. I don't have my copy handy to verify. But this seems like a general definition which would apply to invoking the continuation of closures, which usually take much longer than a little thunk should.
According to the Internet's famous jargon file:
File: jargon.info :thunk: /thuhnk/ n. 1. "A piece of coding which provides an address", according to P. Z. Ingerman, who invented thunks in 1961 as a way of binding actual parameters to their formal definitions in Algol-60 procedure calls. If a procedure is called with an expression in the place of a formal parameter, the compiler generates a {thunk} to compute the expression and leave the address of the result in some standard location. 2. Later generalized into: an expression, frozen together with its environment, for later evaluation if and when needed (similar to what in techspeak is called a `closure'). The process of unfreezing these thunks is called `forcing'. 3. A {stubroutine}, in an overlay programming environment, that loads and jumps to the correct overlay. Compare {trampoline}. 4. People and activities scheduled in a thunklike manner. "It occurred to me the other day that I am rather accurately modeled by a thunk --- I frequently need to be forced to completion." --- paraphrased from a {plan file}. Historical note: There are a couple of onomatopoeic myths circulating about the origin of this term. The most common is that it is the sound made by data hitting the stack; another holds that the sound is that of the data hitting an accumulator. Yet another holds that it is the sound of the expression being unfrozen at argument-evaluation time. In fact, according to the inventors, it was coined after they realized (in the wee hours after hours of discussion) that the type of an argument in Algol-60 could be figured out in advance with a little compile-time thought, simplifying the evaluation machinery. In other words, it had `already been thought of'; thus it was christened a `thunk', which is "the past tense of `think' at two in the morning".
As far as my mind can tell (which is quite far, I believe), two meanings of the word "thunk" are used on this page. I feel that each should have a separate page... I dunno why, though :-) -- Ihope127 03:42, 22 August 2005 (UTC)
I say split. Thunks are a mechanism for delayed computation of values. The OOP-related thing and the Microsoft-specific "hack" stuff definitely have nothing to do with that---those sound more like "stubs", which is the term I would use for bits of code that stand in for calling other code. Of course I'm not saying we should get rid of that info, but it could be split into separate articles or perhaps merged into something more relevant. --cos —Preceding unsigned comment added by 129.67.53.183 ( talk) 23:40, 22 October 2009 (UTC)
I agree, there is a need to split. The third point is definitely specific to Windows and should be in another page than the two first generic notions. Should we first agree on that ? (and on dialectics also, of course !) —Preceding unsigned comment added by 134.157.168.24 ( talk) 16:47, 26 February 2010 (UTC)
I removed the following and replaced it with a shorter summary based on the first half:
A flat thunk consists of a pair of dlls (one 32bit and one 16bit) that are used to translate calls from 32bit code to 16bit code. To allow the two dlls to communicate, some intermediate code must be used to translate memory addresses (pointers) between platforms. If you have any past experience with 16bit process memory calls, you may recall that they consist of a pointer to a memory segment and the offset into that memory segment. This is different than a 32bit process memory pointer which consists of an absolute address to the memory being accessed. So, the problem, in a nutshell, is translating segment + offset pointers into absolute addresses. VB programmers don't usually need to worry about things like memory pointers, but the problem is that ALL software is ultimately based on memory pointers. It is the IDE and the programming language that hide these ugly details from us but when you get right down to it, every variable, function, sub and etc... that you write (in any language) consists of an address in memory. Now, imagine a 16bit dll being loaded into a 32bit process where none of the memory addresses match up on either side of the function calls. It just plain can't be done without proper translation. By Gaurav Bhaskar Microsft Support v-2gabha@mssupport.microsoft.com
Cammy 19:25, 29 December 2006 (UTC)
I deleted this fragment. I'm putting in here in case anybody wants to expand and properly cite it into an encyclopedic paragraph. -- shadytrees 17:50, 16 July 2007 (UTC)
It does not belong under a heading of "delayed computation",
as constantly
is a function and hence its
argument is evaluated when the "thunk" is created, not when
used.
70.111.106.99 (
talk) 21:00, 3 January 2008 (UTC)
The code examples in this article are too obscure to be informative. They don't so much demonstrate the use of thunks as they show bits of Algol or some such. It would perhaps be more useful to say that thunks are in fact a data structure that contains a reference to code as well as an environment to execute that code in (the env might be a stack frame or dictionary pointer -- it's a place that somehow holds mappings between variables and values).
For instance:
variable a = 7; def foo(a); // NB: two vars called a print a + 2; enddef; foo(a + 10);
When foo is called in an eager language, the expression (a + 10) is evaluated before the function is called, and the value passed in is 17. foo will print the value of (17 + 2) = 19 (not 7 + 2), as it uses the latest definition of a, which is the one local to the function.
In a lazy language, instead of evaluating (a + 10) before passing it into foo, the code of the expression just gets wrapped up in a thunk along with current variable mappings as [(a + 10), {a -> 7}] and that gets passed in. When foo tries to print the value of (a + 2), the thunk that is now in the local variable a is executed using the environment it carries in it, not the one of foo, so we get (17 + 2), which is printed as 19. The point to note is that in principle you shouldn't have the same bit of code executing in environments it doesn't belong to (this relates to lambda calculus and how you can't subst RHS expressions that use the same variables as the LHS expressions).
Thunks can help with some infinite loops. Consider the following Haskell code (a lazy language):
-- takes two args and returns the first first x y = x -- infinite recursion infinity = let loop k = loop (k+1) in loop 0 first 5 infinity => PRINTS 5
The infinite recursion is never evaluated, as it's never used by the program. It only ever uses the first arg of the function, and the second just stays a thunk and is not expanded. An eager language would try to get a value for infinity before calling first. At this point it may be worth looking into innermost vs outermost reduction.
While thunks can be useful in this way, they have their disadvantages: creating and destroying thunks all the time can make programs slow. Thunks that contain side-effects (e.g. print a value to the screen) may not execute in the order the programmer intended as their contents are only evaluated whenever the function called tries to use their value (if at all).
There are some interesting exceptions in common langs like C and Java where this sort of lazy thing is used (but not using thunks). For instance, if statements only execute one of their branches, which is a bit of forced lazy evaluation.
Just some points that could be used to spruce this article up, or that people can read up on if they're interested in this sort of thing.
--cos
—Preceding unsigned comment added by 129.67.53.183 ( talk) 19:05, 22 October 2009 (UTC)
The Algol 60 thunk is not the same thing as lazy evaluation because the semantics are that it is re-evaluated for each use, not only on first use (the article "lazy evaluation" makes this distinction very early). That is why Jensen's device "works".
Nor is it a "delayed computation" because its value is the value at the point of use not at some earlier time. It is not in general possible to evaluate it any earlier, so it is not delayed. Of course, in a given use, one or both of these things might be true but that is a property of the program, not of the thunk itself.
BTW, if I remember correctly, and to nit pick, addresses (aka "names") are not part of the value domain in Algol 60, in which case, these thunks do not return values.
The C thing that cos mentions I would not describe as lazy evaluation but consecutional evaluation. The difference being that in the consecutional case, the expression is never referenced (i.e. bound), as against the value never being needed but the expression having been referenced. The difference may appear subtle but it is conceivable that in a (imagined) language with precise exception semantics the lazy case might throw an exception at the point of being bound while the consecutional case might not.
There should, perhaps, be an article for consecutional semantics.
Noticing the similarities between these concepts is useful, but we should also be careful to notice their differences. -- dlm —Preceding unsigned comment added by 192.55.54.38 ( talk) 13:50, 17 May 2011 (UTC)
Please can someone write a section explaining why running 32-bit programs on 64-bit Windows is not considered thunking. —Preceding unsigned comment added by 198.54.202.114 ( talk) 15:43, 18 September 2010 (UTC)
Do people think (no semi-pun intended) that it would be useful to put an example of the assembly code a compiler would generate for the wrapper function in the example? It talks about how the function saves time on average (one expensive operation and one cheap operation in one case, and an addition of only a heap branch in another case), but it's kind of hard to see that without knowing what exact instructions are generated. I don't quite understand it well enough to write such :) - but I've seen this kind of code (Newton ROM - ArM Norcraft-compiled C++ [0.43/C4.68 (rel195/Newton Release 9); 1996]). I can sat that it looks like a mess of jumps, and I think seeing a "simple" assembly compilation of this pattern would help understanding (to the extent that Wikipedia CS articles are a psudo-textbook..)
Also, why doesn't the Talk page for Thunk_(object-oriented_programming
automatically redirect here to Talk:Thunk
? — Preceding
unsigned comment added by
98.223.232.121 (
talk) 16:42, 21 December 2011 (UTC)
The Thunk (compatibility mapping) page (which I'm merging back here) has these two examples.
I don't know what this paragraph is talking about, nor are there sources that might help me understand. The second one is maybe trying to say that "thunking" = "event-driven programming", but then it's wrong and by all appearances original research. Nonetheless I'm leaving the examples here in case I'm mistaken. 50.136.204.132 ( talk) 10:10, 7 March 2014 (UTC)
This article was split in Feb. 2011. This was its state at the time: disjointed, poorly formatted, jargon-heavy, filled with examples but very little explanation. So little explanation that editors were convinced that the article was talking about several unrelated concepts. It was not. All the meanings of "thunk" here (except the ones I pasted above) are practical evolutions of Ingerman's "computations as parameters" concept. They differ with regard to the reason that a computation is needed, and how it is generated (by the compiler, another compile-time tool, or a run-time service). There should be one article, and now there is. 50.136.204.132 ( talk) 07:36, 9 March 2014 (UTC)
IMHO, example in this section is uselessly complicated, since it uses a class A which merely appears in the example while being irrelevant to the issue discussed.
To achieve the improvement goal requested in the header of "Thunk" page, class A could be suppressed and class B and class C could be renamed into class A and class B respectively, thus making clearer the intrisic issue of the case shown.
This would yield something like:
Thunks are useful in object-oriented programming platforms that allow a class to inherit multiple interfaces, leading to situations where the same method might be called via any of several interfaces. The following code illustrates such a situation in C++.
class A {
int value;
virtual int access() { return this->value; }
};
class B : public A {
int better_value;
virtual int access() { return this->better_value; }
};
int use(A *a) {
return a->access();
}
// ...
A someA;
use(&someA);
B someB;
use(&someB);
In this example, the code generated for each of the classes A and B will include a
dispatch table that can be used to call access
on an object of that type, via a reference that has the same type. Class B will have an additional dispatch table, used to call access
on an object of type B via a reference of type A. The expression a->access()
will use A's own dispatch table or the additional B table, depending on the type of object a refers to. If it refers to an object of type B, the compiler must ensure that B's access
implementation receives an
instance address for the entire B object, rather than just the inherited A part of that object.
[1]
As a direct approach to this pointer adjustment problem, the compiler can include an integer offset in each dispatch table entry. This offset is the difference between the reference's address and the address required by the method implementation. The code generated for each call through these dispatch tables must then retrieve the offset and use it to adjust the instance address before calling the method.
The solution just described has problems similar to the naïve implementation of call-by-name described earlier: the compiler generates several copies of code to calculate an argument (the instance address), while also increasing the dispatch table sizes to hold the offsets. As an alternative, the compiler can generate an adjustor thunk along with B's implementation of access
that adjusts the instance address by the required amount and then calls the method. The thunk can appear in B's dispatch table for A, thereby eliminating the need for callers to adjust the address themselves.
[2]
References
{{
cite journal}}
: Cite journal requires |journal=
(
help)
One of the uses for thunks is a numeric computation which repeatedly invokes a subcomputation with different inputs, e.g., an integration routine. In ALGOL 60 such a routine would normally be written as a procedure with a call-by-name parameter for the subcomputation. In some other languages it would probably be coded with a procedure parameter, avoiding the need for a thunk. I believe that Thunk#Applications should have a subsection with examples of such uses, but am not sure what it should be called. Shmuel (Seymour J.) Metz Username:Chatul ( talk) 17:45, 3 May 2016 (UTC)
To the text "The address and environment of this helper subroutine," Chatul added the annotation that the environment of the helper routine (the thunk) is not the environment in question, but that it is some other environment which is unclear from the wording of the note.
Possibly incorrect text should be corrected, not annotated with a "clarification" that contradicts it. In this case, the text previously said "the address of the helper subroutine." This ought to be enough. Whether the environment of a function travels with it is an implementation question. An environment should be included to deal with the funarg problem, but thunks have seen limited use in languages such as C++ that do not include one. When the funarg problem is solved, the environment passed for the thunk routine is — almost by definition — the thunk routine's environment. What other routine's environment could it be? 73.71.251.64 ( talk) 19:02, 24 May 2020 (UTC)