In third paragraph of
Project significance the sentence "The conformational states and the short simulations between them are then compiled into statistical Markov state models (MSMs), which essentially serve as a map of the protein's energy landscape and kinetic and equilibrium thermodynamics properties and illustrate folding pathways" (italics added) is problematic. The part I emphasized should be either split out or otherwise disconnected from the preceding long object: it requires rereading to be properly comprehended. Done
The last sentence of the same paragraph starts with "The Pande lab has used these MSMs to", as does the first sentence of next (fourth) paragraph, which significantly breaks the text flow. Done
The first sentence of the fourth paragraph of the same section reads "The Pande lab has used these MSMs to parallelize simulations that overall require more than 10 million CPU hours." Different amount of CPU hours is required depending on the properties of CPU this statement bases off (eg. 10m CPU hours of i486sx are ways less work then 10m CPU hours of modern server CPU). Either a note about CPU this phrase refers to should be given or the phrase itself should be generalize to solve ambiguity. Done
What is "NTL9 protein"? (in next sentence)? If the exact protein is important for the topic, wikilink should be here. Done
In the second sentence of the first paragraph of
Biomedical research the phrase "therapeutic intervention can be the next step, which take the form of molecules that alter the production of a certain protein" implies that intervention takes form of molecules; the sentence should be slightly reworded to avoid this impression. BTW, "takes". Done
In third sentence of the third (last) paragraph of
Cancer the sentence "IL-2 binds to these pulmonary cells differently than it does to T cells, a key approach for IL-2 research" should be reworded, as the second phrase is logically disconnected. Done
In fourth sentence of
Osteogenesis imperfecta the sentence "The Pande lab has produced a publication which uses quantum mechanical techniques to improve upon previous simulations of collagen, which may be useful for future computational studies of collagen" should be rephrased: publication can describe the use, but it doesn't actually use quantum mechanical techniques. Done
The seventh sentence of the first paragraph of
Viruses is incomplete: "limited to several orders of magnitude shorter" then what? Done
In
Participation the whole native/x86 FLOPS issue should be explained. Though it is somehow done in last paragraph of
PetaFLOPS milestones, the reader should be able to understand whether what are the "native FLOPS" native for and does summing them up gives a reasonable figure. Though I didn't dive in, I get the impression that the adequateness issue with the total of differently counted FLOPS was the reason of the project's introduction of x86 FLOPS; if so, "native" FLOPS should probably be omitted for all the data available in x86 FLOPS and a note explaining the issue should be added. Done
The last sentence of the
Participation section should be expanded to explain that users were asked to donate CPU time. At least I got the current text as a suggestion to donate money.
Also either text or (preferably) the footnote for this sentence (via |at=) should specify the exact update. Done
The sentence in the
Software (that single one, which doesn't belong to subsections) gives the impression that of exclusive list of components. Probably the "consists of three components" should be changed to something like "involves three primary concepts". Done
The article refers to
GROMACS as "GROMACS" and "Gromacs". One case should be used. Done
The second sentence of
Multi-core processing client section has the phrase "These cores work together to complete a single WU significantly faster". The word "core" needs to be disambiguated between the "core" in "multi-core processor" and program's component. If the former is meant, some explanation is required to avoid leaving impression that the effect is reached without specific code optimization. Done
I would note that contrary to Wikipedia's practice several articles are linked from this one more then once. Though I would ordinarily require that second, third and later occasions be unlinked, this article imposes a substantial load on the reader, so repeated links may actually facilitate reading it. Please don't get this comment as a call to further add duplicate links though.
2a. it contains a list of all references (sources of information), presented in accordance with
the layout style guideline.
2b.
reliable sources are
cited inline. All content that
could reasonably be challenged, except for plot summaries and that which summarizes cited content elsewhere in the article, must be cited no later than the end of the paragraph (or line if the content is not in prose).
Citation templates require more work: the authors' first and last names are ordered inconsistently, the
|work= and |publisher= are used (and wikilinked) inconsistently, Donethe dates
should be in MDY format.
The image in infobox is reported to be distributed under
CC0 conditions, but no proof is demonstrated. The fact that uploader's name looks like the name of author (as written on description page) doesn't prove the licensing information on its own. Done
The caption of the image in
Participation section includes unnecessary details: "shown, by device type, in teraFLOPS as recorded semi-daily from November 2006 until September 2007." Done
7. Overall assessment.
Discussion
Please refer to the issues in the table above by their numbers (eg. 1a1 for first issue with "prose" criterion).
1a3: The exact phrase in the publication is "We created MSMs based on molecular dynamics simulations of a fast-folding double-norleucine HP35 mutant, extending prior calculations to include more than 1 ms of simulation and requiring more than 10 million CPU hours of computation." Since it also says "Folding@Home donors provided computer resources", I think its reasonable to assume that the calculations were run on a broad range of hardware and that measurement is simply the sum of everyone's CPU time.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
The problem is that this article will still be there in, say, 2050; I think we may safely assume that by then 10 million CPU hours would be an amount of computation that would be ways greater then the author of this quote was thinking of. Thus at least year should be specified. Eg. "that overall required more than 10 million CPU hours as of 2011". —
Dmitrij D. Czarkoff (
talk)
21:00, 6 April 2012 (UTC)reply
1a4: I think its important to specify which protein. An expert reading this article will probably be familiar which the chemical properties of that protein, but I don't. There's no Wikipedia article on NTL9. The name is so small IMO there's hardly a readability difference between "the NTL9 protein" and "a protein" but I can remove the name if you think that's important.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
1a5: I added the qualifier "can", which I think also fixed the grammar issue. Molecular interventions are things like drugs, antibiotics, etc.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
Correct me if I'm wrong, but as I get it, therapeutic intervention is an action, thus it can take form of some action (molecular intervention in this case). —
Dmitrij D. Czarkoff (
talk)
21:00, 6 April 2012 (UTC)reply
1a9: I made an initial stab at clarifying the difference, and put the explanation above any mention of "native FLOPS". Removing this native FLOP measurement would obliterate the info in the PetaFLOPS Milestone section, almost all of which is based on native FLOPS before x86 FLOPS were displayed. I've seen nothing to indicate that summing the native FLOPS is particular bad, only that there's a difference between the two measurements. I personally don't see this as a big deal, but please let me know if there's something further I need to fix/clarify here.
Jesse V. (
talk)
01:25, 7 April 2012 (UTC)reply
Well, I dismiss this issue, as the accurate information is unverifiable. Hope some improvements can come in future. For the record: when the text says total of native petaFLOPS is 5.7, and in x86 petaFLOPS it is 8, this means that replacing all x86 units with PS unit with equal native petaFLOPS performance will result in no changes in native petaFLOPS and large increase in x86 petaFLOPS. That specifically means that one of this measurement system, namely native petaFLOPS is widely inaccurate. That's why "[b]y reporting both, Folding@home attempts to even out these hardware differences." —
Dmitrij D. Czarkoff (
talk)
10:02, 7 April 2012 (UTC)reply
1a10: The Pande lab does accept monetary donations, did you know that? :P In any case, I've clarified it. I'm very unfamiliar with the "at = " field. Can you explain?
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
1a13: It now says "CPU cores". Since it states that they work together, isn't that a sufficient explanation? The section then goes into details as to the specific techniques used to make them work together.
Jesse V. (
talk)
21:40, 6 April 2012 (UTC)reply
I would be more explicit here. Eg. "the ability to use several CPU cores simultaneously allows to complete..." This wording isn't perfect, I believe you could choose the better one. —
Dmitrij D. Czarkoff (
talk)
22:17, 6 April 2012 (UTC)reply
2b1: I've inserted "work =" and "publisher =" information where it was missing, (
diff) does that address that problem? If an article exists for an author/publisher I try to wikilink to it. Please specify where I'm "inconsistent" in that regard. Also, per
this page the current format of "FirstName LastName" should be fine, no? Or do I need to do "LastName, FirstName" using first1 and last1 and all that? If you were referring to in-text dates, then I believe I've fixed this with
this edit.
Jesse V. (
talk)
02:56, 6 April 2012 (UTC)reply
Looks OK to me, though before applying for FA you may want to convert dates in footnotes to MDY format, as the ISO 8601 date format you are using is depricated. Though this is out of
WP:GACR, so I'm not asking it of you now. —
Dmitrij D. Czarkoff (
talk)
08:24, 6 April 2012 (UTC)reply
6a1: I see your point. However, I know for a fact that that user is who he says he is, and that statement is accurate. He is a Pande lab member, and
this is his foldingforum.org account. I've also written to him via email, and he sent me the GIMP files for that image. Please advise as to how he can confirm his identity.
Jesse V. (
talk)
02:56, 6 April 2012 (UTC)reply
Though ideal solution is a link to the official website page stating usage rights for this image, e-mail message from the project (that they don't claim copyright or release it under CC0) would be OK. See, if this image was produced while the author was working for the copyright owner of this software, the copyright may belong to the project without author realizing it. Similarly, the way he submitted it may have transferred the rights. —
Dmitrij D. Czarkoff (
talk)
07:26, 6 April 2012 (UTC)reply
Notice: I understand that there are certain things that you are looking for and some improvement suggestions would be outside the scope of this GA review. Therefore if you or anyone else feel that there's something specific that needs to be done to further improve the quality of this article, especially if if can be brought really close to FA standards, feel free to open a topic in the Talk page. I would be eager to hear such suggestions and if I couldn't take care of it myself I'm sure others could. Thanks,
Jesse V. (
talk)
04:50, 6 April 2012 (UTC)reply
The article frequently uses the constructions similar to "Pande lab using Folding@home". Could they be generalized to omit Pande lab, or there is a reason to constantly mention it? —
Dmitrij D. Czarkoff (
talk)
09:20, 6 April 2012 (UTC)reply
In third paragraph of
Project significance the sentence "The conformational states and the short simulations between them are then compiled into statistical Markov state models (MSMs), which essentially serve as a map of the protein's energy landscape and kinetic and equilibrium thermodynamics properties and illustrate folding pathways" (italics added) is problematic. The part I emphasized should be either split out or otherwise disconnected from the preceding long object: it requires rereading to be properly comprehended. Done
The last sentence of the same paragraph starts with "The Pande lab has used these MSMs to", as does the first sentence of next (fourth) paragraph, which significantly breaks the text flow. Done
The first sentence of the fourth paragraph of the same section reads "The Pande lab has used these MSMs to parallelize simulations that overall require more than 10 million CPU hours." Different amount of CPU hours is required depending on the properties of CPU this statement bases off (eg. 10m CPU hours of i486sx are ways less work then 10m CPU hours of modern server CPU). Either a note about CPU this phrase refers to should be given or the phrase itself should be generalize to solve ambiguity. Done
What is "NTL9 protein"? (in next sentence)? If the exact protein is important for the topic, wikilink should be here. Done
In the second sentence of the first paragraph of
Biomedical research the phrase "therapeutic intervention can be the next step, which take the form of molecules that alter the production of a certain protein" implies that intervention takes form of molecules; the sentence should be slightly reworded to avoid this impression. BTW, "takes". Done
In third sentence of the third (last) paragraph of
Cancer the sentence "IL-2 binds to these pulmonary cells differently than it does to T cells, a key approach for IL-2 research" should be reworded, as the second phrase is logically disconnected. Done
In fourth sentence of
Osteogenesis imperfecta the sentence "The Pande lab has produced a publication which uses quantum mechanical techniques to improve upon previous simulations of collagen, which may be useful for future computational studies of collagen" should be rephrased: publication can describe the use, but it doesn't actually use quantum mechanical techniques. Done
The seventh sentence of the first paragraph of
Viruses is incomplete: "limited to several orders of magnitude shorter" then what? Done
In
Participation the whole native/x86 FLOPS issue should be explained. Though it is somehow done in last paragraph of
PetaFLOPS milestones, the reader should be able to understand whether what are the "native FLOPS" native for and does summing them up gives a reasonable figure. Though I didn't dive in, I get the impression that the adequateness issue with the total of differently counted FLOPS was the reason of the project's introduction of x86 FLOPS; if so, "native" FLOPS should probably be omitted for all the data available in x86 FLOPS and a note explaining the issue should be added. Done
The last sentence of the
Participation section should be expanded to explain that users were asked to donate CPU time. At least I got the current text as a suggestion to donate money.
Also either text or (preferably) the footnote for this sentence (via |at=) should specify the exact update. Done
The sentence in the
Software (that single one, which doesn't belong to subsections) gives the impression that of exclusive list of components. Probably the "consists of three components" should be changed to something like "involves three primary concepts". Done
The article refers to
GROMACS as "GROMACS" and "Gromacs". One case should be used. Done
The second sentence of
Multi-core processing client section has the phrase "These cores work together to complete a single WU significantly faster". The word "core" needs to be disambiguated between the "core" in "multi-core processor" and program's component. If the former is meant, some explanation is required to avoid leaving impression that the effect is reached without specific code optimization. Done
I would note that contrary to Wikipedia's practice several articles are linked from this one more then once. Though I would ordinarily require that second, third and later occasions be unlinked, this article imposes a substantial load on the reader, so repeated links may actually facilitate reading it. Please don't get this comment as a call to further add duplicate links though.
2a. it contains a list of all references (sources of information), presented in accordance with
the layout style guideline.
2b.
reliable sources are
cited inline. All content that
could reasonably be challenged, except for plot summaries and that which summarizes cited content elsewhere in the article, must be cited no later than the end of the paragraph (or line if the content is not in prose).
Citation templates require more work: the authors' first and last names are ordered inconsistently, the
|work= and |publisher= are used (and wikilinked) inconsistently, Donethe dates
should be in MDY format.
The image in infobox is reported to be distributed under
CC0 conditions, but no proof is demonstrated. The fact that uploader's name looks like the name of author (as written on description page) doesn't prove the licensing information on its own. Done
The caption of the image in
Participation section includes unnecessary details: "shown, by device type, in teraFLOPS as recorded semi-daily from November 2006 until September 2007." Done
7. Overall assessment.
Discussion
Please refer to the issues in the table above by their numbers (eg. 1a1 for first issue with "prose" criterion).
1a3: The exact phrase in the publication is "We created MSMs based on molecular dynamics simulations of a fast-folding double-norleucine HP35 mutant, extending prior calculations to include more than 1 ms of simulation and requiring more than 10 million CPU hours of computation." Since it also says "Folding@Home donors provided computer resources", I think its reasonable to assume that the calculations were run on a broad range of hardware and that measurement is simply the sum of everyone's CPU time.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
The problem is that this article will still be there in, say, 2050; I think we may safely assume that by then 10 million CPU hours would be an amount of computation that would be ways greater then the author of this quote was thinking of. Thus at least year should be specified. Eg. "that overall required more than 10 million CPU hours as of 2011". —
Dmitrij D. Czarkoff (
talk)
21:00, 6 April 2012 (UTC)reply
1a4: I think its important to specify which protein. An expert reading this article will probably be familiar which the chemical properties of that protein, but I don't. There's no Wikipedia article on NTL9. The name is so small IMO there's hardly a readability difference between "the NTL9 protein" and "a protein" but I can remove the name if you think that's important.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
1a5: I added the qualifier "can", which I think also fixed the grammar issue. Molecular interventions are things like drugs, antibiotics, etc.
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
Correct me if I'm wrong, but as I get it, therapeutic intervention is an action, thus it can take form of some action (molecular intervention in this case). —
Dmitrij D. Czarkoff (
talk)
21:00, 6 April 2012 (UTC)reply
1a9: I made an initial stab at clarifying the difference, and put the explanation above any mention of "native FLOPS". Removing this native FLOP measurement would obliterate the info in the PetaFLOPS Milestone section, almost all of which is based on native FLOPS before x86 FLOPS were displayed. I've seen nothing to indicate that summing the native FLOPS is particular bad, only that there's a difference between the two measurements. I personally don't see this as a big deal, but please let me know if there's something further I need to fix/clarify here.
Jesse V. (
talk)
01:25, 7 April 2012 (UTC)reply
Well, I dismiss this issue, as the accurate information is unverifiable. Hope some improvements can come in future. For the record: when the text says total of native petaFLOPS is 5.7, and in x86 petaFLOPS it is 8, this means that replacing all x86 units with PS unit with equal native petaFLOPS performance will result in no changes in native petaFLOPS and large increase in x86 petaFLOPS. That specifically means that one of this measurement system, namely native petaFLOPS is widely inaccurate. That's why "[b]y reporting both, Folding@home attempts to even out these hardware differences." —
Dmitrij D. Czarkoff (
talk)
10:02, 7 April 2012 (UTC)reply
1a10: The Pande lab does accept monetary donations, did you know that? :P In any case, I've clarified it. I'm very unfamiliar with the "at = " field. Can you explain?
Jesse V. (
talk)
15:37, 6 April 2012 (UTC)reply
1a13: It now says "CPU cores". Since it states that they work together, isn't that a sufficient explanation? The section then goes into details as to the specific techniques used to make them work together.
Jesse V. (
talk)
21:40, 6 April 2012 (UTC)reply
I would be more explicit here. Eg. "the ability to use several CPU cores simultaneously allows to complete..." This wording isn't perfect, I believe you could choose the better one. —
Dmitrij D. Czarkoff (
talk)
22:17, 6 April 2012 (UTC)reply
2b1: I've inserted "work =" and "publisher =" information where it was missing, (
diff) does that address that problem? If an article exists for an author/publisher I try to wikilink to it. Please specify where I'm "inconsistent" in that regard. Also, per
this page the current format of "FirstName LastName" should be fine, no? Or do I need to do "LastName, FirstName" using first1 and last1 and all that? If you were referring to in-text dates, then I believe I've fixed this with
this edit.
Jesse V. (
talk)
02:56, 6 April 2012 (UTC)reply
Looks OK to me, though before applying for FA you may want to convert dates in footnotes to MDY format, as the ISO 8601 date format you are using is depricated. Though this is out of
WP:GACR, so I'm not asking it of you now. —
Dmitrij D. Czarkoff (
talk)
08:24, 6 April 2012 (UTC)reply
6a1: I see your point. However, I know for a fact that that user is who he says he is, and that statement is accurate. He is a Pande lab member, and
this is his foldingforum.org account. I've also written to him via email, and he sent me the GIMP files for that image. Please advise as to how he can confirm his identity.
Jesse V. (
talk)
02:56, 6 April 2012 (UTC)reply
Though ideal solution is a link to the official website page stating usage rights for this image, e-mail message from the project (that they don't claim copyright or release it under CC0) would be OK. See, if this image was produced while the author was working for the copyright owner of this software, the copyright may belong to the project without author realizing it. Similarly, the way he submitted it may have transferred the rights. —
Dmitrij D. Czarkoff (
talk)
07:26, 6 April 2012 (UTC)reply
Notice: I understand that there are certain things that you are looking for and some improvement suggestions would be outside the scope of this GA review. Therefore if you or anyone else feel that there's something specific that needs to be done to further improve the quality of this article, especially if if can be brought really close to FA standards, feel free to open a topic in the Talk page. I would be eager to hear such suggestions and if I couldn't take care of it myself I'm sure others could. Thanks,
Jesse V. (
talk)
04:50, 6 April 2012 (UTC)reply
The article frequently uses the constructions similar to "Pande lab using Folding@home". Could they be generalized to omit Pande lab, or there is a reason to constantly mention it? —
Dmitrij D. Czarkoff (
talk)
09:20, 6 April 2012 (UTC)reply