![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||
|
I've added notability and COI tags to the article. I've been unable to find articles on this topic independent of the group that invented the data structure. Without independent, in-depth reliable sources per WP:RS, this article may fail to satisfy notability guidelines per WP:GNG. The COI tag comes from the article creator also being one of the inventors of the algorithm and also an author on two of the cited papers: "Cache-Oblivoius streaming B-trees" and "The TokuFS Streaming File System". Per WP:COI, this can lead to problems with the neutrality of the article. Editors knowledgable about the topic should review the content for neutrality--although without independent reliable sources to rely on, this may be difficult. Thanks, -- Mark viking ( talk) 18:24, 23 April 2014 (UTC)
I've removed the notability tag. Database indexing is an important topic, B-Trees and their variants are very widely used despite their limitations. A new technique that overcomes them is notable.
I too would like to see more independent sources. Paul Foxworthy ( talk) 05:55, 1 April 2016 (UTC)
The article says "There are several choices for how the buffers are flushed, all leading to similar I/O complexity."
It seems you are hinting at whether all buffers in a node are flushed or only the fullest buffer is flushed, and comparing/contrasting the costs resulting due to one of the 2 decisions. Is that so? If that is the case, then it is totally not clear that it is what is meant. Some more elaboration would be super helpful! — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 02:06, 28 April 2014 (UTC)
There are a bunch of choices for how and when to flush: as you point out, you can flush only the fullest buffer or all. But you can also flush-on-full only or also flush-on-query. None of these change the asymptotic analysis. It's all an engineering choice, based on typical workloads. Does that help?
Farach (
talk)
20:30, 28 April 2014 (UTC)
Thanks! That makes sense. However, I would like to see some proof or a sketch (maybe hand-wavey) of why it doesn't affect the asymptotics and whether the constants involved are different for either case.
Dhruvbird (
talk)
21:03, 30 April 2014 (UTC)
However, there are no links to /info/en/?search=Cache-oblivious_algorithm — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 15:42, 28 April 2014 (UTC)
Although the COLA is cache oblivious, the current version of the FTI is not. It is based on the B-epsilon tree, which explicitly codes for a block transfer size.
Farach (
talk)
20:28, 28 April 2014 (UTC)
Ah! Yes, I forget - thanks for the clarification.
Dhruvbird (
talk)
21:03, 30 April 2014 (UTC)
You mentioned in one of the previous comments that the FT was earlier implemented using a COLA, but subsequent implementations have since been based on the B-epsilon tree. Are there any references to why this decision was made? What were the tradeoffs and why the current choice is a better one? — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 04:10, 1 May 2014 (UTC)
![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||
|
I've added notability and COI tags to the article. I've been unable to find articles on this topic independent of the group that invented the data structure. Without independent, in-depth reliable sources per WP:RS, this article may fail to satisfy notability guidelines per WP:GNG. The COI tag comes from the article creator also being one of the inventors of the algorithm and also an author on two of the cited papers: "Cache-Oblivoius streaming B-trees" and "The TokuFS Streaming File System". Per WP:COI, this can lead to problems with the neutrality of the article. Editors knowledgable about the topic should review the content for neutrality--although without independent reliable sources to rely on, this may be difficult. Thanks, -- Mark viking ( talk) 18:24, 23 April 2014 (UTC)
I've removed the notability tag. Database indexing is an important topic, B-Trees and their variants are very widely used despite their limitations. A new technique that overcomes them is notable.
I too would like to see more independent sources. Paul Foxworthy ( talk) 05:55, 1 April 2016 (UTC)
The article says "There are several choices for how the buffers are flushed, all leading to similar I/O complexity."
It seems you are hinting at whether all buffers in a node are flushed or only the fullest buffer is flushed, and comparing/contrasting the costs resulting due to one of the 2 decisions. Is that so? If that is the case, then it is totally not clear that it is what is meant. Some more elaboration would be super helpful! — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 02:06, 28 April 2014 (UTC)
There are a bunch of choices for how and when to flush: as you point out, you can flush only the fullest buffer or all. But you can also flush-on-full only or also flush-on-query. None of these change the asymptotic analysis. It's all an engineering choice, based on typical workloads. Does that help?
Farach (
talk)
20:30, 28 April 2014 (UTC)
Thanks! That makes sense. However, I would like to see some proof or a sketch (maybe hand-wavey) of why it doesn't affect the asymptotics and whether the constants involved are different for either case.
Dhruvbird (
talk)
21:03, 30 April 2014 (UTC)
However, there are no links to /info/en/?search=Cache-oblivious_algorithm — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 15:42, 28 April 2014 (UTC)
Although the COLA is cache oblivious, the current version of the FTI is not. It is based on the B-epsilon tree, which explicitly codes for a block transfer size.
Farach (
talk)
20:28, 28 April 2014 (UTC)
Ah! Yes, I forget - thanks for the clarification.
Dhruvbird (
talk)
21:03, 30 April 2014 (UTC)
You mentioned in one of the previous comments that the FT was earlier implemented using a COLA, but subsequent implementations have since been based on the B-epsilon tree. Are there any references to why this decision was made? What were the tradeoffs and why the current choice is a better one? — Preceding unsigned comment added by Dhruvbird ( talk • contribs) 04:10, 1 May 2014 (UTC)