This is the
talk page for discussing improvements to the
Context-free grammar article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated B-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For example, if we take the following grammar:
and the string "1 + 1 + 1" then the left derivation of this string is the list [ (1), (1), (2), (2), (2) ]. Analogously the rightmost derivation is defined as the list that we get if we always replace the rightmost nonterminal first. In this case this would be the list [ (1), (2), (1), (2), (2)].
It is necessary to add, that arrows outgoing from node are ordered. AlexShkotin 05:14, 6 September 2006 (UTC)
and the string '1 + 1 + a'. Still the parsing process is not determined (neither righmost nor leftmost derivation) because there always is a choice between first expanding (via rule (1)) or terminating (via rules (2) or (3)). Although leftmost and rightmost derivation is clearly distinguished by the order of application of rules (2) and (3), this is not clear from the text. I am not sure whether an altogether different example is useful or if it is sufficient to mention point out that there are different derivation path for each type of derivation. -- Larsipulami ( talk) 15:20, 10 December 2009 (UTC)
+ + / \ / \ + a 1 + / \ / \ 1 1 1 a and both project to 1 + 1+a 1+1 + a
Is it possible to decide whether given context-free grammar is equivalent to given regular grammar ? In particular is it possible to decide whethere there is single string that doesn't belong to context-free language (that is, whether or not is it equivalent to .*) ? Taw 14:41 Apr 6, 2003 (UTC)
On the page Context-free_language#Decidability_properties is is stated:
And judging from Regular_grammar and Regular_language it seems that regular languages can easily be described by context-free grammars (which is natural, as any regular language is also context-free), so a right-linear CFG or a left-linear CFG generates a regular language. So there are *some* CFGs for which regularity is decidable, right? I am not questioning the theory, I just think it maybe could be explained more clearly? -- Lasse Hillerøe Petersen ( talk) 20:21, 7 December 2014 (UTC)
References
Context-free grammars are important because they are powerful enough to describe the syntax of programming languages
(Well...context-free grammars can describe most of the syntax of programming languages. For example, any programming language that requires that variables be declared before use is not fully describable via a context-free grammar, essentially because there can be arbitrary amounts of source code between declaration and use--see the reference to the "pumping lemma" below. Such constraints are typically swept over into the corner, labelled "semantics," and dealt with outside of the parser mechanism proper. In the example, "semantic action" code is attached to the grammar rule that recognizes identifiers to check against a symbol table.) --24.36.158.101
Phrase structure grammar and context-free grammars
There is a redirect from
Phrase structure grammar to this page.
Does this mean that Phrase structure grammars are just context-free grammars in any case? Isn't there more to say about Phrase structure grammars?
It ought to be a stub if they are not equivalent.Otherwise it is misleading. -- Sundar 05:08, Feb 15, 2005 (UTC)
I did some more looking around on this. Here's the score from the most reputable sources I could find:
So this seems like a disputed usage. At any rate, no one uses the term in any current work. The stub idea seems reasonable. kraemer 00:39, 20 June 2007 (UTC)
It would be nice to include some statement about whether natural languages are context-free in this article. On this topic, the article Bambara language has the following note:
I wonder what is statement is really supposed to say, because I find it a bit difficult to believe. Can it really be so hard to find an example showing that English is not context-free? -- Saforrest 12:37, Apr 19, 2005 (UTC)
The article states that "BNF (Backus-Naur Form) is the most common notation used to express context-free grammars.", but still the article make use of production rules like V→w instead of V::=w, is that because production rules and BNF are interchangeable ? -- Robsy 11:32, 2 May 2006 (UTC)
I doubt that Lojban is actually context-free. It probably were context-free if none of the terminator cmavo were elidable. Icek 22:53, 28 June 2006 (UTC)
You either understand all this jargon, in which case you already know what a "context free grammar" is...or you don't, and this is Greek to you.
Useless.
The comment above is useless. I found the explanation on this page clear and helpful. -- Whisk3rs 00:07, 17 May 2007 (UTC)
I tried to add a simpler explanation that most people could understand to the introduction just now. — Preceding unsigned comment added by Programmers are evil ( talk • contribs) 20:24, 27 October 2015 (UTC)
I agree that an informal motivation would be a good idea; I'm thinking about a natural language example along the image shown here. We could give a small excerpt of an English grammar that just generates the sentence. - Jochen Burghardt ( talk) 22:20, 19 March 2016 (UTC)
A first attempt to realize my above suggestion is shown to the right. I hope the tree image might be widely familar from English school education. The image caption could explain some issues of CFGs. - Presentation problems are: the grammar takes too much horizontal space; however, when the syntactical categories are abbreviated (like "VP" for "Verb_Phrase") to save space, they'd need to be explained. commons:Category:Syntax trees in English contains 159 syntax trees, but none of them fits perfectly here; maybe we need to design a 160th one. - Jochen Burghardt ( talk) 20:04, 25 March 2016 (UTC)
A starting point would be to add an explanation of the used symbol S to the first example. It represents a grammar? In that case, would G be more appropriate? No? Why? Also, the use of the arrow symbol (product?) could be linked to an article describing that symbol etc. The use of symbols instead of human language warrants for an easy reference material to symbols used. Otherwise the use of symbols serve only an obfuscation purpose, being informative only to those already adept in the topic, which doubtfully is the purpose of Wikipedia.
This particular language can be generated by a parsing expression grammar, which is a relatively new formalism that is particularly well-suited to programming languages.
I've cut this from the head of the article. PEGs are relatively new, and I see no reason why they would be differentiated from any of the number of well-established formalisms that can also accept the language AnBnCn, especially since this is the canonical type 1 language (context-sensitive).
68.122.42.35 ( talk) 03:33, 25 October 2008 (UTC)
In the article "context-free grammar" the first paragraph could use a little clarity.
The following sentence: "The term "context-free" expresses the fact that nonterminals can be rewritten without regard to the context in which they occur" could really use an example or another sentence or two to describe it. I was unable to figure out what this meant, possibly not because I'm stupid. (i.e.: I'm not stupid, which may or may not be why I couldn't understand the above sentence :P )
I am not formally trained, but then, that could be seen as permissable for wikipedia.
Thanks to all you geniuses who are teaching me math and computer science.
I stumbled upon a tool that produces images from context free grammars. Here is the link http://www.contextfreeart.org/index.html 140.203.154.11 ( talk) 18:37, 11 November 2008 (UTC) mihai andrei
Under the formal definition section, the last in Additional Section 5 says "no cycles". In this context, what exactly is a cycle? 66.69.234.193 ( talk) 22:05, 24 January 2009 (UTC)
The only reasonable cycle that comes to my mind is something like this S->BA,B->S. If you have no that kind of cycles, you can generate just finite amount of products from the CFG and thus the generated language is star free. One might want to use that kind of CFG to compress text. —Preceding unsigned comment added by Hiihammuk ( talk • contribs) 12:44, 25 January 2009 (UTC)
I have read this article a few times over the months, and never noticed that it mentioned (in the Normal Form section) that the membership problem is decidable (i.e. determining whether a string is generated by a given grammar). I think this should get more prominent mention in the article. Maybe in the summary? It at least shouldn't only be buried at the end of the normal form section. --pdf23ds 24.174.103.112 ( talk) 18:17, 17 May 2009 (UTC)
I like Likebox's edits a lot in general, but I'm not so happy about the references to Noam Chomsky's beliefs. The article should provide facts about cfgs, not beliefs. And while they are at the core of Chomsky's frameworks for describing syntax, they were always complemented with transformations in some form, so the reader shouldn't be left with the impression that syntax == context free grammar in Chomsky's eyes. Rp ( talk) 19:32, 11 January 2010 (UTC)
I've modified the section slightly, because Chomsky has never believed that language is strictly context free, but rather that it has a context free component who's output is manipulated by further decidedly non-CF mechanisms (at the time, transformations, and currently, movement). This is clear in his work, when he addresses the inadequacy of CFG for various phenomena. -- Augur ( talk) 08:18, 1 February 2010 (UTC)
There is a distinction between the grammar of parentheses matching (()()) and the grammar of matching two types of parens (([][])[()]). The first language can be approximatly parsed by a finite automaton exponentially well, meaning that while it is technically context free and requires a stack, a finite automaton with a counter of n-bits can match strings of matching parens up to a depth of 2^n. Effectively, this means that a regular grammar will work to match the strings, and I wouldn't call that a real context free grammar. I would call it a "log context free grammar". I don't know what the official term is, if there is one.
On the other hand, with two types of parens, n bits can only work up to a depth of n. This is a true context free grammar, it can't be approximated well by a regular grammar, since you need to push the type onto the stack. The languages of interest, natural and computer languages, are true context free grammars. Likebox ( talk) 13:55, 19 January 2010 (UTC)
Never? How then does one account for such a construction (Catullus):
Sed mulier cupido quod dicit amanti in vento et rapida scribere oportet aqua
Thank you. I just so happen to be a linguist. By "transliterate" I suppose that you meant "translate". So here:
Word for word :
But [a] woman eager what [she] tells [a] lover in [the] wind and swift [to] write one-should water
In English now:
But what a woman tells an eager lover ought to be written in the wind and in swift water. —Preceding unsigned comment added by JacquesGuy ( talk • contribs) 00:12, 25 January 2010 (UTC)
"Cupido" is masculine dative, "eager" in "a woman is eager" is feminine nominative, i.e. "cupida"
Go learn Latin. End of discussion. —Preceding unsigned comment added by 61.68.162.228 ( talk) 06:15, 26 January 2010 (UTC)
The article in question points out that Swiss German has a non context free construction, which involves the following:
Jan says that (we-1) (Mary-1) (the house-2) helped-1 paint-2
The numbers say which nouns go with which verb. is is not stack based, otherwise it would be:
Jan says that (we-1) (Mary-1) (the house-2) (painting-2) (helped-1)
So agreed, it's not context free. But you need to go several levels deep to make this work. The claim that they make is that you can go on in this pattern in Swiss German with more nouns and more verbs
Jan says that we-1 Mary-1 Martha-2 Sally-3 the house-4 told-1 tell-2 help-3 paint-4
If you find something like this, that would be the clincher. But I have not been able to verify this schema as grammatical in Dutch German, and the gloss provided by the authors for their analogous sentence was insufficient to determine whether it was like the above.
But the construction in question is peculiar to Swiss German, and Dutch, while Chomsky was working with English. It is manifestly obvious that English grammar is context free. I don't know how this article can be used to support the statement that "Chomsky's observations about the non-context freeness of natural language has held up", especially that I don't see any examples analogous to the above in Chomsky, and I don't know of any in English. Likebox ( talk) 13:00, 9 February 2010 (UTC)
The example of an overlapping construction:
Is not as overlapping as the Martian
Which is truly inhuman. The second is a better example, I think, because the first is too natural--- it could be produced. The second shows what a truly non-context-free language would look like.
This example of an overlapping construction is very good--- I came up with much worse examples. But it never occurs in the New York Times (I don't think), and it has a colloquial flavor (at least to my ears), precisely because the dangling "with green headlights" is not in the proper place. I feel like it's a "John saw a car with green headlights in the ad yesterday", mangled up as someone said it. But I might have been programming a computer too long.
It would be good to have two examples--- a truly Martian language, with tons of overlapping constructions everywhere (it's easy to make one up), and an example in English like the one above. Unfortunately the one above sounds very colloquial and marginally grammatical, Do you know of a similar example in formal writing? Likebox ( talk) 05:59, 21 February 2010 (UTC)
Article currently states: "In later work (e.g. Chomsky 1981), the idea of formulating a grammar consisting of explicit rewrite rules was abandoned."
Could someone (who is more familiar than myself with the subject matter) please rewrite this sentence to eliminate the ambiguity and the passive voice. "Was abandoned" ... by who? Chomsky?
Karl gregory jones ( talk) 19:02, 9 February 2011 (UTC)
Article states: "It is hard to find cases of possible overlap and many such cases can be explained in an alternative way that doesn't assume overlap."
1. "Hard to find" sounds un-encyclopedic to my ear. Can someone rewrite this to sound more professional?
2. "Hard to find" means uncommon but extant. Can someone provide an example, or examples plural?
3. Can someone provide examples that have been explained in alternative ways?
Karl gregory jones ( talk) 19:11, 9 February 2011 (UTC)
In the last paragraph of subsection "Proper CFGs" we find the following:
This makes it clear that . The language is context-free, but as shown in Example 4.8, it is not regular.
Should we remove this hard-coded reference to "Example 4.8"? Perhaps we might add an actual reference for this result.
(Sorry for the minor and possibly irrelevant points, I'm a newbie contributor.) — Preceding unsigned comment added by Marcelkcs ( talk • contribs) 16:39, 29 May 2011 (UTC)
I don't think it makes sense to require that there is at least one production rule for the non-terminal S. All that counts is that the transitive closure of the production rules, filtered to terminals is taken, in defining the accepted language L(G). If there is no production rule for S, then we have obviously L(G) = {}, i.e. the accepted language is empty. This is not to be confused with L(G) = {eps}.
But there are also other context free grammars that also lead to empty languages. And there are such that have a rule for the start symbol S. For example the following grammar, has also an empty accepted language:
S --> S
So the requirement that thre is at least one rule for S, does not imply that the accepted language L(G) is non empty, so it doesn't make any sense.
Janburse ( talk) 20:09, 20 June 2011 (UTC)
I'm currently studying CFGs in school; I got rather confused reading this article, when presented with this notation:
S → SS S → (S) S → ()
I hadn't seen a CFG presented that way before and I didn't know what it meant. Then later in the article I arrived at
S → bSbb | A A → aA | ε
which is what I'm used to seeing.
After hunting through the article numerous times I finally searched the page for pipes and found "It is common to list all right-hand sides for the same left-hand side on the same line, using | (the pipe symbol) to separate them. Hence the grammar above can be described more tersely as follows:" arbitrarily hidden under the "A regular grammar" section.
There absolutely needs to be a "Notation" section before the "Examples" section that clearly explains both styles of notation to the reader. Not explaining your first notation and switching to another notation midway through with an easily-missed warning is a recipe for confusion for those not fully familiar with the subject. I would write a notation section myself, but I don't feel I grasp the concept and terminology enough to write accurately about it. Some guy ( talk) 05:52, 3 March 2013 (UTC)
The lead (or lede in Wiki-speak?) is terrible, IMO. The Background section does a better job of explaining what this is all about. I am a SW engineer, and remember a little about this from college many years ago, but that lead section did nothing in helping me remember the overall purpose and meaning of a CFG. I can't imagine how it could be expected to help a lay-person. Nerfer ( talk) 17:18, 23 March 2020 (UTC)
Until the section
section "Well-formed parentheses", commas are used to separate production rules. But from there on, no commas are used!
I think this should be done in an uniform way. — Preceding
unsigned comment added by
Doubletimer (
talk •
contribs)
17:55, 27 November 2020 (UTC)
I don't have editing experience, so I thought I'd bring this up here and let someone else address it. I buy that the language in this section (balanced but not properly nested strings of parentheses and brackets) is not context-free, but this cannot be shown through the Pumping Lemma. Indeed, any nonempty string in this language will contain either the pair '(' and ')' or the pair '[' and ']' - either of these pairs of substrings will satisfy the conclusion of the Pumping Lemma. The observation that this language contains the sublanguage (^n [^n )^n ]^n is irrelevant. The complexity of a language puts no bounds on the complexity of its subsets (e.g. the regular language (a)* of all strings of a's contains the subset {a^n | n is in your favorite non-recursively-enumerable set of natural numbers}). Mersenne44 ( talk) 21:07, 26 September 2022 (UTC)
Minor nitpick here.
The article says:
My suggestion is to avoid calling | the "pipe symbol", which might cause confusion to people interpreting it as the Bash pipe ( /info/en/?search=Vertical_bar#Pipe) and think that is somehow fed into .
Since the vertical bar ( /info/en/?search=Vertical_bar) goes by a lot of other names, I think it would me more appropriate to refer to it as either:
130.192.232.228 ( talk) 12:21, 30 September 2022 (UTC)
In the "Examples" section, the aforementioned sentence is confusing to me (I understand a lot of the rest of the page). The ε-production itself is defined in the context of the article earlier (further up) and no mention is made alluding to why there'd be [later] some improper something with it. It is only described what an ε-production is. I can't find any information that should somehow hint at or explain why "it is not proper". Transientdifficulties ( talk) 20:10, 6 April 2023 (UTC)
This is the
talk page for discussing improvements to the
Context-free grammar article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated B-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For example, if we take the following grammar:
and the string "1 + 1 + 1" then the left derivation of this string is the list [ (1), (1), (2), (2), (2) ]. Analogously the rightmost derivation is defined as the list that we get if we always replace the rightmost nonterminal first. In this case this would be the list [ (1), (2), (1), (2), (2)].
It is necessary to add, that arrows outgoing from node are ordered. AlexShkotin 05:14, 6 September 2006 (UTC)
and the string '1 + 1 + a'. Still the parsing process is not determined (neither righmost nor leftmost derivation) because there always is a choice between first expanding (via rule (1)) or terminating (via rules (2) or (3)). Although leftmost and rightmost derivation is clearly distinguished by the order of application of rules (2) and (3), this is not clear from the text. I am not sure whether an altogether different example is useful or if it is sufficient to mention point out that there are different derivation path for each type of derivation. -- Larsipulami ( talk) 15:20, 10 December 2009 (UTC)
+ + / \ / \ + a 1 + / \ / \ 1 1 1 a and both project to 1 + 1+a 1+1 + a
Is it possible to decide whether given context-free grammar is equivalent to given regular grammar ? In particular is it possible to decide whethere there is single string that doesn't belong to context-free language (that is, whether or not is it equivalent to .*) ? Taw 14:41 Apr 6, 2003 (UTC)
On the page Context-free_language#Decidability_properties is is stated:
And judging from Regular_grammar and Regular_language it seems that regular languages can easily be described by context-free grammars (which is natural, as any regular language is also context-free), so a right-linear CFG or a left-linear CFG generates a regular language. So there are *some* CFGs for which regularity is decidable, right? I am not questioning the theory, I just think it maybe could be explained more clearly? -- Lasse Hillerøe Petersen ( talk) 20:21, 7 December 2014 (UTC)
References
Context-free grammars are important because they are powerful enough to describe the syntax of programming languages
(Well...context-free grammars can describe most of the syntax of programming languages. For example, any programming language that requires that variables be declared before use is not fully describable via a context-free grammar, essentially because there can be arbitrary amounts of source code between declaration and use--see the reference to the "pumping lemma" below. Such constraints are typically swept over into the corner, labelled "semantics," and dealt with outside of the parser mechanism proper. In the example, "semantic action" code is attached to the grammar rule that recognizes identifiers to check against a symbol table.) --24.36.158.101
Phrase structure grammar and context-free grammars
There is a redirect from
Phrase structure grammar to this page.
Does this mean that Phrase structure grammars are just context-free grammars in any case? Isn't there more to say about Phrase structure grammars?
It ought to be a stub if they are not equivalent.Otherwise it is misleading. -- Sundar 05:08, Feb 15, 2005 (UTC)
I did some more looking around on this. Here's the score from the most reputable sources I could find:
So this seems like a disputed usage. At any rate, no one uses the term in any current work. The stub idea seems reasonable. kraemer 00:39, 20 June 2007 (UTC)
It would be nice to include some statement about whether natural languages are context-free in this article. On this topic, the article Bambara language has the following note:
I wonder what is statement is really supposed to say, because I find it a bit difficult to believe. Can it really be so hard to find an example showing that English is not context-free? -- Saforrest 12:37, Apr 19, 2005 (UTC)
The article states that "BNF (Backus-Naur Form) is the most common notation used to express context-free grammars.", but still the article make use of production rules like V→w instead of V::=w, is that because production rules and BNF are interchangeable ? -- Robsy 11:32, 2 May 2006 (UTC)
I doubt that Lojban is actually context-free. It probably were context-free if none of the terminator cmavo were elidable. Icek 22:53, 28 June 2006 (UTC)
You either understand all this jargon, in which case you already know what a "context free grammar" is...or you don't, and this is Greek to you.
Useless.
The comment above is useless. I found the explanation on this page clear and helpful. -- Whisk3rs 00:07, 17 May 2007 (UTC)
I tried to add a simpler explanation that most people could understand to the introduction just now. — Preceding unsigned comment added by Programmers are evil ( talk • contribs) 20:24, 27 October 2015 (UTC)
| ||||||||||||||||||||||||||||||
![]() |
I agree that an informal motivation would be a good idea; I'm thinking about a natural language example along the image shown here. We could give a small excerpt of an English grammar that just generates the sentence. - Jochen Burghardt ( talk) 22:20, 19 March 2016 (UTC)
A first attempt to realize my above suggestion is shown to the right. I hope the tree image might be widely familar from English school education. The image caption could explain some issues of CFGs. - Presentation problems are: the grammar takes too much horizontal space; however, when the syntactical categories are abbreviated (like "VP" for "Verb_Phrase") to save space, they'd need to be explained. commons:Category:Syntax trees in English contains 159 syntax trees, but none of them fits perfectly here; maybe we need to design a 160th one. - Jochen Burghardt ( talk) 20:04, 25 March 2016 (UTC)
A starting point would be to add an explanation of the used symbol S to the first example. It represents a grammar? In that case, would G be more appropriate? No? Why? Also, the use of the arrow symbol (product?) could be linked to an article describing that symbol etc. The use of symbols instead of human language warrants for an easy reference material to symbols used. Otherwise the use of symbols serve only an obfuscation purpose, being informative only to those already adept in the topic, which doubtfully is the purpose of Wikipedia.
This particular language can be generated by a parsing expression grammar, which is a relatively new formalism that is particularly well-suited to programming languages.
I've cut this from the head of the article. PEGs are relatively new, and I see no reason why they would be differentiated from any of the number of well-established formalisms that can also accept the language AnBnCn, especially since this is the canonical type 1 language (context-sensitive).
68.122.42.35 ( talk) 03:33, 25 October 2008 (UTC)
In the article "context-free grammar" the first paragraph could use a little clarity.
The following sentence: "The term "context-free" expresses the fact that nonterminals can be rewritten without regard to the context in which they occur" could really use an example or another sentence or two to describe it. I was unable to figure out what this meant, possibly not because I'm stupid. (i.e.: I'm not stupid, which may or may not be why I couldn't understand the above sentence :P )
I am not formally trained, but then, that could be seen as permissable for wikipedia.
Thanks to all you geniuses who are teaching me math and computer science.
I stumbled upon a tool that produces images from context free grammars. Here is the link http://www.contextfreeart.org/index.html 140.203.154.11 ( talk) 18:37, 11 November 2008 (UTC) mihai andrei
Under the formal definition section, the last in Additional Section 5 says "no cycles". In this context, what exactly is a cycle? 66.69.234.193 ( talk) 22:05, 24 January 2009 (UTC)
The only reasonable cycle that comes to my mind is something like this S->BA,B->S. If you have no that kind of cycles, you can generate just finite amount of products from the CFG and thus the generated language is star free. One might want to use that kind of CFG to compress text. —Preceding unsigned comment added by Hiihammuk ( talk • contribs) 12:44, 25 January 2009 (UTC)
I have read this article a few times over the months, and never noticed that it mentioned (in the Normal Form section) that the membership problem is decidable (i.e. determining whether a string is generated by a given grammar). I think this should get more prominent mention in the article. Maybe in the summary? It at least shouldn't only be buried at the end of the normal form section. --pdf23ds 24.174.103.112 ( talk) 18:17, 17 May 2009 (UTC)
I like Likebox's edits a lot in general, but I'm not so happy about the references to Noam Chomsky's beliefs. The article should provide facts about cfgs, not beliefs. And while they are at the core of Chomsky's frameworks for describing syntax, they were always complemented with transformations in some form, so the reader shouldn't be left with the impression that syntax == context free grammar in Chomsky's eyes. Rp ( talk) 19:32, 11 January 2010 (UTC)
I've modified the section slightly, because Chomsky has never believed that language is strictly context free, but rather that it has a context free component who's output is manipulated by further decidedly non-CF mechanisms (at the time, transformations, and currently, movement). This is clear in his work, when he addresses the inadequacy of CFG for various phenomena. -- Augur ( talk) 08:18, 1 February 2010 (UTC)
There is a distinction between the grammar of parentheses matching (()()) and the grammar of matching two types of parens (([][])[()]). The first language can be approximatly parsed by a finite automaton exponentially well, meaning that while it is technically context free and requires a stack, a finite automaton with a counter of n-bits can match strings of matching parens up to a depth of 2^n. Effectively, this means that a regular grammar will work to match the strings, and I wouldn't call that a real context free grammar. I would call it a "log context free grammar". I don't know what the official term is, if there is one.
On the other hand, with two types of parens, n bits can only work up to a depth of n. This is a true context free grammar, it can't be approximated well by a regular grammar, since you need to push the type onto the stack. The languages of interest, natural and computer languages, are true context free grammars. Likebox ( talk) 13:55, 19 January 2010 (UTC)
Never? How then does one account for such a construction (Catullus):
Sed mulier cupido quod dicit amanti in vento et rapida scribere oportet aqua
Thank you. I just so happen to be a linguist. By "transliterate" I suppose that you meant "translate". So here:
Word for word :
But [a] woman eager what [she] tells [a] lover in [the] wind and swift [to] write one-should water
In English now:
But what a woman tells an eager lover ought to be written in the wind and in swift water. —Preceding unsigned comment added by JacquesGuy ( talk • contribs) 00:12, 25 January 2010 (UTC)
"Cupido" is masculine dative, "eager" in "a woman is eager" is feminine nominative, i.e. "cupida"
Go learn Latin. End of discussion. —Preceding unsigned comment added by 61.68.162.228 ( talk) 06:15, 26 January 2010 (UTC)
The article in question points out that Swiss German has a non context free construction, which involves the following:
Jan says that (we-1) (Mary-1) (the house-2) helped-1 paint-2
The numbers say which nouns go with which verb. is is not stack based, otherwise it would be:
Jan says that (we-1) (Mary-1) (the house-2) (painting-2) (helped-1)
So agreed, it's not context free. But you need to go several levels deep to make this work. The claim that they make is that you can go on in this pattern in Swiss German with more nouns and more verbs
Jan says that we-1 Mary-1 Martha-2 Sally-3 the house-4 told-1 tell-2 help-3 paint-4
If you find something like this, that would be the clincher. But I have not been able to verify this schema as grammatical in Dutch German, and the gloss provided by the authors for their analogous sentence was insufficient to determine whether it was like the above.
But the construction in question is peculiar to Swiss German, and Dutch, while Chomsky was working with English. It is manifestly obvious that English grammar is context free. I don't know how this article can be used to support the statement that "Chomsky's observations about the non-context freeness of natural language has held up", especially that I don't see any examples analogous to the above in Chomsky, and I don't know of any in English. Likebox ( talk) 13:00, 9 February 2010 (UTC)
The example of an overlapping construction:
Is not as overlapping as the Martian
Which is truly inhuman. The second is a better example, I think, because the first is too natural--- it could be produced. The second shows what a truly non-context-free language would look like.
This example of an overlapping construction is very good--- I came up with much worse examples. But it never occurs in the New York Times (I don't think), and it has a colloquial flavor (at least to my ears), precisely because the dangling "with green headlights" is not in the proper place. I feel like it's a "John saw a car with green headlights in the ad yesterday", mangled up as someone said it. But I might have been programming a computer too long.
It would be good to have two examples--- a truly Martian language, with tons of overlapping constructions everywhere (it's easy to make one up), and an example in English like the one above. Unfortunately the one above sounds very colloquial and marginally grammatical, Do you know of a similar example in formal writing? Likebox ( talk) 05:59, 21 February 2010 (UTC)
Article currently states: "In later work (e.g. Chomsky 1981), the idea of formulating a grammar consisting of explicit rewrite rules was abandoned."
Could someone (who is more familiar than myself with the subject matter) please rewrite this sentence to eliminate the ambiguity and the passive voice. "Was abandoned" ... by who? Chomsky?
Karl gregory jones ( talk) 19:02, 9 February 2011 (UTC)
Article states: "It is hard to find cases of possible overlap and many such cases can be explained in an alternative way that doesn't assume overlap."
1. "Hard to find" sounds un-encyclopedic to my ear. Can someone rewrite this to sound more professional?
2. "Hard to find" means uncommon but extant. Can someone provide an example, or examples plural?
3. Can someone provide examples that have been explained in alternative ways?
Karl gregory jones ( talk) 19:11, 9 February 2011 (UTC)
In the last paragraph of subsection "Proper CFGs" we find the following:
This makes it clear that . The language is context-free, but as shown in Example 4.8, it is not regular.
Should we remove this hard-coded reference to "Example 4.8"? Perhaps we might add an actual reference for this result.
(Sorry for the minor and possibly irrelevant points, I'm a newbie contributor.) — Preceding unsigned comment added by Marcelkcs ( talk • contribs) 16:39, 29 May 2011 (UTC)
I don't think it makes sense to require that there is at least one production rule for the non-terminal S. All that counts is that the transitive closure of the production rules, filtered to terminals is taken, in defining the accepted language L(G). If there is no production rule for S, then we have obviously L(G) = {}, i.e. the accepted language is empty. This is not to be confused with L(G) = {eps}.
But there are also other context free grammars that also lead to empty languages. And there are such that have a rule for the start symbol S. For example the following grammar, has also an empty accepted language:
S --> S
So the requirement that thre is at least one rule for S, does not imply that the accepted language L(G) is non empty, so it doesn't make any sense.
Janburse ( talk) 20:09, 20 June 2011 (UTC)
I'm currently studying CFGs in school; I got rather confused reading this article, when presented with this notation:
S → SS S → (S) S → ()
I hadn't seen a CFG presented that way before and I didn't know what it meant. Then later in the article I arrived at
S → bSbb | A A → aA | ε
which is what I'm used to seeing.
After hunting through the article numerous times I finally searched the page for pipes and found "It is common to list all right-hand sides for the same left-hand side on the same line, using | (the pipe symbol) to separate them. Hence the grammar above can be described more tersely as follows:" arbitrarily hidden under the "A regular grammar" section.
There absolutely needs to be a "Notation" section before the "Examples" section that clearly explains both styles of notation to the reader. Not explaining your first notation and switching to another notation midway through with an easily-missed warning is a recipe for confusion for those not fully familiar with the subject. I would write a notation section myself, but I don't feel I grasp the concept and terminology enough to write accurately about it. Some guy ( talk) 05:52, 3 March 2013 (UTC)
The lead (or lede in Wiki-speak?) is terrible, IMO. The Background section does a better job of explaining what this is all about. I am a SW engineer, and remember a little about this from college many years ago, but that lead section did nothing in helping me remember the overall purpose and meaning of a CFG. I can't imagine how it could be expected to help a lay-person. Nerfer ( talk) 17:18, 23 March 2020 (UTC)
Until the section
section "Well-formed parentheses", commas are used to separate production rules. But from there on, no commas are used!
I think this should be done in an uniform way. — Preceding
unsigned comment added by
Doubletimer (
talk •
contribs)
17:55, 27 November 2020 (UTC)
I don't have editing experience, so I thought I'd bring this up here and let someone else address it. I buy that the language in this section (balanced but not properly nested strings of parentheses and brackets) is not context-free, but this cannot be shown through the Pumping Lemma. Indeed, any nonempty string in this language will contain either the pair '(' and ')' or the pair '[' and ']' - either of these pairs of substrings will satisfy the conclusion of the Pumping Lemma. The observation that this language contains the sublanguage (^n [^n )^n ]^n is irrelevant. The complexity of a language puts no bounds on the complexity of its subsets (e.g. the regular language (a)* of all strings of a's contains the subset {a^n | n is in your favorite non-recursively-enumerable set of natural numbers}). Mersenne44 ( talk) 21:07, 26 September 2022 (UTC)
Minor nitpick here.
The article says:
My suggestion is to avoid calling | the "pipe symbol", which might cause confusion to people interpreting it as the Bash pipe ( /info/en/?search=Vertical_bar#Pipe) and think that is somehow fed into .
Since the vertical bar ( /info/en/?search=Vertical_bar) goes by a lot of other names, I think it would me more appropriate to refer to it as either:
130.192.232.228 ( talk) 12:21, 30 September 2022 (UTC)
In the "Examples" section, the aforementioned sentence is confusing to me (I understand a lot of the rest of the page). The ε-production itself is defined in the context of the article earlier (further up) and no mention is made alluding to why there'd be [later] some improper something with it. It is only described what an ε-production is. I can't find any information that should somehow hint at or explain why "it is not proper". Transientdifficulties ( talk) 20:10, 6 April 2023 (UTC)