This pseudocode is no longer a proposed standard. See User:Dcoetzee/Wikicode.
This issue has generated much discussion, and here seems to be the best place to continue that discussion, but for background reading, you should also see Talk:Pseudocode, Talk:Algorithms on Wikipedia, and User talk:Dcoetzee/Wikicode.
(I've added the above for newcomers, like me, not existing protagonists. Mark Hurd 10:16, 10 Oct 2004 (UTC))
Discussion originally placed on project page follows
: int
demonstrate, they're just not required. You may be looking for them before and not after (see second point).
Derrick Coetzee
03:12, 6 May 2004 (UTC)
This is all just some random thoughts, so feel free to disregard what I say as you see fit ;) Dysprosia 23:09, 5 May 2004 (UTC)
Thanks a lot for your careful examination and comments, will make some related changes soon. Derrick Coetzee 23:31, 5 May 2004 (UTC)
I don't think brackets on a function call should be optional. Why have two ways to do the same thing?
CGS
23:44, 5 May 2004 (UTC).
I can see how something like this would be useful to illustrate simple algorithms. But even there, you can't necessarily compare the efficiency of two algorithms without knowing a little bit more about how data structures are meant to be implemented. For instance, "lists" in real programming languages are sometimes linked lists (as in Lisp) and sometimes vectors (as in Python). With linked lists, inserting an element in the middle is cheap, but taking the list's length is expensive; with vectors, it's just the other way around.
map
type is. My philosophy here is that the operations I describe are just an interface to an abstract data structure, and that in translating to an executable code (or performing analysis) you would choose an appropriate real data structure based on the set of operations being used. On the other hand this might be potentially confusing to the reader; I'm not sure which way to go on this. I certainly don't want a proliferation of collections for every concrete collection data structure. What do you think?
Derrick Coetzee
15:43, 6 May 2004 (UTC)I also don't see anything here about pass-by-value vs. pass-by-reference. This matters, too: can functions alter their parameters in the enclosing scope of the call?
At least one more operation on maps is needed: keys(map)
, to extract a set or list of the keys. And how is the map unassigned
value distinct from the record null
value? Is one meant to smell like a Lisp NIL and the other like an SQL NULL? :)
keys
operation myself, fixed that. As for unassigned
versus null
, the reason null
is not used for both is so that a key can map to null
without being unassigned.
Derrick Coetzee
15:22, 6 May 2004 (UTC)null
in a map, can I store a map unassigned
in a record?" Thus, each different subscriptable type ends up with its own way of saying "nobody home"?defined
. In Python, you can use the in
operator on any collection type -- for maps ("dictionaries") it asks if a key is defined in the map. --
FOo
20:52, 6 May 2004 (UTC)The keywords tagged union
and cases
are related to one another, but don't look related from the names. Perhaps type union
and type case
?
--
FOo
14:18, 6 May 2004 (UTC)
Thanks a lot for the feedback, will address your other points a bit later. Derrick Coetzee 15:22, 6 May 2004 (UTC)
For block delimiting, I would prefer to use parentheses (like in
ML) to indicate a list of statements. Shown below are examples without delimiters, with begin and end tags, and with parentheses.
sum := 0 while n > 0 sum := sum + n n := n - 1
sum := 0 while n > 0 begin sum := sum + n n := n - 1 end
sum := 0 while n > 0 ( sum := sum + n n := n - 1 )
Out of these examples, I like the parentheses best.... jaredwf 16:18, 8 May 2004 (UTC)
But with the indentation the parentheses are redundant. CGS 09:29, 9 May 2004 (UTC).
May I remind you still another way that saves one line of code
sum := 0 while n > 0 sum := sum + n n := n - 1 endwhile
or a variation:
sum := 0 while n > 0 sum := sum + n n := n - 1 elihw
Moreover, a disadvantage of parentheses is that they visually "blur" the boundaries of a block, you know, kind of gray-scale halftones between black and white. Mikkalai 21:24, 13 May 2004 (UTC)
These examples look like Fortran 90/95/2003, where one would write
do while (n > 0) sum = sum + n n = n - 1 end do
Fortran could be used as a model.
Whatever approach, IMO block termination keyword is a must. Suppose you had
while n > 0 sum := sum + n read_next(n) report(n)
By a single keystroke, I convert it into
while n > 0 sum := sum + n read_next(n) report(n)
It requires a man who knows an algorithm in depth to catch the bug. Mikkalai 23:31, 13 May 2004 (UTC)
Just to add my opinion to the blocks debate: I think the maintenance concern is, frankly, moot, because we will not be writing large amounts of code using this. Keep that in mind. We will certainly be changing the code, but it should generally be expected to fit comfortably on a screen, if not less. I generally favour indention alone for this reason.
As for Mikkalai's example, it's true that a relatively small edit could effect a big difference in behaviour, but the resulting problem is highly visible, and so should be quickly fixed. I've seen sneaky vandalism on Wikipedia that was much more subtle than this, and still got quickly fixed.
As my final point in favour of indention alone, I note that, when block delimeters are used, if a line of code were improperly indented, we would nevertheless consider this a potential readability problem and so something that we would have to fix anyway. The added flexibility is not only unnecessary but undesirable.
However, I also realise that many existing programmers are used to some sort of block delimeter, and so this helps them read code. Whether this outweighs the brevity and reduced clutter gained by omitting them, I am unsure. In any case, we would certainly use some form of punctuation, and not cluttersome begin/end keywords.
Derrick Coetzee 02:30, 14 May 2004 (UTC)
I like the endwhile example by
Mikkalai since it saves some space, but still "blocks" off the block.... If we use block delimiters, "improperly indented" code may have a readability problem, but the code will still be correct and it will be obvious that there is improper indention. Like the example from above, if we use endwhile we have
while n > 0 sum := sum + n read_next(n) endwhile report(n)
and if we don't we get
while n > 0 sum := sum + n read_next(n) report(n)
It is extremely obvious from the first example that the indention is wrong, but for the second it is not as clear. The block delimiter only adds one more line and I think it aids clarity.
-- jaredwf 04:35, 14 May 2004 (UTC)
Instead of endwhile why not use Donald Knuth's proposal? Like so:
while n > 0 sum := sum + n read_next(n) repeat report(n)
The keyword repeat is nice because when you read to it, you realize you are reading the last statement of a looping block. This is better than "end" which could be confused with an if (also, it makes it sound like the loop will end, as if it were a break). I too agree with the others here who think that Python's experiment in formatting is a failed one at best, and that some delimiter (even if it's small like fi and repeat) is better than nothing. (Mind you, Python is an enjoyable programming language to program in, and it's very useful. I just think they made a bad decision with this one issue.) -- MShonle 19:15, 21 Dec 2004 (UTC)
This proposal is worthwhile and I think some such is overdue. Lots of work, but .... I see a potential problem however. In the interest of doing a really good job, there will surely be here the dreaded committee effect. Accretion of feature after beloved feature until the result is the camel designed by the horse design committee. This being WP, where all is hostage to all, I can't think of a solution.
Wirthian feature parsimony is fine in many respects, but not ideally suited for instant transparency. C character parsimony is certainly not. COBOLic verbosity is self-defeating. And so on. There may not be an ideal examplar amongst real languages.
The proposal as it exists now is admirable. Best possible, certainly not -- there can be no way to identify it even if we had one. Perhaps a reasonable way would be to have an open period, followed by a vote, followed by a discussion changes until another vote. And repeat.
Sort of like Linus does with the kernel. But, who will bell the cat by taking the Linus role? The vote procedure at least has the virtue of precedent here on WP.
Comment? ww 22:54, 6 May 2004 (UTC)
You know what? Perhaps each of us should come up with a solution/wikicode proposal, and let the community vote on one proposal, and then chip away/discuss/improve that? Dysprosia 23:02, 6 May 2004 (UTC)
I agree that the way I've set it up right now is bound to introduce bias, not the least of which being the tendency to not change reasonable-looking things I wrote into the original proposal. I think the most important thing however is getting something through the process and into use - hopefully something good, but consistency is my main concern. Any method that would garner sufficient interest among computer science contributors to take a standard through to official status is good enough for me.
I was hoping that by encouraging others to edit the proposal itself we could avoid the bias of it being written by a single entity, but for some reason people seem unwilling to make even small changes, instead suggesting them in the discussion. We have a page history, don't we?
My remaining concern is a bit elitist, but I still feel compelled to voice it: when designing a pseudocode, people often favour features and even syntactical conventions of languages they know. I'm sure your average stock programmer would design a pseudocode disturbingly similar to C and Java. This does have the advantage of familiarity, which helps readability and writability, but may be worse in the long run, as popular languages change.
In any case, let's choose some method and stick with it. One possible method is to create a small number of proposals, individually improve each of them through the wiki process, and then merge them into a finished product; if we can get enough people behind the process to carry this out, I think it would be good.
Derrick Coetzee 17:58, 7 May 2004 (UTC)
The pseudocode would have to be an imperative language, right? Or, is a functional language a possibility? jaredwf 16:44, 8 May 2004 (UTC)
In my proposal I included a mechanism for closures, which may come in handy. There are three arguments I would make for an imperative pseudocode, in order of importance:
Derrick Coetzee 01:00, 12 May 2004 (UTC)
Don't start from scratch. Start from Python. It already looks like pseudocode, and supports both imperative and functional paradigms. I can't think of any reason the pseudocode presented here is superior to Python for any of the features Python supports.
Certainly, Python has the one big advantage that people can test the correctness of the pseudocode before they publish it. You can't do that with a fictional pseudocode, and therefore all such code would be perpetually untrustworthy. - Doradus 02:53, 9 Jun 2004 (UTC)
I said it before and I'll say again... While NPOV is important, I can't see the logic in inventing a brand new language that looks almost identical to Python but isn't actually executable Python code. It's like inventing a whole new mathematical language because if I write then I might alienate people who prefer or a÷b. Or rolling our own pronunciation guide system because choosing SAMPA would be biased against those who prefer the International Phonetic Alphabet.
If you open up old CS texts, you may find examples written in ALGOL or COBOL, languages that would certainly not be one's first choice today. You may also find others written in a myriad different and independently-invented pseudocode notations, with subtle differences and subtle ambiguities. Given the choice, I'd take the ALGOL or COBOL, because if you find that you don't understand a piece of made-up pseudocode, where do you turn for further clarification?
Now, we don't have to go with Python per se; it's just that Python is so similar to the pseudocode we are using that it presents itself as the obvious choce. If we can find a better existing language, let's start with that instead! And if we find flaws in Python that make it unsuitable for use as pseudocode, let's use a "purified" variant of Python and call that wikicode. But let's not gratuitously inflict yet another arbitrary pseudocode on the world.
"Wikipedia is not a Python library" -- perhaps, but it stands to become something even less useful: a library of un-executable, ambiguous, probably-correct-but-never-actually-tested code. -- Doradus 04:13, Sep 22, 2004 (UTC)
Wikicode does not fit the definition of pseudocode, which in computing is defined as 'using language similar to plain English to explain the purpose of each line of a program' (or something along those lines). In other words, it would be something like
Making up a pretend programming language does NOT fit the definition of pseudocode and does NOT perform any useful purpose on wikipedia. Python would meet with the same problems as I have outlined above. If an article requires pseudocode, it requires PSEUDOcode, not programming code! -- Cynical 15:51, 24 Sep 2004 (UTC)
Hello, all. :) I think this is a wonderful project, and the approach is well organized. I have a few thoughts:
At the top of this page, a stated goal is "Universality: code should be clear to programmers familiar with nearly any common language, and not too similar to any one." I am a little concerned with code that is only clear to "programmers." I think that some effort should be made to create a syntax that is slightly more intuitive to beginning programmers and to non-programmers who are simply interested in algorithms.
For example, instead of Wirth's := operator, how about an "arrow" showing the direction of assignment/data movement:
n <- 5
The use of the word "function" seems problematic to me, both because of the many different names that are used, (function, subroutine, procedure, handler, etc.), and because of possible confusion that may arise with mathematically-inclined, yet non-programming-inclined readers. The article at subroutine uses an interesting and nonpartisan word: "subprogram".
subprogram sort( list, comparisonPredicate ) (indented body of function)
I've had a few more thoughts... but I'll wait to see what the reaction to these two are. :) AdmN 06:02, 25 Aug 2004 (UTC)
n<-5
looks like a Boolean expression to me: "is n less than negative five?" Since the colon does not have the ambiguity that the left angle bracket does, := seems clearer. Heck, if we wanted to we could use the AppleScript notations set n to 5
or copy 5 to n
... :)x ← 2
repeat with i from 1 to 10
It occurred to me, thinking of AdmN's concerns, that it might be preferable to avoid putting so many operators in a single C-style function call notation, and instead to use sentences or phrases where appropriate, inserting variable names in reserved slots, in a manner more similar to Smalltalk and English text. Example:
Template: insert value at end of list Use example: insert x at end of primesList
I would also allow (but not mandate) the definition of functions that can be called in this manner:
function kill numBunnies : int bunnies using weapon (bunny-slaughtering code)
What are your thoughts on this? Derrick Coetzee 18:20, 25 Aug 2004 (UTC)
-- Notes: -- AppleScript 'lists' and strings are 1-based. -- '<' and '>' can be written as "less than" and "greater than" -- '<=' and '>=' can also be expressed with 1 character in MacRoman and Unicode, -- as well as by the longer statements "less than or equal to", etc. -- Possession can be written in two forms: -- item i of a -- a's item i -- on insertion_sort( a ) set n to length of a -- Initially, the first item is considered 'sorted' -- i divides a into a sorted region, x < i, and an unsorted one, x >= i -- repeat with i from 2 to n set v to item i of a -- Select the item at the beginning of the as yet unsorted section set j to i -- Work backwards through the array, finding where v should go repeat while item ( j - 1 ) of a > v -- If this element is greater than v, move it up one set item j of a to item ( j - 1 ) of a set j to j - 1 if ( j <= 1 ) then exit repeat end repeat set item j of a to v -- Stopped when a[ j - 1 ] <= v, so put at position j end repeat end insertion_sort
The main page should eventually have a "cheat sheet", a concise listing of the conventions for quick reference by those who just need an aid to memory. I could start something like this later tonight, if there are no objections. AdmN 20:05, 25 Aug 2004 (UTC)
for
syntax to for each i from 1 to 10.
Derrick Coetzee 20:55, 25 Aug 2004 (UTC)General Comments:
My preference would be something that is obvious to the unlearned reader. Therefore, I would prefer something that looks more like C/C++, Java, etc. rather than ML. Far more people are familiar with the first group than ML.
I would think that clarity is more important than consistency / ambiguity. In other words, even if two variations are somewhat ambiguous, as long as a "reasonable person" could deduce the intent, that is fine with me. The intent is to convey information, not produce syntactically correct code.
Specific comments:
I think that the use of parens for comments could get confused with function args. I would recommend double slashes for comments. Italics would be a good addition, but might be a problem if people forget the closing tags, so I'm not sure if italics would be worth the effort. Parens would be fine if the entire line was a comment.
I would rather see the data type before the formal arguments where necessary.
I also would prefer some kind of delimeter for blocks, although for simple cases, indents alone might be fine.
For example:
function add(int value, list) { // add value to end of list (indented body of function) }
Statements can also be simply an english comment, ignoring all syntax issues, such as:
(add 5 to end of numberlist)
Variables need not be declared, nor initialized. Comments can clarify usage
ent := maxent // set entry to highest possible value
I would prefer := or = for assignment rather than ← because of the extra typing, and because it will be less obvious to a "newbie" how to type that.
:=
was used originally but changed to enhance readability. I think most readers understand :=
though, and it's easier to edit.
Derrick Coetzee 17:35, 29 Aug 2004 (UTC)The for loop should not be so similar to the for each
for i = 0 to 10 { // loop through table (do things with table) } for each x in list { // iterative through list (do something with list entry) }
I would prefer simple characters for operators. Again, it is less typing, and a "newbie" would know how to type it.
I also don't think we want a large collection of "standard" operators. If a given topic will be using any specialized operators, provide a list at the top of the article explaining their uses. (I would recommend this even if we do settle on a standard set of operators. The reader may not know what they are.)
Rather than trying to define a global set of functions (such as the list and set functions), I would suggest that an article could provide a list of functions applicable to the subject matter.
For example, on the Linked List page, we might have:
function isGreater(node1, node2) // true if value of node 1 > node 2
Function definitions would only be necessary if they were germain to the topic.
=
and another using :=
would be a source of confusion.=
and you use :=
, and if you have one set of assumed routines and I use a different set of assumed routines, as long as we both list our assumptions, I doubt if anyone will get confused.
I have reasons behind most of the above suggestions (not just because I think so), and would be glad to expand if any of them if asked. (Its getting late, and I have to get up early tomorrow morning.)
I think the thing we should remember is that the reader may not be familiar with whatever standard we set forth, and we should not require them to look up the standard to understand the examples.
Okay, just to avoid getting too bogged down in arguments over details, let me say:
I hope this helps discussion and helps move the proposal towards something useful for everyone. RSVP, and be bold. Derrick Coetzee 01:25, 5 Sep 2004 (UTC)
Gee, I had expected more changes, especially when you said I might not agree with some. ;^)
I think we are basically done with this standard. A couple of minor comments:
I started out using the K&R formatting for curly braces: opening brace on the same line as the function/if/whatever, and the closing brace on its own line. I switched originally because emacs automatically formatted it that way, but then got to prefer it because it is much easier to match open and closing braces. Using K&R, it was too easy not to notice that I forgot to include the opening brace. So, I would prefer having the open & close braces line up.
My thoughts on "add x to the end of the list" was that we should allow a descriptive statement, but wouldn't need to back it up with details. In particular, if you need to do something, but the details aren't germane to the discussion, I would like to see a vague description of what has to happen. True, that could be done with a comment statement, but that starts to confuse explanation with process. I would like to see pseudocode explain what happens, and comments to explain why it happens.
I think braces should be permitted in do-while loops to provide a consistent look-and-feel. Blocks should be consistently formed by delimeters or keywords, IMHO. (And I wouldn't object to keywords instead of delimeters.)
Other than that, I have no problems with your changes. And, although I think my above ideas are valid, I could live with what is there now.
OK, what's next? ;^)
wrp103 (Bill Pringle) - Talk 14:39, 21 Sep 2004 (UTC)
There are many applications for which one wants to loop over two or more collections in parallel. Consider, for instance, setting up a mapping for a simple substitution cipher, given plaintext and ciphertext alphabets. The "C-ish" way is to use an artificial index variable into both alphabets:
cipher := emptyMap for index from 0 to size(plainAlphabet) - 1 { cipher[plainAlphabet[index]] := cryptAlphabet[index]] }
This introduces a great place for an off-by-one error, since size(list) is here the number of elements, not the index of the last element. It is also slow when the collection is a list, not an array; and it can have an error if cryptAlphabet is shorter than plainAlphabet.
An alternative is to have a syntax to step through several collections in parallel, like this:
cipher := emptyMap for each plainLetter in plainAlphabet and each cryptLetter in cryptAlphabet { cipher[plainLetter] := cryptLetter }
This can be distinguished from a nested loop by the use of the keyword and each, which indicates that the loop is stepping over several collections in parallel.
If one or more of the collections is unordered (like a set) then the pairing-up of elements is unspecified. This could be used when you need to pair each of a set with some element of a different set, but do not care which one.
The loop terminates when the shortest collection is exhausted.
Here's an alternate syntax, avoiding the and each keyword:
cipher := emptyMap for each plainLetter, cryptLetter in plainAlphabet, cryptAlphabet { cipher[plainLetter] := cryptLetter }
Thoughts? -- FOo 20:47, 23 Sep 2004 (UTC)
record cryptPair { plainLetter, cryptLetter } construct cryptList, a cryptPair list cipher := emptyMap for each pair in cryptList cipher[pair.plainLetter] := pair.cryptLetter
for corresponding plainLetter, cryptLetter in plainAlphabet, cryptAlphabet cipher[plainLetter] := cryptLetter
for (plainLetter, cryptLetter) in pairUp(plainAlphabet, cryptAlphabet) cipher[plainLetter] := cryptLetter
Hello. I see that the Python code in Pascal's triangle has been replaced by Wikicode. Frankly, it's not an improvement: it's not any clearer, and it can't be executed at all, by anyone. I agree that not everyone is familiar with Python; only about 100,000 people, is my guess. This is some orders of magnitude greater than the number of people familiar with Wikicode.
I see also that there's a list of "Pages needing conversion". Needing? Isn't that a bit strong, folks? Will converting into an officially uglified pseudocode improve any of those articles?
In closing: (1) Language design is hard, and there is nothing to gain by reinventing the wheel here, and a lot to lose. (2) If it ain't broke don't fix it.
Regards, Wile E. Heresiarch 06:08, 24 Sep 2004 (UTC)
One more thing- was there ever actually a vote to make Wikicode official WP policy? If not it isn't legitimate and should be on VFD right now Cynical 20:42, 13 Oct 2004 (UTC)
"Wikicode" is a bad idea. Pseudocode can be, and should be, a mish-mash of natural language and any or all programming languages. That's the whole idea. Alternatively, you can write in a real programming language with well-defined syntax and semantics. (Good code examples can be as clear as pseudocode, if written in the right language), Wikicode seems to have all the problems of both, without the advantages of either.
Good languages for programming examples:
Imperative:
Functional languages:
-- The Anome 19:33, 27 Sep 2004 (UTC)
No standard
No use of languages which are invented specifically for Wikipedia
Wikicode (as proposed) as standard instead of source code
Wikicode (as proposed) as standard for pseudocode examples only, source code examples left alone
Python as standard
Java as standard
C/C++ as standard
Something else
The proposal has been updated to address some of the issues the discussion here has raised. If there are still objections, I invite them. My purpose for Wikicode is only to be a helpful way of improving consistency across articles where possible without interfering with clarity, and not to restrict the editor in any way. I hope you will excuse my bold edits, and I invite your reverts and criticism.
To be more specific, my changes are:
Derrick Coetzee 18:14, 1 Oct 2004 (UTC)
We specify an indent style in the wikicode? That's awful. We should let writers use their own indent style. Dysprosia 12:24, 30 Dec 2004 (UTC)
It seems, at least based on the above responses, that the entire idea isn't receiving any support. But it is not wikicode itself I believe in here. My logic at its core is simply that gratuitous inconsistency is bad; this seems manifest to me. I appreciate the vast diversity of our community, but I think limiting superficial expression only improves uniformity, improves understanding, and enhances editors' chances for more profound expression.
So I guess my question is, why isn't anyone happy with wikicode, and how can I help move it towards something consistent that the community will want to use? I'm well aware that a standard without followers is not. The overwhelming agreement seems to be that a standard is unnecessary. If so, why? I really want to understand. Please help. Deco 10:51, 31 Dec 2004 (UTC)
a = b + c a = b + c; a := b + c;
I agree with all who have said that inventing a Wikipedia-specific language is a bad idea. I also believe that it is senseless for articles to contain 10 different implementations of everyone's favorite language and that programming articles should have one consistent programming language. Imagine an introduction to CS textbook that implemented every other example in a new language... not good for learning.
If you look on SourceForge by programming language, the languages there have these project counts:
Which is interesting, but I don't think any of those languages are suitable example languages for many reasons, not the least of which will be endless battles between advocates of different languages, as well as unfairly making editing harder for people not as skilled in that language.
My proposal is simple and is based on these assumptions and premises:
The proposal:
Both parts of the proposal are critical.
Daniel Quinlan 10:05, Apr 11, 2005 (UTC)
More on the proposal:
This paper contains an interesting survey of computer programming teachers and their preferred language choice. Daniel Quinlan 10:23, Apr 11, 2005 (UTC)
Template:Wikicode has been nominated for deletion. You are invited to comment on the discussion at Wikipedia:Templates for deletion#Wikicode. Thank you.
This pseudocode is no longer a proposed standard. See User:Dcoetzee/Wikicode.
This issue has generated much discussion, and here seems to be the best place to continue that discussion, but for background reading, you should also see Talk:Pseudocode, Talk:Algorithms on Wikipedia, and User talk:Dcoetzee/Wikicode.
(I've added the above for newcomers, like me, not existing protagonists. Mark Hurd 10:16, 10 Oct 2004 (UTC))
Discussion originally placed on project page follows
: int
demonstrate, they're just not required. You may be looking for them before and not after (see second point).
Derrick Coetzee
03:12, 6 May 2004 (UTC)
This is all just some random thoughts, so feel free to disregard what I say as you see fit ;) Dysprosia 23:09, 5 May 2004 (UTC)
Thanks a lot for your careful examination and comments, will make some related changes soon. Derrick Coetzee 23:31, 5 May 2004 (UTC)
I don't think brackets on a function call should be optional. Why have two ways to do the same thing?
CGS
23:44, 5 May 2004 (UTC).
I can see how something like this would be useful to illustrate simple algorithms. But even there, you can't necessarily compare the efficiency of two algorithms without knowing a little bit more about how data structures are meant to be implemented. For instance, "lists" in real programming languages are sometimes linked lists (as in Lisp) and sometimes vectors (as in Python). With linked lists, inserting an element in the middle is cheap, but taking the list's length is expensive; with vectors, it's just the other way around.
map
type is. My philosophy here is that the operations I describe are just an interface to an abstract data structure, and that in translating to an executable code (or performing analysis) you would choose an appropriate real data structure based on the set of operations being used. On the other hand this might be potentially confusing to the reader; I'm not sure which way to go on this. I certainly don't want a proliferation of collections for every concrete collection data structure. What do you think?
Derrick Coetzee
15:43, 6 May 2004 (UTC)I also don't see anything here about pass-by-value vs. pass-by-reference. This matters, too: can functions alter their parameters in the enclosing scope of the call?
At least one more operation on maps is needed: keys(map)
, to extract a set or list of the keys. And how is the map unassigned
value distinct from the record null
value? Is one meant to smell like a Lisp NIL and the other like an SQL NULL? :)
keys
operation myself, fixed that. As for unassigned
versus null
, the reason null
is not used for both is so that a key can map to null
without being unassigned.
Derrick Coetzee
15:22, 6 May 2004 (UTC)null
in a map, can I store a map unassigned
in a record?" Thus, each different subscriptable type ends up with its own way of saying "nobody home"?defined
. In Python, you can use the in
operator on any collection type -- for maps ("dictionaries") it asks if a key is defined in the map. --
FOo
20:52, 6 May 2004 (UTC)The keywords tagged union
and cases
are related to one another, but don't look related from the names. Perhaps type union
and type case
?
--
FOo
14:18, 6 May 2004 (UTC)
Thanks a lot for the feedback, will address your other points a bit later. Derrick Coetzee 15:22, 6 May 2004 (UTC)
For block delimiting, I would prefer to use parentheses (like in
ML) to indicate a list of statements. Shown below are examples without delimiters, with begin and end tags, and with parentheses.
sum := 0 while n > 0 sum := sum + n n := n - 1
sum := 0 while n > 0 begin sum := sum + n n := n - 1 end
sum := 0 while n > 0 ( sum := sum + n n := n - 1 )
Out of these examples, I like the parentheses best.... jaredwf 16:18, 8 May 2004 (UTC)
But with the indentation the parentheses are redundant. CGS 09:29, 9 May 2004 (UTC).
May I remind you still another way that saves one line of code
sum := 0 while n > 0 sum := sum + n n := n - 1 endwhile
or a variation:
sum := 0 while n > 0 sum := sum + n n := n - 1 elihw
Moreover, a disadvantage of parentheses is that they visually "blur" the boundaries of a block, you know, kind of gray-scale halftones between black and white. Mikkalai 21:24, 13 May 2004 (UTC)
These examples look like Fortran 90/95/2003, where one would write
do while (n > 0) sum = sum + n n = n - 1 end do
Fortran could be used as a model.
Whatever approach, IMO block termination keyword is a must. Suppose you had
while n > 0 sum := sum + n read_next(n) report(n)
By a single keystroke, I convert it into
while n > 0 sum := sum + n read_next(n) report(n)
It requires a man who knows an algorithm in depth to catch the bug. Mikkalai 23:31, 13 May 2004 (UTC)
Just to add my opinion to the blocks debate: I think the maintenance concern is, frankly, moot, because we will not be writing large amounts of code using this. Keep that in mind. We will certainly be changing the code, but it should generally be expected to fit comfortably on a screen, if not less. I generally favour indention alone for this reason.
As for Mikkalai's example, it's true that a relatively small edit could effect a big difference in behaviour, but the resulting problem is highly visible, and so should be quickly fixed. I've seen sneaky vandalism on Wikipedia that was much more subtle than this, and still got quickly fixed.
As my final point in favour of indention alone, I note that, when block delimeters are used, if a line of code were improperly indented, we would nevertheless consider this a potential readability problem and so something that we would have to fix anyway. The added flexibility is not only unnecessary but undesirable.
However, I also realise that many existing programmers are used to some sort of block delimeter, and so this helps them read code. Whether this outweighs the brevity and reduced clutter gained by omitting them, I am unsure. In any case, we would certainly use some form of punctuation, and not cluttersome begin/end keywords.
Derrick Coetzee 02:30, 14 May 2004 (UTC)
I like the endwhile example by
Mikkalai since it saves some space, but still "blocks" off the block.... If we use block delimiters, "improperly indented" code may have a readability problem, but the code will still be correct and it will be obvious that there is improper indention. Like the example from above, if we use endwhile we have
while n > 0 sum := sum + n read_next(n) endwhile report(n)
and if we don't we get
while n > 0 sum := sum + n read_next(n) report(n)
It is extremely obvious from the first example that the indention is wrong, but for the second it is not as clear. The block delimiter only adds one more line and I think it aids clarity.
-- jaredwf 04:35, 14 May 2004 (UTC)
Instead of endwhile why not use Donald Knuth's proposal? Like so:
while n > 0 sum := sum + n read_next(n) repeat report(n)
The keyword repeat is nice because when you read to it, you realize you are reading the last statement of a looping block. This is better than "end" which could be confused with an if (also, it makes it sound like the loop will end, as if it were a break). I too agree with the others here who think that Python's experiment in formatting is a failed one at best, and that some delimiter (even if it's small like fi and repeat) is better than nothing. (Mind you, Python is an enjoyable programming language to program in, and it's very useful. I just think they made a bad decision with this one issue.) -- MShonle 19:15, 21 Dec 2004 (UTC)
This proposal is worthwhile and I think some such is overdue. Lots of work, but .... I see a potential problem however. In the interest of doing a really good job, there will surely be here the dreaded committee effect. Accretion of feature after beloved feature until the result is the camel designed by the horse design committee. This being WP, where all is hostage to all, I can't think of a solution.
Wirthian feature parsimony is fine in many respects, but not ideally suited for instant transparency. C character parsimony is certainly not. COBOLic verbosity is self-defeating. And so on. There may not be an ideal examplar amongst real languages.
The proposal as it exists now is admirable. Best possible, certainly not -- there can be no way to identify it even if we had one. Perhaps a reasonable way would be to have an open period, followed by a vote, followed by a discussion changes until another vote. And repeat.
Sort of like Linus does with the kernel. But, who will bell the cat by taking the Linus role? The vote procedure at least has the virtue of precedent here on WP.
Comment? ww 22:54, 6 May 2004 (UTC)
You know what? Perhaps each of us should come up with a solution/wikicode proposal, and let the community vote on one proposal, and then chip away/discuss/improve that? Dysprosia 23:02, 6 May 2004 (UTC)
I agree that the way I've set it up right now is bound to introduce bias, not the least of which being the tendency to not change reasonable-looking things I wrote into the original proposal. I think the most important thing however is getting something through the process and into use - hopefully something good, but consistency is my main concern. Any method that would garner sufficient interest among computer science contributors to take a standard through to official status is good enough for me.
I was hoping that by encouraging others to edit the proposal itself we could avoid the bias of it being written by a single entity, but for some reason people seem unwilling to make even small changes, instead suggesting them in the discussion. We have a page history, don't we?
My remaining concern is a bit elitist, but I still feel compelled to voice it: when designing a pseudocode, people often favour features and even syntactical conventions of languages they know. I'm sure your average stock programmer would design a pseudocode disturbingly similar to C and Java. This does have the advantage of familiarity, which helps readability and writability, but may be worse in the long run, as popular languages change.
In any case, let's choose some method and stick with it. One possible method is to create a small number of proposals, individually improve each of them through the wiki process, and then merge them into a finished product; if we can get enough people behind the process to carry this out, I think it would be good.
Derrick Coetzee 17:58, 7 May 2004 (UTC)
The pseudocode would have to be an imperative language, right? Or, is a functional language a possibility? jaredwf 16:44, 8 May 2004 (UTC)
In my proposal I included a mechanism for closures, which may come in handy. There are three arguments I would make for an imperative pseudocode, in order of importance:
Derrick Coetzee 01:00, 12 May 2004 (UTC)
Don't start from scratch. Start from Python. It already looks like pseudocode, and supports both imperative and functional paradigms. I can't think of any reason the pseudocode presented here is superior to Python for any of the features Python supports.
Certainly, Python has the one big advantage that people can test the correctness of the pseudocode before they publish it. You can't do that with a fictional pseudocode, and therefore all such code would be perpetually untrustworthy. - Doradus 02:53, 9 Jun 2004 (UTC)
I said it before and I'll say again... While NPOV is important, I can't see the logic in inventing a brand new language that looks almost identical to Python but isn't actually executable Python code. It's like inventing a whole new mathematical language because if I write then I might alienate people who prefer or a÷b. Or rolling our own pronunciation guide system because choosing SAMPA would be biased against those who prefer the International Phonetic Alphabet.
If you open up old CS texts, you may find examples written in ALGOL or COBOL, languages that would certainly not be one's first choice today. You may also find others written in a myriad different and independently-invented pseudocode notations, with subtle differences and subtle ambiguities. Given the choice, I'd take the ALGOL or COBOL, because if you find that you don't understand a piece of made-up pseudocode, where do you turn for further clarification?
Now, we don't have to go with Python per se; it's just that Python is so similar to the pseudocode we are using that it presents itself as the obvious choce. If we can find a better existing language, let's start with that instead! And if we find flaws in Python that make it unsuitable for use as pseudocode, let's use a "purified" variant of Python and call that wikicode. But let's not gratuitously inflict yet another arbitrary pseudocode on the world.
"Wikipedia is not a Python library" -- perhaps, but it stands to become something even less useful: a library of un-executable, ambiguous, probably-correct-but-never-actually-tested code. -- Doradus 04:13, Sep 22, 2004 (UTC)
Wikicode does not fit the definition of pseudocode, which in computing is defined as 'using language similar to plain English to explain the purpose of each line of a program' (or something along those lines). In other words, it would be something like
Making up a pretend programming language does NOT fit the definition of pseudocode and does NOT perform any useful purpose on wikipedia. Python would meet with the same problems as I have outlined above. If an article requires pseudocode, it requires PSEUDOcode, not programming code! -- Cynical 15:51, 24 Sep 2004 (UTC)
Hello, all. :) I think this is a wonderful project, and the approach is well organized. I have a few thoughts:
At the top of this page, a stated goal is "Universality: code should be clear to programmers familiar with nearly any common language, and not too similar to any one." I am a little concerned with code that is only clear to "programmers." I think that some effort should be made to create a syntax that is slightly more intuitive to beginning programmers and to non-programmers who are simply interested in algorithms.
For example, instead of Wirth's := operator, how about an "arrow" showing the direction of assignment/data movement:
n <- 5
The use of the word "function" seems problematic to me, both because of the many different names that are used, (function, subroutine, procedure, handler, etc.), and because of possible confusion that may arise with mathematically-inclined, yet non-programming-inclined readers. The article at subroutine uses an interesting and nonpartisan word: "subprogram".
subprogram sort( list, comparisonPredicate ) (indented body of function)
I've had a few more thoughts... but I'll wait to see what the reaction to these two are. :) AdmN 06:02, 25 Aug 2004 (UTC)
n<-5
looks like a Boolean expression to me: "is n less than negative five?" Since the colon does not have the ambiguity that the left angle bracket does, := seems clearer. Heck, if we wanted to we could use the AppleScript notations set n to 5
or copy 5 to n
... :)x ← 2
repeat with i from 1 to 10
It occurred to me, thinking of AdmN's concerns, that it might be preferable to avoid putting so many operators in a single C-style function call notation, and instead to use sentences or phrases where appropriate, inserting variable names in reserved slots, in a manner more similar to Smalltalk and English text. Example:
Template: insert value at end of list Use example: insert x at end of primesList
I would also allow (but not mandate) the definition of functions that can be called in this manner:
function kill numBunnies : int bunnies using weapon (bunny-slaughtering code)
What are your thoughts on this? Derrick Coetzee 18:20, 25 Aug 2004 (UTC)
-- Notes: -- AppleScript 'lists' and strings are 1-based. -- '<' and '>' can be written as "less than" and "greater than" -- '<=' and '>=' can also be expressed with 1 character in MacRoman and Unicode, -- as well as by the longer statements "less than or equal to", etc. -- Possession can be written in two forms: -- item i of a -- a's item i -- on insertion_sort( a ) set n to length of a -- Initially, the first item is considered 'sorted' -- i divides a into a sorted region, x < i, and an unsorted one, x >= i -- repeat with i from 2 to n set v to item i of a -- Select the item at the beginning of the as yet unsorted section set j to i -- Work backwards through the array, finding where v should go repeat while item ( j - 1 ) of a > v -- If this element is greater than v, move it up one set item j of a to item ( j - 1 ) of a set j to j - 1 if ( j <= 1 ) then exit repeat end repeat set item j of a to v -- Stopped when a[ j - 1 ] <= v, so put at position j end repeat end insertion_sort
The main page should eventually have a "cheat sheet", a concise listing of the conventions for quick reference by those who just need an aid to memory. I could start something like this later tonight, if there are no objections. AdmN 20:05, 25 Aug 2004 (UTC)
for
syntax to for each i from 1 to 10.
Derrick Coetzee 20:55, 25 Aug 2004 (UTC)General Comments:
My preference would be something that is obvious to the unlearned reader. Therefore, I would prefer something that looks more like C/C++, Java, etc. rather than ML. Far more people are familiar with the first group than ML.
I would think that clarity is more important than consistency / ambiguity. In other words, even if two variations are somewhat ambiguous, as long as a "reasonable person" could deduce the intent, that is fine with me. The intent is to convey information, not produce syntactically correct code.
Specific comments:
I think that the use of parens for comments could get confused with function args. I would recommend double slashes for comments. Italics would be a good addition, but might be a problem if people forget the closing tags, so I'm not sure if italics would be worth the effort. Parens would be fine if the entire line was a comment.
I would rather see the data type before the formal arguments where necessary.
I also would prefer some kind of delimeter for blocks, although for simple cases, indents alone might be fine.
For example:
function add(int value, list) { // add value to end of list (indented body of function) }
Statements can also be simply an english comment, ignoring all syntax issues, such as:
(add 5 to end of numberlist)
Variables need not be declared, nor initialized. Comments can clarify usage
ent := maxent // set entry to highest possible value
I would prefer := or = for assignment rather than ← because of the extra typing, and because it will be less obvious to a "newbie" how to type that.
:=
was used originally but changed to enhance readability. I think most readers understand :=
though, and it's easier to edit.
Derrick Coetzee 17:35, 29 Aug 2004 (UTC)The for loop should not be so similar to the for each
for i = 0 to 10 { // loop through table (do things with table) } for each x in list { // iterative through list (do something with list entry) }
I would prefer simple characters for operators. Again, it is less typing, and a "newbie" would know how to type it.
I also don't think we want a large collection of "standard" operators. If a given topic will be using any specialized operators, provide a list at the top of the article explaining their uses. (I would recommend this even if we do settle on a standard set of operators. The reader may not know what they are.)
Rather than trying to define a global set of functions (such as the list and set functions), I would suggest that an article could provide a list of functions applicable to the subject matter.
For example, on the Linked List page, we might have:
function isGreater(node1, node2) // true if value of node 1 > node 2
Function definitions would only be necessary if they were germain to the topic.
=
and another using :=
would be a source of confusion.=
and you use :=
, and if you have one set of assumed routines and I use a different set of assumed routines, as long as we both list our assumptions, I doubt if anyone will get confused.
I have reasons behind most of the above suggestions (not just because I think so), and would be glad to expand if any of them if asked. (Its getting late, and I have to get up early tomorrow morning.)
I think the thing we should remember is that the reader may not be familiar with whatever standard we set forth, and we should not require them to look up the standard to understand the examples.
Okay, just to avoid getting too bogged down in arguments over details, let me say:
I hope this helps discussion and helps move the proposal towards something useful for everyone. RSVP, and be bold. Derrick Coetzee 01:25, 5 Sep 2004 (UTC)
Gee, I had expected more changes, especially when you said I might not agree with some. ;^)
I think we are basically done with this standard. A couple of minor comments:
I started out using the K&R formatting for curly braces: opening brace on the same line as the function/if/whatever, and the closing brace on its own line. I switched originally because emacs automatically formatted it that way, but then got to prefer it because it is much easier to match open and closing braces. Using K&R, it was too easy not to notice that I forgot to include the opening brace. So, I would prefer having the open & close braces line up.
My thoughts on "add x to the end of the list" was that we should allow a descriptive statement, but wouldn't need to back it up with details. In particular, if you need to do something, but the details aren't germane to the discussion, I would like to see a vague description of what has to happen. True, that could be done with a comment statement, but that starts to confuse explanation with process. I would like to see pseudocode explain what happens, and comments to explain why it happens.
I think braces should be permitted in do-while loops to provide a consistent look-and-feel. Blocks should be consistently formed by delimeters or keywords, IMHO. (And I wouldn't object to keywords instead of delimeters.)
Other than that, I have no problems with your changes. And, although I think my above ideas are valid, I could live with what is there now.
OK, what's next? ;^)
wrp103 (Bill Pringle) - Talk 14:39, 21 Sep 2004 (UTC)
There are many applications for which one wants to loop over two or more collections in parallel. Consider, for instance, setting up a mapping for a simple substitution cipher, given plaintext and ciphertext alphabets. The "C-ish" way is to use an artificial index variable into both alphabets:
cipher := emptyMap for index from 0 to size(plainAlphabet) - 1 { cipher[plainAlphabet[index]] := cryptAlphabet[index]] }
This introduces a great place for an off-by-one error, since size(list) is here the number of elements, not the index of the last element. It is also slow when the collection is a list, not an array; and it can have an error if cryptAlphabet is shorter than plainAlphabet.
An alternative is to have a syntax to step through several collections in parallel, like this:
cipher := emptyMap for each plainLetter in plainAlphabet and each cryptLetter in cryptAlphabet { cipher[plainLetter] := cryptLetter }
This can be distinguished from a nested loop by the use of the keyword and each, which indicates that the loop is stepping over several collections in parallel.
If one or more of the collections is unordered (like a set) then the pairing-up of elements is unspecified. This could be used when you need to pair each of a set with some element of a different set, but do not care which one.
The loop terminates when the shortest collection is exhausted.
Here's an alternate syntax, avoiding the and each keyword:
cipher := emptyMap for each plainLetter, cryptLetter in plainAlphabet, cryptAlphabet { cipher[plainLetter] := cryptLetter }
Thoughts? -- FOo 20:47, 23 Sep 2004 (UTC)
record cryptPair { plainLetter, cryptLetter } construct cryptList, a cryptPair list cipher := emptyMap for each pair in cryptList cipher[pair.plainLetter] := pair.cryptLetter
for corresponding plainLetter, cryptLetter in plainAlphabet, cryptAlphabet cipher[plainLetter] := cryptLetter
for (plainLetter, cryptLetter) in pairUp(plainAlphabet, cryptAlphabet) cipher[plainLetter] := cryptLetter
Hello. I see that the Python code in Pascal's triangle has been replaced by Wikicode. Frankly, it's not an improvement: it's not any clearer, and it can't be executed at all, by anyone. I agree that not everyone is familiar with Python; only about 100,000 people, is my guess. This is some orders of magnitude greater than the number of people familiar with Wikicode.
I see also that there's a list of "Pages needing conversion". Needing? Isn't that a bit strong, folks? Will converting into an officially uglified pseudocode improve any of those articles?
In closing: (1) Language design is hard, and there is nothing to gain by reinventing the wheel here, and a lot to lose. (2) If it ain't broke don't fix it.
Regards, Wile E. Heresiarch 06:08, 24 Sep 2004 (UTC)
One more thing- was there ever actually a vote to make Wikicode official WP policy? If not it isn't legitimate and should be on VFD right now Cynical 20:42, 13 Oct 2004 (UTC)
"Wikicode" is a bad idea. Pseudocode can be, and should be, a mish-mash of natural language and any or all programming languages. That's the whole idea. Alternatively, you can write in a real programming language with well-defined syntax and semantics. (Good code examples can be as clear as pseudocode, if written in the right language), Wikicode seems to have all the problems of both, without the advantages of either.
Good languages for programming examples:
Imperative:
Functional languages:
-- The Anome 19:33, 27 Sep 2004 (UTC)
No standard
No use of languages which are invented specifically for Wikipedia
Wikicode (as proposed) as standard instead of source code
Wikicode (as proposed) as standard for pseudocode examples only, source code examples left alone
Python as standard
Java as standard
C/C++ as standard
Something else
The proposal has been updated to address some of the issues the discussion here has raised. If there are still objections, I invite them. My purpose for Wikicode is only to be a helpful way of improving consistency across articles where possible without interfering with clarity, and not to restrict the editor in any way. I hope you will excuse my bold edits, and I invite your reverts and criticism.
To be more specific, my changes are:
Derrick Coetzee 18:14, 1 Oct 2004 (UTC)
We specify an indent style in the wikicode? That's awful. We should let writers use their own indent style. Dysprosia 12:24, 30 Dec 2004 (UTC)
It seems, at least based on the above responses, that the entire idea isn't receiving any support. But it is not wikicode itself I believe in here. My logic at its core is simply that gratuitous inconsistency is bad; this seems manifest to me. I appreciate the vast diversity of our community, but I think limiting superficial expression only improves uniformity, improves understanding, and enhances editors' chances for more profound expression.
So I guess my question is, why isn't anyone happy with wikicode, and how can I help move it towards something consistent that the community will want to use? I'm well aware that a standard without followers is not. The overwhelming agreement seems to be that a standard is unnecessary. If so, why? I really want to understand. Please help. Deco 10:51, 31 Dec 2004 (UTC)
a = b + c a = b + c; a := b + c;
I agree with all who have said that inventing a Wikipedia-specific language is a bad idea. I also believe that it is senseless for articles to contain 10 different implementations of everyone's favorite language and that programming articles should have one consistent programming language. Imagine an introduction to CS textbook that implemented every other example in a new language... not good for learning.
If you look on SourceForge by programming language, the languages there have these project counts:
Which is interesting, but I don't think any of those languages are suitable example languages for many reasons, not the least of which will be endless battles between advocates of different languages, as well as unfairly making editing harder for people not as skilled in that language.
My proposal is simple and is based on these assumptions and premises:
The proposal:
Both parts of the proposal are critical.
Daniel Quinlan 10:05, Apr 11, 2005 (UTC)
More on the proposal:
This paper contains an interesting survey of computer programming teachers and their preferred language choice. Daniel Quinlan 10:23, Apr 11, 2005 (UTC)
Template:Wikicode has been nominated for deletion. You are invited to comment on the discussion at Wikipedia:Templates for deletion#Wikicode. Thank you.