This is the
talk page for discussing improvements to the
Word (computer architecture) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||
|
![]() |
Daily pageviews of this article
A graph should have been displayed here but
graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at
pageviews.wmcloud.org |
This article has many possible titles, including word, computer word, memory word, data word, instruction word, word size, word length, etc. I believe the general form should be used since none of the specific forms is well established. Of course "word" has to be qualified to disambiguate from non-computing uses. I'm not particularly happy with "(computer science)", though. Perhaps "Word (computing)" would be the best choice overall. - R. S. Shaw 05:00, 6 June 2006 (UTC)
I can't see why not to do this. It would be silly to have a different article for every word length. Merge them all into the same article. I don't mind under what category you qualify word, seems like a bit of an arbitrary choice anyway. -- FearedInLasVegas 22:39, 28 July 2006 (UTC)
OK, I did the merge, but it's fairly unintegrated. Anyone feel free to smooth it out... Lisamh 15:46, 21 September 2006 (UTC)
It's my understanding that the concept of "word" size is a nebulous one and not very defined on some systems. Should the article make this more clear? The way the article makes it sound, computer designers pick a world size and then base other choices around it. While there are good reasons to use the same number of bits for registers as you do for bus-widths and so on, I just don't think this is the reality any more. The x86/x64 archictectures, for instance, are a mess of size choices. So what I'm proposing is a reversal of emphasis by saying: 'computers have certain bus-widths and register-sizes, etc., and the size of a "word" is the number of bits most common to them; this is not always a straight-forward determination". -- Apantomimehorse 16:04, 30 August 2006 (UTC)
"Word" seems quite ambiguous in modern usage, since people very often mean it as 16-bits rather than the actual word size of the processor (likely now to be 32 or 64 bits). There's Intel and Microsoft who, for reasons of backwards compatability, use "word" to mean a fixed size of 16-bits, just as "byte" is used to mean a fixed size of 8-bits. Although in formal contexts, "word" still means the basic native size of integers and pointers in the processor, a huge number of people believe that "word" just means "16-bits". Would it be useful to mention the phrase "machine word" as being a less ambiguous way to refer to that native size, and try to avoid the danger of the fixed 16-bit meaning? JohnBSmall 14:10, 8 October 2006 (UTC)
I have to strongly disagree. To anyone with a mainframe background of any kind this is simply not true. A word might be 32, 36, or 60 bits. 16 bits is a halfword, to call it a word is a recentism. A word on 8086 systems was 16 bits, and Intel didn't want to bother to update their terminology when the 386 was introduced. Peter Flass ( talk) 22:17, 19 November 2014 (UTC)
There were a whole bunch of minicomputers before the VAX. I used Motorola, HP, and DEC. There were a few others in the building for "real time" computing and some pre-PC desktops.
I remember the CDC 6600 as having a 64-bit word and there were 128 characters available. The 128 characters does not match a 6-bit word, but rather a 7-bit word. You could run APL on it and it used a whole bunch of special characters. .
I only read that part of the manual once, and there were a few "hall talk" discussions about the size. So, there may be a wet memory error there.
Ralph —The preceding unsigned comment was added by 67.173.69.180 ( talk) 15:33, 6 January 2007 (UTC).
Is "tword" worth mentioning? It is used, at least, in nasm as a ten-byte field. I found a few other references to it (for example, GoAsm). I was looking because the nasm documentation didn't really say how big it is. -- Ishi Gustaedr 17:56, 20 July 2007 (UTC)
the Model II reduced this to 6 cycles, but reduced the fetch times to 4 cycles if one or 1 cycle if both address fields were not needed by the instruction
Suggestions:
1. Article needs to differentiate between general word meaning and Intel's definition (more clearly). Generally word should mean maximum data, which could be stored in a register, transfered through the memory bus in one cycle, etc (width of them). Now Intel (or x86 programmers) have coined the new word definition, which is 16 bits. This is only one of the possible word sizes like byte can mean not only 8 bits. IA-32 (since 80386) natural word size is 32 bytes cause such amount of data is maximum register capacity and is the size with which the system operates. 16 bit part is only part of the register. As well, 16 bit word can also be divided into high and low parts and all these together make one 32 bit register. Same goes for x86-64, but not the latest IA-64. Furthermore, C language standards ANSI C and C99 define that int size is equal to the size of the word in particular system. On IA-32 it is 32 bits.
2.
and is still said to be 16 bits, despite the fact that they may in actuality (and especially when the default operand size is 32-bit) operate more like a machine with a 32 bit word size. Similarly in the newer x86-64 architecture, a "word" is still 16 bits, although 64-bit ("quadruple word") operands may be more common.
Seems too much uncertainty (may, more like) —Preceding unsigned comment added by 193.219.93.218 ( talk) 13:49, 11 June 2008 (UTC)
A ‘‘plain’’
int
object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the rangeINT_MIN
toINT_MAX
as defined in the header<limits.h>
).
int
does not involve the term "word" at all.int
s and C compilers for the 68000 family that had 32-bit int
s, so some compiler writers chose "the width of the CPU data paths" and some compiler writers chose "the size of the data and address registers and of the operands of the widest integer arithmetic instructions".
Guy Harris (
talk)
20:05, 20 April 2018 (UTC)What are the criteria for including an architecture in this table? I don't see the PDP-6/10, which was the dominant architecture on the ARPAnet and the major machine used in AI research in the 60's and 70's; I don't see the SDS/XDS 940, a pioneering early timesharing machine; I don't see the GE/Honeywell 6xx/ 6xxx series, on which Multics was implemented; I don't see the Manchester Atlas, which pioneered virtual memory. Of course, I could just add them, but perhaps there is actually some logic about the current choice I am missing? For that matter, I'm not sure that having an extensive list is all that valuable; maybe it could be condensed in some useful way? -- Macrakis ( talk) 20:16, 27 July 2010 (UTC)
Although I appreciate the idea that this article describes "Word" as a dynamically sized type depending on architecture...
When we're in the world of x86 there's simply a backward compatibility in effect that has fixed the size of the type "WORD" to 16 bits. The problem with this concept however is Compilers, that actually utilize fixed types, but use base types like "integer" and "long" that are sized dependent on the underlying cpu architecture.
When in C the type WORD is defined via "#define WORD int", it doesn't mean that the WORD is always the length of an integer, it means that the WORD is defined to a type that at the time that it was written - an int was 16 bits. When you compile the same definition on a compiler that supports 64 bits, it will be compiled to a 32 bits type, which will probably Break your software - especially when coupled to the Windows-API. Nowadays WORD will probably be defined as a short int, which is 16-bits in most C compilers.
As a result, or reason, who knows what happened first, both Intel and AMD have a clear definition of Words in their opcode manuals.
The following quote is from the "Intel® 64 and IA-32 Architectures, Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M":
The "legacy" opcodes LODSB, LODSW, LODSD for example are made to specifically load 1-byte, 2-byte, 4-byte into registers. Independent of whether the architecture is 32 or 64 bits.
So I personally think that the size of a WORD has been fixed into history to 16-bits and DWORD into 32-bits, simply because they're currently heavily used and really need to be binary compatible types. — Preceding unsigned comment added by Partouf ( talk • contribs) 20:54, 15 January 2011 (UTC)
On the 68000, a word usually refers to 16 bits. The 68000 had a 16 bit data bus, and although it had 32 bit registers, performing 32 bit instructions were slower than 16 bit ones. -OOPSIE- ( talk) 08:27, 2 February 2011 (UTC)
A critical limit on any architecture is the width of an address. I admit that there are multiple versions of this: the width of the address bus and the width of the address as far as ordinary programs are concerned (and sometimes the width of the address for programs that are willing to work hard).
Many computer architectures ran out of address bits, grew warts to extend the physical address, and then died when these were not sufficient. Think: PDP-8 with 12-bit addresses, then bank switching to allow more memory (horrible for a program to deal with, OK for an OS). Or the i386 with 32 bit addresses and then PAE.
Reliable sources say that the main motivation for 64-bit architectures (SPARC, MIPS, AMD64, ARM8) has been to make wider addresses convenient.
I'd like a column added to the table to give the address width. DHR ( talk) 20:07, 28 October 2013 (UTC)
Thanks for improving my language regarding "char" and Unicode (and Harvard, but while true shouldn't split-cache be mentioned?). While the history lesson is correct (I can always count on you), my point wasn't to list encodings. UTF-8 is up to 82.4% while Shift JIS is down to 1.2% on the web. [2]
My point was that "char" is still 8-bit (or not e.g. in Java but with all those newer languages we have, there is still an 8-bit type) while not "usable" when 8-bit for characters.. The multibyte encodings didn't influence computer architecture and word size as far as I know.. Still I just thought somehow it should be mentioned that "char" can be bigger (e.g. 32-bit in Julia (programming language), my current favorite) while not having anything to do with the least addressable unit. Then maybe it's ok to shorten the language, to be on topic, while only mentioning Unicode/UTF-8 (as the only important encoding going forward)? comp.arch ( talk) 14:38, 19 January 2015 (UTC)
Recent edits changed the entries for IA-32 and x86-64 to use "word" in the historical-terminology sense rather than the natural-size sense. This has been much discussed during the history of this page.
If, for architectures that evolved from earlier architectures with shorter natural-size words, we're going to consistently use "word" in the historical-terminology, then 1) we need to fix the entry for VAX as well (it was a 32-bit architecture, so "the natural unit of data used by a particular processor design", to quote the lede of the article, was 32 bits) and 2) we should update the lede, as it doesn't say "word is a term the developer of an instruction set uses for some size of data that might, or might not, be the natural unit of data used by a particular processor design".
Otherwise, we should stick to using "word" to mean what we say it means, even if that confuses those used to calling a data that's half the size, or a quarter of the size, of the registers in an instruction set a "word".
(We should also speak of instruction sets rather than processor designs in the lede, as there have been a number of designs for processors that implement an N-bit instruction set but have M-bit data paths internally, for M < N, but that's another matter.) Guy Harris ( talk) 23:38, 20 February 2016 (UTC)
Well I had to scroll about 11 or 12 bits down the this page of ever-so cleverness, but finally some sanity, thank you Partouf. Firstly, whoever was responsible for putting together this page, I love it. It's full of really interesting stuff, and I also get the impression you may have a similar coding ethos to myself. I like it. I like you. So I really would like to see this page be re-titled and linked to or included in a page on the way 'word' is used in computing, perhaps under the heading "origins of data size terminology" or something like that. It is all true. You are technically correct, however anybody looking for real world information can only be confused here. Perhaps a junior school student who is starting computer science and can't remember which one of those funny new terms is which. There's actually comments here from people not wanting readers to get the wrong impression and think a word is 16-bits... It is. Sorry. The real world built a bridge, got over it and moved on with their lives. That poor school kid just wants to know its 2 bytes, she has enough study to do already.
Perhaps should be noted somewhere early in the page not to confuse this with the "word size" that some sources still use to describe processor architecture, and directing the 1 in every 65536 visitors (including myself) who find that interesting, to a section further down the page. The fact that the word has been re-purposed really should rate about 1 sentence in the opening paragraph before it is clearly described that a word is 2 bytes, 16 bits and can hold values of zero thru 2^16-1. Maybe something like "In computing 'word' was once used to describe the size of the largest individual binary value that could be manipulated by a given system, but with the evolution of processor design and rapid growth of computer use worldwide, it now almost always refers to..." a WORD. This is what languages do, they evolve. If we want to wax philosophical about it, I could postulate that it is probably because computer terminology and indeed documentation began to solidify around the time that 16 bit architecture was the mainstream norm. A group of cetaceanologists may very correctly us the term 'cow' to describe 100 tons of migratory aquatic mammal, but WP quite rightly directs that inquiry straight to this page.
The obvious example that immediately sprang to mind, I see has already been canvassed here by Partouf, being the long ago standardisation of the term in assembly language programming and its ubiquitous and categorical definition as being 2 bytes. From volume 1 of the current Intel® 64 and IA-32 Architectures Software Developer Manuals:
4.1 FUNDAMENTAL DATA TYPES The fundamental data types are bytes, words, doublewords, quadwords, and double quadwords (see Figure 4-1). A byte is eight bits, a word is 2 bytes (16 bits), a doubleword is 4 bytes (32 bits), a quadword is 8 bytes (64 bits), and a double quadword is 16 bytes (128 bits).
Please believe me, this is a page full of really good, interesting stuff. Stuff that I mostly already know, but I still find the chronicled detail of a subject so close to my heart to be a thing of great beauty. But, I suspect like most people who feel that way, I don't need 'word' explained to me, this page should be for those who do. I got here when I searched for 'qword'. The term for the next size up next size up (dqword) momentarily slipped my mind, and I expected a page dealing with data sizes where I could get the information. What I found was more akin to a group of very proper gents all sipping tea and reassuring each other that the word 'gay' really does mean happy, regardless of what those other 5 billion people use it for. In fact, having just looked up the gay page, its a really good example:
Gay This article is about gay as an English-language term. For the sexual orientation, see Homosexuality. For other uses, see Gay (disambiguation).
Gay is a term that primarily refers to a homosexual person or the trait of being homosexual. The term was originally used to mean "carefree", "happy", or "bright and showy".
A bit further on there's a 'History' title and some of this, “The word gay arrived in English during the 12th century from Old French gai, most likely deriving…..”
Should anybody find this example to be in some way inappropriate they will have proved themselves well described by my simile. It is the perfect example.
The only sources I am aware of in which 'word' is still used to describe that native data size of a processor, are in some technical manuals from microcontroller manufacturers originating from regions where English is not the predominant language. Perhaps this page could be accurate titled “Word (microcontroller architecture)”, but not “Word (computing)”. As “Word(computer science)”, it is just plain wrong. I know it was right in 1970, but that's a moot point. I have spent some time programming non-PC processors, of various makes and (dare I say it) word-sizes, so I understand why 'word' it so appropriate in its traditional context, its almost poetic when you imagine the quantum with which a processor speaks.
However, in the real world mostly they don't. They 'talk' to almost anything else using 2 wire buses like SMBus or I²C. SPI with it's lavish 4 wires is being used less in favor of the more versatile and cheaper to implement I²C. With these I find myself not thinking in 8,16, 32 or 64 bits as I do with PCs, but in serial pulses. You can write the code in C these days, its mostly interfacing with peripherals that the challenge, and its mostly 2 or 4 wire, even modern smartphones. Typically the only high bandwidth 'bus' is for graphics, and having the GPU and CPU fused together, they are all but the same device... its no surprise our eyes are still attached directly to the front of our brains.
After I wrote the start of this, I thought I had made a mistake, and that this was a page specifically for processor architecture, that there was a word(unit of data) page also and it was a case of a bad link that needed to be redirected. I looked. I was wrong. I couldn't believe it. I'm sorry
“Recentism”…. Peter, that has to be trolling right? I'm not sure, but you say, “A word might be 32, 36, or 60 bits” immediately followed by “16 bits is a halfword, to call it a word is a recentism.” You then go on to explain that the brains trust at Intel were just being lazy when they decided that they wouldn't re-define the length of a cubit to match the new king's forearm. So I'm just going to give you a big thumbs-up for sense of humor, but I should touch on the subject that you have so rightly identified as being at the root of the situation. The only linked reference on this page without the word 'history' in the title was first printed in 1970. Reference number 4 is unlinked, but it shows a glimmer of promise with its title specifying “Concepts and Evolution”.
Guy, gave a couple of great examples going back to the mid 60's, however, the real question is what does the word mean? I know what it means to you, and I expect also to our host, but what does it mean to the vast majority of people who use it? Back in the mid 60's when mainframes and COBOL were all the rage, a tiny fraction of the world's population had access to any sort of computer. PC's arrived a few years later, but were simplistic and prohibitively expensive for most people, It really wasn't until the 80's that even a notable minority of the world had access to one. The ZX81 sold 1.5 million units. By the late 80's the entire range had sold about 7million. A quick look at my phone and original Angry Birds has over 100 million downloads just on android.
A frivolous example, but I think its fair to say that then number of people who have daily access to a computer today is comparable to the entire population of the world in 1965, without knowing the exact figures. I could be wrong, but it wouldn't be a ridiculous argument. I think guessing at there being 10million PC's in 1985 when Intel sanely decided to stick with the cubit they had been using for years would be generous. Even though it may only be a minority of users who think about data sizes beyond the bare minimum they are taught at school, I would suggest that in 2016 there must be tens, if not hundreds of millions of people who all know that a word is 2 bytes.
Possibly there's as many young people who know that gay can also mean 'happy' or 'bright', possibly not, but regardless of our personal feelings, are any of us old gents sipping our tea really going to tell that 6'6” Hell's Angel at the bar that he looks really gay tonight? Of course not, because like it or not we know what the word means. A word, any word, does not mean what the first person to speak it was thinking at the time, it means what people understand you to be communicating when you use it. The most absolutely stingy comparative estimate I could possible imagine of people who use 'word' in a computing sense who take it to mean “2 Bytes” vs. those who take it to mean “the size of the native unit for a given processing system” would be 1000:1, albeit that I've knocked off a couple of zeros I'm quite sure are there, the point remains valid.
This page describes the historical origins of the use of the term 'word' in computing, not the actual meaning attributed to its usage by almost all the people who use it. The section of the population who understand the word in the way you are presenting it, is a very small minority, but more importantly the section of the population who actually use the word in that context is so tiny that it is bordering on statistically irrelevant. You can, of course point to historical text that will suggest you are technically correct, but at the end of the day what will you have achieved? WP is a truly awesome entity that has already etched itself into the global consciousness. The term 'word' is one of the central concepts in computing, as I suspect you might possibly want to point out, were you not so committed to the second pillar.
One of the things that fellows like me who are not as young as many have to come to terms with, is that regardless of how certainly I know that my view on something is the best way to look at it, if an entire generation has decided otherwise, I am not going to sway them. I could stubbornly hold my ground continue to back my barrow up, the best I could hope for there, is that the world just ignores me until I become venerable and expire, or I could participate in society as it is presented to me, and endeavor to appropriately impart whatever meager wisdom I posses from within something that is far larger than myself.
In my ever-so-humble opinion, the first thing a page should do is give the reader the information they are looking for, and then if there is an even richer context to place around it, so much the better. I believe that if you create a page that is relevant and informative, but then beckons the reader as far into the larger context as suits their individual interests and capacity you will have made something that is truly greater than the sum of its parts that may well outlive the both of us. Perhaps by then people will have decided that gay means 'windy'. Please try to imagine how this page presents to somebody who searches 'qword' or links from here.
… and really, check this out for formatting, its quite well constructed.
“I say old chap, its rather gay outdoors this afternoon”
60.231.47.233 (
talk)
22:54, 12 June 2016 (UTC)sanity
Guy Harris, thanks for editing the table. I suggested non-italics (intentionally just changed not all in case I was wrong; left inconsistent..), and thanks for not changing (w) and {w} back.. and clarifying why italics in comments. I can go with the table as is; or just change and show (again) how it might look. I believe it might be lost on people why some are in italics.. Would upper case be better? I had a dilemma with "b", as it should stay lower case, but if you want to others to stand out upper would do. Also per MOS:FRAC, there should be a space I think (maybe exception; I hope to converse space?). comp.arch ( talk) 15:03, 19 April 2017 (UTC)
While the list of computer architectures is quite thorough, it could use Data General Nova, National Semiconductor NS320xx (NS32016/NS32032), SPARC (32- and 64-bit versions) and PA-RISC entries. 64-bit MIPS would also be nice. I'd add them myself but I don't know all the technical details. Bumm13 ( talk) 10:31, 27 September 2017 (UTC)
I'm also surprised Burroughs isn't mentioned, they pioneered many innovations in architecture and had long word lengths on their small machines. Also the Symbolics Ivory's 40-bit word probably deserves a mention for uniqueness if nothing else. -- 72.224.97.26 ( talk) 23:58, 10 May 2020 (UTC)
There are recent edits regarding removal, or unremoval, of 8 in the list of word sizes. Note that there are seven items in the Uses of Words section, all of which could be definitions of word size. It seems, though, that the number of bits used to describe a processor might not be the word size. VAX is considered a 32 bit processor (32 bit registers, 32 bit addresses, etc.) but has a 16 bit word size. The 8080, widely called an 8 bit processor, doesn't necessarily have an 8 bit word size. It has 16 bit addresses and (some) 16 bit registers. IBM S/360, a 32 bit system, has (in different models) data bus width from 8 to 64, 24 bit addresses, ALU size from 8 to 64 bits, not necessarily the same as data bus width. I think I agree with the removal of 8 from the list, even considering that there are 8 bit processors. (Unless said processors have an 8 bit address space.) Gah4 ( talk) 23:43, 10 February 2019 (UTC)
The article says In general, new processors must use the same data word lengths and virtual address widths as an older processor to have binary compatibility with that older processor. This makes a lot of sense, but many examples show it is wrong. VAX is the 32 bit architecture (still with the 16 bit word) of the PDP-11. Early models have compatibility mode to execute PDP-11 code. IBM extended the 24 bit addressing for S/360 to 31 bits in XA/370 and ESA/390, and finally to 64 bit addressing in z/Architecture, binary compatible all the way through. You can run compiled load modules from early S/360 years, under z/OS on modern z/Architecture machines, complete back binary compatibility. I suspect more people understand the 32 bit and 64 bit extensions to the 16 bit 8086, which again kept the 16 bit word. Gah4 ( talk) 05:33, 27 March 2019 (UTC)
These supercomputers should be included in the table of machines due to their unique architectures and/or place in history. EX: The Star had as its basis, concepts from the programming language APL. /info/en/?search=CDC_STAR-100 EX: The ASC had as its basic, optimized vector/matrix operations. Its instruction set was designed to handle triple for loops as a first class operation. /info/en/?search=TI_Advanced_Scientific_Computer — Preceding unsigned comment added by 137.69.117.204 ( talk) 20:13, 10 October 2019 (UTC)
The article says
In general, new processors must use the same data word lengths and virtual address widths as an older processor to have binary compatibility with that older processor.
There are ways of expanding the word lengths or virtual address widths without breaking binary compatibility. The most common way this is accomplished is by adding separate modes, so that the newer processor starts up in a mode without support for the wider word lengths or virtual address widths, and an OS that supports the newer version of the instruction set can enable the widening and, typically, even can run binary programs expecting the narrower widths by changing the mode on a per-process basis. This is how System/370-XA extended addresses from 24 bits (within a 32-bit word, the upper 8 bits being ignored) to 31 bits (again, within a 32-bit word, with the uppermost bit ignored), and how most 32-bit instruction sets were expanded to 64 bits.
In addition, some 64-bit instruction sets, such as x86-64, don't require that implementations support a fully 64-bit instruction set, but require that addresses in which the unsupported bits are significant cause faults (non-zero or, at least in the case of x86-64, not just sign extensions), so that programs can't use those bits to store other data (this was the issue that required the mode bit for System/370-XA). Later implementations can (and have) increased the number of significant bits.
This was an issue, however, for some code for the Motorola 68000 series. The 68000 and 68010 had 32-bit address registers, but only put the lower 24 bits on the address bus. This would allow applications to pack 8 bits of information in the upper 8 bits of an address. That code would run into trouble on the 68012, which put the lower 31 bits on the address bus, and on the 68020 and later processors, which either put all 32 bits on the address bus, or, for the 68030 and later, provided them to the on-chip MMU to translate to a physical address. This was an issue for Macintosh software. Guy Harris ( talk) 18:06, 13 April 2021 (UTC)
The instruction length cell for the IA-64 is rather terse. Does anybody know of an online manual that I could cite in a footnote with |quote=
for a fuller explanation? Note that the 128-bit packet contains more than 41-bit instructions. --
Shmuel (Seymour J.) Metz Username:Chatul (
talk)
03:01, 24 April 2022 (UTC)
The redirect
Bitness has been listed at
redirects for discussion to determine whether its use and function meets the
redirect guidelines. Readers of this page are welcome to comment on this redirect at
Wikipedia:Redirects for discussion/Log/2024 March 21 § Bitness until a consensus is reached.
Utopes (
talk /
cont)
03:22, 21 March 2024 (UTC)
This is the
talk page for discussing improvements to the
Word (computer architecture) article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google ( books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | This article is rated C-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||
|
![]() |
Daily pageviews of this article
A graph should have been displayed here but
graphs are temporarily disabled. Until they are enabled again, visit the interactive graph at
pageviews.wmcloud.org |
This article has many possible titles, including word, computer word, memory word, data word, instruction word, word size, word length, etc. I believe the general form should be used since none of the specific forms is well established. Of course "word" has to be qualified to disambiguate from non-computing uses. I'm not particularly happy with "(computer science)", though. Perhaps "Word (computing)" would be the best choice overall. - R. S. Shaw 05:00, 6 June 2006 (UTC)
I can't see why not to do this. It would be silly to have a different article for every word length. Merge them all into the same article. I don't mind under what category you qualify word, seems like a bit of an arbitrary choice anyway. -- FearedInLasVegas 22:39, 28 July 2006 (UTC)
OK, I did the merge, but it's fairly unintegrated. Anyone feel free to smooth it out... Lisamh 15:46, 21 September 2006 (UTC)
It's my understanding that the concept of "word" size is a nebulous one and not very defined on some systems. Should the article make this more clear? The way the article makes it sound, computer designers pick a world size and then base other choices around it. While there are good reasons to use the same number of bits for registers as you do for bus-widths and so on, I just don't think this is the reality any more. The x86/x64 archictectures, for instance, are a mess of size choices. So what I'm proposing is a reversal of emphasis by saying: 'computers have certain bus-widths and register-sizes, etc., and the size of a "word" is the number of bits most common to them; this is not always a straight-forward determination". -- Apantomimehorse 16:04, 30 August 2006 (UTC)
"Word" seems quite ambiguous in modern usage, since people very often mean it as 16-bits rather than the actual word size of the processor (likely now to be 32 or 64 bits). There's Intel and Microsoft who, for reasons of backwards compatability, use "word" to mean a fixed size of 16-bits, just as "byte" is used to mean a fixed size of 8-bits. Although in formal contexts, "word" still means the basic native size of integers and pointers in the processor, a huge number of people believe that "word" just means "16-bits". Would it be useful to mention the phrase "machine word" as being a less ambiguous way to refer to that native size, and try to avoid the danger of the fixed 16-bit meaning? JohnBSmall 14:10, 8 October 2006 (UTC)
I have to strongly disagree. To anyone with a mainframe background of any kind this is simply not true. A word might be 32, 36, or 60 bits. 16 bits is a halfword, to call it a word is a recentism. A word on 8086 systems was 16 bits, and Intel didn't want to bother to update their terminology when the 386 was introduced. Peter Flass ( talk) 22:17, 19 November 2014 (UTC)
There were a whole bunch of minicomputers before the VAX. I used Motorola, HP, and DEC. There were a few others in the building for "real time" computing and some pre-PC desktops.
I remember the CDC 6600 as having a 64-bit word and there were 128 characters available. The 128 characters does not match a 6-bit word, but rather a 7-bit word. You could run APL on it and it used a whole bunch of special characters. .
I only read that part of the manual once, and there were a few "hall talk" discussions about the size. So, there may be a wet memory error there.
Ralph —The preceding unsigned comment was added by 67.173.69.180 ( talk) 15:33, 6 January 2007 (UTC).
Is "tword" worth mentioning? It is used, at least, in nasm as a ten-byte field. I found a few other references to it (for example, GoAsm). I was looking because the nasm documentation didn't really say how big it is. -- Ishi Gustaedr 17:56, 20 July 2007 (UTC)
the Model II reduced this to 6 cycles, but reduced the fetch times to 4 cycles if one or 1 cycle if both address fields were not needed by the instruction
Suggestions:
1. Article needs to differentiate between general word meaning and Intel's definition (more clearly). Generally word should mean maximum data, which could be stored in a register, transfered through the memory bus in one cycle, etc (width of them). Now Intel (or x86 programmers) have coined the new word definition, which is 16 bits. This is only one of the possible word sizes like byte can mean not only 8 bits. IA-32 (since 80386) natural word size is 32 bytes cause such amount of data is maximum register capacity and is the size with which the system operates. 16 bit part is only part of the register. As well, 16 bit word can also be divided into high and low parts and all these together make one 32 bit register. Same goes for x86-64, but not the latest IA-64. Furthermore, C language standards ANSI C and C99 define that int size is equal to the size of the word in particular system. On IA-32 it is 32 bits.
2.
and is still said to be 16 bits, despite the fact that they may in actuality (and especially when the default operand size is 32-bit) operate more like a machine with a 32 bit word size. Similarly in the newer x86-64 architecture, a "word" is still 16 bits, although 64-bit ("quadruple word") operands may be more common.
Seems too much uncertainty (may, more like) —Preceding unsigned comment added by 193.219.93.218 ( talk) 13:49, 11 June 2008 (UTC)
A ‘‘plain’’
int
object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the rangeINT_MIN
toINT_MAX
as defined in the header<limits.h>
).
int
does not involve the term "word" at all.int
s and C compilers for the 68000 family that had 32-bit int
s, so some compiler writers chose "the width of the CPU data paths" and some compiler writers chose "the size of the data and address registers and of the operands of the widest integer arithmetic instructions".
Guy Harris (
talk)
20:05, 20 April 2018 (UTC)What are the criteria for including an architecture in this table? I don't see the PDP-6/10, which was the dominant architecture on the ARPAnet and the major machine used in AI research in the 60's and 70's; I don't see the SDS/XDS 940, a pioneering early timesharing machine; I don't see the GE/Honeywell 6xx/ 6xxx series, on which Multics was implemented; I don't see the Manchester Atlas, which pioneered virtual memory. Of course, I could just add them, but perhaps there is actually some logic about the current choice I am missing? For that matter, I'm not sure that having an extensive list is all that valuable; maybe it could be condensed in some useful way? -- Macrakis ( talk) 20:16, 27 July 2010 (UTC)
Although I appreciate the idea that this article describes "Word" as a dynamically sized type depending on architecture...
When we're in the world of x86 there's simply a backward compatibility in effect that has fixed the size of the type "WORD" to 16 bits. The problem with this concept however is Compilers, that actually utilize fixed types, but use base types like "integer" and "long" that are sized dependent on the underlying cpu architecture.
When in C the type WORD is defined via "#define WORD int", it doesn't mean that the WORD is always the length of an integer, it means that the WORD is defined to a type that at the time that it was written - an int was 16 bits. When you compile the same definition on a compiler that supports 64 bits, it will be compiled to a 32 bits type, which will probably Break your software - especially when coupled to the Windows-API. Nowadays WORD will probably be defined as a short int, which is 16-bits in most C compilers.
As a result, or reason, who knows what happened first, both Intel and AMD have a clear definition of Words in their opcode manuals.
The following quote is from the "Intel® 64 and IA-32 Architectures, Software Developer’s Manual, Volume 2A: Instruction Set Reference, A-M":
The "legacy" opcodes LODSB, LODSW, LODSD for example are made to specifically load 1-byte, 2-byte, 4-byte into registers. Independent of whether the architecture is 32 or 64 bits.
So I personally think that the size of a WORD has been fixed into history to 16-bits and DWORD into 32-bits, simply because they're currently heavily used and really need to be binary compatible types. — Preceding unsigned comment added by Partouf ( talk • contribs) 20:54, 15 January 2011 (UTC)
On the 68000, a word usually refers to 16 bits. The 68000 had a 16 bit data bus, and although it had 32 bit registers, performing 32 bit instructions were slower than 16 bit ones. -OOPSIE- ( talk) 08:27, 2 February 2011 (UTC)
A critical limit on any architecture is the width of an address. I admit that there are multiple versions of this: the width of the address bus and the width of the address as far as ordinary programs are concerned (and sometimes the width of the address for programs that are willing to work hard).
Many computer architectures ran out of address bits, grew warts to extend the physical address, and then died when these were not sufficient. Think: PDP-8 with 12-bit addresses, then bank switching to allow more memory (horrible for a program to deal with, OK for an OS). Or the i386 with 32 bit addresses and then PAE.
Reliable sources say that the main motivation for 64-bit architectures (SPARC, MIPS, AMD64, ARM8) has been to make wider addresses convenient.
I'd like a column added to the table to give the address width. DHR ( talk) 20:07, 28 October 2013 (UTC)
Thanks for improving my language regarding "char" and Unicode (and Harvard, but while true shouldn't split-cache be mentioned?). While the history lesson is correct (I can always count on you), my point wasn't to list encodings. UTF-8 is up to 82.4% while Shift JIS is down to 1.2% on the web. [2]
My point was that "char" is still 8-bit (or not e.g. in Java but with all those newer languages we have, there is still an 8-bit type) while not "usable" when 8-bit for characters.. The multibyte encodings didn't influence computer architecture and word size as far as I know.. Still I just thought somehow it should be mentioned that "char" can be bigger (e.g. 32-bit in Julia (programming language), my current favorite) while not having anything to do with the least addressable unit. Then maybe it's ok to shorten the language, to be on topic, while only mentioning Unicode/UTF-8 (as the only important encoding going forward)? comp.arch ( talk) 14:38, 19 January 2015 (UTC)
Recent edits changed the entries for IA-32 and x86-64 to use "word" in the historical-terminology sense rather than the natural-size sense. This has been much discussed during the history of this page.
If, for architectures that evolved from earlier architectures with shorter natural-size words, we're going to consistently use "word" in the historical-terminology, then 1) we need to fix the entry for VAX as well (it was a 32-bit architecture, so "the natural unit of data used by a particular processor design", to quote the lede of the article, was 32 bits) and 2) we should update the lede, as it doesn't say "word is a term the developer of an instruction set uses for some size of data that might, or might not, be the natural unit of data used by a particular processor design".
Otherwise, we should stick to using "word" to mean what we say it means, even if that confuses those used to calling a data that's half the size, or a quarter of the size, of the registers in an instruction set a "word".
(We should also speak of instruction sets rather than processor designs in the lede, as there have been a number of designs for processors that implement an N-bit instruction set but have M-bit data paths internally, for M < N, but that's another matter.) Guy Harris ( talk) 23:38, 20 February 2016 (UTC)
Well I had to scroll about 11 or 12 bits down the this page of ever-so cleverness, but finally some sanity, thank you Partouf. Firstly, whoever was responsible for putting together this page, I love it. It's full of really interesting stuff, and I also get the impression you may have a similar coding ethos to myself. I like it. I like you. So I really would like to see this page be re-titled and linked to or included in a page on the way 'word' is used in computing, perhaps under the heading "origins of data size terminology" or something like that. It is all true. You are technically correct, however anybody looking for real world information can only be confused here. Perhaps a junior school student who is starting computer science and can't remember which one of those funny new terms is which. There's actually comments here from people not wanting readers to get the wrong impression and think a word is 16-bits... It is. Sorry. The real world built a bridge, got over it and moved on with their lives. That poor school kid just wants to know its 2 bytes, she has enough study to do already.
Perhaps should be noted somewhere early in the page not to confuse this with the "word size" that some sources still use to describe processor architecture, and directing the 1 in every 65536 visitors (including myself) who find that interesting, to a section further down the page. The fact that the word has been re-purposed really should rate about 1 sentence in the opening paragraph before it is clearly described that a word is 2 bytes, 16 bits and can hold values of zero thru 2^16-1. Maybe something like "In computing 'word' was once used to describe the size of the largest individual binary value that could be manipulated by a given system, but with the evolution of processor design and rapid growth of computer use worldwide, it now almost always refers to..." a WORD. This is what languages do, they evolve. If we want to wax philosophical about it, I could postulate that it is probably because computer terminology and indeed documentation began to solidify around the time that 16 bit architecture was the mainstream norm. A group of cetaceanologists may very correctly us the term 'cow' to describe 100 tons of migratory aquatic mammal, but WP quite rightly directs that inquiry straight to this page.
The obvious example that immediately sprang to mind, I see has already been canvassed here by Partouf, being the long ago standardisation of the term in assembly language programming and its ubiquitous and categorical definition as being 2 bytes. From volume 1 of the current Intel® 64 and IA-32 Architectures Software Developer Manuals:
4.1 FUNDAMENTAL DATA TYPES The fundamental data types are bytes, words, doublewords, quadwords, and double quadwords (see Figure 4-1). A byte is eight bits, a word is 2 bytes (16 bits), a doubleword is 4 bytes (32 bits), a quadword is 8 bytes (64 bits), and a double quadword is 16 bytes (128 bits).
Please believe me, this is a page full of really good, interesting stuff. Stuff that I mostly already know, but I still find the chronicled detail of a subject so close to my heart to be a thing of great beauty. But, I suspect like most people who feel that way, I don't need 'word' explained to me, this page should be for those who do. I got here when I searched for 'qword'. The term for the next size up next size up (dqword) momentarily slipped my mind, and I expected a page dealing with data sizes where I could get the information. What I found was more akin to a group of very proper gents all sipping tea and reassuring each other that the word 'gay' really does mean happy, regardless of what those other 5 billion people use it for. In fact, having just looked up the gay page, its a really good example:
Gay This article is about gay as an English-language term. For the sexual orientation, see Homosexuality. For other uses, see Gay (disambiguation).
Gay is a term that primarily refers to a homosexual person or the trait of being homosexual. The term was originally used to mean "carefree", "happy", or "bright and showy".
A bit further on there's a 'History' title and some of this, “The word gay arrived in English during the 12th century from Old French gai, most likely deriving…..”
Should anybody find this example to be in some way inappropriate they will have proved themselves well described by my simile. It is the perfect example.
The only sources I am aware of in which 'word' is still used to describe that native data size of a processor, are in some technical manuals from microcontroller manufacturers originating from regions where English is not the predominant language. Perhaps this page could be accurate titled “Word (microcontroller architecture)”, but not “Word (computing)”. As “Word(computer science)”, it is just plain wrong. I know it was right in 1970, but that's a moot point. I have spent some time programming non-PC processors, of various makes and (dare I say it) word-sizes, so I understand why 'word' it so appropriate in its traditional context, its almost poetic when you imagine the quantum with which a processor speaks.
However, in the real world mostly they don't. They 'talk' to almost anything else using 2 wire buses like SMBus or I²C. SPI with it's lavish 4 wires is being used less in favor of the more versatile and cheaper to implement I²C. With these I find myself not thinking in 8,16, 32 or 64 bits as I do with PCs, but in serial pulses. You can write the code in C these days, its mostly interfacing with peripherals that the challenge, and its mostly 2 or 4 wire, even modern smartphones. Typically the only high bandwidth 'bus' is for graphics, and having the GPU and CPU fused together, they are all but the same device... its no surprise our eyes are still attached directly to the front of our brains.
After I wrote the start of this, I thought I had made a mistake, and that this was a page specifically for processor architecture, that there was a word(unit of data) page also and it was a case of a bad link that needed to be redirected. I looked. I was wrong. I couldn't believe it. I'm sorry
“Recentism”…. Peter, that has to be trolling right? I'm not sure, but you say, “A word might be 32, 36, or 60 bits” immediately followed by “16 bits is a halfword, to call it a word is a recentism.” You then go on to explain that the brains trust at Intel were just being lazy when they decided that they wouldn't re-define the length of a cubit to match the new king's forearm. So I'm just going to give you a big thumbs-up for sense of humor, but I should touch on the subject that you have so rightly identified as being at the root of the situation. The only linked reference on this page without the word 'history' in the title was first printed in 1970. Reference number 4 is unlinked, but it shows a glimmer of promise with its title specifying “Concepts and Evolution”.
Guy, gave a couple of great examples going back to the mid 60's, however, the real question is what does the word mean? I know what it means to you, and I expect also to our host, but what does it mean to the vast majority of people who use it? Back in the mid 60's when mainframes and COBOL were all the rage, a tiny fraction of the world's population had access to any sort of computer. PC's arrived a few years later, but were simplistic and prohibitively expensive for most people, It really wasn't until the 80's that even a notable minority of the world had access to one. The ZX81 sold 1.5 million units. By the late 80's the entire range had sold about 7million. A quick look at my phone and original Angry Birds has over 100 million downloads just on android.
A frivolous example, but I think its fair to say that then number of people who have daily access to a computer today is comparable to the entire population of the world in 1965, without knowing the exact figures. I could be wrong, but it wouldn't be a ridiculous argument. I think guessing at there being 10million PC's in 1985 when Intel sanely decided to stick with the cubit they had been using for years would be generous. Even though it may only be a minority of users who think about data sizes beyond the bare minimum they are taught at school, I would suggest that in 2016 there must be tens, if not hundreds of millions of people who all know that a word is 2 bytes.
Possibly there's as many young people who know that gay can also mean 'happy' or 'bright', possibly not, but regardless of our personal feelings, are any of us old gents sipping our tea really going to tell that 6'6” Hell's Angel at the bar that he looks really gay tonight? Of course not, because like it or not we know what the word means. A word, any word, does not mean what the first person to speak it was thinking at the time, it means what people understand you to be communicating when you use it. The most absolutely stingy comparative estimate I could possible imagine of people who use 'word' in a computing sense who take it to mean “2 Bytes” vs. those who take it to mean “the size of the native unit for a given processing system” would be 1000:1, albeit that I've knocked off a couple of zeros I'm quite sure are there, the point remains valid.
This page describes the historical origins of the use of the term 'word' in computing, not the actual meaning attributed to its usage by almost all the people who use it. The section of the population who understand the word in the way you are presenting it, is a very small minority, but more importantly the section of the population who actually use the word in that context is so tiny that it is bordering on statistically irrelevant. You can, of course point to historical text that will suggest you are technically correct, but at the end of the day what will you have achieved? WP is a truly awesome entity that has already etched itself into the global consciousness. The term 'word' is one of the central concepts in computing, as I suspect you might possibly want to point out, were you not so committed to the second pillar.
One of the things that fellows like me who are not as young as many have to come to terms with, is that regardless of how certainly I know that my view on something is the best way to look at it, if an entire generation has decided otherwise, I am not going to sway them. I could stubbornly hold my ground continue to back my barrow up, the best I could hope for there, is that the world just ignores me until I become venerable and expire, or I could participate in society as it is presented to me, and endeavor to appropriately impart whatever meager wisdom I posses from within something that is far larger than myself.
In my ever-so-humble opinion, the first thing a page should do is give the reader the information they are looking for, and then if there is an even richer context to place around it, so much the better. I believe that if you create a page that is relevant and informative, but then beckons the reader as far into the larger context as suits their individual interests and capacity you will have made something that is truly greater than the sum of its parts that may well outlive the both of us. Perhaps by then people will have decided that gay means 'windy'. Please try to imagine how this page presents to somebody who searches 'qword' or links from here.
… and really, check this out for formatting, its quite well constructed.
“I say old chap, its rather gay outdoors this afternoon”
60.231.47.233 (
talk)
22:54, 12 June 2016 (UTC)sanity
Guy Harris, thanks for editing the table. I suggested non-italics (intentionally just changed not all in case I was wrong; left inconsistent..), and thanks for not changing (w) and {w} back.. and clarifying why italics in comments. I can go with the table as is; or just change and show (again) how it might look. I believe it might be lost on people why some are in italics.. Would upper case be better? I had a dilemma with "b", as it should stay lower case, but if you want to others to stand out upper would do. Also per MOS:FRAC, there should be a space I think (maybe exception; I hope to converse space?). comp.arch ( talk) 15:03, 19 April 2017 (UTC)
While the list of computer architectures is quite thorough, it could use Data General Nova, National Semiconductor NS320xx (NS32016/NS32032), SPARC (32- and 64-bit versions) and PA-RISC entries. 64-bit MIPS would also be nice. I'd add them myself but I don't know all the technical details. Bumm13 ( talk) 10:31, 27 September 2017 (UTC)
I'm also surprised Burroughs isn't mentioned, they pioneered many innovations in architecture and had long word lengths on their small machines. Also the Symbolics Ivory's 40-bit word probably deserves a mention for uniqueness if nothing else. -- 72.224.97.26 ( talk) 23:58, 10 May 2020 (UTC)
There are recent edits regarding removal, or unremoval, of 8 in the list of word sizes. Note that there are seven items in the Uses of Words section, all of which could be definitions of word size. It seems, though, that the number of bits used to describe a processor might not be the word size. VAX is considered a 32 bit processor (32 bit registers, 32 bit addresses, etc.) but has a 16 bit word size. The 8080, widely called an 8 bit processor, doesn't necessarily have an 8 bit word size. It has 16 bit addresses and (some) 16 bit registers. IBM S/360, a 32 bit system, has (in different models) data bus width from 8 to 64, 24 bit addresses, ALU size from 8 to 64 bits, not necessarily the same as data bus width. I think I agree with the removal of 8 from the list, even considering that there are 8 bit processors. (Unless said processors have an 8 bit address space.) Gah4 ( talk) 23:43, 10 February 2019 (UTC)
The article says In general, new processors must use the same data word lengths and virtual address widths as an older processor to have binary compatibility with that older processor. This makes a lot of sense, but many examples show it is wrong. VAX is the 32 bit architecture (still with the 16 bit word) of the PDP-11. Early models have compatibility mode to execute PDP-11 code. IBM extended the 24 bit addressing for S/360 to 31 bits in XA/370 and ESA/390, and finally to 64 bit addressing in z/Architecture, binary compatible all the way through. You can run compiled load modules from early S/360 years, under z/OS on modern z/Architecture machines, complete back binary compatibility. I suspect more people understand the 32 bit and 64 bit extensions to the 16 bit 8086, which again kept the 16 bit word. Gah4 ( talk) 05:33, 27 March 2019 (UTC)
These supercomputers should be included in the table of machines due to their unique architectures and/or place in history. EX: The Star had as its basis, concepts from the programming language APL. /info/en/?search=CDC_STAR-100 EX: The ASC had as its basic, optimized vector/matrix operations. Its instruction set was designed to handle triple for loops as a first class operation. /info/en/?search=TI_Advanced_Scientific_Computer — Preceding unsigned comment added by 137.69.117.204 ( talk) 20:13, 10 October 2019 (UTC)
The article says
In general, new processors must use the same data word lengths and virtual address widths as an older processor to have binary compatibility with that older processor.
There are ways of expanding the word lengths or virtual address widths without breaking binary compatibility. The most common way this is accomplished is by adding separate modes, so that the newer processor starts up in a mode without support for the wider word lengths or virtual address widths, and an OS that supports the newer version of the instruction set can enable the widening and, typically, even can run binary programs expecting the narrower widths by changing the mode on a per-process basis. This is how System/370-XA extended addresses from 24 bits (within a 32-bit word, the upper 8 bits being ignored) to 31 bits (again, within a 32-bit word, with the uppermost bit ignored), and how most 32-bit instruction sets were expanded to 64 bits.
In addition, some 64-bit instruction sets, such as x86-64, don't require that implementations support a fully 64-bit instruction set, but require that addresses in which the unsupported bits are significant cause faults (non-zero or, at least in the case of x86-64, not just sign extensions), so that programs can't use those bits to store other data (this was the issue that required the mode bit for System/370-XA). Later implementations can (and have) increased the number of significant bits.
This was an issue, however, for some code for the Motorola 68000 series. The 68000 and 68010 had 32-bit address registers, but only put the lower 24 bits on the address bus. This would allow applications to pack 8 bits of information in the upper 8 bits of an address. That code would run into trouble on the 68012, which put the lower 31 bits on the address bus, and on the 68020 and later processors, which either put all 32 bits on the address bus, or, for the 68030 and later, provided them to the on-chip MMU to translate to a physical address. This was an issue for Macintosh software. Guy Harris ( talk) 18:06, 13 April 2021 (UTC)
The instruction length cell for the IA-64 is rather terse. Does anybody know of an online manual that I could cite in a footnote with |quote=
for a fuller explanation? Note that the 128-bit packet contains more than 41-bit instructions. --
Shmuel (Seymour J.) Metz Username:Chatul (
talk)
03:01, 24 April 2022 (UTC)
The redirect
Bitness has been listed at
redirects for discussion to determine whether its use and function meets the
redirect guidelines. Readers of this page are welcome to comment on this redirect at
Wikipedia:Redirects for discussion/Log/2024 March 21 § Bitness until a consensus is reached.
Utopes (
talk /
cont)
03:22, 21 March 2024 (UTC)