Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a
transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the
current reference desk pages.
November 4 Information
Can Assembly document contain machine code?
I ask this probably wired question for general knowledge, as someone who does only Bash and JavaScript and never wrote one sentence in Assembly:
Might there be a case when a programmer will write machine code directly in Assembly document (Assembly-absratcion + machine code; binary/hexadecimal) and if so, would it be executed directly from Assembly?
There are a small number of rare cases, like when the programmer needs to choose an instruction that isn't normally documented, in which case they just insert it as data in the middle of the code.
Bash is a command language that
runs on UNIX computers and the Bash interpreter cannot understand assembly language. There is
a devious way to add inline assembly into
C language via the asm keyword, then compile a
DSO (Dynamic Shared Object) that Bash can employ using its
here documents syntax.
JavaScript is a language primarily for web pages that is interpreted by a browser that cannot understand assembly language. Be aware that there are many different
Assembly languages that are each specific to a particular computer architecture, unlike Bash and JavaScript that are both interpreters portable to different processors.
Assembly language consists of a handful of easily-remembered simple commands that act directly on the
registers of the target processor. For example, the assembly programmer who wishes to load an 8-bit register with a value 97 writes mov al, 61h into a text file which his assembler program (e.g.
MASM for an
Intel 80x86 processor) translates to the
Machine codeB0 61. These are two bytes expressed in
Hexadecimal. The OP asks about the possibility of a programmer writing directly machine code such as B0 61. It is possible because
The
CPU cannot know or care where machine code came from and, if it is correct, will execute it as expected
but we must qualify that as follows:
It is very inadvisable to circumvent the assembler program that has been developed and tested with a full knowledge of the target CPU
Machine code is almost impossible for a human subsequently to understand, debug or modify without the help of a
Disassembler program.
Exceptional cases that might justify a direct change in machine code are:
The assembly programmer knows reliably about a new feature of a CPU that has not yet been incorporated in his version of assembler program
Yes, this is pretty common as a feature, although not often needed or used.
An
assembly language program is written as source code. There are only two uses for this: editing it manually, or supplying it to an
assembler program, which then converts it to binary machine code. You can't load this source file directly onto the target processor.
It's standard that you can also write an assembler '
directive', then some numbers after this (human-readable format) which says "Don't treat this as assembly language, don't assemble it, just copy it around as raw data". This is most often used to load data look-up tables, but if those numbers are also valid as machine code, then they'll be incorporated as more machine code into the program and can be used as such.
Whether it gets executed depends on if anything tries to execute it. Remember that most processors have a fairly dense set of instructions (most numbers map to recognised
opcodes) and so it will do something with it – probably not anything very useful. All that's needed is for the
program counter to be instructed to jump to this address.
This isn't often used for a serious purpose. It can be sometimes if you're writing to a new version of a processor and your assembler doesn't yet support some new instruction which has recently been added. This might even be an instruction (or an
operand) which shouldn't work, but it turns out that it does (and might have some weird, but useful side-effect).
Assembling such a program involves more than just translating assembler mnemonics into opcodes. One of the main tasks for assemblers is in doing the arithmetic needed to calculate jumps, pointers etc. So any embedded machine code like this would also need to have that done first, and that's really tedious.
It's even possible to find this facility in some (older) higher-level programming languages. Years back (mid-'80s) I was using
CORAL 66, which was obsolete even then. It's a very simple language, and we were writing 'systems' code, which manipulated low-level features of the processor hardware, writing specific codes into specific locations to control the memory-mapping hardware. Fortunately the language compiler had 'CODE' blocks, which allowed us to write codes in directly, just as you've been describing.
Andy Dingley (
talk)
20:50, 5 November 2019 (UTC)reply
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a
transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the
current reference desk pages.
November 4 Information
Can Assembly document contain machine code?
I ask this probably wired question for general knowledge, as someone who does only Bash and JavaScript and never wrote one sentence in Assembly:
Might there be a case when a programmer will write machine code directly in Assembly document (Assembly-absratcion + machine code; binary/hexadecimal) and if so, would it be executed directly from Assembly?
There are a small number of rare cases, like when the programmer needs to choose an instruction that isn't normally documented, in which case they just insert it as data in the middle of the code.
Bash is a command language that
runs on UNIX computers and the Bash interpreter cannot understand assembly language. There is
a devious way to add inline assembly into
C language via the asm keyword, then compile a
DSO (Dynamic Shared Object) that Bash can employ using its
here documents syntax.
JavaScript is a language primarily for web pages that is interpreted by a browser that cannot understand assembly language. Be aware that there are many different
Assembly languages that are each specific to a particular computer architecture, unlike Bash and JavaScript that are both interpreters portable to different processors.
Assembly language consists of a handful of easily-remembered simple commands that act directly on the
registers of the target processor. For example, the assembly programmer who wishes to load an 8-bit register with a value 97 writes mov al, 61h into a text file which his assembler program (e.g.
MASM for an
Intel 80x86 processor) translates to the
Machine codeB0 61. These are two bytes expressed in
Hexadecimal. The OP asks about the possibility of a programmer writing directly machine code such as B0 61. It is possible because
The
CPU cannot know or care where machine code came from and, if it is correct, will execute it as expected
but we must qualify that as follows:
It is very inadvisable to circumvent the assembler program that has been developed and tested with a full knowledge of the target CPU
Machine code is almost impossible for a human subsequently to understand, debug or modify without the help of a
Disassembler program.
Exceptional cases that might justify a direct change in machine code are:
The assembly programmer knows reliably about a new feature of a CPU that has not yet been incorporated in his version of assembler program
Yes, this is pretty common as a feature, although not often needed or used.
An
assembly language program is written as source code. There are only two uses for this: editing it manually, or supplying it to an
assembler program, which then converts it to binary machine code. You can't load this source file directly onto the target processor.
It's standard that you can also write an assembler '
directive', then some numbers after this (human-readable format) which says "Don't treat this as assembly language, don't assemble it, just copy it around as raw data". This is most often used to load data look-up tables, but if those numbers are also valid as machine code, then they'll be incorporated as more machine code into the program and can be used as such.
Whether it gets executed depends on if anything tries to execute it. Remember that most processors have a fairly dense set of instructions (most numbers map to recognised
opcodes) and so it will do something with it – probably not anything very useful. All that's needed is for the
program counter to be instructed to jump to this address.
This isn't often used for a serious purpose. It can be sometimes if you're writing to a new version of a processor and your assembler doesn't yet support some new instruction which has recently been added. This might even be an instruction (or an
operand) which shouldn't work, but it turns out that it does (and might have some weird, but useful side-effect).
Assembling such a program involves more than just translating assembler mnemonics into opcodes. One of the main tasks for assemblers is in doing the arithmetic needed to calculate jumps, pointers etc. So any embedded machine code like this would also need to have that done first, and that's really tedious.
It's even possible to find this facility in some (older) higher-level programming languages. Years back (mid-'80s) I was using
CORAL 66, which was obsolete even then. It's a very simple language, and we were writing 'systems' code, which manipulated low-level features of the processor hardware, writing specific codes into specific locations to control the memory-mapping hardware. Fortunately the language compiler had 'CODE' blocks, which allowed us to write codes in directly, just as you've been describing.
Andy Dingley (
talk)
20:50, 5 November 2019 (UTC)reply