Sunday, February 5, 2012

Assembly explaining the IO with c

Following on from my last post relating to IO using the c API's I am now going to explain how it works.

The biggest difference in the build process was that instead of linking the Assembly library created by NASM I am compiling and linking with gcc.


The reason is because gcc provides the libraries needed to be compiled against like stdio.

Segment .data
I am defining the variables two_var_output, output, input and hello_world.
They are being given the values of a char array.
The value of two_var_output is %s%s\0.
The value of output is You said \0.
The value of input is Please say something: \0.
The value of hello_world is Hello World!\nSecond blog article code\n\0.
Where \n is the new line constant with the value of 10 in ASCII / UNICODE.
Where \0 is the null constant with the value of 0 in ASCII / UNICODE.
Remember with c Strings they end in a null terminator char (0).

Variables (Data directives)
Variables in Assembly which have a value are defined in the following format:
label dx value
Where label is the name given to it, x part of dx is the size and value is the value given to it.
What letters are available? well take a look:
B = Byte
W = Word
D = Double word
Q = Quad word
T = Ten bytes
Segment .bss
Now you can have not just dx but also resx.
resx is used when you want to define a variable but not give it a value / initiate it.
Where X is letter from the list in Variables section just like for dx.

Here I have defined the variable result of type resb with not defining it a size or value.
Where the values will not exceed a byte (0-255).
The reason I have done this

  1. It can save a very small amount of memory
  2. We don't know how big it will be as it will be given value from the user later.
Segment .text
First during our function definition segment I declare the function _main (main for e.g. Linux remember?) as a global as in it can be called from command line.
The change here was that it didn't quite so much matter when it came to just linking it to an executable during the build process but when you compile against c it doesn't quite like that. As c functions by default have a main function, well here it is in Assembly.

The next statement (the extern one) is all about "importing" global c functions provided by gcc.
The particular ones involved for this is gets and printf.
Where gets actually gets data from the terminal and printf outputs it.

In the main function (and the only one at that) there is basically a whole lot of code repeated.

First lot of main functions code
As a side note c requires char arrays in dword size. So prefix the variable in output to dword.
Of course this could have been done by changing dx to dw but hey lets save a little bit of memory if we can?
First we push the variable hello_world with the value defined above on to the stack.
Think of the stack as an array, which contains all the variables to passed to a function when you call it.
Preferably a function will also put it back on the stack after it has finished.
Remember when you push a variable on the stack it must come off again.
Next we call the printf function so if you were in c you could use:
printf(hello_world);
When you pop a variable off the stack you provide another variable / register to set it to hence eax.
That finishes the basic of outputting to the terminal but remember when ever you finish a line to the terminal to use the new line constant to actually go to a new line.

The second lot of code is essentially the same as the first except its uses the variable input not hello_world.

Third lot of code in main function (gets)
This is quite an interesting little function from c.
It gets a piece of data from terminal and returns when you press enter aka new line constant.
But that is not the interesting part.
Instead of providing it something like a format, you provide it with the variable to store the result to!
So we push on to the stack the variable result (to store the value in) except this function doesn't require the variable to be of x size it only requires a place to be able to store stuff to.
Hence it not being initialized. It does that.

So now we have the value that we entered in a variable now what?
Output it back to the user like a good birdie!

Last section of code in main function (thats changed)
Okay so here the problem with this lot, you can easily output one with printf as the first variable is the format.
But with two you need the first to be format to enable the actual two to be outputted.
To make it simple you can use %s for a string and %x for a number.
If you want a more though and actual understanding of printf's format chars check out printf stdio library.

So how does it look in c?
printf(two_var_output, output, result);
Okay I think you get my point now.
Thanks for reading!
I hope this helps with how this actually works.

No comments:

Post a Comment