samedi 30 septembre 2017

NASM x86 Assembly Optimization [Linear congruential generator]

I have tried writing my first assembly program (NASM x86 assembly). It is an linear congruential generator, or the plan was that is is one. Because of limitations of my assembly skills I had to change the starting parameters of the pseudo random number generator (prng):

name: before: after

seed:214567:21456

a:22695477:16807

c:1:0

m:2^32:2^24

%include "io.inc"
section .data
;cheating with the numbers by putting a 0. infront of them so they look like floats
msg db "0.%d",10,0
msgDone db "done",10,0
msgStart db "start",10,0
;see http://ift.tt/1MDS8uz
seed dd 21456
a dd 16807
c dd 0
m dd 16777216
section .text
global CMAIN
extern printf
CMAIN:
    mov ebp, esp; for correct debugging
    ;create stack frame
    push ebp
    mov ebp, esp
    ; edi is used for the loop count
    mov edi, 0
    push msgStart
    call printf
    generate:
        ;seed = (seed*a+b)%m
        mov eax, [seed]
        mov ecx,[a]
        ;seed*a
        mul ecx
        ;seed*a+b
        add eax,[c]
        mov ecx, [m]
        ;(seed*a+b)%m
        div ecx
        mov eax,edx
        mov [seed], edx
        ;push to print them out
        push eax;disable for testing
        push msg;disable for testing
        call printf ;disable for testing
        add esp,8
        inc edi
        cmp edi, 12 ; set this to 1000000000
    jne generate
    ;destroy stack frame
    push msgDone
    call printf
    xor eax, eax
    mov esp, ebp
    pop ebp
    ret

I have written the same code in java and in NASM x86 assembly and expected an performance increase relative to java with NASM. But in the end it took 3 times longer than java did. Nasm ~ 10sec , Java ~ 3-4 sec. My test was letting each program generate 1.000.000.000 random numbers. Of course without the System.out.prints/ call printf because they needed a lot of time.

I knew from the beginning that my assembly code would be badly optimized, because I have never programmed in it before and now just wanted to try it, but that the performance would be that bad is something I never had excpected.

Can someone tell me why my programm needs so much time and how assembly programs can be optimized?

Thanks

---The code was written with SASM on Win10 64bit




Aucun commentaire:

Enregistrer un commentaire