Discussion:
Saving registers on the stack
(too old to reply)
Harald Hallen
2002-11-25 12:00:51 UTC
Permalink
This is a really elementary question and I shamelessly put it in the hope of saving an hour or so of reading manuals.

I used 68000 processors a lot around 10 years ago. In assembly language routines for the 68000, the standard way of saving and restoring multiple registers on the stack was to use the MOVEM instruction, like so:

MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4


Now that I'm returning to the architecture in its more modern ColdFire incarnation, it seems that the address register predecrement and postincrement addressing modes are not available with the MOVEM instruction. So my question is: what's the optimal way of saving/restoring multiple registers now? Does one have to use a separate MOVE instruction for each register?


Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)


________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
Rudy Van Hecke
2002-11-25 13:21:27 UTC
Permalink
Hello Harald,

To save less than 4 registers it's better to use separate MOVE's

A solution to save more registers is given by:

LEA.L -20(SP),SP
MOVEM.L D0-D4,(SP)
...
MOVEM.L (SP),D0-D4
LEA.L 20(SP),SP

Also a LINK instruction can create space on the stack

LINK.W A1,#-20
...
MOVEM.L D0-D4,(SP)
...
MOVEM.L (SP),D0-D4
...
UNLK A1

Or a combination of both

regards,
Rudy Van Hecke
Anerma Belgium
Tel: +32 14 26 69 51
Fax: +32 14 26 69 55


-----Oorspronkelijk bericht-----
Van: ***@wildrice.com [mailto:***@wildrice.com] Namens Harald
Hallen
Verzonden: maandag 25 november 2002 12:01
Aan: ***@WildRice.com
Onderwerp: Saving registers on the stack

This is a really elementary question and I shamelessly put it in the
hope of saving an hour or so of reading manuals.

I used 68000 processors a lot around 10 years ago. In assembly language
routines for the 68000, the standard way of saving and restoring
multiple registers on the stack was to use the MOVEM instruction, like
so:

MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4


Now that I'm returning to the architecture in its more modern ColdFire
incarnation, it seems that the address register predecrement and
postincrement addressing modes are not available with the MOVEM
instruction. So my question is: what's the optimal way of
saving/restoring multiple registers now? Does one have to use a separate
MOVE instruction for each register?


Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)


________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>





________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
Hervé SALVAT
2002-11-26 17:27:48 UTC
Permalink
You have to create first some room on the stack, then push the registers :

lea -60(%sp),%sp
movem.l %d0-%d7/%a0-%a6,%a7@

then after retreiving registers from the stack, you have to manually adjust
the stack :

movem.l %a7@,%d0-%d7/%a0-%a6
lea 60(%sp),%sp

On V2 core processors without data cache, MOVEM is far mor efficient than
separate MOVEs, because it is THE instruction (the only one) that performs
line reads or writes. There is also great benefit to use it for memcpy.


Hervé SALVAT
mailto:***@lord-ing.com

LORD Ingénierie
Z.A. de Marly BP110
2 rue Fresnel
91410 CORBREUSE
tel + 33 (0)1 64 551 551
fax +33 (0)1 64 595 595
http://www.lord-ing.com
-----Message d'origine-----
Date: lundi 25 novembre 2002 13:01
Objet: Saving registers on the stack
This is a really elementary question and I shamelessly put it
in the hope of saving an hour or so of reading manuals.
I used 68000 processors a lot around 10 years ago. In assembly
language routines for the 68000, the standard way of saving
and restoring multiple registers on the stack was to use the
MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4
Now that I'm returning to the architecture in its more modern
ColdFire incarnation, it seems that the address register
predecrement and postincrement addressing modes are not
what's the optimal way of saving/restoring multiple registers
now? Does one have to use a separate MOVE instruction for each
register?
Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)
_______________________________________________________________
_________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
David Webster
2002-11-27 14:34:52 UTC
Permalink
Hervé,
I was intersted in your comment regarding MOVEM being the most efficient
instruction for multiple reads/write and especially for memcpy.

Have you done any tests to quantity how mush more efficient it is as
opposed to a number of "move.l (a0)+,(a1)+" instructions? Especuially as
when using the movem you also need additional instructions to advance
the source and destination "pointers"? Do you have a sample memcpy
written using movem?

I'm always looking for ways to squeeze more out of the processor so
thanks for this tip - I think I'll investigate this further.

Regards
Dave Webster
-----Original Message-----
Of
Hervé SALVAT
Sent: 26 November 2002 17:28
To: 'Coldfire CPU Discussion List'
Subject: RE: Saving registers on the stack
You have to create first some room on the stack, then push the
lea -60(%sp),%sp
then after retreiving registers from the stack, you have to manually adjust
lea 60(%sp),%sp
On V2 core processors without data cache, MOVEM is far mor efficient than
separate MOVEs, because it is THE instruction (the only one) that performs
line reads or writes. There is also great benefit to use it for memcpy.
Hervé SALVAT
LORD Ingénierie
Z.A. de Marly BP110
2 rue Fresnel
91410 CORBREUSE
tel + 33 (0)1 64 551 551
fax +33 (0)1 64 595 595
http://www.lord-ing.com
-----Message d'origine-----
Date: lundi 25 novembre 2002 13:01
Objet: Saving registers on the stack
This is a really elementary question and I shamelessly put it
in the hope of saving an hour or so of reading manuals.
I used 68000 processors a lot around 10 years ago. In assembly
language routines for the 68000, the standard way of saving
and restoring multiple registers on the stack was to use the
MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4
Now that I'm returning to the architecture in its more modern
ColdFire incarnation, it seems that the address register
predecrement and postincrement addressing modes are not
what's the optimal way of saving/restoring multiple registers
now? Does one have to use a separate MOVE instruction for each
register?
Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)
_______________________________________________________________
_________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
Hervé SALVAT
2002-11-27 17:54:28 UTC
Permalink
Dave

I made a simple test on a ***@62.5MHZ: copy 2048 words from DRAM to DRAM.
Cache disabled :
standard memcpy 592 us
optimized memcpy 355 us
Cache enabled :
standard memcpy 416 us
optimized memcpy 167 us

Actually, I developped video-processing algorithms. The trick is to read
data in four data registers at a time with a MOVEM, process data register by
register, and then write back the four processed registers in a single
MOVEM. Of course this is assembly code not very easy to read, but the speed
improvment is impressive.
You must pay attention to long-align the datas : misaligned MOVEM works, but
byte by byte, and you loose all the speed improvment.

For the figures above, I used the standard memcpy from clib and an optimized
memcpy from Dave Fiddes.
/* memcpy.S -- optimised memcpy() for Motorola ColdFire
*
* Copyright (c) 2000 David J. Fiddes
*/
There is also a memset. If you cannot find it on Internet, please mail me.

Regards
Hervé
-----Message d'origine-----
Date: mercredi 27 novembre 2002 15:35
À: 'Coldfire CPU Discussion List'
Objet: RE: Saving registers on the stack
Hervé,
I was intersted in your comment regarding MOVEM being the most
efficient
instruction for multiple reads/write and especially for memcpy.
Have you done any tests to quantity how mush more efficient it is as
opposed to a number of "move.l (a0)+,(a1)+" instructions?
Especuially as
when using the movem you also need additional instructions to advance
the source and destination "pointers"? Do you have a sample memcpy
written using movem?
I'm always looking for ways to squeeze more out of the processor so
thanks for this tip - I think I'll investigate this further.
Regards
Dave Webster
-----Original Message-----
Of
Hervé SALVAT
Sent: 26 November 2002 17:28
To: 'Coldfire CPU Discussion List'
Subject: RE: Saving registers on the stack
You have to create first some room on the stack, then push the
lea -60(%sp),%sp
then after retreiving registers from the stack, you have to manually
adjust
lea 60(%sp),%sp
On V2 core processors without data cache, MOVEM is far mor efficient
than
separate MOVEs, because it is THE instruction (the only one) that
performs
line reads or writes. There is also great benefit to use it for
memcpy.
Hervé SALVAT
LORD Ingénierie
Z.A. de Marly BP110
2 rue Fresnel
91410 CORBREUSE
tel + 33 (0)1 64 551 551
fax +33 (0)1 64 595 595
http://www.lord-ing.com
-----Message d'origine-----
Date: lundi 25 novembre 2002 13:01
Objet: Saving registers on the stack
This is a really elementary question and I shamelessly put it
in the hope of saving an hour or so of reading manuals.
I used 68000 processors a lot around 10 years ago. In assembly
language routines for the 68000, the standard way of saving
and restoring multiple registers on the stack was to use the
MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4
Now that I'm returning to the architecture in its more modern
ColdFire incarnation, it seems that the address register
predecrement and postincrement addressing modes are not
what's the optimal way of saving/restoring multiple registers
now? Does one have to use a separate MOVE instruction for each
register?
Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)
_______________________________________________________________
_________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
_______________________________________________________________
_________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
_______________________________________________________________
_________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
Charles Esson
2002-11-26 08:37:25 UTC
Permalink
It depended on the chip, but the 68000 rough rule was: You had to push more
than about 3 registers to make it worth while using the MOVEM instruction.

For the coldfire if you want to use the MOVEM instruction you have to do the
register calc yourself.

e.g.
SUB $16,SP
MOVEM.L D0-D4,(SP)
.
MOVEM.L (SP),D0-D4
ADD #16,SP

Or you can use LEA

LEA -16(SP),SP
MOVEM.L D0-D4,(SP)

The LEA is the better option if the number of register is greater than 2, ADDQ
abd SUBQ if count is 2 or less.

I use the 68000k rough rule to decide between MOVE and MOVEM, I have never
looked at the coldfire timeing to work out what the right changover point is.
Post by Harald Hallen
This is a really elementary question and I shamelessly put it in the hope
of saving an hour or so of reading manuals.
I used 68000 processors a lot around 10 years ago. In assembly language
routines for the 68000, the standard way of saving and restoring multiple
MOVEM.L D0-D4,-(SP)
.
.
MOVEM.L (SP)+,D0-D4
Now that I'm returning to the architecture in its more modern ColdFire
incarnation, it seems that the address register predecrement and
postincrement addressing modes are not available with the MOVEM
instruction. So my question is: what's the optimal way of saving/restoring
multiple registers now? Does one have to use a separate MOVE instruction
for each register?
Harald Hallén
Senior Software Engineer
LMI Selcom AB
Box 250
S-433 25 Partille
Sweden
Phone: +46 (0)31 3362523 (direct)
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>
________________________________________________________________________
ColdFire Discussion List
See: <http://www.WildRice.com/ColdFire/>

Continue reading on narkive:
Search results for 'Saving registers on the stack' (newsgroups and mailing lists)
15
replies
[PATCH v5 1/3] ARM: probes: check stack operation when decoding
started 2014-08-27 18:02:30 UTC
linux-kernel@vger.kernel.org
15
replies
[PATCH v5 1/3] ARM: probes: check stack operation when decoding
started 2014-08-27 18:02:30 UTC
linux-arm-kernel@lists.infradead.org
63
replies
More on C++ stack arrays
started 2013-10-20 19:25:35 UTC
digitalmars-d@puremagic.com
12
replies
[ros-dev] [ros-diffs] [sir_richard] 45052: Patch that fixes VMWare boot (and should fix QEMu/KVM boot on the testbot): [NTOS]: A trap can get us into a state where DS/ES are invalid, making any pointer dereference (on DS/ES segmented memory, not SS, the stack) crash (and probably double-fault). Therefore, we have to be careful to switch to a good DS/ES before touching the TrapFrame pointer, which we don't have in ESP like the ASM code, but in a DS/ES-segmented register. For V8086 traps we can switch to the good DS/ES immediately, but for other kinds of traps, we actually need to save the current (bad) segments first. So we save them on the stack now, then switch to the good ones, then store the stack values into the trap frame. This is what happens on a non-optimized (-O0) build. On an optimized build, the segments will end up in registers instead, which is fine too (they'll be direct values). The order of instructions is guaranteed since the segment macros are volatile. [NTOS]: The GPF and Invalid Opcode handlers are performance critical when talking about V8086 traps, because they control the main flow of execution during that mode (GPFs will be issued for any privileged instruction we need to emulate, and invalid opcode might be generated for BOPs). Because of this, we employ a fast entry/exit macro into V8086 mode since we can make certain assumptions. We detect, and use, such scenarios when the V8086 flag is enabled in EFLAGS. However, because we can land in a GPF handler with an invalid DS/ES, as some V8086 code could trample this during BIOS calls for example, we must make sure that we are on a valid DS/ES before dereferencing any pointer. We fixup DS/ES either in KiEnterTrap (for normal entry/exit) or, for V86, in KiEnterV86Trap. Notice the problem: we need to detect which of these to use early on but we can't touch the EFLAGS in the frame because DS/ES could be invalid. Thankfully SS is always guaranteed valid, so stack dereferences are game! We therefore read the EFLAGS here, in assembly, where we can touch ESP as we please. We save this in EDX, which will be used as the second argument for the FASTCALL C trap entry. When we make the fast V86 check, we use the parameter instead of the trap frame, leading us to using the correct trap entry function, which fixes up DS/ES and lets us go on our merry way... [NTOS]: Make appropriate changes to GENERATE_TRAP_HANDLERS macro. [NTOS]: Switch to using well-known NT trap handler names (hex-based, double-zeroed) instead of decimal-based trap handler names which are confusing. [NTOS]: Clean up some debug spew.
started 2010-01-12 18:37:38 UTC
ros-dev@reactos.org
9
replies
[PIC] Copying the stack with PIC18?
started 2006-12-21 06:51:15 UTC
piclist@mit.edu
Loading...