• Re: Questions on arm-gcc linking: .ARM.exidx and .relocate sections

    From David Brown@21:1/5 to pozz on Thu Jan 9 13:44:42 2025
    On 09/01/2025 09:47, pozz wrote:
    I studied the output files from a build process of Atmel Studio project
    for SAMD20 MCU that is a Cortex-M0+.
    The IDE uses arm-gcc compiler.

    The strange thing I noticed was the last Flash address used: in lss it
    is 2'06a4. Indeed lss file ends with:



    .ARM.exidx      0x000206a4        0x8
     *(.ARM.exidx* .gnu.linkonce.armexidx.*)
     .ARM.exidx     0x000206a4        0x8 c:/program files (x86)/atmel/studio/7.0/toolchain/arm/arm-gnu-toolchain/bin/../lib/gcc/arm-none-eabi/6.3.1/thumb/v6-m\libgcc.a(_udivmoddi4.o)
                    [!provide]                PROVIDE (__exidx_end, .)

    This shows that the ".ARM.exidx" section is being pulled in by the code
    for the "_udivmoddi4.o" object file in libgcc.a. _udivmoddi4 is a
    function that does division and modulo of 64-bit unsigned integers (on
    targets that don't have a matching hardware instruction). But since it
    is a "linkonce" section, it could also be pulled in by many other
    functions - "linkonce" sections get merged automatically.

    My understanding is that this section and the following few bytes are
    required for stack unwinding for C++ exceptions. Even if you are not
    using C++, or using it with exceptions disabled, there is still a very
    small amount of such data generated and included in the C library
    builds, because someone might call these functions in combination with
    C++ exceptions. It is, I would say, too small to worry about in all but
    the tightest memory situations.



    However I noticed another strange thing. The hex file that I use for production doesn't end at address 2'06AC, but at address 2'08CC. There
    are other 0x220=544 bytes.


    After exploring the output files I found the .relocate sections in map
    file. It seems it is linked to RAM (0x2000'0000):


    .relocate       0x20000000      0x220 load address 0x000206ac
                    0x20000000                . = ALIGN (0x4)
                    0x20000000                _srelocate = .
     *(.ramfunc .ramfunc.*)
     *(.data .data.*)
     .data.memset_func
                    0x20000000        0x4 src/mbedtls/library/platform_util.o
     .data.g_interrupt_enabled
                    0x20000004        0x1 src/ports/samd20/ASF/common/utils/interrupt/interrupt_sam_nvic.o
                    0x20000004                g_interrupt_enabled

    <snip>

                    0x20000220                _erelocate = .

    .bss            0x20000220     0x611c load address 0x000208d0
                    0x20000220                . = ALIGN (0x4)
                    0x20000220                _sbss = .
                    0x20000220                _szero = .


    However I think it is linked in Flash and copied in RAM during startup
    code.

    Yes.


    From what I understand, they are global/static variables initialized in
    the declaration (with a startup value).


    Yes, that is exactly what it is.

    Uninitialised file-scope and static data in C goes in the ".bss"
    section, linked to ram. There is code in the crt.o file (or another
    startup file) that clears the .bss to zero.

    Initialised file-scope and static data goes in the ".data" section.
    This is linked to ram (i.e., the addresses of the variables are in ram)
    but there is also a copy in flash with the initialisation data. The
    pre-main startup code copies the data from flash to the .data section.

    It is also possible to link functions to ram - they are copied across in
    the same way (that's the ".ramfunc" section mentioned in your map file).
    You might do this for speed-critical code on a microcontroller with
    slow flash, or for functions used by flash programming routines.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Brown@21:1/5 to pozz on Fri Jan 10 11:03:11 2025
    On 09/01/2025 22:28, pozz wrote:
    Il 09/01/2025 13:44, David Brown ha scritto:
    On 09/01/2025 09:47, pozz wrote:
    I studied the output files from a build process of Atmel Studio
    project for SAMD20 MCU that is a Cortex-M0+.
    The IDE uses arm-gcc compiler.

    The strange thing I noticed was the last Flash address used: in lss
    it is 2'06a4. Indeed lss file ends with:



    .ARM.exidx      0x000206a4        0x8
      *(.ARM.exidx* .gnu.linkonce.armexidx.*)
      .ARM.exidx     0x000206a4        0x8 c:/program files (x86)/atmel/
    studio/7.0/toolchain/arm/arm-gnu-toolchain/bin/../lib/gcc/arm-none-
    eabi/6.3.1/thumb/v6-m\libgcc.a(_udivmoddi4.o)
                     [!provide]                PROVIDE (__exidx_end, .)

    This shows that the ".ARM.exidx" section is being pulled in by the
    code for the "_udivmoddi4.o" object file in libgcc.a.  _udivmoddi4 is
    a function that does division and modulo of 64-bit unsigned integers
    (on targets that don't have a matching hardware instruction).  But
    since it is a "linkonce" section, it could also be pulled in by many
    other functions - "linkonce" sections get merged automatically.

    Ok, but what are those 8 bytes? Code? Values? It is strange there aren't
    any info in lss.


    Unfortunately, if you want an answer to that, you need to dig into the
    murky depths of how exception processing and stack unwinding are done.
    It's complicated, it will involve a great deal of effort searching,
    reading, experimenting, and analysing. And you'll learn pretty much
    nothing of use unless you are thinking of making your own C++ compiler
    from scratch - it's not even particularly useful if you are using C++
    and have exceptions enabled. (Looking at the size of the sections might
    be of interest to see the overhead exceptions have on code size.)

    It would be nice if I could give you a clearer answer, or point you to a
    simple explanation online, but I'm afraid I can't. And while I don't
    know everything about this kind of thing, I know more than most - I have
    an unhealthy interest in the details of toolchain. So if I can't give
    you a full answer, you are probably just going to have to accept it as
    an unexplained mystery unless you want to do a lot of googling. (If
    someone else here actually knows more useful details, please let us know!)


    My understanding is that this section and the following few bytes are
    required for stack unwinding for C++ exceptions.  Even if you are not
    using C++, or using it with exceptions disabled, there is still a very
    small amount of such data generated and included in the C library
    builds, because someone might call these functions in combination with
    C++ exceptions.

    Thanks for the explanation that I take as is, without fully
    understanding :-)


    Most C functions are "transparent" to C++ exceptions. That is, if a C++ function "foo" has a try-catch block and calls the C function "bar"
    which in turn calls the C++ function "foobar" which throws an exception,
    then the throw handling will normally jump straight back to "foo" and
    skip "bar" entirely.

    But there are a few things that can complicate the process. gcc
    extensions such as cleanup functions can be used in C code and must be "unwound" like C++ destructors. setjmp/longjmp make a mess of
    everything (as they always do). And some other mixes of C and C++
    functions can be a little more complex.

    Thus you end up with a small amount of data for stack unwinding and C++ exception handling even for C code, so that you can link that C code
    with C++ code and use it freely.


    It is, I would say, too small to worry about in all but the tightest
    memory situations.

    Of course, yes. My question was "What is it?", not "How to save these 8 bytes?" if I don't need it.


    Good, because that would be a much harder question to answer well!


    However I think it is linked in Flash and copied in RAM during
    startup code.

    Yes.


     From what I understand, they are global/static variables initialized
    in the declaration (with a startup value).


    Yes, that is exactly what it is.

    Uninitialised file-scope and static data in C goes in the ".bss"
    section, linked to ram.  There is code in the crt.o file (or another
    startup file) that clears the .bss to zero.

    Initialised file-scope and static data goes in the ".data" section.
    This is linked to ram (i.e., the addresses of the variables are in
    ram) but there is also a copy in flash with the initialisation data.
    The pre- main startup code copies the data from flash to the .data
    section.

    I expected to see the non volatile copy in Flash of .relocate in the map file. However only the copy in RAM is shown.


    The map file shows the symbols - and the symbols are all in ram. The
    only bits you see in the source copy in flash are for the start and end
    of the block to copy.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)