This is safer because the previous code assumed that the start and end
VMAs of .data and .bss were word-aligned, which is not always the case,
so the initialization code could write data outside these sections. The
ROM functions support any address boundary.
This is faster because the ROM functions are ultra optimized, using
realignment and the LDM/STM instructions, which is much better than the
previous simple loops of single word accesses.
This is smaller because the ROM functions don't require to add any code
to the target device other than simple function calls.
This makes the code simpler and more maintainable because standard
functions are not reimplemented and no assembly is used.
Note that this is also faster and smaller than the corresponding
functions from the standard string library.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
The initialization code clearing .bss is allowed to use the stack, so
the stack can not be in .bss, or this code will badly fail if it uses
the stack.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
In order to be fast, the reset_handler() function uses word accesses to
initialize the .data output section. However, most toolchains do not
automatically force the alignment of an output section LMA to use the
maximum alignment of all its input sections. Because of that, assuming
that .data contains some words, the LMA of the .data output section was
not word-aligned in some cases, resulting in an initialization performed
using slow unaligned word accesses.
This commit forces the alignment of the LMA of the .data output section
with a word boundary in order to always use fast aligned word accesses
to read the .data load area.
Note that this solution is better than using ALIGN_WITH_INPUT, both
because the latter is a new feature incompatible with older toolchains,
and because it could create a big gap between _etext and the LMA of
.data if strongly-aligned data were added to .data, although only a word
alignment is required here.
The same considerations apply to the VMA of .data. However, it is
already automatically word-aligned, both because .data contains words,
and because the end VMA of the previous output section (.socdata) is
word-aligned. Moreover, if the VMA of .data were forcibly word-aligned,
then a filled gap could appear at the beginning of this section if
strongly-aligned data were added to it, thus wasting flash memory.
Consequently, it's better not to change anything for the VMA of .data,
all the more it's very unlikely that it does not contain any word and
that the end VMA of .socdata becomes non-word-aligned, and this would
only result in a slower initialization.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
Some toolchains, like Sourcery CodeBench Lite 2013.05-23 arm-none-eabi
(http://sourcery.mentor.com/public/gnu_toolchain/arm-none-eabi/)
automatically force the alignment of an output section LMA to use the
maximum alignment of all its input sections. This toolchain uses GNU
binutils 2.23, and this automatic behavior is the same as the manual
behavior of the ALIGN_WITH_INPUT feature of GNU binutils 2.24+.
This behavior is not an issue per se, but it creates a gap between
_etext and the LMA of the .data output section if _etext does not have
the same alignment, while reset_handler() initialized this section by
copying the data from _etext to its VMA, hence an offset in the
addresses of loaded data, and missing data.
This commit fixes this issue by making reset_handler() directly use the
LMA of the .data section using LOADADDR(.data), rather than assuming
that _etext is this LMA.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
* Only enable TX by default.
* Add some magic for RX handling. When an input handler is registered:
* Automatically enable RX-related and interrupts
* Automatically lock the SERIAL PD on under all power modes
* Automatically enable the UART clock under sleep and deep sleep
* Automatically undo all of the above when the input handler becomes NULL
* As a result, modules / examples that need UART RX no longer need to clock the UART and manipulate the SERIAL PD. They simply have to specify an input handler
* Don't automatically power on the UART whenever the CM3 is active
* Before accessing the UART, make sure it is powered and clocked
* Avoid falling edge glitches
* Fix garbage characters / Explicitly wait for UART TX to complete
* Implement new style of PD locks
* Use our own shutdown sequence rather than the one provided by cc26xxware
* Shutdown from within the interrupt that requested it. This allows shutdown to take place even if the code is stuck in a loop somewhere else
* Improve DCDC/GLDO/uLDO switching logic
* Explicitly handle oscillators and retentions
Instead of using a separate data structure to request that a PD remain powered during deep sleep,
we do the same within the main LPM data structure through an additional field.
This allows us to maintain only one linked list of LPM modules and overall improves code clarity
This tutorial was written for the older implementation of CoAP, and
while it may be possible to update it, the directions include URLs and
repos that no longer exist, so it's better to just remove it.
Only the interrupt flags that have been handled must be cleared.
Otherwise, if a new interrupt occurs after the interrupt statuses are
read and before they are cleared, then it is discarded without having
been handled. This issue was particularly likely with two interrupt
trigger conditions occurring on different pins of the same port in a
short period of time.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
Power-up interrupts do not always update the regular interrupt status.
Because of that, in order not to miss power-up interrupts, the ISR must
handle both the regular and the power-up interrupt statuses.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
Introduce new useful GPIO macros to:
- get the raw interrupt status of a port,
- get the masked interrupt status of a port,
- get the power-up interrupt status of a port.
These macros are cleaner and less error-prone than raw register access
code copied all over the place.
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>
Behave just like the CS8900A driver: Both the CS8900A and the LAN91C96 dynamically share a buffer for received packets and packets to be send. If the chip is exposed to a network with a lot of broadcasts the shared buffer might fill quicker with received packets than the 6502 reads them (via polling). So we might need to drop some received packets in order to be able to send anything at all.
OR-ing an offset to a base address instead of adding it is dangerous
because it can only work if the base address is aligned enough for the
offset.
Moreover, if the base address or the offset has a value unknown at
compile time, then the assembly instructions dedicated to 'base +
offset' addressing on most CPUs can't be emitted by the compiler because
this would require the alignment of the base address against the offset
to be known in order to optimize 'base | offset' into 'base + offset'.
In that case, the compiler has to emit more instructions in order to
compute 'base | offset' on most CPUs, e.g. on ARM, which means larger
binary size and slower execution.
Hence, replace all occurrences of 'base | offset' with 'base + offset'.
This must become a coding rule.
Here are the results for the cc2538-demo example:
- Compilation of uart_init():
* before:
REG(regs->base | UART_CC) = 0;
200b78: f446 637c orr.w r3, r6, #4032 ; 0xfc0
200b7c: f043 0308 orr.w r3, r3, #8
200b80: 2200 movs r2, #0
200b82: 601a str r2, [r3, #0]
* now:
REG(regs->base + UART_CC) = 0;
200b7a: 2300 movs r3, #0
200b7c: f8c4 3fc8 str.w r3, [r4, #4040] ; 0xfc8
- Size of the .text section:
* before: 0x4c7c
* now: 0x4c28
* saved: 84 bytes
Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau.dev@gmail.com>