Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP for reliable unicast and BLE software update #136

Merged
merged 27 commits into from
May 19, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
190a3c2
filename typo
geeksville May 10, 2020
2fa5955
minor fixups to get nrf52 building again
geeksville May 10, 2020
8b911ab
Cleanup build for NRF52 targets
geeksville May 10, 2020
c12fb69
update protos
geeksville May 10, 2020
86ae69d
refactor so I can track and ignore recent packets of any type
geeksville May 11, 2020
9f05ad2
remove random delay hack from broadcast, since we now do that for all…
geeksville May 11, 2020
b6a202d
runs again with new protobufs
geeksville May 12, 2020
a0b43b9
Send "unset" for hwver and swver if they were unset
geeksville May 13, 2020
140e298
fix rare gurumeditation if we are unlucky and some ISR code is in ser…
geeksville May 14, 2020
14fdd33
move bluetooth OTA back into main tree for now
geeksville May 14, 2020
5ec5248
complete ble ota move
geeksville May 14, 2020
6961853
ble software update fixes
geeksville May 15, 2020
db72fac
Merge remote-tracking branch 'root/master'
geeksville May 15, 2020
95e952b
todo update
geeksville May 16, 2020
ef1463a
have tbeam charge at max rate (450mA)
geeksville May 17, 2020
efc2395
Fix #133 - force deep sleep if battery reaches 10%
geeksville May 17, 2020
ef831a0
Fix leaving display on in deep sleep.
geeksville May 17, 2020
19f5a5e
oops - use correct battery shutoff voltage
geeksville May 17, 2020
53c3d9b
doc updates
geeksville May 19, 2020
26d3ef5
Use the hop_limit field of MeshPacket to limit max delivery depth in
geeksville May 19, 2020
976bdad
sniffReceived now allows router to inspect packets not destined for t…
geeksville May 19, 2020
7aa47cf
Merge remote-tracking branch 'root/master' into reliable
geeksville May 19, 2020
cca4867
want_ack flag added
geeksville May 19, 2020
8bf4919
wip reliable unicast (1 hop)
geeksville May 19, 2020
6ba960c
one hop reliable ready for testing
geeksville May 19, 2020
c65b518
less logspam
geeksville May 19, 2020
71041e8
reliable unicast 1 hop works!
geeksville May 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions bin/build-all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ function do_build {
cp $SRCELF $OUTDIR/elfs/firmware-$ENV_NAME-$COUNTRY-$VERSION.elf
}

# Make sure our submodules are current
git submodule update

# Important to pull latest version of libs into all device flavors, otherwise some devices might be stale
platformio lib update

Expand Down
46 changes: 46 additions & 0 deletions boards/nrf52840_dk_modified.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{
"build": {
"arduino": {
"ldscript": "nrf52840_s140_v6.ld"
},
"core": "nRF5",
"cpu": "cortex-m4",
"extra_flags": "-DARDUINO_NRF52840_PCA10056 -DNRF52840_XXAA",
"f_cpu": "64000000L",
"hwids": [["0x239A", "0x4404"]],
"usb_product": "SimPPR",
"mcu": "nrf52840",
"variant": "pca10056-rc-clock",
"variants_dir": "variants",
"bsp": {
"name": "adafruit"
},
"softdevice": {
"sd_flags": "-DS140",
"sd_name": "s140",
"sd_version": "6.1.1",
"sd_fwid": "0x00B6"
},
"bootloader": {
"settings_addr": "0xFF000"
}
},
"connectivity": ["bluetooth"],
"debug": {
"jlink_device": "nRF52840_xxAA",
"onboard_tools": ["jlink"],
"svd_path": "nrf52840.svd"
},
"frameworks": ["arduino"],
"name": "A modified NRF52840-DK devboard (Adafruit BSP)",
"upload": {
"maximum_ram_size": 248832,
"maximum_size": 815104,
"require_upload_port": true,
"speed": 115200,
"protocol": "jlink",
"protocols": ["jlink", "nrfjprog", "stlink"]
},
"url": "https://meshtastic.org/",
"vendor": "Nordic Semi"
}
2 changes: 1 addition & 1 deletion boards/ppr.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"hwids": [["0x239A", "0x4403"]],
"usb_product": "PPR",
"mcu": "nrf52840",
"variant": "pca10056-rc-clock",
"variant": "ppr",
"variants_dir": "variants",
"bsp": {
"name": "adafruit"
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# What is Meshtastic?

Meshtastic is a project that lets you use
inexpensive (\$30 ish) GPS radios as an extensible, super long battery life mesh GPS communicator. These radios are great for hiking, skiing, paragliding - essentially any hobby where you don't have reliable internet access. Each member of your private mesh can always see the location and distance of all other members and any text messages sent to your group chat.
inexpensive (\$30 ish) GPS radios as an extensible, long battery life, secure, mesh GPS communicator. These radios are great for hiking, skiing, paragliding - essentially any hobby where you don't have reliable internet access. Each member of your private mesh can always see the location and distance of all other members and any text messages sent to your group chat.

The radios automatically create a mesh to forward packets as needed, so everyone in the group can receive messages from even the furthest member. The radios will optionally work with your phone, but no phone is required.

Expand Down
21 changes: 10 additions & 11 deletions docs/software/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,16 @@ Items to complete soon (next couple of alpha releases).
- lower wait_bluetooth_secs to 30 seconds once we have the GPS power on (but GPS in sleep mode) across light sleep. For the time
being I have it set at 2 minutes to ensure enough time for a GPS lock from scratch.

- remeasure wake time power draws now that we run CPU down at 80MHz

# AXP192 tasks

- figure out why this fixme is needed: "FIXME, disable wake due to PMU because it seems to fire all the time?"
- "AXP192 interrupt is not firing, remove this temporary polling of battery state"
- make debug info screen show real data (including battery level & charging) - close corresponding github issue

# Medium priority

Items to complete before the first beta release.

- Don't store position packets in the to phone fifo if we are disconnected. The phone will get that info for 'free' when it
fetches the fresh nodedb.
- Use the RFM95 sequencer to stay in idle mode most of the time, then automatically go to receive mode and automatically go from transmit to receive mode. See 4.2.8.2 of manual.
- Use 32 bits for message IDs
- Use fixed32 for node IDs
- Remove the "want node" node number arbitration process
- Don't store position packets in the to phone fifo if we are disconnected. The phone will get that info for 'free' when it
fetches the fresh nodedb.
- Use the RFM95 sequencer to stay in idle mode most of the time, then automatically go to receive mode and automatically go from transmit to receive mode. See 4.2.8.2 of manual.
- possibly switch to https://github.com/SlashDevin/NeoGPS for gps comms
- good source of battery/signal/gps icons https://materialdesignicons.com/
- research and implement better mesh algorithm - investigate changing routing to https://github.com/sudomesh/LoRaLayer2 ?
Expand Down Expand Up @@ -204,3 +199,7 @@ Items after the first final candidate release.
- enable fast lock and low power inside the gps chip
- Make a FAQ
- add a SF12 transmit option for _super_ long range
- figure out why this fixme is needed: "FIXME, disable wake due to PMU because it seems to fire all the time?"
- "AXP192 interrupt is not firing, remove this temporary polling of battery state"
- make debug info screen show real data (including battery level & charging) - close corresponding github issue
- remeasure wake time power draws now that we run CPU down at 80MHz
8 changes: 5 additions & 3 deletions docs/software/cypto.md → docs/software/crypto.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Cryptography is tricky, so we've tried to 'simply' apply standard crypto solutio
the project developers are not cryptography experts. Therefore we ask two things:

- If you are a cryptography expert, please review these notes and our questions below. Can you help us by reviewing our
notes below and offering advice? We will happily give as much or as little credit as you wish as our thanks ;-).
- Consider our existing solution 'alpha' and probably fairly secure against an not very aggressive adversary. But until
notes below and offering advice? We will happily give as much or as little credit as you wish ;-).
- Consider our existing solution 'alpha' and probably fairly secure against a not particularly aggressive adversary. But until
it is reviewed by someone smarter than us, assume it might have flaws.

## Notes on implementation
Expand All @@ -16,7 +16,7 @@ the project developers are not cryptography experts. Therefore we ask two things

Parameters for our CTR implementation:

- Our AES key is 256 bits, shared as part of the 'Channel' specification.
- Our AES key is 128 or 256 bits, shared as part of the 'Channel' specification.
- Each SubPacket will be sent as a series of 16 byte BLOCKS.
- The node number concatenated with the packet number is used as the NONCE. This counter will be stored in flash in the device and should essentially never repeat. If the user makes a new 'Channel' (i.e. picking a new random 256 bit key), the packet number will start at zero. The packet number is sent
in cleartext with each packet. The node number can be derived from the "from" field of each packet.
Expand All @@ -35,4 +35,6 @@ Note that for both stategies, sizes are measured in blocks and that an AES block
## Remaining todo

- Make the packet numbers 32 bit
- Confirm the packet #s are stored in flash across deep sleep (and otherwise in in RAM)
- Have the app change the crypto key when the user generates a new channel
- Implement for NRF52 [NRF52](https://infocenter.nordicsemi.com/topic/com.nordic.infocenter.sdk5.v15.0.0/lib_crypto_aes.html#sub_aes_ctr)
74 changes: 67 additions & 7 deletions docs/software/mesh-alg.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,80 @@
# Mesh broadcast algorithm

FIXME - instead look for standard solutions. this approach seems really suboptimal, because too many nodes will try to rebroast. If
all else fails could always use the stock Radiohead solution - though super inefficient.

great source of papers and class notes: http://www.cs.jhu.edu/~cs647/

reliable messaging tasks (stage one for DSR):

- DONE generalize naive flooding
- DONE add a max hops parameter, use it for broadcast as well (0 means adjacent only, 1 is one forward etc...). Store as three bits in the header.
- DONE add a 'snoopReceived' hook for all messages that pass through our node.
- DONE use the same 'recentmessages' array used for broadcast msgs to detect duplicate retransmitted messages.
- DONE in the router receive path?, send an ack packet if want_ack was set and we are the final destination. FIXME, for now don't handle multihop or merging of data replies with these acks.
- DONE keep a list of packets waiting for acks
- DONE for each message keep a count of # retries (max of three). Local to the node, only for the most immediate hop, ignorant of multihop routing.
- DONE delay some random time for each retry (large enough to allow for acks to come in)
- DONE once an ack comes in, remove the packet from the retry list and deliver the ack to the original sender
- DONE after three retries, deliver a no-ack packet to the original sender (i.e. the phone app or mesh router service)
- DONE test one hop ack/nak with the python framework
- Do stress test with acks

dsr tasks

- do "hop by hop" routing
- when sending, if destnodeinfo.next_hop is zero (and no message is already waiting for an arp for that node), startRouteDiscovery() for that node. Queue the message in the 'waiting for arp queue' so we can send it later when then the arp completes.
- otherwise, use next_hop and start sending a message (with ack request) towards that node.
- Don't use broadcasts for the network pings (close open github issue)
- add ignoreSenders to radioconfig to allow testing different mesh topologies by refusing to see certain senders
- test multihop delivery with the python framework

optimizations / low priority:

- low priority: think more careful about reliable retransmit intervals
- make ReliableRouter.pending threadsafe
- bump up PacketPool size for all the new ack/nak/routing packets
- handle 51 day rollover in doRetransmissions
- use a priority queue for the messages waiting to send. Send acks first, then routing messages, then data messages, then broadcasts?

when we receive any packet

- sniff and update tables (especially useful to find adjacent nodes). Update user, network and position info.
- if we need to route() that packet, resend it to the next_hop based on our nodedb.
- if it is broadcast or destined for our node, deliver locally
- handle routereply/routeerror/routediscovery messages as described below
- then free it

routeDiscovery

- if we've already passed through us (or is from us), then it ignore it
- use the nodes already mentioned in the request to update our routing table
- if they were looking for us, send back a routereply
- if max_hops is zero and they weren't looking for us, drop (FIXME, send back error - I think not though?)
- if we receive a discovery packet, we use it to populate next_hop (if needed) towards the requester (after decrementing max_hops)
- if we receive a discovery packet, and we have a next_hop in our nodedb for that destination we send a (reliable) we send a route reply towards the requester

when sending any reliable packet

- if we get back a nak, send a routeError message back towards the original requester. all nodes eavesdrop on that packet and update their route caches

when we receive a routereply packet

- update next_hop on the node, if the new reply needs fewer hops than the existing one (we prefer shorter paths). fixme, someday use a better heuristic

when we receive a routeError packet

- delete the route for that failed recipient, restartRouteDiscovery()
- if we receive routeerror in response to a discovery,
- fixme, eventually keep caches of possible other routes.

TODO:

- DONE reread the radiohead mesh implementation - hop to hop acknoledgement seems VERY expensive but otherwise it seems like DSR
- optimize our generalized flooding with heuristics, possibly have particular nodes self mark as 'router' nodes.

- DONE reread the radiohead mesh implementation - hop to hop acknowledgement seems VERY expensive but otherwise it seems like DSR
- DONE read about mesh routing solutions (DSR and AODV)
- DONE read about general mesh flooding solutions (naive, MPR, geo assisted)
- DONE reread the disaster radio protocol docs - seems based on Babel (which is AODVish)
- possibly dash7? https://www.slideshare.net/MaartenWeyn1/dash7-alliance-protocol-technical-presentation https://github.com/MOSAIC-LoPoW/dash7-ap-open-source-stack - does the opensource stack implement multihop routing? flooding? their discussion mailing list looks dead-dead
- REJECTED - seems dying - possibly dash7? https://www.slideshare.net/MaartenWeyn1/dash7-alliance-protocol-technical-presentation https://github.com/MOSAIC-LoPoW/dash7-ap-open-source-stack - does the opensource stack implement multihop routing? flooding? their discussion mailing list looks dead-dead
- update duty cycle spreadsheet for our typical usecase
- generalize naive flooding on top of radiohead or disaster.radio? (and fix radiohead to use my new driver)

a description of DSR: https://tools.ietf.org/html/rfc4728 good slides here: https://www.slideshare.net/ashrafmath/dynamic-source-routing
good description of batman protocol: https://www.open-mesh.org/projects/open-mesh/wiki/BATMANConcept
Expand Down Expand Up @@ -77,7 +138,6 @@ look into the literature for this idea specifically.

FIXME, merge into the above:


good description of batman protocol: https://www.open-mesh.org/projects/open-mesh/wiki/BATMANConcept

interesting paper on lora mesh: https://portal.research.lu.se/portal/files/45735775/paper.pdf
Expand Down
33 changes: 23 additions & 10 deletions docs/software/nrf52-TODO.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,18 @@
# NRF52 TODO

## Misc work items

## Initial work items

Minimum items needed to make sure hardware is good.

- test my hackedup bootloader on the real hardware
- add a hard fault handler
- use "variants" to get all gpio bindings
- plug in correct variants for the real board
- Use the PMU driver on real hardware
- Use new radio driver on real hardware
- Use UC1701 LCD driver on real hardware. Still need to create at startup and probe on SPI
- test the LEDs
- test the buttons
- make a new boarddef with a variant.h file. Fix pins in that file. In particular (at least):
#define PIN_SPI_MISO (46)
#define PIN_SPI_MOSI (45)
#define PIN_SPI_SCK (47)
#define PIN_WIRE_SDA (26)
#define PIN_WIRE_SCL (27)

## Secondary work items

Expand Down Expand Up @@ -45,7 +40,6 @@ Needed to be fully functional at least at the same level of the ESP32 boards. At

- use SX126x::startReceiveDutyCycleAuto to save power by sleeping and briefly waking to check for preamble bits. Change xmit rules to have more preamble bits.
- turn back on in-radio destaddr checking for RF95
- remove the MeshRadio wrapper - we don't need it anymore, just do everythin in RadioInterface subclasses.
- figure out what the correct current limit should be for the sx1262, currently we just use the default 100
- put sx1262 in sleepmode when processor gets shutdown (or rebooted), ideally even for critical faults (to keep power draw low). repurpose deepsleep state for this.
- good power management tips: https://devzone.nordicsemi.com/nordic/nordic-blog/b/blog/posts/optimizing-power-on-nrf52-designs
Expand All @@ -62,6 +56,11 @@ Needed to be fully functional at least at the same level of the ESP32 boards. At

Nice ideas worth considering someday...

- Use flego to me an iOS/linux app? https://felgo.com/doc/qt/qtbluetooth-index/ or
- Use flutter to make an iOS/linux app? https://github.com/Polidea/FlutterBleLib
- make a Mfg Controller and device under test classes as examples of custom app code for third party devs. Make a post about this. Use a custom payload type code. Have device under test send a broadcast with max hopcount of 0 for the 'mfgcontroller' payload type. mfg controller will read SNR and reply. DOT will declare failure/success and switch to the regular app screen.
- Hook Segger RTT to the nordic logging framework. https://devzone.nordicsemi.com/nordic/nordic-blog/b/blog/posts/debugging-with-real-time-terminal
- Use nordic logging for DEBUG_MSG
- use the Jumper simulator to run meshes of simulated hardware: https://docs.jumper.io/docs/install.html
- make/find a multithread safe debug logging class (include remote logging and timestamps and levels). make each log event atomic.
- turn on freertos stack size checking
Expand All @@ -72,11 +71,14 @@ Nice ideas worth considering someday...
- in addition to the main CPU watchdog, use the PMU watchdog as a really big emergency hammer
- turn on 'shipping mode' in the PMU when device is 'off' - to cut battery draw to essentially zero
- make Lorro_BQ25703A read/write operations atomic, current version could let other threads sneak in (once we start using threads)
- turn on DFU assistance in the appload using the nordic DFU helper lib call
- make the segger logbuffer larger, move it to RAM that is preserved across reboots and support reading it out at runtime (to allow full log messages to be included in crash reports). Share this code with ESP32 (use gcc noinit attribute)
- convert hardfaults/panics/asserts/wd exceptions into fault codes sent to phone
- stop enumerating all i2c devices at boot, it wastes power & time
- consider using "SYSTEMOFF" deep sleep mode, without RAM retension. Only useful for 'truly off - wake only by button press' only saves 1.5uA vs SYSTEMON. (SYSTEMON only costs 1.5uA). Possibly put PMU into shipping mode?
- change the BLE protocol to be more symmetric. Have the phone _also_ host a GATT service which receives writes to
'fromradio'. This would allow removing the 'fromnum' mailbox/notify scheme of the current approach and decrease the number of packet handoffs when a packet is received.
- Using the preceeding, make a generalized 'nrf52/esp32 ble to internet' bridge service. To let nrf52 apps do MQTT/UDP/HTTP POST/HTTP GET operations to web services.
- lower advertise interval to save power, lower ble transmit power to save power

## Old unorganized notes

Expand All @@ -102,6 +104,17 @@ Nice ideas worth considering someday...
- DONE remove unused sx1262 lib from github
- at boot we are starting our message IDs at 1, rather we should start them at a random number. also, seed random based on timer. this could be the cause of our first message not seen bug.
- add a NEMA based GPS driver to test GPS
- DONE use "variants" to get all gpio bindings
- DONE plug in correct variants for the real board
- turn on DFU assistance in the appload using the nordic DFU helper lib call
- make a new boarddef with a variant.h file. Fix pins in that file. In particular (at least):
#define PIN_SPI_MISO (46)
#define PIN_SPI_MOSI (45)
#define PIN_SPI_SCK (47)
#define PIN_WIRE_SDA (26)
#define PIN_WIRE_SCL (27)
- customize the bootloader to use proper button bindings
- remove the MeshRadio wrapper - we don't need it anymore, just do everything in RadioInterface subclasses.

```

Expand Down
31 changes: 21 additions & 10 deletions platformio.ini
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ build_flags = -Wno-missing-field-initializers -Isrc -Isrc/mesh -Isrc/gps -Ilib/n
; the default is esptool
; upload_protocol = esp-prog

; monitor_speed = 115200
monitor_speed = 921600

# debug_tool = esp-prog
Expand Down Expand Up @@ -83,7 +84,7 @@ src_filter =
upload_speed = 921600
debug_init_break = tbreak setup
build_flags =
${env.build_flags} -Wall -Wextra
${env.build_flags} -Wall -Wextra -Isrc/esp32
lib_ignore = segger_rtt

; The 1.0 release of the TBEAM board
Expand All @@ -92,7 +93,7 @@ extends = esp32_base
board = ttgo-t-beam
lib_deps =
${env.lib_deps}
AXP202X_Library
https://github.com/meshtastic/AXP202X_Library.git
build_flags =
${esp32_base.build_flags} -D TBEAM_V10

Expand Down Expand Up @@ -122,11 +123,9 @@ board = ttgo-lora32-v1
build_flags =
${esp32_base.build_flags} -D TTGO_LORA_V2


; The NRF52840-dk development board
[env:nrf52dk]
; Common settings for NRF52 based targets
[nrf52_base]
platform = nordicnrf52
board = ppr
framework = arduino
debug_tool = jlink
build_type = debug ; I'm debugging with ICE a lot now
Expand All @@ -136,10 +135,6 @@ src_filter =
${env.src_filter} -<esp32/>
lib_ignore =
BluetoothOTA
lib_deps =
${env.lib_deps}
UC1701
https://github.com/meshtastic/BQ25703A.git
monitor_port = /dev/ttyACM1

debug_extra_cmds =
Expand All @@ -150,3 +145,19 @@ debug_init_break =
;debug_init_break = tbreak loop
;debug_init_break = tbreak Reset_Handler

; The NRF52840-dk development board
[env:nrf52dk]
extends = nrf52_base
board = nrf52840_dk_modified

; The PPR board
[env:ppr]
extends = nrf52_base
board = ppr
lib_deps =
${env.lib_deps}
UC1701
https://github.com/meshtastic/BQ25703A.git



Loading