Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EEPROM Checksum #4167

Merged
merged 5 commits into from
Jul 2, 2016
Merged

Conversation

thinkyhead
Copy link
Member

Implement a checksum in the EEPROM to verify the stored data.

This also bumps the EEPROM version to V24.

@lrpirlet
Copy link
Contributor

@thinkyhead Maybe some reorganiztion of the EEPROM is needed first... Today if I re-compile the same version of Marlin, just changing the number of Mesh Bed Leveling points, I corrupt all subsequent values... See #4086

@@ -324,9 +344,9 @@ void Config_StoreSettings() {
EEPROM_WRITE_VAR(i, dummy);
}

char ver2[4] = EEPROM_VERSION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Now store checksum in stored_checksum variable
uint16_t stored_checksum = eeprom_checksum;

@Grogyan
Copy link
Contributor

Grogyan commented Jun 29, 2016

Do we have for M500 the ability to erase the EEPROM before doing a "factory reset"?

@thinkyhead
Copy link
Member Author

@Grogyan - Do you mean, write zeros to the EEPROM first before writing the data?

@thinkyhead thinkyhead force-pushed the rc_eeprom_checksum branch 2 times, most recently from d5f3c2e to ffeac2f Compare June 30, 2016 01:07
@Grogyan
Copy link
Contributor

Grogyan commented Jun 30, 2016

@thinkyhead yes, zeros or FF typically.

This isn't the first time EEPROM corruption has happened, hence why i'm inquiring whether an erase is done first.
Eg
If EEPROM_VERSION != V24 Then inform user, erase EEPROM before writing new version

@thinkyhead thinkyhead force-pushed the rc_eeprom_checksum branch from ffeac2f to 00a7d94 Compare June 30, 2016 01:19
@thinkyhead
Copy link
Member Author

thinkyhead commented Jun 30, 2016

@Grogyan It will be interesting to see how often this checksum check actually gets tripped. It should be generally rare, but yes as you point out we need to account for "worst-case" for all data we store. So we should make room for a 10x10 mesh and make that the maximum size allowed. Or use a MAX_MESH_POINTS setting that enforces it.

@thinkyhead
Copy link
Member Author

thinkyhead commented Jun 30, 2016

If EEPROM_VERSION != V24 Then inform user, erase EEPROM before writing new version

The logic Marlin follows, which I think is right, is if the version or checksum fails to match, it will throw an error and just apply the defaults explicitly. Users can then just write to EEPROM with M500 to replace the outdated or malformed data.

In the long run it would be good to be able to specify, from Marlin version to Marlin version, that we only add to the end of the EEPROM structure and never change anything before it (from the previous EEPROM version), so the old data can always be read, even if it doesn't have all the newer data. But I expect we will continue to shuffle the EEPROM around from point-release to point-release.

I've proposed that we just write GCode to the EEPROM instead. There's enough room for all the output of M503 to fit. Then you could just load it and "play it back."

@Grogyan
Copy link
Contributor

Grogyan commented Jun 30, 2016

@thinkyhead that assumes that the EEPROM data recalled for certain functions is in the same location in EEPROM, which in this PR is not, hence if not erased before writing the new EEPROM version.
Keeping the old version of EEPROM with an M500 can also cause corruption issues.

The ideal method, as suggested would be to call the old settings into RAM, erase the EEPROM then load the settings from RAM back into EEPROM on the new version EEPROM. Thereby the correct EEPROM addresses will be correct.

@@ -173,6 +180,9 @@ void Config_StoreSettings() {
char ver[4] = "000";
int i = EEPROM_OFFSET;
EEPROM_WRITE_VAR(i, ver); // invalidate data first

Copy link
Contributor

@MagoKimbra MagoKimbra Jun 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now i = 104 you must add 2 for uint16 eeprom-checksum or write eeprom_checksum 0...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! I'm trying to do too many things at once, obviously.

@thinkyhead thinkyhead force-pushed the rc_eeprom_checksum branch from 00a7d94 to 8c8cae7 Compare June 30, 2016 06:01
@thinkyhead thinkyhead force-pushed the rc_eeprom_checksum branch from 0ad0f28 to 6b2a7b3 Compare June 30, 2016 23:28
@thinkyhead
Copy link
Member Author

thinkyhead commented Jul 1, 2016

The ideal method, as suggested would be to call the old settings into RAM, erase the EEPROM then load the settings from RAM back into EEPROM on the new version EEPROM.

@Grogyan So far we haven't kept track of the old layouts or preserved the old reading code when we update to a new layout. It would be nice if we could do that, but it's not so simple. Should we convert every old version to the latest one, or just the previous one?

If we stored the GCode text for the current configuration, rather than the binary, that would make it much easier, but that would require a complete rewrite of the EEPROM code. Note that we only have 512 bytes (guaranteed) to work with. On larger boards we have plenty of space - at least 1K, and up to 4K - but we want to keep supporting smaller boards.

In the long run, we do want to improve the EEPROM and configuration stuff generally. But I think we need to start over and come up with a good robust system. Perhaps we can use IFF formatting.

Keeping the old version of EEPROM with an M500 can also cause corruption issues.

I don't see that anywhere. M500 overwrites all the space it needs to overwrite, skipping over nothing. M501 won't even try to read an older EEPROM layout.

Here's the code that reads the mesh. Apparently it just reads –but ignores– the existing data if it doesn't match the new mesh dimensions. (The first values written for the mesh are a "mini-header.")

#if ENABLED(MESH_BED_LEVELING)
  mbl.status = dummy_uint8;
  mbl.z_offset = dummy;
  if (mesh_num_x == MESH_NUM_X_POINTS && mesh_num_y == MESH_NUM_Y_POINTS) {
    EEPROM_READ_VAR(i, mbl.z_values);
  } else {
    mbl.reset();
    for (uint8_t q = 0; q < mesh_num_x * mesh_num_y; q++) EEPROM_READ_VAR(i, dummy);
  }
#else
  for (uint8_t q = 0; q < mesh_num_x * mesh_num_y; q++) EEPROM_READ_VAR(i, dummy);
#endif // MESH_BED_LEVELING

The EEPROM-reading code is actually pretty smart about all the issues you raise. I think if you look closely you will not be able to find a snippet that illustrates a potential for corruption. But please point out any that do!

@Grogyan
Copy link
Contributor

Grogyan commented Jul 1, 2016

@thinkyhead it is something worth looking into, IMO after a 1.10 (stable) release.
The EEPROM reading looks OK to me, with my limited coding skills.

@thinkyhead
Copy link
Member Author

thinkyhead commented Jul 1, 2016

The configurability of Repetier-Host and Smoothieware make me a bit jealous. They put Marlin to shame. Of course, Smoothie runs on a much faster board, so they can make everything parametric and still get great performance. We have to explicitly pre-configure a lot more to get decent performance.

I'm sure that after 1.1.0 we'll do a great re-write of the EEPROM code, and we can add an option to save/restore current settings to SD also.

@thinkyhead thinkyhead merged commit ac4f235 into MarlinFirmware:RCBugFix Jul 2, 2016
@thinkyhead thinkyhead deleted the rc_eeprom_checksum branch July 2, 2016 02:22
@thinkyhead thinkyhead mentioned this pull request Jul 8, 2016
@jbrazio jbrazio modified the milestone: 1.1.0 Jul 18, 2016
drewmoseley pushed a commit to drewmoseley/Marlin that referenced this pull request Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants