Replace event_tag FK to get rid of insert and return #710 #731

Roiocam · 2023-04-13T08:44:10Z

References #710

patriknw

looking good, but a compatibility approach is needed

core/src/main/scala/akka/persistence/jdbc/journal/dao/JournalQueries.scala

core/src/main/resources/reference.conf

patriknw

looking good

core/src/main/scala/akka/persistence/jdbc/journal/dao/JournalQueries.scala

core/src/main/resources/schema/postgres/postgres-create-schema.sql

Roiocam · 2023-08-29T04:21:31Z

proposal

After several months, I finally have the time to complete this merge. I've thoroughly reviewed the migration's changes, and here are my considerations.

Regarding the event_tag table, there are two essential components:

The ordering used to sequence events in a specific stream.
The foreign key used to load event payloads lazily.

lazy event load means join table and query.

In the previous approach, we used the event_id as the foreign key. However, this incurred some costs since we had to wait for all events to be inserted before retrieving their IDs. Moreover, the 'slick' plugin had performance issues with 'InsertAndReturn', which led to individual inserts instead of batch inserts.

On the other hand, we can employ the primary key of the event_journal table instead of the 'ordering' column for the foreign key. This modification removes the need for InsertAndReturn.

To ensure a rolling updates when shifting to the "new way," we propose a phased rollout with steps controlled by a configuration property:

Continue with the old approach.
Modify the table structure to allow for a redundant foreign key column, although only one will be used.
Write both old and new foreign keys, but only read the old foreign key for event lazy-loading (configurable through the configuration property).
Once the projection catches up with the newly written foreign key's position, migrate the table to the new primary and foreign keys. Alter the table structure to make the old foreign key column nullable.
Finally, the migration has completed, switch the projection to read the new foreign key and stop writing in the old manner (controlled by configuration properties).

An real world example using MySQL.

1. add new column before migration

ALTER TABLE event_tag
    ADD PERSISTENCE_ID  VARCHAR(255),
    ADD SEQUENCE_NUMBER BIGINT;

2. change to redundant write and read via config

jdbc-journal.tables.event_tag.redundant-write = true
jdbc-read-journal.tables.event_tag.redundant-read = true

3. waitting for projection catech, and then

-- drop old fk column
DELETE
FROM event_tag
WHERE PERSISTENCE_ID IS NULL
  AND SEQUENCE_NUMBER IS NULL;
-- drop old FK constraint
SELECT CONSTRAINT_NAME
INTO @fk_constraint_name
FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
WHERE TABLE_NAME = 'event_tag';
SET @alter_query = CONCAT('ALTER TABLE event_tag DROP FOREIGN KEY ', @fk_constraint_name);
PREPARE stmt FROM @alter_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
-- drop old PK  constraint
ALTER TABLE event_tag
DROP PRIMARY KEY;
-- create new PK constraint for PK column.
ALTER TABLE event_tag
    ADD CONSTRAINT
        PRIMARY KEY (PERSISTENCE_ID, SEQUENCE_NUMBER, TAG);
-- create new FK constraint for PK column.
ALTER TABLE event_tag
    ADD CONSTRAINT fk_event_journal_on_pk
        FOREIGN KEY (PERSISTENCE_ID, SEQUENCE_NUMBER)
            REFERENCES event_journal (PERSISTENCE_ID, SEQUENCE_NUMBER)
            ON DELETE CASCADE;
-- alter the event_id to nullable, so we can skip the InsertAndReturn.
ALTER TABLE event_tag
    MODIFY COLUMN EVENT_ID BIGINT UNSIGNED NULL;

4. finally, rollback the redundant config

jdbc-journal.tables.event_tag.redundant-write = false
jdbc-read-journal.tables.event_tag.redundant-read = false

Roiocam · 2023-08-29T05:26:55Z

core/src/test/scala/akka/persistence/jdbc/query/EventsByTagMigrationTest.scala

+
+  it should "migrate event tag to new way" in {
+    // 1. Mock legacy data on here, but actually using redundant write and read.
+    withRollingUpdateActorSystem { implicit system =>


During the verify step I commented these line here, and using the original approach to insert the same rows (at this time, the table didn't have any new column of journal table PK). By doing this, the outcome aligns more closely with real-world scenarios.

However, in order to avoid redundancy with the old schema, I'm use the new approach here to simulate the insertion.

I've validated this with databases other than H2, as I believe that when using H2, it implies that migration is not necessary.

Roiocam · 2023-08-31T08:42:16Z

I did some tests and profiles to verify the improvement of performance.

before PR

A flame graph provides two pieces of information: the original approach incurs overhead not only during the commit (insert), but also has cost on during the result coverter.

Furthermore, just 6 samples account for 0.46% + 0.3% of CPU time.(Although this is not strict proof)

after PR

After this PR, the most obvious change is the elimination of the overhead in result convert.
Additionally, there is an improvement in execution efficiency at the database side (though this cannot be demonstrated by this flame graph).

patriknw

LGTM, thanks for great work on this

octonato

Looking good.

I think we can improve the migration instructions. We also need a page about it.

For example, we need to ask users to create new columns and deploy a new version using redundant-write.

Then users still need to run an SQL script to backfill the PID and SeqNr. Something like:

UPDATE event_tag
  SET
    persistence_id = event_journal.persistence_id,
    sequence_number = event_journal.sequence_number
  FROM event_journal
  WHERE event_journal.ordering = event_tag.event_id;

After that, deploy once more with redundant-read plus changes in the schema.

About redundant reads and writes, I find the name ambiguous. I made a comment in my comments below.

I think we can even use one single config. For example, legacy-tag-key. By default set to true.

Users should add the new columns and update the plugin.
The application will start to write in all three columns.

After backfilling the two new columns (pid and seq_nr) with data from event_journal, the user deploys once again with legacy-tag-key=false.
Now, the application won't write the event_id anymore and will always make the join using pid and seq_nr.

core/src/main/resources/reference.conf

migrator/src/main/scala/akka/persistence/jdbc/migrator/JournalMigrator.scala

octonato · 2023-09-04T11:09:37Z

core/src/main/resources/schema/postgres/postgres-event-tag-migration.sql

+    ADD PERSISTENCE_ID  VARCHAR(255),
+    ADD SEQUENCE_NUMBER BIGINT;
+
+-- >>>>>>>>>>> after projection catch up.


It's not clear to me what do you mean by projection catch-up? Are you referring to the migration tool? The tool is intended for migration from previous versions.

With the changes in this PR, we need to run another kind of migration.

Luckily, we can do this migration with plain SQL. For example, for Postgres, we can run the following:

UPDATE event_tag SET persistence_id = event_journal.persistence_id, sequence_number = event_journal.sequence_number FROM event_journal WHERE event_journal.ordering = event_tag.event_id;

After that, we can create the new foreign key and even drop the event_id.

"after projection catch-up" refers to the process of waiting for Akka Projection to update the offset to the earliest new column write.

I hadn't considered the option of migrating data using SQL, which is great but I overlooked it. Including this SQL in the migration steps allows us to proceed without waiting for the Projection read "catch up".

I will update PR after fix integration test.

For clarification, there is no projection involved here.

The tags are written when the events are persisted, not when consumed by a Projection.

So basically, the journal stays what it is. Old tags will be written with ordering filled and pid and seq_nr will stay empty. New tags will have pid and seq_nr filled. Without explicit migration, nothing will happen.

Btw, Akka Projection is a separate project and may not be in use at all. You can use Akka Persistence with this plugin without even using Akka Projection.

core/src/main/resources/reference.conf

patriknw · 2023-09-21T11:29:51Z

Could you add the following to a file: akka-persistence-jdbc/core/src/main/mima-filters/5.2.1.backwards.excludes/issue-710-tag-fk.excludes

ProblemFilters.exclude[IncompatibleSignatureProblem]("akka.persistence.jdbc.journal.dao.JournalTables#EventTags.eventId")
ProblemFilters.exclude[IncompatibleResultTypeProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.eventId")
ProblemFilters.exclude[DirectMissingMethodProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.copy")
ProblemFilters.exclude[IncompatibleResultTypeProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.copy$default$1")
ProblemFilters.exclude[IncompatibleResultTypeProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.copy$default$2")
ProblemFilters.exclude[DirectMissingMethodProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.this")
ProblemFilters.exclude[MissingTypesProblem]("akka.persistence.jdbc.journal.dao.JournalTables$TagRow$")
ProblemFilters.exclude[DirectMissingMethodProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.apply")
ProblemFilters.exclude[IncompatibleSignatureProblem]("akka.persistence.jdbc.journal.dao.JournalTables#TagRow.unapply")

patriknw · 2023-09-21T11:30:17Z

There are also some test failures from CI if you can take a look at those?

Roiocam · 2023-09-21T12:19:37Z

Could you add the following to a file: akka-persistence-jdbc/core/src/main/mima-filters/5.2.1.backwards.excludes/issue-710-tag-fk.excludes

of course.

There are also some test failures from CI if you can take a look at those?

Renato has some suggestions on this PR that sparked some thoughts in me: Perhaps we can simplify the rolling update steps by migrating SQL.

I have some thoughts, but I am currently trapped by other issues and don't have the time for it at the moment. I will fix and verify them over the weekend.

Roiocam · 2023-10-26T03:35:33Z

The migration guide will be add it on new pull request.

Roiocam · 2023-10-26T03:35:37Z

@octonato could you please take a look agin? Thanks.

Roiocam · 2023-10-26T03:37:49Z

core/src/test/scala/akka/persistence/jdbc/query/EventsByTagMigrationTest.scala

+    // the legacy table schema creation.
+    if (newDao) {
+      addNewColumn();
+      migrateLegacyRows();


using the SQL migrate old rows, then we could rolling updates.

octonato

LGTM

I left a few comments but more about things for later we should consider after we cut a release with this change.

octonato · 2023-10-26T09:11:16Z

migrator/src/main/scala/akka/persistence/jdbc/migrator/JournalMigrator.scala

+      newJournalQueries.TagTable ++= tags
+        .map(tag =>
+          TagRow(
+            Some(journalSerializedRow.ordering), // legacy tag key enabled by default.


This is fine, but we could also read the config key and then decide if we want to fill this column or not.

Ultimately, it will be nicer if users can even not have this column in their table.

octonato · 2023-10-26T09:15:54Z

core/src/main/resources/schema/h2/h2-create-schema.sql

@@ -18,12 +18,14 @@ CREATE TABLE IF NOT EXISTS "event_journal" (
 CREATE UNIQUE INDEX "event_journal_ordering_idx" on "event_journal" ("ordering");

 CREATE TABLE IF NOT EXISTS "event_tag" (
-    "event_id" BIGINT NOT NULL,
+    "event_id" BIGINT,


It would be better if new users just don't have this column.

I know, legacy mode is enabled by default and if you just start today with the plugin you will be using it in legacy mode without even noticing it.

I don't have a solution. I'm just mentioning it so we can think together about alternatives.

In the worst case scenario, we leave it as is and people will have a column that is never used.

Probably, we need to go with that and on a next release we simply drop this column and we remove the legacy mode flag.

Then users will first need to move to the previous version, run the migrate and then move to the next version. After all that they will be able to drop the event_id column.

octonato

@Roiocam, thanks for all the efforts and patience. I have one more comment that I think we need to address before merging.

We are almost there.

octonato · 2023-10-26T09:38:00Z

core/src/main/resources/schema/mysql/mysql-event-tag-migration.sql

+ALTER TABLE event_tag
+    ADD persistence_id  VARCHAR(255),
+    ADD sequence_number BIGINT;
+-- migrate rows


I just realised that the script needs to be run in two parts. First add the columns, redeploy with new version, then run the rest of the script and redeploy once more with legacy-mode set to false. See my comment on the migration guide PR: #781 (review)

* Replace event_tag FK to get rid of insert and return akka#710 * support rolling updates akka#710 * remove CRLF akka#710 * optimized migrator akka#710 * fixes oracle test akka#710 * unitTest,SQL for migration akka#710 * fix MigratorSpec akka#710 * chore: typo fix akka#710 * fix: IntegrationTest and clean code akka#710 * fix: compatible legacy tag read akka#673 * chore: mi-ma filter for PR * fix: optimized migrate step * fix: dialect for column fill * fix: update migration sql * fix: mysql dialect * fix: dialect syntax * fix: dialect syntax * fix: avoid use system table of mysql * fix: batch insert caused flaky test * fix: insert less event of large batch * fix: script fix and strongly express two-step update

Roiocam mentioned this pull request Apr 13, 2023

batch insert event and event_tag performance. #710

Closed

patriknw reviewed Apr 13, 2023

View reviewed changes

core/src/main/scala/akka/persistence/jdbc/journal/dao/JournalQueries.scala Outdated Show resolved Hide resolved

core/src/main/scala/akka/persistence/jdbc/journal/dao/JournalQueries.scala Outdated Show resolved Hide resolved

Roiocam changed the title ~~Replace event_tag FK to get rid of insert and return #710~~ [wip]Replace event_tag FK to get rid of insert and return #710 Apr 13, 2023

patriknw reviewed Apr 17, 2023

View reviewed changes

core/src/main/resources/reference.conf Outdated Show resolved Hide resolved

Roiocam force-pushed the master branch from 6b01120 to 9fd8e82 Compare August 23, 2023 15:56

patriknw reviewed Aug 24, 2023

View reviewed changes

Roiocam commented Aug 29, 2023

View reviewed changes

Roiocam changed the title ~~[wip]Replace event_tag FK to get rid of insert and return #710~~ Replace event_tag FK to get rid of insert and return #710 Aug 29, 2023

patriknw approved these changes Sep 1, 2023

View reviewed changes

octonato reviewed Sep 4, 2023

View reviewed changes

Roiocam mentioned this pull request Sep 6, 2023

failure: Some eventsByTag tests fail with Oracle and legacy.ByteArrayReadJournalDao #673

Open

Roiocam marked this pull request as draft September 6, 2023 10:15

Roiocam force-pushed the master branch from 4ec3c2b to 1c2dccd Compare October 8, 2023 06:57

Roiocam and others added 12 commits October 25, 2023 22:50

Replace event_tag FK to get rid of insert and return akka#710

0ed72af

support rolling updates akka#710

c075995

remove CRLF akka#710

ed1b178

optimized migrator akka#710

afb695a

fixes oracle test akka#710

e86bacf

unitTest,SQL for migration akka#710

9987edf

fix MigratorSpec akka#710

5c35f05

chore: typo fix akka#710

4fc918c

fix: IntegrationTest and clean code akka#710

50a8722

fix: compatible legacy tag read akka#673

1b5d7e6

chore: mi-ma filter for PR

a54afb9

fix: optimized migrate step

cfb6e7a

Roiocam force-pushed the master branch from 1c2dccd to cfb6e7a Compare October 25, 2023 15:45

Roiocam added 8 commits October 26, 2023 00:29

fix: dialect for column fill

5cffc31

fix: update migration sql

8bdd4db

fix: mysql dialect

791b63b

fix: dialect syntax

2b1ae04

fix: dialect syntax

b8988d9

fix: avoid use system table of mysql

695e293

fix: batch insert caused flaky test

dfdd492

fix: insert less event of large batch

d1bdfc2

Roiocam marked this pull request as ready for review October 26, 2023 03:33

Roiocam requested a review from octonato October 26, 2023 03:33

Roiocam commented Oct 26, 2023

View reviewed changes

Roiocam mentioned this pull request Oct 26, 2023

docs: event tag migration guide #781

Merged

octonato approved these changes Oct 26, 2023

View reviewed changes

octonato reviewed Oct 26, 2023

View reviewed changes

fix: script fix and strongly express two-step update

8587ba0

octonato merged commit d4d942e into akka:master Oct 27, 2023
13 checks passed

Roiocam mentioned this pull request Oct 31, 2023

Failed: CurrentEventsByTagTest #782

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace event_tag FK to get rid of insert and return #710 #731

Replace event_tag FK to get rid of insert and return #710 #731

Roiocam commented Apr 13, 2023

patriknw left a comment

patriknw left a comment

Roiocam commented Aug 29, 2023 •

edited

Loading

Roiocam Aug 29, 2023

Roiocam commented Aug 31, 2023

patriknw left a comment

octonato left a comment

octonato Sep 4, 2023

Roiocam Sep 6, 2023

octonato Sep 6, 2023

patriknw commented Sep 21, 2023

patriknw commented Sep 21, 2023

Roiocam commented Sep 21, 2023

Roiocam commented Oct 26, 2023

Roiocam commented Oct 26, 2023

Roiocam Oct 26, 2023 •

edited

Loading

octonato left a comment

octonato Oct 26, 2023

octonato Oct 26, 2023

octonato Oct 26, 2023

octonato left a comment

octonato Oct 26, 2023

Replace event_tag FK to get rid of insert and return #710 #731

Replace event_tag FK to get rid of insert and return #710 #731

Conversation

Roiocam commented Apr 13, 2023

patriknw left a comment

Choose a reason for hiding this comment

patriknw left a comment

Choose a reason for hiding this comment

Roiocam commented Aug 29, 2023 • edited Loading

proposal

An real world example using MySQL.

1. add new column before migration

2. change to redundant write and read via config

3. waitting for projection catech, and then

4. finally, rollback the redundant config

Choose a reason for hiding this comment

Roiocam commented Aug 31, 2023

before PR

after PR

patriknw left a comment

Choose a reason for hiding this comment

octonato left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patriknw commented Sep 21, 2023

patriknw commented Sep 21, 2023

Roiocam commented Sep 21, 2023

Roiocam commented Oct 26, 2023

Roiocam commented Oct 26, 2023

Roiocam Oct 26, 2023 • edited Loading

Choose a reason for hiding this comment

octonato left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

octonato left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Roiocam commented Aug 29, 2023 •

edited

Loading

Roiocam Oct 26, 2023 •

edited

Loading