You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DBAL does not consider KEY(A, B) to fulfill KEY(A), assuming that it should, since the leftmost part matches the whole later index.
Q
A
BC Break
no
Version
2.6.x
Context
I am working in a project where I stumbled upon a DBAL (2.6) behavior that I couldn't understand why it was built that way (the situation happens at least since 2.5 where duplicated indexes started being identified up to newer versions and master).
In this case we use Mysql and I was trying to remove redundant indexes (indexes with columns that match the leftmost part of other indexes).
I noticed that we had a lot of implicit indexes automatically added for foreign keys by Doctrine ORM that became redundant by custom indexes that started with the same column as the foreign key field. So I tried removing those indexes, which Mysql accepts (since we have another index that can be used for the foreign key lookups just the same).
After doing so I noticed that doctrine:schema:validate no longer validated my database as in sync with my mapping. Doctrine auto auto-schema mapping is trying to add again the indexes for the foreign keys, even though there were compound indexes that work just the same (all mapped as well).
Since we wanted to help RDMS performance by removing redundant indexes, now the auto schema tools (doctrine:migration:diff, doctrine:schema:update, doctrine:schema:validate) cannot be properly used since they try every time to add dozens of indexes we know we don't want or need.
How to Reproduce
I am gonna provide and link here a PR with a failing test.
That appears to be there since the beginning of duplicate/overruling indexes: bc9c8c2
Work has been done since then on best identifying duplicated/redundant indexes (custom or implict), but never touched the compound indexes issue I noticed now.
Specifically for FK naming (not our case here, but touches the indexing as well): doctrine/orm#3753
The only report of this same concern I found was the one here: #1195 (comment)
Justification
Longer indexes (KEY(A, B)) make shorter (KEY(A)), leftmost equivalent indexes, redundant. Having many indexes can hurt performance in large tables with a high writing rate, and keeping redundant indexes certainly does not help on that.
The check found in the investigation prevents longer indexes to fulfill others, even though they should. Although the reason stated in the comment 8 years ago seemed like a special case treatment and not a performance-feature blocker.
So redundant indexes should be avoided and that should be true across platforms:
MySQL requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. Such an index is created on the referencing table automatically if it does not exist. This index might be silently dropped later, if you create another index that can be used to enforce the foreign key constraint. index_name, if given, is used as described previously.
A foreign key must reference columns that either are a primary key or form a unique constraint. This means that the referenced columns always have an index (the one underlying the primary key or unique constraint); so checks on whether a referencing row has a match will be efficient. Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too. Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.
The answer is basically the same as when using an index to police a Primary Key or Unique Key constraint. An index can be used providing the leading columns match those of the constraint (in any order). The index can indeed potentially have additional columns appended (or overloaded) to it.
Conclusion
So given the situation I have written a couple of tests and changed the Index::isFulfilleBy method to incorporate this behavior, while specially treating the original concern that made this feature not available originally (as per the comment made 8 years ago).
I am providing a PR to better illustrate the issue and a hopefully serve as a seed for discussing the approach towards it (if agreed DBAL should fix this behavior).
That PR should be a work in progress (I assume more tests and ideas should be discussed first).
I was not able to test in all platforms (possibly will have some broken tests) and also I understand it is possible that different databases can have different requirements for indexes becoming redundant as well as foreign key special considerations.
If that turns out to be really important and we are unable to safely confirm general compatibility, we should then look at having some platform specific behavior, so where we know it is true we make it happen and ensure best performance practice all the way along regarding redundant indexes.
The text was updated successfully, but these errors were encountered:
Defining proper indexes looks like a job for the database designer, not for the DBAL. Implementing it properly in a portable way is a non-trivial work that requires a ton of time a dedication. The initial assumption that the longer index fulfills the one that it has as a prefix doesn't work for hash indexes while the DBAL is currently unaware of the underlying data structure of indexes.
DBAL does not consider
KEY(A, B)
to fulfillKEY(A)
, assuming that it should, since the leftmost part matches the whole later index.Context
I am working in a project where I stumbled upon a DBAL (2.6) behavior that I couldn't understand why it was built that way (the situation happens at least since 2.5 where duplicated indexes started being identified up to newer versions and
master
).In this case we use Mysql and I was trying to remove redundant indexes (indexes with columns that match the leftmost part of other indexes).
I noticed that we had a lot of implicit indexes automatically added for foreign keys by Doctrine ORM that became redundant by custom indexes that started with the same column as the foreign key field. So I tried removing those indexes, which Mysql accepts (since we have another index that can be used for the foreign key lookups just the same).
After doing so I noticed that
doctrine:schema:validate
no longer validated my database as in sync with my mapping. Doctrine auto auto-schema mapping is trying to add again the indexes for the foreign keys, even though there were compound indexes that work just the same (all mapped as well).Since we wanted to help RDMS performance by removing redundant indexes, now the auto schema tools (
doctrine:migration:diff
,doctrine:schema:update
,doctrine:schema:validate
) cannot be properly used since they try every time to add dozens of indexes we know we don't want or need.How to Reproduce
I am gonna provide and link here a PR with a failing test.
PR: #3146
Investigation
So I narrowed the issue to this check (added back in 2010):
https://github.com/doctrine/dbal/blob/master/lib/Doctrine/DBAL/Schema/Index.php#L209-L213
Background
That appears to be there since the beginning of duplicate/overruling indexes: bc9c8c2
Work has been done since then on best identifying duplicated/redundant indexes (custom or implict), but never touched the compound indexes issue I noticed now.
Specifically for FK naming (not our case here, but touches the indexing as well): doctrine/orm#3753
The only report of this same concern I found was the one here: #1195 (comment)
Justification
Longer indexes (
KEY(A, B)
) make shorter (KEY(A)
), leftmost equivalent indexes, redundant. Having many indexes can hurt performance in large tables with a high writing rate, and keeping redundant indexes certainly does not help on that.The check found in the investigation prevents longer indexes to fulfill others, even though they should. Although the reason stated in the comment 8 years ago seemed like a special case treatment and not a performance-feature blocker.
So redundant indexes should be avoided and that should be true across platforms:
Mysql
https://dev.mysql.com/doc/refman/8.0/en/create-table-foreign-keys.html
Postgresql
https://www.postgresql.org/docs/current/static/ddl-constraints.html
MariaDB
https://mariadb.com/kb/en/library/foreign-keys/
Oracle
(sorry, I failed to find it clear in the documentation)
https://richardfoote.wordpress.com/2014/04/02/indexing-foreign-keys-helden/
Conclusion
So given the situation I have written a couple of tests and changed the
Index::isFulfilleBy
method to incorporate this behavior, while specially treating the original concern that made this feature not available originally (as per the comment made 8 years ago).I am providing a PR to better illustrate the issue and a hopefully serve as a seed for discussing the approach towards it (if agreed DBAL should fix this behavior).
PR: #3147
That PR should be a work in progress (I assume more tests and ideas should be discussed first).
I was not able to test in all platforms (possibly will have some broken tests) and also I understand it is possible that different databases can have different requirements for indexes becoming redundant as well as foreign key special considerations.
If that turns out to be really important and we are unable to safely confirm general compatibility, we should then look at having some platform specific behavior, so where we know it is true we make it happen and ensure best performance practice all the way along regarding redundant indexes.
The text was updated successfully, but these errors were encountered: