Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBAL-968 - Rebuilt doModifyLimitQuery in SQLServerPlatform and fixed invalid test cases. #658

Closed
wants to merge 6 commits into from

Conversation

billschaller
Copy link
Member

The recent change to SQLServerPlatform.php (17dad30) broke the ORM Paginator's queries on SQL server.

I investigated, and found that some of the test cases for the SQL Server platform weren't actually correct SQL. Also, there were no test cases that covered what the paginator is doing, so I've written test cases for those. I will open a pull request for this issue.

The modifyLimitQuery method in SQLServerPlatform.php should be fixed to pass the fixed old tests and the new tests.

My concern is that that method is becoming too complex, but that's an issue for another day.

@doctrinebot
Copy link

Hello,

thank you for creating this pull request. I have automatically opened an issue
on our Jira Bug Tracker for you. See the issue link:

http://www.doctrine-project.org/jira/browse/DBAL-969

We use Jira to track the state of pull requests and the versions they got
included in.

@Ocramius
Copy link
Member

Hey @zeroedin-bill, did you base your PR on top of 2.4? Should be based on master instead

@guilhermeblanco
Copy link
Member

@Ocramius @zeroedin-bill don't bother... I'll make these aligned to master later today.

@billschaller
Copy link
Member Author

@Ocramius i need to git better at git

@Ocramius
Copy link
Member

You'll git it right next time

@stof
Copy link
Member

stof commented Aug 12, 2014

is it expected that this changes only tests (and breaking them) ?

@billschaller
Copy link
Member Author

Yes - The tests that have been changed were missing FROM clauses in their subqueries, which is invalid. So I changed the test SQL to have a from clause. The tests now fail, revealing the fact that the behavior of the modifyLimitQuery func is not correct.

@billschaller
Copy link
Member Author

@Ocramius i think i git it now...

@stof
Copy link
Member

stof commented Aug 12, 2014

yeah, looks much better. now, if you have an idea to fix it, it will be great

@billschaller
Copy link
Member Author

@stof, there are two issues related to TSQL that make the whole crazy thing necessary in the first place:

  1. No LIMIT -- This is not so bad -- oracle is the same more or less
  2. No ORDER BY in subqueries -- this forces you to use the window function ROW_NUMBER() OVER(ORDER BY xyz) syntax that is radically different from every other DBMS. Oracle has this syntax, but kindly allows you to use ORDER BY in subqueries and the [rownum] pseudocolumn instead if you wish. I think this was just laziness on the part of the SQL Server devs.

The $selectFromPattern regex was changed for some reason, and now it doesn't properly recognize the boundaries of the select list. I think the doModifyLimitQuery method in the SQLServerPlatform is kind of nuts. SQL Query parsing with regex is hard to get right. This

IMO the best way to fix this method would be to rewrite it to do the following:

  1. build a syntax tree from the query... Q.Q
  2. resolve the column aliases from the outermost ORDER BY clause found
  3. kill any residual ORDER BY clauses at all levels of the query
  4. append "ROW_NUMBER() OVER(ORDER BY ...) as doctrine_rownum" to the select list of the top level query
  5. Build the new query as 'SELECT * FROM () AS doctrine_tbl WHERE doctrine_rownum BETWEEN AND <offset + limit>

Return.

It would be nice if everything could just pass an AST to the doModifyLimitQuery method, but that would require rewriting everything so that's a nope.

It could instead be rewritten to accurately separate out the query parts using regex. It would need to do so without relying on finding specific strings that are passed by the ORM or other dependent libs. For example, SQLServerPlatform.php:1211

$isWrapped = (preg_match('/SELECT DISTINCT .* FROM \(.*\) dctrn_result/', $query)) ? true : false;

This relies on finding a specific piece of SQL generated by the ORM. Coupling like this just makes this method worse and harder to debug.

I'm not sure what the best course of action for fixing this method is. I think the tests cover all the use cases now at least.

@deeky666
Copy link
Member

@zeroedin-bill I think we won't get this thing ever fixed in 2.x because of the lack of an AST to rebuild the query from. This method has been adjusted so many times now always fixing one issue and most probably introducing another. So I guess the only way for 2.x is to verify correct rebuilding of the query via regex through tests and always continuesly adding new tests for new (edge) cases.

@Ocramius
Copy link
Member

Fully agree with @deeky666 - regexes are annoying and will never be able to cover all use-cases. We'll have to patch in bugfixes as they come and then find a decent solution for 3.x, which has to be first of all specced out

@billschaller
Copy link
Member Author

@Ocramius @deeky666 Yes, 2.x can be kept going with band-aids. For 3.0 maybe the platform-specific SQL syntax modifiers such as doModifyLimitQuery and appendLockHint could be implemented to work as decorators on an AST instead of having to parse the SQL.

@billschaller
Copy link
Member Author

I rewrote doModifyLimitQuery -- still a WIP as it needs documentation. Works on all the test cases now.

@stof
Copy link
Member

stof commented Aug 13, 2014

@zeroedin-bill decorators on an AST instead of having to parse the SQL ? which AST ? building the AST would require parsing the SQL into the AST first, which is platform-specific (as each platform can have its own specificities in their SQL, which is precisely why we have the Platform classes in DBAL)


$isWrapped = (preg_match('/SELECT DISTINCT .* FROM \(.*\) dctrn_result/', $query)) ? true : false;
// Strip order by out of the query
$tsqlParts = $this->stripOrderByFromTsqlParts($tsqlParts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does TSQL mean ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TSQL is the SQL Server dialect of SQL

@stof
Copy link
Member

stof commented Aug 13, 2014

Please add some phpdoc for all your private methods to explain what they are about. Reviewing the code which does not have any comment and weird names (I don't understand what TSQL means in the names as said in my other comment for instance) is not easy.

@billschaller
Copy link
Member Author

@stof I'll revise and add documentation and clean up. I'm off for today, but I'll be back on it tomorrow, thanks for the code review.

@billschaller
Copy link
Member Author

@stof Revised... Let me know if you see any other issues and I'll fix em.

/**
* Breaks down a SQL statement into logical parts
*
* @param $sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be @param string $sql

@billschaller
Copy link
Member Author

@stof fixed

@billschaller billschaller changed the title DBAL-968 DBAL-968 - Rebuilt doModifyLimitQuery in SQLServerPlatform and fixed invalid test cases. Sep 17, 2014
@billschaller
Copy link
Member Author

@deeky666 @Ocramius I think this PR and #654 pretty much fix SQL server limiting and ordering. I've been running this code for a month and have had no issues.

I could fix up #654 in a separate PR and add test cases for it if you'd like...

As of now SQL Server platform is broken in master, and I'd like to see this get merged or another solution found before 2.5 gets released.

@Ocramius
Copy link
Member

As per discussion with @deeky666, I am not sure if we want to include this level of complexity just to fix MSSQL's syntax compatibility.

I am really puzzled about this: on one side, I see amazing work, on the other side I see impossibility for us to maintain this or to keep it in maintainable shape.

@billschaller
Copy link
Member Author

I just want limit queries to work right...

Whatever solution achieves that and does not break on a regular basis is the right one in my opinion.

As for this PR, I'm not married to my implementation, but the test cases are valid. I'm not sure a simple parser is any less maintainable in the long run than a chain of regexes. The history of the sql server doModifyLimitQuery implementation speaks to the difficulty of achieving a stable solution with regex alone.

@billschaller
Copy link
Member Author

And hey, it was fun to write.

@Ocramius
Copy link
Member

@zeroedin-bill fully agree with that, I'm just really hesitant to merge here (mainly because the ball is on our side afterwards).

@billschaller
Copy link
Member Author

@Ocramius Understandable... I can make a new PR with just the added/fixed tests tomorrow if you'd like...

@Ocramius
Copy link
Member

@zeroedin-bill I'd still want @deeky666's opinion here.

@deeky666
Copy link
Member

@zeroedin-bill first of all I really appreciate your hard work on this. The only concerns I have about this PR is complexity and maintainability as @Ocramius pointed out. Your implementation is more or less a Parser for SQL Server SELECT queries. On the one hand I see the requirement for this kind of complexity, on the other hand I get headaches polluting the platform with it.
The only compromise I would see here is to extract this "Parser" to one or more classes to avoid platform pollution. Still, it feels rather hackish adding such a component to 2.x. @Ocramius, thoughts?

@billschaller
Copy link
Member Author

@deeky666 A change I proposed to the ORM paginator in this PR would make a lot of the extra logic in this thing unnecessary -- If we can just assume that order by clauses inside subqueries are not valid and should be stripped out, then I think we can make this work without the parser stuff.

@Ocramius
Copy link
Member

Linking doctrine/orm#1220

@billschaller
Copy link
Member Author

Not needed due to a better solution. See #818

@deeky666 deeky666 added the Bug label Jan 5, 2016
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants