Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes a regression when cancelling a recurring with no processor_id #17292

Merged
merged 1 commit into from
May 26, 2020

Conversation

eileenmcnaughton
Copy link
Contributor

@eileenmcnaughton eileenmcnaughton commented May 11, 2020

Overview

processor_id is an optional field fo payment processor specific references. However, the recent switch to the property bag is causing it to hard-fail when it is empty

InvalidArgumentException: "processorID field has max length of 255"

#0 civicrm/civicrm/CRM/Contribute/Form/CancelSubscription.php(219): Civi\Payment\PropertyBag->setRecurProcessorID(NULL)
#1 civicrm/civicrm/CRM/Core/Form.php(484): CRM_Contribute_Form_CancelSubscription->postProcess()
#2 civicrm/civicrm/CRM/Core/StateMachine.php(144): CRM_Core_Form->mainProcess()

Before

Hard fail per above

After

Works

Technical Details

@mattwire @artfulrobot we could make the calling code check if it's empty but I think it should also be possible to null it out after set & we would be making the calling code jump through hoops for purity

Comments

I think we should also port to 5.25 which should be the first affected version
59545b0#diff-30a8e5b11544ab10573dea3f36cb9db2R209

@civibot
Copy link

civibot bot commented May 11, 2020

(Standard links)

@civibot civibot bot added the master label May 11, 2020
@eileenmcnaughton eileenmcnaughton changed the base branch from master to 5.26 May 11, 2020 23:30
@civibot civibot bot added 5.26 and removed master labels May 11, 2020
@eileenmcnaughton
Copy link
Contributor Author

test this please

@artfulrobot
Copy link
Contributor

The purpose of PropertyBag was to make sure there was no fuzziness around what parameters are in use, and to tighten validation of the params.

The concept so far has been that if a property is required it can't be NULL, as evidenced by the getter which will throw an exception if the property is not isset(). So setting things to NULL has not been allowed.

(We introduced legacy support that treats zero-length-strings as if the property was not there: #16532 but that's different because it's a legacy hack. Just thought it would be useful to link that.)

Personally I'd have had it that there are cases where NULL and zero-length-string are each separately valid values for certain properties, but others felt not (although note that NULL is allowed for custom properties).

For RecurProcessorID I think the valid values are: NULL (it is optional), or a non-zero-length string, max 255 bytes. So I think the setter should allow those values.

But the problem is then the getter which will throw an exception if there is a NULL.

I think the options are:

  • A1: we allow NULL as a valid value for RecurProcessorID, but disallow ''
    or
  • A2: we allow NULL and re-cast '' to NULL in setRecurProcessorID.
    or
  • A3 we allow NULL and ''. That is we consider them separate valid values.

And

  • B1: no action. If code is wanting to access an optional property it should check for its existance with
    $bag->has('RecurProcessorID') before calling the getter.
    or
  • B2: we add a getRecurProcessorID function that will return the stored value, NULL or otherwise, without exception.
    or
  • B3: we change the main getter to use array_key_exists instead of isset. This won't make much of a difference since the other setters still don't allow NULL, but would mean less work in future if there's another property for which NULL is valid.

I'd propose: A1/A3 and B1/B3

I find A2 tempting but feels a bit of a backslide, so I'd like to resist that.

B1 keeps closest to the collective wisdom to-date. B2 is a fix for this and this only. B3 seems more generically useful as long as it's not abused in future by sloppy setters.

@eileenmcnaughton
Copy link
Contributor Author

I feel like most of the benefit in adding the property bag was in standardising the parameters and the benefit of not permitting variables to be get or set as null is marginal.

On the other hand every time we attempt to use the propertyBag we hit pain around the tightness on NULL.

I feel like we aren't doing ourselves any favours by being strict around empty values

@artfulrobot
Copy link
Contributor

So what's your pref for the A, B options, @eileenmcnaughton ?

I've got massively stung in other projects by assumptions around NULL, '', zero, FALSE, and not set/existing, which is probably the background to my concern here. I understand that api3 ignores params with NULL, and sets params with '' to NULL, and that DAO converts 'null' to NULL, so it's always an up-hill struggle to improve on this in CiviCRM. But I believe api4 is better on that.

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot I've implemented A2 above. I think

  1. it should return a string since it is a string variable & there is no meaningful difference between NULL and '' here
  2. I think the backsliding concern is offset by the benefit of making the property bag easier to use & hence making it more adoptable.

We can revisit this but let's get it fixed in the rc at any rate!

@artfulrobot
Copy link
Contributor

artfulrobot commented May 25, 2020

@eileenmcnaughton I agree on getting it fixed at least, and this is certainly the most minimal way to get it not to error.

My final points are:

  • looks like you've implemented the opposite of A2, i.e. you've cast to a string if NULL is passed in, instead of casting to NULL if "" passed in. Meh. If you're treating the two values as one and the same, it doesn't seem to matter which you pick, except if anyone is using IS [NOT] NULL in SQL later on.

  • the database schema says that field can be NULL or "". The original docblock marked the $input as string because it did not accept NULL (it would error). As we're now accepting NULL, that should be marked as mixed (since this PR now allows FALSE, too), although yes, we want a string.

  • the test includes an equality check on NULL, followed by a type check. It would make more sense to check the type first, and for the equality check to specify "" instead of NULL, since that is what you expect.

  • I think "Test that null is valid for recurring contribution ID." is not an accurate description of the test, since we're checking processor_id

  • I think we should have a comment in setRecurProcessorID() explaining the non-standard behaviour we've introduced, like: "For backwards compatibility, we accept NULL and cast it to an empty string".

@eileenmcnaughton eileenmcnaughton force-pushed the prop branch 2 times, most recently from 97a5fe9 to 69306b4 Compare May 25, 2020 12:51
@eileenmcnaughton
Copy link
Contributor Author

OK - I've made some updates that should be enough to get past the 'this doesn't work' part & addresses some of the above.

I just added a comment to point to here from the function rather than specifying the plan because I feel we need to work through converting a few forms to pass the PropertyBag in to doPayment before being sure we want to keep it as strict as it is (this param aside). I've tried a few times but have wound up backing out each attempt, for various reasons.

@artfulrobot
Copy link
Contributor

Fair enough. I don't want to delay a fix, so let's merge this. I'm sorry to hear that the propertyBag aproach has caused headaches.

For the record for other readers, since this conversation now has a code reference: PropertyBag came from a place where not having it was causing headaches, duplication and inconsistency. Its purpose was to assist in providing a consistent, self-documenting and error-free (as in it spots errors in the code that uses it) way to provide the required data for a function. Part of the concept was that data could be given or not, but if given it must be correct for the field.

(Myriad forms) → (PropertyBag) → (Myriad processes) → (Data Models)
    LOGIC           DATA              LOGIC              DATA
                                                          ↓
                              (Processes and reports)   ← /

In form processing, UI-structured data has to be translated into the structure required by the underlying model so that consumers of that data can understand it correctly.

Historically, CiviCRM has come from a largely UI-focussed direction, so a lot of the logic was placed in the "myriad processes" part of the equation - forms would pass in whatever they wanted and each process had to interpret what every conceivable form meant. At first this made sense because there were more forms than processes, and it was more likely that someone would make another form than another process, and you could always just invent a new parameter name from the form side.

But as the project has evolved, in particular to allow extensions to implement payment processors, it's become difficult to have the same logic duplicated correctly throughout all the 3rd party code - it became important to have a single, well defined and unambiguous interpretation of the data, so that rather than N third party payment processors each having to check whether the contact ID was passed in as contact_id, ContactID, or cid, they could reliably check and get the one property.

PropertyBag tries to standardise the data interface to these processes, with the ideal being that a process can know if a property (a) was provided or not, and if so: (b) that it at least has already passed some sanity checks; and (c) that it is the correct type. (c) is important because it's the difference between "please set this to null" and "please leave this alone"; and it avoids persistent, stored ambiguity, e.g. all SQL and/or PHP reporting on processor_id must coalesce NULL to "" since that ambiguity has been stored but it means the same thing - this logic has to be done by everything that interprets that field. Querying for empty data with api3 or api4 is difficult.

It's proved difficult to deal with the difference between exists, is null, is zero, is false, is empty string. In part it's a lack of specification that makes this hard: Nobody knows what it should be anyway, but nobody wants to remove the ambiguity in case there was a difference, or more likely, some code assumed one thing and other another so a change could break existing code! And, of course, most functions don't need to care - in this example, most functions will only need to search on a particular non-empty value of processor_id.

I hope we're able to move, over time, to a position where we do know what things should be and from this definition we can start to insist on tighter standards that save us from further ambiguity problems.

Meanwhile, the code was definitely broken in the real world, and thanks to @eileenmcnaughton, it now works as before. We are still allowing the stored ambiguity of whether processor_id is NULL or "", but we assume every process that cares is checking it for both types of empty.

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot I'm happy to switch back to the 'set empty string to null' -that's obviously more in line with your design.

What worries me is that the process of converting a form is likely to be a case of having a bunch of parameters that we might not know much about because that's what our forms are like right now - so ideally we could do somethlng like

$propertyBag->setContactID($params['contact_id')):
$propertyBag->setDescription($params['source'] ?? NULL):

etc - because I don't know all the different permutations of how this line has been reached & I ideally want to switch over - but without having to write a wrapper to check if each parameter is empty or losing the visibility of using the setVariable() methods.

You only have to grep for CRM+Utils_Array::value() to see how often we don't know what our params hold...

@eileenmcnaughton
Copy link
Contributor Author

I should note I actually love the propertyBag - it's frustrating me that I'm finding it hard to switch the code over but I think it's a great design

@artfulrobot
Copy link
Contributor

@eileenmcnaughton yes, thanks for saying so and I feel your pain! I don't mean to be a PITA or purist for the sake of it, either: we all appreciate code that works 😄 CiviCRM in places is like someone's messy desk - you can't touch anything because they "know where it is", but neither can you expect a new colleague to work at it. I really appreciate that as someone in core team you have to keep the wheels turning. I'm sure that over time our efforts and frustrations in tiding things up will pay off.

@artfulrobot
Copy link
Contributor

Ideally we could do ... $propertyBag->setDescription($params['source'] ?? NULL):

Just on that one, the crucial distinction to make is: do you think this code says "set the description to my source, or if I don't have it, set the description to NULL", or do you think this code says "set the description to my source, but if it's NULL just don't set it at all"? I'm +1 on the first definition, but I know that this is different to api3's conventions (I believe it's more in line with api4 though).

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot so I just went through what it would look like on MembershipRenewal form based on current code

Screen Shot 2020-05-26 at 7 48 57 PM

What stands out is that 2 fields are set & then potentially later overwritten. Having identified that I can alter them to be more like

      $paymentParams->setContactID($this->_params['contact_id'] ?? $this->_contributorContactID);
  • which is what the current code does - although I suspect the current code is wrong

However, I think setting to NULL should actually unset the variable if already set. (or set it to NULL if already set) -

@artfulrobot
Copy link
Contributor

artfulrobot commented May 26, 2020

Great. Yes, you can chain ?? too, so $maybe ?? $perhaps ?? NULL will works to select maybe, falling back to perhaps, falling back to NULL. I think this is a good pattern, it's clearer and more efficient than 2 calls.

However, I think setting to NULL should actually unset the variable if already set. (or set it to NULL if already set) -

I might agree, just to clarify:

If a field can be null, e.g. recurProcessorID, then I think:

<?php

// Store NULL in the prop bag:
$pb->setRecurProcessorID(NULL);

// Store something, then overwrite it. Also valid:
$pb->setRecurProcessorID("FRED");
$pb->setRecurProcessorID(NULL);

If a field cannot be null, e.g. contactID, then I think

<?php

// Attempting to store NULL is invalid and should throw exception
$pb->setContactID(NULL);

$pb->setContactID(123);
$pb->setContactID(NULL); // throws exception

@eileenmcnaughton
Copy link
Contributor Author

eileenmcnaughton commented May 26, 2020

@artfulrobot - yes I agree with the above - I'm not sure if we have defined our minimum required fields but obviously contactID is one (ideally contributionID would be too but sadly we still haven't got that consistent enough yet)

amount is another

On deploying this code I'm seeing InvalidArgumentException: 'processorID field has max length of 255'
@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot I've updated it again to be closer to your A2 + B3 - which seems more inline with the idea of setting to NULL rather than empty string. It rejects 'FALSE' or 0 now as it did earlier (not sure whether 0 would ever be valid but....)

@artfulrobot
Copy link
Contributor

@eileenmcnaughton looks good to me 👍

I've run the tests locally and they all pass. I have a suggestion for the tests:

pr-17292-suggestion-1.diff.txt

@eileenmcnaughton
Copy link
Contributor Author

@artfulrobot I'm going to merge this - I think I don't have the enthusiasm to retweak the test at the moment but once we have merged to master we can consider

@eileenmcnaughton eileenmcnaughton merged commit 2a95d4a into civicrm:5.26 May 26, 2020
@eileenmcnaughton eileenmcnaughton deleted the prop branch May 26, 2020 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants