-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 640 #43
Issue 640 #43
Conversation
…dded debug config var to return stacktrace on error.
…f hitting up Gemini.
Looks like I've got some checkstyle and other oddities I've introduced with bad merging. Also probably have to update the composer lockfile again using 5.6. |
Gemini/src/UrlMinter/UrlMinter.php
Outdated
substr($context, 2, 2), | ||
substr($context, 4, 2), | ||
substr($context, 6, 2), | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be reduced to
$segments = str_split($context, 2);
http://php.net/str_split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whikloj++
|
||
$urn = $event['object']['id']; | ||
if (preg_match("/urn:uuid:(?<uuid>.*)/", $urn, $matches)) { | ||
if (isset($matches['uuid'])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? If the preceding if (preg_match
is successful, this should always succeed... I think. Did you mean to test for emptiness? if (!empty($matches['uuid'])) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm probably being overly cautious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. If the regex matches, the group is always set in the array, but the results can be empty.
ubuntu@claw:/var/www/html/Crayfish/Gemini$ php -a
Interactive mode enabled
php > $str = "urn:uuid:";
php > $regex = "/urn:uuid:(?<uuid>.*)/";
php > echo preg_match($regex, $str, $matches);
1
php > echo isset($matches['uuid']);
1
php > echo empty($matches['uuid']);
1
'response' => $response, | ||
]); | ||
|
||
return (string)$response->getBody(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading this correctly, then the body could be the Exception message passed from the UrlMinter through GeminiController. Are you missing a check for the status Code before continuing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm being lazy here and not try/catching. Probably was just relying on debug mode when testing? It's been a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading this I see I'm purposefully bubbling up exceptions from Guzzle. MillinerService also lets them bubble and they're eventually handled by the controller.
$jsonld_url, | ||
$fedora_url, | ||
$token | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is updateContent so the URL was found in gemini, do we need to re-save them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember how/when I saved the urls was tricky because of all the different points of failure, but can't recall the exact reason other than general paranoia. Let me look back and see if I can figure out specifically why, because if not, we shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can't convince myself we need this. Let's remove it and see what happens during testing.
// Get the file's LDP-NR counterpart in Fedora. | ||
$urls = $this->gemini->getUrls($file_uuid, $token); | ||
|
||
if (empty($urls)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In saveContent
you determine whether to save or update based on whether the item has been mapped in Gemini, but in saveMedia
you assume it already has been mapped and if it hasn't then you fail. Where does it get mapped the first time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you do the same check in saveFile
too. So this is a unique function and I'm not clear why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually never mapping Media in Gemini, just Files and Content. I'm looking up the Media in a roundabout way since timing issues can leave me in a scenario where I can't link everything together, particuarly when deleting. I was winding up with dangling references to Media that no longer existed in some corner cases. It's pretty extreme, but when doing strange things like creating a File and Media, deleting the Media, and then creating a new Media and associating it with the original file you can run into issues like this.
So I'm just always looking it up every time to guarantee I get it right. It's slower but guaranteed to work.
FYI, Islandora/documentation#721 would cut down on some of the hassle when doing the lookup and provide some caching. It would also drop the requirement of enabling the JSON format for Media GET requests.
Thanks for the review @whikloj. I know massive PRs like this aren't the easiest to jump into. I'll get these test going and then make the changes you suggested. |
Codecov Report
@@ Coverage Diff @@
## master #43 +/- ##
=============================================
- Coverage 97.11% 61.15% -35.97%
- Complexity 68 146 +78
=============================================
Files 6 8 +2
Lines 347 726 +379
=============================================
+ Hits 337 444 +107
- Misses 10 282 +272
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
GitHub Issue: Islandora/documentation#640
What does this Pull Request do?
Total rewrite of both Milliner and Gemini. Milliner has most of the business logic that was previously in Alpaca, but only uses idempotent methods. Gemini now references resources by UUID and stores full URIs, not relative.
How should this be tested?
There's an upcoming Islandora-Devops/claw-playbook PR that we'll use to test all the PRs required for Islandora/documentation#640, since it encompasses Alpaca, Crayfish, Crayfish-Commons, the core Islandora module... pretty much all of it.
Interested parties
@Islandora-CLAW/committers