Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drupal / Gemini File UUID Mismatch #1076

Closed
seth-shaw-unlv opened this issue Apr 4, 2019 · 34 comments
Closed

Drupal / Gemini File UUID Mismatch #1076

seth-shaw-unlv opened this issue Apr 4, 2019 · 34 comments
Assignees
Milestone

Comments

@seth-shaw-unlv
Copy link
Contributor

While testing riprap I discovered several of my Media aren't getting picked up by check_fixity because the uuid being reported by the JSON API (sourced from the Drupal file_managed.uuid field) doesn't match the uuid stored in Gemini for that file. However, it is inconsistent. Some do match while many don't.

For example, (after adding some print statements to riprap's PluginFetchResourceListFromDrupal.php) one of my successful items reports:

Node id: 233784
	media_url: http://dams.library.unlv.edu/node/233784/media
	field_media_file uuid: 9471ac1a-116e-4220-a93b-d03d5aa292a0
	fedora_url: http://localhost:8080/fcrepo/rest/masters/ent/ent001450-056.tif

Indeed, the Gemini db does have the correct uuid for this item:

+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+
| fedora_hash                                                                                                                      | drupal_hash                                                                                                                      | uuid                                 | drupal_uri                                                                   | fedora_uri                                                                         | dateCreated         | dateUpdated         |
+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+
| 2228c3bf188614f4449821f63328d4c6a7a2dbf698e18270e6388151f739e1cd61546c949950d7ce7b188c1558422700475c3f69752694f92beeeddf5963ae54 | aa30949313684de1b375cc4c687a344cf4d8c38b519e2d5a4c750ae897d2edd03828639a6d10d52248c90d4fada5bf70fb7617430a4a4ea2ee289c40c7d18775 | 9471ac1a-116e-4220-a93b-d03d5aa292a0 | http://dams.library.unlv.edu/_flysystem/fedora/masters/ent/ent001450-056.tif | http://localhost:8080/fcrepo/rest/masters/ent/ent001450-056.tif                    | 2019-03-01 14:07:19 | 2019-03-01 14:07:19 |
+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+

Compared to Drupal for the corresponding file:

+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+----------+--------+------------+------------+
| fid   | uuid                                 | langcode | uid  | filename          | uri                                    | filemime   | filesize | status | created    | changed    |
+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+----------+--------+------------+------------+
| 92374 | 9471ac1a-116e-4220-a93b-d03d5aa292a0 | en       | NULL | ent001450-056.tif | fedora://masters/ent/ent001450-056.tif | image/tiff | 95503106 |      1 | 1551457224 | 1551981341 |
+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+----------+--------+------------+------------+

However, one of my unsuccessful ones displays:

Node id: 205112
	media_url: http://dams.library.unlv.edu/node/205112/media
	field_media_file uuid: de77f71f-6321-4cb7-900a-080ffb219cbf
	fedora_url: 

'fedora_url' is blank because $fedora_url = $this->getFedoraUrl($media['field_media_file'][0]['target_uuid']);, which queries Gemini using the uuid, returned false. Indeed, the Gemini database has a different UUID for this item: ('de77f71f-6321-4cb7-900a-080ffb219cbf' in Drupal, '498944af-2773-4d4b-9526-9f7a08d326da' in Gemini)

+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+
| fedora_hash                                                                                                                      | drupal_hash                                                                                                                      | uuid                                 | drupal_uri                                                                   | fedora_uri                                                                         | dateCreated         | dateUpdated         |
+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+
| 6c2ed41faef22f2ee0cfeda0ce9a5303c5771f3b9900ad3d2713b08b42d6d8a0167b4da1b89fb22f5abe4c668e22575ac7f4d8cbfb7af9eba52569320ff1fb38 | 347fb4ee386c59798276ca8e4576dbf0ace0d48c9a9d43a25705cada1d9761f77f70a36a10aa2009d061373ef599404660bf3f9de4cfe424a1b6df99c5a93c9f | 498944af-2773-4d4b-9526-9f7a08d326da | http://dams.library.unlv.edu/_flysystem/fedora/masters/ent/ent000725-013.tif | http://localhost:8080/fcrepo/rest/masters/ent/ent000725-013.tif                    | 2019-02-19 20:20:10 | 2019-02-19 20:20:10 |
+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+--------------------------------------+------------------------------------------------------------------------------+------------------------------------------------------------------------------------+---------------------+---------------------+

Compared to Drupal:

+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+-----------+--------+------------+------------+
| fid   | uuid                                 | langcode | uid  | filename          | uri                                    | filemime   | filesize  | status | created    | changed    |
+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+-----------+--------+------------+------------+
| 57984 | de77f71f-6321-4cb7-900a-080ffb219cbf | en       | NULL | ent000725-013.tif | fedora://masters/ent/ent000725-013.tif | image/tiff | 145856244 |      1 | 1550790127 | 1551798977 |
+-------+--------------------------------------+----------+------+-------------------+----------------------------------------+------------+-----------+--------+------------+------------+

It is not clear to me why some uuids would be correctly recorded in Gemini, while others aren't. Any theories about the cause or strategies for fixing the discrepancies are welcomed.

@dannylamb
Copy link
Contributor

I may be kicking myself soon... is that media uuid instead of the file?

@seth-shaw-unlv
Copy link
Contributor Author

Funny thing, it IS a media UUID for the Service File of a completely different object (ent000817-037).

@seth-shaw-unlv
Copy link
Contributor Author

Actually, it isn't. I thought I found that 498944af-2773-4d4b-9526-9f7a08d326da on another media, but that uuid doesn't match after all. I can't find it in Drupal anywhere.

@mjordan
Copy link
Contributor

mjordan commented Apr 4, 2019

@seth-shaw-unlv are you using Islandora Riprap at all? If so, don't discount the possibility that my coding in one or more of the classes at https://github.com/mjordan/islandora_riprap/tree/master/src/Riprap is the culprit ..... it doesn't write to Gemini but if a file is not found in Gemini by Islandora Riprap, we probably should do a sanity check on how it gets UUIDs from the JSON.

@seth-shaw-unlv
Copy link
Contributor Author

@mjordan I was initially submitting this issue on riprap, and I have submitted the sanity check as a separate issue there, but as I looked into it and double-checked everything (looking in the JSON API output and the database tables) I determined that this is first and foremost a problem with Gemini and Drupal having mismatched UUIDs.

@mjordan
Copy link
Contributor

mjordan commented Apr 4, 2019

Cool, let's tackle one issue at a time.

@dannylamb
Copy link
Contributor

I mentioned Media because we changed how we index them in Gemini a while ago, like a year or so ago. TL;DR there's weird timing issues, and it's really just the file and the node that get indexed in Gemini, not the Media.

And yeah... this is surely just a bug that's been laying in wait until someone uses Islandora 8 enough to find it.

So what's different between the two that would make one get indexed properly and another not? They're both files in Fedora, right? It's not like this is because one's on public or not... And maybe is one a derivative and the other an original file?

@seth-shaw-unlv
Copy link
Contributor Author

That's the thing... there is no difference between them that I am aware of. They are both Original Files saved to the 'fedora://' filesystem. They were added on two different days... perhaps something happened as I was trying to work through Drupal 8 Migrate errors?

@seth-shaw-unlv
Copy link
Contributor Author

Oh... I just compared the times on each of these database records. For the successful one (which was loaded later) the Gemini record was created about 6 hours after the Drupal record (give or take since I'm not sure if any time-zones are being accounted for there). I presume the lag is due to a Camel message queue wait. Or should the Gemini update be more instantaneous with Flysystem? I haven't checked to be certain what the data flows look like.

Anyway, the unsuccessful one shows a Gemini create time nearly two full days before the Drupal create time, which implies to me that I had done a Migration rollback and migrated it again. I presume the roll back would have wiped the Drupal file record but may not have triggered a Gemini Delete. So, when I migrated the file again it had the same path as before but a new uuid, which would explain why I can't find the uuid recorded in Gemini in Drupal anywhere.

Does that hypothesis make sense? Anyone want to test a migrate, rollback, and re-migrate to see what happens to Drupal v. Gemini?

@dannylamb
Copy link
Contributor

@seth-shaw-unlv In theory, it would trigger the events. Whether or not they were processed or when on the other hand, is a totally different scenario. Most likely things would fall apart somewhere downstream.

I can certainly try and replicate. And it might be worthwhile to investigate how we could rebuild a Gemini index.

@seth-shaw-unlv
Copy link
Contributor Author

seth-shaw-unlv commented Apr 5, 2019

@dannylamb Well, strike that theory.

Using a relatively fresh box (built 9 days ago): I migrated a file (Gemini and Drupal's uuids matched), performed a roll-back (confirmed they were gone from both Drupal and Gemini), and ran the migration again (the file had a new uuid, but Drupal and Gemini had the new uuid).

So, yeah, 🤷‍♂️ .

P.S. For some reason Gemini is recording date/times in Central time despite my being in Pacific time while Drupal is recording the epoc value correctly.

@dannylamb
Copy link
Contributor

Well, good to know it works, but that doesn't help us sort out what happened to you. I imagine throughout the course of development that it may be common to make some duds, but still.

So do those UUIDs exist at all in Drupal?

@seth-shaw-unlv
Copy link
Contributor Author

I haven't been able to find them. I checked node uuids, file uuids, and media uuids. I didn't see them anywhere.

Another point of reference, I threw some queries at the database:

select drupal8.file_managed.uuid, drupal8.file_managed.uri INTO OUTFILE 'wrong_uuid_in_gemini.txt' From drupal8.file_managed WHERE drupal8.file_managed.uuid NOT IN (SELECT gemini.Gemini.uuid FROM gemini.Gemini) AND drupal8.file_managed.uri LIKE 'fedora%' ;
Query OK, 39282 rows affected (1.35 sec)

select count(fid) FROM file_managed WHERE uri LIKE 'fedora%';
+------------+
| count(fid) |
+------------+
|      47424 |
+------------+

That means ~82% of my Drupal File uuids don't match Gemini.

I'm going to try using that list of uuids and uris to create a mass Gemini update script to get the correct uuids in place. It feels heavy handed, but I want my riprap running...

@seth-shaw-unlv
Copy link
Contributor Author

So, @dannylamb, to further confirm the Gemini uuid isn't in Drupal I exported the whole Drupal database and grep-ed for the uuids. I can confirm that the working item is in both and, for the non-working one, the Drupal uuid is only in Drupal and the Gemini uuid is only in Gemini.

Between the dates and the non-matching uuids, I really feel like Drupal created new records with new uuids but Gemini failed to either delete or update. I just don't know why.

@seth-shaw-unlv
Copy link
Contributor Author

@dannylamb, I just checked my gemini log. (I feel silly for not looking at it earlier.)

For ent000725-013 (the failed one) the Gemini UUID (498944af-2773-4d4b-9526-9f7a08d326da) shows up as a successful PUT on 2019-02-19:

[2019-02-19 20:20:10] app.INFO: Matched route "{route}". {"route":"PUT_uuid","route_parameters":{"_controller":"gemini.controller:put","uuid":"498944af-2773-4d4b-9526-9f7a08d326da","_route":"PUT_uuid"},"request_uri":"http://dams.library.unlv.edu/gemini/498944af-2773-4d4b-9526-9f7a08d326da","method":"PUT"} []
[2019-02-19 20:20:10] app.DEBUG: Checking for guard authentication credentials. {"firewall_key":"default","authenticators":1} []
[2019-02-19 20:20:10] app.DEBUG: Calling getCredentials() on guard configurator. {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-19 20:20:10] crayfish.syn.jwt_authentication.DEBUG: Token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NTA2MjkyMTAsImV4cCI6MTU1MDYzNjQxMCwid2ViaWQiOiIxIiwiaXNzIjoiaHR0cHM6XC9cL2RhbXMubGlicmFyeS51bmx2LmVkdSIsInN1YiI6ImFkbWluIiwicm9sZXMiOlsiYXV0aGVudGljYXRlZCIsImFkbWluaXN0cmF0b3IiLCJmZWRvcmFhZG1pbiJdfQ.DwsyftTuj3e1t-l4ZbDaQ9Ff4L4S4hcacFjEcpuGq6TqBNa0FiG9Yv2CjNlfn7bF3jWQ6WJ7ml9quSVlxha7g6V-Iyk0mScFdHxMEZ0maPYwgJ115969_Pw9Gvzv7LXpexfFqXx20VQwn6k_xYJjaOLjIr7RzvHOA3FR2nif4m3pTcnUMi0iT3H246d_qOGEr-JPpr8TUsxxsSzx7xQdXQoASV3cobGtPKR6i4o5pjQhmtS5Ef7pt_ew153KigsMwrhMxqdWut5G5XdDaj4cXv_QlZPhtAe8lr9RlmLkOXCUnHdhhRPhgTbdFRjNt-eZm7AxeNfPDgo4trQjx64xHg [] []
[2019-02-19 20:20:10] app.DEBUG: Passing guard token information to the GuardAuthenticationProvider {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-19 20:20:10] app.INFO: Guard authentication successful! {"token":"[object] (Symfony\\Component\\Security\\Guard\\Token\\PostAuthenticationGuardToken: PostAuthenticationGuardToken(user=\"admin\", authenticated=true, roles=\"authenticated, administrator, fedoraadmin\"))","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-19 20:20:10] app.DEBUG: Guard authenticator set no success response: request continues. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-19 20:20:10] app.DEBUG: Remember me skipped: it is not configured for the firewall. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-19 20:20:10] app.DEBUG: > PUT /gemini/498944af-2773-4d4b-9526-9f7a08d326da [] []
[2019-02-19 20:20:10] app.DEBUG: < 201 [] []

But it never shows up in the log again.

The Drupal UUID for the same item (de77f71f-6321-4cb7-900a-080ffb219cbf) pops up in the log two days later:

[2019-02-21 18:27:59] app.INFO: Matched route "{route}". {"route":"PUT_uuid","route_parameters":{"_controller":"gemini.controller:put","uuid":"de77f71f-6321-4cb7-900a-080ffb219cbf","_route":"PUT_uuid"},"request_uri":"http://dams.library.unlv.edu/gemini/de77f71f-6321-4cb7-900a-080ffb219cbf","method":"PUT"} []
[2019-02-21 18:27:59] app.DEBUG: Checking for guard authentication credentials. {"firewall_key":"default","authenticators":1} []
[2019-02-21 18:27:59] app.DEBUG: Calling getCredentials() on guard configurator. {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-21 18:27:59] crayfish.syn.jwt_authentication.DEBUG: Token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NTA3OTUyNzksImV4cCI6MTU1MDgwMjQ3OSwid2ViaWQiOiIxIiwiaXNzIjoiaHR0cHM6XC9cL2RhbXMubGlicmFyeS51bmx2LmVkdSIsInN1YiI6ImFkbWluIiwicm9sZXMiOlsiYXV0aGVudGljYXRlZCIsImFkbWluaXN0cmF0b3IiLCJmZWRvcmFhZG1pbiJdfQ.ETF4bTORPw-oljNdO9iRPVRP4VOay8MHMtddsjF2ZWhBBwJNJXNMniECWHlhJC9KgoXXhczobnOdBKAqr06QKoamxD3k-tn4ssm0E3fApf_PwBstviz8K_E11CJ1JlSU-6PkFf43Jp-WPN8fPZ3xqn4-XMZhk7vgtWXfrE5yJqoKlj8oP862p76rDtskp8QftWX0hVCdhMJ0wl96X8yqYLvUuDNOX1xwXQSSannRVWKm27nMiLo70FLtXZQWnmKKxpPjzPlNxllo0Iz8EzldwVe1QviT7czrIEVQ2WRi70nYW4Sumj_lUzvMMRRZY0BTW_GOe1P3qrV6-eK1CESgcw [] []
[2019-02-21 18:27:59] app.DEBUG: Passing guard token information to the GuardAuthenticationProvider {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-21 18:27:59] app.INFO: Guard authentication successful! {"token":"[object] (Symfony\\Component\\Security\\Guard\\Token\\PostAuthenticationGuardToken: PostAuthenticationGuardToken(user=\"admin\", authenticated=true, roles=\"authenticated, administrator, fedoraadmin\"))","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-21 18:27:59] app.DEBUG: Guard authenticator set no success response: request continues. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-21 18:27:59] app.DEBUG: Remember me skipped: it is not configured for the firewall. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-02-21 18:27:59] app.DEBUG: > PUT /gemini/de77f71f-6321-4cb7-900a-080ffb219cbf [] []
[2019-02-21 18:27:59] app.DEBUG: < 204 [] []

It returned a 204. At this point Gemini still has the old UUID for this item and has no record of the Drupal UUID it just received in the PUT command. This is evidenced by further gemini log messages where we look for the Drupal UUID and receive 404 responses:

[2019-03-05 10:51:24] app.INFO: Matched route "{route}". {"route":"GET_uuid","route_parameters":{"_controller":"gemini.controller:get","uuid":"de77f71f-6321-4cb7-900a-080ffb219cbf","_route":"GET_uuid"},"request_uri":"http://dams.library.unlv.edu/gemini/de77f71f-6321-4cb7-900a-080ffb219cbf","method":"GET"} []
[2019-03-05 10:51:24] app.DEBUG: Checking for guard authentication credentials. {"firewall_key":"default","authenticators":1} []
[2019-03-05 10:51:24] app.DEBUG: Calling getCredentials() on guard configurator. {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-03-05 10:51:24] crayfish.syn.jwt_authentication.DEBUG: Token: eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiJ9.eyJpYXQiOjE1NTE3OTk5NjQsImV4cCI6MTU1MTgwNzE2NCwid2ViaWQiOiIxIiwiaXNzIjoiaHR0cHM6XC9cL2RhbXMubGlicmFyeS51bmx2LmVkdSIsInN1YiI6ImFkbWluIiwicm9sZXMiOlsiYXV0aGVudGljYXRlZCIsImFkbWluaXN0cmF0b3IiLCJmZWRvcmFhZG1pbiJdfQ.m-NFiapTEpW6GH30WnnoSTNjjGiYpVrw_IwSaIsXEvvlj8zMElr0D8i1CLXa-T0Mcr8EMpFSeFKQsCizp4oFwI7OJGd7nQJWcR-9IGQpE7mq0Ygzjjh7M86BHg4OMRNI5RrSIYud_V3cslot5QEmJRQWD439z3QZIAJzsAz2RJsJ-mHXD8UQZEYeZdej6QqyCe83apC97H0R8YMiCgOa7ykjPnEkBYIxRqqpluIgh8SOF4i5Sf7wClt-duSxw3smOsGg6OP7Z3yEFoSkevQK5XMZ9Ip_unRfAvJ3EVs-1cLGoYQTxKOHnmJpb0s2q4PAqkae5QaKJh4TtWN_epsOig [] []
[2019-03-05 10:51:24] app.DEBUG: Passing guard token information to the GuardAuthenticationProvider {"firewall_key":"default","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-03-05 10:51:24] app.INFO: Guard authentication successful! {"token":"[object] (Symfony\\Component\\Security\\Guard\\Token\\PostAuthenticationGuardToken: PostAuthenticationGuardToken(user=\"admin\", authenticated=true, roles=\"authenticated, administrator, fedoraadmin\"))","authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-03-05 10:51:24] app.DEBUG: Guard authenticator set no success response: request continues. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-03-05 10:51:24] app.DEBUG: Remember me skipped: it is not configured for the firewall. {"authenticator":"Islandora\\Crayfish\\Commons\\Syn\\JwtAuthenticator"} []
[2019-03-05 10:51:24] app.DEBUG: > GET /gemini/de77f71f-6321-4cb7-900a-080ffb219cbf [] []
[2019-03-05 10:51:24] app.DEBUG: < 404 [] []

There are several similar requests for the Drupal UUID, each receiving the same response: 404.

@seth-shaw-unlv
Copy link
Contributor Author

Addendum to the Migrate possibility, I also attempted a drush mim --update. That did change the file uuid, but Gemini was also correctly updated. Or, more accurately, the old UUID record was deleted from Gemini and a new record with the new UUID created.

So, I can't find fault with the Drupal API in my small tests on this account.

I wonder if this is all a symptom of #978; camel crashed, never processed the Gemini delete, and the 204 wasn't really a success but a failure due to an entry already using the fedora or drupal uri. @dannylamb, does Gemini have a database unique constraint on any field other than the UUID that may have resulted in this happening?

@seth-shaw-unlv
Copy link
Contributor Author

Ah hah! Gemini has a unique key on the fedora/drupal uri hashes:

UNIQUE KEY `fedora_hash` (`fedora_hash`,`drupal_hash`)

Since both the Drupal and Fedora URIs shouldn't have changed from one entry to the other the hashes wouldn't have either. The log doesn't show a delete for the old record so the database constraint would have refused the new one. However, the 204 response (SUCCESS, NO CONTENT) doesn't accurately reflect the new record's failure.

@mjordan
Copy link
Contributor

mjordan commented Apr 5, 2019

@seth-shaw-unlv that is some pretty sweet trouble-shootin' 😃

@whikloj whikloj added this to the 1.0.0 milestone Apr 11, 2019
@dannylamb
Copy link
Contributor

@seth-shaw-unlv Looking at the code, it should:tm: update if the record already exists. I'll have to manually send some requests to confirm this behaviour.

@dannylamb
Copy link
Contributor

@seth-shaw-unlv Ok, so after digging deeper (and finding out we don't even have a logger in the controller...) I've managed to step through all this and sort out what's going on. Turns out it's not blowing up because Doctrine considers the update query containing the exact same information to be ok and doesn't throw any exceptions. Hence 204, which is a success as in "Nothing changed... here's no content". Which isn't exactly "Success, something was changed... here's no content".

I can't really think of a status code that would express that off the top of my head. Maybe it could be 200 but with a message?

@whikloj
Copy link
Member

whikloj commented May 7, 2019

@dannylamb so when this SQL command occurs does anything change in Gemini?

@whikloj
Copy link
Member

whikloj commented May 7, 2019

304 Not Modified?

@dannylamb
Copy link
Contributor

@whikloj It makes the update query, which is successfully applied, but it just overwrites it with the exact same data. So it's successful in technicality.

304 is for GET and HEADs and caching, etc... but hijacking it for a PUT response when nothing was changed feels appropriate. It certainly conveys the message better than 200, which I imagine would imply that there was some operation performed. And it's not like the HTTP police are going to come and take us away...

Also, found this when researching:

@whikloj
Copy link
Member

whikloj commented May 8, 2019

So looking back here I am confused.

@dannylamb is indicating that the service is simply overwriting the existing DB record with an exact copy.

Whereas @seth-shaw-unlv seems to be saying that the second insert/update has the same Drupal and Fedora URI but a different UUID. So it is not an exact copy.

I think we need consensus on which it is before we can move forward here.

@whikloj
Copy link
Member

whikloj commented May 8, 2019

Also @dannylamb has somehow commented in the future 👻 So his is always last.

@seth-shaw-unlv
Copy link
Contributor Author

@whikloj is right. It isn't creating an exact copy because the UUID is different. Two separate POSTs, both have the same Drupal and Fedora URIs but with different UUIDs. The second POST isn't saving the new UUID so it shouldn't return success.

@seth-shaw-unlv
Copy link
Contributor Author

Perhaps a 409 (Conflict)?

@dannylamb
Copy link
Contributor

How in the world did I time travel? Github best unlock its secrets to me. I have children, dangit, and need more time for everything!

@dannylamb
Copy link
Contributor

Anyhow, sorry I misunderstood the situation @seth-shaw-unlv. Let me test again and see if it throws an error in that instance, too. Because if it does blow up I'd expect to get a 500 based on how the code is now.

@dannylamb
Copy link
Contributor

@seth-shaw-unlv Confirmed.

vagrant@claw:/var/www/html/drupal/web/modules/contrib/islandora$ curl -i -H "Authorization: Bearer islandora" -X PUT -d '{"drupal":"abc","fedora":"123"}' -H "Content-Type: application/json" localhost:8000/gemini/abc123
HTTP/1.1 204 No Content
Date: Thu, 09 May 2019 15:08:56 GMT
Server: Apache/2.4.18 (Ubuntu)
X-Powered-By: PHP/7.1.28-1+ubuntu16.04.1+deb.sury.org+3
Cache-Control: no-cache, private
Content-Type: text/html; charset=UTF-8

vagrant@claw:/var/www/html/drupal/web/modules/contrib/islandora$ curl -i -H "Authorization: Bearer islandora" -X PUT -d '{"drupal":"abc","fedora":"123"}' -H "Content-Type: application/json" localhost:8000/gemini/def456
HTTP/1.1 204 No Content
Date: Thu, 09 May 2019 15:09:07 GMT
Server: Apache/2.4.18 (Ubuntu)
X-Powered-By: PHP/7.1.28-1+ubuntu16.04.1+deb.sury.org+3
Cache-Control: no-cache, private
Content-Type: text/html; charset=UTF-8

vagrant@claw:/var/www/html/drupal/web/modules/contrib/islandora$ curl -i -H "Authorization: Bearer islandora" localhost:8000/gemini/def456HTTP/1.1 404 Not Found
Date: Thu, 09 May 2019 15:10:33 GMT
Server: Apache/2.4.18 (Ubuntu)
X-Powered-By: PHP/7.1.28-1+ubuntu16.04.1+deb.sury.org+3
Cache-Control: no-cache, private
Content-Length: 36
Content-Type: text/html; charset=UTF-8

Could not locate URL pair for def456

I'm on board. 409 FTW.

@whikloj whikloj self-assigned this May 29, 2019
@whikloj
Copy link
Member

whikloj commented May 29, 2019

So this seems to be the issue, but I am unclear why we can update the URLs? Other than using this to determine if we are creating (201 Created) or updating (204 No Content) do we need this?
https://github.com/Islandora-CLAW/Crayfish/blob/master/Gemini/src/UrlMapper/UrlMapper.php#L70-L77

@dannylamb
Copy link
Contributor

Is this still a thing?

@seth-shaw-unlv
Copy link
Contributor Author

It is still a thing, but it is an edgecase that only shows up when something else broke. So, yeah, a bug, but a minor one.

@dannylamb
Copy link
Contributor

Given that we've ditched Gemini due to reasons just like this, I'm closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants