-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Goblin with Azure Cosmos DB #84
Comments
Hi Jordan, I have never used Cosmos DB. I can help look into this, but a couple things first:
|
Hi @davebshow, Do you know what version of TinkerPop Cosmos DB uses? Cosmos DB uses version 3 for Tinkerpop. The gremlin console, according to https://docs.microsoft.com/en-us/azure/cosmos-db/create-graph-gremlin-console requires version 3.2.5 or above of Tinkerpop so that's how we inferred the version they use for Tinkerpop. What version of Cosmos DB do you have? We're using the latest version of Cosmos DB as we launched it a day ago. What version of Goblin, aiogremlin, and gremlinpython you are using? Versions are a little wacky right now, I am working to get things up to speed. Goblin 2.0.0 PS Microsoft Azures website points us to Goblin as the primary gremlin api to use: https://docs.microsoft.com/en-us/azure/cosmos-db/. It seems like someone got it to work at sometime ;) Thanks for your help! -Jordan |
First things first: update your versions. Like I said, the versions are a bit wacky right now--I've been finishing grad school/moving/starting a new job and I'm still not caught up. In a fresh environment you would do:
This should also give you Idk if this will solve your problems, but it is a first step. The Goblin 2.0 package tests against TP 3.2.4, and there have been quite a few changes since. |
Hi @davebshow, So we updated the environment using the suggested versions above in a completely new virtualenv. That did reduce the errors, but we did get the main one still: line 8, in We can connect to Cosmos Graph DB using node.js gremlin package, and we can connect to it using the gremlin console. We'd like to use Python however, and it seems your library is the best approach, even Microsoft recommends it. Please look to https://docs.microsoft.com/en-us/azure/cosmos-db/create-graph-gremlin-console#ConnectAppService for the yaml template that they recommend for Gremlin connecting to Cosmos DB. |
After looking again, it seems like this is a configuration issue. You can't pass the gremlin.yaml for the Java driver to Goblin. Gobliln config tries to be as similar as possible to the Java driver, but for practical reasons (like you can't pass java serializer classes to the Python code). Please try to update config based on Goblin app config options: http://goblin.readthedocs.io/en/latest/app.html. Sorry, the docs aren't in the best state right now. Also, it looks like you may need to configure ssl options. I'm not sure if you guys are using ssl really, but the examples for cosmos all set |
Hi @davebshow , Good catch. I updated config based on Goblin app conf in http://goblin.readthedocs.io/en/latest/app.html. Here's what it looks like now: scheme: 'wss' However, now it's looking for /home/clink-im/envs/goblin_test/bin/python /home/clink-im/haydenbeadles/clearinghouse/cl_janus/test_goblin.py Seems to need the ssl files since we pass those config options. I can look again, but I didn't see any key and cert files available from cosmos db. Think that's all handled internally. I commented the code like so: server.py:
It gets past that error. Now I try either: OGM create vertex script import asyncio, datetime from goblin import element, properties class Person(element.Vertex): class Knows(element.Edge): loop = asyncio.get_event_loop() app = loop.run_until_complete(Goblin.open(loop, configfile='config.yaml')) app.register(Person, Knows) async def go(app): loop.run_until_complete(go(app)) or List vertex script import asyncio, datetime loop = asyncio.get_event_loop() from goblin import DriverRemoteConnection # alias for aiogremlin.DriverRemoteConnection async def go(loop): results = loop.run_until_complete(go(loop)) I get back this error for both: /home/clink-im/envs/goblin_test/bin/python /home/clink-im/haydenbeadles/clearinghouse/cl_janus/test_goblin2.py Process finished with exit code 1 It fails on await session.flush() and vertices = await g.V().toList() respectively. I'm not sure where to go from here. It seems to be ignoring my username and password config options, at least until it fails. Whatever I put there whether or not it matches from Cosmos DB the error doesn't change. Just an observation I see... I am putting the right usernames and password but just as a negative test case I tried putting wrong username and password in the yaml file but never saw a difference. Error doen't change that's what's above. Again, thank you for your help here. I feel like we might be close? |
Think I may have figured that problem out. It was due to: :param float response_timeout: (optional) Needs to be a float and passing response_timeout: None in yaml isn't a float value. I know your doc puts that in there, but it doesn't seem to like it. If I use response_timeout: 60 or comment it out, #response_timeout: None in the yaml file for that config option, then the app either times out at 60 seconds. Error like so: /home/clink-im/envs/goblin_test/bin/python /home/clink-im/haydenbeadles/clearinghouse/cl_cosmos/test_goblin2.py During handling of the above exception, another exception occurred: Traceback (most recent call last): Process finished with exit code 1 or hangs indefinitely when I comment out response_timeout. It seems to never really communicate past "connecting" with Gremlin on Cosmos, at least past the hostname. When I put the wrong hostname in hosts config option, it fails on Cannot connect to host _hostname_om:443 ssl:True [Name or service not known]. When I put the right one, it gets past the connection part it seems. Like it gets to this: async def go(loop): And then just hangs with no return, unless i set a numeric timeout for response_timeout I see in debugger under .toList()
The self at this point looks like this: self = {AsyncGraphTraversal} [['V']] That doesn't look right to me (unable to get repr... doesn't look good and graph shows empty) Here's what CosmosDB says needs to be placed for username and password (https://docs.microsoft.com/en-us/azure/cosmos-db/create-graph-gremlin-console#ConnectAppService):
I guess that format, particularly on username, isn't causing issue with Goblin? Anyway, if I leave username and password blank, it still hangs. Seems like it's not even got that far yet to authenticate. These are just observations I see, maybe they'll help. |
One thing we'll try in the morning is reverting back Goblin and its dependencies to the latest stable release, just to cover bases there. Since we were passing unexpected yaml config params originally that weren't recognized by Goblin, now that we potentially have the yaml file Goblin safe, I'd like to try it against a stable release. I feel like Cosmos would have allowed backwards compatibility to Tinkerpop but maybe I'm wrong. Worth a try... |
After reverting back to latest stable versions, we continue to get errors. Currently our Goblin connection to Cosmos is still hanging. We reverted back to the versions you specified earlier. |
Ok, well I have some good news and some bad news. Good news is that someone from Microsoft contacted me about getting going with Python and Cosmos, so hopefully I will have more info and there will be more documentation soon. Bad news is that Cosmos DB Gremlin endpoint does not accept bytecode, only strings. This means that the Goblin, as well as GLV code, will not work with Cosmos. Instead, you have to submit queries as a string (shown in this example): http://aiogremlin.readthedocs.io/en/latest/usage.html#using-the-driver-module Hopefully I will have more info for you soon. |
Hi @davebshow, Thanks a bunch (and Microsoft) for taking point on this! It would be great to use Goblin, so I am liking the collaboration and efforts there. That does sound like some work though to get it working with Cosmos DB Gremlin endpoint, with it not accepting bytecode. Would be a change... Keep us in the loop, we'd appreciate it! Our alternative approach for now (working in parallel) is to use OrientDB and their pyorient OGM and underlying driver. It's got its own hurdles we're working through. The agnostic language of Gremlin and the fully managed cosmos graph DB is very enticing. |
Update: they will be accepting bytecode in the near future. Also, tomorrow or the next I will be playing around a bit with Python and Cosmos. I'll let you know what I figure out. |
I was going to ask if Microsoft was going to accept bytecode soon so not surprised on that answer. They should keep it to standard ;) Sounds good! |
Hi @davebshow, Any news on Goblin with Cosmos? |
Well, like I said, until they support bytecode, Goblin is a no go with Cosmos. I know that the guys trying things out at Microsoft have got Gremlin Python up and running with Cosmos, and I was going to try some script submission with aiogremlin when I get a chance. |
Ok thanks Dave for the update! |
@davebshow: Any news on this? I'm interested to try this library out with Cosmos DB, but having problems with the configuration as well. |
AFAIK Cosmos still doesn't support Bytecode, so using a GLV based solution (like Goblin) still isn't an option. You should be able to connection with the |
@davebshow Is this still the case ? Any update on this ? |
Hi,
Has anyone been able to use Goblin with Cosmos DB?
Here's our yaml config:
hosts: ['gremlin_uri_that_cosmos_gives_us_in_azure_dashboard']
port: 443
username: '/dbs/graphdb/colls/Persons'
password: 'somepassword'
response_timeout: 5
connectionPool: {
enableSsl: true}
serializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { serializeResultToString: true }}
We run
import asyncio, datetime
import goblin
from goblin import Goblin
from goblin import driver, abc, exception
loop = asyncio.get_event_loop()
app = loop.run_until_complete(Goblin.open(loop, configfile='config.yaml'))
app.close()
but it always comes back with
Traceback (most recent call last):
File "/usr/lib/python3.6/asyncio/selector_events.py", line 724, in _read_ready
data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 104] Connection reset by peer
We've tried with the Gremlin console and it sort of works better against cosmos db but then we get back the connection reset by peer error there too after awhile. It seems to be a cosmos db/networking issue with Azure, but we're not sure where to look.
We just have a trial edition of azure at this point, just evaluating Goblin with Cosmos DB for graphs.
The Azure documentation for Cosmos DB using Gremlin says to use the Goblin Python driver for Python support, but we can't even connect at the moment :(
Any help would be appreciated.
Thanks,
Jordan
The text was updated successfully, but these errors were encountered: