-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coordination of server use #377
Comments
The run started on 6th of August is not finished yet. The task
Both problems were caused by subst_ids which were in the mv_grid table but due to the new run of osmTGmod not part of the etrago buses. When I enforced a re-run of the mv-grid-dataset, the tasks finished successfully. |
the new branch for the Fridays' run @gnn was talking about does not exist yet, right? |
I think he was talking about this branch: https://github.com/openego/eGon-data/tree/continuous-integration/run-everything-over-the-weekend |
Thank you, didn't copy the name during the webco and the docs have not been updated yet. |
Apparently, there has been no run on Friday?! |
May I start it today? @gnn |
I would find it great yes! |
gnn told me that he started a clean-run on Friday. But I didn't check the results yet. |
Ah, I'm just seeing he didn't use the image we used before but created a new one. But I dunno which HTTP port it's listening on.. :( |
Got it, it's port 9001 (do u know how u reconfigure the tunnel @AmeliaNadal ?). Apparently, it crashed quite early at tasks It's very likely that the first one is caused by insufficient disk space as there're only 140G free (after cleaning up temp files) and that might not sufficient for the temp tables created by osmTGmod. So I propose to delete my old setup we used before and re-run the new one. Shall I do so? Any objections @IlkaCu @AmeliaNadal ? |
I could access the results (thanks for asking @nesnoj!) and my tasks haven't run. So I have no objection that you re-run the workflow ;) |
Done. Update: |
I'm done on the server and happy, go ahead @IlkaCu |
I merged one bug fix into |
Great, thank you. |
If I see it right, the server run in normal mode has been successful. 🥳 |
Awesome!
Generally I'm fine with both options, but I guess that there might be some additional checks necessary (at least in #260) before it can get merged to dev. I reckon there will be some more commits in the branches so separate merging via PRs seems more clean to me. |
A task of mine failed due to some column name adjustments in 5b7d9f2. |
I see that I missed an open question last week. Sorry for that.
Since the CR branch might contain changes which are working but not yet meant to be merged into dev, I'm in favour of merging tested feature branches into dev individually. This also makes it easier to figure out where a change came from, which is important when trying to fix bugs which are discovered later on. Hence my 👍 to @nesnoj's comment. :) |
That's exactly what has been annoying most when keeping track of 2 branches. Thx for the hint! 🙏 BTW @IlkaCu : Some of "your" tasks failed in the current run. Also, we get a |
We'e experiencing some odd stuff: parts of 2 tasks in |
|
Yes, @nailend had to mark the task failed as it'd take too much time without parallelization. The parallelization was lost the same way like the stuff mentioned above - we assume that someone who merged recently didn't take sufficient care. This unfortunately cannot be fixed in the current clean run (new tasks are not detected properly). We would have to start a versioned run. Is that ok for you @ClaraBuettner @IlkaCu @AmeliaNadal? However, this would raise the problem in #979, right? |
Lets give it a try. #979 will not necessarily appear again, I guess. |
I don't really see another solution, so that's ok for me too :) |
Thanks for the quick replies! I'll take care.. |
Uh, the max_con limit I set before was without any consequence as it was overridden by @gnn's manual script 🤦♂️. But we agreed to set the HP tasks ( It's running, but |
Thanks for the notification, this is solved! |
|
Done and cleared.. |
On the prior run which finished yesterday, 1 task failed: |
Thanks for notifying, I've removed the sanity checks for eGon100RE in CI. |
The run finished today :) |
The following tasks failed (I tried to clear them but they failed again):
|
Hey @AmeliaNadal!
Looks like a certificate problem on the provider's side you wouldn't be able to solve (I guess it is possible to use a param to ignore the certificate temporarily). The cert for that side was renewed 6 days ago so it is supposed to work, did you stumble across this error today?
Self-speaking. Sounds like the Hetzner server which is 99% full. We have 2 instances ( (By the way, I didn't take part in the last meetings so sorry if I'm stating something obvious you're already aware of.. |
Yes, that run can be deleted. |
Done. @AmeliaNadal I restarted the tasks but the SSL error persists.
The old file is still available but not covered by the cert for some reason. So the instance is back running now.. |
Hi everyone, |
I will have a look. |
Hey @AmeliaNadal, I just stumbled across this error in another project, it was problem with |
I checked the versions and the current CI run used |
[Apparently][0], this version [breaks][1] the ```python demandregio.insert-cts-ind-demands ``` task. [0]: #377 (comment) [1]: #1108
This issue is meant to coordinate the use of the
egondata
user/instance on our server in FL.We already agreed on starting a clean-run of the dev branch on every Friday. This will (most likely) make some debugging necessary on Mondays. To avoid conflicts while debugging, please comment in this issue before you start debugging and shortly note on which datasets/ parts of the workflow you will be working on.
The text was updated successfully, but these errors were encountered: