-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hathifiles daily loading #83
Conversation
68ea566
to
382fb84
Compare
Ignore the vendored boilerplate for kubernetes for now (@daaang is still figuring out patterns for best managing this long-term) |
holdings: { | ||
mysql: { | ||
port: 3306, | ||
ip: '10.255.8.249', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love repeating this magic IP address across a bunch of public repos. Consul will save us from needing to do this imminently, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a timeframe for consul? Does a roadmap for this stuff exist? It's a little exhausting to have half-baked devops dependencies pop up randomly and repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked some to @daaang and @botimer about this. Right now the top priority is LSP-related stuff, getting to some shared patterns regarding the tanka / jsonnet stuff and github actions, consul after that, and (last I heard) loki and prometheus for shared logging after consul. The major blocker is unrelated things continuing to break. As far as I can tell there is no formal roadmap, project charter, JIRA epic, etc. There are some issues around some of this on the A&E JIRA, which is no longer public to the rest of HT/LIT.
@daaang @botimer I would say regarding tanka/jsonnet and github actions -- I'm happy to contribute some time to that especially as now we have a variety of examples on the HT side for review (this, https://github.com/hathitrust/otis, https://github.com/hathitrust/feed, https://github.com/hathitrust/dex-htrc, https://github.com/hathitrust/hathi_search_client)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is exhausting. Unfortunately, we have been pulled into broken thing after broken thing repeatedly for 6 weeks straight. We do have a roadmap that needs some revisions. It's been exceptionally difficult to make and keep a roadmap when 75%+ of our time is on emergencies.
Part of the complication has been the demand for multi-cluster deployments. There is no general solution for this, so the research has been extensive, in order to avoid coming up with something homemade and half-baked. The community is divided between managing multiple clusters completely from the outside (via GitOps and tools like Flux and ArgoCD) and building that support into Kubernetes via federation (working group drafts). We still don't have our plan settled, but it's looking like the Flux/Argo models are pretty well aligned with our general notions.
As far as Consul, it's been "number two" on the list for six months. We are working on it when we can, and believe that it will solve a set of problems, but no one should be selling it yet. It's not ready and we don't know when it will be yet because there are too many promises made on our time by others. In other words: do whatever else it takes for now.
newest = create(:loaded_file, produced: Date.today - 1, type: "hathifile") | ||
create(:loaded_file, produced: Date.today - 2, type: "hathifile") | ||
create(:loaded_file, produced: Date.today - 1, type: "holding") | ||
newest = create(:loaded_file, produced: Date.today - 1, type: "test") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing this locally, and the tests were fighting with data in the dev database. While we could (and perhaps eventually should) set up separate dev & test databases via docker-compose.yml
, it's probably also worthwhile to make sure the tests don't make unnecessary assumptions or generate unnecessary conflict.
61898d4
to
a91ee69
Compare
{ | ||
local config = $._config.search_client, | ||
|
||
phineas:: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know "phineas" is a thing, but there doesn't appear to be any documentation for it. I found a reference to https://github.com/mlibrary/phineas-config but that 404s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. It's another half-baked devops dependency leaking into our projects. All I can say is .. this is temporary and there is a plan, but there is no timeline, and we need to keep moving forwards with what we're trying to do on our end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two MultiLogger classes.
I remain skeptical of the utility of the sonnets. We are adding 30k lines of vendor code to the repo. Is the added complexity commensurate to the problem are we solving?
@@ -0,0 +1,466 @@ | |||
// Override defaults paramters for objects in the ksonnet libs here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still befuddled by vendor stuff in the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not wild about it, but I think right now our choices are:
-
Follow the example from github.com/mlibrary/patron_account, which is the most recent example from A&E of doing this; update our repositories when there is newer guidance
-
Strike out on our own -- either with Kustomize (https://kustomize.io/), our own jsonnet/tanka stuff, or a gitops tool like https://fluxcd.io/ -- but I'm not wild about spending a lot of time investigating options here and (likely) ending up out of sync with A&E recommendations.
At the least, I think we're at a place where doing something managed is better than dealing with raw YAML resources, and this is at least something where we can benefit when A&E comes up with more useful recommendations.
One option would be to take all the k8s stuff out of the individual repositories and move it to https://github.com/hathitrust/phineas-config or https://github.com/hathitrust/ht_config_k8s -- that has the benefit of avoiding multiple vendored copies of stuff, at the cost of making it less transparent where the config is. We could also just omit the vendored stuff, but then it requires additional steps to get to the point where you can run tk apply
.
- fix missing requires - logger config - loading flag config - file_mutex ensures cleanup - hathifile produced on a given date is named for the previous day - LoadedFile tests don't get confsued by other stuff in db
This is preparatory for having additional environments for one-off jobs, etc that also bring in the services to ensure they're defined
Adds: