Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define an interface for meta-pipeline client #61

Closed
ccojocar opened this issue Jul 29, 2019 · 18 comments
Closed

Define an interface for meta-pipeline client #61

ccojocar opened this issue Jul 29, 2019 · 18 comments
Assignees
Milestone

Comments

@ccojocar
Copy link
Contributor

It would be nice to have a client with a clean interface to start/stop the meta-pipeline. This can be defined for now in jx.

The interface will define the required data struct and the actions which can be performed.

@hferentschik you can assign this to you if you want to work on it.

@hferentschik hferentschik changed the title Define an interface for met-pipeline client Define an interface for meta-pipeline client Jul 29, 2019
@hferentschik
Copy link
Contributor

hferentschik commented Jul 29, 2019

We have this already a little bit in metapipeline.go with func CreateMetaPipelineCRDs(params CRDCreationParameters) . We can extend on this. There is already very little left in step_create_meta_pipeline apart from command line parsing and validation. Having an interface and impl which Lighthouse can call directly instead of using StepCreatePipelineOptions.Run is really a minimum we should aim for.

A further step would be starting to move this client/interface out of pkg and into a top level metapipeline package which Lighthouse could depend on.

That said, what's about avoiding a hard dependency to jx altogether. There are several benefits to this imo. There is less risk of other, non meta pipeline related jx dependencies, sneaking into the code base and the code will imo be cleaner. Also depending on jx on the moment is really hard in terms of dependency management due to all the required replace statements It feels messy.

I guess there are two options to avoid a hard dependency to jx. One is to include the jx binary into the image for Lighthouse and let it call the jx binary via os.exec. Comparing to how we currently build the StepCreatePipelineOptions and then call Run, there is really not such a big difference. You exchange the hard jx dependency against more complexity in building the image.

The second option is to keep the pipelinerunner controller and let Lighthouse make an HTTP request to trigger the pipeline. Doing that you:

  • have no hard dependency of Lighthouse to jx
  • have a clear separation of concerns
  • have a more micro service orientated architecture

Given that pipelinerunner is ready to go and it got an overhaul just recently, there is no additional development overhead for this solution. On the contrary. Only drawback might be the additional required resources to run the controller.

Thoughts?

@hferentschik hferentschik self-assigned this Jul 29, 2019
@ccojocar
Copy link
Contributor Author

I agree that running a separate microservice and using a REST/gRCP interface is the best option to decouple from jx. Another option is to use a CRD between the hook handler and pipelinerunner but this seems to me overkilling at this point. Of course as you mentioned, invoking jx as a binary is also a solution but I am not a big fun of this, it seems patchy - running multiple application process in the container does not seem a robust solution to me.

In my opinion the current pipelinerunner interface is not well designed. It just accepts an entire ProwJob. I believe there is room for improvement here, especially to define a clean interface REST/gRPC in terms of metapipeline domain and needs. The metapipeline since to me the natural owner of that interface. The lighthouse is just a consumer. I have a slight preference for gRPC especially for the type grantees it brings.

@jstrachan is advocating to start directly the meta-pipeline from hook/trigger in order to have less moving parts.

@jstrachan
Copy link
Member

I'm hoping with some refactoring we can keep the codebase of creating the meta pipeline CRDs pretty small with minimal dependencies so it won't be too much of an issue to invoke that code directly from a single auto-scaled pod to keep the footprint of Jenkins X nice and small

@ccojocar ccojocar reopened this Jul 30, 2019
@hferentschik
Copy link
Contributor

I agree that running a separate microservice and using a REST/gRCP interface is the best option to decouple from jx.

Agreed. Also my preferred choice.

Another option is to use a CRD between the hook handler and pipelinerunner but this seems to me overkilling at this point.

+1

Of course as you mentioned, invoking jx as a binary is also a solution but I am not a big fun of this, it seems patchy

Agreed. It is not my preferred solution either. Mainly mentioned it for completeness. Everything is a trade off though, if we'd consider no module dependency to jx important as well as no additonal service, than this is imo a viable option.

In my opinion the current pipelinerunner interface is not well designed. It just accepts an entire ProwJob. I believe there is room for improvement here, especially to define a clean interface REST/gRPC in terms of metapipeline domain and needs.

Absolutely, even Lighthouse aside I would have suggested a change in the interface of the pipelinerunner. So I fully expect to change/adjust the pipelinerunner interface.

The metapipeline since to me the natural owner of that interface. The lighthouse is just a consumer.

Sounds reasonable

I have a slight preference for gRPC especially for the type grantees it brings.

I am not familiar with gRPC, so don't really know atm.

@hferentschik
Copy link
Contributor

I'm hoping with some refactoring we can keep the codebase of creating the meta pipeline CRDs pretty small with minimal dependencies so it won't be too much of an issue to invoke that code directly from a single auto-scaled pod to keep the footprint of Jenkins X nice and small

Do you refer to the pipelinerunner or to talk about keeping the current approach of creating the meta pipeline from within Lighthouse?

@hferentschik
Copy link
Contributor

to keep the footprint of Jenkins X nice and small

I don't think a pipelinerunner service would be hard on the footprint.

@jstrachan
Copy link
Member

jstrachan commented Jul 31, 2019

@hferentschik its not about whether its easy to do - its whether we should do it or not. We're talking about a simple go function call to create the meta pipeline CRDs here; its better footprint wise (1 container, 1 http handler rather than 2 containers and 2 http handler) together with error handling/retry of the 2nd http call & we've only got 1 liveness/readiness check to do & don't have to deal with error handling if the 2nd container is down - plus it'll add a tiny bit of network/CPU overhead with marshalling the data.

But for me the biggest reason to make it 1 simple container rather than 2 is ease of management, observablity + diagnostics for end users. With the 1 container lighthouse option with the meta pipeline code statically linked inside lighthouse; we go from webhook handler -> meta pipeline CRDs in 1 container with a smaller footprint of things that can go wrong & we can easily associate all the logrus logging context through all parts of the lighthouse/meta pipeline code. Then if a webhook doesn't result in a meta pipeline being created we have exactly 1 pod to look in for the logs; not 2 to try figure out what went wrong.

It's already a huge challenge for end users trying to figure out why pipelines don't work with with prow with: hook, pipeline, pipelinerunner, tektoncontroller pods & folks randomly restarting pods to try get things working.

The simplest possible thing to manage is 1 pod with 1 log so you can see if its working; if its not why and restart if needed - so there's only 1 liveness check to worry about etc.

Our #1 aim is the simplest possible software thats rock solid and as simple to manage/maintain as possible with the lowest footprint/cost. If we had tons of different PlumberClient to worry about so we had lots of implementations to switch out; then sure splitting this 1 fairly simple webhook handler into 2 microservices makes sense - but given the fact we will only ever have 1 implementation of the PlumberClient any time soon; the meta factory function call - I'm preferring avoiding increasing the complexity + footprint at this time.

If we need to it will be trivial to re-evaluate or even add 2 different charts/options which use embedded v remote plumber implementations. (If tekton folks really wanna remove the jx dependency we could upstream that to tekton at some later point and have a jx binary distro of lighthouse with jx inlined).

If the motivation for the decoupling is worrying about go module dependency size; lets tackle that head on and make the meta factory function call have the minimal possible dependency tree by refactoring the implementation; rather than refactoring lighthouse into 2 containers/deployments

@ccojocar
Copy link
Contributor Author

its better footprint wise (1 container, 1 http handler rather than 2 containers and 2 http handler) together with error handling/retry of the 2nd http call & we've only got 1 liveness/readiness check to do & don't have to deal with error handling if the 2nd container is down - plus it'll add a tiny bit of network/CPU overhead with marshalling the data.

What does it mean smaller footprint? IMO the footprint is defined in terms of CPU and memory usage. If we are considering that the current jx binary has 220MB that gets statically compiled into the Lighthouse that's a lot of memory introduced by many things which are not required by Lighthouse. So coming back to the footprint question: is it better to have a fat POD which consumes a lot of memory or 2 lighter PODs?

But for me the biggest reason to make it 1 simple container rather than 2 is ease of management, observablity + diagnostics for end users

In a real use case scenario, there are required minimum 2 hook PODs for high availability, otherwise the user will experience downtime quite often if let's say the hook container is somehow down.

It's already a huge challenge for end users trying to figure out why pipelines don't work with with prow with: hook, pipeline, pipelinerunner, tektoncontroller pods & folks randomly restarting pods to try get things working.

We want to avoid doing this mistake again by properly designing things instead of quick hacks. Unfortunatlly pipelinerunner doesn't have too much design though into it, and IMO it proved to be one of the most instable component of the today's stack.

Our #1 aim is the simplest possible software thats rock solid and as simple to manage/maintain as possible with the lowest footprint/cost

I am totally for this but introducing a hard dependency to the whole jx doesn't make the things simple. It brings a lot of transitive dependencies which needs to be updated and maintained. This is already difficult to do so in jx. e.g. jx has still a dependency to kubenetes version 1.11.3 with a lot of overwrites because the entire dependency tree has become unmanageable (https://github.com/jenkins-x/jx/blob/cd9fd3aee92cb10ba7988bbf5eb276b0eadba13c/go.mod#L141).

I would like that we avoid this if possible in Lighthouse.

If we had tons of different PlumberClient to worry about so we had lots of implementations

In a clean implementation, the client needs to be defined by the metapipeline. Lighthouse is only a consumer of this client. Plumber does say much to me other than a generic thing.

If the motivation for the decoupling is worrying about go module dependency size; lets tackle that head on and make the meta factory function call have the minimal possible dependency tree by refactoring the implementation; rather than refactoring lighthouse into 2 containers/deployments

Are we up to extract the pipeline and the metapipline bits in a separate repo outside of jx? This will lead us to a minimum dependency tree.

@jstrachan
Copy link
Member

jstrachan commented Jul 31, 2019

@ccojocar I'm not talking about introducing a hard dependency on the whole jx binary - just the go code for creating the meta pipeline. i.e. refactoring the jx repo so that the meta pipeline code can be easily reused.

I hope we can refactor the jx repo so that the meta pipeline code can be easily consumed with modest dependencies without 200Mb of code (e.g. avoiding the option stuff which brings in pretty much all the jx dependencies) and bloating the lighthouse binary. If that doesn't work then lets try move the CRDs + associated meta pipeline code into a separate repo?

The result should be a tiny single container for lighthouse that we can auto-scale from zero -> N if cost is an issue or have, say, minimum of 3 containers running all the time for HA

@ccojocar
Copy link
Contributor Author

I hope we can refactor the jx repo so that the meta pipeline code can be easily consumed with modest dependencies without 200Mb of code (e.g. avoiding the option stuff which brings in pretty much all the jx dependencies) and bloating the lighthouse binary. If that doesn't work then lets try move the CRDs + associated meta pipeline code into a separate repo?

All of these sounds good to me! I would like us to make this a first priority. This is why in the first place I opened this issue. I would like us to define a clean interface even though it is built-in and defined in jx repository.

I hope that we all agree that the current PlumberJob is not really suited to move forward in a production ready version of Lighthouse.

@jstrachan
Copy link
Member

totally agreed. We really need to refactor jx so its easier to consume small bits of code (e.g. the CRDs + kube package + meta pipeline client) with minimal dependencies.

we should be able to make PlumberJob a small simple struct once we are sure we can use pure tekton for git reporting and remove the old prow reporting stuff

@jstrachan
Copy link
Member

BTW a quick win could be to split up the option/factory stuff (which currently depends on knative build + all kinds of stuff) so folks can just use the jx client / kube client and no other dependencies

@jstrachan
Copy link
Member

even today before we refactor jx, the lighthouse binary is 101Mb - but it should shrink massively if we can move away from the options code & just reuse the real dependencies we need for meta pipeline

@hferentschik
Copy link
Contributor

What does it mean smaller footprint? IMO the footprint is defined in terms of CPU and memory usage. If we are considering that the current jx binary has 220MB that gets statically compiled into the Lighthouse that's a lot of memory introduced by many things which are not required by Lighthouse. So coming back to the footprint question: is it better to have a fat POD which consumes a lot of memory or 2 lighter PODs?

I'd argue in its current form two pods/services (pipelinerunner + lighthouse) is the more lightweight option.

Provided we can refactor the jx code making meta pipeline is easier to consume this might change. However, this option probably takes more time than the pipelinerunner + lighthouse option.

Another reason for two services is the ability to deploy new versions of meta pipeline without impacting the Lighthouse deployment. If these two things are tightly coupled we need to re-build and re-deploy everything on each release.

Last but not least, I seem to recall discussions that Lighthouse should be a "generic" webhook handler which potentially also works outside of Jenkins-X. This would definitely increase the likelihood to get a community around this.

Unfortunatlly pipelinerunner doesn't have too much design though into it, and IMO it proved to be one of the most instable component of the today's stack.

The controller is not really a lot of code. Whatever design concerns we have, I think we can easily address them.

Regarding stability, the main issue was the failing git clone commands and our intentional destruction of the pod in this cases. This should be addressed by the issue #4862 and the corresponding pull request. Looking at the DataDog logs, pipelinerunner is much more stable now. There are still errors, but the ones I looked at where related to not able to merge in changes. That said, all the git handling won't be occurring in the meta pipeline. There all git handling happens in the pipeline.

I am totally for this but introducing a hard dependency to the whole jx doesn't make the things simple. It brings a lot of transitive dependencies which needs to be updated and maintained. This is already difficult to do so in jx. e.g. jx has still a dependency to kubenetes version 1.11.3 with a lot of overwrites because the entire dependency tree has become unmanageable

This is one of me main concerns as well. Even if we extract the meta pipeline code into another package which is easier to consume, we still have to deal with all the overrides. If we are talking about taking things head one, sorting out the jx dependencies is high on my list, but I fear this is not an easy task. Hence, I feel it is better to shield Lighthouse from this at least for now. Maybe further down the road when we dealt with the jx dependencies and refactored the meta pipeline code, we can "merge" the two services.

I am open for the merge of these two services further down the road in case we see benefits in terms of manageability, observabilty or resource consumption. However, I think the safer, cleaner and in fact quicker road to integrate meta pipeline into Lighthouse is via pipelinerunner.

Are we up to extract the pipeline and the metapipline bits in a separate repo outside of jx? This will lead us to a minimum dependency tree.

I think that's something I'd like to avoid for now. That feels like a lot of work. There are several things which would need extracting.

@hferentschik
Copy link
Contributor

I would like us to define a clean interface even though it is built-in and defined in jx repository.

One way or another we should define a clean interface.

I hope that we all agree that the current PlumberJob is not really suited to move forward in a production ready version of Lighthouse.

+1

@jstrachan
Copy link
Member

here's a PR for removing all the cobra / CommonOptions / Options code from lighthouse - it also shrinks the binary from 101 -> 68Mb jenkins-x/jx@4843b3a

@jstrachan
Copy link
Member

@hferentschik here's a PR that simplifies the interface further
#90

FWIW I'm not sure there's many ways to make this 1 function, 1 argument interface much cleaner:

type Plumber interface {
Create(*PlumberArguments) (*PlumberArguments, error)

@ccojocar ccojocar added this to the Sprint 12 milestone Aug 9, 2019
@hferentschik
Copy link
Contributor

See jenkins-x/jx#5051

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants