Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify publish dependencies on deployments #21576

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lilatomic
Copy link
Contributor

@lilatomic lilatomic commented Oct 27, 2024

Infrastructure deployments can depend on other targets being packaged and published. For example, a Helm chart can deploy Docker images. This capability is similar to how runtime_package_dependencies allows packaging other targets for use in tests (for example, shunit2_test). Pants already has support for publishing dependencies before deploying them. However, this behaviour is not exposed to users and is only filled by the Helm backend when it identifies Docker targets to push.

This MR add a field publish_dependencies which allows users to manually specify targets to publish before deploying. closes #21522 .

Notes:

  • experimental-deploy goals need to hook this field into their targets, but the execution is handled automatically
  • I used a shared field so it's easy to use it for all backends implementing it
  • Dependencies are propagated (eg if a dep in publish_dependencies changes, is the deployment marked as changed)

@lilatomic lilatomic force-pushed the feature/deployment-publish-dependencies branch from b16bb27 to f371525 Compare November 6, 2024 21:28
@lilatomic lilatomic added category:new feature backend: Helm Helm backend-related issues backend: Terraform Terraform backend-related issues labels Nov 6, 2024
@lilatomic lilatomic marked this pull request as ready for review November 6, 2024 21:51
@Nishikoh
Copy link
Contributor

Nishikoh commented Nov 7, 2024

This implementation is exactly what I've been waiting for! I'm absolutely thrilled thank you so much for this cool feature! 🚀

Copy link
Contributor

@alonsodomin alonsodomin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea and in general, the implementation looks good to me. But it looks like that when the publish_dependencies field is used, then the inferred dependencies are ignored...?

I would like to see a behavior more akin to the dependencies field. Meaning that when dependencies are explicitly provided, these are added to the set of inferred ones. Similarly, the publish_dependencies field could be use to exclude inferred ones using the special syntax !path/to:target as the dependencies field does.

set(chain.from_iterable([deploy.publish_dependencies for deploy in deploy_processes]))
if deploy_subsystem.publish_dependencies
else set()
)

publish_targets = inferred_publish_targets | specified_publish_targets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't publish_targets here be the intersection set of these two other sets?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear how users should use these 2 options. Since every deployment now has it's own publish_dependencies, shouldn't we delete the goal's publish_dependencies option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we're using --no-experimental-deploy-publish-dependencies in CI, because we publish docker images in a separate step, so we don't need to do it again in deploy step. So yes, if these 2 options are going to be used together, we would need the set intersection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've missed something. This is how I thought of it

  • inferred_publish_targets : targets a backend infers should be published. For example, Helm can infer Docker images that need to be published before this deployment
  • specified_publish_targets : manually specified dependencies that should be published before this deployment

Like for normal dependencies, backends can infer dependencies and users can manually add some. The result should be both of these (set union).

I would think that --no-experimental-deploy-publish-dependencies would turn both of these off.

@benjyw
Copy link
Contributor

benjyw commented Nov 9, 2024

This is probably fine as a stopgap, but it is yet another case where we have a "type" of dependency (runtime deps, build time deps, and so on), and I wish we had a principled way of expressing those.

@benjyw
Copy link
Contributor

benjyw commented Nov 9, 2024

Actually, now that I think of it, why aren't these just regular dependencies, where we detect that the dependency is a publishable thing, and therefore know to publish?

@lilatomic
Copy link
Contributor Author

I like the idea and in general, the implementation looks good to me. But it looks like that when the publish_dependencies field is used, then the inferred dependencies are ignored...?

I think I might be misunderstanding. But yes, dependencies in the dependencies field are not used to determine what needs to be published. Each backend can still infer publishable dependencies from those, like Helm does.

shouldn't publish_targets here be the intersection set of these two other sets?

We use the union of the 2 sets so that the backend can do its inference but users can still specify overrides with publish_dependencies

@lilatomic
Copy link
Contributor Author

I would like to see a behavior more akin to the dependencies field. Meaning that when dependencies are explicitly provided, these are added to the set of inferred ones. Similarly, the publish_dependencies field could be use to exclude inferred ones using the special syntax !path/to:target as the dependencies field does.

Oh, that's a good idea! I'll see about implementing that.

@lilatomic
Copy link
Contributor Author

Actually, now that I think of it, why aren't these just regular dependencies, where we detect that the dependency is a publishable thing, and therefore know to publish?

I think publishing every publishable item in a dependency would be unexpected. A few examples:

  1. Terraform modules / Helm charts referenced locally : these support local filepath references, but are also (or could also be) publishable. I think it's necessary to allow for not publishing these
  2. Not deployments, but Docker images which reference other images in the same repo can build off of other local images (only package) or remote versions of those images (would require pulling, and at some point a publish)

That said, we could also add that ability (probably with a toggle). I think it might be as simple as filtering all deps for unionmembership of PublishFieldSet.

@alonsodomin
Copy link
Contributor

I'm afraid there is not a general solution to detecting which dependencies can be published. In the case of helm_deployment, scanning the transitive dependencies looking for publishable artifacts can lead to not only finding dependent remote (3rd party) images, but it can also lead to finding other publishable artifacts like a python package, which most likely wouldn't be expected to be published when running a deployment. Hence why the list of publishable artifacts for a Helm deployment is gathered in the Helm backend (and constrained to only local Docker images), as it's specific to the actual deploy goal implementation provided by the backend.

Worth mentioning that I always questioned this approach. The main motivation I had to implement a publish step during the deployment was to prevent getting a "deployment failed" error in the case in which the end user had forgotten to publish the images first (plus the fact that it sort of make sense as part of a Continuous Delivery workflow). This is very handy but the question remained: show actually Pants do that? What if end users want to separate the two steps (publishing and deploying)? How bad would be the end user experience if the deploy goal didn't publish anything and failed telling the users that some of the dependencies need to be publish first?

Since this goal was added we already added a flag to disable the publishing part of the process, meaning that there is a use case already for separating the two steps. So that makes me think probably the best direction is to remove the functionality as it could be that it's more intuitive for the end user that the deploy goal only deploys stuff and does nothing else. This would also simplify the implementation a lot but I believe we need some kind of user feedback here to properly understand how the majority of end users would like to use this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Helm Helm backend-related issues backend: Terraform Terraform backend-related issues category:new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Publish is not executed before deploy is executed
5 participants