Skip to content

Use KAFKA_ as prefix for environment Kafka config#209

Merged
philipp94831 merged 112 commits intov3from
feature/rename-env-config
Jul 23, 2024
Merged

Use KAFKA_ as prefix for environment Kafka config#209
philipp94831 merged 112 commits intov3from
feature/rename-env-config

Conversation

@philipp94831
Copy link
Member

No description provided.

Base automatically changed from feature/topology-builder to v3 July 19, 2024 12:09
@philipp94831 philipp94831 marked this pull request as ready for review July 19, 2024 12:42
@philipp94831 philipp94831 merged commit be8ffc8 into v3 Jul 23, 2024
@philipp94831 philipp94831 deleted the feature/rename-env-config branch July 23, 2024 09:48
* Parse configuration properties of a Kafka app from environment variables
*/
public final class EnvironmentStreamsConfigParser {
public final class EnvironmentKafkaConfigParser {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question: How hard would it be to move this into the cli package? I'm thinking about using something like https://smallrye.io/smallrye-config/Main/ and it already comes with its own support for env variables and their priority.
I guess it wouldn't make a difference because the passed config settings always have a higher priority than the setting created here, but maybe it is still cleaner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we would move it to the cli package as it is unrelated to cli. We can replace the whole thing with smallrye if that makes sense. What are the advantages of smallrye?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smallrye config isn't a replacement for the env variables but for pico-cli that doesn't use arguments as config source but has other sources (like json, yaml, zookeeper or custom source: https://smallrye.io/smallrye-config/Main/config-sources/custom/). The point is that it already considers env variables when parsing the config. So for example if you have the following yaml:

kafka:
  bootstrapServer: my-server:9092

You could overwrite the value with and env variable called KAFKA_BOOTSTRAP_SERVER and therefore there would be no need for our env parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to open a PR with a proposal. I think it will likely be a non-breaking change so we can do that also with a release in between

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@torbsto We can actually do so quite easily by adding the env vars here https://github.com/bakdata/streams-bootstrap/blob/v3/streams-bootstrap-cli/src/main/java/com/bakdata/kafka/KafkaApplication.java#L264 and getting rid of the env vars in KafkaPropertiesFactory. Still not sure if we want to do that though as I see it as non-related to CLI. Is there even a conflict with smallrye while this feature is there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For smallrye it wouldn't make a difference, because it already considers env variables when creating the config. I think the problem of the current approach is that streams-bootstrap's env variables always have a lower priority than what is passed. That works for CLI args, but let's say we have a different source that loads the config from a file (but unlike smallyre doesn't consider env variables), then the env variables have a lower priority than the file config. I'd expect that they have a higher priority

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can switch priority with those specified with CLI param if you want. The only others that have higher priority are bootstrap-servers, schema-registry, unique-app-id and serialization

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just keep it in mind, it is fine for now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants