Skip to content

chore: various updates to improve debugging apps#123

Open
Fraser999 wants to merge 5 commits intomainfrom
fraser/eng-1943/various-updates
Open

chore: various updates to improve debugging apps#123
Fraser999 wants to merge 5 commits intomainfrom
fraser/eng-1943/various-updates

Conversation

@Fraser999
Copy link
Contributor

@Fraser999 Fraser999 commented Feb 28, 2026

This PR comprises several changes:

  1. disables connection pooling in OAuth job to avoid stale connection errors, and reduces log level from error to warn for failing to refresh the OAuth token (as this is a non-critical error I believe)
  2. avoids recording the current span twice (it's recorded as the last entry in "spans" since we enable with_span_list)
  3. supports including file and line number in log lines by setting the env var TRACING_WITH_FILE_AND_LINE_NO to a non-empty value
  4. a somewhat larger change to env var handling:
    • the existing init4 function is now deprecated in favour of a similar init function which requires the tracing and metrics configs to be passed in. The idea is that callers will add these configs to their main config of their app, and hence all the env vars including the tracing and metrics ones will be visible via the inventory and checkable via the check_inventory methods
    • added the tracing config, which internally holds an optional OTEL config
    • updated check_inventory's error case to also include env vars which are set to an empty value
    • added a set of types like OptionalBoolWithDefault which are designed to be used as members of config structs. These are provided with a const generic arg to be used as the default, allowing the config to not have to deal with Option<T> where the optional value is always set to Some after calling from_env.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@Fraser999 Fraser999 marked this pull request as ready for review February 28, 2026 02:17
@Fraser999 Fraser999 changed the title disable conn pooling in oauth job and reduce log level chore: various updates to improve debugging apps Feb 28, 2026
Comment on lines +172 to +184
let duration = tokio::time::Duration::from_secs(self.config.oauth_token_refresh_interval);
let mut interval = tokio::time::interval(duration);
interval.set_missed_tick_behavior(MissedTickBehavior::Delay);

loop {
interval.tick().await;
debug!("Refreshing oauth token");
match self.authenticate().await {
Ok(_) => {
debug!("Successfully refreshed oauth token");
}
Err(err) => {
let mut current = &err as &dyn Error;

// This is a little hacky, but the oauth library nests
// errors quite deeply, so we need to walk the source chain
// to get the full picture.
let mut source_chain = Vec::new();
while let Some(source) = current.source() {
source_chain.push(source.to_string());
current = source;
}
let source_chain = source_chain.join("\n\n Caused by: \n");

error!(%err, %source_chain, "Failed to refresh oauth token");
}
Ok(_) => debug!("Successfully refreshed oauth token"),
Err(error) => warn!(
error = %format!("{:#}", eyre!(error)),
"Failed to refresh oauth token"
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the behavior to wait before authenticating, but we want to maintain previous behavior, which is authenticate then wait. This ensures fast startups for anything gated by authentication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first tick is immediately ready, so there's no real change in behaviour here.

optional_with_default! {
/// An optional boolean with a const default, for use in config structs.
/// A non-empty value is treated as `true`; missing or empty falls back to the default.
OptionalBoolWithDefault, bool, |env_var, _s| Ok(true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will treat "false" as true, no? That feels counterintuitive. Should handle these types of cases explicitly, e.g. "false" or "0" should actually set false, and then document each case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but that matches the impl FromEnvVar for bool pretty much. The only difference is that for an unset env var OptionalBoolWithDefault::from_env_var will return Ok(default), whereas bool::from_env_var will return Err(NotPresent), and for e.g. TRACING_LOG_JSON, we currently unconditionally map errors to false. (I think it'd be better to only map NotPresent to the default value and pass through NotUnicode errors as OptionalBoolWithDefault does.

I agree that parsing "false" as true is counterintuitive, but it's also very common for env vars. Maybe we could provide a new type which does actual parsing of the env var so that:

  • truthy => true
  • falsy => false
  • unset/empty => default if provided, error if not

Ok(_) => T::from_env_var(env_var).map(Some),
Err(_) => Ok(None),
Err(VarError::NotPresent) => Ok(None),
Err(error) => Err(FromEnvErr::parse_error(env_var, error)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically a small behavior change but previous logic would swallow the other error, which would only be a NotUnicode error in this case, so I think that's okay for the use case at hand. Some callers might have to explicitly handle this slightly different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - I can certainly revert this change, but I'm not sure I agree that it's the best approach. If a user set the env var to something, they probably expected it to be parsed. I think it's a potential footgun for users to silently treat that case as the same as not setting the env var at all.

TracingConfig::from_env().wrap_err("failed to get tracing config from environment")?;
let metrics_config =
MetricsConfig::from_env().wrap_err("failed to get metrics config from environment")?;
let _guard = init(tracing_config, metrics_config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to ticket out downstream upgrade work to accommodate this config change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - I can do that. I think the list is:

  • builder
  • filler
  • quincey
  • seal
  • tx-pool

I left the existing init4 function in place (although deprecated) to ease the transition.

signal_hook::flag::register(signal_hook::consts::SIGTERM, Arc::clone(&term))?;
signal_hook::flag::register(signal_hook::consts::SIGINT, Arc::clone(&term))?;

let _guard = init4();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@init4samwise can you list all of the downstream callers of this init4 function in init4tech repos?

@prestwich
Copy link
Member

a somewhat larger change to env var handling:

one idea i have toyed with is making init4<T: ConfigTrait>() -> eyre::Result<T> with trait ConfigTrait: FromEnv + AsRef<TracingConfig> + AsRef<MetricsConfig>. that way we can be opinionated and still pretty simple

thoughts?

@Fraser999
Copy link
Contributor Author

Fraser999 commented Mar 6, 2026

that way we can be opinionated and still pretty simple

Yes, that seems like an improvement, in that the caller doesn't need to do two steps (parse config then call init4). I guess we'd need to make the return type a simple struct containing the parsed T and the OTEL guard.

I made that change in b9f9cad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants