Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We had a use case for Datadog::Tracing::Contrib::HTTP::Instrumentation.after_request #3736

Open
NobodysNightmare opened this issue Jun 25, 2024 · 7 comments
Labels
community Was opened by a community member feature-request A request for a new feature or change to an existing one

Comments

@NobodysNightmare
Copy link

NobodysNightmare commented Jun 25, 2024

This is in response to #3367:

If you require this hook, please open a "Feature request" stating your use case so we can asses how to best support it.

Describe the goal of the feature

We were looking for a good way to consistently (i.e. without manual repetition) set the resource name of HTTP requests. This is what we did through the after_request hook. Snippet:

Datadog::Tracing::Contrib::HTTP::Instrumentation.after_request do |span, _http, _req, _res|
  break if HTTPFestival::TraceCalls.current_caller.nil?
  break if HTTPFestival::TraceCalls.current_caller.empty?

  span.resource = HTTPFestival::TraceCalls.current_caller
rescue StandardError
  nil # If we mess up, we should not break the original request
end

It's worth noting that HTTPFestival::TraceCalls.current_caller goes back to a thread-local variable, so we ensure that for each HTTP request that might concurrently occur, we set a proper request name. We had a mechanism in place to automatically set a reasonable default for resource names just before we issued the actual HTTP request.

Describe alternatives you've considered
I am not aware of any alternative that could be used to overwrite the resource name of HTTP calls. This leaves us with unhelpful defaults such as GET 200 or POST 404, when our implementation allowed us to construct names that were specific to the action being executed, without being derailed by details like identifiers in the request path.

Judging from the implementation that we had in place before, something that would alternatively work for us is an API like this:

Datadog::Tracing::Contrib::HTTP.with_resource_name('GET /foos/:id') do
  # do the actual HTTP request here (its resource name in Datadog will be "GET /foos/:id")
end
@NobodysNightmare NobodysNightmare added community Was opened by a community member feature-request A request for a new feature or change to an existing one labels Jun 25, 2024
@NobodysNightmare
Copy link
Author

Apparently this use case of ours isn't too surprising. While searching for after_request in open and closed issues I stumbled upon #277. Key takeaways:

  • @delner introduces after_request as a suggested extension point to set resource names (our use case above)
  • then some more talking
  • @buddhistpirate closes the issue, says they are now also using this undocumented extension and wish that it existed for other http clients as well

So yeah. All I can do is say that the use case for which the feature apparently was built initially still exists 🤷

Maybe you can support it in a simpler/more generic way (see my suggestion above).

@delner
Copy link
Contributor

delner commented Jun 27, 2024

Hi @NobodysNightmare! Thanks for highlighting.

This is a challenging feature, less so from a mechanical perspective, but from a change management one, as we try to balance user customization with other functionality out-of-the-box.

From what I remember, I deliberately introduced this feature as undocumented, because while I recognized its pragmatism for modifying resource, I wasn't sure of the side effects of arbitrary execution (e.g. slower HTTP responses when doing heavy work within after_request), and the precedent of using callback hooks all over the code base ("if you can do this for HTTP, why not for everything else too?") and ending up with spaghetti.

Also, as you alluded to, this is chiefly used for modifying resource, which is a troublesome field because there's no clear "right" value for this... it really varies from case-to-case. If we had a better means of providing a reliably better value to this field, or we replaced its use in the UI (as a label) with some other tags on the span, then maybe this callback becomes unnecessary.


I don't have any concrete solutions to suggest yet, but I do think our team should consider what a solution should look like, and how it would practically improve the experience with resource.

Off the cuff, maybe this could be:

  1. A resource targeted feature as you suggested above.
  2. A better resource heuristic in HTTP instrumentation (doubtful we can do this reliably without exposing PII)
  3. Richer tagging + a user-customized schema (probably in YAML or something) for transforming span data. (I prototyped something for this last year with some success.)
  4. Some kind of managed callback hooks (doubtful for the same reasons after_request had problems)
  5. Something else entirely?

Hopefully we can get some kind of workaround in place short term. Maybe the @DataDog/ruby-guild would like to weigh in on ideas for this, and the long term solution.

@NobodysNightmare
Copy link
Author

Hi,

thanks for the open sharing of your thoughts. I share the doubts you have on two of those ideas, especially the ability to automagically guess a better resource name. Option 3 sounds promising, but also like it might be a "longshot" option because of its complexity.

Pragmatically, I will probably tell people at our company to wait a few months before going to the 2.x line of ddtrace. Eventually we will be forced to migrate over anyways because nobody wants to run with too old dependencies, but maybe this buys us enough time for an alternative solution.

@alexevanczuk
Copy link
Contributor

@delner Wondering if there's been any more movement on this one.

There are two big use cases we have:

  1. We integrate with X providers, each with Y of their endpoints. Right now, we just see GET and POST for the resource names for those providers. We'd like each resource to be one of their endpoints. I understand this is hard to generalize (since URLs can have unique resource identifiers), but would be great if we could even just set the resources manually on our side.
  2. On the other hand, we send webhooks to our N clients, so each of the clients' configured endpoints is one of N services in our services page, which clutters the overall namespace. Instead, I'd like a single service client-webhooks and have each client's webhook endpoint be a separate resource.

In both cases, having a hook into the instrumentation for faraday/http here to override the service and/or resource name would give us the flexibility to do this on our side and vastly improve our observability out of the box.

Any other thoughts here? Would you accept a PR that gives the client the ability to configure a custom service and resource (by implementing a block that yields maybe something related to the faraday client / the request / the span)?

cc @anthonybishopric

@ivoanjo
Copy link
Member

ivoanjo commented Sep 2, 2024

Hey folks! FYI a few folks are out for vacation and whatnot, so we may take a bit to reply on this one, hope that's ok!

@alexevanczuk
Copy link
Contributor

Hey @ivoanjo wanted to bump this 🙏🏼 Would be awesome to be able to override the resource_name and service_name functions invoked on these lines:

span.resource = resource_name(env)

@ivoanjo
Copy link
Member

ivoanjo commented Sep 17, 2024

We're still looking into this, hopefully will have an answer soon! 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member feature-request A request for a new feature or change to an existing one
Projects
None yet
Development

No branches or pull requests

4 participants