Ideas for Arc XP

Home| Documentation| Support|

Visibility into content source caching

We're currently using partial caching in content sources to reduce calls to a shared service. We ran into issues with this and had an extremely difficult time figuring out when the cache for content sources is being used. We currently have 2 possibilities:

  • Making a high number (around 20) of parallel calls using partial caching in the same content source results in a bottleneck with the content source Lambda's network. This causes connections to take a long time and the 5 s timeout to be hit.

  • Making more partial cache calls results in the content source cache becoming full and evicts other entries. This results in a spike of traffic to the shared service, causing connections to take a long time and the 5 s timeout to be hit.

With the existing logs, we don't have a way to rule out if the content source cache is close to being full or not. There's also no documentation in Arc around the number of parallel network calls a content source can make. We're requesting for more visibility into the content source cache. This may include metrics around current utilization, memory limits, and number of evictions.

  • Guest
  • Feb 7 2025
  • Future consideration
  • Attach files
      Drop here to upload
    • Guest commented
      10 Feb 18:48

      That would be great, but I don't know if it's doable. The cache limit wouldn't be hit locally, it would need production levels of traffic to fill it up. The network limit might be something you could do (with something like localstack) but I'm not sure if it would have the same number of simultaneous socket limits as a lambda would. I believe that should be the max number of file descriptors in a Lambda (1024), but I'm sure there are other processes using that in a lambda which wouldn't be accounted for in local dev.

    • Admin
      Fatih Yildiz commented
      10 Feb 17:38

      What is the ideal state Jason? Having same limits (both timeout + cache object size limit) on local environments too?

    • Guest commented
      10 Feb 16:23

      Yes, we're already using request tracing. Unfortunately, this doesn't help with troubleshooting either of our theories because local dev doesn't enforce the same limits.

    • Admin
      Fatih Yildiz commented
      10 Feb 14:30

      Hi Jason,

      Thanks for this suggestion. It make sense to provide more details about the Engine cache service.

      I assume you already enabled and used request tracing. While the output of request tracing is not easy to view, we have a script in request tracing documentation that makes it print calls.json (that Engine produces when request tracing is enabled), as a nice TUI table view with indicator of which content source and child partial cache calls including cache-proxy service (Engine cache service) calls that get, or write cache objects, and the actual HTTP calls your content sources make. You can see better view of content source and partial cache behavior.

      That view still lacks cache object size usage/limits but it gives the timing, the order of the calls.

      We'll still keep this idea as future improvement to enhance that view to provide better readable version (labels like cache hit/miss, and the size limit).