Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[server][common] Allow store config repo to fetch configs only when needed #1204

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

adamxchen
Copy link
Collaborator

@adamxchen adamxchen commented Sep 27, 2024

Summary

#1132 introduced additional ZK watch. The implementation of HelixReadOnlyStoreConfigRepository reads all store configs, regardless of which cluster, this creates an issue since it would increase the additional ZK watch by store number x host name. In this PR, it adds some logics to lazy fetch the store config and cache.

How was this PR tested?

new unit test

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

@adamxchen adamxchen changed the title [common] Allow store config repo to fetch configs only when needed [server][common] Allow store config repo to fetch configs only when needed Sep 27, 2024
Copy link
Contributor

@gaojieliu gaojieliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking the following refactoring strategy, which I think will be safer:

  1. Keep the existing refresh calls as today.
  2. HelixReadOnlyStoreConfigRepository#refresh will place a child-change listener as today, but it will only get a list of store names, not store configs.
  3. In the child-change listener, whenever there is a child change, it will only mutate the maintained store list and of coz, for deleted stores, we will remove the corresponding entry from the local store config map and corresponding zk watch.
  4. We will only place a zk watch on store config node in lazy fetch function.
  5. In HelixReadOnlyStoreConfigRepository#getStoreConfig function, if the requested store doesn't in the store list, return Optional.empty() directly, and if it is present, fetch the store config on the fly and place the zk watch to monitor data change.
  6. HelixReadOnlyStoreConfigRepository would maintain a storeConfigMap, but this map will be lazily populated by HelixReadOnlyStoreConfigRepository#getStoreConfig function.

WDYT?

Copy link
Contributor

@gaojieliu gaojieliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some more comments and you need to fix the test failures...

@@ -46,7 +46,7 @@ public ZkStoreConfigAccessor(
}

public List<String> getAllStores() {
return dataAccessor.getChildNames(ROOT_PATH, AccessOption.PERSISTENT);
return HelixUtils.listPathContents(dataAccessor, ROOT_PATH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It provides a bit better error handling, though essentially they are same.

  public static <T> List<String> listPathContents(ZkBaseDataAccessor<T> dataAccessor, String path) {
    try {
      List<String> paths = dataAccessor.getChildNames(path, AccessOption.PERSISTENT);
      return paths == null ? Collections.emptyList() : paths;
    } catch (Exception e) {
      LOGGER.error("Error when listing contents in path: {}", path, e);
      throw e;
    }
  }

this.zkClient = zkClient;
this.accessor = accessor;
this.storeConfigMap = new AtomicReference<>(new HashMap<>());
this.loadedStoreConfigMap = new AtomicReference<>(new HashMap<>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use VeniceConcurrentHashMap here so that we don't need to recreate a map whenever there is a chance in StoreConfigChangedListener?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use ^ as getStoreConfig can update loadedStoreConfigMap concurrently when store configs are missing from the map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants