Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ZooKeeper Namespace Id-to-name mapping #4996

Open
wants to merge 9 commits into
base: 3.1
Choose a base branch
from

Conversation

meatballspaghetti
Copy link
Contributor

This PR is part of an ongoing work to improve efficiency and reduce slowdown of retrieving and writing namespace names and table names in ZooKeeper. These particular changes modify the namespace names, and a follow-up PR in the near future will modify the table names. It replaces the use of the /name nodes under namespaces with a namespace id-to-name mapping in a Json format stored under the /namespaces node. With this, when any function wants to retrieve a namespace name, ZooKeeper will only need to read the single mapping instead of iterating through all /name nodes every time.

New files added for implementation:

  • core/src/main/java/org/apache/accumulo/core/clientImpl/NamespaceMapping.java: Class that holds functionality for initializing, retrieving, writing, serializing, and deserializing namespace id-to-name map (and name-to-id map).

Files changed for implementation:

  • ClientContext.java: Allow namespace mapping to be retrieved from ZooKeeper by referencing the context.
  • Namespaces.java: Update namespace name retrieval functionality with new mapping.
  • ZooKeeperInitializer.java: Replace initialization of default and accumulo /name nodes with creation of new mapping.
  • TableManager.java: Update namespace creation functionality to use new mapping.
  • RenameNamespace.java: Replace usage of /name node to instead use new mapping.

Files changed for upgrading:

  • Constants.java: Deprecate ZNAMESPACE_NAME, as the /name node will no longer be used.
  • Upgrader11to12.java: Build namespace id-to-name mapping in ZooKeeper out of current /name nodes. Follow with deletion of /name nodes.
  • Upgrader11to12Test.java: Add test code respective to additions made in Upgrader11to12.java.

Original Issue: #4698

@meatballspaghetti
Copy link
Contributor Author

meatballspaghetti commented Oct 18, 2024

In progress of fixing Upgrader11to12Test which failed the unit test checks.

@meatballspaghetti
Copy link
Contributor Author

Upgrader11to12Test has been fixed and passes now.

@ctubbsii ctubbsii changed the base branch from main to 3.1 October 23, 2024 16:45
@ctubbsii ctubbsii changed the base branch from 3.1 to main October 23, 2024 16:45
@ctubbsii ctubbsii added this to the 3.1.0 milestone Oct 23, 2024
Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableManager.removeNameSpace() needs to be updated to remove the namespace id from the new json. Also that function seems like its in the wrong place.

The following comment is not a suggested for this PR its a possible follow on, this was a preexisting problem in the code. The code seems to construct the paths for namespaces in ZK all over the place. It would be nice to corral this path construction into more centralized code like maybe in Namesspace or NamespaceMapping code. The recent service lock path code corraled zk path construction and centralized, makes the code easier to reason about and collocates path construction code w/ ZK update/read code.

@ctubbsii ctubbsii changed the base branch from main to 3.1 October 31, 2024 16:58
@ctubbsii ctubbsii changed the base branch from 3.1 to main October 31, 2024 16:58
- Implement Json-formatted mapping of namespace Ids to Names in
  ZooKeeper. Stored on the "/namespaces" node. Initially created in
  ZooKeeperInitializer.
- Create NamespaceMapping.java to hold methods initializing, adding,
  retrieving, serializing, and deserializing namespace Id to Name map
  in ZooKeeper.
- Modify all reading and writing of the "/name" nodes so that code
  instead reads and writes to the new Json mapping for namespace
  name data.
- Deprecate ZNAMESPACE_NAME constant, used for the "/name" node.
- Build namespace id to name mapping of current ZooKeeper namespaces
  in Upgrader11to12. Delete all current "/name" nodes.
- Add test code to Upgrader11to12Test respective to additions made to
  Upgrader11to12.

Resolves: apache#4698
- Move SuppressWarnings annotation for handling
  deprecated Constants.ZNAMESPACE_NAME

Resolves: apache#4698
- Add functionality to serialize() in NamespaceMapping that sorts
  the serialized mapping. Should not significantly impact the
  issue overall - but resolves the test that was failing in
  Upgrader11to12Test due to AssertionError caused by comparing
  2 unsorted maps.
- Move deletion of name nodes to after the namespace mapping has been
  computed and persisted in ZooKeeper.
- Check for existing namespace mapping in upgrade code. If does not
  exist, compute mapping and delete name nodes. If already exists, do
  not recompute namespace mapping but still delete name nodes.
- Check for empty or null namespace mapping when deserializing. If
  empty or null, explicitly return empty Map<String,String>.

Resolves: apache#4698
- Move Constants.ZNAMESPACE_NAME to Upgrader11to12 instead of
  deprecating.
- Throw AssertionError when deserializing map if data is null.
- Remove call to Zoocache.clear().
- Modify Namespaces.exists() to look up namespace Ids in new
  mapping instead of looking them up in ZK.
- Use mutateExisting() in NamespaceMapping.writeNamespaceToMap().
  Include TODO for checking if namespace name already exists.
- Fix usage of new mapping in use of mutateExisting in
  RenameNamespace.
- Revert NodeExistsPolicy.OVERWRITE to FAIL when creating nodes
  in ZooKeeperInitializer.initialize.
- Move intializeNamespaceMap directly to creation of namespaces node.

Resolves: apache#4698
@ctubbsii ctubbsii changed the base branch from main to 3.1 October 31, 2024 17:07
- Check if /namespaces node holds unexpected data in Upgrader11to12
  code. If found, throw exception.
- Streamline Namespaces.exists().
- Add check for accumulo/default namespace when retrieving updated
  mapping. Throw exception if not present.
- Refactor the following actions on namespaces: create, rename,
  remove. Move all functionality that updates the new mapping to be
  in NamespaceMapping, use mutateExisting(), and add checks for
  namespaces that may already exist/do not exist, depending on the
  action.
- Remove now-unneeded Utils.checkNamespaceDoesNotExist() and
  NamespaceMapping.initializeNamespaceMap().

Resolves: apache#4698
- Inline path String in RenameNamespace, store context.getManager() in variable
  in PopulateZooKeeperWithNamespace

Resolves: apache#4698
@meatballspaghetti
Copy link
Contributor Author

Commit dd8c4fa has moved all namespace mapping actions (create/put, rename, remove) into NamespaceMapping, ensured that mutateExisting() is used in all cases, and gives checks for non-existent/already-existing namespaces (depending on the action). The actual namespace node creation and removal functionalities remain in their respective classes and occur after the mapping is updated.

* Use TreeMap for type token for both serialization and deserialization
* Use global Gson from LazySingletons
* Add checks for built-in namespaces for rename and remove cases
* Move name already existing check for rename into the mutateExisting
  method
* Add unit test
* Remove unused exception from throws
* Inline prepareNewNamespaceState overloaded method
* Revert adding some extra params to TableManager.removeNamespace
* Prevent namespace not found from blocking removal operation of
  namespace
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants