Issue with PVC Creation After JupyterHub Helm Chart Upgrade

I recently updated the JupyterHub Helm chart on our Dev environment from version 3.3.7 to 4.1.0. However, after the upgrade, we encountered an issue where, when an existing user tries to launch a instance, a new Persistent Volume Claim (PVC) is created with the same name as the Pod, instead of using the existing PVC.

For example, the PVC for an existing user would be something like claim-user-40test-2ecom, but after the upgrade, the new PVC is created with a name like claim-user-test-com---ca189761. The Pod name also appears as jupyter-user-test-com---ca189761.

Interestingly, when I tested the same Helm chart upgrade on our test environment, the behavior was different. In the test environment, when an existing user launches a new instance after the upgrade, it successfully uses the existing PVC, as expected.

One key difference between the Dev and test environments is that the Dev environment was previously upgraded from 3.3.7 to 4.0.0 back in November. However, due to the same issue, we had to revert the changes and continue using 3.3.7. Now that I’ve updated the Dev environment directly to 4.1.0, the issue persists, whereas the test environment (with a fresh upgrade to 4.1.0) works as expected.

Could anyone provide guidance on why this discrepancy is happening, and how I can ensure that the existing PVC is used correctly on the Dev environment after the upgrade?

Thanks in advance for your help!

Is there a chance that back when the dev environment was upgraded, a new pvc was created for this user/server and then after the downgrade the new pvc was not deleted?

For a little background, the problem stems from the fact that:

  1. the default pvc naming scheme was changed, and
  2. the previous version of kubespawner did not persist the pvc name

So in the upgrade, kubespawner has to guess whether to use the old name or the new name. The way the guess works is basically:

  1. if pvc name is persisted (i.e. this is not the first launch since the upgrade), use that, no guessing required, this problem shouldn’t happen again in the future
  2. if no pvc name is persisted (i.e. first launch after upgrade) and a pvc with the new name doesn’t exist, check if the old pvc name exists, and use that
  3. persist whatever we got, so no more guessing after this

So one way you might get a different result is if the previous upgrade created the new pvc, then downgraded to get the old name back, then upgraded again would pick up the new pvc again created during the last upgrade if it hadn’t been deleted after the downgrade. Deleting the unused new pvc should have fixed it. But now that you’ve upgraded, it helpfully remembers the pvc name it found. It’s a bit tricky

I don’t know how many users you have who haven’t launched yet in the upgraded dev environment, but if you can check:

  1. whether the old pvc name exists, and
  2. whether the new pvc name exists

at upgrade time, and relate that to what pvc gets mounted.

I also wrote this script to try to help remedy the situation. I haven’t found the best way to fix it reliably and robustly for all the ways people might configure JupyterHub.

2 Likes

Thanks so much for your detailed explanation @minrk ! I wanted to provide some more context to clarify the situation further and see if we can pinpoint what might be going wrong in the Dev environment.

Context:
After upgrading our Dev environment to JupyterHub 4.0.0 in November 2024, about 10 users tried launching JupyterHub instances. This resulted in the creation of new Persistent Volume Claims (PVCs) instead of using the existing ones. Following the upgrade, we had to delete the newly created PVCs after downgrading back to 3.3.7. Once the downgrade was completed, users were able to use their old PVCs without any issues.

During the downgrade process, we had to delete the old JupyterHub database and create a new one, as downgrades are not allowed due to the database schema changes between versions. After the downgrade, I can confirm that no new PVCs are left in the Dev environment.

The Issue:
I recently upgraded the Dev environment to JupyterHub 4.0.1 (last week). I specifically asked one user to launch a JupyterHub instance who had never launched JupyterHub on Dev after the November 4.0.0 upgrade. Unfortunately, the issue persists: a new PVC is being created instead of using the existing PVC.

What I Expected:
Based on the behavior described in the JupyterHub chart 4.0.1, when an existing user tries to launch a JupyterHub instance, they should be able to use their existing PVC. For a new user, a new PVC should be created following the updated naming scheme.

Confusion:
I am unsure why the behavior in the test environment is different. In the test environment, this issue does not occur, and users are able to use their existing PVCs after upgrading. The problem seems specific to the Dev environment, even though we followed the same steps.

Can you diff your config between test and dev? Maybe there’s some pvc or naming-related setting that’s set in one but not the other, causing the difference. Especially useful to know if e.g. pvc_name_template is set and to what.

Thank you @minrk, I couldn’t find any significant differences in the configurations between the test and dev environments. The only notable distinction is that in the dev environment, there are some extra volumes attached to the single user instances, whereas in the test environment, there is only a single user volume. Do you think this could be relevant, as the additional volumes in dev might be influencing the PVC creation behavior differently from the test environment. To me, it doesn’t seem like it would be the cause.

I don’t think so, but if you share what your volume config looks like, it might help. Any volume-related config and/or singleuser.storage config might be relevant.