Post save hook in single user with helm in jupyterhub

Hi,

I am trying to add a post save hook with single user configuration with helm in kubernetes. I would like the hook to be called for every user when a file is saved. I have tried many configurations but no luck.

I am using jupyterhub 4.1.3.

hub:
  extraConfig:
    spawner_conf.py: |
      def post_save_hook(model, os_path, **kwargs):
          print("====== the hook is called =====")
      c.FileContentsManager.post_save_hook = script_post_save

this doesn’t work I think because the FileContentsManager is not initialized yet when this hook is run.

I tried many different configuration options with single user like the following:

hub:
  singleuser:
    extraFiles:
      postSave:
        mountPath: /usr/local/bin/start-notebook.d/post-save.py
        mode: 0755
        stringData: |
          #!/opt/conda/bin/python
          import os
          def custom_post_save(model, os_path, **kwargs):
              print("============ hook was called ===========")
              
          print("===== setup is invoked ====")
          try:
              c.FileContentsManager.post_save_hook = custom_post_save
          except Exception as e:
              print(f"c is not available, exception {e}")

The config seems to be not available. I also tried c = get_config(), but it has the same issue.

Would appreciate any help. Thanks.

The first option doesn’t work because configuration you’ve added ot the Hub environment isn’t loaded in the singleuser environment.

The second option is close, but start-notebook.d is for scripts to run when the notebook server starts (i.e. it’s running python3 /usr/local/bin/start-notebook.d/post-save.py as its own process during startup), not configure the notebook server. If you change the mountPath in singleuser.extraFiles to /usr/local/etc/jupyter/jupyter_server_config.py, your config should be loaded.

1 Like

@minrk thanks for the reply. Tried the change, redeployed, restarted single servers. Here are the logs from a save action:

[I 2024-05-22 17:57:16.798 ServerApp] Saving file at /run_query.ipynb
[I 2024-05-22 17:57:16.818 ServerApp] 200 PUT /user/[email protected]/api/contents/run_query.ipynb?1716400636697 ([email protected]@::ffff:10.252.10.18) 69.31ms
[I 2024-05-22 17:57:16.887 ServerApp] 200 GET /user/[email protected]/api/contents/run_query.ipynb?content=0&hash=1&1716400636827 ([email protected]@::ffff:10.252.10.18) 5.97ms
[I 2024-05-22 17:57:16.911 ServerApp] 200 GET /user/[email protected]/api/contents?content=1&hash=0&1716400636827 ([email protected]@::ffff:10.252.10.18) 28.36ms
[I 2024-05-22 17:57:16.956 ServerApp] 200 GET /user/[email protected]/api/contents/run_query.ipynb/checkpoints?1716400636905 ([email protected]@::ffff:10.252.10.18) 5.13ms
[I 2024-05-22 17:57:16.959 ServerApp] 201 POST /user/[email protected]/api/contents/run_query.ipynb/checkpoints?1716400636905 ([email protected]@::ffff:10.252.10.18) 5.96ms
[I 2024-05-22 17:57:19.856 ServerApp] 200 GET /user/[email protected]/api/terminals?1716400639809 ([email protected]@::ffff:10.252.10.18) 1.51ms
[I 2024-05-22 17:57:21.041 ServerApp] 200 GET /user/[email protected]/api/sessions?1716400640993 ([email protected]@::ffff:10.252.10.18) 1.95ms
[I 2024-05-22 17:57:21.045 ServerApp] 200 GET /user/[email protected]/api/kernels?1716400641000 ([email protected]@::ffff:10.252.10.18) 2.26ms

Also the single user logs don’t even show that it was loaded when the single user was restarted.

do I need to pass any param in the c.KubeSpawner.args to pick up the configuration?

If this is your Z2JH config it’s incorrect- singleuser should be at the top level, not under hub:

It added the spaces when I copied the config. I had the correct hierarchy, apologies for the confusion.

hub:
  ...
singleuser:
  extraFiles:
    postSave:

But it didn’t trigger the post_save_hook.

Could you show us your current singleuser.extraFiles: for completeness?
Is the file visible, and readable, inside your container?

Here’s the full config:

 singleuser:
    extraAnnotations:
      linkerd.io/inject: enabled
    image:
      name: quay.io/jupyter/scipy-notebook
      tag: "hub-4.1.5"
    startTimeout: 3600
    serviceAccountName: "{{ .Values.kubernetes.namespace }}-jupyterhub-user"
    extraEnv:
      DASK_SCHEDULER_ADDRESS: "dask-scheduler:8786"
      MODIN_ENGINE: "dask"
      MODIN_STORAGE_FORMAT: "pandas"
      # This is the public endpoint.
    extraFiles:
      mamba:
        mountPath: /usr/local/bin/start-notebook.d/20-mamba.sh
        stringData: |
          #!/bin/bash
          set -x
          # mamba install -y modin=0.26.1 s3fs=2023.12.2 boto3=1.33.13 r-irkernel
        mode: 0755
      postSave:
        mountPath: /usr/local/etc/jupyter/jupyter_server_config.py
        mode: 0755
        stringData: |
          import os
          def custom_post_save(model, os_path, **kwargs):
              print("============ hook was called ===========")
          
          print("===== setup is invoked ====")
          try:
              c.FileContentsManager.post_save_hook = custom_post_save
          except Exception as e:
              print(f"c is not available, exception {e}")

    networkPolicy:
      egress:
        - to:
          - namespaceSelector:
              matchLabels:
                name: {{ .Values.kubernetes.namespace }}
          - podSelector:
              matchLabels:
                app: middle

It’s been a few days, this is still not working. Should I proceed to any other avenue? Thanks.

This is the right place. Many people on this forum, and in the Jupyter community as a whole, are here voluntarily, so support may not be as fast as from a commercial software vendor.

Have you checked this?

Thanks for the help. Really appreciate the help and the contribution.

I think the file is accessible and also executable, set to 755. Once the server started, I also tried running the script.

Here are the logs from a shell within the server.

jovyan@jupyter-jan-40betteromics-2ecom:~$ ls /usr/local/etc/jupyterhub/jupyterhub_config.d/jupyter_server_config.py
/usr/local/etc/jupyterhub/jupyterhub_config.d/jupyter_server_config.py
jovyan@jupyter-jan-40betteromics-2ecom:~$ ls -al /usr/local/etc/jupyterhub/jupyterhub_config.d/jupyter_server_config.py
-rwxr-xr-x 1 root users 285 Jun  9 00:24 /usr/local/etc/jupyterhub/jupyterhub_config.d/jupyter_server_config.py
jovyan@jupyter-jan-40betteromics-2ecom:~$ python /usr/local/etc/jupyterhub/jupyterhub_config.d/jupyter_server_config.py
===== setup is invoked ====
c is not available, exception name 'c' is not defined
jovyan@jupyter-jan-40betteromics-2ecom:~$ which python
/opt/conda/bin/python
jovyan@jupyter-jan-40betteromics-2ecom:~$ python --version
Python 3.11.9
jovyan@jupyter-jan-40betteromics-2ecom:~$ 

This is the actual server logs (first 100 lines), see that the 20-mamba is invoked, but no logs for the postSave file.

Entered start.sh with args: jupyterhub-singleuser --NotebookApp.tornado_settings={"headers":{"Content-Security-Policy": "frame-ancestors * 'self' http://localhost:5000 "}}
Running hooks in: /usr/local/bin/start-notebook.d as uid: 1000 gid: 100
++ '[' 0 -ne 0 ']'
Sourcing shell script: /usr/local/bin/start-notebook.d/20-mamba.sh
++ echo 'Done running hooks in: /usr/local/bin/start-notebook.d'
Done running hooks in: /usr/local/bin/start-notebook.d
++ id -u
+ '[' 1000 == 0 ']'
+ [[ '' == \1 ]]
+ [[ '' == \y\e\s ]]
++ id -u jovyan
+ JOVYAN_UID=1000
++ id -g jovyan
+ JOVYAN_GID=100
+ whoami
+ [[ jovyan != \j\o\v\y\a\n ]]
+ [[ 1000 != \1\0\0\0 ]]
+ [[ 100 != \1\0\0 ]]
+ [[ ! -w /home/jovyan ]]
+ source /usr/local/bin/run-hooks.sh /usr/local/bin/before-notebook.d
++ '[' 1 -ne 1 ']'
++ [[ ! -d /usr/local/bin/before-notebook.d ]]
+++ id -u
+++ id -g
Running hooks in: /usr/local/bin/before-notebook.d as uid: 1000 gid: 100
++ echo 'Running hooks in: /usr/local/bin/before-notebook.d as uid: 1000 gid: 100'
++ for f in "${1}/"*
++ '[' -e /usr/local/bin/before-notebook.d/10activate-conda-env.sh ']'
++ case "${f}" in
++ echo 'Sourcing shell script: /usr/local/bin/before-notebook.d/10activate-conda-env.sh'
++ source /usr/local/bin/before-notebook.d/10activate-conda-env.sh
Sourcing shell script: /usr/local/bin/before-notebook.d/10activate-conda-env.sh
++++ conda shell.bash hook
+++ eval 'export CONDA_EXE='\''/opt/conda/bin/conda'\''
export _CE_M='\'''\''
export _CE_CONDA='\'''\''
export CONDA_PYTHON_EXE='\''/opt/conda/bin/python'\''

# Copyright (C) 2012 Anaconda, Inc
# SPDX-License-Identifier: BSD-3-Clause
__conda_exe() (
    "$CONDA_EXE" $_CE_M $_CE_CONDA "$@"
)

__conda_hashr() {
    if [ -n "${ZSH_VERSION:+x}" ]; then
        \rehash
    elif [ -n "${POSH_VERSION:+x}" ]; then
        :  # pass
    else
        \hash -r
    fi
}

__conda_activate() {
    if [ -n "${CONDA_PS1_BACKUP:+x}" ]; then
        # Handle transition from shell activated with conda <= 4.3 to a subsequent activation
        # after conda updated to >= 4.4. See issue #6173.
        PS1="$CONDA_PS1_BACKUP"
        \unset CONDA_PS1_BACKUP
    fi
    \local ask_conda
    ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix "$@")" || \return
    \eval "$ask_conda"
    __conda_hashr
}

__conda_reactivate() {
    \local ask_conda
    ask_conda="$(PS1="${PS1:-}" __conda_exe shell.posix reactivate)" || \return
    \eval "$ask_conda"
    __conda_hashr
}

conda() {
    \local cmd="${1-__missing__}"
    case "$cmd" in
        activate|deactivate)
            __conda_activate "$@"
            ;;
        install|update|upgrade|remove|uninstall)
            __conda_exe "$@" || \return
            __conda_reactivate
            ;;
        *)
            __conda_exe "$@"
            ;;
    esac
}

if [ -z "${CONDA_SHLVL+x}" ]; then
    \export CONDA_SHLVL=0
    # In dev-mode CONDA_EXE is python.exe and on Windows
    # it is in a different relative location to condabin.
    if [ -n "${_CE_CONDA:+x}" ] && [ -n "${WINDIR+x}" ]; then
        PATH="$(\dirname "$CONDA_EXE")/condabin${PATH:+":${PATH}"}"
    else
        PATH="$(\dirname "$(\dirname "$CONDA_EXE")")/condabin${PATH:+":${PATH}"}"
    fi
    \export PATH
...

Finally after a few hundred lines it does say the following:

/opt/conda/lib/python3.11/site-packages/traitlets/traitlets.py:1241: UserWarning: Overriding existing post_save_hook                    (custom_post_save) with a new one (custom_post_save).                                                                               282   return self.func(*args, **kwargs)

but when I save a notebook file in the server (tried both the jupyterhub server, single user server), none of these print messages are showing up.

After some investigation, seems like it was working all along as @minrk suggested. The logs are not captured in the server, possibly the post hook save is called using a thread.

I tried writing to a temporary file when the save is called, seems like it’s being generated. Thanks for the help.