Tutor troubleshooting notes
This is not an official guide! This is just a log of my (@Kyle McCormick's) experiences running Tutor locally, both for my own reference, and for the community’s reference as we work to improve Tutor and it’s official docs via the Archive: Tutor DevEnv Adoption.
Task 1: Set up Tutor for local development
Goal
Using a branch B, which is based on openedx/edx-platform:master, I want to:
Run LMS and CMS with local code
Run static analysis & tests
Run a management command
Relevant Tutor Docs
“Open edX Development”: https://docs.tutor.overhang.io/dev.html
Particularly, this section: https://docs.tutor.overhang.io/dev.html#setting-up-a-development-environment-for-edx-platform
“Running Open edX on the master branch (“nightly”)”: Obsolete: Tutor nightly — Tutor documentation
Issues Encountered
When working out of a repo generated by
tutor dev bindmount
:git state was confusing as a newcomer
no branch information
patches applied on top of master - wasn’t clear where they came from at first
copied static assets and custom settings files made git state dirty
git remote protocol needed to be switched from
https
tossh
to allow pushgit had to be configured to pull relevant branches
When using
tutor dev start ...
ortutor dev exec ...
:settings weren’t set correctly? Regis mentioned that
exec
doesn’t setDJANGO_SETTINGS_MODULE
properly…
When setting
COMMON_OPENEDX_VERSION
to my feature branch and tryingtutor images build openedx
:If branch has updates, needed to use
--no-cache
, which took a very long time.Had to override
openedx-dockerfile-git-patches-default
to be empty, otherwise cherry-picking would conflict with the state of my branch (because said commits are already on master, it seems?)
Volume mounting requires full path;
~
for home didn’t work.Had to start from scratch due to < insert requirements problem that carlos had >
Successful Approach
# Clone edx-platform and switch to my branch.
$ git clone git@github.com:openedx/edx-platform
$ (cd edx-platform && git checkout $B)
# Install Tutor Nightly.
$ virtualenv tutor-venv
$ source tutor-venv/bin/activate
$ git clone --branch nightly git@github.com:overhangio/tutor
$ cd tutor
$ pip install -r requirements/dev.txt
$ pip install -e .
# Provision Tutor.
$ tutor local quickstart
$ tutor local stop
# Copy edx-platform virtual environment to host.
$ tutor dev start lms
$ tutor dev bindmount lms /openedx/venv
$ tutor dev stop lms
$ tutor dev dc rm
# Configure Tutor to mount your local edx-platform and virtual environment
# by writing a env/dev/docker-compose.override.yml file to your Tutor env.
# The syntax for each mount is $HOST_PATH:$DOCKER_PATH.
# Be sure to substitute in the appropriate $HOST_PATH for each mount.
$ cat > $(tutor config printroot)/env/dev/docker-compose.override.yml <<- EOF
version: "3.7"
services:
lms:
volumes:
- /home/kyle/openedx/edx-platform:/openedx/edx-platform
- /home/kyle/.local/share/tutor-nightly/volumes/venv:/openedx/venv
cms:
volumes:
- /home/kyle/openedx/edx-platform:/openedx/edx-platform
- /home/kyle/.local/share/tutor-nightly/volumes/venv:/openedx/venv
lms-worker:
volumes:
- /home/kyle/openedx/edx-platform:/openedx/edx-platform
- /home/kyle/.local/share/tutor-nightly/volumes/venv:/openedx/venv
cms-worker:
volumes:
- /home/kyle/openedx/edx-platform:/openedx/edx-platform
- /home/kyle/.local/share/tutor-nightly/volumes/venv:/openedx/venv
EOF
# Install requirements, provision demo course, admin user and static assets.
$ tutor dev run lms make requirements
$ tutor dev run lms npm install
$ tutor dev run lms openedx-assets build --env=dev
$ tutor dev createuser admin admin@example.com --password admin --staff --superuser
$ tutor dev importdemocourse
# (NOT REQUIRED - JUST EXAMPLES) Run tests, linting, and a management command.
$ tutor dev run lms pytest path/to/some/code
$ tutor dev run lms pylint path/to/some/code
$ tutor dev run lms ./manage.py lms run_some_management_command
# Run LMS and CMS.
$ tutor dev start -d lms cms
Suggested Improvements
This was just a first pass at suggested improvements. We’ve been iterating these ideas in some GitHub issues:
# Clone edx-platform and switch to my branch.
$ git clone git@github.com:openedx/edx-platform
$ (cd edx-platform && git checkout $B)
# Install Tutor Nightly.
$ virtualenv tutor-venv
$ source tutor-venv/bin/activate
# Either:
$ git clone --branch nightly git@github.com:overhangio/tutor
$ cd tutor
$ make requirements
# OR:
# tutor-nightly could be a metapackage depending on the latest nightly (`N.dev`) tutor release
$ pip install tutor-nightly
# Configure mounting:
# * from my edx-platform to the container's /openedx/edx-platform, and
# * from the default location in tutor-nightly config to /openedx/venv.
$ tutor config save --set OPENEDX_MOUNTS=/home/kyle/openedx/edx-platfrom:/openedx/edx-platform,/openedx/venv
# Provision Tutor, to include a default user as well as static assets.
$ tutor dev quickinit
# Run tests, linting, and a management command.
$ tutor dev run bash
app@lms$ pytest path/to/some/code
app@lms$ pylint path/to/some/code
app@lms$ ./manage.py lms run_some_management_command
app@lms$ exit
# Run LMS, with virtual environment and application code from host.
$ tutor dev runserver -d lms cms
Task 2: Set up Tutor in a local Kubernetes (k8s) cluster
Goal
Run LMS, CMS and the Learning MFE in a k8s cluster on my local machine using Tutor
Relevant Tutor Docs
Kubernetes deployment — Tutor documentation
Notes
Stream of consciousness… follow-up items are denoted with
Docs recommend Minikube for trying things out. Great.
Minikube setup was easy.
tutor k8s start
: Failing. Unable to connect to MySQL.Tried
tutor k8s exec 'mysql mysql --username=... --password=...'
, could not connect.Also,
tutor k8s exec command arg1 arg2
: this doesn't work. Needs to betutor k8s exec 'command arg1 arg2'
.Went back to
dev
mode, confirmed that I can connect to MySQL.Brain: These are two different databases, so I need run
quickstart
again.
tutor k8s quickstart
:Hung when it got to discovery.
Disabled discovery for now.
Without discovery plugin enabled,
quickstart
succeeds 👍🏻
How do I view LMS in the browser?
local.overhang.io : unable to connect.
docs tell me to look at caddy's external IP and configure my DNS server with it.
kubectl --namespace openedx get services/caddy
says that EXTERNAL_IP is<pending>
StackOverflow tells me that I need to run
minikube tunnel
in order for the load balancer to work within minikubeNow, EXTERNAL_IP shows up.
Putting that IP address in the browser hangs in an encouraging way but doesn't show anything.
Caddy's logs show that I am successfully making requests to Caddy.
I'm looking at the Caddyfile. Caddyfile is defined in terms of the hostnames I have from config.yaml. That is, local.overhang.io.
Brain: Caddy is expecting URLs to be in the form of
*.local.overhang.io/*
Edit /etc/hosts, pointing local.overhang.io at the value of EXTERNAL_IP
Go to https://local.overhang.io in the browser... hangs still
Brain: Right, I don't have TLS set up
Go to http://local.overhang.io: LMS loads!
Takeaway: Would be good to note
minikube tunnel
and/etc/hosts
changes that were necessary for this to work locally.
Trying to log in...
Need to make a superuser
tutor k8s exec lms bash
..../manage.py lms manage_user ... --superuser
Could not load django_debug_toolabr
DJANGO_SETTINGS_MODULE=lms.envs.production ./manage.py lms manage_user ... --superuser
TODO: File an issue for DJANGO_SETTINGS_MODULE being mis-set?
Log in works!
Trying to enroll...
Whoops, no courses.
tutor k8s importdemocourse
.That worked! Enrolled.
Trying to start the learning mfe...
tutor plugins enable mfe
tutor config save
tutor k8s init --limit=mfe
-> exited successfullyMFE app isn't running, logs say
Error from server (BadRequest): container "mfe" in pod "mfe-7757f78d77-qcpmr" is waiting to start: image can't be pulled
according to deployments.yml, the assigned image is docker.io/overhangio/openedx-mfe:13.0.2
docker pull docker.io/overhangio/openedx-mfe:13.0.2
yieldsmanifest for overhangio/openedx-mfe:13.0.2 not found: manifest unknown: manifest unknown
Went to openedx-mfe on DockerHub, latest image is 12.01.
Brain: Oh, maybe I'm supposed to build this myself?
tutor images build mfe
Was going to file an issue about the missing 13.x dockerhub image, but then I found Why is the mfe image not pre-built
Image built successfully!
Tutor still wants to use the remote image, which doesn’t exist…
Pushed custom-built image to https://hub.docker.com/r/kdmccormick96/openedx-mfe, and set
MFE_DOCKER_IMAGE
to point to it.We should either:
address the lack of dynamic config overrides for MFEs (on frontend-wg’s radar with https://github.com/openedx/frontend-wg/milestone/2 , not currently being worked on, though)
add docs explaining how to work around the MFE issue by building a custom image, and somehow remove the messaging about the MFE image being missing from dockerhub… which is technically correct but doesn’t lead the user to the solution.
tutor k8s stop mfe && tutor k8s start mfe
->The Deployment "mfe" is invalid
.I think it's failing because there are old pods running still. See 1: Old pods aren't getting destroyed
Went to dashboard (
minikube dashboard
) and selected theopenedx
namespace.There are still pods running, but their instance ID is different than the instance ID that
tutor k8s stop
is using.Find the instance ID from the dashboard.
Run
kubectl delete --namespace openedx --selector=app.kubernetes.io/instance=openedx-INSTANCE_ID_FROM_DASHBOARD,app.kubernetes.io/component!=loadbalancer deployments,services,configmaps,jobs
. Resources are deleted.TODO: what was happening here?
Run
tutor k8s start
again.local.overhang.io hangs. Ran
tutor k8s logs lms
. Getdjango.db.utils.OperationalError: (1045, "Access denied for user 'openedx'@'172.18.0.1' (using password: YES)")
.Ran
tutor k8s quickstart
again.Job failed; access denied still.
Stop containers:
tutor k8s stop
.Ran
tutor k8s quickstart
again. Still access denied.Brain:
This must be an issue with the mysql database itself. Where is mysql data stored?
checking volumes.yml...
it's stored using the default storage class
kubectl get storageclass
-> minikube hostPath is the default -> it's somewhere in the minikube container.maybe restarting minikube will start me off with a fresh database?
minikube stop && minikube start && tutor k8s quickstart
-> same error as beforeugg, let's try
minikube delete && minikube start && tutor k8s quickstart
TODO: what was happening here?
After destroying and starting over…
Minor issue: External IP had changed. Had to update /etc/hosts
Another issue: Celery 5 seems to break lms-worker and cms-worker. This isn’t specific to k8s, and seems to have come up in the past week.
With the Celery fix in place, Studio and LMS work!
and with my custom image pushed to DockerHub, the course outline in the Learning MFE works!
Unfortunately, courseware in the Learning MFE shows “An error has occurred” with no other explanation. No JS error logs in the console or 5XXs/4XXs in the network log.
Workaround: Toggle on
courseware.use_legacy_frontend
waffle flag in order to use legacy coursewareTODO: figure out why this didn’t work
Now, using legacy, courseware works.
References
1: Old pods aren't getting destroyed
$ tutor k8s start mfe
kubectl get namespaces openedx
NAME STATUS AGE
openedx Active 12h
Namespace already exists: skipping creation.
kubectl apply --kustomize /home/kyle/.local/share/tutor-nightly/env --selector app.kubernetes.io/name=mfe
The Deployment "mfe" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-KXV0satwwOtuGRz7lwgslwU7", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"mfe", "app.kubernetes.io/part-of":"openedx"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
Error: Command failed with status 1: kubectl apply --kustomize /home/kyle/.local/share/tutor-nightly/env --selector app.kubernetes.io/name=mfe
$ tutor k8s stop
kubectl delete --namespace openedx --selector=app.kubernetes.io/instance=openedx-KXV0satwwOtuGRz7lwgslwU7,app.kubernetes.io/component!=loadbalancer deployments,services,configmaps,jobs
service "cms" deleted
service "elasticsearch" deleted
service "lms" deleted
service "mfe" deleted
service "mongodb" deleted
service "mysql" deleted
service "redis" deleted
service "smtp" deleted
configmap "caddy-config-b9gk6kf847" deleted
configmap "openedx-config-kkhkt28hth" deleted
configmap "openedx-settings-cms-bkft5k2b4h" deleted
configmap "openedx-settings-lms-h25dtdkdh2" deleted
configmap "redis-config-fccm65mh4m" deleted
$ kubectl --namespace=openedx get pods
NAME READY STATUS RESTARTS AGE
caddy-85589d6669-dnf65 1/1 Running 1 (11m ago) 52m
cms-7dd67c5f55-b4rvb 1/1 Running 1 (11m ago) 52m
cms-job-20220303113051-lrzh5 0/1 Completed 0 51m
cms-worker-67846c7d7b-t2djr 1/1 Running 1 (11m ago) 52m
elasticsearch-d8b6859f7-7rfmr 1/1 Running 1 (11m ago) 52m
lms-788dfd9669-spgdw 1/1 Running 1 (11m ago) 52m
lms-job-20220303113217-gwj82 0/1 Completed 0 50m
lms-worker-794866f475-zhzzb 1/1 Running 1 (11m ago) 52m
mfe-7757f78d77-qcpmr 0/1 ImagePullBackOff 0 49m
mongodb-f4f6dc446-jn7ck 1/1 Running 1 (11m ago) 52m
mysql-7cb5f98d7-tlf76 1/1 Running 1 (11m ago) 52m
redis-7ddff4c6dd-hcxb6 1/1 Running 1 (11m ago) 52m
smtp-7454bf587b-6cl9l 1/1 Running 1 (11m ago) 52m
$ tutor k8s stop
kubectl delete --namespace openedx --selector=app.kubernetes.io/instance=openedx-KXV0satwwOtuGRz7lwgslwU7,app.kubernetes.io/component!=loadbalancer deployments,services,configmaps,jobs
No resources found
$