[BD-19, BD-20] Meeting Notes

[BD-19, BD-20] Meeting Notes

2021-01-12

  • course-discovery deployed to prod

    • mostly going well. some small issues surfacing from prospectus builds

  • edX working to clean up the old usages of 1.5

2021-01-05

  • talking to DE, we want to halt work on edx-analytics-data-api for now, because DE cannot spare the resources

  • course-discovery

    • try to work on this earlier in the day for edX

    • Hopefully one last issue will be resolved this week

    • If we find others, we will focus on getting RG a good dev setup to run the same tests we are

2020-12-22

  • course-discovery

    • handled rebasing onto master

    • did some more testing on stage

      • some issues arose during testing

      • understand how to address these issues

      • should be handled by the next meeting

  • edx-analytics-pipeline

    • no updates at the moment

    • questions:

      • who can handle questions from DE while Stuart is out?

        • A: not sure, @Diana Huang will check in with DE but it’s possible that they are too understaffed at the moment.

      • test model that handles connection with ES

2020-12-15

  • course-discovery

    • nothing has changed since last week

    • have not had any of Feanil’s time

    • one last question that needs to be addressed, having trouble replicating it

    • can we check the latest commit?

  • edx-analytics-api

    • did a big debugging session between RG and edX

    • made progress on acceptance tests

      • tests are failing and logs will be sent to RG

    • waiting for edX to return feedback/logs

2020-12-08

  • course-discovery had fixes added

    • found some new smaller issues

      • filtering for org field in exclude mode

      • fixes for this have been added

    • waiting for the next phase of testing

    • fix managed to handle other potential problems

    • one unresolved issue left behind

      • don’t have the same number of results between searches

      • have the database dump - trying to recreate on devstack

      • trying the old process of doing a debugging

        • blocked on Feanil’s time

  • edx-analytics-pipeline

    • asked DE about how the ES is run

      • run on AWS Elasticsearch

    • helpful if we run service tests using the OpenDistro image

    • cannot reproduce the issue locally, but will continue to try it

    • unfortunately, will need to reconfigure ES to work this way

    • check in with DE about what we might be able to do in terms of accelerating testing

2020-12-01

  • Nikolay is back from vacation

    • working on resolving the outstanding issues with course-discovery

    • plans to do another testing session today

    • currently on stage - testing to happen later today

    • needed to handle requirements for the enterprise team(s)

      • add more filters to handle these cases

    • waiting for results

  • edx-analytics-pipline

    • fixed Travis issues with the PR

    • need edX to run acceptance tests to verify that these changes are working

2020-11-24

  • edx-analytics-pipeline is being worked on

    • Travis CI integration fixed

    • tests being added for integration with ES

    • should be ready to go

    • update acceptance tests to use ES7 cluster settings

    • talk with DE - create a thread in Slack where we can discuss the changes that may need to happen

  • waiting for Nikolay for course-discovery

    • resolving the last few issues

    • aiming for the first couple of weeks of December to deploy

  • edx-search

    • deployed! currently no issues

  • just a note: cs_comments_service ES nodes are crashing periodically.

    • TNL team on edX is investigating why this might be happening - slow queries or similar

    • should we just allocate more memory to the nodes?

2020-11-17

  • New PRs for course-discovery/Haystack replacement

    • RG hasn’t reviewed, but edX can/will do that review

    • Waiting on a few last issues to be resolved

    • Waiting for Nikolay to get back from vacation to address those issues

  • prepared PR to edx-analytics-pipeline

    • the code that was broken should now be fixed

    • once unit tests are green, we can ask DE/Stu to run the acceptance tests again

  • edx-search

    • edX is working on testing this on stage

    • we’re seeing some failures indexing courses on stage. edX will be investigating this more closely and will inform RG if code changes are needed.

2020-11-10

  • no blockers on course-discovery, edx-notes-api

    • waiting for feedback from edX

    • course-discovery

      • waiting for time to do the testing

      • when filtering ES, does not work case insensitively

        • differences between ES1.5 and ES7

        • came up with a solution to leave everything lower case

      • planning on doing some more testing

    • edx-notes-api

      • one issue found

      • getting a bad request into ES

      • tried to do a direct query

      • edX to try to do more testing and investigation into the issue

  • edx-analytics-data-api

    • found strange error - ES library failing request to ES

      • need to investigate this issue

    • writing more tests

  • edx-search

    • results need to be paginated to handle results

      • who should update platform side to support pagination?

        • RG for now

        • will reach out to edX if anything changes

2020-11-03

  • conversation about accelerating syncing

  • trying to do a daily sync update. this seems to be going well.

  • finding new and unexpected bugs

  • course-discovery

    • last time, one issue with a POST request

    • added a fix - still a small problem

    • waiting to find some time to test the latest fix

  • edx-notes-api

    • found an issue on prod

    • couldn’t figure out why this was happening

    • have some tests for running against the ES instance directly

  • edx-search

    • trying to test with the settings updated

    • found an issue in local yesterday - probably just an earlier version of edx-search being used incorrectly

  • edx-analytics-pipeline

    • RG to resolve the issue that found by the acceptance tests

    • all code to ES in this repo are mocked, so tests didn’t catch it

2020-10-27

  • work at RG focused on bugfixing/testing/etc

  • course-discovery - actively working on testing and fixing bugs

    • results showing

    • Feanil prepared tests that needed to pass

    • 500 error on some endpoints for course discovery

    • found one last issue on Monday - POST request issue with search all

    • fixes submitted this morning. we need to test to ensure these work

  • Talking with Robert about forums issues - trying to resolve issues that were found on prod

    • We can’t reproduce these issues locally - sort by vote issues that were found

    • This is the main issue found - can we find a way to reproduce it

    • Found a course, provided json to prove - can RG use this json to reproduce locally? if we need to provide staging comments data, we can probably pass it along

    •  

  • Seeing some issues with edx-search and edx-platform - can see if we can fix those issues with ES settings

  •  

2020-10-13

  • Focused on two repos currently in deployment: cs_comments_service, course-discovery

  • Two other repos in progress: analytics work, mostly resolving conflicts, addressing requests

    • Adding tests to analytics-pipeline

  • RG to be on vacation on Wednesday

  • course-discovery

    • some progress being made

    • made some more updates and resolved conflicts

    • did some more testing against Stage data

    • lots of communication via Slack - managed to resolve issues

    • only 4 issues left - 3 of them have the same root cause

    • still didn’t have Stage data on RG - using raw ES queries to edX for testing

    • tests have been written for the new issues/cases we have so that we can try to triage

    • manual testing still showing issues

  • edx-analytics-api

    • resolved conflicts

  • cs_comments_service

    • resolved all issues and comments

  • analytics-pipeline

    • adding tests

2020-10-06

  • Actively working on course-discovery

    • Did second round of testing on Stage - produced a list

    • Went through every item on the list - did a thorough evaluation of every issue

    • Submitted fixes and tests

    • Explanation on the issues posted in Slack

    • PR of fixes has some conflicts to resolve against edx master branch - just needs to be rebased - hoping Feanil can review new code today

    • Don’t need access to data for now

  • Resolved issues with edx-search

  • Added alias functionality back to cs_comments_service - responding to questions and review comments

  • Slack threads can be hard to respond to without context

  • Settings - @Diana Huang to submit PRs to update the settings against RG side

    • at least give an example for RG to look at

2020-09-29

  • RG mainly focused on bugs handled on deployed services

    • course-discovery continuing work - addressed issues found on Stage

    • forums feedback from the TNL squad, addressing issues

    • edx-search and edx-platform updates and fixes based on feedback as well

  • course-discovery

    • progress happening here

    • initial testing on stage last week, found issues

    • working between edX and RG to resolve issues in course-discovery

      • more testing on stage with the fixes, some issues found. old ones not entirely fixed and new ones discovered

      • despite local testing with similar data, still seeing issues on stage with stage data

      • unsure if the data edX provided can replicate all the issues we are seeing

        • can we get a dump of data off of stage? - probably some PII concerns

        • edX to talk to DE about the security/PII concerns

      • planning on getting in fixes back for tomorrow

  • edx-notes-api is in production

    • with issues - had problems indexing all the notes

    • having some devstack problems, but ES seems be working

  • reopened PR against edx-platform

    • resolved test failures on branch

  • opened PR against edx-analytics-pipeline

    • code coverage issues - not enough code coverage of ES sections

    • hopefully the pipeline acceptance tests cover some of this

  • cs_comments_service PR being reviewed by edX

    • most of the issues resolved

    • planning on returning alias functionality to allow for swap between indexes, allowing for quick swapover

  • devstack issues

    • edx-notes-api uses the new ES7 code but still pointing at the old ES container

    • edX working to resolve devstack issues right now

      • focus on skipping notes in the devstack - high priority - @Diana Huang

    • last week - community asked about the devstack issues

2020-09-22

  • issues with devstack and provisioning that need to get fixed

    • ES7 code deployed but not finished

  • course-discovery test was positive but some issues were raised

    • most of them are easily fixable, but some of them need to be discussed when Mike gets back

  • testing edx-platform and edx-search, PR raised some issues

    • can’t reproduce it locally

    • can see which tests are failing - exploring what the reasons for what they are

      • fixes for edx-search

2020-09-15

  • responded to issues with edx-notes-api and edx-search questions

  • edx-analytics-pipeline PR is being prepared and will be ready be opened against upstream

  • will aim to get to edx-search and course-discovery onto stage to do some manual testing this week

  • more people being added to the project from the edX side, may see new names and new questions.

2020-09-08

  • Updates first, then retrospective

  • edx-search and edx-platform changes

    • last week, we talked about removing the old interface

    • submitted updates and need to test this

  • forums is in testing - resolved some issues

    • unit test passed, but did some small manual testing

    • found that previously created indexes were not removed

    • waiting for review from edX/ @Diana Huang

  • working on edx-analytics-pipeline right now

Retrospective

  • What went well:

    • Generally - good and interesting job, an adventure

    • Did the best to get it done quickly and well

    • Good project for RG to pick up, due to familiarity with ES

    • Managed to handle the volume and complexity of the work well and adapted quickly

    • Team worked well together

    • Good communication channels, found solutions together

      • Good to look at PRs early

      • Good feedback loops between teams

      • Especially across timezones

      • Excellent transparency

      • Weekly checkins were a good cadence for discussion

    • Lots of tests already existed in the Open edX codebase - made it easier to ensure that functionality didn’t break in the upgrade/library change

    • Even if the project was larger, was able to react quickly

    • RG pro-actively suggesting retrospective!

  • What could be improved:

    • Underestimated the size of the project

      • Found a repo very far on into the project that needed to be converted as well

      • Always have to account for “padding” in cases where there are exceptions

    • When switching from Haystack to Elasticsearch-DSL, it was necessary to understand the difference between these tools and how different they are

    • Better documentation around the ES architecture +4

      • Want to take this on to improve the experience for Open edX overall

    • Maybe could have used more tests - better tests? +3

      • Didn’t have time to add tests that we saw were missing

    • RG wanted to do more manual testing +1 - parking lot discussion

      • Wanted to help with manual testing

      • Should be part of the project

      • How does this get delivered?

      • We did manual testing for course-discovery and will forums

    • Would like to test and deploy work sooner +3

    • Could we have done code in smaller PRs because of the changes? +1

      • PRs were are broken into smaller chunks - each commit

Documentation around ES architecture

  • ADRs(?)

    • Good for the decisions in the project

  • More comprehensive documentation

    • difficult to maintain

  • Is there a tool that could generate documentation of the ES schema?

  • HOWTO guides

    • how does this impact Koa?

    • these are easier to keep up to date

  • Put together a proposal of work for documentation for RG

    • timebox - maybe a few days, maybe more

    • can decide on what combinations of work

Manual Testing - or Testing

  • What concrete actions can we take?

  • Local testing for ES - take a look at how it works in action

  • Could we add more unit tests(?)

    • Found some unit tests were missing

    • Three API views without any tests

  • Manual testing will help us to find things immediately

 

2020-09-01

  • Introductions for the team, welcoming Robert to the group

  • Close to the end

  • Want to have a retrospective on the project

    • Discuss what went well and what didn’t

    • Extend Sept 8 meeting for an hour

  • analytics-pipeline

    • Estimate: only 1-2 weeks to discuss

  • course-discovery

    • Who will be doing manual testing?

      • edX will start doing deployment preparation

      • RG will do some limited manual testing

  • Questions:

    • When is schema created?

      • All READMEs in the PRs should have instructions on creating the new schemas using management commands

2020-08-25

  • BD-19