You need to be on edX VPN to access splunk at https://splunk.edx.org and will also need to file a ticket to create an Splunk Account.
What is Splunk?
Splunk is a query interface for users to retrieve specific log messages on our web servers. Whenever you see in code "logger.info", "logger.warning" or "logger.exception". Those logs gets written into the log files onto the web server instance.
These logs are then indexed by Splunk for us to quickly search.
Splunk orders all log messages by time, so you can use it to view recent trends and understand frequencies to estimate impact
Search
The search box on top of the splunk is an area where you probably spend most of your time with. It is the area for you to input queries splunk can use to find information for you in our logs.
We will explain the Splunk query language by using a simple edX example:
index=prod-edx service_variant=ecommerce EDX-23025325
- The first clause "index=prod-edx" specify which environment we are querying against. The index key word points to an environment. Common values are:
- prod-edx
- stage-edx
- The second clause "service_variant=ecommerce" specify which IDA/service we are querying for. Common values are:
- lms
- cms
- ecommerce
- credentials
- discovery
- The second claus can also be "source=/edx/var/log/lms/edx.log", which specify the log file we are querying against. Some common log file values are
- /edx/var/log/lms/edx.log
- /edx/var/log/cms/edx.log
- /edx/var/log/ecomworker/edx.log
- /edx/var/log/ecommerce/edx.log
- /edx/var/log/credentials/edx.log
- /edx/var/log/discovery/edx.log
- The third clause is the key word within logs you want to search for.
Once you created the query like above, then you should select the time range you are limiting the query for. The default is "All Time". The default takes a long time to finish, so I recommend limit your query time range to something reasonable.
Drill Down
Once there are search results returned, you can then inspect the search results by click on the " > " arrow under the " i " column.
If you see the "Event Action" button, you can click on the button and select "Show Source". That will open a new tab showing you the raw log messages within the log file.
The most useful thing to see the raw log message is you can expand the raw log message and see stacktrace during error conditions.
Advanced Search
Search is quite powerful. You can use query like SQL to do aggregation, sum/total, and comparisons.
See the example query below:
index=prod-edx source=/edx/var/log/ecommerce/edx.log* | rex field=_raw "\] - (?<evt>[a-z_]+): " | dedup evt basket_id | search (evt=order_placed AND contains_coupon=False) OR evt=payment_received | timechart span=10m sum(amount) by evt | where payment_received > 0 and order_placed != payment_received
Alerts
Within Splunk, you can setup alerts which means, with a pre-defined Splunk query, if there are query results returned, it can automatically email a defined list of email addresses.
You can see all the existing alerts we have defined on the "Alert" tab under the Splunk header.
You can click on "edit" and see what are the query these alerts are using.
You can also create alerts yourself. If you do not have the permission in Splunk to create alerts, please create an ITSupport ticket to get yourself permissions