Wednesday, August 24, 2016

WSO2 Data Analytics Server - Monitoring Spark Cluster

Apache Spark provides a set of user interfaces (UI) which allows to monitor and troubleshoot the issues in a Spark cluster. The following are the default ports of the main UIs available for Spark. These ports are only available when WSO2 DAS is deployed as a Spark cluster.
Master UI - 8081
Application UI - 4040
Worker UI - 11500
These ports can be configured in the /repository/conf/analytics/spark/spark-defaults.conf file.

Master UI :
Access the Spark UIs of the active master and the stand-by master using < node ip>:8081 in each node. It has the information like status, Number of workers, Running applications, completed application etc.

Application UI :
When you access the running applications in the active master, it redirects you to the Spark application UI. It has following tabs.
  • JOBS - Under the "Jobs" tab, you see a list of jobs that have been scheduled or running. The Jobs table displays job, stage, and task progress.
  • Stages - You can see the details for Active, Completed and Failed stages.
  • Storage - You can view RDDs in the Storage tab.
  • Environment - Check the environment tabs, to see if all the configuration parameters are set properly.
  • Executors - Can see processing and storage for each executor. It is possible to look at the thread call stack by clicking on the thread dump link.
Worker UI :
Check the Spark UIs of workers by using < node ip>:11500 to check whether they have running executors. If a worker UI does not have running executors or if it is continuously creating executors, it indicates an issue in the Spark cluster configuration.


References:
[1] https://docs.wso2.com/display/CLUSTER44x/Minimum+High+Availability+Deployment+-+DAS+3.0.1
[2] https://pythagoreanscript.wordpress.com/2015/12/04/setting-up-wso2-das-3-0-minimum-ha-cluster/

No comments:

Post a Comment