TEZ-4441: TezAppMaster may stuck because of reportError skip send err…#236
TEZ-4441: TezAppMaster may stuck because of reportError skip send err…#236abstractdog merged 3 commits intoapache:masterfrom
Conversation
|
🎊 +1 overall
This message was automatically generated. |
| Utils.getTaskSchedulerIdentifierString(taskSchedulerIndex, appContext) + ": " + | ||
| diagnostics); | ||
| if (taskSchedulerDescriptors[taskSchedulerIndex].getClassName().equals(yarnSchedulerClassName)) { | ||
| if (taskSchedulerDescriptors[taskSchedulerIndex].getEntityName() |
There was a problem hiding this comment.
I guess this is the actual fix
just a note, what are the values in your case:
taskSchedulerDescriptors[taskSchedulerIndex].getClassName()
yarnSchedulerClassName
taskSchedulerDescriptors[taskSchedulerIndex].getEntityName()
There was a problem hiding this comment.
Yes, this is the only actual fix.
- Before this PR
| method | return value |
|---|---|
| taskSchedulerDescriptors[taskSchedulerIndex].getClassName() | null |
| yarnSchedulerClassName | "org.apache.tez.dag.app.rm.YarnTaskSchedulerService" |
taskSchedulerDescriptors[taskSchedulerIndex].getClassName() is set from the variable 'taskSchedulerDescriptors' of DAGAppMaster::serviceInit. In DAGAppMaster::parsePlugin, when we construct NamedEntityDescriptor for tez yarn plugin, the className is all null.
yarnSchedulerClassName is set from tez.am.yarn.scheduler.class, default value is "org.apache.tez.dag.app.rm.YarnTaskSchedulerService".
So for tez yarn plugin, taskSchedulerDescriptors[taskSchedulerIndex].getClassName() will never equals to yarnSchedulerClassName. Then
- After this PR
taskSchedulerDescriptors[taskSchedulerIndex].getEntityName() will return "TezYarn"
There was a problem hiding this comment.
I see, nice catch:
we simply don't fill the classname, so we should not rely on it, only use it in case of createCustomTaskScheduler
tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestTaskSchedulerManager.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…or event (#236) (zhengchenyu reviewed by Laszlo Bodor)
…or event (apache#236) (zhengchenyu reviewed by Laszlo Bodor) (cherry picked from commit 55b6031)
https://issues.apache.org/jira/browse/TEZ-4441