双六工場日誌

平凡な日常を淡々と綴ります。

Hadoop メトリクスの一覧

Hadoopメトリクスを網羅している一覧がなく、Hadoopクラスタの監視をつくるときに自分が困ったので、言い出しっぺの法則からとりあえず一覧だけつくりました。ただし、ugi,fairschedulerのメトリクスは省略。あと、rpc.detailed-metricsは全体を一覧にするのが難しい*1ので、一覧からは割愛で。

HBaseのメトリクスはこちらにまとまっているので、そちら参照。→ http://hbase.apache.org/book/hbase_metrics.html
またメトリクスのうち、rpcメトリクスの説明はこちらにあります。 → https://issues.apache.org/jira/browse/HADOOP-6599


jvmとrpcメトリクスは、どのデーモンでも共通。そのほかは各デーモン固有のメトリクスになっています。数が多くて、まだよくわかっていないところも多いので、中身は随時見ていきたいと思います。メトリクス名から自明の項目もありますが。監視の詳細はZabbix-JPの勉強会で話せたらよいなーと思いつつ、まだ準備ができていません。。。

Namenode
dfs.FSDirectory.files_deleted
dfs.FSNamesystem.BlockCapacity
dfs.FSNamesystem.BlocksTotal
dfs.FSNamesystem.CapacityRemainingGB
dfs.FSNamesystem.CapacityTotalGB
dfs.FSNamesystem.CapacityUsedGB
dfs.FSNamesystem.CorruptBlocks
dfs.FSNamesystem.ExcessBlocks
dfs.FSNamesystem.FilesTotal
dfs.FSNamesystem.MissingBlocks
dfs.FSNamesystem.PendingDeletionBlocks
dfs.FSNamesystem.PendingReplicationBlocks
dfs.FSNamesystem.ScheduledReplicationBlocks
dfs.FSNamesystem.TotalLoad
dfs.FSNamesystem.UnderReplicatedBlocks
dfs.namenode.AddBlockOps
dfs.namenode.CreateFileOps
dfs.namenode.DeleteFileOps
dfs.namenode.FileInfoOps
dfs.namenode.FilesAppended
dfs.namenode.FilesCreated
dfs.namenode.FilesInGetListingOps
dfs.namenode.FilesRenamed
dfs.namenode.GetBlockLocations
dfs.namenode.GetListingOps
dfs.namenode.JournalTransactionsBatchedInSync
dfs.namenode.Syncs_avg_time
dfs.namenode.Syncs_num_ops
dfs.namenode.Transactions_avg_time
dfs.namenode.Transactions_num_ops
dfs.namenode.blockReport_avg_time
dfs.namenode.blockReport_num_ops
dfs.namenode.fsImageLoadTime
jvm.NameNode.metrics.gcCount
jvm.NameNode.metrics.gcTimeMillis
jvm.NameNode.metrics.logError
jvm.NameNode.metrics.logFatal
jvm.NameNode.metrics.logInfo
jvm.NameNode.metrics.logWarn
jvm.NameNode.metrics.maxMemoryM
jvm.NameNode.metrics.memHeapCommittedM
jvm.NameNode.metrics.memHeapUsedM
jvm.NameNode.metrics.memNonHeapCommittedM
jvm.NameNode.metrics.memNonHeapUsedM
jvm.NameNode.metrics.threadsBlocked
jvm.NameNode.metrics.threadsNew
jvm.NameNode.metrics.threadsRunnable
jvm.NameNode.metrics.threadsTerminated
jvm.NameNode.metrics.threadsTimedWaiting
jvm.NameNode.metrics.threadsWaiting
rpc.metrics.NumOpenConnections
rpc.metrics.ReceivedBytes
rpc.metrics.RpcProcessingTime_avg_time
rpc.metrics.RpcProcessingTime_num_ops
rpc.metrics.RpcQueueTime_avg_time
rpc.metrics.RpcQueueTime_num_ops
rpc.metrics.SentBytes
rpc.metrics.callQueueLen
rpc.metrics.rpcAuthenticationFailures
rpc.metrics.rpcAuthenticationSuccesses
rpc.metrics.rpcAuthorizationFailures
rpc.metrics.rpcAuthorizationSuccesses
JobTracker
mapred.jobtracker.blacklisted_maps
mapred.jobtracker.blacklisted_reduces
mapred.jobtracker.heartbeats
mapred.jobtracker.jobs_completed
mapred.jobtracker.jobs_failed
mapred.jobtracker.jobs_killed
mapred.jobtracker.jobs_preparing
mapred.jobtracker.jobs_running
mapred.jobtracker.jobs_submitted
mapred.jobtracker.map_slots
mapred.jobtracker.maps_completed
mapred.jobtracker.maps_failed
mapred.jobtracker.maps_killed
mapred.jobtracker.maps_launched
mapred.jobtracker.occupied_map_slots
mapred.jobtracker.occupied_reduce_slots
mapred.jobtracker.reduce_slots
mapred.jobtracker.reduces_completed
mapred.jobtracker.reduces_failed
mapred.jobtracker.reduces_killed
mapred.jobtracker.reduces_launched
mapred.jobtracker.reserved_map_slots
mapred.jobtracker.reserved_reduce_slots
mapred.jobtracker.running_maps
mapred.jobtracker.running_reduces
mapred.jobtracker.trackers
mapred.jobtracker.trackers_blacklisted
mapred.jobtracker.trackers_decommissioned
mapred.jobtracker.waiting_maps
mapred.jobtracker.waiting_reduces
jvm.JobTracker.metrics.gcCount
jvm.JobTracker.metrics.gcTimeMillis
jvm.JobTracker.metrics.logError
jvm.JobTracker.metrics.logFatal
jvm.JobTracker.metrics.logInfo
jvm.JobTracker.metrics.logWarn
jvm.JobTracker.metrics.maxMemoryM
jvm.JobTracker.metrics.memHeapCommittedM
jvm.JobTracker.metrics.memHeapUsedM
jvm.JobTracker.metrics.memNonHeapCommittedM
jvm.JobTracker.metrics.memNonHeapUsedM
jvm.JobTracker.metrics.threadsBlocked
jvm.JobTracker.metrics.threadsNew
jvm.JobTracker.metrics.threadsRunnable
jvm.JobTracker.metrics.threadsTerminated
jvm.JobTracker.metrics.threadsTimedWaiting
jvm.JobTracker.metrics.threadsWaiting
rpc.metrics.NumOpenConnections
rpc.metrics.ReceivedBytes
rpc.metrics.RpcProcessingTime_avg_time
rpc.metrics.RpcProcessingTime_num_ops
rpc.metrics.RpcQueueTime_avg_time
rpc.metrics.RpcQueueTime_num_ops
rpc.metrics.SentBytes
rpc.metrics.callQueueLen
rpc.metrics.rpcAuthenticationFailures
rpc.metrics.rpcAuthenticationSuccesses
rpc.metrics.rpcAuthorizationFailures
rpc.metrics.rpcAuthorizationSuccesses
Secondary Namenode
dfs.FSDirectory.files_deleted
jvm.SecondaryNameNode.metrics.gcCount
jvm.SecondaryNameNode.metrics.gcTimeMillis
jvm.SecondaryNameNode.metrics.logError
jvm.SecondaryNameNode.metrics.logFatal
jvm.SecondaryNameNode.metrics.logInfo
jvm.SecondaryNameNode.metrics.logWarn
jvm.SecondaryNameNode.metrics.maxMemoryM
jvm.SecondaryNameNode.metrics.memHeapCommittedM
jvm.SecondaryNameNode.metrics.memHeapUsedM
jvm.SecondaryNameNode.metrics.memNonHeapCommittedM
jvm.SecondaryNameNode.metrics.memNonHeapUsedM
jvm.SecondaryNameNode.metrics.threadsBlocked
jvm.SecondaryNameNode.metrics.threadsNew
jvm.SecondaryNameNode.metrics.threadsRunnable
jvm.SecondaryNameNode.metrics.threadsTerminated
jvm.SecondaryNameNode.metrics.threadsTimedWaiting
jvm.SecondaryNameNode.metrics.threadsWaiting
Datanode
dfs.datanode.blockChecksumOp_avg_time
dfs.datanode.blockChecksumOp_num_ops
dfs.datanode.blockReports_avg_time
dfs.datanode.blockReports_num_ops
dfs.datanode.block_verification_failures
dfs.datanode.blocks_read
dfs.datanode.blocks_removed
dfs.datanode.blocks_replicated
dfs.datanode.blocks_verified
dfs.datanode.blocks_written
dfs.datanode.bytes_read
dfs.datanode.bytes_written
dfs.datanode.copyBlockOp_avg_time
dfs.datanode.copyBlockOp_num_ops
dfs.datanode.heartBeats_avg_time
dfs.datanode.heartBeats_num_ops
dfs.datanode.readBlockOp_avg_time
dfs.datanode.readBlockOp_num_ops
dfs.datanode.reads_from_local_client
dfs.datanode.reads_from_remote_client
dfs.datanode.replaceBlockOp_avg_time
dfs.datanode.replaceBlockOp_num_ops
dfs.datanode.volumeFailures
dfs.datanode.writeBlockOp_avg_time
dfs.datanode.writeBlockOp_num_ops
dfs.datanode.writes_from_local_client
dfs.datanode.writes_from_remote_client
jvm.DataNode.metrics.gcCount
jvm.DataNode.metrics.gcTimeMillis
jvm.DataNode.metrics.logError
jvm.DataNode.metrics.logFatal
jvm.DataNode.metrics.logInfo
jvm.DataNode.metrics.logWarn
jvm.DataNode.metrics.maxMemoryM
jvm.DataNode.metrics.memHeapCommittedM
jvm.DataNode.metrics.memHeapUsedM
jvm.DataNode.metrics.memNonHeapCommittedM
jvm.DataNode.metrics.memNonHeapUsedM
jvm.DataNode.metrics.threadsBlocked
jvm.DataNode.metrics.threadsNew
jvm.DataNode.metrics.threadsRunnable
jvm.DataNode.metrics.threadsTerminated
jvm.DataNode.metrics.threadsTimedWaiting
jvm.DataNode.metrics.threadsWaiting
rpc.metrics.NumOpenConnections
rpc.metrics.ReceivedBytes
rpc.metrics.RpcProcessingTime_avg_time
rpc.metrics.RpcProcessingTime_num_ops
rpc.metrics.RpcQueueTime_avg_time
rpc.metrics.RpcQueueTime_num_ops
rpc.metrics.SentBytes
rpc.metrics.callQueueLen
rpc.metrics.rpcAuthenticationFailures
rpc.metrics.rpcAuthenticationSuccesses
rpc.metrics.rpcAuthorizationFailures
rpc.metrics.rpcAuthorizationSuccesses
TaskTracker
jvm.TaskTracker.metrics.gcCount
jvm.TaskTracker.metrics.gcTimeMillis
jvm.TaskTracker.metrics.logError
jvm.TaskTracker.metrics.logFatal
jvm.TaskTracker.metrics.logInfo
jvm.TaskTracker.metrics.logWarn
jvm.TaskTracker.metrics.maxMemoryM
jvm.TaskTracker.metrics.memHeapCommittedM
jvm.TaskTracker.metrics.memHeapUsedM
jvm.TaskTracker.metrics.memNonHeapCommittedM
jvm.TaskTracker.metrics.memNonHeapUsedM
jvm.TaskTracker.metrics.threadsBlocked
jvm.TaskTracker.metrics.threadsNew
jvm.TaskTracker.metrics.threadsRunnable
jvm.TaskTracker.metrics.threadsTerminated
jvm.TaskTracker.metrics.threadsTimedWaiting
jvm.TaskTracker.metrics.threadsWaiting
mapred.shuffleOutput.shuffle_failed_outputs
mapred.shuffleOutput.shuffle_handler_busy_percent
mapred.shuffleOutput.shuffle_output_bytes
mapred.shuffleOutput.shuffle_success_outputs
mapred.tasktracker.mapTaskSlots
mapred.tasktracker.maps_running
mapred.tasktracker.reduceTaskSlots
mapred.tasktracker.reduces_running
mapred.tasktracker.tasks_completed
mapred.tasktracker.tasks_failed_ping
mapred.tasktracker.tasks_failed_timeout
rpc.detailed-metrics.canCommit_avg_time
rpc.detailed-metrics.canCommit_num_ops
rpc.detailed-metrics.commitPending_avg_time
rpc.detailed-metrics.commitPending_num_ops
rpc.detailed-metrics.done_avg_time
rpc.detailed-metrics.done_num_ops
rpc.detailed-metrics.getMapCompletionEvents_avg_time
rpc.detailed-metrics.getMapCompletionEvents_num_ops
rpc.detailed-metrics.getProtocolVersion_avg_time
rpc.detailed-metrics.getProtocolVersion_num_ops
rpc.detailed-metrics.getTask_avg_time
rpc.detailed-metrics.getTask_num_ops
rpc.detailed-metrics.ping_avg_time
rpc.detailed-metrics.ping_num_ops
rpc.detailed-metrics.statusUpdate_avg_time
rpc.detailed-metrics.statusUpdate_num_ops
rpc.metrics.NumOpenConnections
rpc.metrics.ReceivedBytes
rpc.metrics.RpcProcessingTime_avg_time
rpc.metrics.RpcProcessingTime_num_ops
rpc.metrics.RpcQueueTime_avg_time
rpc.metrics.RpcQueueTime_num_ops
rpc.metrics.SentBytes
rpc.metrics.callQueueLen
rpc.metrics.rpcAuthenticationFailures
rpc.metrics.rpcAuthenticationSuccesses
rpc.metrics.rpcAuthorizationFailures
rpc.metrics.rpcAuthorizationSuccesses

単純に並べると結構多いですね。。。

*1:rpc.detailed-metricsは該当メソッド実行時に項目が随時追加されるため、網羅するのが難しいのです。。。