Data Collection Overview 3
Application Administrator’s Guide
1
Chapter 0
Introduction to Data Collection
This chapter covers the following topics:
• Data Collection Overview
• Architecture Overview
Data Collection Overview
The Data Collector is a centralized and remotely managed data collection mechanism.
This Java application is responsible for interfacing with backup servers and storage
arrays, gathering information related to storage backup and recovery, and capacity
management.
The Data Collector continuously collects data and sends this data, using an http or https
connection, to another Java application, the Data Receiver. The Data Receiver runs on
the Portal Server and stores the data that it receives in the Reporting Database. When
you use the Portal to generate a report, the Portal requests this information from the
Reporting Database, then returns the results in one of the many available reports.
The Data Collector obtains all of its monitoring rules from a Data Collector
Configuration File. This file resides in the Reporting Database in XML format. When
the Data Collector first starts, it downloads this file from the Reporting Database. The
Data Collector uses this file to determine the list of backup servers, hosts, or storage
arrays that are to be monitored and included in its data collection process. For details on
how host names are processed, see “Host Name Processing - Filters and Aliases” in
Application Administrator’s Guide.
Data Collector Terminology
Data Collector - This software component interfaces with each of the supported
backup and recovery software systems to extract meta-data about the underlying
backup and recovery environment. For example, data can include backup job details
and tape inventory information. In the case of Capacity Manager, the Data Collector
communicates with the storage arrays in your SAN (Storage Area Network).
Data Collection by Backup Product
The following collection mechanisms are used for the particular backup products:
• EMC Legato NetWorker - The Data Collector uses the Legato administration command-
line utilities, such as mminfo, nsradmin, and nsrinfo.
• IBM Tivoli Storage Manager - The Data Collector interfaces with TSM using the TSM
utility, dsmadmc, collecting data from the underlying TSM databases, including TSM
Archives for LAN-free backups.