Symantec 7 Computer Drive User Manual


 
The PureDisk plug-in reads the backup image and separates the image into
files.
The PureDisk plug-in separates files into segments and calculates the
fingerprint for each file and segment.
The plug-in compares each fingerprint against the local fingerprint cache. If
the fingerprint is not known in the cache, the plug-in requests thatthe engine
verify if the fingerprint already exists.
If the fingerprint does not exist, the segment is sent to the engine. If the
fingerprint exists, the segment is not sent.
The fingerprint calculations are based on the MD5 algorithm. However, any
segments that have different content but the same MD5 hash key get different
fingerprints. So NetBackup prevents MD5 collisions.
Data removal process
The following list describes the data removal process for expired backup images:
NetBackup removes the image record from the NetBackup catalog.
NetBackupdirectsthe NetBackupDeduplication Managertoremove theimage.
The deduplication manager immediately removes the image entry and adds a
removal request for the image to the database transaction queue.
From this point on, the image is no longer accessible.
When the queue is next processed, the NetBackup Deduplication Engine
executes theremoval request.The engine alsogenerates removalrequests for
underlying data segments
At the successive queue processing, the NetBackup Deduplication Engine
executes the removal requests for the segments.
Storageis reclaimedafter twoqueueprocessing runs;that is, inone day.However,
data segments of the removed image may still be in use by other images.
If you manually delete an image that has expired within the previous 24 hours,
the data becomes garbage. It remains on disk until removed by the next garbage
collection process.
See About maintenance processingon page 90.
Deduplication architecture
Data removal process
118