<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
<channel>
<title>The Backup Blog</title>
<link>http://thebackupblog.typepad.com/thebackupblog/</link>
<description>A blog about backup, recovery, and archiving. But mostly backup and recovery. Written by Scott Waterhouse.</description>
<language>en-US</language>
<lastBuildDate>Tue, 04 Oct 2011 12:12:40 -0600</lastBuildDate>
<generator>http://www.typepad.com/</generator>

<docs>http://www.rssboard.org/rss-specification</docs>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/typepad/AEBl" /><feedburner:info uri="typepad/aebl" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
<title>Yet More About TSM Integration With Data Domain</title>
<link>http://feedproxy.google.com/~r/typepad/AEBl/~3/nHJV2fDuFPA/yet-more-about-tsm-integration-with-data-domain.html</link>
<guid isPermaLink="false">http://thebackupblog.typepad.com/thebackupblog/2011/10/yet-more-about-tsm-integration-with-data-domain.html</guid>
<description>I thought I was done with TSM. And just when you think you are out, they pull you back in. Only in this case I am being pulled back in by some feedback from a colleague on my previous column....</description>
<content:encoded>&lt;p&gt;I thought I was done with TSM. And just when you think you are out, they pull you back in.&lt;/p&gt;
&lt;p&gt;Only in this case I am being pulled back in by some feedback from a colleague on my previous column. I didnt want to relegate his input to a comment where it wouldn&amp;#39;t get adequate notice, so I am reproducing it here in full and unedited. Full credit for this goes to Mario Correia. Mario has some impeccable credentials as a TSM administrator and consultant, and worked with one of the most respected TSM services groups in the industry. Mario had some great feedback on how often to run TSM reclamation on a Data Domain device.&lt;/p&gt;
&lt;p&gt;&amp;quot;You really hit all the points with TSM&amp;#0160; and DD but I disagree on the comment about more aggressive reclamation at the 20%-30% thresholds.&lt;br /&gt;&amp;#0160;&lt;br /&gt;Typically, 50% is the magic number where you hit the point of diminishing returns. If a tape is 50% utilized, I can take two volumes and consolidate onto one. I still have a net gain of 1 volume. But, once I go beyond that point, I’m not really saving anything from a volume perspective plus I’m also working much harder to get that minimal savings. When you throw in the actual space savings post-dedupe, it’s even less.
&lt;/p&gt;
&lt;br /&gt;If we do the math using 100GB volume size, at 20% reclaim, I’d have to read data from 5 volumes or 400GB of data to reclaim that 100GB&amp;#0160; that was ‘wasted’. But if I get 10:1, that amounts to having TSM read 400GB of data to save 10GB of physical space on the DD. TSM has to read the full amount of “active” pre-comp data on that volume so it works just as hard as if it were reading from real tape.&lt;br /&gt;&amp;#0160;&lt;br /&gt;The other potential gotcha, potential is the key word,&amp;#0160; is that TSM stores individual files in bundles called aggregates. When reclamation runs, TSM rebuilds these aggregates to free up space from expired files. Think of compacting your .pst file. I don’t have any numbers on this, but I would be concerned that we would be chipping away at our de-dupe rates because we would have TSM aggressively rebuilding these aggregates. Granted we would do better because of variable block, but I’m not sure it’s worth&amp;#0160; the risk.&lt;br /&gt;&amp;#0160;&lt;br /&gt;My recommendation would be to use the typical 70-90% as my reclamation threshold ( I think DD best practice is 90%), and just add some buffer onto your DD sizing to make up for those inefficiencies.&amp;quot;
&lt;p&gt;Basically, I agree with everything Mario wrote--with the possible caveat that if you have TSM cycles to burn, you might want to turn up the wick a bit on reclamation beyond what he suggests.&lt;/p&gt;
&lt;p&gt;On the other hand, there is something else interesting that happens here (which Mario alludes to at one point). Strictly speaking, TSM reclamation shouldn&amp;#39;t reclaim all that much physical space on a deduplicated device. Due to the way TSM progressive incremental works, there is not a lot of net new data introduced (post deduplication). So if I am doing reclamation, and cleaning up objects because there are newer versions, we need to recognize that the newer version may only introduce 10% new segments or less. So when we reclaim that file, we dont reclaim the entire capacity associated with it, we only reclaim the amount that represents unique segments--10% in this example.&lt;/p&gt;
&lt;p&gt;So from that perspective, more aggressive reclamation doesnt achieve the same return on investment when used with deduplicated Data Domain as it does non-deduplicated storage, be it physical tape or virtual tape. Which, I suppose, is just another argument for taking a more moderate approach to reclamation like Mario suggests--because you will just be burning CPU and IO with very marginal returns in terms of physical capacity reclaimed.&lt;/p&gt;</content:encoded>



<dc:creator>Scott Waterhouse</dc:creator>
<pubDate>Tue, 04 Oct 2011 12:12:40 -0600</pubDate>

<feedburner:origLink>http://thebackupblog.typepad.com/thebackupblog/2011/10/yet-more-about-tsm-integration-with-data-domain.html</feedburner:origLink></item>
<item>
<title>More Thoughts on TSM</title>
<link>http://feedproxy.google.com/~r/typepad/AEBl/~3/cnzIswxe4YA/more-thoughts-on-tsm.html</link>
<guid isPermaLink="false">http://thebackupblog.typepad.com/thebackupblog/2011/09/more-thoughts-on-tsm.html</guid>
<description>I thought I would wrap up my thoughts on TSM, for now, by talking briefly about two things TSM users are typically concerned about: migrating TSM clients and getting off TSM. The first one of these is something we talked...</description>
<content:encoded>&lt;p&gt;I thought I would wrap up my thoughts on TSM, for now, by talking briefly about two things TSM users are typically concerned about: migrating TSM clients and getting off TSM.&lt;br /&gt;&lt;br /&gt;The first one of these is something we talked about just recently. And I think I covered off the easiest way to move to a Data Domain target with TSM from a tape or virtual tape target. But one of the things I frequently get asked about is: what about clients backing up using client side compression?&lt;br /&gt;&lt;br /&gt;This practice seems to be relatively common amongst TSM users—I could speculate why, but I won&amp;#39;t. Suffice it to say that it is far more common in TSM environments than it is for users of other backup applications. And it is a pernicious practice that is just not good. In any way. Sure it saves bandwidth. Sort of. But bandwidth is free, more or less. And it isn&amp;#39;t a big enough deal to enable remote backup, so really it just saves local bandwidth. Which is even closer to free. 
&lt;/p&gt;
&lt;br /&gt;But aside from that, it kills deduplication. (And no, not just deduplication on Data Domain, deduplication on anything.) Because, you see, TSM client compression stays with the backup object for as long as that object is retained. Meaning that even if you turn it off, only future versions of backup objects will be backed up without compression, and all the existing versions in your TSM storage pool will remain compressed. Where they will deduplicate poorly, or not at all.&lt;br /&gt;&lt;br /&gt;So, for all of you using TSM client compression, here is my advice: turn it off now. Right now.&lt;br /&gt;&lt;br /&gt;OK, have you done it? Good. &lt;br /&gt;&lt;br /&gt;No such thing as soon enough here. &lt;br /&gt;&lt;br /&gt;Because at the end of the day, you will be moving to a dedup target, if you haven&amp;#39;t already. Sooner or later you will. It is inevitable. Just like death and taxes. Only a heck of a lot more pleasant than either. And the best thing you can do to prepare for this future is stop using client side compression now. At least that will minimize the amount of retained compressed data that will not dedup well.&lt;br /&gt;&lt;br /&gt;The other issue here is how to get off of TSM. We have a large number of clients that have done this and are doing this, and one big question that comes up is: what do I do with my retained data? &lt;br /&gt;&lt;br /&gt;Well, there is no pretty or easy answer to this question. We can put lipstick on the pig, but it still says oink. And the problem here is that whenever you migrate to a different backup application, you pretty much maroon all the data presently retained and catalogued by your existing backup application. (There are alternatives, but they are often very expensive, to the point where I have yet to see more than one or two customers out of hundreds be able to justify the process.)&lt;br /&gt;&lt;br /&gt;So your data is marooned. The best thing is to encapsulate that TSM server in a VM and make a backup and turn it off.
&lt;p&gt;And by the way, this answer isnt signficantly different no matter what backup application you choose. Generally, your data is bound to that app for the duration of the retention, and that is that.&lt;/p&gt;
&lt;p&gt;What is better? Well, that is for the next post—why I recommend that there is a better way to do things, and my reasons for making this recommendation.&lt;/p&gt;</content:encoded>


<category>deduplication</category>
<category>tsm</category>

<dc:creator>Scott Waterhouse</dc:creator>
<pubDate>Thu, 01 Sep 2011 13:28:32 -0600</pubDate>

<feedburner:origLink>http://thebackupblog.typepad.com/thebackupblog/2011/09/more-thoughts-on-tsm.html</feedburner:origLink></item>
<item>
<title>Two Interesting Links</title>
<link>http://feedproxy.google.com/~r/typepad/AEBl/~3/NVKzbXWYixo/two-interesting-links.html</link>
<guid isPermaLink="false">http://thebackupblog.typepad.com/thebackupblog/2011/08/two-interesting-links.html</guid>
<description>Today I have a couple of links for all of you interested in backup: First, the launch of the "official" EMC Backup and Recovery Systems Division blog, The Backup Window. There is lots of great content up already, and I...</description>
<content:encoded>&lt;p&gt;Today I have a couple of links for all of you interested in backup:&lt;/p&gt;
&lt;p&gt;First, the launch of the &amp;quot;official&amp;quot; EMC Backup and Recovery Systems Division blog, &lt;a href="http://thebackupwindow.emc.com/" target="_blank" title="The Backup Window"&gt;The Backup Window&lt;/a&gt;. There is lots of great content up already, and I am really liking some of the thinking about the future stuff that Stephen Manley is posting.&lt;/p&gt;
&lt;p&gt;Second, Chad has a really great and thought provoking post about the &lt;a href="http://virtualgeek.typepad.com/virtual_geek/2011/08/tech-preview-avamar-vcloud-protector.html" target="_blank" title="VMware backup technology preview"&gt;future of backup for VMware enviornments&lt;/a&gt;. Really great, and really interesting. Although there is still a&amp;#0160; really big problem to tackle: data consistency within applications.&lt;/p&gt;
&lt;p&gt;And to add to Chad&amp;#39;s comments, I can think of a few other things I would like to see:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Built in charge-back modeling. It is great to provide users with the ability to define their own backup and provision it, but it also has some dangers--like users that dont understand retaining everything forever entails some risks.&lt;/li&gt;
&lt;li&gt;Built in policy boundaries. There should be a setting that defines what the limits of a user-defined policy are. In other words, don&amp;#39;t let people shoot themselves in the foot. vCD should have settings that limit frequency and retention for backups.&lt;/li&gt;
&lt;li&gt;What if modeling that shows resource impacts of changes to backup policy.&lt;/li&gt;
&lt;li&gt;Critical parts of the vCD structures in a multi-tenancy environment should be automatically &amp;quot;pinned&amp;quot; to the backups of that tenant so that those backups are portable and self-defining.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I am sure there are others--but it is great stuff to think about. Make no mistake, this is where backup is heading, in my opinion.&lt;/p&gt;</content:encoded>



<dc:creator>Scott Waterhouse</dc:creator>
<pubDate>Wed, 31 Aug 2011 09:24:37 -0600</pubDate>

<feedburner:origLink>http://thebackupblog.typepad.com/thebackupblog/2011/08/two-interesting-links.html</feedburner:origLink></item>
<item>
<title>Using Data Domain with TSM: Making the Move</title>
<link>http://feedproxy.google.com/~r/typepad/AEBl/~3/mYF28SV3imE/using-data-domain-with-tsm-making-the-move.html</link>
<guid isPermaLink="false">http://thebackupblog.typepad.com/thebackupblog/2011/08/using-data-domain-with-tsm-making-the-move.html</guid>
<description>Once you have decided that your traditional tape infrastructure is not serving your business needs for backup and recovery particularly well, and that your TSM infrastructure requires some updating, what's next? As I discussed last time out, having the ability...</description>
<content:encoded>&lt;p&gt;Once you have decided that your traditional tape infrastructure is not serving your business needs for backup and recovery particularly well, and that your TSM infrastructure requires some updating, what&amp;#39;s next? As I discussed last time out, having the ability to &amp;quot;fix&amp;quot; TSM by dramatically improving the daily schedule, free up TSM server cycles, and enable better business processes—like disaster recovery testing—is a big win. &lt;br /&gt;&lt;br /&gt;But change doesn&amp;#39;t always come easily.&lt;br /&gt;&lt;br /&gt;And particularly change in the form of moving from tape to Data Domain with a TSM server can appear to be a confusing process. After all, for other backup applications that perform periodic fulls, it is &amp;quot;just&amp;quot; a matter of pointing the backup at a new target device, and letting the Data Domain system work its magic. This is almost entirely transparent to the administrator and backup application, and the fact that it is so easy is one of the big virtues of the Data Domain approach.&lt;br /&gt;&lt;br /&gt;However TSM doesn&amp;#39;t do full backups for many clients, and tapes don&amp;#39;t expire in quite the same orderly fashion as they do for other backup applications. (Huge understatement actually. Hopefully no TSM administrators were drinking coffee as they read this as they would likely be trying to get it out of their nose or off their keyboard right now!)
&lt;/p&gt;
&lt;br /&gt;So how do you move, and what is involved? &lt;br /&gt;&lt;br /&gt;There are lots of ways to move to a new media pool in TSM, and few of them are outright bad. There is lots of flexibility, so I don&amp;#39;t want to present this method as the only one. But my preference is to use a process called &amp;quot;reclaim storage pool&amp;quot;. Essentially, this is a process that reclaims the entire contents of a storage pool by moving all valid retained objects to a new storage pool.&lt;br /&gt;&lt;br /&gt;So, for the duration of the migration from tape to Data Domain, you will have two principal on-site storage pools (excepting perhaps your disk pool which is the initial backup target). One will be the old tape pool, the other a new Data Domain pool (which could be either virtual tape via a VTL or a sequential disk pool over NFS). Before the reclamation begins, the Data Domain will become the new primary on-site storage pool. All backups will end up here, either directly, if they come from a LAN-Free client, or via migration, if they go to a disk pool first. All new incremental backup data for all clients will go here. Existing retained data will still be sitting in the tape pool.&lt;br /&gt;&lt;br /&gt;My preference is to let this run for a week to two weeks to begin to capture the most recent active data, and let a portion of the data on tape expire. Every little bit helps.&lt;br /&gt;&lt;br /&gt;Then we begin the reclaim process on the (old) tape storage pool. Reclaiming across storage pool works exactly like reclaiming within a storage pool—except that the valid retained objects will be moved to the new pool. That is, they will end up on Data Domain rather than just another tape. The advantages of using this process are several:&lt;br /&gt;&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;You can interrupt it at any time or cancel it at the console with absolutely no impact (it can be resumed after an interruption without having to start over).&lt;/li&gt;
&lt;li&gt;It is the same process already used to reclaim space within a storage pool, and therefore its impact and management is already well understood by most TSM administrators.&lt;/li&gt;
&lt;li&gt;It can be done gradually: by gradually increasing the reclamation threshold, you can cause the process to run incrementally. By starting at 90%, then 80%, then 70% and so on, we can gradually move over portions of the valid data in a measured fashion. (You may want to do this by identifying all the volumes that fit the criteria, then reclaiming those volumes too.)&lt;/li&gt;
&lt;li&gt;You can exercise as much or as little fine grained control over the migration as you like. Although the steps above are certainly a good idea, in principle, there is not a lot wrong with just setting the reclamation on the old storage pool to 0 (forcing everything to be reclaimed) and letting it go.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It is all as simple as designating the Data Domain as the target for reclaim storage pool: your syntax will determine the storage pool which is the source, the reclamation threshold value, the pool which is the target (Data Domain) and the number of parallel processes (generally as many as you would typically use for reclamation). And that’s it.&lt;br /&gt;&lt;br /&gt;All of which brings us to the next part: what are the things to look out for in a TSM migration from tape to Data Domain? That’s up next time.&lt;/p&gt;</content:encoded>



<dc:creator>Scott Waterhouse</dc:creator>
<pubDate>Fri, 26 Aug 2011 08:59:39 -0600</pubDate>

<feedburner:origLink>http://thebackupblog.typepad.com/thebackupblog/2011/08/using-data-domain-with-tsm-making-the-move.html</feedburner:origLink></item>
<item>
<title>Using Data Domain as a Target for TSM</title>
<link>http://feedproxy.google.com/~r/typepad/AEBl/~3/HctOKQvTTxY/using-data-domain-as-a-target-for-tsm.html</link>
<guid isPermaLink="false">http://thebackupblog.typepad.com/thebackupblog/2011/08/using-data-domain-as-a-target-for-tsm.html</guid>
<description>I have been spending a lot of time with TSM accounts lately, so I thought I would take the opportunity to discuss some of the gotchas and lessons learned from these environments, as they moved from a tape-centric backup environment...</description>
<content:encoded>&lt;p&gt;I have been spending a lot of time with TSM accounts lately, so I thought I would take the opportunity to discuss some of the gotchas and lessons learned from these environments, as they moved from a tape-centric backup environment to a disk based backup environment using EMC Data Domain technology.&lt;br /&gt;&amp;#0160;&lt;br /&gt;The first and major thing each of these customers have realized is the incredible impact the Data Domain systems have on the schedule that TSM operates under. TSM is architected a bit different (understatement of the year) than most other backup products. And for a lot of reasons, the TSM server is usually busy 24 hours a day. In fact, it is almost always the case that it could be busy for more than 24 hours a day, but there just isn&amp;#39;t enough time and resources to enable it to do everything it wants and needs to do.&lt;/p&gt;

&lt;br /&gt;&amp;#0160;&lt;br /&gt;Lets take a look at a typical day in the life of a TSM server:&lt;br /&gt;&amp;#0160;&lt;br /&gt;20:00 to 8:00 backups to disk cache (even in 100% tape environments)&lt;br /&gt;21:00 to 9:00 migration from disk cache to tape&lt;br /&gt;06:00 to 12:00 copy activity to create daily off-site (copy pool) tapes&lt;br /&gt;12:00 to 13:00 TSM database backup to tape&lt;br /&gt;13:00 to 13:30 copy activity to create off-site copy of TSM db (copy pool)&lt;br /&gt;13:00 to 16:00 migration from disk cache to tape&lt;br /&gt;16:00 to 17:00 expiration activity&lt;br /&gt;17:00 to 20:00 reclamation activity&lt;br /&gt;&amp;#0160;&lt;br /&gt;Now I have simplified this a bit: normally some of these activities are interwoven. Migration from disk cache to tape is often an ongoing process. Reclamation often spills over into the backup window. But overall, it is a fair generalization to say that almost every TSM server out there using tape needs more hours than there are in a day to get through this activity. &lt;br /&gt;&amp;#0160;&lt;br /&gt;So what usually gets left out? Reclamation. 98% of the time, the shortfall in time and resources is made up by reducing the amount of reclamation that happens. In turn this means that more and more tape is consumed, and the density of data on tape drops. In TSM environments, it is not unusual to see less than 50% of the total tape capacity used by current valid backup data. In some cases, I have seen 70-90% of tapes wasted on unreclaimed data.&lt;br /&gt;&amp;#0160;&lt;br /&gt;Now lets look at a typical schedule after implementing a Data Domain system:&lt;br /&gt;&amp;#0160;&lt;br /&gt;20:00 to 4:00 backups to disk cache (still sometimesa required even with Data Domain)&lt;br /&gt;20:00 to 4:00 backups to Data Domain from LAN-free clients&lt;br /&gt;21:00 to 6:00 migration from disk cache to tape&lt;br /&gt;6:30 to 7:30 TSM database backup to Data Domain&lt;br /&gt;8:00 to 11:00 expiration activity&lt;br /&gt;11:00 to 15:00 reclamation activity&lt;br /&gt;&amp;#0160;&lt;br /&gt;So what has happened? First, we have moved larger backup clients to a LAN-free method, sending their data directly to Data Domain. With the much larger number of virtual resources we have available in a Data Domain system (up to 256 virtual tape drives--or 512 on a GDA) than most environments ever have access to in the physical world, we can make this simple architectural change which is enormously beneficial.&lt;br /&gt;&amp;#0160;&lt;br /&gt;Second, we have got rid of the copy pool activities. This is a huge drain on the time and resources of the TSM server during the course of the day. By eliminating this entirely, and replacing it with Data Domain replication, we save many hours of processing. Incidentally, we also reduce the size of the TSM database (because we don&amp;#39;t have entries for every backup object retained offsite and every version of them retained offsite). We reduce half of the reclamation workload of the TSM server, because it does not need to do reclamation processing against the copy pool. We reduce the CPU and I/O load on the server. These are all very good things.&lt;br /&gt;&amp;#0160;&lt;br /&gt;Third, database backups and offsite copies are complete by 7:30 in the morning. This means that a full disaster recovery copy of the TSM backup pool and database is available for disaster recovery purposes many hours earlier in the day than they would be with physical tape. In fact, depending on the duration of the copy pool job and the timing of your couriers, off-site tape may not make it off-site for 24-36 hours for some TSM users. This means, by the way, that the best RPO that can be achieved is 72 hours. With Data Domain, we have a RPO of no more than 30 hours.&lt;br /&gt;&amp;#0160;&lt;br /&gt;Finally, reclamation is going to run far faster. In general, reclamation from virtual tape on Data Domain is going to run four to ten times faster than reclamation from physical tape. In turn, this means that we can be aggressive with our reclamation policies, and reclaim a tape when it has 20-30% expired data, rather than 70-90%. In turn, this makes for far more efficient use of our backup infrastructure.&lt;br /&gt;&amp;#0160;&lt;br /&gt;The net result here is that we have taken a typical TSM server from requiring 30+ hours to get through its daily activities, to 18-20 hours to get through the same activities.&lt;br /&gt;&amp;#0160;&lt;br /&gt;These are are the first and most significant benefits that our customers are realizing when they pair TSM with Data Domain technology. Next up: gotchas and approach.</content:encoded>



<dc:creator>Scott Waterhouse</dc:creator>
<pubDate>Fri, 19 Aug 2011 08:39:56 -0600</pubDate>

<feedburner:origLink>http://thebackupblog.typepad.com/thebackupblog/2011/08/using-data-domain-as-a-target-for-tsm.html</feedburner:origLink></item>

</channel>
</rss><!-- ph=1 -->
