<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Web Admin Blog</title>
	
	<link>http://www.webadminblog.com</link>
	<description>Real Web Admins.  Real World Experience.</description>
	<lastBuildDate>Tue, 07 Jul 2009 17:27:13 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/WebAdminBlog" type="application/rss+xml" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">WebAdminBlog</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Velocity 2009 – Best Tidbits</title>
		<link>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-best-tidbits/</link>
		<comments>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-best-tidbits/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 03:43:47 +0000</pubDate>
		<dc:creator>Ernest</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Velocity 2009]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[velocityconf]]></category>
		<category><![CDATA[velocityconf09]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=284</guid>
		<description><![CDATA[Besides all the sessions, which were pretty good, a lot of the good info you get from conferences is by networking with other folks there and talking to vendors.  Here are some of my top-value takeaways.
Aptimize is a New Zealand-based company that has developed software to automatically do the most high value front end optimizations [...]]]></description>
			<content:encoded><![CDATA[<p>Besides all the sessions, which were pretty good, a lot of the good info you get from conferences is by networking with other folks there and talking to vendors.  Here are some of my top-value takeaways.</p>
<p><a href="http://www.aptimize.com/" target="_blank">Aptimize</a> is a New Zealand-based company that has developed software to automatically do the most high value front end optimizations (image spriting, CSS/JS combination and minification, etc.).  We predict it&#8217;ll be big.  On a site like ours, going back and doing all this across hundreds of apps will never happen &#8211; we can engineer new ones and important ones better, but something like this which can benefit apps by the handful is great.</p>
<p>I got some good info from the MySpace people.  We&#8217;ve been talking about whether to run our back end as Linux/Apache/Java or Windows/IIS/.NET for some of our newer stuff.  In the first workshop, I was impressed when the guy asked who all runs .NET and only one guy raised his hand.   MySpace is one of the big .NET sites, but when I talked with them about what they felt the advantage was, they looked at each other and said &#8220;Well&#8230;  It was the most expeditious choice at the time&#8230;&#8221;  That&#8217;s damning with faint praise, so I asked about what they saw the main disadvantage being, and they cited remote administration &#8211; even with the new PowerShell stuff it&#8217;s just still not as easy as remote admin/CM of Linux.  That&#8217;s top of my list too, but often Microsoft apologists will say &#8220;You just don&#8217;t understand because you don&#8217;t run it&#8230;&#8221;  But apparently running it doesn&#8217;t necessarily sell you either.</p>
<p>Our friends from <a href="http://www.opnet.com/" target="_blank">Opnet</a> were there.  It was probably a tough show for them, as many of these shops are of the &#8220;I never pay for software&#8221; camp.  However, you end up wasting far more in skilled personnel time if you don&#8217;t have the right tools for the job.  We use the heck out of their Panorama tool &#8211; it pulls metrics from all tiers of your system, including deep in the JVM, and does dynamic baselining, correlation and deviation.  If all your programmers are 3l33t maybe you don&#8217;t need it, but if you&#8217;re unsurprised when one of them says &#8220;Uhhh&#8230; What&#8217;s a thread leak?&#8221; then it&#8217;s money.</p>
<p><a href="http://www.controltier.com/" target="_blank">ControlTier</a> is nice, they&#8217;re a commercial open source CM tool for app deploys &#8211; it works at a higher level than chef/puppet, more like capistrano.</p>
<p><a href="http://www.engineyard.com/" target="_blank">EngineYard</a> was a really nice cloud provisioning solution (sits on top of Amazon or whatever).  The reality of cloud computing as provided by the base IaaS vendors isn&#8217;t really the &#8220;machines dynamically spinning up and down and automatically scaling your app&#8221; they say it is without something like this (or lots of custom work).  Their solution is, sadly, Rails only right now.  But it is slick, very close to the blue-sky vision of what cloud computing can enable.</p>
<p>And also, I joined the <a href="http://www.eff.org/" target="_blank">EFF</a>!  Cyber rights now!</p>
<p>You can see most of the official proceedings from the conference (for free!):</p>
<ul>
<li><a href="http://velocityconference.blip.tv/" target="_blank">Conference Videos</a></li>
<li><a href="http://www.flickr.com/photos/x180/sets/72157620269837751/" target="_blank">Velocity Photos</a></li>
<li><a href="http://en.oreilly.com/velocity2009/public/schedule/proceedings" target="_blank">Presentation Slides</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-best-tidbits/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Velocity 2009 – Monday Night</title>
		<link>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-monday-night/</link>
		<comments>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-monday-night/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 03:11:40 +0000</pubDate>
		<dc:creator>Ernest</dc:creator>
				<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Velocity 2009]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[velocityconf]]></category>
		<category><![CDATA[velocityconf09]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=282</guid>
		<description><![CDATA[After a hearty trip to Gordon Biersch, Peco went to the Ignite battery of five minute presentations, which he said was very good.  I went to two Birds of a Feather sessions, which were not.  The first was a general cloud computing discussion which covered well-trod ground.  The second was by a hapless Sun guy [...]]]></description>
			<content:encoded><![CDATA[<p>After a hearty trip to Gordon Biersch, Peco went to the <a href="http://en.oreilly.com/velocity2009/public/schedule/detail/9375" target="_blank">Ignite</a> battery of five minute presentations, which he said was very good.  I went to two Birds of a Feather sessions, which were not.  The first was a general cloud computing discussion which covered well-trod ground.  The second was by a hapless Sun guy on Olio and Fabian.  No, you don&#8217;t need to know about them.  It was kinda painful, but I want to commend that Asian guy from Google for diplomatically continuing to try to guide the discussion into something coherent without just rolling over the Sun guy.  Props!</p>
<p>And then &#8211; we were lame and just turned in.  I&#8217;m getting old, can&#8217;t party every night like I used to.  (I don&#8217;t know what Peco&#8217;s excuse is!)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-monday-night/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Velocity 2009 – Scalable Internet Architectures</title>
		<link>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-scalable-internet-architectures/</link>
		<comments>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-scalable-internet-architectures/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 02:51:42 +0000</pubDate>
		<dc:creator>Ernest</dc:creator>
				<category><![CDATA[Application Performance Management]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Velocity 2009]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[velocityconf]]></category>
		<category><![CDATA[velocityconf09]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=276</guid>
		<description><![CDATA[OK, I&#8217;ll be honest.  I started out attending &#8220;Metrics that Matter &#8211; Approaches to Managing High Performance Web Sites&#8221; (presentation available!) by Ben Rushlo, Keynote proserv.  I bailed after a half hour to the other one, not because the info in that one was bad but because I knew what he was covering and wanted [...]]]></description>
			<content:encoded><![CDATA[<p>OK, I&#8217;ll be honest.  I started out attending &#8220;<a href="http://en.oreilly.com/velocity2009/public/schedule/detail/9025" target="_blank">Metrics that Matter &#8211; Approaches to Managing High Performance Web Sites</a>&#8221; (presentation available!) by Ben Rushlo, Keynote proserv.  I bailed after a half hour to the other one, not because the info in that one was bad but because I knew what he was covering and wanted to get the less familiar information from the other workshop.  Here&#8217;s my brief notes from his session:</p>
<ul>
<li>Online apps are complex systems</li>
<li>A siloed approach of deciding to improve midtier vs CDN vs front end engineering results in suboptimal experience to the end user &#8211; have to take holistic view.  <em>I totally agree with this, in our own caching project we took special care to do an analysis project first where we evaluated impact and benefit of each of these items not only in isolation but together so we&#8217;d know where we should expend effort.</em></li>
<li>Use top level/end user metrics, not system metrics, to measure performance.</li>
<li>There are other metrics that correlate to your performance &#8211; &#8220;key indicators.&#8221;</li>
<li>It&#8217;s hard to take low level metrics and take them &#8220;up&#8221; into a meaningful picture of user experience.</li>
</ul>
<p><em>He&#8217;s covering good stuff but it&#8217;s nothing I don&#8217;t know.  We see the differences and benefits in point in time tools, Passive RUM, tagging RUM, synthetic monitoring, end user/last mile synthetic monitoring&#8230;  If you don&#8217;t, read the presentation, it&#8217;s good.  As for me, it&#8217;s off to the scaling session.<br />
</em><br />
I hopped into this session a half hour late.  It&#8217;s <a href="http://en.oreilly.com/velocity2009/public/schedule/detail/8859" target="_blank">Scalable Internet Architectures</a> (again, go get the presentation) by <a href="http://lethargy.org/~jesus/" target="_blank">Theo Schlossnagle</a>, CEO of <a href="http://omniti.com/" target="_blank">OmniTI</a> and author of the similarly named book.</p>
<p><em>I like his talk, it starts by getting to the heart of what Web Operations &#8211; what we call &#8220;Web Admin&#8221; hereabouts &#8211; is.  It kinda confuses architecture and operations initially but maybe that&#8217;s because I came in late. </em></p>
<p>He talks about knowledge, tools, experience, and discipline, and mentions that discipline is the most lacking element in the field.<em> Like him, I&#8217;m a &#8220;real engineer&#8221; who went into IT so I agree vigorously.</em></p>
<p>What specifically should you do?</p>
<ul>
<li>Use version control</li>
<li>Monitor</li>
<li>Serve static content using a CDN, and behind that a reverse proxy and behind that peer based HA.  Distribute DNS for global distribution.</li>
<li>Dynamic content &#8211; now it&#8217;s time for optimization.</li>
</ul>
<h3><strong>Optimizing Dynamic Content<br />
</strong></h3>
<p>Don&#8217;t pay to generate the same content twice &#8211; use caching.  Generate content only when things change and break the system into components so you can cache appropriately.</p>
<p>example: a php news site &#8211; articles are in oracle, personalization on each page, top new forum posts in a sidebar.</p>
<p>Why abuse oracle by hitting it every page view?  updates are controlled.  The page should pull user prefs from a cookie.  (p.s. rewrite your query strings)<br />
But it&#8217;s still slow to pull from the db vs hardcoding it.<br />
All blog sw does this, for example<br />
Check for a hardcoded php page &#8211; if it&#8217;s not there, run something that puts it there.  Still dynamically puts in user personalization from the cookie.  In the preso he provides details on how to do this.<br />
Do cache invalidation on content change, use a message queuing system like openAMQ for async writes.<br />
Apache is now the bottleneck &#8211; use APC (alternative php cache)<br />
or use memcached &#8211; he says no timeouts!  Or&#8230; be careful about them!  Or something.</p>
<h3>Scaling Databases</h3>
<p>1. shard them<br />
2. shoot yourself</p>
<p>Sharding, or breaking your data up by range across many databases, means you throw away relational constraints and that&#8217;s sad.  Get over it.</p>
<p>You may not need relations &#8211; use files fool!  Or other options like couchdb, etc.  <em>Or hadoop, from the previous workshop!</em></p>
<p>Vertically scale first by:</p>
<ul>
<li> not hitting the damn db!</li>
<li> run a good db.  postgres!  not mySQL boo-yah!</li>
</ul>
<p>When you have to go horizontal, partition right &#8211; more than one shard shouldn&#8217;t answer an oltp question.   If that&#8217;s not possible, consider duplication.</p>
<p>IM example.  Store messages sharded by recipient.  But then the sender wants to see them too and that&#8217;s an expensive operation &#8211; so just store them twice!!!</p>
<p>But if it&#8217;s not that simple, partitioning can hose you.</p>
<p>Do math and simulate it before you do it fool!   Be an engineer!</p>
<p>Multi-master replication doesn&#8217;t work right.  But it&#8217;s getting closer.</p>
<h3>Networking</h3>
<p>The network&#8217;s part of it, can&#8217;t forget it.</p>
<p>Of course if you&#8217;re using Ruby on Rails the network will never make your app suck more.  <em>Heh, the random drive-by disses rile the crowd up.</em></p>
<p>A single machine can push a gig.  More isn&#8217;t hard with aggregated ports.  Apache too, serving static files.  Load balancers too.  How to get to 10 or 20 Gbps though?  All the drivers and firmware suck.  Buy an expensive LB?</p>
<p>Use routing.  It supports naive LB&#8217;ing.  Or routing protocol on front end cache/LBs talking to your edge router.  Use hashed routes upstream.  User caches use same IP.  Fault tolerant, distributed load, free.</p>
<p>Use isolation for floods.  Set up a surge net.  Route out based on MAC.  Used vs DDoSes.</p>
<h3>Service Decoupling</h3>
<p>One of the most overlooked techniques for scalable systems.  Why do now what you can postpone till later?</p>
<p>Break transaction into parts.  Queue info.  Process queues behind the scenes.  Messaging!  There&#8217;s different options &#8211; AMQP, Spread, JMS.  Specifically good message queuing options are:</p>
<ul>
<li><a href="http://activemq.apache.org/" target="_blank"> ActiveMQ (Java)</a></li>
<li> OpenAMQ (C)</li>
<li> RabbitMQ (erlang)</li>
</ul>
<p>Most common &#8211; <a href="http://stomp.codehaus.org/" target="_blank">STOMP</a>, sucks but universal.</p>
<p>Combine a queue and a job dispatcher to make this happen.  Side note &#8211; <a href="http://www.danga.com/gearman/" target="_blank">Gearman</a>, while cool, doesn&#8217;t do this &#8211; it dispatches work but it doesn&#8217;t decouple action from outcome &#8211; should be used to scale work that can&#8217;t be decoupled.  (Yes it does, says dude in crowd.)</p>
<p>Scalability Problems</p>
<p>It often boils down to &#8220;don&#8217;t be an idiot.&#8221;  <em>His words not mine.  I like this guy.</em> Performance is easier than scaling.  Extremely high perf systems tend to be easier to scale because they don&#8217;t have to scale as much.</p>
<p>e.g. An email marketing campaign with an URL not ending in a trailing slash.  Guess what, you just doubled your hits.  Use the damn trailing slash to avoid 302s.</p>
<p><em>How do you stop everyone from being an idiot though?  Every person who sends a mass email from your company?  That&#8217;s our problem  &#8211; with more than fifty programmers and business people generating apps and content for our Web site, there is always a weakest link.</em></p>
<p>Caching should be controlled not prevented in nearly any circumstance.</p>
<p>Understand the problem.  going from 100k to 10MM users &#8211; don&#8217;t just bucketize in small chunks and assume it will scale.  Allow for margin for error.  Designing for 100x or 1000x requires a profound understanding of the problem.</p>
<p>Example &#8211; I plan for a traffic spike of 3000 new visitors/sec.  My page is about 300k.  CPU bound.  8ms service time.  Calculate servers needed.  If I varnish the static assets, the calculation says I need 3-4 machines.  But do the math and it&#8217;s 8 GB/sec of throughput.  No way.  At 1.5MM packets/sec &#8211; the firewall dies.  You have to keep the whole system in mind.</p>
<p>So spread out static resources across multiple datacenters, agg&#8217;d pipes.<br />
The rest is only 350 Mbps, 75k packets per second, doable &#8211; except the 302 adds 50% overage in packets per sec.</p>
<p>Last bonus thought &#8211; use zfs/dtrace for dbs, so run them on solaris!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/07/06/velocity-2009-scalable-internet-architectures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Velocity 2009 – Hadoop Operations: Managing Big Data Clusters</title>
		<link>http://www.webadminblog.com/index.php/2009/07/01/velocity-2009-hadoop-operations-managing-big-data-clusters/</link>
		<comments>http://www.webadminblog.com/index.php/2009/07/01/velocity-2009-hadoop-operations-managing-big-data-clusters/#comments</comments>
		<pubDate>Wed, 01 Jul 2009 21:28:16 +0000</pubDate>
		<dc:creator>Ernest</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Velocity 2009]]></category>
		<category><![CDATA[cloudera]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[velocityconf]]></category>
		<category><![CDATA[velocityconf09]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=255</guid>
		<description><![CDATA[Hadoop Operations: Managaing Big Data Clusters (see link on that page for preso) was given by Jeff Hammerbacher of Cloudera.
Other good references -
book: &#8220;Hadoop: The Definitive Guide&#8221;
preso: hadoop cluster management from USENIX 2009
Hadoop is an Apache project inspired by Google&#8217;s infrastructure; it&#8217;s software for programming warehouse-scale computers.
It has recently been split into three main subprojects [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://en.oreilly.com/velocity2009/public/schedule/detail/7624" target="_blank">Hadoop Operations: Managaing Big Data Clusters</a> (see link on that page for preso) was given by <a href="http://jeffhammerbacher.com/" target="_blank">Jeff Hammerbacher</a> of <a href="http://www.cloudera.com/" target="_blank">Cloudera</a>.</p>
<p>Other good references -<br />
book: &#8220;<a href="http://oreilly.com/catalog/9780596521974/" target="_blank">Hadoop: The Definitive Guide</a>&#8221;<br />
preso: <a href="http://wiki.apache.org/hadoop-data/attachments/HadoopPresentations/attachments/Hadoop-USENIX09.pdf" target="_blank">hadoop cluster management from USENIX 2009</a></p>
<p><a href="http://hadoop.apache.org/" target="_blank">Hadoop</a> is an Apache project inspired by Google&#8217;s infrastructure; it&#8217;s software for programming warehouse-scale computers.</p>
<p>It has recently been split into three main subprojects &#8211; HDFS, MapReduce, and Hadoop Common &#8211; and sports an ecosystem of various smaller subprojects (hive, etc.).</p>
<p>Usually a hadoop cluster is a mess of stock 1 RU servers with 4&#215;1TB SATA disks in them.  &#8220;I like my servers like I like my women &#8211; cheap and dirty,&#8221; Jeff did not say.</p>
<p>HDFS:</p>
<ul>
<li>Pools servers into a single hierarchical namespace</li>
<li>It&#8217;s designed for large files, written once/read many times</li>
<li>It does checksumming, replication, compression</li>
<li>Access is from from Java, C, command line, etc.  Not usually mounted at the OS level.</li>
</ul>
<p>MapReduce:</p>
<ul>
<li>Is a fault tolerant data layer and API for parallel data processing</li>
<li>Has a key/value pair model</li>
<li>Access is via Java, C++, streaming (for scripts), SQL (Hive), etc</li>
<li>Pushes work out to the data</li>
</ul>
<p>Subprojects:</p>
<ul>
<li>Avro (serialization)</li>
<li>HBase (like Google BigTable)</li>
<li>Hive (SQL interface)</li>
<li>Pig (language for dataflow programming)</li>
<li>zookeeper (coordination for distrib. systems)</li>
</ul>
<p>Facebook used scribe (log aggregation tool) to pull a big wad of info into hadoop, published it out to mysql for user dash, to oracle rac for internal&#8230;<br />
Yahoo! uses it too.</p>
<p>Sample projects hadoop would be good for &#8211; log/message warehouse, database archival store, search team projects (autocomplete), targeted web crawls&#8230;<br />
As boxes you can use unused desktops, retired db servers, amazon ec2&#8230;</p>
<p>Tools they use to make hadoop include subversion/jira/ant/ivy/junit/hudson/javadoc/forrest<br />
It uses an Apache 2.0 license</p>
<p>Good configs for hadoop:</p>
<ul>
<li>use 7200 rpm sata, ecc ram, 1U servers</li>
<li>use linux, ext3 or maybe xfs filesystem, with noatime</li>
<li>JBOD disk config, no raid</li>
<li> java6_14+</li>
</ul>
<p>To manage it -</p>
<p>unix utes: sar, iostat, iftop, vmstat, nfsstat, strace, dmesg, friends</p>
<p>java utes: jps, jstack, jconsole<br />
Get the rpm!  www.cloudera.com/hadoop</p>
<p>config: my.cloudera.com<br />
modes &#8211; standalong, pseudo-distrib, distrib<br />
&#8220;It&#8217;s nice to use dsh, cfengine/puppet/bcfg2/chef for config managment across a cluster; maybe use scribe for centralized logging&#8221;</p>
<p><em>I love hearing what tools people are using, that&#8217;s mainly how I find out about new ones!</em></p>
<p>Common hadoop problems:</p>
<ul>
<li> &#8220;It&#8217;s almost always DNS&#8221; &#8211; use hostnames</li>
<li> open ports</li>
<li> distrib ssh keys (expect)</li>
<li> write permissions</li>
<li> make sure you&#8217;re using all the disks</li>
<li> don&#8217;t share NFS mounts for large clusters</li>
<li>set JAVA_HOME to new jvm (stick to sun&#8217;s)</li>
</ul>
<h3>HDFS In Depth</h3>
<p>1.  NameNode (master)<br />
VERSION file shows data structs, filesystem image (in memory) and edit log (persisted) &#8211; if they change, painful upgrade</p>
<p>2.  Secondary NameNode (aka checkpoint node) &#8211; checkpoints the FS image and then truncates edit log, usually run on a sep node<br />
New backup node in .21 removes need for NFS mount write for HA</p>
<p>3.  DataNode (workers)<br />
stores data in local fs<br />
stored data into blk_&lt;id&gt; files, round robins through dirs<br />
heartbeat to namenode<br />
raw socket to serve to client</p>
<p>4.  Client (Java HDFS lib)<br />
other stuff (libhdfs) more unstable</p>
<p>hdfs operator utilities</p>
<ul>
<li> safe mode &#8211; when it starts up</li>
<li> fsck &#8211; hadoop version</li>
<li> dfsadmin</li>
<li> block scanner &#8211; runs every 3 wks, has web interface</li>
<li> balancer &#8211; examines ratio of used to total capacity across the cluster</li>
<li> har (like tar) archive &#8211; bunch up smaller files</li>
<li> distcp &#8211; parallel copy utility (uses mapreduce) for big loads</li>
<li> quotas</li>
</ul>
<p>has users, groups, permissions &#8211; including x but there is no execution, but used for dirs<br />
hadoop has some access trust issues &#8211; used through gateway cluster or in trusted env<br />
audit logs &#8211; turn on in log4j.properties</p>
<p>has loads of Web UIs &#8211; on namenode go to /metrics, /logLevel, /stacks<br />
non-hdfs access &#8211; HDFS proxy to http, or thriftfs<br />
has trash (.Trash in home dir) &#8211; turn it on</p>
<p>includes benchmarks &#8211; testdfsio, nnbench</p>
<p>Common HDFS problems</p>
<ul>
<li> disk capacity, esp due to log file sizes &#8211; crank up reserved space</li>
<li> slow but not dead disks and flapping NICS to slow mode</li>
<li> checkpointing and backing up metadata &#8211; monitor that it happens hourly</li>
<li> losing write pipeline for long lived writes &#8211; redo every hour is recommended</li>
<li> upgrades</li>
<li>many small files</li>
</ul>
<h3>MapReduce</h3>
<p>use Fair Share or Capacity scheduler<br />
distributed cache<br />
jobcontrol for ordering</p>
<p>Monitoring &#8211; They use ganglia, jconsole, nagios and canary jobs for functionality</p>
<p>Question &#8211; how much admin resource would you need for hadoop?  Answer &#8211; Facebook ops team had 20% of 2 guys hadooping, estimate you can use 1 person/100 nodes</p>
<p>He also notes that this preso and maybe more are on<a href="http://www.slideshare.net/jhammerb" target="_blank"> slideshare under &#8220;jhammerb.&#8221;</a></p>
<p><em>I thought this presentation was very complete and bad ass, and I may have some use cases that hadoop would be good for coming up!</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/07/01/velocity-2009-hadoop-operations-managing-big-data-clusters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Everything You Need To Know About Cloud Security in 30 Minutes or Less</title>
		<link>http://www.webadminblog.com/index.php/2009/06/25/everything-you-need-to-know-about-cloud-security-in-30-minutes-or-less/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/25/everything-you-need-to-know-about-cloud-security-in-30-minutes-or-less/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 22:05:31 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[mogull]]></category>
		<category><![CDATA[rich]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=270</guid>
		<description><![CDATA[The last presentation of the day was by Rich Mogull on &#8220;Everything you need to know about cloud security in 30 minutes or less&#8221;.  It all started with all of the presentations and diagrams having pictures of clouds so some guy decides to sell that.  Makes security practitioners sad.
Why the cloud is a problem for [...]]]></description>
			<content:encoded><![CDATA[<p>The last presentation of the day was by Rich Mogull on &#8220;Everything you need to know about cloud security in 30 minutes or less&#8221;.  It all started with all of the presentations and diagrams having pictures of clouds so some guy decides to sell that.  Makes security practitioners sad.</p>
<p><span style="text-decoration: underline;"><strong>Why the cloud is a problem for security</strong></span></p>
<ul>
<li>Poor understanding of cloud taxonomies and definitions</li>
<li>A generic term, frequently misused to refer to anything on the Internet</li>
<li>Lack of visibility into cloud deployments</li>
<li>Organic consumption</li>
</ul>
<p>Couldn&#8217;t have talked about this stuff 6 months ago because nobody knew about it and it wasn&#8217;t discussed.</p>
<p><span style="text-decoration: underline;"><strong>Security Implications</strong></span></p>
<ul>
<li>Variable control</li>
<li>Variable visibility</li>
<li>Variable simplicity/complexity</li>
<li>Variable resources</li>
</ul>
<p>Control, visibility, and resources goes down as simplicity and management goes up</p>
<p>Is the cloud more or less secure than we are now?  It depends.  Something are more secure and some things are less secure because of all of the variability.</p>
<p><span style="text-decoration: underline;"><strong>Saas</strong></span></p>
<ul>
<li>Most constrained</li>
<li>Most security managed by your provider</li>
<li>Least flexible</li>
</ul>
<p><span style="text-decoration: underline;"><strong>PaaS</strong></span></p>
<ul>
<li>Less constrained</li>
<li>Security varies tremendously based on provider and application-shared responsibility</li>
<li>Security responsibility</li>
</ul>
<p><span style="text-decoration: underline;"><strong>IaaS</strong></span></p>
<ul>
<li>Most flexible</li>
<li>Most security managed by your developers</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Specific Issues</strong></span></p>
<ul>
<li>Spillage and data security</li>
<li>Reliability/availability</li>
<li>Capability to apply traditional security controls in a dynamic environment</li>
<li>Lack of visibility into cloud usage</li>
<li>Changing development patterns/cycles</li>
</ul>
<p>How do you use your static and dynamic analysis testing tools in the cloud?</p>
<p>Where do you roll your cloud when it fails?</p>
<p><span style="text-decoration: underline;"><strong>Your Top 2 Cloud Security Defenses</strong></span></p>
<ul>
<li>SLA</li>
<li>Contracts</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Understand Your SLAs</strong></span></p>
<ul>
<li>Are there security-specific SLAs?</li>
<li>Can you audit against those SLAs?</li>
<li>Are there contractual penalties for non-compliance?</li>
<li>Do your SLAs meet your risk tolerance requirements?</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Suggested SLAs</strong></span></p>
<ul>
<li>Availability</li>
<li>Security audits &#8211; including third party</li>
<li>Data security/encryption</li>
<li>Personal security</li>
<li>Security controls (depend based on service)</li>
<li>User account management</li>
<li>Infrastructure changes</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Understand Your Cloud</strong></span></p>
<ul>
<li>What security controls are in your cloud?</li>
<li>How can you manage and integrate with the controls?</li>
<li>What security documentation is available?</li>
<li>What contingency plans are available?</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Cloud Security Controls to Look For</strong></span></p>
<ul>
<li>Data encryption/security (key management)</li>
<li>Perimeter defenses</li>
<li>Auditing/logging</li>
<li>Authentication</li>
<li>Segregation</li>
<li>Compliance</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Cloud Security Macro Layers</strong></span></p>
<ul>
<li>Network</li>
<li>Service</li>
<li>User</li>
<li>Transaction</li>
<li>Data</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Don&#8217;t Trust</strong></span></p>
<ul>
<li>SAS70 Audits</li>
<li>Documentation without verification</li>
<li>Non-contractual SLAs</li>
</ul>
<p><span style="text-decoration: underline;"><strong>What to Do</strong></span></p>
<ul>
<li>Educate yourself</li>
<li>Engage with developers</li>
<li>Develop cloud security requirements</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/25/everything-you-need-to-know-about-cloud-security-in-30-minutes-or-less/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cloud Computing Panel Discussion</title>
		<link>http://www.webadminblog.com/index.php/2009/06/25/cloud-computing-panel-discussion/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/25/cloud-computing-panel-discussion/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 21:31:43 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[dell]]></category>
		<category><![CDATA[ibm]]></category>
		<category><![CDATA[jim]]></category>
		<category><![CDATA[josh]]></category>
		<category><![CDATA[mogull]]></category>
		<category><![CDATA[panel]]></category>
		<category><![CDATA[rackspace]]></category>
		<category><![CDATA[rich]]></category>
		<category><![CDATA[rymarczk]]></category>
		<category><![CDATA[securosis]]></category>
		<category><![CDATA[zachary]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=268</guid>
		<description><![CDATA[Next up at the Cloud Computing and Virtualization Security half-day seminar was a Cloud Computing Panel moderated by Rich Mogull (Analyst/CEO at Securosis) with Josh Zachary (Rackspace), Jim Rymarczk (IBM), and Phil Agcaoili (Dell) participating in the panel.  My notes from the panel discussion are below:
Phil: Little difference between outsources of the past and today&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>Next up at the Cloud Computing and Virtualization Security half-day seminar was a Cloud Computing Panel moderated by Rich Mogull (Analyst/CEO at Securosis) with Josh Zachary (Rackspace), Jim Rymarczk (IBM), and Phil Agcaoili (Dell) participating in the panel.  My notes from the panel discussion are below:</p>
<p>Phil: Little difference between outsources of the past and today&#8217;s Cloud Computing.  All of that stuff is sitting outside of your environment and we&#8217;ve been evolving toward that for a long time.</p>
<p>Rich: My impression is that there are benefits to outsourced hosting, but there are clearly areas that make sense and areas that don&#8217;t.  This is fundamentally different from shared computing resources.  Very different applications for this.  Complexity goes up very quickly very quickly for security controls.  Where do you see the most value today?  Where do people need to be most cautious?</p>
<p>Jim: Internal virtualization is almost necessary, but it impacts almost every IT process.  Technology is still evolving and is far from advanced state.  Be pragmatic and find particular applications with a good ROI.</p>
<p>Josh: Understand what you are putting into a cloud environment.  Have a good understanding of what a provider can offer you in terms of sensitive data.  Otherwise you&#8217;re putting yourself in a very bad situation.  A lot of promise.  Great for social networking and web development.  Not appropriate with enterprises with large amounts of IP and sensitive data.</p>
<p>Jim: We&#8217;ll get there in 4-5 years.</p>
<p>Phil: Let supply chain experts do it for you and then interact with them.  Access their enviornment from anywhere.  Use a secure URL with a federated identity.  Your business will come back to you and say &#8220;We need to do this&#8221; and IT will be unable to assist them.  Use it as an opportunity to mobilize compliance and InfoSec and get involved.  It&#8217;s going to come to use and we&#8217;re just going to have to deal with it.  There&#8217;s a long line of people with a &#8220;right to audit&#8221;.  Don&#8217;t think that someone is doing the right thing in this space, you have to ask.</p>
<p>Audience: What is the most likely channel for standards?</p>
<p>Phil: Cloud Security Alliance is a step in the right direction.  Want to come up with PCI DSS like checklists.  CSA is working with IEEE and NIST to work along with them.  Goal is to be able to feed the standards process, not become a standards body.</p>
<p>Rich: The market is anti-standards based.  If we get standardized, then all of the providers are only competing based on cost.</p>
<p>Jim: I think it&#8217;ll happen.  We will see ISO groups for standards on cloud quality.</p>
<p>Audience: Moving data between multiple clouds.  How do you determine who gets paid?</p>
<p>Jim: There are proposals for doing that.  All of the resource parameters.</p>
<p>Phil: Should see standards based on federated identity.  Who is doing what and where.  That&#8217;s where I&#8217;ve seen the most movement.  There is no ISO for SaaS.  Remapping how 27001 and 27002 apply to us as a software provider.</p>
<p>Audience: Two things that drive standards.  The market or monopoly (BetaMax).</p>
<p>Rich: We will have monopolistic ones and then 3rd parties that say they use those standards.</p>
<p>Audience: How can you really have an objective body create standards without being completely embedded in the technology?</p>
<p>Jim: You create a reference standard and the market drives that.</p>
<p>Phil: Gravity pulls us to things that work.  Uses SAML as an example.  It&#8217;s the way the internet has always worked.  The strongest will survive and the right standards will manifest themselves.</p>
<p>Rich: What are some of things that you&#8217;re dealing with internally (as consumers and providers) and the top suggestions for people stuck in this situation?</p>
<p>Jim: People who don&#8217;t have all of the  requirements do public clouds.  If what you want is available (salesforce.com), it may be irresistible.</p>
<p>Josh: Solution needs to be appropriate to the need.  Consult with your attorney to make sure you contract is in line with what you&#8217;re leveraging the provider for.  It&#8217;s really about what you agree to with that provider and their responsibilities.</p>
<p>Phil: The hurricane is coming.  You can&#8217;t scream into the wind, you gotta learn to run for cover.  Find the safe spot.</p>
<p>Audience: What industries do you see using this?  I don&#8217;t see it with healthcare.</p>
<p>Phil: Mostly providers for us.  Outsourcing service desks.  Government.  Large states/local.</p>
<p>Josh: Small and medium retail businesses.  Get products out there at a significantly reduced cost.</p>
<p>Jim: Lots of financial institutions looking for ways to cut costs.  Healthcare industry as well (Mayo Clinic).  Broad interest across the whole market, but especially anywhere they&#8217;re under extreme cost measures.</p>
<p>Rich: I run a small business that picked an elastic provider that couldn&#8217;t pay for a full virtual hosting provider.  Doing shared hosting right now, but capable of growing to a virtual private server.  Have redundancy.  Able to go full-colocation if they need it.  Able to support growth, but start with the same instance to get there.</p>
<p>Audience: How does 3rd party transparency factor into financial uses?</p>
<p>Jim: Almost exclusively private clouds.  There are use cases playing out right now that will be repeatable patterns.  Use cases.</p>
<p>Phil: When the volume isn&#8217;t there, offload to someone like Rackspace and they&#8217;ll help you to grow.</p>
<p>Audience: Are there guidelines to contracts to make sure information doesn&#8217;t just get outsourced to yet another party?</p>
<p>Phil: Your largest partners/vendors steal their contracts.  Use them as templates.</p>
<p>Audience: What recourse do you have that an audit is used to verify that security is not an issue?</p>
<p>Rich: Contracts.</p>
<p>Phil: Third party assessment (ie. the right to audit).  It&#8217;s in our interest to verify they are secure.  It&#8217;s a trend and we now have a long list of people looking to audit against us as a provider.  Hoping for an ISO to come up truly for the cloud.</p>
<p>Audience: Is cloud computing just outsourcing?</p>
<p>Rich: It&#8217;s more than that.  For example, companies have internal clouds that aren&#8217;t outsourced at all.</p>
<p>Josh: Most of the time it&#8217;s leveraging resources more efficiently at hopefully a reduced cost.</p>
<p>Audience: How do I know you&#8217;re telling me the truth about the resources I&#8217;m using?  What if I&#8217;m a bad guy who wants to exploit a competitor using the cloud?</p>
<p>Josh: We&#8217;ve seen guys create botnets using stolen credit cards.  What you&#8217;re billed for is in your contract.</p>
<p>Jim: We&#8217;ve had this solved for decades on mainframes.  Precious resources propagated amongst users.  There&#8217;s no technical reason we&#8217;re not doing it today.</p>
<p>Rich: It depends what type of cloud you&#8217;re using.  Some will tell you.</p>
<p>Josh: If you&#8217;re worried about someone abusing you, why are you there in the first place?</p>
<p>Phil: For our service desk we meter this by how many calls, by location.  Monitor servers that were accessed/patched/etc.  Different service providers will have different levels.</p>
<p>Audience: Seeing some core issues at the heart of this.  For businesses, an assessment of core competencies.  Can you build a better data center with the cloud?  Second issue involves risk assessment.  Can you do a technical audit?  Can you pay for it legally?  How much market presence does the vendor have?  Who has responsibility for what?  Notion of transparency of control.  Seems like it distills down to those core basics.</p>
<p>Jim: I agree.</p>
<p>Rich: Well said.</p>
<p>Phil: Yes, yes, yes.</p>
<p>Audience: How do you write a contract for failed nation states, volatility, etc?  Do we say you can&#8217;t put our stuff in these countries?</p>
<p>Phil: This is the white elephant in the room.  How can you ensure that my data is being protected the way I&#8217;d protect it myself.  It&#8217;s amazing what other people do when they get a hold of that stuff.  This is the underlying problem that we have to solve.  &#8220;Moving from a single-family home to a multi-tenant condo.  How do we build that now?</p>
<p>Rich: You need to be comfortable with what you&#8217;re putting out there.</p>
<p>Audience: To what extent is the military or federal government using cloud computing?</p>
<p>Jim: They&#8217;re interested in finding ways, but they don&#8217;t talk about how they&#8217;re using it.</p>
<p>Audience &#8211; Vern: They&#8217;re doing cloud computing using an internal private cloud already.  They bill back to the appropriate agency based on use.</p>
<p>Phil: Government is very wary of what&#8217;s going on.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/25/cloud-computing-panel-discussion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Virtualization Security Best Practices from a Customer’s and Vendor’s Perspective</title>
		<link>http://www.webadminblog.com/index.php/2009/06/25/virtualization-security-best-practices-from-a-customers-and-vendors-perspective/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/25/virtualization-security-best-practices-from-a-customers-and-vendors-perspective/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 20:04:20 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[best]]></category>
		<category><![CDATA[brian]]></category>
		<category><![CDATA[collapsed]]></category>
		<category><![CDATA[customer]]></category>
		<category><![CDATA[engle]]></category>
		<category><![CDATA[inland]]></category>
		<category><![CDATA[perspective]]></category>
		<category><![CDATA[practices]]></category>
		<category><![CDATA[randell]]></category>
		<category><![CDATA[rob]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[separation]]></category>
		<category><![CDATA[temple]]></category>
		<category><![CDATA[trust]]></category>
		<category><![CDATA[vendor]]></category>
		<category><![CDATA[vmware]]></category>
		<category><![CDATA[zone]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=266</guid>
		<description><![CDATA[The next session during the ISSA half-day seminar on Virtualization and Cloud Computing Security was on security best practices from a customer and vendor perspective.  It featured Brian Engle, CIO of Temple Inland, and Rob Randell, CISSP and Senior Security Specialist at VMware, Inc.  My notes from the presentation are below:
Temple Inland Implementation &#8211; Stage [...]]]></description>
			<content:encoded><![CDATA[<p>The next session during the ISSA half-day seminar on Virtualization and Cloud Computing Security was on security best practices from a customer and vendor perspective.  It featured Brian Engle, CIO of Temple Inland, and Rob Randell, CISSP and Senior Security Specialist at VMware, Inc.  My notes from the presentation are below:</p>
<p><span style="text-decoration: underline;"><strong>Temple Inland Implementation &#8211; Stage 1</strong></span></p>
<p>Overcome Hurdles</p>
<ul>
<li>Management skeptical of Windows virtualization</li>
</ul>
<p>Don&#8217;t Fear the Virtual World</p>
<ul>
<li>First year:
<ul>
<li>Built out development only environment</li>
<li>Trained staff</li>
<li>Developed support processes</li>
<li>Showed hard dollar savings</li>
</ul>
</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Temple Inland &#8211; Stage 2</strong></span></p>
<ul>
<li>Build QA environment</li>
<li>Improve processes</li>
<li>Develop rapid provisioning</li>
<li>Demonstrate advanced functions
<ul>
<li>Vmotion</li>
<li>P2V Conversions</li>
</ul>
</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Temple Inland &#8211; Stage 3</strong></span></p>
<p>First production environment</p>
<p>Temple-Inland Implementation</p>
<ul>
<li>Prior to VMWare. Typical remote facility
<ul>
<li>Physical domain controller</li>
<li>Physical application/file server</li>
<li>Physical tape drive</li>
</ul>
</li>
<li>New architecture
<ul>
<li>Single VMWare server</li>
<li>No tape drive</li>
</ul>
</li>
</ul>
<ul>
<li>Desktops
<ul>
<li>Virtualize desktops through VMWare</li>
<li>No application issues like Citrix Metaframe</li>
<li>Quick deployment and repair</li>
</ul>
</li>
</ul>
<p><span style="text-decoration: underline;"><strong>How Virtualization Affects Datacenter Security</strong></span></p>
<ul>
<li>Abstraction and Consolidation
<ul>
<li>+Capital and Operational Cost Savings</li>
<li>-New infrastructure layer to be secured</li>
<li>-Greater impact of attack or misconfiguration</li>
</ul>
</li>
<li>Collapse of Switches and servers into one device
<ul>
<li>+Flexibility</li>
<li>+Cost-savings</li>
<li>-Lack of virtual network visibility</li>
<li>-No separation-by-default of administration</li>
</ul>
</li>
</ul>
<p>Temple-Inland split the teams so that there was a virtual network administration team within the server administration team.</p>
<p><span style="text-decoration: underline;"><strong>How Virtualization Affects Datacenter Security</strong></span></p>
<ul>
<li>Faster deployment of servers
<ul>
<li>+ IT responsiveness</li>
<li>-Lack of adequate planning</li>
<li>-Incomplete knowledge of current state of infrastructure</li>
</ul>
</li>
<li>VM Mobility
<ul>
<li>+Improved Service Levels</li>
<li>-Identity divorced from physical location</li>
</ul>
</li>
<li>VM Encapsulation
<ul>
<li>+Ease of business continuity</li>
<li>+Consistency of deployment</li>
<li>+Hardware Independence</li>
<li>-Outdated offline systems</li>
</ul>
</li>
</ul>
<p>Build anti-virus, client firewalls, etc into the offline images so that servers are up-to-date right when they are installed.</p>
<p>If something happens to a system, you can&#8217;t just pull the plug anymore.  You have to have policies and processes in place.</p>
<p>With virtualization you can have a true &#8220;gold image&#8221; instead of having different images for all of the different types of hardware.</p>
<p><span style="text-decoration: underline;"><strong>Security Advantages of Virtualization</strong></span></p>
<ul>
<li>Allows automation of many manual error prone processes</li>
<li>Cleaner and easier disaster recovery/business continuity</li>
<li>Better forensics capabilities</li>
<li>Faster recovery after an attack</li>
<li>Patching is safer and more effective</li>
<li>Better control over desktop resources</li>
<li>More cost effective security devices</li>
<li>App virtualization allows de-privileging of end users</li>
<li>Better lifecycle controls</li>
<li>Future: Security through VM Introspection</li>
</ul>
<p>Gartner: &#8220;Like their physical counterparts, most security vulnerabilities will be introduced through misconfiguration&#8221;</p>
<p><span style="text-decoration: underline;"><strong>What Not to Worry About</strong></span></p>
<ul>
<li>Hypervisor Attacks
<ul>
<li>ALL theoretical, highly complex attacks</li>
<li>Widely recognized by security community as being only of academic interest</li>
</ul>
</li>
<li>Irrelevant Architectures
<ul>
<li>Apply only to hosted architecture (ie. Workstation) not bare-metal (ie. ESX)</li>
<li>Hosted architecture generally suitable only when you can trust the guest VM</li>
</ul>
</li>
<li>Contrived Scenarios
<ul>
<li>Involved exploits where best practices around hardening, lockdown, desgin, for virtualization etc not followed or</li>
<li>Poor general IT infrastructure security is assumed</li>
</ul>
</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Are there any Hypervisor Attack Vectors?</strong></span></p>
<p>There are currently no known hypervisor attack vectors to date that have lead to &#8220;VM Escape&#8221;</p>
<ul>
<li>Architecture Vulnerability
<ul>
<li>Designed specifically with isolation in mind</li>
</ul>
</li>
<li>Software Vulnerability &#8211; Possible like with any code written by humans
<ul>
<li>Mitigating Circumstances:
<ul>
<li>Small Code Footprint of Hypervisor (~21MB) is easier to audit</li>
<li>If a software vulnerability is found, exploit difficulty will be very high
<ul>
<li>Purpose build for virtualization only</li>
<li>Non-interactive environment</li>
<li>Less code for hackers to leverage</li>
</ul>
</li>
</ul>
</li>
<li>Ultimately depends on VMWare security response and patching</li>
</ul>
</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Concern: Virtualizing the DMZ/Mixing Trust Zones</strong></span></p>
<p>Three Primary Configurations</p>
<ul>
<li>Physical separation of trust zones</li>
<li>Virtual separation of trust zones with physical security devices</li>
<li>Fully collapsing all servers and security devices into a VI3 infrastructure</li>
</ul>
<p>Also applies to PCI requirement</p>
<p><span style="text-decoration: underline;"><strong>Physical Separation of Trust Zones</strong></span></p>
<p>Advantages</p>
<ul>
<li>Simpler, less complex configuration</li>
<li>Less change to physical environment</li>
<li>Little change to separation of duties</li>
<li>Less change in staff knowledge requirements</li>
<li>Smaller chance of misconfiguration</li>
</ul>
<p>Disadvantages</p>
<ul>
<li>Lower consolidation and utilization of resources</li>
<li>Higher cost</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Virtual Separation of Trust Zones with Physical Security Devices</strong></span></p>
<p>Advantages</p>
<ul>
<li>Better utilization of resources</li>
<li>Take full advantage of virtualization benefits</li>
<li>Lower cost</li>
</ul>
<p>Disadvantages (can be mitigated)</p>
<ul>
<li>More complexity</li>
<li>Greater chance of misconfiguration</li>
</ul>
<p>Getting more toward &#8220;the cloud&#8221; where web zone, app zone, and DB zone are all virtualized on the same system, but still using physical firewalls.</p>
<p><span style="text-decoration: underline;"><strong>Fully Collapsed Trust Zones Including Security Devices</strong></span></p>
<p>Advantages</p>
<ul>
<li>Full utilization of resources, replacing physical security devices with virtual</li>
<li>Lowest-cost option</li>
<li>Management of entire DMZ and network from a single management workstation</li>
</ul>
<p>Disadvantages</p>
<ul>
<li>Greatest complexity, which in turn creates highest chance of misconfiguration</li>
<li>Requirement for explicit configuration to define separation of duties to help mitigate risk of misconfiguration; also requires regualar audits of configurations</li>
<li>Potential loss of certain functionality, such as VMotion (being mitigated by vendors and VMsafe)</li>
</ul>
<p><span style="text-decoration: underline;"><strong>How do we secure our Virtual Infrastructure?</strong></span></p>
<p>Use the principles of Information Security</p>
<ul>
<li>Hardening and lockdown</li>
<li>Defense in depth</li>
<li>Authorization, authentication, and accounting</li>
<li>Separation of duties and least privileges</li>
<li>Administrative controls</li>
</ul>
<p>Protect your management interfaces (VCenter)!  They are the keys to the kingdom.</p>
<p><span style="text-decoration: underline;"><strong>Fundamental Design Principles</strong></span></p>
<ul>
<li>Isolate all management networks</li>
<li>Disable all unneeded services</li>
<li>Tightly regualte all administrative access</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Summary</strong></span></p>
<ul>
<li>Define requirements and ensure vendor/product can deliver
<ul>
<li>Consider culture, capability, maturity, architecture and security needs</li>
</ul>
</li>
<li>Implement under controlled conditions using a defined methodology
<ul>
<li>Use the opportunity to improve control deficiencies in existing physical server areas if possible</li>
<li>Implement processes for review and validation of controls to prevent the introduction of weaknesses</li>
</ul>
</li>
<li>Round corners where your control environment allows
<ul>
<li>Sustain sound practices that maintain required controls</li>
<li>Leverage the technology to achieve efficiency and improve scale</li>
</ul>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/25/virtualization-security-best-practices-from-a-customers-and-vendors-perspective/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>About the Cloud Security Alliance</title>
		<link>http://www.webadminblog.com/index.php/2009/06/25/about-the-cloud-security-alliance/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/25/about-the-cloud-security-alliance/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 19:03:41 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[about]]></category>
		<category><![CDATA[alliance]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[framework]]></category>
		<category><![CDATA[guidance]]></category>
		<category><![CDATA[jeff]]></category>
		<category><![CDATA[membership]]></category>
		<category><![CDATA[objectives]]></category>
		<category><![CDATA[reich]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=261</guid>
		<description><![CDATA[The next presentation at the ISSA half-day seminar was on the &#8220;Cloud Security Alliance&#8221; and Security Guidance for Critical Areas of Focus in Cloud Computing by Jeff Reich.  Here are my notes from this presentation:
Agenda

About the Cloud Security Alliance
Getting Involved
Guidance 1.0
Call to Action

About the Cloud Security Alliance

Not-for-profit organization
Inclusive membership, supporting broad spectrum of subject matter [...]]]></description>
			<content:encoded><![CDATA[<p>The next presentation at the ISSA half-day seminar was on the &#8220;Cloud Security Alliance&#8221; and Security Guidance for Critical Areas of Focus in Cloud Computing by Jeff Reich.  Here are my notes from this presentation:</p>
<p><span style="text-decoration: underline;"><strong>Agenda</strong></span></p>
<ul>
<li>About the Cloud Security Alliance</li>
<li>Getting Involved</li>
<li>Guidance 1.0</li>
<li>Call to Action</li>
</ul>
<p><span style="text-decoration: underline;"><strong>About the Cloud Security Alliance</strong></span></p>
<ul>
<li>Not-for-profit organization</li>
<li>Inclusive membership, supporting broad spectrum of subject matter expertise: cloud experts, security, legal, compliance, virtualization, etc</li>
<li>We believe in Cloud Computing, we want to make it better</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Getting Involved</strong></span></p>
<ul>
<li>Individual membership (free)
<ul>
<li>Subject matter experts for research</li>
<li>Interested in learning about the topic</li>
<li>Administrative &amp; organizational help</li>
</ul>
</li>
<li>Corporate Sponsorship
<ul>
<li>Help fund outreach, events</li>
</ul>
</li>
<li>Affiliated Organizations (free)
<ul>
<li>Joint projects in the community interest</li>
</ul>
</li>
<li>Contact information on website</li>
</ul>
<p>Download version 1.0 of the Security Guidance at http://www.cloudsecurityalliance.org/guidance</p>
<p><span style="text-decoration: underline;"><strong>Overview of Guidance</strong></span></p>
<ul>
<li>15 domains</li>
<li>#1 is Architecture &amp; Framework</li>
<li>Covers Governing in the Cloud (2-7) and Operating in the Cloud (8-15) as well</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Assumptions &amp; Objectives</strong></span></p>
<ul>
<li>Trying to bridge gap between cloud adopters and security practitioners</li>
<li>Broad &#8220;security program&#8221; view of the problem</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Architecture Framework</strong></span></p>
<ul>
<li>Not &#8220;One Cloud&#8221;: Nuanced definition critical to understanding risks &amp; mitigation</li>
<li>5 principal characteristics (abstration, sharing, SOA, elasticity, consumption/allocation)</li>
<li>3 delivery models
<ul>
<li>Infrastructure as a Service</li>
<li>Platform as a Service</li>
<li>Software as a Service</li>
</ul>
</li>
<li>4 deployment models: Public, Private, Managed, Hybrid</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Governance &amp; ERM</strong></span></p>
<ul>
<li>A portion of cloud cost savings must be invested into provider security</li>
<li>Third party transparency of cloud provider</li>
<li>Financial viability of cloud provider</li>
<li>Alignment of key performance indicators</li>
<li>PII best suited in private/hybrid cloud outside of significant due diligence of public cloud provider</li>
<li>Increased frequency of 3rd party risk assessments</li>
</ul>
<p>Important thing to consider is the financial viability of your provider.  You never want to have your data held hostage in a court battle.</p>
<p><span style="text-decoration: underline;"><strong>Legal</strong></span></p>
<ul>
<li>Contracts must have flexible structure for dynamic cloud relationships</li>
<li>Plan for both an expected and unexpected termination of the relationship and an orderly return of your assets</li>
<li>Find conflicts between the laws the cloud provider must comply with and those governing the cloud customer</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Compliance &amp; Audit</strong></span></p>
<ul>
<li>Classify data and systems to understand compliance requirements</li>
<li>Understand data locations, copies</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Information Lifecycle Management</strong></span></p>
<ul>
<li>Understand the logical segregation of information and protective controls imnplemented in storage, transfers, backups</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Summary</strong></span></p>
<ul>
<li>Cloud Computing is real and transformational</li>
<li>Cloud Computing can and will be secured</li>
<li>Broad governance approach needed</li>
<li>Tactical fixes needed</li>
<li>Combination of updating existing best practices and creating completely new best practices</li>
<li>Common sense is not optional</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Call to Action</strong></span></p>
<ul>
<li>Join us, help make our work better</li>
<li>www.cloudsecurityalliance.org</li>
<li>info@cloudsecurityalliance.org</li>
<li>Twitter: @cloudsa, #csaguide</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/25/about-the-cloud-security-alliance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to Cloud Computing and Virtualizaton Security</title>
		<link>http://www.webadminblog.com/index.php/2009/06/25/introduction-to-cloud-computing-and-virtualization-security/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/25/introduction-to-cloud-computing-and-virtualization-security/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 18:31:58 +0000</pubDate>
		<dc:creator>Josh</dc:creator>
				<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[Virtualization]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[issa]]></category>
		<category><![CDATA[vern]]></category>
		<category><![CDATA[williams]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=257</guid>
		<description><![CDATA[Today the Austin ISSA and ISACA chapters held a half-day seminar on Cloud Computing and Virtualization Security.  The introduction on cloud computing was given by Vern Williams.  My notes on this topic are below:
5 Key Cloud Characteristics

On-demand self-service
Ubiquitous network access
Location independent resource pooling
Rapid elasticity
Pay per use

3 Cloud Delivery Models

Software as a Service (SaaS): Providers applications [...]]]></description>
			<content:encoded><![CDATA[<p>Today the Austin ISSA and ISACA chapters held a half-day seminar on Cloud Computing and Virtualization Security.  The introduction on cloud computing was given by Vern Williams.  My notes on this topic are below:</p>
<p><span style="text-decoration: underline;"><strong>5 Key Cloud Characteristics</strong></span></p>
<ul>
<li>On-demand self-service</li>
<li>Ubiquitous network access</li>
<li>Location independent resource pooling</li>
<li>Rapid elasticity</li>
<li>Pay per use</li>
</ul>
<p><span style="text-decoration: underline;"><strong>3 Cloud Delivery Models</strong></span></p>
<ul>
<li>Software as a Service (SaaS): Providers applications over a network</li>
<li>Platform as a Service (PaaS): Deploy customer-created apps to a cloud</li>
<li>Infrastructure as a Service (IaaS): Rent processing, storage, etc</li>
</ul>
<p><span style="text-decoration: underline;"><strong>4 Cloud Deployment Models</strong></span></p>
<ul>
<li>Private cloud: Enterprise owned or leased</li>
<li>Community cloud: Shared infrastructure for a specific community</li>
<li>Public cloud: Sold to the public, Mega-scale infrastructure</li>
<li>Hybrid cloud: Composition of two or more clouds</li>
</ul>
<ul>
<li>Two types: internal and external</li>
<li>http://csrc.nist.com/groups/SNS/cloud-computing/index.html</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Common Cloud Characteristics</strong></span></p>
<ul>
<li>Massive scale</li>
<li>Virtualization</li>
<li>Free software</li>
<li>Autonomic computing</li>
<li>Multi-tenancy</li>
<li>Geographically distributed systems</li>
<li>Advanced security technologies</li>
<li>Service oriented software</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Pros</strong></span></p>
<ul>
<li>Lower central processing unit (CPU) density</li>
<li>Flexible use of resources</li>
<li>Rapid deployment of new servers</li>
<li>Simplified recovery</li>
<li>Virtual network connections</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Cons</strong></span></p>
<ul>
<li>Complexity</li>
<li>Potential impact of a single component failure</li>
<li>Hypervisor security issues</li>
<li>Keeping virtual machine (VM) images current</li>
<li>Virtual network connections</li>
</ul>
<p><span style="text-decoration: underline;"><strong>Virtualization Security Concerns</strong></span></p>
<ul>
<li>Protecting the virtual fabric</li>
<li>Patching off-line VM images</li>
<li>Configuration Management</li>
<li>Firewall configurations</li>
<li>Complicating Audit and Forensics</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/25/introduction-to-cloud-computing-and-virtualization-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Velocity 2009 – Introduction to Managed Infrastructure with Puppet</title>
		<link>http://www.webadminblog.com/index.php/2009/06/24/velocity-2009-introduction-to-managed-infrastructure-with-puppet/</link>
		<comments>http://www.webadminblog.com/index.php/2009/06/24/velocity-2009-introduction-to-managed-infrastructure-with-puppet/#comments</comments>
		<pubDate>Wed, 24 Jun 2009 18:31:40 +0000</pubDate>
		<dc:creator>Ernest</dc:creator>
				<category><![CDATA[Automation]]></category>
		<category><![CDATA[Conferences]]></category>
		<category><![CDATA[Velocity 2009]]></category>
		<category><![CDATA[puppet]]></category>
		<category><![CDATA[velocity]]></category>
		<category><![CDATA[velocityconf]]></category>
		<category><![CDATA[velocityconf09]]></category>

		<guid isPermaLink="false">http://www.webadminblog.com/?p=250</guid>
		<description><![CDATA[Introduction to Managed Infrastructure with Puppet
by Luke Kanies, Reductive Labs
You can get the work files from git://github.com/reductivelabs/velocity_puppet_workshop_2009.git, and the presentation&#8217;s available here.
I saw Luke&#8217;s Puppet talk last year at Velocity 2008, but am more ready to start uptaking some conf management back home.  Our UNIX admins use cfengine, and puppet is supposed to be a [...]]]></description>
			<content:encoded><![CDATA[<p>Introduction to Managed Infrastructure with <a href="http://reductivelabs.com/products/puppet/" target="_blank">Puppet</a><br />
by <a href="http://madstop.com/" target="_blank">Luke Kanies</a>, <a href="http://reductivelabs.com/" target="_blank">Reductive Labs</a></p>
<p>You can get the work files from git://github.com/reductivelabs/velocity_puppet_workshop_2009.git, and the <a href="http://reductivelabs.com/downloads/presentations/velocity_puppet_workshop_2009/project.html" target="_blank">presentation&#8217;s available here</a>.</p>
<p><em>I saw Luke&#8217;s Puppet talk <a href="http://www.webadminblog.com/index.php/2008/06/24/the-velocity-2008-conference-experience-part-vii/" target="_blank">last year at Velocity 2008</a>, but am more ready to start uptaking some conf management back home.  Our UNIX admins use cfengine, and puppet is supposed to be a better-newer cfengine.  Now there&#8217;s also an (allegedly) better-newer one called chef I read about lately.  So this should be interesting in helping to orient me to the space.  At lunch, we sat with Luke and found out that Reductive just got their second round funding and were quite happy, though got nervous and prickly when there was too much discussion of whether they were all buying <a href="http://www.teslamotors.com/" target="_blank">Teslas </a>now.  Congrats Reductive!</em></p>
<p>Now, to work along, you git the bundle and use it with puppet.  <em>Luke assumes we all have laptops, all have git installed on our laptops, and know how to sync his bundle of goodness down.  And have puppet or can quickly install it.  Bah.  I reckon I&#8217;ll just follow along.</em></p>
<p>You can get puppet support via IRC, or the puppet-users google group.</p>
<p>First we exercise &#8220;ralsh&#8221;, the resource abstraction layer shell, which can interact with resources like packages, hosts, and users.  Check em, add em, modify em.</p>
<p>You define abstraction packages.  Like &#8220;ssh means ssh on debian, openssh on solaris&#8230;&#8221;  It requires less redundancy of config than cfengine.</p>
<p>&#8220;puppet&#8221;  consists of several executables &#8211; puppet, ralsh, puppetd, puppetmasterd, and puppetca.</p>
<p>As an aside, <a href="http://cft.et.redhat.com/" target="_blank">cft</a> is a neat config file snapshot thing in red hat.</p>
<p>Anyway, you should use puppet not ralsh directly.  Anyway the syntax is similar.  Here&#8217;s an example invocation:</p>
<pre>puppet -e 'file { "/tmp/eh": ensure =&gt; present }'</pre>
<p>There&#8217;s a file backup, or &#8220;bucket&#8221;, functionality when you change/delete files.</p>
<p>You make a repository and can either distribute it or run it all from a server.</p>
<p>There is reporting.</p>
<p>There&#8217;s a <a href="http://github.com/albanpeignier/gepetto/tree/master" target="_blank">gepetto</a> addon that helps you build a central repo.</p>
<p>A repo has (or should have) modules, which are basically functional groupings.  Modules have &#8220;code.&#8221;  The code can be a class definition.  init.pp is the top/special one.   There&#8217;s a modulepath setting for puppet.  Load the file, include the class, it runs all the stuff in the class.</p>
<p>It has &#8220;nodes&#8221; but he scoffs at them.  Put them in manifests/site.pp.  default, or hostname specific (can inherit default).   But you should use a different application, not puppet, to do this.</p>
<p>You have to be able to completely and correctly describe a task for puppet to do it.  This is a feature not a bug.</p>
<p>Puppet uses a client-server pull architecure.  You start a puppetmasterd on a server.  Use the SSH defaults because that&#8217;s complicated and will hose you eventually.  Then start a puppetd on a client and it&#8217;ll pull changes from the server.</p>
<p><em>This is disjointed.  Sorry about that.  The session is really just reading the slide equivalent of man pages while flipping back and forth to a command prompt to run basic examples.  I don&#8217;t feel like this session gave enough of an intro to puppet, it was just &#8220;launch into the man pages and then run individual commands, many of which he tells you to never do.&#8221;  I don&#8217;t feel like I&#8217;m a lot more informed on puppet than when I started, which makes me sad.  I&#8217;m not sure what the target audience for this is.  If it&#8217;s people totally new to puppet, like me, it starts in the weeds too much.  If it&#8217;s for someone whohas used puppet, it didn&#8217;t seem to have many pro tips or design considerations, it was basic command execution.  Anyway, he ran out of time and flipped through the last ten slides in as many seconds.  I&#8217;m out! </em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.webadminblog.com/index.php/2009/06/24/velocity-2009-introduction-to-managed-infrastructure-with-puppet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss><!-- Dynamic Page Served (once) in 2.480 seconds --><!-- Cached page served by WP-Cache -->
