<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Spark</title>
	<atom:link href="http://spark-project.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://spark-project.org</link>
	<description>Lightning-Fast Cluster Computing</description>
	<lastBuildDate>Thu, 23 May 2013 19:56:48 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
		<item>
		<title>Transformations and Caching &#8211; Spark Screencast #3</title>
		<link>http://spark-project.org/screencast-3-transformations-and-caching/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=screencast-3-transformations-and-caching</link>
		<comments>http://spark-project.org/screencast-3-transformations-and-caching/#comments</comments>
		<pubDate>Wed, 17 Apr 2013 00:06:22 +0000</pubDate>
		<dc:creator>Andy</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=402</guid>
		<description><![CDATA[In this third Spark screencast, we demonstrate more advanced use of RDD actions and transformations, as well as caching RDDs in memory. For more information, check out the Spark documentation page.]]></description>
			<content:encoded><![CDATA[<p>In this third Spark screencast, we demonstrate more advanced use of RDD actions and transformations, as well as caching RDDs in memory.</p>
<div class="video-container shadow"><iframe width="755" height="705" src="http://www.youtube.com/embed/T1lZcimvL18?autohide=0&#038;showinfo=0" frameborder="0" allowfullscreen></iframe></div>
<p>For more information, check out the <a href="http://spark-project.org/documentation.html">Spark documentation page</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/screencast-3-transformations-and-caching/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark screencasts published</title>
		<link>http://spark-project.org/spark-screencasts-published/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-screencasts-published</link>
		<comments>http://spark-project.org/spark-screencasts-published/#comments</comments>
		<pubDate>Tue, 16 Apr 2013 23:59:19 +0000</pubDate>
		<dc:creator>Andy</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=410</guid>
		<description><![CDATA[We have released the first two screencasts in a series of short hands-on video training courses we will be publishing to help new users get up and running with Spark in minutes. The first Spark screencast is called First Steps &#8230; <a href="http://spark-project.org/spark-screencasts-published/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>We have released the first two screencasts in a series of short hands-on video training courses we will be publishing to help new users get up and running with Spark in minutes.</p>
<p>The first Spark screencast is called <a href="http://spark-project.org/screencast-1-first-steps-with-spark/">First Steps With Spark</a> and walks you through downloading and building Spark, as well as using the Spark shell, all in less than 10 minutes!</p>
<p>The second screencast is a 2 minute <a href="http://spark-project.org/spark-documentation-overview-screencast-2/">overview of the Spark documentation</a>.</p>
<p>We hope you find these screencasts useful.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-screencasts-published/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark Documentation Overview – Screencast #2</title>
		<link>http://spark-project.org/spark-documentation-overview-screencast-2/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-documentation-overview-screencast-2</link>
		<comments>http://spark-project.org/spark-documentation-overview-screencast-2/#comments</comments>
		<pubDate>Thu, 11 Apr 2013 23:40:22 +0000</pubDate>
		<dc:creator>Andy</dc:creator>
				<category><![CDATA[Screencasts]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=397</guid>
		<description><![CDATA[This is our 2nd Spark screencast. In it, we take a tour of the documentation available for Spark users online. And for convenience, here are links to the documentation shown in the video: Spark documentation page Amp Camp Mini Course]]></description>
			<content:encoded><![CDATA[<p>This is our 2nd Spark screencast. In it, we take a tour of the documentation available for Spark users online.</p>
<div class="video-container shadow"><iframe width="755" height="705" src="http://www.youtube.com/embed/TikdEfsrFnw?autohide=0&#038;showinfo=0" frameborder="0" allowfullscreen></iframe></div>
<p>And for convenience, here are links to the documentation shown in the video:</p>
<ul>
<li><a href="http://spark-project.org/documentation.html">Spark documentation page</a></li>
<li><a href="http://ampcamp.berkeley.edu/big-data-mini-course-home">Amp Camp Mini Course</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-documentation-overview-screencast-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>First Steps with Spark &#8211; Screencast #1</title>
		<link>http://spark-project.org/screencast-1-first-steps-with-spark/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=screencast-1-first-steps-with-spark</link>
		<comments>http://spark-project.org/screencast-1-first-steps-with-spark/#comments</comments>
		<pubDate>Wed, 10 Apr 2013 22:15:26 +0000</pubDate>
		<dc:creator>Andy</dc:creator>
				<category><![CDATA[Screencasts]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=377</guid>
		<description><![CDATA[This screencast marks the beginning of a series of hands-on screencasts we will be publishing to help new users get up and running in minutes. In this screencast, we: Download and build Spark on a local machine (running OS X, &#8230; <a href="http://spark-project.org/screencast-1-first-steps-with-spark/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This screencast marks the beginning of a series of hands-on screencasts we will be publishing to help new users get up and running in minutes. In this screencast, we:</p>
<ol>
<li>Download and build Spark on a local machine (running OS X, but should be a similar process for Linux or Unix).</li>
<li>Introduce the API using the Spark interactive shell to explore a file.</li>
</ol>
<div class="video-container shadow"><iframe width="755" height="705" src="http://www.youtube.com/embed/KYlLglXD6Ic?autohide=0&#038;showinfo=0" frameborder="0" allowfullscreen></iframe></div>
<p>Check out the next spark screencast in the series, <a href="http://spark-project.org/spark-documentation-overview-screencast-2/">Spark Screencast #2 &#8211; Overview of Spark Documentation</a>. You can also find the Spark documentation online:</p>
<ul>
<li><a href="http://spark-project.org/documentation.html">Spark documentation page</a></li>
<li><a href="http://ampcamp.berkeley.edu/big-data-mini-course-home">Amp Camp Mini Course</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/screencast-1-first-steps-with-spark/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Strata exercises now available online</title>
		<link>http://spark-project.org/strata-exercises-now-available-online/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=strata-exercises-now-available-online</link>
		<comments>http://spark-project.org/strata-exercises-now-available-online/#comments</comments>
		<pubDate>Mon, 18 Mar 2013 02:19:25 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=367</guid>
		<description><![CDATA[At this year&#8217;s Strata conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now available online, letting you learn Spark and Shark at your &#8230; <a href="http://spark-project.org/strata-exercises-now-available-online/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>At this year&#8217;s <a href="http://strataconf.com/strata2013">Strata</a> conference, the AMP Lab hosted a full day of tutorials on Spark, Shark, and Spark Streaming, including online exercises on Amazon EC2. Those exercises are now <a href="http://ampcamp.berkeley.edu/big-data-mini-course/">available online</a>, letting you learn Spark and Shark at your own pace on an EC2 cluster with real data. They are a great resource for learning the systems. You can also find <a href="http://ampcamp.berkeley.edu/amp-camp-two-strata-2013/">slides</a> from the Strata tutorials online, as well as <a href="http://ampcamp.berkeley.edu/amp-camp-one-berkeley-2012/">videos</a> from the AMP Camp workshop we held at Berkeley in August.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/strata-exercises-now-available-online/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Spark 0.7.0 released</title>
		<link>http://spark-project.org/spark-0-7-0-released/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-0-7-0-released</link>
		<comments>http://spark-project.org/spark-0-7-0-released/#comments</comments>
		<pubDate>Wed, 27 Feb 2013 17:06:59 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=354</guid>
		<description><![CDATA[We&#8217;re proud to announce the release of Spark 0.7.0, a new major version of Spark that adds several key features, including a Python API for Spark and an alpha of Spark Streaming. This release is the result of the largest &#8230; <a href="http://spark-project.org/spark-0-7-0-released/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re proud to announce the release of <a href="http://spark-project.org/spark-release-0-7-0/" title="Spark Release 0.7.0">Spark 0.7.0</a>, a new major version of Spark that adds several key features, including a <a href="http://spark-project.org/docs/latest/python-programming-guide.html">Python API</a> for Spark and an <a href="http://spark-project.org/docs/latest/streaming-programming-guide.html">alpha of Spark Streaming</a>. This release is the result of the largest group of contributors yet behind a Spark release &#8212; 31 contributors from inside and outside Berkeley. Head over to the <a href="http://spark-project.org/spark-release-0-7-0/" title="Spark Release 0.7.0">release notes</a> to read more about the new features, or <a href="http://spark-project.org/downloads/">download</a> the release today.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-0-7-0-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark Release 0.7.0</title>
		<link>http://spark-project.org/spark-release-0-7-0/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-release-0-7-0</link>
		<comments>http://spark-project.org/spark-release-0-7-0/#comments</comments>
		<pubDate>Wed, 27 Feb 2013 16:26:19 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=278</guid>
		<description><![CDATA[The Spark team is proud to release version 0.7.0, a new major release that brings several new features. Most notable are a Python API for Spark and an alpha of Spark Streaming. (Details on Spark Streaming can also be found &#8230; <a href="http://spark-project.org/spark-release-0-7-0/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The Spark team is proud to release version 0.7.0, a new major release that brings several new features. Most notable are a <a href="http://spark-project.org/docs/0.7.0/python-programming-guide.html">Python API for Spark</a> and an <a href="http://spark-project.org/docs/0.7.0/streaming-programming-guide.html">alpha of Spark Streaming</a>. (Details on Spark Streaming can also be found in this <a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf">technical report</a>.) The release also adds numerous other improvements across the board. Overall, this is our biggest release to date, with 31 contributors, of which 20 were external to Berkeley.</p>
<p>You can download Spark 0.7.0 as either a <a href="http://spark-project.org/files/spark-0.7.0-sources.tgz">source package</a> (4 MB tar.gz) or <a href="http://spark-project.org/files/spark-0.7.0-prebuilt.tgz">prebuilt package</a> (60 MB tar.gz).</p>
<h3>Python API</h3>
<p>Spark 0.7 adds a <a href="http://spark-project.org/docs/0.7.0/python-programming-guide.html">Python API</a> called PySpark that makes it possible to use Spark from Python, both in standalone programs and in interactive Python shells. It uses the standard CPython runtime, so your programs can call into native libraries like NumPy and SciPy. Like the Scala and Java APIs, PySpark will automatically ship functions from your main program, along with the variables they depend on, to the cluster. PySpark supports most Spark features, including RDDs, accumulators, broadcast variables, and HDFS input and output.</p>
<h3>Spark Streaming Alpha</h3>
<p>Spark Streaming is a new extension of Spark that adds near-real-time processing capability. It offers a simple and high-level API, where users can transform streams using parallel operations like <tt>map</tt>, <tt>filter</tt>, <tt>reduce</tt>, and new sliding window functions. It automatically distributes work over a cluster and provides efficient fault recovery with exactly-once semantics for transformations, without relying on costly transactions to an external system. Spark Streaming is described in more detail in <a href="http://spark-project.org/talks/strata_spark_streaming.ppt">these slides</a> and <a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-259.pdf">our technical report</a>. This release is our first alpha of Spark Streaming, with most of the functionality implemented and APIs in Java and Scala.</p>
<h3>Memory Dashboard</h3>
<p>Spark jobs now launch a web dashboard for monitoring the memory usage of each distributed dataset (RDD) in the program. Look for lines like this in your log:</p>
<p><tt>15:08:44 INFO BlockManagerUI: Started BlockManager web UI at http://mbk.local:63814</tt></p>
<p>You can also control which port to use through the <tt>spark.ui.port</tt> property.</p>
<h3>Maven Build</h3>
<p>Spark can now be built using Maven in addition to SBT. The Maven build enables easier publishing to repositories of your choice, easy selection of Hadoop versions using the Maven profile (<tt>-Phadoop1</tt> or <tt>-Phadoop2</tt>), as well as Debian packaging using <tt>mvn -Phadoop1,deb install</tt>.</p>
<h3>New Operations</h3>
<p>This release adds several RDD transformations, including <tt>keys</tt>, <tt>values</tt>, <tt>keyBy</tt>, <tt>subtract</tt>, <tt>coalesce</tt>, <tt>zip</tt>. It also adds <tt>SparkContext.hadoopConfiguration</tt> to allow programs to configure Hadoop input/output settings globally across operations. Finally, it adds the <tt>RDD.toDebugString()</tt> method, which can be used to print an RDD&#8217;s lineage graph for troubleshooting.</p>
<h3>EC2 Improvements</h3>
<ul>
<li>Spark will now read S3 credentials from the <tt>AWS_ACCESS_KEY_ID</tt> and <tt>AWS_SECRET_ACCESS_KEY</tt> environment variables, if set, making it easier to access Amazon S3.</li>
<li>This release fixes a bug with S3 access that would leave streams open when they are not fully read (e.g. when calling <tt>RDD.first()</tt> or a SQL query with a limit), causing nodes to hang.</li>
<li>The EC2 scripts now support both standalone and Mesos clusters, and launch Ganglia on the cluster.</li>
<li>Spark EC2 clusters can now be spread across multiple availability zones.</li>
</ul>
<h3>Other Improvements</h3>
<ul>
<li>Shuffle operations like <tt>groupByKey</tt> and <tt>reduceByKey</tt> now try to infer parallelism from the size of the parent RDD (unless <tt>spark.default.parallelism</tt> is set).</li>
<li>Several performance improvements to shuffles.</li>
<li>Standalone deploy cluster now spreads jobs out across machines by default, leading to better data locality.</li>
<li>Better error reporting when jobs aren&#8217;t being launched due to not enough resources.</li>
<li>Standalone deploy web UI now includes JSON endpoints for querying cluster state.</li>
<li>Better support for IBM JVM.</li>
<li>Default Hadoop version dependency updated to 1.0.4.</li>
<li>Improved failure handling and reporting of error messages.</li>
<li>Separate configuration for standalone cluster daemons and user applications.</li>
<li>Significant refactoring of the scheduler codebase to enable richer unit testing.</li>
<li>Several bug and performance fixes throughout.</li>
</ul>
<h3>Compatibility</h3>
<p>This release is API-compatible with Spark 0.6 programs, but the following features changed slightly:</p>
<ul>
<li>Parallel shuffle operations where you don&#8217;t specify a level of parallelism use the number of partitions of the parent RDD instead of a constant default. However, if you set <tt>spark.default.parallelism</tt>, they will use that.</li>
<li><tt>SparkContext.addFile</tt>, which distributes a file to worker nodes, is no longer guaranteed to put it in the executor&#8217;s working directory&#8212;instead, you can find the directory it used using <tt>SparkFiles.getRootDirectory</tt>, or get a particular file using <tt>SparkFiles.get</tt>. This was done to avoid cluttering the local directory when running in local mode.</li>
</ul>
<h3>Credits</h3>
<p>Spark 0.7 was the work of many contributors from Berkeley and outside&#8212;in total, 31 different contributors, of which 20 were from outside Berkeley. Here are the people who contributed, along with areas they worked on:</p>
<ul>
<li>Mikhail Bautin &#8212; Maven build</li>
<li>Denny Britz &#8212; memory dashboard, streaming, bug fixes</li>
<li>Paul Cavallaro &#8212; error reporting</li>
<li>Tathagata Das &#8212; streaming (lead developer), 24/7 operation, bug fixes, docs</li>
<li>Thomas Dudziak &#8212; Maven build, Hadoop 2 support</li>
<li>Harvey Feng &#8212; bug fix</li>
<li>Stephen Haberman &#8212; new RDD operations, configuration, S3 improvements, code cleanup, bug fixes</li>
<li>Tyson Hamilton &#8212; JSON status endpoints</li>
<li>Mark Hamstra &#8212; API improvements, docs</li>
<li>Michael Heuer &#8212; docs</li>
<li>Shane Huang &#8212; shuffle performance fixes</li>
<li>Andy Konwinski &#8212; docs</li>
<li>Ryan LeCompte &#8212; streaming</li>
<li>Haoyuan Li &#8212; streaming</li>
<li>Richard McKinley &#8212; build</li>
<li>Sean McNamara &#8212; streaming</li>
<li>Lee Moon Soo &#8212; bug fix</li>
<li>Fernand Pajot &#8212; bug fix</li>
<li>Nick Pentreath &#8212; Python API, examples</li>
<li>Andrew Psaltis &#8212; bug fixes</li>
<li>Imran Rashid &#8212; memory dashboard, bug fixes</li>
<li>Charles Reiss &#8212; fault recovery fixes, code cleanup, testability, error reporting</li>
<li>Josh Rosen &#8212; Python API (lead developer), EC2 scripts, bug fixes</li>
<li>Peter Sankauskas &#8212; EC2 scripts</li>
<li>Prashant Sharma &#8212; streaming</li>
<li>Shivaram Venkataraman &#8212; EC2 scripts, optimizations</li>
<li>Patrick Wendell &#8212; streaming, bug fixes, examples, docs</li>
<li>Reynold Xin &#8212; optimizations, UI</li>
<li>Haitao Yao &#8212; run scripts</li>
<li>Matei Zaharia &#8212; streaming, fault recovery, Python API, code cleanup, bug fixes, docs</li>
<li>Eric Zhang &#8212; examples</li>
</ul>
<p>Thanks to everyone who contributed!</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-release-0-7-0/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Spark/Shark Tutorial for Amazon EMR</title>
		<link>http://spark-project.org/run-spark-and-shark-on-amazon-emr/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=run-spark-and-shark-on-amazon-emr</link>
		<comments>http://spark-project.org/run-spark-and-shark-on-amazon-emr/#comments</comments>
		<pubDate>Sun, 24 Feb 2013 22:43:50 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=270</guid>
		<description><![CDATA[This weekend, Amazon posted an article and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in &#8230; <a href="http://spark-project.org/run-spark-and-shark-on-amazon-emr/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This weekend, Amazon posted an <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">article</a> and code that make it easy to launch Spark and Shark on Elastic MapReduce. The article includes examples of how to run both interactive Scala commands and SQL queries from Shark on data in S3. Head over to the <a href="http://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923">Amazon article</a> for details. We&#8217;re very excited because, to our knowledge, this makes Spark the first non-Hadoop engine that you can launch with EMR.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/run-spark-and-shark-on-amazon-emr/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark 0.6.2 released</title>
		<link>http://spark-project.org/spark-0-6-2-released/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-0-6-2-released</link>
		<comments>http://spark-project.org/spark-0-6-2-released/#comments</comments>
		<pubDate>Fri, 08 Feb 2013 03:16:53 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=259</guid>
		<description><![CDATA[We recently released Spark 0.6.2, a new version of Spark. This is a maintenance release that includes several bug fixes and usability improvements (see the release notes). We recommend that all users upgrade to this release.]]></description>
			<content:encoded><![CDATA[<p>We recently released <a href="http://spark-project.org/spark-release-0-6-2/" title="Spark Release 0.6.2">Spark 0.6.2</a>, a new version of Spark. This is a maintenance release that includes several bug fixes and usability improvements (see the <a href="http://spark-project.org/spark-release-0-6-2/" title="Spark Release 0.6.2">release notes</a>). We recommend that all users upgrade to this release.</p>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-0-6-2-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spark Release 0.6.2</title>
		<link>http://spark-project.org/spark-release-0-6-2/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spark-release-0-6-2</link>
		<comments>http://spark-project.org/spark-release-0-6-2/#comments</comments>
		<pubDate>Fri, 08 Feb 2013 03:05:16 +0000</pubDate>
		<dc:creator>matei</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://spark-project.org/?p=250</guid>
		<description><![CDATA[Spark 0.6.2 is a maintenance release that contains several bug fixes and usability improvements. You can download it as a source package (2.5 MB tar.gz) or prebuilt package (48 MB tar.gz). We recommend that all Spark 0.6 users update to &#8230; <a href="http://spark-project.org/spark-release-0-6-2/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Spark 0.6.2 is a maintenance release that contains several bug fixes and usability improvements. You can download it as a <a href="http://spark-project.org/files/spark-0.6.2-sources.tgz">source package</a> (2.5 MB tar.gz) or <a href="http://spark-project.org/files/spark-0.6.2-prebuilt.tgz">prebuilt package</a> (48 MB tar.gz).</p>
<p>We recommend that all Spark 0.6 users update to this maintenance release.</p>
<p>The fixes and improvements in this version include:</p>
<ul>
<li>A number of fault tolerance fixes regarding detecting dead nodes, handling missing map output fetches, and allowing failed nodes to rejoin the cluster</li>
<li>Documentation fixes that clarify the configuration for the standalone mode and improve the quick start instructions</li>
<li>A connection reuse bug fix that improves shuffle performance</li>
<li>Support for launching a cluster across multiple availability zones in the EC2 scripts</li>
<li>Support for deleting security groups when an EC2 cluster is terminated</li>
<li>Improved memory configuration for the standalone deploy cluster daemons: instead of using <code>SPARK_MEM</code> for their memory, which often led people to give them much more memory than they intended, they now use a separate variable, <code>SPARK_DAEMON_MEMORY</code>, with a reasonable default of 512 MB
<li>Fixes to the Windows run scripts for Spark</li>
<li>Better detection of a machine&#8217;s external IP address</li>
<li>Several small optimizations and bug fixes</li>
</ul>
<p>In total, eleven people contributed to this release:</p>
<ul>
<li>Stephen Haberman (bug fix)</li>
<li>Shane Huang (shuffle fix)</li>
<li>Fernand Pajot (bug fix)</li>
<li>Andrew Psaltis (bug fix)</li>
<li>Imran Rashid (standalone cluster, bug fix)</li>
<li>Charles Reiss (fault recovery fixes, node re-registration, tests)</li>
<li>Josh Rosen (fault recovery, Java API fixes, deploy scripts)</li>
<li>Peter Sankauskas (EC2 scripts)</li>
<li>Lee Moon Soo (bug fix)</li>
<li>Patrick Wendell (bugs, docs)</li>
<li>Matei Zaharia (fault recovery, UI, docs, bug fixes)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://spark-project.org/spark-release-0-6-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
