Nov 25th, 2016
Deep learning algorithms have traditionally been used in specific applications, most notably, computer vision, machine translation, text mining, and fraud detection. Deep learning truly shines when the model is big and trained on large-scale datasets. Meanwhile, distributed computing platforms like Spark are designed to handle big data and have been used extensively. Therefore, by having deep learning available on Spark, the application of deep learning is much broader, and now businesses can fully take advantage of deep learning capabilities using their existing Spark infrastructure.
Oct 30th, 2016
Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory. Previously, we investigated how Alluxio is used for Spark RDDs. In this article, we investigate how to effectively use Spark DataFrames with Alluxio.
Oct 24th, 2016
Today we’re excited to unveil our first products which enable organizations to turn data into value with unprecedented ease, flexibility, and speeds. We believe our new products will substantially advance Alluxio for both the community and our enterprise customers.
In this blog, I will share with you the challenges that we see application developers and business line owners face today when working with big data, and show how Alluxio addresses these challenges.
Oct 16th, 2016
This is an excerpt from the Accelerating Data Analytics on Ceph Object Storage with Alluxio whitepaper. In addition to the reference architecture in this blog, the whitepaper provides a detailed implementation guide to reproduce the environment
Sep 20th, 2016
SAN MATEO, CA–(Marketwired – Sep 21, 2016) – Alluxio (formerly Tachyon), developers of the world’s first memory-speed virtual distributed storage system that bridges big data applications and underlying storage systems, will be exhibiting (Booth #P30) at Strata + Hadoop World, taking place Sept. 27 – 29, 2016 at the Javits Convention Center in New York City.
Sep 1st, 2016
Alluxio is the world's first memory-speed virtual distributed storage system that bridges applications and underlying storage systems, providing unified data access orders of magnitudes faster than existing solutions. The Hadoop Distributed File System (HDFS) is a distributed file system for storing large volumes of data. HDFS popularized the paradigm of bringing computation to data and the co-located compute and storage architecture.
Aug 27th, 2016
We are excited to announce a big data storage acceleration solution with Huawei. This solution combines Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system to dramatically enhance the speed and efficiency of big data analytics for the enterprise.
Aug 25th, 2016
Organizations like Baidu and Barclays have deployed Alluxio with Spark in their architecture, and have achieved impressive benefits and gains. Recently, Qunar deployed Alluxio with Spark in production and found that Alluxio enables Spark streaming jobs to run 15x to 300x faster. In this blog, we investigate how Alluxio can make Spark more effective, and discuss various ways to use Alluxio with Spark. Alluxio helps Spark perform faster, and enables multiple Spark jobs to share the same, memory-speed data.
Aug 19th, 2016
This is an excerpt from the Accelerating On-Demand Data Analytics with Alluxio whitepaper, which includes a detailed implementation guide in addition to this high level overview.
Jul 16th, 2016
At Qunar, we have been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. In addition, Alluxio’s unified namespace enables different applications and frameworks to easily interact with our data from different storage systems.
Jun 21st, 2016
Alluxio 1.1 release includes many great features and improvements from the community. Alluxio would not be what it is today without the growing open source community, and we would like to thank everyone involved in this project. With the Alluxio 1.1 release, the community has continued to grow at a rapid pace, to reach over 250 contributors to Alluxio – nearly 3x growth over the last year!
May 30th, 2016
Alluxio, formerly Tachyon, began as a research project at UC Berkeley’s AMPLab in 2012. This year we announced the 1.0 release of Alluxio, the world’s first memory speed virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. We have been working closely with the Alluxio community on realizing the vision of Alluxio to become the de facto storage unification layer for big data and other scale out application environments.
Apr 24th, 2016
The exponential growth of the raw computational power, communication bandwidth, and storage capacity results in continuous innovation in how data is processed and stored. To address the evolving nature of the compute and storage landscape, we are continuously advancing Alluxio, a state-of-the-art memory-centric virtual distributed storage system.
Apr 5th, 2016
Alluxio, formerly Tachyon, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. For example, global financial powerhouse Barclays made the impossible possible by using Alluxio with Spark in their architecture. Technology giant Baidu analyzes petabytes of data and realized 30x performance improvements with a new architecture centered around Alluxio and Spark.
Feb 14th, 2016
Alluxio, formerly Tachyon, began as a research project when I was a Ph.D. student at UC Berkeley’s AMPLab in 2012. At the time, Spark and Mesos were taking off. We saw what Spark and Mesos could do for compute and resource management respectively, while the storage piece of this story was missing.