Arimo Leverages Alluxio’s In-Memory Capability, Improving Time-to-Results for Deep Learning Models

Arimo Team Nov 25th, 2016

Deep learning algorithms have traditionally been used in specific applications, most notably, computer vision, machine translation, text mining, and fraud detection. Deep learning truly shines when the model is big and trained on large-scale datasets. Meanwhile, distributed computing platforms like Spark are designed to handle big data and have been used extensively. Therefore, by having deep learning available on Spark, the application of deep learning is much broader, and now businesses can fully take advantage of deep learning capabilities using their existing Spark infrastructure.

Effective Spark DataFrames with Alluxio

Gene Pang Pei Sun Oct 30th, 2016

Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory. Previously, we investigated how Alluxio is used for Spark RDDs. In this article, we investigate how to effectively use Spark DataFrames with Alluxio.

Alluxio Launches Industry's First System to Unify Data at Memory Speed

Haoyuan Li Oct 24th, 2016

Today we’re excited to unveil our first products which enable organizations to turn data into value with unprecedented ease, flexibility, and speeds. We believe our new products will substantially advance Alluxio for both the community and our enterprise customers. In this blog, I will share with you the challenges that we see application developers and business line owners face today when working with big data, and show how Alluxio addresses these challenges.

Accelerating Data Analytics on Ceph Object Storage with Alluxio

Adit Madan Oct 16th, 2016

This is an excerpt from the Accelerating Data Analytics on Ceph Object Storage with Alluxio whitepaper. In addition to the reference architecture in this blog, the whitepaper provides a detailed implementation guide to reproduce the environment

Alluxio to Showcase Memory-Speed Virtual Distributed Storage System at Strata + Hadoop World in New York Sept. 27 - 29, 2016

Amelia Wong Sep 20th, 2016

SAN MATEO, CA–(Marketwired – Sep 21, 2016) – Alluxio (formerly Tachyon), developers of the world’s first memory-speed virtual distributed storage system that bridges big data applications and underlying storage systems, will be exhibiting (Booth #P30) at Strata + Hadoop World, taking place Sept. 27 – 29, 2016 at the Javits Convention Center in New York City.

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Calvin Jia Sep 1st, 2016

Alluxio is the world's first memory-speed virtual distributed storage system that bridges applications and underlying storage systems, providing unified data access orders of magnitudes faster than existing solutions. The Hadoop Distributed File System (HDFS) is a distributed file system for storing large volumes of data. HDFS popularized the paradigm of bringing computation to data and the co-located compute and storage architecture.

Alluxio Partners with Huawei to Deliver Big Data Storage Acceleration Solution

Neena Pemmaraju Aug 27th, 2016

We are excited to announce a big data storage acceleration solution with Huawei. This solution combines Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system to dramatically enhance the speed and efficiency of big data analytics for the enterprise.

Effective Spark RDDs with Alluxio

Gene Pang Pei Sun Aug 25th, 2016

Organizations like Baidu and Barclays have deployed Alluxio with Spark in their architecture, and have achieved impressive benefits and gains. Recently, Qunar deployed Alluxio with Spark in production and found that Alluxio enables Spark streaming jobs to run 15x to 300x faster. In this blog, we investigate how Alluxio can make Spark more effective, and discuss various ways to use Alluxio with Spark. Alluxio helps Spark perform faster, and enables multiple Spark jobs to share the same, memory-speed data.

Accelerating On-Demand Data Analytics with Alluxio

Calvin Jia Aug 19th, 2016

This is an excerpt from the Accelerating On-Demand Data Analytics with Alluxio whitepaper, which includes a detailed implementation guide in addition to this high level overview.

Qunar Performs Real-Time Data Analytics up to 300x Faster with Alluxio

Xueyan Li Lei Xu Xiaoxu Lv Jul 16th, 2016

At Qunar, we have been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. In addition, Alluxio’s unified namespace enables different applications and frameworks to easily interact with our data from different storage systems.

What’s new in Alluxio 1.1 Release

Gene Pang Jun 21st, 2016

Alluxio 1.1 release includes many great features and improvements from the community. Alluxio would not be what it is today without the growing open source community, and we would like to thank everyone involved in this project. With the Alluxio 1.1 release, the community has continued to grow at a rapid pace, to reach over 250 contributors to Alluxio – nearly 3x growth over the last year!

Introducing Alluxio Open Source Project Governance

Haoyuan Li May 30th, 2016

Alluxio, formerly Tachyon, began as a research project at UC Berkeley’s AMPLab in 2012. This year we announced the 1.0 release of Alluxio, the world’s first memory speed virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. We have been working closely with the Alluxio community on realizing the vision of Alluxio to become the de facto storage unification layer for big data and other scale out application environments.

Unified Namespace: Allowing Applications To Access Data Anywhere

Jiri Simsa Apr 24th, 2016

The exponential growth of the raw computational power, communication bandwidth, and storage capacity results in continuous innovation in how data is processed and stored. To address the evolving nature of the compute and storage landscape, we are continuously advancing Alluxio, a state-of-the-art memory-centric virtual distributed storage system.

Getting Started with Alluxio and Spark

Calvin Jia Apr 5th, 2016

Alluxio, formerly Tachyon, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. For example, global financial powerhouse Barclays made the impossible possible by using Alluxio with Spark in their architecture. Technology giant Baidu analyzes petabytes of data and realized 30x performance improvements with a new architecture centered around Alluxio and Spark.

Alluxio, formerly Tachyon, is Entering a New Era with 1.0 release

Haoyuan Li Feb 14th, 2016

Alluxio, formerly Tachyon, began as a research project when I was a Ph.D. student at UC Berkeley’s AMPLab in 2012. At the time, Spark and Mesos were taking off. We saw what Spark and Mesos could do for compute and resource management respectively, while the storage piece of this story was missing.

Get Started with Alluxio

Get Started