Mar 20th, 2017
By leveraging Alluxio, Mesos, Minio, and Spark we have created an end-to-end data processing solution that is performant, scalable, and cost optimal. We use Alluxio as the unified storage layer to connect disparate storage systems and bring memory performance, with Minio mounted as the under store to Alluxio to keep cold (infrequently accessed) data and to sync data to AWS S3. Apache Spark serves as the compute engine.
Mar 13th, 2017
Today, we’re excited to announce our partnership with Mesosphere to enable fast on-demand analytics with Alluxio via Mesosphere’s DC/OS in one-click. This partnership is a natural extension of the synergy between Alluxio and DC/OS. Alluxio, the world's first system that unifies data at memory speed, allows enterprises to manage and analyze data stored across disparate storage systems on premise and in the cloud at memory speed. Mesosphere brings enterprises the power of cloud native technologies, with the control to run on any infrastructure - datacenter or cloud...
Feb 23rd, 2017
SAN MATEO, Calif., Feb. 23, 2017 (GLOBE NEWSWIRE) -- Alluxio (formerly Tachyon), developers of the world’s first system that unifies data at memory speed, and Kyligence, a leading intelligent big data analytics company formed by the core members of Apache Kylin jointly announce a strategic partnership. The two companies collaborated to integrate the Alluxio memory-speed virtual distributed storage system with Apache Kylin's ultra-large-scale data analysis technology (OLAP on Hadoop) to further unlock the value of big data for enterprises.
Feb 8th, 2017
Alluxio 1.4.0 has been released with a large number of new features and improvements. This blog highlights some stand out aspects of the release.
Jan 16th, 2017
SAN MATEO, CA--(Marketwired - Jan 17, 2017) - Alluxio (formerly Tachyon), developers of the world's first system that unifies data at memory speed, today announced a solution with Alluxio Enterprise Edition (AEE) and Dell EMC's Elastic Cloud Storage (ECS) for big data workloads. The new solution is designed to help Dell EMC ECS enterprise customers deliver more value from data as they transition their businesses to meet the new demands of a digital economy.
Nov 25th, 2016
Deep learning algorithms have traditionally been used in specific applications, most notably, computer vision, machine translation, text mining, and fraud detection. Deep learning truly shines when the model is big and trained on large-scale datasets. Meanwhile, distributed computing platforms like Spark are designed to handle big data and have been used extensively. Therefore, by having deep learning available on Spark, the application of deep learning is much broader, and now businesses can fully take advantage of deep learning capabilities using their existing Spark infrastructure.
Oct 29th, 2016
Many organizations deploy Alluxio together with Spark for performance gains and data manageability benefits. In this blog post, we investigate how Alluxio helps Spark be more effective. Alluxio increases performance of Spark jobs, helps Spark jobs perform more predictably, and enables multiple Spark jobs to share the same data from memory. Previously, we investigated how Alluxio is used for Spark RDDs. In this article, we investigate how to effectively use Spark DataFrames with Alluxio.
Oct 24th, 2016
Today we’re excited to unveil our first products which enable organizations to turn data into value with unprecedented ease, flexibility, and speeds. We believe our new products will substantially advance Alluxio for both the community and our enterprise customers.
In this blog, I will share with you the challenges that we see application developers and business line owners face today when working with big data, and show how Alluxio addresses these challenges.
Oct 16th, 2016
This is an excerpt from the Accelerating Data Analytics on Ceph Object Storage with Alluxio whitepaper. In addition to the reference architecture in this blog, the whitepaper provides a detailed implementation guide to reproduce the environment
Sep 20th, 2016
SAN MATEO, CA–(Marketwired – Sep 21, 2016) – Alluxio (formerly Tachyon), developers of the world’s first memory-speed virtual distributed storage system that bridges big data applications and underlying storage systems, will be exhibiting (Booth #P30) at Strata + Hadoop World, taking place Sept. 27 – 29, 2016 at the Javits Convention Center in New York City.
Sep 1st, 2016
Alluxio is the world's first memory-speed virtual distributed storage system that bridges applications and underlying storage systems, providing unified data access orders of magnitudes faster than existing solutions. The Hadoop Distributed File System (HDFS) is a distributed file system for storing large volumes of data. HDFS popularized the paradigm of bringing computation to data and the co-located compute and storage architecture.
Aug 27th, 2016
We are excited to announce a big data storage acceleration solution with Huawei. This solution combines Huawei’s FusionStorage with Alluxio’s memory-speed virtual distributed storage system to dramatically enhance the speed and efficiency of big data analytics for the enterprise.
Aug 25th, 2016
Organizations like Baidu and Barclays have deployed Alluxio with Spark in their architecture, and have achieved impressive benefits and gains. Recently, Qunar deployed Alluxio with Spark in production and found that Alluxio enables Spark streaming jobs to run 15x to 300x faster. In this blog, we investigate how Alluxio can make Spark more effective, and discuss various ways to use Alluxio with Spark. Alluxio helps Spark perform faster, and enables multiple Spark jobs to share the same, memory-speed data.
Aug 19th, 2016
This is an excerpt from the Accelerating On-Demand Data Analytics with Alluxio whitepaper, which includes a detailed implementation guide in addition to this high level overview.
Jul 16th, 2016
At Qunar, we have been running Alluxio in production for over 9 months, resulting in 15x speedup on average, and 300x speedup at peak service times. In addition, Alluxio’s unified namespace enables different applications and frameworks to easily interact with our data from different storage systems.
Jun 21st, 2016
Alluxio 1.1 release includes many great features and improvements from the community. Alluxio would not be what it is today without the growing open source community, and we would like to thank everyone involved in this project. With the Alluxio 1.1 release, the community has continued to grow at a rapid pace, to reach over 250 contributors to Alluxio – nearly 3x growth over the last year!
May 30th, 2016
Alluxio, formerly Tachyon, began as a research project at UC Berkeley’s AMPLab in 2012. This year we announced the 1.0 release of Alluxio, the world’s first memory speed virtual distributed storage system, which unifies data access and bridges computation frameworks and underlying storage systems. We have been working closely with the Alluxio community on realizing the vision of Alluxio to become the de facto storage unification layer for big data and other scale out application environments.
Apr 24th, 2016
The exponential growth of the raw computational power, communication bandwidth, and storage capacity results in continuous innovation in how data is processed and stored. To address the evolving nature of the compute and storage landscape, we are continuously advancing Alluxio, a state-of-the-art memory-centric virtual distributed storage system.
Apr 5th, 2016
Alluxio, formerly Tachyon, provides Spark with a reliable data sharing layer, enabling Spark to excel at performing application logic while Alluxio handles storage. For example, global financial powerhouse Barclays made the impossible possible by using Alluxio with Spark in their architecture. Technology giant Baidu analyzes petabytes of data and realized 30x performance improvements with a new architecture centered around Alluxio and Spark.
Feb 14th, 2016
Alluxio, formerly Tachyon, began as a research project when I was a Ph.D. student at UC Berkeley’s AMPLab in 2012. At the time, Spark and Mesos were taking off. We saw what Spark and Mesos could do for compute and resource management respectively, while the storage piece of this story was missing.