Announcing Alluxio v1.8.0

We are excited to announce the release of Alluxio Enterprise Edition (AEE) and Community Edition (ACE) and Alluxio Open Source (AOS) v1.8.0. Click HERE to download! This release brings features and enhancements in Alluxio to simplify cloud adoption (and hybrid cloud, and migration from HDFS to object storage) for analytics and machine learning and improve useability.

To help make it easier to get started using Alluxio, we have also collected a set of resources into a starter kit. The kit is located on the download page right below the download button and includes:

  • How to guide: Install Alluxio on a local machine;
  • How to guide: Install, plus mount an S3 bucket and accelerate remote reads;
  • Video: walk-through of install through accelerating remote reads;
  • How to: Running Spark on Alluxio;
  • Learn more: Architecture overview.

The second item is a simple tutorial for how to mount a remote AWS S3 bucket and accelerate data access. If you are not ready to actually do the install right now you can watch a short video that walks you through the steps.

Click here to read the full press release

Specifically, the release brings the following features and enhancements focusing on cloud adoption and tiered locality

Tiered Locality

The tiered locality feature allows users to take full advantage of locality in their clusters. Alluxio has always supported node-level locality, but now users may specify arbitrary locality levels such as rack, region, or availability zone. This way, clients can prefer to read from workers in the same rack, taking advantage of higher network transfer speeds and reducing cost of data transfer.

Object Store Listing Optimizations

In the past, recursive listing of status (ls) is implemented by requesting the file or directory status from the UFS individually. This results in many calls to the UFS if there are many files or directories under a directory. When the UFS is an object store, this problem becomes exacerbated because calls to object stores are slower than calls to other UFS. In v1.8, we utilize bulk apis for object stores and makes only one call to the underlying UFS regardless of how many files and directories are contained in the directory being listed. This results in significant performance improvement when the user lists status of large directories

Improved Useability

Alluxio Servers Configuration Consistency Checking v1.8 adds a server-side configuration checking to help discover configuration errors and warnings. Suspected configuration errors are reported through the web UI, doctor CLI, and master logs.

New fsadmin CLI Commands A new CLI has been added to help with monitoring and debugging from the command line. The CLI exposes most of the information available in the web UI

Journal Backup and Restore Users can now take journal backups and re-apply them at a later time.

Simplified Configuration Management Alluxio v1.8 simplifies the configuration management by supporting cluster-wide default configuration. As a result, different client applications such as Alluxio Shell commands, Spark jobs, or MapReduce jobs can initialize their configuration with the cluster-wide configuration values retrieved from masters, including client-side settings

Enhanced Metrics Alluxio 1.8 vastly increases the coverage of metrics reported by the system. All RPC requests will be recorded with an associated user, resulting in a detailed set of machine consumable metrics for monitoring the Alluxio cluster. Alluxio continues to use the CodaHale library for metrics and provides a number of common sinks out of the box, including Graphite, CSV, and Prometheus.

Click here to Download Alluxio