Monday, 7 March 2016

Analyzing Data with Salt Viz Library




There are many data visualization libraries available for the data scientist to create narratives. Plotly, matplotlib, seaborn, ggplot2, D3.js, … the list is long. However, when the data gets large and the memory gets low these libraries struggle to keep up.

These simple concepts allow for the creation of a variety of powerful big-data visualizations. Several well explained examples are available in the github repository.

Salt is a visualization library that leverages Apache Spark to create big-data visualization. It is built around 2 concepts: 

1) Dimension reduction: i.e. transforming the data space into the a smaller visualization space.

2) Data aggregation: values in the visualization space which are close to each other are grouped via a collection of seven sample aggregators.

You will need the Docker, a Java compiler, Gradle (automation system builder) and Node + npm to be installed on your local in order to run these examples.

Did you find this blog post useful?

Help others see it too by commenting and sharing below.

Also, check out our latest Disruptive Data Science Conference, speakers and workshops.

No comments:

Post a Comment