The paper FUn: A Framework for Interactive Visualizations of Large, High Dimensional Datasets on the Web has been accepted for publication by Bioinformatics.
During the past decade, big data has become a major tool in scientific endeavors. While statistical methods and algorithms are well-suited for analyzing and summarizing enormous amounts of data, the results do not allow for a visual inspection of the entire data. Current scientific software, including R packages and Python libraries such as ggplot2, matplotlib, and plot.ly, do not support interactive visualizations of datasets exceeding 100,000 data points on the web. Other solutions enable the web-based visualization of big data only through data reduction or statistical representations. However, recent hardware developments, especially advancements in graphical processing units (GPUs), allow for the rendering of millions of data points on a wide range of consumer hardware like laptops, tablets and mobile phones. Similar to the challenges and opportunities brought to virtually every scientific field by big data, both the visualization of and interaction with copious amounts of data is both demanding and holds great promise.
Here we present FUn, a framework consisting of a client (Faerun) and server (Underdark) module, facilitating the creation of web-based, interactive 3D visualizations of large data sets, enabling record level visual inspection. We also introduce a reference implementation providing access to SureChEMBL, a database containing patent information on more than 17 million chemical compounds.
The source code and the most recent builds of Faerun and Underdark, Lore.js and the data preprocessing toolchain used in the reference implementation, are available on the project website (http://doc.gdb.tools/fun/).
Author(s): Daniel Probst and Jean-Louis Reymond