ICDM 2017: IEEE International Conference on Data Mining
Seventh Workshop on
Data Mining in Earth System Science (DMESS 2017)
Co-conveners: Forrest M. Hoffman, Auroop R. Ganguly, Jitendra Kumar, and Richard Tran Mills
Pontalba Room on the Mezzanine Level of
The Roosevelt New Orleans, A Waldorf Astoria Hotel
New Orleans, Louisiana, USA | November 18, 2017
Final Workshop Program (November 18, 2017)
Printable Workshop Program (PDF)
|8:30||Introduction to Data Mining in Earth System Science (DMESS)|
|Forrest M. Hoffman||Introductory Presentation and Panel Charge||Forrest M. Hoffman, Auroop R. Ganguly, Jitendra Kumar, and Richard Tran Mills|
|9:00||Precipitation Estimate from Multi-Satellite Remote Sensing Measurements using Machine Learning Methods|
|Kuolin Hsu||Invited Keynote Presentation||Kuolin Hsu and Soroosh Sorooshian|
|9:30||Convolutional Neural Network Approach for Mapping Arctic Vegetation using Multi-Sensor Remote Sensing|
Abstract | Slides
|Zachary Langford||Contributed Paper Presentation||SP19205||Zachary Langford, Jitendra Kumar, and Forrest M. Hoffman|
|10:15||Resolution Reconstruction of Climate Data with Pixel Recursive Model|
|Sookyung Kim||Invited Paper Presentation||SP19203||Sookyung Kim, Sasha Ames, Chengzhu Zhang, Jiwoo Lee, and Dean Williams|
|10:45||Quantifying Seasonal Patterns in Disparate Environmental Variables Using the PolarMetrics R Package|
|Bjørn-Gustaf Brooks||Contributed Paper Presentation||SP19206||Bjorn Brooks, Danny Lee, Ankur Desai, Lars Pomara, and William W. Hargrove|
|11:15||Vital Role of Training and Education in Big Data Applications|
Abstract | Slides
|David A. Yuen and Gabriele Morra||Invited Keynote Presentation||David A. Yuen and Gabriele Morra|
|13:00||A Machine Learning Approach to Non-uniform Spatial Downscaling of Climate Variables|
|Soukayna Mouatadid||Contributed Paper Presentation||SP19207||Soukayna Mouatadid, Steve Easterbrook, and Andre Erler|
|13:30||Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures|
Abstract | Slides
|Vamsi Sripathi||Invited Keynote Presentation||Vamsi Sripathi|
|14:00||Deriving Data-driven Insights from Climate Extreme Indices for the Continental US|
|David Sathiaraj||Contributed Paper Presentation||SP19201||Xinbo Huang, David Sathiaraj, Lei Wang, and Barry Keim|
|14:30||How Can Physics Inform Deep Learning Methods in Earth System Science?: Recent Progress and Future Prospects|
|Anuj Karpatne||Invited Keynote Presentation||Anuj Karpatne|
|15:15||DMESS Panel Discussion|
|19:15||DMESS Workshop Dinner at the Red Fish Grill (115 Bourbon Street). Meet in hotel lobby at 7:15 p.m. You are responsible for your own food and drinks.|
Spanning many orders of magnitude in time and space scales, Earth science data, from point measurements to process-based Earth system model output, are increasingly large and complex, and often represent very long time series, making these data difficult to analyze, visualize, interpret, and understand. An “explosion” of heterogeneous, multi-disciplinary data–including observations and models of interacting natural, engineered, and human systems–have rendered traditional means of integration and analysis ineffective, necessitating the application of new analytical methods and the development of highly scalable software tools for synthesis, assimilation, comparison, and visualization. For complex, nonlinear feedbacks among chaotic processes, new methods and approaches for data mining and computational statistics are required for classification and change detection, model evaluation and benchmarking, uncertainty quantification, and incorporation of constraints from physics, chemistry, and biology into analysis. This workshop explores various data mining approaches and algorithms for understanding nonlinear dynamics of weather and climate systems and their interactions with biogeochemical cycles, impacts of natural system responses and climate extremes on engineered systems and interdependent infrastructure networks, and mitigation and adaptation strategies for natural hazards and infrastructure and ecosystem resilience. Encouraged are original research papers describing applications of statistical and data mining methods that support analysis and discovery in climate predictability, attributions, weather extremes, water resources management, risk analysis and hazards assessment, ecosystem sustainability, infrastructure resilience, and geo-engineering. Rigorous review papers that either have the potential to expose data mining researchers to commonly used data-driven methods in the Earth sciences or discuss the applicability and caveats of such methods from a machine learning or statistical perspective, are also desired. Methods may include, but are not limited to cluster analysis, empirical orthogonal functions (EOFs), extreme value and rare events analysis, genetic algorithms, neural networks and deep learning methods, physics-constrained data analytics, automated data assimilation, and other machine learning techniques. Novel approaches that bring new ideas from nonlinear dynamics and information theory, network science and graphical methods, and the state-of-the-art in computational statistics and econometrics, into data mining and machine learning, are particularly encouraged.
Program Committee Members:
- Michael W. Berry (Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee, USA)
- Bjørn-Gustaf J. Brooks (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
- Nathaniel O. Collier (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
- Auroop R. Ganguly (Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts, USA)
- William W. Hargrove (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
- Forrest M. Hoffman (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
- Jian Huang (Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, Tennessee USA)
- Evan Kodra (risQ Incorporated, Cambridge, Massachusetts, USA)
- Jitendra Kumar (Terrestrial Systems Modeling Group, Environmental Sciences Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
- Vipin Kumar (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
- Miguel D. Mahecha (Department of Biogeochemical Integration, Max Planck Institute for Biogeochemistry, Jena, GERMANY)
- Richard T. Mills (Intel Corporation, Hillsboro, Oregon, USA)
- Steven P. Norman (Eastern Forest Environmental Threat Assessment Center, USDA Forest Service, Asheville, North Carolina, USA)
- Sarat Sreepathi (Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
- Vamsi Sripathi (Intel Corporation, Hillsboro, Oregon, USA)
- Karsten Steinhaeuser (Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA)
- Min Xu (Computational Earth Sciences Group, Computational Sciences & Engineering Division and Oak Ridge Climate Change Science Institute (CCSI), Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA)
Authors are invited to submit manuscripts of up to 10 pages reporting unpublished, mature, and original research and recent developments/theoretical considerations in applications of data mining to Earth sciences by August 28, 2017, in IEEE 2-column format. Accepted papers will be printed in the conference proceedings. Additional details and a link to the manuscript submission system will be provided in the near future. Submission implies the willingness of at least one of the authors to register and present the paper.
Please submit your paper via the website at https://wi-lab.com/cyberchair/2017/icdm17/scripts/submit.php?subarea=SP19&undisplay_detail=1&wh=/cyberchair/2017/icdm17/scripts/ws_submit.php.
|Full paper submission:||August 28, 2017 — Deadline extended!|
|Author notification:||September 4, 2017|
|Camera-ready paper submission:||September 18, 2017 — Deadline extended!|
|DMESS 2017 Workshop:||November 18, 2017|
|ICDM 2017 Conference:||November 18–21, 2017|
Contribution to Computational Science:
This workshop will contribute to the field of Computational Science by creating a forum for original research papers and presentations from leading computational and Earth scientists who are applying data mining techniques on advanced computing platforms (HPC systems, clusters, grids and clouds) to distill knowledge from the massive—and growing—data sets created by the Earth science community.
About the Workshop Co-conveners:
Forrest M. Hoffman is a Senior Computational Climate Scientist at Oak Ridge National Laboratory (ORNL). As a resident researcher in ORNL’s Climate Change Science Institute (CCSI) and a member of ORNL’s Computational Sciences & Engineering Division (CSED), Forrest develops and applies Earth system models (ESMs) to investigate the global carbon cycle and feedbacks between biogeochemical cycles and the climate system. He applies data mining methods using high performance computing to problems in landscape ecology, remote sensing, and large-scale climate data analytics. He founded the workshop series on Data Mining in Earth System Science (DMESS) in 2009 and has served as lead convener for all six prior workshops. Forrest is also a Joint Faculty Professor in the University of Tennessee’s Department of Civil & Environmental Engineering in nearby Knoxville, Tennessee.
Auroop R. Ganguly is a civil and environmental engineer who works at the intersection of three broad areas: (1) Climate Extremes and Water Sustainability, (2) Infrastructural Resilience and Homeland Security, and (3) Applied Data and Computational Sciences. Prior to his current position as a faculty at Northeastern University in Boston, MA, he was at the US Department of Energy’s Oak Ridge National Laboratory for seven years, at Oracle Corporation for five years, and at a startup subsequently acquired by Oracle for a year. In addition, he has a dual interest in ancient history and science fiction, i.e., the forgotten past and the unknown future. While he has nothing particularly against the here and now, he rarely gets any time to spend there and then. Once upon a time he obtained a PhD from MIT, and currently, other than his day job as the Principal Investigator of the SDS Lab at Northeastern, he takes a bunch of undergraduate kids across India to study climate change, and happens to be the Chief Scientific Adviser for a startup, risQ Inc. (http://www.risq.io/), co-founded with one of his former PhD students.
Jitendra Kumar is a computational hydrologist at Oak Ridge National Laboratory and a Joint Assistant Professor at the University of Tennessee, Knoxville. He conducts research at the intersection of high performance computing, environmental and Earth sciences, and systems analysis and data mining. His research entails data mining, large-scale global optimization, computational hydrology and hydrogeology, landscape ecology, remote sensing, and development of parallel algorithms for large-scale supercomputers.
Richard Tran Mills is an HPC Earth System Models Architect at Intel Corporation, where he leads efforts related to weather, climate, and Earth System models and associated analysis tools on current and next-generation high-performance computing architectures. Prior to joining Intel in 2014, he spent a decade as a research scientist at Oak Ridge National Laboratory, and also held a joint faculty appointment at the University of Tennessee, Knoxville. His work has spanned high-performance scientific computing, geospatiotemporal data mining, computational hydrology, and climate change science. He is one of the original developers of PFLOTRAN, an open-source code for massively parallel simulation of hydrologic flow and reactive transport problems, and has also contributed to the development of PETSc, the Portable, Extensible Toolkit for Scientific Computation, a suite of solvers, data structures, and associated routines for the solution of a wide variety of scientific computing problems. He earned his Ph.D. in Computer Science in 2004 at the College of William and Mary, where he was a Department of Energy Computational Science Graduate Fellow. Prior to that, he studied geology and physics at the University of Tennessee, Knoxville as a Chancellor’s Scholar.
For assistance or additional information, contact Forrest Hoffman (email@example.com)
Last Modified: Monday, 20-Nov-2017 22:06:26 EST
Warnings and Disclaimers
extrapolations of experimental and theoretical knowledge.
Until recently, the geological sciences dealt with the great complexity of the Earth largely by focusing on specific spatial, temporal, or compositional regimes. Subdisciplines studied phenomena that were largely compartmentalized, and influences from outside the domain of study were greatly simplified or ignored. As a result, the various subdisciplines of the field—geology, geochemistry, geophysics, paleontology, geohydrology, geoengineering—were neatly pigeonholed and had little need or incentive to communicate or work with one another.
During the past 25 years, that fragmentation of the geological sciences has been disappearing under the influence of several momentous developments. The result has been a profound change in the way geoscientists study the Earth.
UNIFYING FORCES IN THE GEOLOGICAL SCIENCES
Many advances have simultaneously contributed to a general unification of the geological sciences, but three in particular stand out: the plate tectonic revolution, the enhanced ability to produce images of both the surface and the interior, and the increasing recognition of humanity as a geological agent.
In the 1960s the geological sciences experienced a conceptual revolution that continues to affect the field today. Traditionally, most geologists analyzed the history of the Earth primarily in terms of vertical movements: mountains emerged from a buckling crust and were eroded, sea levels rose and fell, whole areas of continents were uplifted. But a quarter century ago a series of developments in marine geology and paleomagnetism—the record of the magnetic signals preserved in the rocks—resulted in a radically new picture of the Earth. This new picture acknowledged the essential role of large-scale horizontal movements throughout the Earth's evolution as well as that of vertical movements.
This concept of plate tectonics has endured through two decades of scientific scrutiny and is now regarded as an established fact, a situation nearly unthinkable in 1960. Scientists now know that the crust is composed of about a dozen major (Plate 3) and several minor plates that constantly move and jostle each other in response to movements in the underlying mantle. Where two plates converge, one may override the other, and the leading edge of the lower plate may melt as it reaches greater depths or may produce melting in the overlying mantle. This convergence creates the oceanic trenches and zones of coastal volcanoes seen on the western edge of South America, in the Southwest Pacific, and elsewhere. If neither plate sinks, collision creates a wrinkled mountain belt, such as the Himalaya or the Urals. Where plates merely sideswipe each other, the boundary shears laterally, as happens along California's San Andreas fault. In each type of plate boundary zone, earthquakes occur when the plates bind, build up stress, and suddenly slip free. A plate may break apart; two observable examples are a rift across a volcanic center such as the Red Sea and a rift through the interior of the plate related to stresses on a distant plate boundary, as at Lake Baikal.
The plate tectonic model enables geoscientists to synthesize formerly separate and enigmatic facets of crustal processes. This ongoing synthesis helps explain the distribution and timing of mountain building, igneous activity, earthquakes, sedimentary basins, ore deposits, and other features of both practical and theoretical import. Plate tectonics also establishes a new view of earth history, in which the plates have moved with time, with consequent effects on the hydrosphere, atmosphere, and biosphere.
Plate tectonics relates activity within the Earth that is directly associated with the movements of plates to characteristics and changes of the surface. For example, midplate hotspots, which are responsible for such features as the volcanic activity at Yellowstone Park and the Hawaiian Islands, are caused by plumes of hot material rising from within the mantle, perhaps from as deep as the core-mantle boundary. Plate tectonics also offers a natural framework for geochemical cycles, which involve the transfer of elements among the various envelopes that form the earth system.
The revolutionary theory of plate tectonics is comparable in power and elegance to the Copernican theory of the sixteenth century, to Newton's theory of gravitation in the seventeenth century, to the establishment of atomic theory at the beginning of the nineteenth century, to the theory of evolution in biology later in that century, and to the development of quantum mechanics and relativity in physics in the twentieth century. Plate tectonic theory is the newest of these major advances in human understanding, and as a result many of its implications have yet to be worked out. Fundamen-