Current Research

Cyberinfrastructure for Intelligent High-Resolution Snow Cover Inference from Cubesat Imagery

The ability to observe the Earth from space at relevant spatial and temporal scales is key to understanding how hydrological and ecological systems will respond to climate change. In particular, high spatial and temporal resolution (meter scale, daily frequency) observations of snow-covered areas in mountain regions are critical as snow is important for water resources, driving the seasonal hydrological regimes of the Western U.S., with significant impacts on ecological communities. Planet Labs, Inc. (Planet) is a promising new source of commercial Cubesat high-resolution imagery that can be used in environmental science, as it has both high spatial (3.0-4.0 m) and temporal (1-2 day) resolution. This project will develop open-source, cloud-based cyberinfrastructure including an automated pipeline for processing, analyzing and interpreting Planet Cubesat image data using a machine learning approach to infer snow cover at meter-scale resolution. All models and data products will be openly available for use and modification by scientific communities. The project will support the training of students, postdocs and other early-career researchers through training events, special interest groups, and incubator programs.

Currently, remotely-sensed snow observations with adequate temporal (daily) resolution are either captured at a spatial scale far too large to be relevant to high-resolution hydrology and ecology studies (e.g. MODIS, 500m) or are appropriate in spatial scale (1-10 m) but have inadequate temporal resolution and are cost-prohibitive (e.g. airborne LiDAR). The recent increase of commercial Earth Observation data with high spatiotemporal resolution may bridge the gap between ground-based and low-resolution satellite observation data. This project will focus on using convolutional neural networks-based models to couple ground and airborne-derived snow observations with Planet imagery in three different montane systems in Washington, California, and Colorado. These sites have very good coverage of ground and airborne snow observations at high resolution (3m) collected by the NASA Airborne Snow Observatory (ASO) and SnowEx missions, which will be used in the training and validation of the models. The project will develop advanced cyberinfrastructure using scalable virtual machines, distributed collaborative architecture, reusable computational frameworks, and replicable machine learning workflows to empower Earth scientists to access, process and generate high-resolution snow products from Cubesat data. The project will adopt open-source strategies and ensure that all data, algorithms, and architecture comply with FAIR data principles and reproducibility and will include training materials that promote the adoption of the infrastructure and tools. Read more...

Funding: NSF Geoinformatics.

Machine Learning Training and Curriculum Development for Earth Science Studies

Earth system science discoveries are increasingly affected by the management, analysis, and data inference using powerful machine learning (ML) techniques. Yet, the skills required to perform these tasks, and the education of cutting-edge, open-source technology to build ML models and pipelines, big data, and cloud computing, are not covered by the traditional graduate curriculum in the geosciences. To fill these gaps, this project will develop the GeoScience MAchine Learning Resources and Training (GeoSMART) framework that will build a foundation in open-source scientific ecosystems and general ML theory, toolkits, and deployment on Cloud computing.

This project will include a team of geoscience and ML educators to create a novel ML curriculum with focus on seismology, cryosphere and hydrology applications. The training materials will be included in an enhanced curriculum that will broaden impact on emerging ML communities. The project’s implementation plan will provide training in open-source ML toolkits and data science skills. Further, the project will cultivate the development of discipline-specific ML libraries, workflows, and communities of practice to sustain future growth of ML cybertraining opportunities. By building tools using open-source and cloud-accessible platforms, and by partnering with colleges and institutions that lack computing resources for ML workflows, the project will increase access to cybertraining materials and help to solve geoscience challenges.

Funding: NSF Cybetraining.

Evaluation of remotely sensed snow covered area datasets across the Sierra Nevada meadows

We will generate high-resolution SCA maps across the Sierra Nevada watersheds and will examine the snow mapping performance across and around the meadow areas at several times during the ablation season. Meadow ecosystems of the Sierra Nevada are dependent on the snowpack, groundwater and other hydrologic functions that ensure proper habitat for diverse biota, providing key ecosystem services in a fragile equilibrium, and at risk from human and hydroclimatic changes. We will use the digitized meadow dataset to identify meadows with an area greater than 1 acre (~4000 m2) and will analyze the model performance for areas with at least 3x3 MODIS pixels. We will compare the reconstructed fSCA from the Planet data with MODSCAG fSCA and other forest-corrected fSCA products to diagnose differences in snow data from the two different satellite image sources with similar temporal resolution (~daily) but very different spatial resolutions.

Funding: NASA.

What's in a pixel? Snow water equivalent and subpixel variability at multiple spatial resolutions in mountainous terrain

Snowpack melt in spring and summer is critical to meet water demands in the Western United States. Reliable streamflow forecasting for water management requires robust estimation of the water stored in the snowpack and applications of distributed hydrologic models. Recent applications in acquisition and processing of light detection and ranging (lidar) observations to estimate snow depth made possible to evaluate how a hydrologic model represents the spatial distribution of snow. Multiple studies have identified causes for model uncertainty across a range of models of various complexity. Snowmelt models are uncertain in their predictions primarily due to: inadequate or erroneous forcing data, model structure and process representation, as well as coarse spatial resolution and representation of subgrid variability. We propose to apply different resolution distributed hydrologic models across several mountainous regions where the lidar-derived snow depth datasets are available. We will use forcing datasets from multiple sources to run different types of models at variable spatial resolutions ranging 30-1000 m. We will examine models’ performances to simulate snow water equivalent at different spatial scales, paying particular attention to models’ abilities to represent subgrid variability, spatial patterns of snow in complex terrain and snow in forested areas.

Snow in complex terrain is highly variable, so distributed modeling is essential for proper representations of snow and streamflow. Remote-sensing (using both space-borne and air-borne instruments) provides critical spatial measurements, but to definitively identify the “best” model configuration, multiple measurements are needed at different times of year and at different spatial scales. We combine hydrologic modeling with remotely sensed distributed snow depth and land surface temperature data to provide diagnostic in model ability to represent distributed snow processes.

Funding: NASA.

Cyberinfrastructure for Intelligent High-Resolution Snow Cover Inference from Cubesat Imagery

​Machine Learning Training and Curriculum Development for Earth Science Studies

Evaluation of remotely sensed snow covered area datasets across the Sierra Nevada meadows

Machine Learning Training and Curriculum Development for Earth Science Studies