Data analysis and visualization is an integral part of scientific discovery. Boasting a vast collection of open-source (free) libraries for diverse data operations, visualization, and statistical analyses, R has become a sought-after skill for researchers, data analysts, and researchers alike. This two-part workshop will provide hands-on exercises on data visualization using R. In Part-I, we will discuss plotting basics, data manipulation, ggplot, text mining, and word clouds. Additionally, an introduction to spatial data mapping and interactive plots using leaflets will be explored. In Part-II, we will demonstrate geospatial operations like projections, resampling, spatial extraction, cropping, masking, etc. using rasters, shapefiles, and spatial data frames. Conversion from/to different data formats like data frames, matrices, rasters, and structured data like NetCDF will also be demonstrated. Advanced topics will include working with data cubes (RasterStack/ brick), layer-wise operations on data cubes, cell-wise operations on raster time series by implementing user-defined functions. Special focus will be given to the parallel implementation of layer-wise/ cell-wise custom operations on RasterStack/ bricks for large-scale datasets.
Instructor and lead
Vinit Sehgal is a Ph.D. student at the Water Management and Hydrological Science Program at Texas A&M University. His research interests include scaling issues in hydrology, hydroclimatology, remote sensing and soil physics. In his Ph.D. research, Vinit studies the impact of scale and heterogeneity on observable hydrological processes at large spatial scales using satellite data.
Leah Kocian is a Ph.D. student at the Biological and Agricultural Engineering Department at Texas A&M University. Leah received here B.S. from the same program in 2019. Here research focusses on water quality and management, biochemistry of soils, with a focus on contaminant transport in soils.
Shubham Jain is a Ph.D. student at the Water Management and Hydrological Sciences program at Texas A& M University and a Research Assistant at the Texas Water Resources Institute. His research interests include GIS applications to hydrologic modeling and analyzing the impacts of climate and land cover change on water resources
Alan C. Lewis is a Ph.D. candidate at the Water Management and Hydrological Science Program at Texas A&M University. He studies urban outdoor water demand to advance future approaches to water conservation planning. In 2019, BVWaterSmart won statewide recognition for their work on residential water conservation and earned the Texas Blue Legacy Award and the Texas Environmental Excellence Award.
|1:00 PM – 3:00 PM||Introduction to data visualization in R|
Cover the basics of data visualization in R. The following topics will be covered:
Plotting basics in R
Data visualization using ggplot
Text mining and word cloud
Introduction to spatial mapping with Leaflets
|3:00 PM – 3:15 PM||Break|
|3:15 PM – 5:15 PM||Large-scale geospatial data analysis|
Focus on various advanced GIS-type operations on geospatial data (rasters/ shapefiles) in R.
Applications of parallel computing for cell-wise and layer-wise analysis of geospatial data will be covered.
Raster data and spatial polygons
Working with data cubes (RasterStack/ Brick)
NetCDF/ H5 dataset
User-defined cell/ layer-wise operations on data cubes (StackApply/ calc, cellstats etc)
Parallel computing for gridded/ layered operations