Projects
Sandia
I've worked in a wide variety of areas in my time at Sandia National Labs.
Visualizations:
- Created or contributed to over a dozen Tableau visualizations
- Calculated fields
- Text formatting
- Swappable sheets via parameters
- Axis synchronization
- Dataset replacement
- Utilizing multiple datasets and blending
- Table joining
- Help manage Tableau server
- Created or contributed to over a dozen Shiny visualizations
- Mapping via Leaflet
- Graphing via Plotly (barchart, Sankey, line graph, control-chart, etc.)
- Network diagrams via networkD3
- Integration with Piwik/Matomo
- Datatables
- Custom CSS
- Shinydashboard, shinyBS, shinyWidgets, etc.
- Help manage RStudio Connect Server
Data Transformation and Manipulation:
- Python and R scripting
- ODBC and JDBC connectivity
- MongoDB connectivity via jsonlite and pymongo
- Reading/writing Excel, CSV, feather, .rds, etc.
- Heavy tidyverse (especially dplyr) and Pandas use
- Anaconda, pip install packages
- Jupyter and RMarkdown notebooks
- REST API web services (e.g. Scival/Scopus publication data)
- Data virtualization via visual tools
- Writing SQL, Mongo queries, etc.
- Git version control
Data Architecture and Data Governance
- Working team on data architecture and analytics pipeline
- Testing and evaluating Cloudera services (CDF, CDP, CDSW, etc.)
- Working team on data governance, helped implement and populate Collibra data catalog
Dev-Ops and Data Engineering:
- Gitlab CI/CD pipeline creation via YAML and Gitlab Runners
- Automated deployments
- Basic architecture design
- Docker
- Program and tooling research, Hortonworks/Cloudera, Apache Airflow, Apache Nifi, Rancher, etc.
- Linux commandline
Project Management and Leadership:
- Asked to co-lead analytics sub-team after only ~1.25 years
- Run the Data Sciences Community of Practice
- Presentation of results and demos to leadership
- Work directly with customers on features, functionality, etc.
- Writing funding proposals, fiscal year planning, project planning
- Manage user stories and tasks in Rally, Jira, and Gitlab via agile methodology
- Project lead
User Experience (UX):
- User interviews
- User testing, time-on-task, success rates, SUS, etc.
- Survey design
- Heuristic reviews
- Design and mockup creation using Axure
Presentations:
- National Laboratories Information Technology Summit (NLIT) 2022
- Sandia Insights - A Data Sciences Architecture and Framework (PowerPoint Presentation)
- National Laboratories Information Technology Summit (NLIT) 2019
- Viz Wars: Tableau vs. Shiny (PowerPoint Presentation)
- Sandia Insights - A Data Sciences Architecture and Framework (PowerPoint Presentation)
Kort
Kort is a NodeJS application for performing User Experience (UX) related surveys and methods. It currently supports:
- Card Sorting
- Tree Testing
- Product Reaction Cards
- System Usability Scale (SUS)
- Net Promoter Score (NPS)
It uses MongoDB as the database backend. It's also free and Open Sourced under the GPL license. I work on this with my good friend from grad-school Leif Berg. We use Github Actions for automated building and testing of the codebase after committing to Github.
Steam Analysis
Steam is a very popular gaming system and seller of PC games. Utilizing their APIs, I wrote a set of Python scripts to pull game information, player counts, and more on a scheduled basis into a MongoDB database. I have a Raspberry Pi 3 that sits and runs these scripts as well as hosts the database. I'm interested in analyzing this data to spot trends in game popularity, types of games, price changes, and more.
I've started some basic descriptive analysis via Jupyter notebooks as well as topic modeling via the pyLDAvis package.
The code is available on Github. In the future, after I've collected more data I hope to perform additional analysis such as price prediction.
Kaggle
Kaggle is a popular Data Science community and competition site where contestants use programming and analysis to work on interesting problems.
I used Pandas and scikit-learn to analyze hundreds of thousands of HTML files as part of the 'Truly Native' Kaggle competition. The objective was to identify whether or not a webpage had native advertisements. Random Forest machine learning was used from the scikit-learn Python library. The code is available via Github.
Arma
The popular series of computer games, Arma, from Bohemia Interactive has an extensive modding community. For fun, I've worked on a number of projects.
- Arma2NetMySQL is program for connecting to a MySQL or SQLite database from within the game.
- ArmaConnect is a C# program for connecting to the game and using UDP heartbeats to look for a corresponding Android application running on your phone or tablet. Once it finds a connection, it switches to TCP and shows your position on a map of the game world, weather, time, and other information in real-time. This work involved a lot of network negotiation and learning as well as coding across languages (C# and Java).
- ArmaBriefingConversion is a Python script for converting missions from an older version of Arma to a newer version.
- PHPArma2ServerStatus is a PHP script for querying an Arma server and showing information such as people connected, running mods, etc.
- ArmaScriptTrace is Java program that utilizes GraphViz to trace the execution of scripts in a mission and show them in a network diagram visually.