All Projects → wibeasley → RAnalysisSkeleton

wibeasley / RAnalysisSkeleton

Licence: GPL-2.0 license
Files and settings commonly used in analysis projects with R

Programming Languages

r
7636 projects
python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to RAnalysisSkeleton

Odysis
Jupyter Interactive Widgets library for 3-D mesh analysis
Stars: ✭ 15 (+0%)
Mutual labels:  analysis
PHAT
Pathogen-Host Analysis Tool - A modern Next-Generation Sequencing (NGS) analysis platform
Stars: ✭ 17 (+13.33%)
Mutual labels:  analysis
SqlServer.Rules
SQL Server static code analysis rules for SSDT database projects
Stars: ✭ 20 (+33.33%)
Mutual labels:  analysis
pathpy
pathpy is an OpenSource python package for the modeling and analysis of pathways and temporal networks using higher-order and multi-order graphical models
Stars: ✭ 124 (+726.67%)
Mutual labels:  analysis
volkscv
A Python toolbox for computer vision research and project
Stars: ✭ 58 (+286.67%)
Mutual labels:  analysis
ggshakeR
An analysis and visualization R package that works with publicly available soccer data
Stars: ✭ 69 (+360%)
Mutual labels:  analysis
very good analysis
Lint rules for Dart and Flutter used internally at Very Good Ventures 🦄
Stars: ✭ 194 (+1193.33%)
Mutual labels:  analysis
root pandas
A Python module for conveniently loading/saving ROOT files as pandas DataFrames
Stars: ✭ 108 (+620%)
Mutual labels:  analysis
TwitterSearch2Gephi
This windows CLI app lets you collect data from twitter via REST API and convert it into a CSV data set that can be used with Gephi. Other social networks (Reddit, Youtube, WWW) are also supported.
Stars: ✭ 21 (+40%)
Mutual labels:  analysis
cis
Home of the Community Intercomparison Suite.
Stars: ✭ 30 (+100%)
Mutual labels:  analysis
covid19analysis
COVID-10 Analysis
Stars: ✭ 16 (+6.67%)
Mutual labels:  analysis
hypothetical
Hypothesis and statistical testing in Python
Stars: ✭ 49 (+226.67%)
Mutual labels:  analysis
MixingBear
Package for automatic beat-mixing of music files in Python 🐻🎚
Stars: ✭ 73 (+386.67%)
Mutual labels:  analysis
jacoco-report
Github action that publishes the JaCoCo report as a comment in the Pull Request
Stars: ✭ 31 (+106.67%)
Mutual labels:  analysis
fornalder
Visualize long-term trends in collections of Git repositories.
Stars: ✭ 80 (+433.33%)
Mutual labels:  analysis
tnb-analysis
Gain insights about thenewboston digital crypto currency network by doing some analysis
Stars: ✭ 24 (+60%)
Mutual labels:  analysis
pingnoo
An open-source cross-platform traceroute/ping analyser.
Stars: ✭ 149 (+893.33%)
Mutual labels:  analysis
should-i-play-f6
Chess project to analyze the statistical effect of playing f3 (as white) or f6 (as black) on the outcome of the game.
Stars: ✭ 15 (+0%)
Mutual labels:  analysis
SvelteScaling
Does Svelte scale?
Stars: ✭ 21 (+40%)
Mutual labels:  analysis
mousetrap
Process and Analyze Mouse-Tracking Data
Stars: ✭ 33 (+120%)
Mutual labels:  analysis

R Analysis Skeleton

No one beginning a data science project should start from a blinking cursor.
...Templatization is a best practice for things like using common directory structure across projects...
-Megan Risdal Kaggle Product Lead.

This project contains the files and settings commonly used in analysis projects with R. A developer can start an analysis repository more quickly by copying these files. The purpose of each directory is described in its README file. Some aspects are more thoroughly described in Collaborative Data Science Practices.

Pipelines

The repo contains two pipelines that aim to be simple enough to understand, yet complex enough to mimic aspects frequently seen in analysis projects.

Cars

The simplest example involves a csv that is lightly groomed and saved as an rds file. A knitr Rmd file analyzes the rds; the text, graphs, and tables are saved as a self-contained html. The html file is very portable; it can be saved on a drive, emailed to a colleague, or publicly served on a website.

flow-skeleton-car

Intra-individual Differences

Most nontrivial data science projects require multiple sources to address a single issue. This example uses three sources: (a) longitudinal measurements for individuals across time (mlm.csv), (b) static county characteristics (county.csv), and (c) longintudinal county-level characteristics (te.csv). Each csv is independently groomed and loaded into its own database table (in db.sqlite) by an ellis lane. Conventional statistical software is not designed to digest multiple data rectangles; a scribe transforms multple database-normalized tables into a single rds that can be analyzed directly. In this case, the mlm.rds supports two analyses: a conventional report of statistical inferences intended for subject-experts concerned with complex hypotheses, and a dashboard of simplified patterns intended for administrators concerned with operational progress. The te.rds supports a comparison of the time and effort results between counties.

flow-skeleton

Establishing a Workstation for Analysis

  1. Install and configure the needed software, as described in the Workstation chapter of Collaborative Data Science Practices. Select the programs to meet your needs, and if in doubt, cover the Required Installation section and then pick other tools as necessary.

  2. Download the repo to your local machine. One option is to clone it.

  3. On your local machine, open the project in RStudio by double-clicking RAnalysisSkeleton.Rproj.

  4. Install the packages needed for this repo. Within the RStudio console, execute these two lines. The first line installs a package. The second line inspects the repo's DESCRIPTION file to identify and install the prerequisites.

    remotes::install_github(repo="OuhscBbmc/OuhscMunge")
    OuhscMunge::update_packages_addin()
  5. Execute the entire pipeline of the repo by executing the flow.R file. Open it in RSutdio and click the 'Source' button near the top right of the screen. The flow file then tells other files to run in the desired order. Running this file creates the data objects --i.e., the primary objective of this repo. The objects include (a) intermediate data files, (b) analysis-ready datafiles, and (c) html reports that display the ultimate analyses.

  6. If you'd like to view the database created by this repo's pipeline, install a program that can visually explore a SQLite file. Two of many options are SQLiteStudio and DB Browser for SQLite.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].