Open-source scientific pipeline framework
Every lab has that folder.
The one full of scripts no one documented, results no one can reproduce, and a pipeline that only works on one person's machine. You inherited it from a grad student who graduated three years ago. Half the dependencies are pinned to versions that no longer exist. The other half aren't pinned at all.
You spend your first month not doing science—just getting the code to run.
GG.Flow is the fix.
GG.Flow is a scientific pipeline framework—a way to define, execute, and reproduce multi-step computational workflows. It handles dependency resolution, environment isolation, provenance tracking, and result caching so you can focus on the science instead of the plumbing.
Every run is logged with its exact configuration, dependencies, and inputs. Someone joins your lab next year? They run the same pipeline and get the same results.
Environment isolation means your pipeline runs the same way everywhere—your laptop, a cluster, a collaborator's workstation halfway around the world.
Change one parameter and re-run. GG.Flow caches everything upstream that didn't change, so you're not waiting hours for steps that already completed.
Your pipeline definition is documentation. New lab members can read it and understand what your analysis does without deciphering a folder of numbered scripts.
We built GG.Flow for our own research on emergent simulation—modeling biological, psychological, and environmental systems at scale. But the framework itself doesn't know or care what you're simulating.
If your work involves batch-processed computational pipelines—genomics, climate modeling, materials science, epidemiology, any field where you chain computational steps together and need the results to be reproducible—GG.Flow works for you.
Neuroscience is where we started. Science is where it goes.
We built GG.Flow for ourselves. Our own simulation pipelines were getting unwieldy—too many moving parts, too many manual steps, too much time lost to configuration instead of research.
Then we realized the same foundation applies to nearly any batch-processed scientific pipeline. And we think it could do real good.
So we opened it. Not as a marketing strategy or a loss leader for paid products. Because the reproducibility crisis in science is real, and every lab that can reproduce its results reliably is a lab doing better science.
GG.Flow is and will remain free and open-source. The MUSE ecosystem that uses it is proprietary. The infrastructure that makes science more reproducible shouldn't be.
GG.Flow implements a directed acyclic graph (DAG) execution model for scientific workflows. Key technical characteristics:
GG.Flow is designed for computational reproducibility in scientific contexts. It complements but does not replace domain-specific tools (e.g., workflow managers for HPC job scheduling). Its primary contribution is making the connection between "I ran this analysis" and "here is exactly how to run it again" automatic rather than aspirational.
The framework is released under the MIT license. Contributions are welcome. Integration with institutional HPC environments and cloud platforms is on the roadmap.
Open-source, free forever, and ready for your next pipeline. Questions? We'd love to hear from you.
Get in Touch