Big refactor
Summary
Refactor!
Current behaviour/setbacks
Everything is a mess.
Desired behaviour/advantages
Follow some kind of project structure: (inspiration: https://github.com/drivendata/cookiecutter-data-science and aliby/skeletons)
- data
- gemfiles (genome-scale metabolic models)
- interim (for e.g. .pkl files)
- processed
- docs (dump .org files in here)
- notebooks (naming convention: 01-XXXX.ipynb, for ordering. The number has nothing to do with issues)
- reports (e.g. PDF outputs)
- figures (e.g. PNG/PDF files)
- poetry.lock & pyproject.toml
- src (source code -- most of yeast8model.py will go into this)
- init.py
- data (download/generate data)
- constants
- gem (dealing with cobrapy model, naming it gem rather than cobra to prevent confusion)
- calc (performing various calculations)
- viz (visualisation, e.g. plots)
- scripts (run scripts, should produce usable outputs)
- dev (dev scripts, use for developing new features & debugging)
- tests (unit tests)
Implementation sketch
-
Isolate all of this on a separate branch (with frequent merges from master?) -
Create sub-directories and sort files according to them. -
Update references to modules. -
src -
scripts -
notebooks -
dev
-
-
Updates references to files. -
data/gemfiles -
data/interim -
data/lookup
-
-
Add snippet to all notebooks to allow import of local functions. -
Test that stuff still works -- start from the highest-use scripts & notebooks (the rest can be fixed as errors arise) -
(maybe?) jupytext to convert notebooks to files that play well with emacs. -
Convert notebooks that don't have much literate programming into scripts. -
Remove large blocks of repetitive code across the board. -
Low-effort refactoring, i.e. the current issues that have 'refactor' labels.
Edited by Arin Wongprommoon