excel, monitoring, automation, etl, scientific, scraping, federation
etl, development, devops, big-data
- Replace an existing workflow based on Excel in order to:
- Federate team work around a same referential of data.
- Streamline and automate the day to day process.
- Measure and monitor with continuous integration the solution performance with respect to historical data.
- Maintain information about power plants in Europe
- Data is stored in Postgresql in ~40 tables.
- Loading and update of data is done with Tanker (https://hg.adimian.com/tanker), mainly from or to Excel files.
- Generate all input files necessary to run the price model
- Plugable architecture to handle data transformation (based on pandas, numpy and some scientific Pythnon libraries).
- Declarative definition of generated files.
- Collect live information form tens dozens of sources (either through web scraping or from coprporate web-services).