Mid Data Scientist

  • Full time
  • Prague
  • Posted 1 week ago
DBT framework (nice to have)
SQL (regular)
Catboost (regular)
XGBoost (regular)
LGBM (regular)
NumPy (regular)
Pandas (regular)
Hyperopt (regular)
sklearn (regular)
Python (regular)
PROJECT INFORMATION:
  • Industry: e-commerce
  • Location: 100% remote / hybrid model or from client’s office in Warsaw
  • Project language: Polish, English
  • Assignment type: B2B
  • Remuneration: up to 200 PLN/H +VAT
  • Start: Flexible
 
PROJECT ROLE:
  • E2E delivery of credit risk models including:
– new features creation based on current models’ errors
– tuning models after the auto-ml phase
– extending internal ml library
– calibrating scores to observed probability of default
– optimizing threshold of optimal automated decisions
  • Similar models already exist, we want to boost them to the limit of lgbm algorithm capability before we move to more research approaches (graph models, deep learning)
 REQUIREMENTS:
  • Min. 2-3 years of professional experience
  • Very good knowledge of Python (sklearn, hyperopt, pandas, numpy, lgbm)
  • Machine Learning:
excellent knowledge of gradient boosting algorithm (lgbm, xgboost, catboost)
– excellent knowledge of linear models (logistic regression with regularization)
– knowledge of how the train sample definition influences output generated by the model
– knowledge of model validation techniques
  • Feature Engineering :
– experience with feature (predictors) creation from multiple sources,  
– ability to write easily extendable and maintainable SQL code)
  • Optimization:
– experience with the calculation of optimal thresholds for automated decisions)
  • Credit Risk Domain:
 experience in building credit risk scoring models
– the familiarity of measures such as DPD, Default Rate, Probability of default, vintage etc.
 
NICE TO HAVE:
  • Data Engineering:
– experience in building DAGs in Airflow or similar tool
– knowledge of the DBT framework
– query execution planning
 
  • Machine Learning Engineering:
– at least one classification model deployed to the production environment to make automated decisions
– contributions to Python packages (internal or external)
– unit testing of ml code
– knowledge of Azure Cloud
– knowledge of Kubernetes
– experience as Python Developer
 
SOFT SKILLS:
  • Inquisitive mind – inherent need to understand more and dig deeper
  • Team player – will help out if possible and will ask for help if required
  • Task-oriented, not time-oriented
  • Respectful and professional – does not require manual time management from his/her manager
 
WE OFFER: 
  • Work in the international environment within Scandinavian business culture
  • Co-financing of private medical care (Medicover) and the Multisport card
  • Recommendation program
  • ProData Consult mobile application – easy reporting of working time, quick access to new offers

To apply for this job please visit cz.talent.com.