Data Steward

  • Full time
  • Prague
  • Posted 4 months ago
Amazon AWS (junior)
Google Cloud Platform (junior)
PostgreSQL (junior)
Oracle (junior)
Bash (regular)
Python (regular)
SQL (advanced)
Join Netguru Talent Marketplace, a proven partner for tech-minded freelancers and experts. Thanks to us, you will have access to various project-based opportunities and can collaborate with different companies and industries. As a result, you will not only gain more experience but also develop a variety of skills you didn’t even know you had.

Currently, we’re developing a project for a real estate franchise company based in the USA, Texas.

We are looking for a Data Steward with Python skills to join an ongoing, long term project for 5-6 months, part-time. The goal of the project is to provide tech support for the listings team working on MLS data mapping. The data are later used in client’s products dedicated for the real estate agents. Our work is focused mainly on mapping and QAing the sources we map. We sometimes assist with production deployments and help spotting metadata changes that occur on source providers’ end.

What are the benefits of working on this particular project?
You will be working with a high-scale living system that processes hundreds of data sources from multiple providers daily and requires constant supervision to provide accurate information. It is one of the most accurate databases about real estate listings in the entire U.S. The environment that you will be working in is rapidly changing which makes the work challenging, but also exciting. The job is demanding and requires patience and attention to detail, but at the same time, the team is working closely and Data Stewards can rely on each other.

Required skills: 

  • advanced SQL knowledge</strong>; 
  • at least 2-3 years of experience with additional languages such as Python, Bash shell; 
  • experience in database development (Oracle, PostgreSQL, MongoDB, BigQuery); 
  • ability to develop analytics models (e.g. regression, simulation, statistical, etc) for structured and unstructured data sets with a focus on data lineage to proactively identify risk areas and identify/catch outliers trends and/or projections, where appropriate;
  •  cloud experience in Google GCP and/or Amazon AWS.

Nice to have: 

  • experience with data architecture, data modeling, schema design, and/or software development; 
  • experience with Github; 
  • experience in GCP: GCS, Cloud functions, Dataflow, Cloud Composer, Bigquery.

Joining Netguru as a Data Steward on this project means:
  • Working on a challenging product for a big company.
  • Maintaining data loads for 500 disparate sources from multiple providers.
  • Creating, developing and documenting data mapping rules.
  • Working with the downloader development team to give direction for how we want to download data using industry standard best practices.
  • Developing data cleansing and transformation automation.
  • Making improvements and logging for the transformation process.
  • Utilizing tools and programs to analyze and convert data to a standardized format.
  • Developing continuous process improvements.
  • Identifying and fixing “data bugs” and improving the overall quality of information.
  • Recommending process improvements.

Apply if you:
  • Are ready to work 40 hours per week, including weekends.
  • Can communicate very well in spoken and written English — CEFR C1.
  • Are a great communicator and team player.
  • Can work cross-functionally with distributed teams, sometimes with team members in different timezones.

To apply for this job please visit