Job Description
Job Description
Looking for a professional to design, build and maintain cloud-based data systems that allow processing large volumes of information and providing reliable data access and transformation capabilities. The candidate will work in a cross-functional environment, contributing to platform integration with analysis, visualization and downstream applications.
Responsibilities:
Design and build cloud-based data pipelines to ingest, transform and store large volumes of data.
Develop APIs and services that expose reliable data access, transformation, and orchestration capabilities.
Contribute to geospatial data processing workflows using tools such as Apache Sedona and Iceberg's native geospatial functionality.
Help develop and maintain shared platform tools and libraries that improve productivity and promote self-service capabilities across engineering and analytics teams.
Monitor, tune, and troubleshoot performance, cost, and reliability of data infrastructure components.
Collaborate with cross-functional teams to ensure platform systems integrate seamlessly with analytics, visualization, and downstream applications.
Requirements:
3-5 years of data engineering experience, ideally working with large-scale or geospatial systems.
Proficiency in Python and SQL (Spark SQL preferred).
Experience with distributed data processing frameworks such as Spark, Flink or Beam.
Familiarity with orchestration frameworks (Airflow, Temporal, Dagster) and workflow automation principles.
Experience with cloud platforms (AWS, GCP or Azure) and containerization tools (Docker, Kubernetes).
Exposure to batch and streaming architectures, and understanding of modern data lake or lake house patterns (Iceberg, Delta, Hudi).
Knowledge of geospatial data formats, processing libraries, and coordinate systems is an advantage.
Problem-solving skills and interest in building reliable, scalable, and observable data systems.
Salary to receive
To agree