We are looking for a highly skilled and experienced Data Engineer with expertise in Python and the AWS ecosystem to join our dynamic team. In this role, you will design, build, and maintain scalable data pipelines and manage large datasets within AWS infrastructure. You will work closely with data scientists, analysts, and other stakeholders to ensure the availability of clean, reliable, and high-performing data systems to support our analytical and operational needs.

Key Responsibilities:

  • Design and Build Data Pipelines: Develop robust and scalable data pipelines using Python, AWS services, and other technologies to move and process large datasets efficiently. (Java experience is also a plus.)
  • Data Integration and ETL Processes: Implement ETL (Extract, Transform, Load) processes to aggregate data from multiple sources into data lakes and data warehouses.
  • AWS Services Management: Utilize AWS services such as S3, Lambda, EC2, Glue, Redshift, RDS, Athena, and Kinesis to manage and optimize data processing workflows. AWS Services Expertise: Demonstrate comprehensive knowledge and hands-on experience in managing the full range of AWS services required for the organization's data processing workflows, including:
  • AWS Glue: Design, develop, and maintain Glue jobs for data extraction, transformation, and loading (ETL) tasks. Optimize Glue job performance and reliability.
  • AWS Lambda: Architect and implement serverless data processing functions using Lambda, integrating them with other AWS services as needed.
  • AWS API Gateway: Build and manage secure, scalable, and RESTful APIs using API Gateway to expose data and functionalities.
  • Amazon S3: Utilize S3 for reliable and scalable data storage, including the ingestion of JSON files and integration with downstream services.
  • Amazon EC2: Provision and manage EC2 instances as needed to support the data processing infrastructure, such as triggering Lambda functions or other custom applications.
  • AWS IAM: Establish appropriate IAM roles, policies, and permissions to ensure secure access to AWS resources and data.
  • AWS Snowflake Integration: Integrate the data processing workflows with Snowflake, leveraging features like Snowpipe to load data into the data warehouse.
  • Design and implement end-to-end data processing pipelines, seamlessly integrating the various AWS services.
  • Ensure data integrity, security, and compliance throughout the data processing lifecycle.
  • Optimize the performance, cost-effectiveness, and scalability of the data processing infrastructure.
  • Automate and streamline the deployment, monitoring, and maintenance of the AWS-based data processing environment.
  • Collaborate with cross-functional teams to understand requirements and deliver data-driven solutions.
  • Stay up-to-date with the latest AWS service updates, best practices, and industry trends to continuously improve the data processing capabilities.
  • Automation and Monitoring: Automate data ingestion, transformation, and quality checks, ensuring data availability and integrity across platforms. Monitor the health and performance of data pipelines and optimize them for performance and cost.
  • Data Storage and Retrieval: Design and manage cloud-based storage solutions, including data lakes and warehouses, ensuring easy retrieval and accessibility of data.
  • Collaboration with Data Teams: Work closely with data scientists, analysts, and business teams to understand their data requirements and design solutions that meet their needs.
  • Data Quality and Consistency: Implement data quality checks and validation mechanisms to ensure the accuracy, completeness, and consistency of data across various systems.
  • Documentation and Reporting: Maintain comprehensive documentation on data systems, pipeline workflows, and solutions. Provide insights and recommendations based on data analysis.
  • Stay Current with Trends: Keep up-to-date with the latest technologies, tools, and best practices in data engineering, cloud computing, and the AWS ecosystem.

Required Skills and Qualifications:

  • 5+ years of experience as a Data Engineer or similar role with expertise in Python and AWS ecosystem.
  • Proficiency in Python: Strong experience in writing clean, efficient, and scalable Python code for data processing, automation, and pipeline development.
  • AWS Ecosystem Expertise: Deep understanding and hands-on experience with key AWS services including S3, Lambda, Glue, EC2, Redshift, RDS, Athena, and Kinesis, including API Gateway and EMR Cluster.
  • Data Pipeline Development: Proven experience in designing and building scalable and reliable data pipelines, ETL processes, and automating data workflows.
  • Database Management: Experience working with relational (SQL) and NoSQL databases. Familiarity with AWS Redshift, RDS, DynamoDB, and other database systems.
  • Data Warehousing: Experience with cloud-based data warehousing solutions such as AWS Redshift or similar.
  • Snowflake: Deep understanding of the Snowflake cloud data warehouse platform, including its features, capabilities, and best practices for data loading, transformation, and querying.
  • Big Data Tools: Familiarity with big data technologies like Apache Spark, Hadoop, and Kafka is a plus.
  • Data Integration: Experience integrating data from multiple sources, including APIs, third-party services, and on-premise data systems.
  • Data Quality Assurance: Experience in ensuring data quality, consistency, and security across data pipelines.
  • Version Control: Proficient in using Git or similar version control systems for code management.
  • CI/CD: Familiar with Continuous Integration and Continuous Deployment practices and tools.
  • Problem-Solving: Strong analytical and troubleshooting skills, with the ability to quickly resolve data issues and optimize pipeline performance.
  • Collaboration Skills: Excellent communication and interpersonal skills with the ability to work effectively across teams and stakeholders.

Preferred Skills:

  • Containerization: Experience with Docker or Kubernetes for containerizing data applications and pipelines.
  • DevOps Practices: Familiarity with infrastructure-as-code tools like Terraform, CloudFormation, and AWS CDK.
  • Data Visualization: Experience with data visualization tools (e.g., Tableau, Power BI) for reporting and presenting data insights.
  • Machine Learning: Familiarity with machine learning concepts or frameworks for integrating data pipelines into ML workflows.
  • Apache Airflow: Experience with workflow orchestration tools like Apache Airflow for managing complex data pipelines.
  • Experience in managing project communication or Client-facing experience.

Education:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field (or equivalent work experience).

Why Join Us:

  • Innovative Projects: Work on challenging and impactful data engineering projects using the latest AWS tools and technologies.
  • Career Growth: Take advantage of continuous learning opportunities and career development in a fast-growing field.
  • Collaborative Culture: Join a team of talented professionals in a collaborative and supportive work environment.
  • Competitive Compensation: Enjoy a competitive salary and benefits package based on your experience.

工作详细内容

全部职位:
1 发布
工作时间:
早班
工作类型:
工作地址:
Johar Town, 拉合尔, 巴基斯坦
性别:
没有偏好
学历:
学士 (理工学士) 只有
职位等级:
资深专业人员
经验:
5年 - 8年
在之前申请:
Feb 10, 2025
发布日期:
Jan 09, 2025

Mavric Technology

· 51-100 员工 - 拉合尔

你最大的竞争优势

快速得到有竞争力的分析和专业的对你的评定
联系我们团队的专业顾问来提升你的简历
尝试罗资 专业版

相同职位头衔

Data Engineer

Parkeet Ai, , 巴基斯坦
发布 Jan 11, 2025

Data Engineer

Parket Ai, , 巴基斯坦
发布 Jan 10, 2025

Electronics Engineer

Hi Tech Plastics Engineering, 拉合尔, 巴基斯坦
发布 Jan 07, 2025

Software Engineer

Ninjas Code, 伊斯兰堡, 巴基斯坦
发布 Dec 25, 2024
浏览全部
我在ROZEE上找到工作啦!