company logo

AWS Data Bricks Engineer

India
Remote
Mid-Level: 4 to 6 years
Posted on Feb 01 2023

About the Job

Skills

AWS
Scala
Spark
Data Bricks

Job Description

This is a remote position.


Requirements

●      Strong experience as a AWS Data Engineer and must have AWS Databricks experience.

●      Expert proficiency in Spark Scala, Python, and PySpark is a plus

●      Must have data migration experience from on prem to cloud

●      Hands-on experience in Kinesis to process & analyze Streaming data, and AWS DynamoDB

●      In depth understanding of AWS cloud and AWS Data lake and Analytics solutions.

●      Expert level hands-on development Design and Develop applications on Databricks, Databricks Workflows, AWS Managed Airflow, Apache Airflow is required.

●      Extensive hands-on experience implementing data migration and data processing using AWS services: VPC/SG, EC2, S3, AutoScaling, CloudFormation, LakeFormation, DMS, KinesisKafka, Nifi, CDC processing, Amazon S3EMRRedshiftAthena, Snowflake, RDS, Aurora, Neptune, DynamoDB, Cloudtrail, CloudWatch, DockerLambda, Spark, Glue, SageMaker, AI/ML, API GW, etc.

●      Hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc.

●      Knowledge of different programming and scripting languages

●      Good working knowledge of code versioning tools [such as Git, Bitbucket or SVN]

●      Hands-on experience in using Spark SQL with various data sources like JSON, Parquet and Key Value Pair

●      Experience preparing data for Data Science and Machine Learning.

●      Experience preparing data for use in SageMaker and AWS Databricks.

●      Demonstrated experience preparing data, automating and building data pipelines for AI Use Cases (text, voice, image, IoT data etc.…).

●      Good to have programming language experience with .NET or Spark/Scala

●      Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark Scala, Spark SQL/PySpark

●      Knowledge of AWS/Azure DevOps processes like CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence

●      Working experience with Visual Studio, PowerShell Scripting, and ARM templates.

●      Strong understanding of Data Modeling and defining conceptual logical and physical data models.

●      Big Data/analytics/information analysis/database management in the cloud

●      IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations.

●      Ability to remain up to date with industry standards and technological advancements that will enhance data quality and reliability to advance strategic initiatives

●      Basic experience with or knowledge of agile methodologies

●      Working knowledge of RESTful APIs, OAuth2 authorization framework and security best practices for API Gateways

Responsibilities:

·         Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigation

·         Manage data related requests, analyze issues, and provide efficient resolution. Design all program specifications and perform required tests

·         Design and Develop data Ingestion using Glue, AWS Managed Airflow, Apache Airflow and processing layer using Databricks.

·         Work with the SMEs to implement data strategies and build data flows.

·         Prepare codes for all modules according to required specification.

·         Monitor all production issues and inquiries and provide efficient resolution.

·         Evaluate all functional requirements, map documents, and troubleshoot all development processes

·         Document all technical specifications and associates project deliverables.

·         Design all test cases to provide support to all systems and perform unit tests.

 

Qualifications:

●      2+ years of hands-on experience designing and implementing multi-tenant solutions using AWS Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions.

●      5+ years’ experience in a software development, data engineering, or data analytics field using Python, PySpark, Scala, Spark, Java, or equivalent technologies.

●      Bachelor’s or Master’s degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience

●      Strong written and verbal communication skills

●      Ability to manage competing priorities in a fast-paced environment

●      Ability to resolve issues

●      Self-Motivated and ability to work independently

●      Nice to have-

-        AWS Certified: Solutions Architect Professional

-        Databricks Certified Associate Developer for Apache Spark


About the company

Industry Human Resources Company size 51-200 employees 17 on LinkedIn Includes members with current employer listed as Brij HR Sollutions, including part-time roles. Headquarters Pune, Maharashtra

Industry

Human Resources

Company Size

51-200 Employees

Headquarter

Pune

Other open jobs from Brij HR Sollutions