company logo

Distributed Datastore Lead

Navi Mumbai
Full-Time
Executive: 10 to 30 years
Posted on Jun 05 2024

Not Accepting Applications

About the Job

Skills

Hadoop
Capacity Planning
Docker
Kubernetes
Distributed Computing
Elasticsearch

Education

B.E./B.Tech/MCA in Computer Science or Information Technology


Experience

Minimum of 12-15 years of hands-on experience in distributed computing and data management, with a focus on leading technologies such as Hadoop, Kafka, Elasticsearch, and Spark.


Job Summary

The Distributed Database Lead is responsible for overseeing the design, implementation, and management of distributed database environments, including Hadoop, Kafka, and Spark. This role emphasizes automation, self-healing capabilities, and containerization, integrating these services into both Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) models. This role requires expertise in leading distributed technologies such as Hadoop, Kafka, Elasticsearch, and Spark, with a focus on replication strategies, migration, indexing, and capacity planning.


Key Responsibilities:

1. Lead and participate in RFP processes, design and architect distributed datastore solutions ensuring scalability, reliability, and performance to meet evolving business needs.

2. Develop and implement methodologies to offer distributed database services on IaaS and PaaS models also to ensure the IaaS and PaaS offerings are scalable, secure, and user-friendly.

3. Design, implement and manage containerized distributed datastores environments using Docker and Kubernetes.

4. Develop container orchestration strategies to improve scalability and manageability of distributed database solutions.

5. Design, implement, and manage automation solutions for database deployment, management using tools like ansible & jenkins.

6. Evaluate and select appropriate distributed technologies and platforms such as Hadoop, Kafka, Elasticsearch, and Spark. Develop and implement a strategy for distributed databases in alignment with business objectives.

7. Design and implement replication strategies to ensure data redundancy, fault tolerance, and disaster recovery capabilities across distributed environments.

8. Plan and execute data migration strategies between different datastore platforms or versions, minimizing downtime and ensuring data integrity throughout the process.

9. Optimize data indexing strategies to improve query performance and data retrieval efficiency in distributed datastore environments.

10. Monitor and analyze data growth patterns, resource utilization, and performance metrics to forecast capacity requirements and plan for future scalability.

11. Conduct performance analysis and tuning of Hadoop, Kafka, Spark, and Elasticsearch components to maximize efficiency and throughput.

12. Create and maintain comprehensive documentation of distributed datastore configurations, procedures, and best practices. Provide training and mentorship to team members as needed.

About the company

We are the force behind the meteoric rise of Indias leading telecom operator Jio with 400 Million+ customers. In Addition to this we have also powered an exhaustive list of digital apps & services that have delivered functionality, usability, engagement, scale and loyalty. We provide solutions for customers (B2C) and enterprise (B2B). We have an end to end 5G solution consisting of 5G Radio, a com ...Show More

Industry

Media & Telecommunication...

Company Size

51-200 Employees

Headquarter

Navi Mumbai, Maharashtra

Other open jobs from Jio