Cloud Operations Engineer
Clarivate
2020-03-01-Present
Technical Leader of the main logging platform (FluentD, Opendistro
ElasticSearch, Kibana) in AWS – automation done using (Docker, GIT,
Py...
Technical Leader of the main logging platform (FluentD, Opendistro
ElasticSearch, Kibana) in AWS – automation done using (Docker, GIT,
Python, Terraform, JSON, YML, Ansible, Rundeck and Jenkins).
• On-boarding and supporting new ingestions from our internal business units.
SSO, ISM Policies, Indices, Monitoring, Aggregators, certificates, LB, DNS,
fluentD configs...
• Leading the migration from the custom ElasticSearch Cluster to AWS
ElasticSearch Service.
• Troubleshooting any issues in Production.
• On-Call 24x7.
• Monitoring in SignalFX and Datadog (Integration, configuration, management
and creation of dashboards)
• Helping other teams with any potential blockers they can have in AWS.
• Solving issues in AWS using Jira to manage the tickets. (EC2, VPC, Route53,
LB, SG, TGW, IAM, S3, Workspaces...)
• AMI management using Packer and Jenkins.
• Improving existing automation. Ansible, Jenkins, Git, Packer, EC2...
• Governance tasks.
Git
Jenkins
Ansible
View more
Git
Jenkins
Ansible
Datadog
Terraform
Packer
Fluentd
Elastic Stack
Splunk signalfx
View more
Site Reliability Engineer
Dealfront
2023-01-01-2024-03-01
- Installed and configured various tools and services (e.g., Vault, Cert-Manager, Opensearch, Sentry, MinIO, Qdrant) using Helm, Ansible,...
- Installed and configured various tools and services (e.g., Vault, Cert-Manager, Opensearch, Sentry, MinIO, Qdrant) using Helm, Ansible, and ArgoCD to support data-engineering and data-science teams.
- Supported developers in creating roles and tokens in Vault for secure access to production databases.
- Implemented VictoriaMetrics Alert for critical alerts to PagerDuty and ensured its functionality.
- Established a logging strategy in the Kubernetes cluster using Opensearch for effective troubleshooting.
- Collaborated with cross-functional teams to maintain the Kubernetes cluster's smooth operation.
- Monitored and managed the Kubernetes cluster using VictoriaMetrics and Grafana, creating custom dashboards.
- Developed and maintained comprehensive documentation for the Kubernetes cluster and associated processes.
- Stayed updated with the latest technologies and continuously improved cluster infrastructure.
- Installed, configured, and maintained production Kafka clusters and Postgres databases, including monitoring and customizations for efficiency and reliability.
- Integrated various tools with Okta for SSO and OpenID.
Designed and implemented Gitlab CI pipelines for automating operational tasks.
Elasticsearch
Kubernetes
Grafana
View more
Elasticsearch
Kubernetes
Grafana
Pagerduty
Prometheus
CI/CD
Helm
Argo CD
View more
Monitoring Engineer
White Hat Gaming
2021-11-01-2022-12-01
- Standardized, automated, and optimized monitoring, tagging, and integrations in Datadog with Terraform and Ansible for agent installati...
- Standardized, automated, and optimized monitoring, tagging, and integrations in Datadog with Terraform and Ansible for agent installation.
- Designed and implemented new alerts, metrics, and dashboards in Datadog.
- Established a log archival strategy using AWS S3 and monitored solutions within Kubernetes clusters.
- Assisted developers in integrating custom application metrics into Datadog.
- Configured a logging strategy in Datadog, entailing a comprehensive migration from AWS OpenSearch.
- Led the Systems Team, encompassing the Database and DevTools teams, making critical decisions to enhance infrastructure and team efficiency.
- Governed, reported on, resized, and decommissioned AWS resources and Datadog assets.
- Provided support to other teams in diagnosing and resolving monitoring and logging issues.
- Collaborated closely with SRE and Developers to improve processes and documentation.
- Worked with the NOC team (L1 and L2 support team) to address monitoring and logging requirements.
- Established a functional on-call rotation and escalation policies using OpsGenie and Datadog.
Git
Bash
Ansible
Kubernetes
Datadog
Terraform
Elastic Stack
AWS
View more