Principal Consultant, Co-founder
Crowd Consulting LLC
2016-04-01-Present
Multiple project in the field of data warehousing and big data enginnering. Business development, team augmentation, mentoring, pre- and ...
Multiple project in the field of data warehousing and big data enginnering. Business development, team augmentation, mentoring, pre- and post-sales solution architecture, engineering and support
Python
SQL
Amazon RDS
View more
Python
SQL
Amazon RDS
Apache Spark
Apache Hadoop
AWS EMR
Snowflake
Hortonworks Data Platform
Apache Hive
Aws rredshift
View more
Big Data Engineer
Boston Consulting Group, GAMMA (via Toptal)
2018-06-01-2019-01-01
2 contracts (via Toptal). Both clients – major pharmaceutical companies. Subcontracted by BCG GAMMA Advance Analytics and Data Scien...
2 contracts (via Toptal). Both clients – major pharmaceutical companies. Subcontracted by BCG GAMMA Advance Analytics and Data Science division to provide engineering support for BCG’s data scientists on DMP and personalization projects. Mostly Feature Engineering and ETL but also devops tasks: Python utilities, Airflow installations and Airflow administration Python scripting, Spark to Excel Python scripts, other devops tasks. Designed and build dynamic S3-to-S3 RDS-driven (metadata in Postgres) ETL system in Spark/Hive. AWS Glue is used for Hive metastore, Athena for querying and Airflow for scheduling. ETL system build based on modern Data Warehousing best practices. Documented the system and provided training. Designed and build Feature Engineering S3 Data Mart and multi-layered S3 Customer-360 Data Lake. Proposed and enforced development standards, provided documentation, data validation procedures, operational support and maintenance guidelines.
Python
Pandas
Apache Spark
View more
Python
Pandas
Apache Spark
Airflow
Hdp
Athena
Apache Hive
Glue
View more
VP Data
Enervee
2017-10-01-2018-05-01
Manage data team and built analytical system. Build AWS S3 Data Lake. Loaded Segment.com data from parsed Segment logs bypassing Seg...
Manage data team and built analytical system. Build AWS S3 Data Lake. Loaded Segment.com data from parsed Segment logs bypassing Segment Warehouse. Validated data against Segment Warehouse in AWS Redshift. Loaded PostgreSQL product catalog data to the Data Lake what allowed to develop site behavior marketing BI reporting and ML predictive analytics. Designed a framework for Enterprise Data Warehouse.
Python
Django
MySQL
View more
Python
Django
MySQL
PostgreSQL
Ansible
Amazon RDS
Segment
Amazon Redshift
AWS EMR
Airflow
View more