Experience Summary: Neelotpaul Roy Data Engineering Manager - Accenture Strategy & Consulting (AI)
Experience Summary: Neelotpaul Roy Data Engineering Manager - Accenture Strategy & Consulting (AI)
Neelotpaul Roy
Data Engineering Manager – Accenture Strategy & Consulting (AI)
MLOps, Cloud Architecture, Data Governance, Engineering & Analytics
[email protected]
+91 9503 412 123
Experience Summary
Neelotpaul is a Manager (Level 7) with Accenture Applied Intelligence, India. He joined Accenture 5 years back and
overall, he has 12 years of IT experience delivering competency as Data Practitioner in Application Information
Management and ML Engineering, specialising
❑ Data Governance (Quality, Cleanse and Enrichment)
❑ Cloud Analytics Platforms Architecture (Serverless)
❑ AI/ML specific Data Engineering
❑ MLOps
❑ CMMI, SAFe and GxP processes
He strives on building analytics solution on cloud platforms using Serverless computing, Secure Architecture, Dev/ML
Ops enabled and rendering insightful Dashboards, having Data Governance principal as pillars.
Neelotpaul actively participates as a Solution Architect for opportunities and RFP. He has designed and delivered
multiple ML engineering projects E2E
Neelotpaul is people lead(line manager) for 10+ resources whom he grooms, mentor and guide as a career
councillor.
He also leads around 35+ members from Accenture in the client engagement.
Technical Acumen
Software/Suite Tools & Services Solutions Version Control
AWS Lambda - Cloud based Metadata GitLab
Azure S3 Store (AWS) GitHub
- Customer Cleanse
Databricks IAM SVN
Solution (Azure)
Snowflake DynamoDB - Information Management Perforce
Tibco EBX Glue - Data Quality Strategy
ELK stack Neptune (ML Based) Framework
Informatica (EDC, Analyst) Athena - Universal Parser Spark
Atlassian Suite ADLS - Spark Data Lineage - Hadoop
ML Studio Spline
Cloudera/HDP .Net
Oracle/SQL Server/MySQL App services
Vantage, UniFi, Nautilus PowerBI
QlikSense
Industry/Domain DevOps Languages
APIM
Banking Terraform Python
SNS
Telecom Jenkins Hive
Step Function
Financial Services CFT Scala
Elastic Load
Life Insurer Balancer Scripting
Food CloudWatch TDD C#
Life Science ECS SonarQube PL/SQL
Fargate pytest
Docker
Page 1 of 6
Neelotpaul Roy
Technology/Tools:
Databricks, Spark, Scala, Python, Snowflake, Tibco EBX, AWS – Neptune, Glue, Athena, s3, Lambda, RDS, Cloudwatch
● Define strategies and implement central metadata management, data governance & security analysis
● Integral part of Data Management Platform Team of Formula1, working with multiple Use Cases, Data
Engineering., Platform Security, Analytics, MDM/Governance, Testing and PQM to define strategies and
implement central metadata management, data governance & perform security analysis
● Day to day work involves designing integration of data management capabilities with multiple stockholders,
implementing solutions writing codes, initiating core platform set up for Use Case teams, building Jenkins
pipelines to industrialize to production, Change Request, Process and handovers to enablement team.
● Perform gap analysis and feature metadata (Technical, Operational) capture process for each use case using
multiple cloud vendors (AWS/Azure) with pyspark(databricks) ETL codes.
● Design and Productionize Snowflake-EBX, Glue Data Catalog as Databricks metastore for ETL, s3object
tagging, GraphDB(Neptune)-Databricks/Sagemaker capability and Spark based Data Lineage(Spline).
Page 2 of 6
Neelotpaul Roy
● Discuss and define the scope of Security Controls for data management, implement and collect evidence.
● Automate DevOps pipelines using Jenkins, git as code base and SonarQube for automated Code Quality.
Dec 2020 – Apr 2021: Cloud Architecture SME, Customer Data Cleanse Mumbai, India
Johnson Controls (Accenture Applied Intelligence)
Building a Customer Data Cleanse Application which is based on clustering-based ML algorithm to deduplicate
customer data based on user-based training and configurations. Architecture of the web application is based on
Azure, with a user intuitive interface, brilliant E2E dashboards and visualization for a 360 view.
Technology/Tools:
Python, ADLS, AD Authentication, APP Services (Django, Angular), ADF, ML Studio, Azure Devops, PowerBI
● An automated solution that that enables the creation of clean, complete, and accurate set of golden
customer records
● Machine Learning and Business rules to establish the Customer golden record
● Build the architecture on Azure which can work seamlessly along with industrialize the solution.
● Define the execution strategy & detailed plan to generate the Customer Golden Record
● Reduced the number of customer records by merging the duplicates
● Enabled JCI to drive Incremental Business Benefits through analytics use cases in Marketing, Sales and
Pricing
● CI/CD using Azure Devops
● E2E Reporting through PowerBI
Oct 2019 – Mar 2020: Solution Architect, Analytics POD Mumbai, India
McDonalds (Accenture Applied Intelligence)
Design an architecture to create a data mart for Analytics Platform. Global store level data must be folded and
parsed to create segregation-based Data Layers.
Technology/Tools:
Spark, Scala, Python, AWS – Glue, s3, EMR, Athena, Redshift
● Design ETL framework to capture global store data(semi-structured) into data lake.
● Parse, catalogue & create a readable platform which is leveraged by Data Science team to gather meaningful
insights (Models Intelligent Menu Master, Transactional Analytical Record etc.)
● Data Profiling, Audit and Reconciliation framework for data uniqueness and validations using spark on EMR,
s3 as data layer, Glue to catalogue and Athena/Redshift to query.
● Data Transformation using spark to parse the multi-nested XML inputs data (in petabytes)
● Runtime & cost optimization of Spark jobs
Mar 2019 – Sep 2019: Data Strategy Consultant, Data Governance Mumbai, India
Bank of Baroda (Accenture Applied Intelligence)
Design and implement Metadata Management and Data Quality. With the proposed Data Lake setup and the
available gamut of tools, Metadata Management in Hadoop will be set up using Informatica Big Data Bundle. Using
Informatica EDC create end-to-end data lineage starting from source till data consumption in enriched zone.
Strategies for Data Quality through Data Auditing, creating Single Customer View, Cleansing and Enrichment. Using
Informatica Analyst for Profiling and hive to perceive the data.
Page 3 of 6
Neelotpaul Roy
Create and use python-based Data Quality tool to remove discrepancies and data quality checks. The tool will
connect to Hive to collect data, process through pyspark and provide clusters with confidence score.
Technology/Tools:
Informatica EDC, Informatica Analyst, Hive, Python, pyspark
Sep 2018 – Mar 2019: Solution Architect, Hyper Personalization Mumbai, India
HDFC Life (Accenture Applied Intelligence)
Design the architecture for cloud based real time processing of FLS, Lead and Policy data to feed in insights, Task and
Triggers for agents through mobile applications. Transformation, Modelling, Scoring, Incentive optimization, Task
Prioritization and Budget Optimization is done using python and R scripts. Using AWS DynamoDB as data lake and
lambda(python), Docker, ECS Tasks, step functions and R scripts to generate insights.
Technology/Tools: AWS (S3, Lambda Function – Python 3.6, Step Functions State Machine, S3, Cloudwatch logs and
events, ECS Tasks, DynamoDB, Fargate), R, Docker, Jenkins (CI/CD), GitLab.
● Design cloud-based solution for real time processing of salesforce (FLS, Lead and Policy) data
transformations in between layers and push prioritized tasks to agents on mobile app.
● Implement serverless flow using lambda functions to break down batch mode python scripts to perform
Transformation &Task allocation.
● Run R scripts written to perform Modelling, Scoring and optimization using Docker and schedule it through
ECS tasks.
● Conceptualize and create design diagrams for the near real time processing on Task and Triggers based upon
data updates.
● Integration of modules like Transformation, Aggregation, Scoring, Modelling, Optimization and Task and
Trigger using several AWS components to provide near real time processing data feed into mobile
application
Mar 2018 – Oct 2018: IM Consultant, Shared Analytics Melbourne, Australia
NBN (Accenture Applied Intelligence)
The role involved designing and implementing metadata store for shared analytics platform that will store key
metadata from all the data sources, lineage for all the objects/tables and technical reconciliation for all the
processes. The data in data lake will also be used by other businesses and by the organisation data science team to
generate near-real-time insights out of data.
Technology/Tools: AWS (S3, Lambda Function – Python 3.6, Simple Notification Service, Step Functions State
Machine, Cloudwatch, ELB), Terraform, Jenkins (CI/CD), REST API(Flask), Elasticsearch – Kibana, GitLab, SonarQube,
pytest, coverage.
● Design cloud-based Metadata Store for Kafka Schema Registry, S3, DataStax Enterprise and UDS.
● Implement the AWS architecture using dedicated S3 Buckets, SNS Event Notifications and AWS Lambda
functions using python. to push the data into Elasticsearch and visualized by Kibana.
● Index mapping of json format metadata into Elasticsearch
● Create dashboards and visualization in Kibana.
● Visualize Real-time metadata, lineage, and reconciliation in object level.
● Write Flask Rest APIs to push static metadata into production s3 bucket. Picking up the committed json file in
GitLab repo, the API will pick up the file and push to s3 bucket.
● Execute TDD using SonarQube, Jenkins pipeline, pytest and coverage.
June 2017 – Jan 2018: Data Analytics Consultant, Born in Cloud Kolkata, India
Autodesk (Accenture)
Technology/Tools: AWS, EMR, TEZ, Sqoop, Hive, Oozie, Redshift, RDS(MySQL), Spark.
Page 4 of 6
Neelotpaul Roy
● To design and develop real-time distributed data warehouse for structured data from multiple sources like
Pelican, Revpro and MDS systems into AWS architecture.
● Transforming the data into 3 layers to push the through Redshift into SAP BO
● Implementation of TEZ execution engine for the Hadoop jobs to run 10X faster
● Implementation of EMRFS and upgradation to latest versions of EMR and other tools
Oct 2016 – June 2017: Data Analyst, DM and Cloud COE Pune, India
Cybage/Accenture
Technology/Tools: EMR, AWS, Sqoop, Hive, Oozie, Python, Scala, AI/ML (Supervised Learning), Spark.
● Designed and developed application to gather user information and gain insights to grow organizational
assistance system.
● Designed and implemented graphical interface in python using tKinter.
May 2014 – Dec 2014: Senior Developer, CSM & SXP-Web-CPCT Pune, India
AT&T (TechM)
Technology/Tools: Microsoft.Net, MS SQL 2008 R2, MVC 4.0, Jquery, JavaScript, Oracle PL/SQL
● Carried out analysis and requirements for the activity and accordingly designed and follow the development
and Unit test case plan
● Create new Cost model for CSM and developed various sub-modules under it
● Creating procedures and functions as per requirement in PL/SQL
May 2010 – April 2014: Developer (Nautilus, Unifi, Vantage, Autosys) Pune, India
Fiserv
Technology/Tools: Microsoft.Net, MS SQL, Web Services, LINQ, Entity Framework, JavaScript
● Primary dev-ops resource to Fiserv’s proprietary Nautilus enterprise content management system and UniFi.
● Daily removal of Document and Process locks in Nautilus. Committing DIP awaiting commit queue batches.
● Process coordinator
Educational Qualifications
AMITY University Lucknow, India
Bachelor of Technology, Computer Science and Engineering May 2010
Page 6 of 6