Role overview
A senior individual contributor role responsible for designing, building, and operationalizing end-to-end AI and machine learning solutions that accelerate CNA's migration to a modern cloud data lakehouse. The engineer works across structured and unstructured data domains — including documents, images, audio, and transactional records — to unlock analytical value through scalable pipelines, RAG architectures, vector databases, and knowledge graphs. This role may also provide guidance to others to support the building of complex technical capabilities.
Responsibilities
- Design and build AI solutions that accelerate data migration from legacy systems to the cloud, ensuring scalability, reliability, and governance compliance.
- Design and implement scalable ingestion and transformation pipelines across structured (SQL, relational) and unstructured (documents, images, audio, email, call transcripts) data sources, applying OCR, NLP preprocessing, and document chunking strategies optimized for LLM consumption.
- Implement modern lakehouse patterns on Google Cloud Platform (GCP) — including data governance, cataloging, and lineage tracking — to ensure data is reliably discoverable, auditable, and fit for AI/ML workloads at scale.
- Design and implement vector databases, embedding pipelines, and knowledge graph structures that serve as the foundational retrieval layer for RAG and other AI applications.
- Productionize and operationalize AI solutions and advanced analytics in a DevOps/MLOps environment, including automated testing, monitoring, and rollback capabilities.
- Cultivate innovation by proactively proposing new ideas and identifying the right combination of tools and frameworks to turn business problems into analytics solutions.
- Researches, identifies and implements process improvements that address complex technology gaps. Builds strong knowledge of technology enablers.
Basic qualifications
- Bachelor's Degree in Computer Science, Engineering, Mathematics, Computational Statistics, Data Science, or a related technical field (or equivalent experience); Master's Degree preferred.
- Typically 7+ years of experience in data engineering, Artificial Intelligence or Machine Learning.
- 2+ years of coding proficiency in at least one programming language (Python, Java, SQL).
- Applicable certifications preferred (GCP, Data Engineering).
Preferred qualifications
- Experience using Agile methods preferred.
- Preferred experience with the insurance industry, its products and services.
- Experience in implementing big data processing technology. Apache Spark preferred.
Benefits
- CNA offers a comprehensive and competitive benefits package to help our employees – and their family members – achieve their physical, financial, emotional and social wellbeing goals.
About the company
You have a clear vision of where your career can go. And we have the leadership to help you get there. At CNA, we strive to create a culture in which people know they matter and are part of something important, ensuring the abilities of all employees are used to their fullest potential.