Principal Software Developer – AI Data Architect
Caseware • COJava Remote
Caseware is one of Canada's original Fintech companies, having led the global audit and accounting software industry for over 30 years, with more than 500,000 users across 130 countries and available in 16 different languages. While you might not have heard of us (yet) over 36,000 accounting and audit professionals list Caseware as a skill on their LinkedIn profiles!
We are seeking a Principal Software Developer – AI Data Architect to drive the technical vision and architectural strategy of Caseware’s AI-Ready Data Platform. This role will define the enterprise data architecture, patterns, and modeling standards that deliver trusted, governed, high-quality data products forming a foundational data platform for our cloud offerings, enabling AI capabilities and secure interoperability with customer systems, while powering analytics and strengthening our core products.
This role requires hands-on experience delivering data for AI workflows and practical familiarity with modern LLM tooling and AI platform integration patterns. You will apply this experience to build a data foundation that supports AI workflows and agentic capabilities, analytics, and customer interoperability.
This is a key leadership role where you will act as a hands-on architect while mentoring the development team, guiding the long-term technical vision, shaping enterprise data architecture standards across teams, and contributing to crucial AI and data platform projects.
📍 Location: This is a fully remote position located in Colombia.
You will be reporting to:
Contact:
Maira Russo - Senior Talent Acquisition Partner
What you will do:
What you will bring:
• 10+ years of experience in software development and data engineering, with at least 5 years in a senior technical leadership role, preferably as a Principal Developer or Data Architect.
• Demonstrated experience architecting and enabling data for AI workflows in production, such as embedding pipelines, vector-based retrieval, RAG data workflows, and real-time/event-driven data flows that support agentic systems and AI integrations.
• Experience with AI platform integration and orchestration patterns, including agent workflow orchestration and LLM/agent observability and evaluation, partnering closely with data science and engineering teams to operationalize AI-Ready datasets and pipelines. Familiarity with LangGraph, Langfuse, MCP, AWS Bedrock, AWS AgentCore, and LaunchDarkly is preferred.
• Experience enabling secure interoperability patterns with customer AI systems, including governed data access, tenant-aware controls, and safe integration patterns for customer-managed AI workflows.
• Experience defining AI and data governance and platform adoption standards in large organizations, including controls for privacy, access, auditability, safe reuse, and operational guardrails for AI-Ready datasets and data products.
• Experience designing modern data platforms on cloud-native infrastructure (AWS preferred), including lakehouse patterns, medallion architecture, ETL/ELT pipelines, distributed processing with Spark, Trino, and MapReduce, analytics and AI-Ready data at scale, with strong operational practices.
• Hands-on experience with core data technologies and integration patterns: MongoDB, Amazon DocumentDB, MS SQL Server, DynamoDB, AWS ElastiCache for Redis or Valkey; event streaming and queueing using SNS/SQS. Postgres, pgvector, and Kafka or Pub/Sub are an asset.
• Hands-on experience with AWS data platform services: S3, S3 Express, Athena, Glue Catalog, Lake Formation, OpenSearch Serverless, S3 Vector Storage, Iceberg, Lambda, Step Functions, EKS, ETL on EMR, and EMR Serverless.
• Proven ability to architect and deliver scalable, reliable data systems and product data architectures, guiding teams in data models, storage and integration architectures, data contracts, data domain taxonomy, schema and event versioning, and resolving performance and scale bottlenecks.
• Proficiency in data movement and performance architecture: Experience designing replication, event sourcing, and CDC/change tracking strategies, safe historical reprocessing patterns, and performance optimization through query analysis, indexing, and partitioning.
• Strong technical leadership: Experience mentoring teams, setting engineering and architecture standards, and influencing technical direction across multiple teams.
• Experience working with DevOps teams, CI/CD pipelines, infrastructure-as-code, and operational tooling to deliver scalable, resilient data platforms and pipelines.
• Communication and collaboration skills to align cross-functional teams and engage with senior leadership on technical strategy, trade-offs, and decisions.Strong English language communication and collaboration skills
• Demonstrated experience architecting and enabling data for AI workflows in production, such as embedding pipelines, vector-based retrieval, RAG data workflows, and real-time/event-driven data flows that support agentic systems and AI integrations.
• Experience with AI platform integration and orchestration patterns, including agent workflow orchestration and LLM/agent observability and evaluation, partnering closely with data science and engineering teams to operationalize AI-Ready datasets and pipelines. Familiarity with LangGraph, Langfuse, MCP, AWS Bedrock, AWS AgentCore, and LaunchDarkly is preferred.
• Experience enabling secure interoperability patterns with customer AI systems, including governed data access, tenant-aware controls, and safe integration patterns for customer-managed AI workflows.
• Experience defining AI and data governance and platform adoption standards in large organizations, including controls for privacy, access, auditability, safe reuse, and operational guardrails for AI-Ready datasets and data products.
• Experience designing modern data platforms on cloud-native infrastructure (AWS preferred), including lakehouse patterns, medallion architecture, ETL/ELT pipelines, distributed processing with Spark, Trino, and MapReduce, analytics and AI-Ready data at scale, with strong operational practices.
• Hands-on experience with core data technologies and integration patterns: MongoDB, Amazon DocumentDB, MS SQL Server, DynamoDB, AWS ElastiCache for Redis or Valkey; event streaming and queueing using SNS/SQS. Postgres, pgvector, and Kafka or Pub/Sub are an asset.
• Hands-on experience with AWS data platform services: S3, S3 Express, Athena, Glue Catalog, Lake Formation, OpenSearch Serverless, S3 Vector Storage, Iceberg, Lambda, Step Functions, EKS, ETL on EMR, and EMR Serverless.
• Proven ability to architect and deliver scalable, reliable data systems and product data architectures, guiding teams in data models, storage and integration architectures, data contracts, data domain taxonomy, schema and event versioning, and resolving performance and scale bottlenecks.
• Proficiency in data movement and performance architecture: Experience designing replication, event sourcing, and CDC/change tracking strategies, safe historical reprocessing patterns, and performance optimization through query analysis, indexing, and partitioning.
• Strong technical leadership: Experience mentoring teams, setting engineering and architecture standards, and influencing technical direction across multiple teams.
• Experience working with DevOps teams, CI/CD pipelines, infrastructure-as-code, and operational tooling to deliver scalable, resilient data platforms and pipelines.
• Communication and collaboration skills to align cross-functional teams and engage with senior leadership on technical strategy, trade-offs, and decisions.Strong English language communication and collaboration skills