Data Architect - Java

Data Modeling, Data Warehouse, ETL, Load
Description

GSPANN is seeking a highly skilled Data Architect to join our team in Mexico. The ideal candidate will lead and define data architecture, ensure data quality, and establish data governance processes. This role involves handling millions of rows of data daily, solving significant big data challenges, and building top-tier data solutions that drive key business decisions.

Who We Are

GSPANN has been in business for over a decade, with over 2000 employees worldwide, and servicing some of the largest retail, high technology, and manufacturing clients in North America. We provide an environment that enables career growth while still interacting with company leadership.

Visit Why GSPANN for more information.

Location: Mexico
Role Type: Full Time
Published On: 25 June 2024
Experience: 7+ Years
Description
GSPANN is seeking a highly skilled Data Architect to join our team in Mexico. The ideal candidate will lead and define data architecture, ensure data quality, and establish data governance processes. This role involves handling millions of rows of data daily, solving significant big data challenges, and building top-tier data solutions that drive key business decisions.
Role and Responsibilities
  • Design, implement, and lead data architecture, ensuring high standards of data quality and governance across the organization.
  • Establish and promote data modeling standards and best practices.
  • Develop and advocate for data quality standards and practices.
  • Create and maintain data governance processes, procedures, policies, and guidelines to ensure data integrity and security.
  • Promote the successful adoption of data utilization and self-service data platforms within the organization.
  • Create and maintain critical data standards and metadata to enable data as a shared asset.
  • Develop standards and write template codes for sourcing, collecting, and transforming data for both streaming and batch processing.
  • Design data schemes, object models, and flow diagrams to structure, store, process, and integrate data.
  • Provide architectural assessments, strategies, and roadmaps for data management.
  • Implement and manage industry best practice tools and processes, including Data Lake, Databricks, Delta Lake, S3, Spark ETL, Airflow, Hive Catalog, Redshift, Kafka, Kubernetes, Docker, and CI/CD pipelines.
  • Translate big data and analytics requirements into scalable and high-performance data models, guiding data analytics engineers.
  • Define templates and processes for designing and analyzing data models, data flows, and integration.
  • Lead and mentor Data Analytics team members in best practices, processes, and technologies in Data Platforms.
Skills and Experience
  • Bachelor’s or Master’s degree in Computer Science or a related field.
  • Over 10 years of hands-on experience in Data Warehousing, ETL processes, Data Modeling, and Reporting.
  • More than 7 years of experience in productizing and deploying Big Data platforms and applications.
  • Expertise in relational/SQL, distributed columnar data stores/NoSQL databases, time-series databases, Spark streaming, Kafka, Hive, Delta Parquet, Avro, and more.
  • Hands-on subject-matter expertise in the architecture and administration of Big Data platforms and Data Lake Technologies (AWS S3/Hive), and experience with ML and Data Science platforms.
  • Extensive experience in understanding complex business use cases and modeling data in the data warehouse.
  • Proficiency in SQL, Python, Spark, AWS S3, Hive data catalog, Parquet, Redshift, Airflow, and Tableau or similar tools.
  • Proven experience in building Custom Enterprise Data Warehouses or implementing tools like Data Catalogs, Spark, Tableau, Kubernetes, and Docker.
  • Good knowledge of infrastructure requirements such as Networking, Storage, and Hardware Optimization, with hands-on experience with Amazon Web Services (AWS).
  • Strong verbal and written communication skills, with the ability to work efficiently across internal and external organizations and virtual teams.
  • Demonstrated industry leadership in Data Warehousing, Data Science, and Big Data technologies.
  • Strong understanding of distributed systems and container-based development using Docker and the Kubernetes ecosystem.
  • Deep knowledge of data structures and algorithms.
  • Experience working in large teams using CI/CD and agile methodologies.

Key Details

Location: Mexico
Role Type: Full Time
Published On: 25 June 2024
Experience: 7+ Years

Apply Now