Iván G. Pérez
Data Scientist | ML Engineer
Details
Online Presence
Skills
  • Data Platforms
  • Machine Learning
  • Backend Dev
  • Cloud Devops
  • Collaboration Process
Languages
  • Spanish
  • English
  • Portuguese
Interests
  • Sports
  • Continuous learning & knowledge sharing
  • Open Source
  • Photography
  • Science Fiction
  • MTG
PROFILE

Data Scientist with foundations in NLP and machine learning, building data-driven systems using Python and SQL. I enjoy creating practical AI/ML solutions and advancing new methods through experimentation, statistical rigor, and continuous learning. At Nike, I developed prototypes, performed A/B experiments, and worked with embeddings, recommenders, and LLM integrations. I’m now applying those skills to an analytics-first SaaS project that integrates commerce and messaging platforms to deliver data-driven insights for micro-entrepreneurs. I’m committed to deepening my statistical and algorithmic foundations while contributing as a hands-on technical individual contributor.

Employment History
  • Personal Applied Research Project
    Peru (2025 - Present)
    • Researching conversational AI and LLM evaluation as a hands-on learning project using FastAPI, PostgreSQL, and the WhatsApp Cloud API.
    • Set up architecture and MLOps foundations: Docker Compose, repositories, CI/CD, typing/linting, observability scaffolds, and cost/latency monitoring.
    • Integrated a LiteLLM-based intent classifier with rule-based fallbacks to experiment with evaluation and guardrail strategies for safe dialogue flows.
    • Established structured logging, telemetry, and token budgeting aligned with privacy and security best practices.
    • Continuing development as a personal applied-learning effort to deepen expertise in LLM evaluation, observability, and deployment practices.
  • Data Scientist
    Nike Valiant Labs & Athlete Innovation (via Launchpad.AI) (2024 - 2025)
    • Designed the onboarding and Running Baseline framework for Nike’s Running Coach app, integrating exercise-science modeling with data architecture to initialize personalized training profiles from first use.
    • Developed a cold-start recommender system that generated immediate, personalized run suggestions through shared user/run feature spaces, clustering, and cosine-similarity matching.
    • Built an asynchronous Python prototype that read biometrics and used LLMs with text-to-speech to deliver real-time, context-aware coaching feedback during runs.
    • Engineered Nike’s Cue Engine (a FastAPI and Databricks-powered service orchestrating feature retrieval and LLM prompt generation) to produce personalized, on-tone coaching cues at scale, deployed on AWS ECS with full observability.
    • Created and optimized PySpark/SQL data pipelines on Databricks to support feature engineering, cue personalization, and downstream analytics.
  • Data Scientist
    Nike Global Marketing Sciences (via Launchpad.AI) (2022 - 2024)
    • Created consumer-focused KPIs and dashboards using PySpark, SparkSQL, Pandas, and Matplotlib to analyze engagement and fatigue across marketing channels.
    • Developed time-series forecasting systems to predict next-fiscal-year engagement metrics (email CTR and push open rate) using Databricks, Pandas, Statsmodels, and Holt-Winters exponential smoothing; results guided territory-level growth goals.
    • Analyzed 40M+ member interactions to identify promo engagement drivers and optimal communication frequency, applying PySpark, SQL, and XGBoost to reduce audience fatigue and improve campaign efficiency across email and push.
    • Designed and productionized Nike’s universal holdout experimentation framework to measure marketing incrementality across global segments, automating daily Databricks pipelines that precalculated significance tests for millions of segment combinations powering real-time Tableau dashboards.
    • Performed pre-test and post-hoc power analyses, applied one-sided z-tests, and validated effect sizes to ensure experiments were statistically sound and business-relevant.
    • Built and optimized PySpark/SQL pipelines on Databricks, improving Delta Lake performance and reducing data retrieval time by 20×.
  • ML Engineer
    Nike Sports Research Lab (via Launchpad.AI) (2022 - 2022)
    • Performed data wrangling, sanity checks, and EDA using SparkSQL, PySpark, and Seaborn in Databricks.
    • Built data-ingestion pipelines integrating geolocation, weather, and perception data through external APIs.
    • Developed predictive models with XGBoost on biomechanical, physiological, and perception datasets.
  • Machine Learning & NLP Fellow
    Launchpad.AI (2022 - 2022)
    • Explored task-oriented conversational-agent prototypes using seq2seq and Transformer models in PyTorch and TensorFlow.
    • Designed and curated datasets for NLP tasks in the fashion domain.
    • Led a 13-member Conversational AI team under Agile methodology.
    • Conducted research combining vision and language features (Detic, VinVL, CLIP).
  • Earlier Professional Experience (pre-2017)
    Spain / Remote
    • 5+ years as Full-stack Engineer at Accenture, building web/enterprise systems
    • 6+ years as IT Business Consultant at KPMG, leading digital transformation initiatives
    • 4+ years in PMO / Project leadership roles at KPMG / Telefónica, supporting cross-functional teams
    • 3+ years in Digital Marketing & Customer Success at Bespoken.io, driving product-user alignment

Please see other relevant work experiences in my LinkedIn profile.
Education
  • Computer Science Engineer (Top 25% — completed thesis to earn Engineer title)
    Pontificia Universidad Católica del Perú (PUCP)
  • Artificial Intelligence coursework
    Columbia University (AI Program, 2020) | Stanford University (NLP with Deep Learning, 2021)

Please see the courses and certifications I have taken in my LinkedIn profile.
Publications