I am Willian Pina

Passionate about data, challenges, and technology.

About Me

Hello,

I am Willian Pina, a Data Scientist and Operational Analysis Coordinator, with experience in Data Science, strategic analysis, and Artificial Intelligence applied to geospatial contexts.

I work at the intersection of data, technology, and decision-making, coordinating and developing analytical solutions that support operational and strategic decision-making in real-world, complex environments.

Location: Brazil

Focus: Data Science, Analytics, Operational & Strategic Analysis (Geospatial)

Languages

Portuguese: Native

English: Advanced (Professional proficiency)

Technical Skills

Python 90%
SQL 85%
Machine Learning 80%
Geospatial AI 75%
GIS Tools 70%
Generative IA 65%

My Journey

🎓 Education

MSc in Data Science

University of Colorado Boulder — Computer Science

2023 – 2025

Postgraduate in Data Science (Specialization)

University of Colorado Boulder — Computer Science

2023 – 2024

Postgraduate in Artificial Intelligence

Mackenzie Presbyterian University — Engineering & Technology

2022

Bachelor in Military Sciences

Brazilian Military Academy (AMAN)

1999 – 2002

💼 Experience

Data Scientist & Operations Analytics Coordinator

CENSIPAM — Brazilian Amazon Surveillance and Protection System
(Ministry of Defense)

2025 – Present

Intelligence Analyst

Geospatial & Operational Intelligence — Brazilian Army

2012 – 2024

Brazilian Army Officer

Leadership, planning and operations

2003 – 2016

📜 Certifications

Core Certifications
  • IBM Data Science Professional Certificate
    IBM — 2026
Professional Programs
  • Google Data Analytics — 2024
  • Google Advanced Data Analytics — 2024
  • Google Business Intelligence — 2024
  • Google Project Management — 2024
  • AWS Academy — Cloud Foundations — 2024
  • AWS Academy — Cloud Architecting — 2024
Additional Training
  • Fundamentals of GIS (National Geospatial Intelligence College) — 2025
  • Introduction to MongoDB — 2024
  • Google Cybersecurity — 2024
  • Google Prompt Essentials — 2024
  • Google Cloud Data Analytics — 2026

Projects

COVID-19 Impact Analysis

Analysis of the Impacts of COVID-19

Analysis of global COVID-19 data focusing on confirmed cases and deaths, applied to countries with populations larger than Brazil.

Master's · Data Science · Statistics · R Language

BBC News Classification

BBC News Classification

Academic project exploring unsupervised learning techniques, with emphasis on matrix factorization, to automatically identify and classify BBC news categories from text data.

Unsupervised Learning · Matrix Factorization · NLP

Cancer Detection

Metastatic Cancer Detection in Digital Pathology Images

Binary classification of histopathological images to detect metastatic cancer in digital pathology patches.

Computer Vision · TensorFlow · Medical Imaging

UK Job Market

Data Scientist Job Market Analysis

Exploratory analysis of data scientist job postings in the UK, investigating salary variation, in-demand skills, and the relationship between company ratings, remote work, and compensation.

Data Analysis · Data Visualization · Job Market Analytics

ENEM 2022

Predictive Analysis of ENEM 2022

Application of supervised learning techniques to analyze and predict student performance in ENEM 2022, considering demographic, social, and educational factors based on official INEP microdata.

Supervised Learning · Data Science

Monet Style GAN

Monet-Style Image Generation with Generative Adversarial Networks

Deep learning project using Generative Adversarial Networks (GANs) to generate images in the style of the artist Claude Monet from paintings and general photographs.

Deep Learning · Computer Vision · TensorFlow

Invasive Species Monitoring

Invasive Species Monitoring

Image classification project applying computer vision and deep learning techniques to identify invasive species in forest environments, supporting large-scale environmental monitoring.

CNN · Computer Vision · TensorFlow

NLP Disaster Tweets

Disaster Tweet Classification

Natural Language Processing (NLP) project focused on automatically classifying tweets, identifying whether the content is related to real-world disasters based on semantic context.

Text Classification · Machine Learning · TensorFlow

NYC Shooting Incidents

New York City Shooting Incident Analysis

Academic project involving exploratory and spatial analysis of historical shooting incident data in New York City, using official NYPD records to identify temporal, geographic, and demographic patterns of gun violence.

Data Analysis · Data Visualization · R Language

Single-Cell Perturbations

Single-Cell Perturbation Analysis

Academic project applying unsupervised learning techniques, with emphasis on matrix factorization, to analyze cellular responses to drug perturbations in single-cell data, integrating data science and biotechnology.

Unsupervised Learning · Bioinformatics · K-Means

Energy Market Analysis Data

Global Renewable Energy Market Analysis and Projections to 2030

Analytical study of the global renewable energy market using UN data to identify wind and solar trends, cluster countries, and project future scenarios.

Time Series · Sustainability · Data Science

CNN Optimization with CIFAR-100

Convolutional Neural Network Optimization on CIFAR-100

Machine learning optimization project improving a CNN for CIFAR-100 image classification using architectural tuning and Snapshot Ensembles.

Deep Learning · Computer Vision · Model Optimization

Geospatial Terrain Analysis

Geospatial Analysis for Detecting Hidden Illegal Mining in Sararé

Application of geospatial techniques and satellite imagery to identify areas with a high probability of hidden illegal mining machinery, supporting operational and environmental actions in the Sararé Indigenous Territory.

Geospatial AI · Remote Sensing · Satellite Imagery

Geospatial Analysis of Housing Data

Geospatial Analysis of Housing Data — Brasília (DF)

Geospatial analysis of real estate prices in Brasília using property features, location, and points of interest.

Geospatial AI · Real Estate Analytics · OpenStreetMap

Sararé Mining Structure Detection

Illegal Mining Structure Detection in the Sararé Indigenous Territory

Geospatial AI project for detecting illegal mining structures and machinery using satellite imagery and YOLO.

Geospatial AI · Computer Vision · Environmental Monitoring

Airline Reviews

Customer Satisfaction Analysis at British Airways

Analysis of airline customer reviews using clustering and logistic regression to identify satisfaction patterns and key factors influencing customer recommendations at British Airways.

Data Science · Machine Learning · Customer Satisfaction Analysis

PDF Summarizer

PDF Summarizer with Text-to-Speech Conversion

Streamlit application that allows users to upload PDF files, automatically generates content summaries, and converts the summaries into audio, making it easier to consume long documents.

NLP · Streamlit · Text-to-Speech

Fintech Case Study

Cloud Data Analytics for Fintech with BigQuery and SQL

Cloud Data Analytics case study using BigQuery and advanced SQL to analyze loan data, regional distribution, and credit trends, supporting financial decision-making in a fintech.

Cloud Analytics · BigQuery · SQL · Data Analysis