Lucas Pastur-Romay

Manager Data Scientist

Manager Data Scientist with +6 years of professional experience developing cross-industry solutions to solve complex problems using machine learning techniques. Manage data scientists teams and lead projects end-to-end: develop proof of concepts, oversaw technical implementation and code quality, and supervise execution in production environment.

Artificial Intelligence

One of my passions is to implement and develop AI models to solve real-life problems. I have professional experience in the use of different supervised ML models: Artificial Neural Networks (FNN, CNN, LSTM, VAE, GAN, transformers, diffusion models), Gradient Boosting Machine, Random Forest. Use of unsupervised algorithms (k-means, KNN, HDBSCAN) and dimensional reduction techniques (PCA, t-SNE, UMAP).

Big Data

I have developed and maintained data science projects in Big Data environments for large enterprises. Experience with different cloud providers (AWS, Google cloud, Azure, databricks). Knowledge of tools such as S3, EC2, Airflow, Lambda, MongoDB. Use of structured and unstructured data sources (text, image, video, geospatial data). I have trained large deep learning models using clusters of GPUs and supercomputers.

Business

I work at Mango, a multinational fashion company. I have been developing Inspire, a platform for fashion designers. I have experience in different industries: pharmaceutical, healthcare, banking, insurance, automotive, chemical industry, entertainment, food retail, and marketing. Expert in translating client needs into project requirements to maximize impact. I have performed planning, budgeting, and development of over 30 data science projects.

Programming

I have experience in software implementation using Python, PySpark, R, SQL, HTML, and CSS. I have developed very complex projects in big data architectures. I like clean code implementation following good programming practices. Use of Gitflow, CI/CD tools (Jenkinks), and process orchestration (Airflow). Implementation of agile methodology, and use of management tools such as Jira.

Visualization

I enjoy processing and analyzing data to prepare visualizations and dashboards to help understand the data to communicate results to stakeholders. I have developed graphics and web applications using tools such as plotly, streamlit, shiny, D3.js. Preparation of interactive visualizations using ML models, embeddings, images, graphs.

Healthcare

I have a bachelor's degree in Biology and a master's degree in Neuroscience. During my PhD I was part of a research group in data science applied to healthcare. Recently, I published a scientific paper on BRAF mutation in melanoma. I participated in a Hackathon about neurofibromatosis related-cancer. I have been working on a personal project to translate sign language.

There images were created using Stable diffusion model. If you like these illustrations

Visit the AI Art Gallery

My Projects

Personal web page

I have developed this website to improve my knowledge in web application development using Django, HTML and CSS. The application is hosted on an EC2 cloud server on AWS.

PhD Thesis and papers

I hold a PhD cum laude with a thesis about Neuroscience and Deep Learning. I have written scientific papers about neuromorphic chips, brain simulation, deep learning, and big data applications in pharmaceutical and bioinformatics.

Sign language translator

Application for real-time translation from sign language to text. Use of open-cv for video processing. Use of landmark detection models and training of classification models.

Hack4NF 2022

Best page award in a hackathon about a rare disease called Neurofibromatosis. I have trained models to predict cancer types using genomic and clinical data. Use of NLP techniques for clustering genes. Development of a web application to analyze the results.

Sentiment analysis

Sentiment analysis using Machine Learning models (BERT, LightGBM, Logistic Regression). Use of SHAP explainability technique to get the most important words. Visualization using word clouds.

Dashboard Covid data

Use of covid data to analyze the temporal evolution by country. Preparation of a dashboard using plotly and streamlit. Use of geospatial data to make covid incidence maps by country.