Preetham Ganesh

Machine Learning Engineer

About Me

"Creativity is seeing the same thing but thinking differently" – Dr. A.P.J Abdul Kalam.

Hello, I'm Preetham Ganesh, an ML Engineer. I'm passionate about building and deploying efficient Machine Learning applications to solve real-world problems. I thrive on seeking new challenges to continuously improve my skills and expertise.

I hold an MS in Computer Science from the University of Texas at Arlington, where I worked at the Vision-Learning-Mining Research lab, developing an ASL-to-English translation application, which fueled my passion for inclusive technology. I later joined Qualitest, supporting Google as a Data Analyst, where I worked on analyzing user-submitted search queries. Following that, at Clarium, I engineered ML pipelines to extract and summarize insights from healthcare and financial documents. Currently, I work at Ajace, where I lead the development of full-stack AI applications. My work focuses on building Retrieval-Augmented Generation (RAG) systems, real-time audio transcription pipelines using Whisper, and dynamic tool invocation via LLM function calling.

I'm excited to connect with professionals, share insights, and explore new opportunities. If you're interested in collaborating, discussing exciting projects, or simply connecting over shared interests, please feel free to reach out at [email protected] .

Experience

March 2025 - Present

Machine Learning Engineer @ Ajace

(Offer accepted November 2024)

Built & deployed full-stack AI systems, including Whisper-based real-time transcription and RAG pipelines with TimescaleDB & Ollama for contextual document search. Integrated LLM function-calling workflows with Zendesk to automate ticket creation and reduce support latency. Modularized tools for scalable orchestration across speech, retrieval, and action layers.

LLMs
Retrieval-Augmented Generation (RAG)
Whisper
Function Calling
TimescaleDB
Ollama
LangChain
FastAPI
PostgreSQL
Docker
LLM Tooling
Zendesk API

March 2022 - March 2024

Machine Learning Engineer @ Clarium

Designed & deployed advanced ML models, including TensorFlow-based Segmentation & Word Recognition models as part of Health & Medical Document Summarization tool. Optimized model selection with MLFlow, fine-tuned the Google T5-Small LLM, and boost model performance with synthetic data augmentation.

TensorFlow
Natural Language Processing (NLP)
Computer Vision
MLFlow
Micro Services
OpenCV
SQL
Docker
Keras
REST APIs

October 2021 - March 2022

Data Analyst @ Qualitest

Enhanced data quality by 15% with SQL optimizations and created dashboards for improved decision-making. Automated data workflows with Python, reducing manual tasks by 10%, and ensured data consistency across diverse datasets through cross-functional collaboration.

Google BigQuery
Data Analysis
SQL
Knowledge Graphs
Python

February 2020 - May 2021

Graduate Student Researcher @ University of Texas at Arlington - VLM Research Lab

Developed a cascaded ML model system to convert ASL videos into English speech in real-time. Migrated OpenPose from OpenCV to PyTorch, reducing extraction time by 90% and enabling real-time analysis. Optimized pipeline performance through hyper-parameter tuning, achieving a 98% Top-5 accuracy on the WLASL benchmark for Video Sign Language Recognition.

TensorFlow
OpenCV
PyTorch
Computer Vision
Natural Language Processing (NLP)
Academic Publishing

March 2018 - April 2019

Undergraduate Student Researcher @ Amrita Vishwa Vidyapeetham

Developed hybrid ensemble models for rainfall prediction in Tamil Nadu, utilizing bagging and boosting techniques. Created district-specific and cluster-based models to address regional differences, achieving over 91.13% accuracy in forecasting.

Regression
Scikit-Learn
Matplotlib
SciPy
NumPy
Pandas

Skills

Programming Languages

Python
R
SQL
MySQL
BigQuery
MATLAB
HTML
CSS
JavaScript
Java

Cloud & DevOps

AWS EC2
AWS S3
AWS SageMaker
AWS ECR
AWS ECS
Azure ML
GCP
Heroku
Docker
Git
GitHub
TensorFlow Serving
GitHub Actions
GitLab CI
CI/CD

Packages & Frameworks

TensorFlow
Keras
Scikit-Learn
PyTorch
MLFlow
NLTK
SpaCy
NumPy
OpenCV
Flask
Multiprocessing

Data Visualization Tools

Pandas
PySpark
Power BI
Tableau
Hadoop
Scala
Snowflake

Publications

2021

POS-Tagging based Neural Machine Translation System for European Languages using Transformers

Authors: Preetham Ganesh, Bharat S. Rawal, Alexander Peter, Andi Giri

This study addresses language barriers by proposing a novel Neural Machine Translation (NMT) approach using inter-language word similarity and Part-of-Speech (POS) tagging for model training and testing. Two classical architectures, Luong Attention-based Sequence-to-Sequence and Transformer models, were used, with tokenization by SentencePiece and Subword Text Encoder, respectively. The models were evaluated on Spanish, French, and German datasets with BLEU, Precision, and METEOR scores, showing promising results.

2020

Personalized system for human gym activity recognition using an RGB camera

Authors: Preetham Ganesh, Reza Etemadi Idgahi, Chinmaya Basavanahally Venkatesh, Ashwin Ramesh Babu, Maria Kyrarini

This paper presents a Human Activity Recognition system using an RGB camera to classify gym activities (e.g., push-up, squat) through models like SVM, Decision Tree, KNN, and Random Forest, with the latter achieving 98.98% accuracy. A repetition counter was developed using local minima analysis and dynamic time warping to assess workout accuracy per skeletal point. An interactive Android app was also built to provide users insights into their workouts.

Projects

FLAIR Abnormality Detection and Segmentation System for Brain MRI Images

Designed and implemented an ML pipeline for FLAIR abnormality detection and segmentation using TensorFlow and MLFlow. Trained a CNN classifier (95% accuracy) and a U-Net segmentation model (Dice Score: 0.77). Built a Streamlit-based front end with a Flask backend for real-time inference, deployed on a Homelab server using Docker with routing configured via Cloudflare.

Digit Recognizer

Developed and deployed a CNN-based digit recognition model trained on the MNIST dataset (Accuracy: 98.5%) using TensorFlow. The system is hosted on a Homelab server, featuring a Streamlit front end for user input and a Flask API backend for real-time inference.

Education

August 2019 - May 2021

Master of Science in Computer Science @ University of Texas at Arlington

Computer Vision
Special Topics in Intelligent Systems
Machine Learning
Neural Networks
Data Mining

July 2015 - April 2019

Bachelor of Technology in Computer Science & Technology @ Amrita Vishwa Vidyapeetham

Intelligent Systems
Natural Language Processing
Software Engineering
Database Management System

Certifications

AWS Certified Cloud Practitioner

Issued by: Amazon Web Services

Credential ID: 411288ab-0712-4957-a73e-13dc4edd5b79

Certificate