Name: Pujan Bhatt

Job Role: Data Engineer

Experience: 1 year

Address: Toronto, Canada

Skills

SQL 95%
Pyspark 85%
Data Visualization 90%
Cloud 80%
Machine Learning 70%

About

About Me

I am a passionate AI professional with a strong foundation in big data tools and cloud technologies, currently honing my skills in AI development at George Brown College. My experience spans across various data technologies, including Azure, Tableau, SQL Server, and more.

  • Profile: Data Science and AI professional
  • Education: Master of Computer Information
  • Language: English, Hindi, Gujarati
  • BI Tools: Microsoft Power BI & Tableau
  • Other Skills: Cloud, PySpark, Excel, Git & JIRA
  • Interest: Traveling, Singing, Spiritual reading

20 +   Projects completed

LinkedIn

Resume

Resume

Seeking a role as a Data Science and AI professional to apply expertise in machine learning, deep learning, and LLMs to real-world challenges. Experience in developing predictive models,computer vision systems, and scalable ETL pipelines.

Experience


2025-Present

AI Engineer

Herb Immortal

  • Developed a cloud-based Language Model pipeline on Google Cloud Platform, leveraging scalable infrastructure for training and optimization.
  • Performed OCR-based text extraction using Google Cloud Vision API and built Python notebooks for data preprocessing and character-level extraction.
  • Fine-tuning transformer-based LLMs using Hugging Face and orchestrating training workflows on Vertex AI.
  • Managed structured datasets in GCS buckets to enable scalable model training and evaluation.

2024-2025

Machine Learning Intern

Kinectrics

  • Developed and optimized YOLO-based computer vision models for real-time defect detection in industrial equipment.
  • Integrated LLMs for automating document processing, enhancing text generation, summarization, and extraction.
  • Automated data pipelines using PySpark for large-scale preprocessing, model training, and evaluation.
  • Leveraged TensorFlow and PyTorch to implement, fine-tune, and deploy machine learning models.

2022-2023

Data Engineer Intern

Open-path Solutions

  • Developed and optimized PySpark transformations to process and analyze large datasets, including data extraction, transformation, and loading (ETL) from various sources such as Azure SQL Database, PostgreSQL, MongoDB, and Snowflake.
  • Implemented complex data pipelines for data cleansing, aggregation, and integration using PySpark, enhancing data quality and performance.
  • Collaborated with cross-functional teams to design and execute scalable data solutions in a distributed environment, improving data accessibility and reducing query times.

August 2024

Software Engineer Intern

J.P. Morgan

  • Set up a local dev environment by downloading the necessary files, tools and dependencies.
  • Fixed broken files in the repository to make web application output correctly.
  • Used JPMorgan Chase’s open source library called Perspective to generate a live graph that displays a data feed in a clear and visually appealing way for traders to monitor.



Education


JAN 2024 - DEC 2024

PG IN APPLIED A.I. SOLUTIONS DEVELOPMENT

GEORGE BROWN COLLEGE, TORONTO

Grade: First class distinction.

JAN 2020 – SEP 2020

PG IN INTERACTIVE MEDIA MANAGEMENT

ALGONQUIN COLLEGE, OTTAWA

Grade: First class distinction.

JAN 2019 - DEC 2019

PG IN BUSINESS MANAGEMENT

SASKATCHEWAN POLYTECHNIC, MOOSE JAW

Grade: First class distinction.

MAY 2016 - OCT 2018

MASTER OF SCIENCE IN INFORMATION TECHNOLOGY

GLS UNIVERSITY, INDIA

Grade: First class distinction.

MAY 2012 - Dec 2015

BACHELOR OF COMPUTER APPLICATION

H.L UNIVERSITY, INDIA

Grade: First class distinction.

Projects

Projects

Below are the sample Data engineering projects and A.I engineering project on SQL, Python, Pyspark.

Data processing and integration on Azure Cloud using Pyspark

Integrate and process these datasets using various big data tools and technologies to uncover patterns and insights into the crime dynamics of these two major cities.

Converting text into speech using text-to-speech libraries

Create an Audiobook from a PDF using PyPDF2 and pyttsx3.

Image captioning

Process of generating textual description of an image. It uses both Natural Language Processing and Computer Vision to generate the captions.


Object detection for daily life.

object detection on COCO 2017 validation dataset using Faster R-CNN , Single Shot Detector and YOLOv3.

Feature extractor from audio data

Extracting a set of features that are informative with respect to the desired properties of the original audio data

More projects on Github

I love to solve problems & uncover hidden data stories


GitHub

Contact

Contact Me

Below are the details to reach out to me!

Address

Toronto, Canada

Contact Number

+639-538-3580

Email Address

bhattpujan03@gmail.com

Download Resume

resumelink



Find me on