Projects

Click on the project title to view the source code (if available) of the project. Visit my GitHub profile to view more projects!

Real-time Face Recognition using CCTV Cameras

A production-ready pipeline utilizing state-of-the-art face recognition models, optimized for deployment in CCTV environments. This robust system ensures real-time facial recognition with high accuracy and efficiency, tailored for surveillance applications.

TensorRT Object Detection using EfficientDetLite

A high-performance project designed to implement EfficientDet Lite models (versions 0 to 4) for object detection. Utilizing ONNX for model inference and TensorRT for optimized engine building, this project enables efficient and rapid deployment of object detection models with support for FP32 and FP16 precision on NVIDIA GPUs (GitHub)

Estimating HeadPose using Deep Learning

A real-time head pose analysis application that predicts the vertical and horizontal face orientation (looking up/down and left/right/straight) of a person with high precision and efficiency,

ArtGAN: Artwork Restoration using Generative Adversarial Networks.

ArtGAN leverages a conditional Generative Adversarial Network to restore damaged artwork, combining custom data augmentation, innovative CResNetBlocks, and patch-discriminators for impressive results. (GitHub)

KrishnaConnect: Utilizing the power of RAG to chat with Bhagavad Gita.

An AI-driven application that merges the teachings of the Bhagavad Gita with modern technology, using a Large Language Model and Retrieval-Augmented Generation to simulate conversations with Lord Krishna. It employs LangChain, LangServer, FastAPI, and a Flutter-based user interface to provide a responsive, intuitive experience, enabling users to access and interact with ancient wisdom seamlessly. This innovative project aims to bring the timeless guidance of the Gita to a global audience, blending spiritual insight with the latest in AI and software development.

Skeletal Action Recognition

Formulated, trained and tested a video action recognition machine learning pipeline in PyTorch that recognizes and classifies actions such as jumping, walking, running etc. using skeletal body poses.

Screenshot from 2023-11-16 17-20-20_edit

Intraretinal Fluid Segmentation using UNet

This project showcases the implementation of a UNet architecture for the precise detection of intraretinal cystoid fluid. Leveraging the power of deep learning, the model offers a significant tool in the analysis of retinal images, aiding in the identification of this key marker in ocular diseases. (GitHub)

Gaussian Mixture Models & Expectation Maximization

Gaussian Mixture Models (GMM) along with the Expectation Maximization (EM) algorithm on a custom dataset featuring numerical data representing various leaf colors. This initiative was centered around accurately clustering leaves based on color variations, showcasing the power of GMM and EM in extracting meaningful patterns from data. This project demonstrated the underlying statistics behind GMM & EM and my ability to apply sophisticated statistical methods to nuanced data analysis, highlighting and demonstrating the practicality of such techniques. (GitHub)

YOLOv8 End-to-End Instance Segmentation Model

YOLOv8 instance segmentation model in ONNX format that incorporates all post-processing stepsdirectly inside the ONNX model, from Non-Maximum Suppression (NMS) to mask calculations. (GitHub)

Mouse Gesture Control using MMPose

Gesture Control with MMPose is a project that enables hands-free computer interaction using hand gestures. It harnesses the power of MMPose, a pose estimation library, alongside technologies like OpenCV and PyAutoGUI. You can control your computer, make presentations exciting, enhance gaming experiences, and even navigate virtual worlds—all through simple hand movements (GitHub)

Pose-Estimation for Golfers using RTMPose

Deployed RTMO (Real-Time Multi-person Object) Pose Estimation model to estimate key points on the golfer's body and visualize these key points along with bounding boxes and skeletons on the video frames to analyze golf swings (yes that's me in the image, I love Golf).

Extracting clips from a long video based on text queries.

An end-to-end pipeline that allows users to extract small clips (10-50 seconds) from long duration videos (1/2 hours) based on a textual query depicting an activity or an object present in the video. This project was built with the help of HuggingFace.

Abandoned/Missing Object Detection

An end-to-end computer vision pipeline that allows users to detect objects and generate alerts for objects that have been left abandoned in a scene, or objects that were initially present in the scene but later went missing.

Industrial Anomaly Detection using Segmentation, Detection and Machine Vision.

Prepared an augmented dataset for industrial anomaly detection. Formulated, trained and tested multiple segmentation and object detection models to improve accuracy and performance, finally combining this with machine vision to obtain an end-to-end pipeline for accurate inference.

A mobile application that analyzes Instagram caption sentiments.

Back-end NLP implementation of a flutter application that allows user to analyze sentiments of other users based on their captions. This application uses NLP to analyze captions and provide positive, negative and neutral scores as the output. (GitHub)

Waste Classification using FastAI

A classifier that accurately identifies waste types, categorizing them as recyclable or organic, thus facilitating proper disposal and recycling efforts. Additionally, the inclusion of a text-to-speech feature makes this tool accessible to visually impaired individuals, ensuring wider usability and impact. View the video here.