Jonathan Muhire Jonathan Muhire
Software R&D + ML Research

I build ML systems, data pipelines, and research tooling.

I translate research ideas into scalable software with deep learning, multimodal modeling, and distributed systems.

Graduate researcher with open-source ML systems work. Professional background and credentials are listed on LinkedIn.

  • Document AI pipeline work for layout + OCR in RenAIssance.
  • NLP crisis-signal prototype experiments in ISSR.
  • Dataset versioning workflows with MinIO + LakeFS.
Open to: Research Engineer / ML Systems Focus: ML research + distributed systems
GSoC 2025 repository work Public GitHub repos Research notes Robotics demo clips
Highlights
  • What I built: Layout + OCR pipeline experiments for historical docs (evidence).
  • What I built: Crisis-signal NLP prototype using social text + sentiment features (evidence).
  • What I built: Dataset versioning workflow using MinIO and LakeFS (evidence).
Core Stack
Python PyTorch Deep Learning Distributed Systems Data Infra
View My Work

How I think about ML research, data, and infrastructure.

Three system views that reflect how I scope research problems, build data pipelines, and ship models.

Multimodal ML Pipeline
From data intake to evaluation, with feedback loops for continuous learning.
Vision Language Ingest Curation Training Eval Deploy feedback loop
Distributed Data Platform
Storage, versioning, and compute aligned for reproducible experiments.
Sources Object Store Versioning Compute Registry Metrics lineage
Experiment Lifecycle
A research loop optimized for iteration speed and insight capture.
Hypothesis Baseline Ablations Insights Results Publish

Full project timeline

Recent ML research highlights are above. This is the complete catalog across research, systems, and software.

Applied ML · NLP

ISSR — Crisis Detection

What I built: NLP prototype for crisis-signal classification using social text, sentiment, and location features.

NLP Sentiment Analysis Geospatial Signals
Evidence: Repo Docs
Additional projects (embodied AI, full-stack, mobile, games)

2025 - ML Research & Systems

PyMyCobot Teleoperation

What I built: Local teleoperation workflow and control scripts for manipulation data-collection experiments using the PyMyCobot stack.

Python Teleoperation Data Collection
UMI Multi-Sensor Data Extraction

What I built: Multi-sensor extraction scripts and trajectory analysis workflow around UMI datasets and ORB-SLAM-based tracking.

ORB-SLAM3 Trajectory Analysis Sensor Fusion
UMI Data Annotation Pipeline

What I built: Annotation workflow for manipulation episodes, including gripper-pose and keyframe labeling around embodied-CoT style experiments.

Dataset Annotation Computer Vision Embodied CoT
MinIO + LakeFS Data Infrastructure

What I built: Object-storage and dataset-versioning setup for robotics datasets with reproducible data snapshots.

MinIO LakeFS Dataset Versioning
LeRobot ISO-101 Platform

What I built: Local data-collection and policy-training workflow using LeRobot tooling for fast experiment loops.

LeRobot ROS2 Policy Training
RenAIssance Document Analysis

What I built: GSoC 2025 repository work on historical document layout analysis and OCR experiments.

PyTorch LayoutLMv3 Document AI
ISSR - Mental Health Crisis Detection

What I built: GSoC candidate repository prototype for crisis-signal detection from social text with sentiment and geospatial features.

NLP Sentiment Analysis Geospatial Signals
ArtExtract - Art Analysis AI

What I built: CNN-RNN project for artwork classification and style-analysis experiments.

CNN-RNN PyTorch Classification

2024 - Full-Stack Development

NutrAI Health Assistant

What I built: Full-stack nutrition app prototype with meal-planning and dietary-analysis features.

React Python Web App
CampusBuddy Mobile App

What I built: Flutter app for campus events, dining, and student-resource discovery.

Flutter Dart Mobile App
Point of Sales Terminal

What I built: Java POS application with inventory tracking, payment flows, and sales reporting.

Java Desktop App Inventory

2023 - Game Development

Space Invader Game

What I built: C++ arcade game project with custom gameplay mechanics and rendering logic.

C++ SFML Game Loop
Pacman AI

What I built: Pacman AI project with autonomous agents, pathfinding, and gameplay heuristics.

Python Pygame Search Algorithms

Background and research focus

A quick overview of my path and what I am working on now.

About

Learn more about my background, experience, and approach to building intelligent systems.

Research notes and systems breakdowns

Short technical writeups on robotics, ML systems, and multimodal research.

Open to ML research and systems roles.

Looking for teams building large-scale ML systems, deep learning pipelines, or research tooling. Happy to chat about collaboration or roles.

Fastest ways to reach me

Email is best for opportunities; LinkedIn for introductions.