SYS // online 12.9716°N · 77.5946°E lat // bengaluru
vol. 07 · ed. 2026.06 · iss. 042
·
shimbun · “newspaper” · a field notebook
UTKARSH · SINGH ML & data engineer · bengaluru, india
>> est. 2021 · rev. 2026.06.02 · IST
LEDE · 序章 · preface

Utkarsh.Singhさん

A computer science engineer building data pipelines, ML systems and real-time processing for teams that actually measure what they ship. Currently at CommentSold; writing clean, maintainable code and the occasional essay.

see projects → read the notebook us9044@gmail.com

About jo · preface · opening

§ 01 / 04
~/about

Computer Science Engineer by training, a pipelines person by practice. Currently a Machine Learning & Data Engineer at CommentSold, working on big-data tooling, ML pipelines and real-time processing.

Over the past few years I've deepened my expertise through hands-on projects and online courses — seeking not just knowledge but wisdom. I believe in building meaningful products, not just features: every line of code carries the potential to help or hurt a business, so I commit to clean, maintainable, scalable solutions and regular refactoring to keep technical debt honest.

Outside of code: I'm a tech enthusiast who loves exploring new tools, a keen interest in AI, ML and data science, and a fondness for thoughtful debate — well-chosen discussions spark growth and new perspectives.

based inbengaluru, india
currentML & data engineer · CommentSold
languagesen · hi
focusAI · ML · data science

Work experience gyō · one’s craft or trade

§ 02 / 04
~/work
2025nowbengaluru · commentsold

ML & Data Engineer

CommentSold · ML pipelines & real-time data

Working with big data technologies, ML pipelines, real-time data processing and building data pipelines at scale. Blending discipline with curiosity to ship clean, maintainable systems that earn their keep.

stackpython · spark
ml pipelines
real-time data
big data
2021 – 2025bengaluru · datagrokr

Data Engineer

DataGrokr Analytics · pipelines & APIs

Built and scaled data pipelines and APIs for analytics workloads. Four years of hardening the unglamorous middle of the stack — the part that decides whether the graph goes flat or keeps paging people at 2 a.m.

stackpython · sql
data pipelines
api development
cloud

Education gaku · study, learning

§ 03 / 04
~/edu
2023 – 2024online · MITx

MicroMasters · Statistics & Data Science

Massachusetts Institute of Technology

Actively working through the program — prioritising depth of understanding over a rushed completion. Goal is to master the concepts before sitting the final capstone exam.

focusstatistics
data science
2021 – 2023gurgaon · BML Munjal

M.Sc. · ML, Data Science & AI

BML Munjal University

Deepened expertise across Machine Learning, Artificial Intelligence and Data Science — the theoretical grounding that makes the pipelines at work less mysterious.

focusmachine learning
AI · data science
2017 – 2021phagwara · LPU

B.Tech · Computer Science (Hons.)

Lovely Professional University

Solid foundations in the core of computer science. Honours track, systems and algorithms focus.

focusCS fundamentals
GenAI
Foundation of Generative AI · Udacity · 2025
LLMs
Generative AI with LLMs · Coursera · 2024
DS
DS with Python & Hero Vired DS/ML/AI · 2024

Skills waza · craft & technique

§ 04 / 04
~/skills
hiraku · open / develop

Software Dev.

// build & ship
  • Python expert
  • Go strong
  • TypeScript strong
  • Rust learning
  • Django / FastAPI expert
  • Postgres expert
  • Docker · k8s daily
ban · foundation / board

Data Engineering

// move · model · store
  • Spark expert
  • dbt expert
  • Airflow strong
  • Kafka · Flink strong
  • Iceberg / Delta strong
  • Terraform daily
  • SQL (pg, presto) expert
seki · analysis / split

Data Science

// measure & reason
  • Pandas · Polars expert
  • Applied stats strong
  • Causal inference comfortable
  • Experiment design strong
  • scikit-learn expert
  • Jupyter · Quarto daily
chi · knowledge / knowing

AI · ML

// learn & evaluate
  • PyTorch strong
  • LLM evaluation expert
  • RAG · retrieval expert
  • Vector DBs strong
  • Fine-tuning (LoRA) comfortable
  • Ray · vLLM daily
  • Diffusion curious
utkarsh@commentsold — ~/skills — zsh
utkarsh@commentsold ~/skills $ grep -r "expert" ./ | wc -l
11 matches · domains: 4
utkarsh@commentsold ~/skills $ cat .motto
“make systems that don’t lie.”

Projects saku · to make · a work

§ 06 works · 2021 → 2026
~/projects
filter :: all · 全 ai · 知 data · 盤 software · 開 open source · 共
0012026 · AI
RAG · CONTEXT · REPAIR

Kintsugi

金継ぎ “golden repair”

Context-repair layer for production RAG. Catches hallucinations via citation cross-checks, stitches in missing sources.

AI · RAG · 2026● live
0022025 · DATA
CDC · STREAMING · LAKE

Koi

“carp” — streaming CDC

Debezium → Iceberg with schema-aware merges. Handles 60M row/hour without blinking.

Data · CDC · 2025★ shipped
0032024 · SWE
SSG · RUST · ~2KLOC

Sumi

“ink” — static site generator

Tiny Rust SSG built around Pandoc. Incremental, typed front-matter, ~2k LOC. Powers this site.

SWE · Rust · 2024★ shipped
0042023 · SWE
LOCAL-FIRST · CRDT · E2EE

Neko

“cat” — offline-first notes

Local-first notes w/ CRDT sync. Markdown, backlinks, full-text, no server when you don’t want one.

SWE · Local · 2023★ shipped
0052022 · DATA
FEED · TELEMETRY · 200M

Hoshi

“star” — feed telemetry

Near-real-time feed analytics for a 200M-user consumer app. Spark structured streaming + ClickHouse.

Data · 2022archived
0062021 → · OSS
~/.zshrc ~/.tmux.conf ~/.config/nvim/init.lua ~/.gitconfig ~/.ghostty/config ~/.config/wezterm/ ~/bin/git-summary

Wabi

“rustic” — dotfiles

Personal dotfiles. Curated, minimal, boringly-reliable. 200+ stars somehow.

OSS · 2021→● maintained
every project name is a real word. the gloss sits under the title so visitors learn something too.
→ more on github.com/eternalAbyss · linkedin.com/in/us9044

Blog & notes bun · text, writing, prose

§ 05 essays · 08 notes
~/writing

Tweaks 調

mode
accent · hanko
cyber intensity
display type