Jose Mancera  /  Machine Learning Engineer

I build systems that learn from data, at scale.

Machine-learning engineer at LinkedIn, working on video recommendations — vision-language models, retrieval, and content understanding.

  • U.S. Marine
  • Baylor B.S.
  • Notre Dame M.S.
  • LinkedIn
Portrait of Jose Mancera.
Jose Mancera

01 About

Marine first.
Engineer after.

Before any of this, I was a sergeant in the Marine Corps. It taught me to ignore the noise and look for the truth from first principles — which, it turns out, is most of engineering.

I left, studied computer science, and started building machine-learning systems. The work I care about lives where research meets production — models that have to be genuinely good and actually ship to real people.

The ML is the fun part, not the whole point. What I like is hard problems, the people they're for, and doing the unglamorous work well.

02 Toolkit

What I work in

Less a list of tools, more the shape of the problems I like.

  • 01

    Recommendation & Retrieval

    Ranking, embeddings, and the systems that decide what you see next.

  • 02

    Multimodal ML

    Vision-language models and content understanding across video and text.

  • 03

    Models at Scale

    Distributed training and serving large models without the whole thing falling over.

  • 04

    Home Base

    Python and PyTorch, plus whatever the problem actually needs.

03 Path

How I got here

  1. Now

    LinkedIn

    Machine-learning engineer on video recommendations, in Mountain View.

  2. 2023–2024

    Amazon · SpaceX · AWS

    A run of engineering internships across recommendation, infrastructure, and large-model serving.

  3. School

    Notre Dame · Baylor

    An M.S. in progress at Notre Dame; a B.S. from Baylor before it.

  4. Before

    U.S. Marine Corps

    Where it started. Out as a sergeant.

04 Selected Work

Things I've built

A few projects from outside the day job — built to learn, not to impress.

GPT, from scratch

A transformer language model assembled from the raw parts — tokenizer, attention, training loop — in PyTorch. The goal was to understand every line.

PyTorchTransformersFrom scratch

Mancera1/gpt-from-scratch

Retrieval-augmented generation

A RAG system that grounds a language model in a real document store and serves it on Kubernetes. Vector search in, sourced answers out.

MilvusLLMsKubernetes

Mancera1/rag-system

Fully from-scratch RAG

The two above, wired together: my GPT is the generator inside my RAG — my tokenizer, my retrieval, my transformer, no external APIs — answering over a behavioural-data-science handbook in the demo below.


It's deliberately tiny and barely trained, and that's the point: it shows how the earliest GPTs actually began, and from these foundations you can see exactly where the later improvements come from.


Next steps aren't a better architecture — it's training on far more real data, on a GPU. I've only scratched the surface there, but the core idea is proven.

From scratchRAGNo APIs

Live demo: toggling the untrained model, the trained model, and retrieval on top.