Applied Machine Learning

Teaching Machines to See

I am the founder of Darmis AI. I specialize in training neural networks to map unstructured physical documents to clean, schema-validated structured outputs. Mapping warped pixels to computational variables, one line of code at a time.

Explore Vision Sandbox Get In Touch

Neural Target DARMIS-X9

Shlomo Stept

Applied ML Architect & Founder

Interactive Scanner

Form-Native OCR Sandbox

Vision Kernel v4.2

Choose Source Document

Resolution Quality 300 DPI

Confidence Gate 85%

Document Skew 0°

Gaussian Blur 0px

EXTRACTED_STRUCTURED_JSON

Model Weights

Case Studies

High-impact neural layers

B2B Document AI

99.2% Acc 12M Params

InvoiceParser-v4

A specialized convolutional neural parser designed to map skewed receipts, multi-page billing sheets, and complex financial layout streams. Extracts nested values with strict compliance checks, replacing brittle, template-based scripts.

Character Segmentation Precision 99.2%

Skew Affine Tolerances 15 deg

PyTorch CUDA Vision Transformers JSON Schema

Biometric Intelligence

98.8% Acc 32M Params

BiometricMatch-v2

A high-precision verification CNN trained to match standard passport scanned credentials directly to live security cameras. Employs advanced spatial matrix transformations to realign warped, high-contrast print anomalies in real-time.

Facial Embedding Alignment 98.8%

Noise Contrast Gain 88%

TensorFlow OpenCV Affine Transforms FaceNet

Foundry Nodes

Where I've Built

Document AI & ML

2023 — Present

Founder & Chief Architect

Building Darmis AI, a form-native document intelligence platform designed to extract clean, nested structure values directly from complex visual forms without template fragility. Designing end-to-end vision neural pipelines and structural tokenization.

Darmis AI

2020 — 2023

Applied ML Research Engineer

Researched and trained deep convolutional neural networks (CNNs) and transformer models optimized for visual parsing tasks. Refined affine transformation matrices and bounding box matching, improving model inference throughput by 22%.

Computer Vision Labs

2017 — 2020

Software Engineer (ML Core)

Engineered highly performant pixel processing algorithms and structural file tokenization libraries. Built baseline synthetic document datasets that established robust training runs for early Document OCR experiments.

DataSystems Inc

Core Competencies

System Expertise

Neural layout mapping

Neural Mapping Scope

Active Nodes 100 Nodes

Repulsion Field 180px

Focus Target FLOAT_SPACE

Structured Document AI

Extracting nested, schema-compliant JSON objects from complex paper invoice forms and ID formats.

Vision Transformers

Fine-tuning vision transformer backbones to parse character fragments under high-noise inputs.

Affine Alignment Math

Mapping perspective distortions into coordinate grids for scanning matrix corrections.

Vision Stream

Published Writings

Form-native neural research

October 2025

Teaching Neural Nets to Read Warped Receipts

Applying convolutional mesh grids and perspective transform matrices to fix skewed camera inputs.

Read Article →

June 2025

Structured Outputs: Beyond Regex in Visual Parsers

Why standard character matching falls short on unstructured forms and how native JSON encoders stabilize layouts.

Read Article →

January 2025

Optimizing Vision Transformers for Mobile Browsers

How to structure and pack weights to run real-time visual boarding pass scanning at 30fps client-side.

Read Article →

Live Neural Compounding

Model Telemetry Logs

Standard Output Stream

NEURAL_CORE_KERNEL_7a

Connection Ready

Let's Interface
In AFFINE SPACE

LinkedIn GitHub Code

Teaching Machines to See

Shlomo Stept

Form-Native OCR Sandbox

Choose Source Document

Case Studies

InvoiceParser-v4

BiometricMatch-v2

Where I've Built

Founder & Chief Architect

Applied ML Research Engineer

Software Engineer (ML Core)

System Expertise

Neural Mapping Scope

Structured Document AI

Vision Transformers

Affine Alignment Math

Published Writings

Teaching Neural Nets to Read Warped Receipts

Structured Outputs: Beyond Regex in Visual Parsers

Optimizing Vision Transformers for Mobile Browsers

Model Telemetry Logs

Let's InterfaceIn AFFINE SPACE

Let's Interface
In AFFINE SPACE