Models

GLM-OCR

Open OCR model and pipeline for turning complex document images into usable text.

MIT model / Apache-2.0 code License
Open source
GLM-OCR MIT model / Apache-2.0 code License zai-org/GLM-OCR verified 2026-04-19
About

GLM-OCR overview

GLM-OCR is an open OCR model and document pipeline from Z.ai, focused on accurate, fast, and comprehensive image-to-text extraction for documents, tables, formulas, and complex layouts.

Document-first model focus

GLM-OCR targets OCR and image-to-text extraction rather than general chat.

Specialization is valuable when a workflow depends on layout, tables, equations, and structured document text.

Open model and pipeline licensing

The repository states MIT licensing for the model and Apache-2.0 licensing for code components.

Clear licensing makes it easier to evaluate for production document workflows.

Useful for agent intake

OCR output can feed downstream agents, search indexes, and retrieval systems.

Agents are only as useful as the documents and screens they can accurately read.
Use cases

When to use GLM-OCR

PDF and document ingestion

Convert scans and visual documents into text before indexing or summarization.

Research workflow automation

Extract usable text from papers, reports, forms, and tables for downstream analysis.

RAG preprocessing

Use OCR as the first stage before chunking, embedding, and retrieval.

Compare

How it compares

Choose GLM-OCR for document pipelines vs general VLMs

A general multimodal model may describe an image, but GLM-OCR is the better starting point when the job is faithful document extraction.

FAQ

Questions

What should I check before using GLM-OCR?

Run GLM-OCR on a fixed prompt set from your own workflow. Compare quality, latency, context handling, retry behavior, deployment path, and license fit against nearby open models before adopting it.

Is GLM-OCR open source?

GLM-OCR is listed with MIT model / Apache-2.0 code based on the official source links in this profile. Re-check the repository, model card, or docs before production use.

Who should evaluate GLM-OCR?

GLM-OCR is most worth evaluating for builders working on document AI, PDF processing, or knowledge ingestion.

Tags

Capabilities

local inferencetool callingopen sourceopen weightsdeveloper workflow