Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • Z zathura
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 152
    • Issues 152
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 1
    • Merge requests 1
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • pwmt
  • zathura
  • Issues
  • #163
Closed
Open
Created Jul 01, 2020 by Elias Haddad@eliasyoussef

Copy-paste issue with pdf files having latin glyphs

Dear all,

First of all, thanks for the amazing software. I use it daily.

Problem

Zathura has issues interpreting white space from pdfs when the latex file is compiled with any of the following libraries:

  • cmap
  • lmodern
  • fontenc

When trying to compile from documents that used those libraries, the algorithm parses extra white space between the words.

See the MRE below:

% Created 2020-07-01 Wed 11:00
% Intended LaTeX compiler: pdflatex
\documentclass[11pt]{article}
\usepackage{cmap}
\usepackage{lmodern}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{wrapfig}
\usepackage{rotating}
\usepackage[margin=3cm]{geometry}

\begin{document}
Mão, piúma, êm

This paper covers the literature related to the "new new theory of trade". 

\end{document}

If the document above is compiled with pdflatex, trying to copy and paste the above lines results in:

M
ão, p i ú m
a, ê m
T h i s p ap e r c ov e r s t h e l i t e r at u r e r e l at e d
t o t h e " n e w
n e w
t h e or y
of t r ad e " .

However, those libraries are necessary for the output to be correctly encoded with glyphs for latin languages. Removing those, we get the correct spacing, but we mess with the accentuation:

% Created 2020-07-01 Wed 11:00
% Intended LaTeX compiler: pdflatex
\documentclass[11pt]{article}
\usepackage[utf8]{inputenc}
\usepackage{wrapfig}
\usepackage{rotating}
\usepackage[margin=3cm]{geometry}

\begin{document}
Mão, piúma, êm

This paper covers the literature related to the "new new theory of trade". 

\end{document}

Resulting in:

M˜ao, pi´uma, ˆem
This paper covers the literature related to the ”new new theory of trade”.

I test the same outputs with different readers, and xreader is able to succesfully retrieve the original test. Since it is open source, it may serve of inspiration to correct such issue.

Thanks!

Assignee
Assign to
Time tracking