PDF Tools
After a long search (see session below), I came across this tool for PDF annotation. PDF Tools is a DocView replacement for pdf on steroids. It renders pdf files in memory on demand, thus allows much extended capabilities, such as annotation, search and lots more just like another org file. See it in action here or my screen cast.
Installation
Update: What would be the correct `use-package` declaration? · Issue #528 · politza/pdf-tools · GitHub -NOT working
-
Install pdf-tools from
pinguim06 branch, use official git for newer version (11/29/2016-04:15:04 PM).let-list-1.0.1 is needed for Arch Linux in my case. Download let-alist-1.0.1 from bottom of this link, and install it via M-x package-install-file.
12345git clone https://github.com/pinguim06/pdf-tools$ cd /path/to/pdf-tools$ make install-server-deps # optional, does NOT work for Arch Linux$ make -smake produced the ELP file pdf-tools-${VERSION}.tar. This package contains all the necessary files for Emacs and may be installed by either using
12$ make install-package#or executing the following Emacs command:
12M-x package-install-file RET pdf-tools-${VERSION}.tar RETNB: rebuild and re-intstall if pdf-tools stopped working after poppler update
1234567891011121314(package-initialize t)(package-activate 'pdf-tools)(pdf-tools-install);;grap org-pdfview https://github.com/markus1189/org-pdfview/blob/master/org-pdfview.el(eval-after-load 'org '(require 'org-pdfview))(add-to-list 'org-file-apps '("\\.pdf\\'" . org-pdfview-open))(add-to-list 'org-file-apps '("\\.pdf::\\([[:digit:]]+\\)\\'" . org-pdfview-open))(eval-after-load 'pdf-view'(define-key pdf-view-mode-map (kbd "M-h") 'pdf-annot-add-highlight-markup-annotation))(eval-after-load 'pdf-view'(define-key pdf-view-mode-map (kbd "<tab>") 'pdf-annot-add-highlight-markup-annotation)) -
pdf-tools-org
pdf-tools-org is an emacs package that provides integration between pdf-tools and org-mode. The main features are importing and exporting pdf annotations from/to org files.
123456789;;importing and exporting pdf annotations from/to org files.;;https://github.com/pinguim06/pdf-tools-org(add-to-list 'load-path "/home/tan/config/emacs/extend/pdf-tools-org/")(require 'pdf-tools-org);;auto save org when pdf file closed(add-hook 'after-save-hook(lambda ()(when (eq major-mode 'pdf-view-mode) (pdf-tools-org-export-to-org-mod)))) -
Modified annotation export function to include the link back to pdf.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546;; modify the original function (see below) to add page link back to pdf file, Tan(defun pdf-tools-org-export-to-org-mod ()"Export annotations to an Org file."(interactive)(setq fname buffer-file-name)(let ((annots (sort (pdf-annot-getannots) 'pdf-annot-compare-annotations))(filename (format "%s.org"(file-name-sans-extension(buffer-name))))(buffer (current-buffer)))(with-temp-buffer;; org-set-property sometimes never returns if buffer not in org-mode(org-mode)(insert (concat "#+TITLE: Annotation notes for " (file-name-sans-extension filename)))(mapc(lambda (annot) ;; traverse all annotations(progn(org-insert-heading-respect-content);;retrieve page # for annotation (number-to-string (assoc-default 'page annot));; (insert (number-to-string (nth 1 (assoc-default 'edges annot)))) for margin(insert (concat (symbol-name (pdf-annot-get-id annot)) "\s[[pdfview:" fname "::" (number-to-string (assoc-default 'page annot))"++" (number-to-string (nth 1 (assoc-default 'edges annot))) "][p." (number-to-string (assoc-default 'page annot)) "]]" ))(insert (concat " :" (symbol-name (pdf-annot-get-type annot)) ":"));; insert text from marked-up region in an org-mode quote(when (pdf-annot-get annot 'markup-edges)(insert (concat "\n#+BEGIN_QUOTE\n"(with-current-buffer buffer(pdf-info-gettext (pdf-annot-get annot 'page)(pdf-tools-org-edges-to-region(pdf-annot-get annot 'markup-edges))))"\n#+END_QUOTE")))(insert (concat "\n\n" (pdf-annot-get annot 'contents)));; set org properties for each of the remaining fields(mapcar'(lambda (field) ;; traverse all fields(when (member (car field) pdf-tools-org-exportable-properties)(org-set-property (symbol-name (car field))(format "%s" (cdr field)))))annot)))(cl-remove-if(lambda (annot) (member (pdf-annot-get-type annot) pdf-tools-org-non-exportable-types))annots))(org-set-tags 1)(write-file filename pdf-tools-org-export-confirm-overwrite))))save the annotation as org file when pdf file is saved.
123456;;auto save org when pdf file closed(add-hook 'after-save-hook(lambda ()(when (eq major-mode 'pdf-view-mode)(pdf-tools-org-export-to-org-mod)))) -
Export pdf outline as org-headings and extracting images of square annotations and inlining them. see code myrjola. Not all pdfs have outlines.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869(defun pdf-annot-markups-as-org-text (pdfpath &optional title level)"Acquire highlight annotations as text"(interactive "fPath to PDF: ")(let* ((outputstring "") ;; the text to be return(title (or title (replace-regexp-in-string "-" " " (file-name-base pdfpath ))))(level (or level (1+ (org-current-level)))) ;; I guess if we're not in an org-buffer this will fail(levelstring (make-string level ?*))(pdf-image-buffer (get-buffer-create "*temp pdf image*")))(with-temp-buffer ;; use temp buffer to avoid opening zillions of pdf buffers(insert-file-contents pdfpath)(pdf-view-mode)(pdf-annot-minor-mode t)(ignore-errors (pdf-outline-noselect (current-buffer)))(setq outputstring (concat levelstring " Annotations from " title "\n\n")) ;; create heading(let* ((annots (sort (pdf-annot-getannots nil (list 'square 'highlight) nil)'pdf-annot-compare-annotations))(last-outline-page -1))(mapc(lambda (annot) ;; traverse all annotations(message "%s" annot)(let* ((page (assoc-default 'page annot))(height (nth 1 (assoc-default 'edges annot)))(type (assoc-default 'type annot))(id (symbol-name (assoc-default 'id annot)))(text (pdf-info-gettext page (assoc-default 'edges annot)))(imagefile (concat id ".png"))(region (assoc-default 'edges annot));; use pdfview link directly to page number(linktext (concat "[[pdfview:" pdfpath "::" (number-to-string page)"++" (number-to-string height) "][" title "]]" ));; The default export is for highlight annotations(annotation-as-org (concat text "\n(" linktext ", " (number-to-string page) ")\n\n")));; Square annotations are written to images and displayed inline(when (eq type 'square)(pdf-view-extract-region-image (list region) page (cons 1000 1000) pdf-image-buffer)(with-current-buffer pdf-image-buffer(write-file imagefile))(setq annotation-as-org (concat "[[file:" imagefile "]]" "\n\n(" linktext ", " (number-to-string page) ")\n\n")));; Insert outline heading if not already inserted(let* ((outline-info (ignore-errors(with-current-buffer (pdf-outline-buffer-name)(pdf-outline-move-to-page page)(pdf-outline-link-at-pos))))(outline-page (when outline-info (number-to-string (assoc-default 'page outline-info)))))(when outline-info(unless (equal last-outline-page outline-page)(setq outputstring (concat outputstring(make-string (+ level (assoc-default 'depth outline-info)) ?*)" "(assoc-default 'title outline-info)", "outline-page"\n\n"))(setq last-outline-page outline-page))))(setq outputstring (concat outputstring annotation-as-org))))annots)))(insert outputstring))) -
Note Taking with PDF-Tools
yet another resource.
-
vitual PDFs
from here.
- Enable recognition of *.vpdf files via M-x pdf-virtual-global-mode.
- Open a PDF in a directory with many more PDF files.
- M-x pdf-virtual-create-buffer
-
C-c C-c
This would create a virtual PDF representing all (complete) pages of all documents in the directory. And for the most part it acts just like any other PDF document, with the exception that the document is read-only, i.e. no modification of annotations. The file-format is explained in the buffer.
This would be even more fun, if we could display multiple pages (e.g. snippets of multiple math-definitions, which you can’t seem to remember) per window.
Some keybindings
Navigation | |
---|---|
Scroll Up / Down by page-full | space / backspace |
Scroll Up / Down by line | C-n / C-b |
Scroll Right / Left | C-f / C-b |
Top of Page / Bottom of Page | < / > |
Next Page / Previous Page | n / p |
First Page / Last Page | M-< / M-> |
Incremental Search Forward / Backward | C-s / C-r |
Occur (list all lines containing a phrase) | M-s o |
Jump to Occur Line | RETURN |
Pick a Link and Jump | F |
Incremental Search in Links | f |
History Back / Forwards | B / F |
Display Outline | o |
Jump to Section from Outline | RETURN |
Jump to Page | M-g g |
Display | |
Zoom in / Zoom out | + / – |
Fit Height / Fit Width / Fit Page | H / W / P |
Trim margins (set slice to bounding box) | s b |
Reset margins | s r |
Reset Zoom | 0 |
Annotations | |
List Annotations | C-c C-a l |
Jump to Annotations from List | SPACE |
Mark Annotation for Deletion | d |
Delete Marked Annotations | x |
Unmark Annotations | u |
Close Annotation List | q |
Add and edit annotations | via Mouse selection and left-click context menu |
Syncing with Auctex | |
jump to PDF location from source | C-c C-g |
jump source location from PDF | C-mouse-1 |
Miscellaneous | |
Refresh File (e.g., after recompiling source) | g |
Print File | C-c C-p |
org-noter [NEW as of 2019.9.6]
Combining org-noter and org-ref
from Notes on Org-noter — Dani
With org-noter and org-ref installed, I find it easier to add a reference from the org-ref insert cite-link interface C-c ] in the current Org file I’m working with, then pressing RET with the cursor on top of the citation link and open the document with PDFView from the PDFtools package. On a file with multiple references, I can also jump to the notes adding the following to org-ref-helm-user-candidates:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
(defun org-ref-noter-at-point () "Open the pdf for bibtex key under point if it exists." (interactive) (let* ((results (org-ref-get-bibtex-key-and-file)) (key (car results)) (pdf-file (funcall org-ref-get-pdf-filename-function key))) (if (file-exists-p pdf-file) (progn (find-file-other-window pdf-file) (org-noter)) (message "no pdf found for %s" key)))) (add-to-list 'org-ref-helm-user-candidates '("Org-Noter notes" . org-ref-noter-at-point)) |
And, the ~Org-Noter notes ~ will appear under User functions.
PDF-extract
Original resource from here. This tool is a little gem which is useful to get a list of BibTeX references from a scholarly PDF article.
-
install ruby if not already
12sudo pacman -S rubyand, Before you use RubyGems, you should add $(ruby -e “print Gem.user_dir”)/bin to your $PATH. You can do this by adding the following line to ~/.bashrc:
12PATH="$(ruby -e 'print Gem.user_dir')/bin:$PATH" -
install pdf-extract
NB:get the cloned pdf-extract (not the same as zip) for extract-bib option; gem install pdf-extract does NOT have it, otherwise will getting following error.
12345git clone https://github.com/CrossRef/pdfextractcd pdfextractgem build pdf-extract.gemspecgem install pdf-extract-0.1.1.gemif you met this error.
123gem uninstall pdf-readergem install pdf-reader -v 1.2.0 -
Usage
Extract references and a title from a PDF:
12$ pdf-extract extract --references --titles myfile.pdfResolve references to DOIs and output related metadata as BibTeX:
12$ pdf-extract extract-bib --resolved_references myfile.pdfEnjoy and abuse it…
123456789pdf-extract extract-bib --resolved_references ./test.pdfFound DOI from Text: 10.1002/elps.201100160 (Score: 4.5962505)Found DOI from Text: 10.1039/b516119c (Score: 4.4811935)Found DOI from Text: 10.1039/b701116d (Score: 5.687472)Found DOI from Text: 10.1039/b906257b (Score: 4.2532096)Found DOI from Text: 10.1039/c0lc00443j (Score: 3.6910071)Found DOI from Text: 10.1063/1.3605509 (Score: 2.6751742)Found DOI from Text: 10.1038/nmeth.1481 (Score: 3.1425736)
Extract annotation from PDF -my old notes
- updated doppler together with dependency
- method 1: Install leela from github or arch AUR site Note: man page not updated as in github, leela annots [NOT annotations]
-
Method 2: Zotero, Mendeley, a tablet, et al. download C code and Makefile (papers/books/tmp/folder), make, run
12./Annot file://`pwd`/life-in-the-frozen-state.pdf |less - Method 1 with R retrieve lost of information needed to use xslt to format the data, xml attribute
- both method one page short, guess it is doppler thing, Method 2 do not tell whether it is highlights or links (does show the pages), method 1 DOES well.
- Reference management with Emacs, BibTeX, and Zotero | Tomás S. Grigera
MAR
About the Author:
Beyond 8 hours - Computer, Sports, Family...